Develop. Growth Differ. (2008) 50, S131–S140 doi: 10.1111/j.1440-169X.2008.00991.x

Review Clustered family

Takeshi Yagi* KOKORO-Biology Group, Laboratories for Integrated Biology, Graduate School of Frontier Biosciences, Osaka University, 1-3 Yamadaoka, Suita, Osaka 565-0871, Japan

The brain is a complex system composed of enormous numbers of differentiated neurons, and brain structure and function differs among . To examine the molecular mechanisms underlying brain structure and function, it is important to identify the molecules involved in generating neural diversity and organization. The clustered protocadherin (Pcdh) family is the largest subgroup of the diverse cadherin superfamily. The clustered Pcdh are predominantly expressed in the brain and their structures in vertebrates are diversified. In mammals, the clustered Pcdh family consists of three gene clusters: Pcdh-a, Pcdh-b, and Pcdh-g. During brain development, this family is upregulated by neuronal differentiation, and Pcdh-a is then dramatically downregulated by myelination. Clustered Pcdh expression continues in the olfactory bulb, hip- pocampus, and until adulthood. Structural analysis of the first cadherin domain of the Pcdh-a revealed it lacks the features that classical cadherins require for homophilic adhesiveness, but it contains Pcdh-specific loop structures. In Pcdh-a, an RGD motif on a specific loop structure binds b1-integrin. For gene expression, the gene clusters are regulated by multiple promoters and alternative cis splicing. At the single-cell level, several dozen Pcdh-a and -g mRNA are regulated monoallelically, resulting in the combina- torial expression of distinct variable . The Pcdh-a and Pcdh-g proteins also form oligomers, further increasing the molecular diversity at the cell surface. Thus, the unique features of the clustered Pcdh family may provide the molecular basis for generating individual cellular diversity and the complex neural circuitry of the brain.

Key words: brain, CNR, evolution, integrin, neuronal diversity, .

dependent cell adhesion molecules, and are identified Introduction by their cadherin sequence motifs, which contain The brain differs in its structural organiza- about 110 amino acids (Takeichi 1988). The recent tion and functioning across species. It contains sequencing of several vertebrate genomes has enormous numbers of differentiated individual revealed the impressive diversity of the cadherin neurons that form complex neural networks. To under- superfamily, in which more than 100 different stand the biological mechanisms of the brain, we need cadherin-related have been identified (Yagi & to learn how neuronal diversity is generated and the Takeichi 2000). The protocadherin (Pcdh) family is a neural networks organized, on the molecular level. large subgroup within the cadherin superfamily in Cell-adhesion molecules are known to be involved vertebrates, and the Pcdh genes are predominantly in many biological events, including cell recognition, expressed in the brain. The Pcdh family is divided into cell signaling, and the formation of neural networks. two groups based on their genomic structure: the clus- Cadherins were originally characterized as calcium- tered and nonclustered Pcdh families (Morishita & Yagi 2007). With over 50 members, the clustered Pcdh family represents the largest subgroup in the cadherin superfamily (Kohmura et al. 1998; Wu & Maniatis *Author to whom all correspondence should be addressed. 1999). In mammals, the clustered Pcdh family consists Email: [email protected] of the Pcdh-a, Pcdh-b, and Pcdh-g genes (Hirayama & Received 20 November 2007; revised 6 December 2007; Yagi 2006). These gene clusters are generally orga- accepted 7 December 2007. nized into a variable region containing a tandem array © 2008 The Author Journal compilation © 2008 Japanese Society of of variable exons and a constant region containing Developmental Biologists one set of constant exons (Fig. 1). S132 Y. Yagi

αβ γ

variable constant variable variable constant

5

α α β β 5 γ γ γ γ 1 ~ 12 C1C2 1 ~ 22 I A1 ~ A1 2 B1 ~ B8 C3~C5

I

F

A

HS7

HS5-1 Genomic DNA ORNT2 T

1

-

?

5

E E E E E E

E E E E

S

HS7

S SE S S

S SE SE S S

SE

H

CS CS CS C C C C

C C C CS C C

C

α1 α2 α3 α4 α5 α6 α7 α8 α9 α10 α11 α12 C1 C2

RGD (45-47) mRNA splicing Protein C-(X)5-C

(70-76) EC1 EC2 EC3 EC4 EC5 EC6 C

S -

5 extracellular regionTM cytoplasmic region

RGD

X -

EC1 C Fig. 1. Genomic organization of the mouse clustered Pcdh genes. Three clusters are present in 18: Pcdh-a (a), Pcdh-b (b), and Pcdh-g (g). The gene clusters include variable and constant regions, although the Pcdh-b cluster does not have a constant region. The Pcdh-aC1 and -aC2 variable exons exhibit high with Pcdh-gC3,-gC4, and -gC5. The small black boxes indicate identified DNase I hypersensitive enhancer sites (HS7 and HS5-1) in the Pcdh-a cluster. A conserved sequence element (CSE) is almost always present in the promoter region upstream of each variable . The CSE sequence, however, is not conserved in the Pcdh-aC2 promoter region. The mature mRNA of Pcdh-a is produced from one of these variable exons and a set of three constant region exons by cis-splicing. Each variable exon encodes a single extracellular, transmembrane (TM), and cytoplasmic region, and the Pcdh-a and -g constant exons encode the cytoplasmic tails. All the clustered Pcdh proteins consist of a signal peptide with six extracellular cadherin (EC) domains in the extracellular region. In addition, there is a Cys-(X)5-Cys (C-X5-C) motif in the EC1 domain of all Pcdh proteins, and an RGD motif is well conserved in the mammalian Pcdh-a proteins. The ribbon diagram of the representive NMR structure of the EC1 domain of Pcdh-a4 Morishita et al. (2006).

The Pcdh variable exons are orthologous among Molecular evolution human, rat, and mouse (Sugino et al. 2000; Noonan Clustered Pcdh genes have been identified in several et al. 2004a; Yanase et al. 2004; Wu 2005). In those vertebrate species: human, chimpanzee, rat, mouse, of the chicken, zebrafish, fugu, and coelacanth, chicken, zebrafish, fugu, and coelacanth (Wu & however, such orthologous relationships are not found Maniatis 1999; Sugino et al. 2000; Noonan et al. (Noonan et al. 2004a,b; Sugino et al. 2004; Tada et al. 2004a,b; Sugino et al. 2004; Tada et al. 2004; Yanase 2004; Yu et al. 2007). In addition, unusual splicing et al. 2004; Wu 2005; Yu et al. 2007; Fig. 2). Inverte- forms that are not seen in mammals are generated brate species contain cadherin-related genes, but not from the more variable region of the Pcdh-a genes in clustered Pcdh genes. In vertebrates, the clustered zebrafish. Nevertheless, the putative promoter ele- Pcdh genes are predominantly expressed in the brain. ments upstream of the variable exons are highly Therefore, the Pcdh gene clusters were generated in conserved across species (Tada et al. 2004). Analysis the genome of a common vertebrate ancestor and of single nucleotide polymorphisms (SNP) of the diversified during brain evolution. Phylogenetic analy- Pcdh-a variable exons among wild-derived and labo- ses of the Pcdh gene clusters demonstrated that the ratory mouse strains demonstrated that synonymous constant region of the Pcdh-a and Pcdh-g clusters is (i.e., no change in the amino acid) SNP are concen- highly conserved among vertebrates (Fig. 2). Teleost trated in the first and fifth extracellular cadherin (EC) fishes have two unlinked clustered Pcdh loci. domains within each variable exon (Taguchi et al. Zebrafish has one set of the Pcdh-a and Pcdh-g clus- 2005). These domains have high nucleotide homology ters in each of these loci (Noonan et al. 2004b; Tada among the Pcdh-a paralogues, suggesting that gene et al. 2004). Fugu possesses one set of the Pcdh-a conversion events in the homologous domains are and Pcdh-g clusters in one , and only the Pcdh-a related to the generation of SNP. A phylogenetic analy- cluster in the other (Yu et al. 2007). The constant sis revealed gene conversion events in the EC1 and regions of the Pcdh-a and Pcdh-g clusters in teleosts EC5 domains. Interestingly, the guanine/cytosine (GC) are also highly conserved (Fig. 2). content of the third nucleotide in each codon is higher

© 2008 The Author Journal compilation © 2008 Japanese Society of Developmental Biologists Clustered protocadherin family S133

αβ γ

variable constant variable variable constant ~750Kb Human (Chr5)

~900Kb Mouse (Chr18)

Chicken ?

Coelacanth

~320Kb cluster 1 (Chr10)

Zebrafish >320Kb cluster 2 ? (Chr14)

cluster 1

Fugu

cluster 2

Fig. 2. Comparison of the human, mouse, chicken, coelacanth, zebrafish (two unlinked), and fugu (two unlinked) Pcdh clusters. Paralogous relationships in each gene cluster are indicated by the vertical colored boxes. Orthologous relationships for mammalian Pcdh-a,-b, and -g clusters are indicated by the colored bars connected with a line. The variable and constant region exons of Pcdh-a (a), Pcdh-b (b), and Pcdh-g (g), are shown at the top. Both zebrafish and fugu contain two unlinked clustered Pcdh gene loci (cluster 1 and cluster 2). The Pcdh-b cluster is absent from teleost fishes (zebrafish and fugu). Several variable genes, which are colored in purple, green and light purple boxes in chicken, coelacanth, zebrafish and fugu, are absent from the mammalian genomes. Red colored exons show constant exons for Pcdh-a and Pcdh-g clusters, which are well conserved in all vertebrates (shown by red bars). The Pcdh-b cluster has no constant exons. Black vertical boxes represent relic or pseudogenes. Adapted from Hirayama & Yagi (2006). in the EC1 and EC5 domains, although a bias for GC the clustered Pcdh genes being candidates for to adenine/thymine (AT) substitutions is found in all the responsible genes of psychiatric diseases as well codon positions (Taguchi et al. 2005). Paralogous as for determining functional characteristics of the relationships as a result of gene conversion have been human brain. demonstrated in several multigene families that show increased GC content at the third codon position and cell adhesion (Drouin et al. 1999). Thus, there is evidence for gene conversion events in the Pcdh clusters (Noonan et al. Protein structural analysis has provided significant 2004b; Miki et al. 2005; Taguchi et al. 2005). In addi- insight into the molecular basis for the cell adhesion of tion, findings indicate that lineage-specific duplication classical cadherins (Patel et al. 2003). The homophilic of variable exons, region-restricted gene conversion adhesive binding region of classical cadherin is within species, and adaptive variation have all driven located in the EC1 domain (Boggon et al. 2002; the evolution of the clustered Pcdh genes (Noonan Morishita et al. 2006). To determine whether the clus- et al. 2004b). tered Pcdh proteins share these characteristics, the In humans, the Pcdh gene cluster is located at 5q31 solution protein structure of the EC1 domain of and contains many SNP (Kirov et al. 2003; Noonan Pcdh-a4 was determined by nuclear magnetic reso- et al. 2003; Miki et al. 2005). The polymorphisms in the nance (NMR) (Umitsu et al. 2005; Morishita et al. Pcdh-a gene cluster include amino acid exchanges, 2006). The topology, a b-sandwich-like structure com- extensive linkage disequilibrium, and deletions posed of two packed b sheets, resembles the EC1 (Noonan et al. 2003; Miki et al. 2005). This molecular domain of classical cadherins, although the sequence diversity in the human population is compatible with similarity between the EC1 domains of Pcdh-a4 and of

© 2008 The Author Journal compilation © 2008 Japanese Society of Developmental Biologists S134 Y. Yagi classical cadherins is low (30% maximum). However, members, although it is absent from the Pcdh-a pro- the interconnecting loops of the Pcdhs are divergent teins of other vertebrates. Therefore, the interaction from those of the classical cadherins in length and between the Pcdh-a family and b1 integrin may deter- biochemical properties. mine certain neural features that are characteristic of The Pcdh-a4 EC1 domain does not have a tryp- mammals. tophan (W2) or a deep hydrophobic pocket, which are Pcdh proteins are reported to be regulated by essential for the homophilic adhesive property of the proteolysis. The Pcdh-a and Pcdh-g proteins are spe- classical cadherins (Boggon et al. 2002; Morishita cifically cleaved by ADAM10 and presenilin (Haas et al. 2006). The deep hydrophobic pocket is highly et al. 2005; Hambsch et al. 2005; Reiss et al. 2006; conserved among the classical, type II, and desmo- Bonn et al. 2007), and this proteolysis modulates cell somal cadherins (Patel et al. 2003). Although the adhesion and Pcdh-a and Pcdh-g cis-heterodimer Pcdh-a4 EC1 domain has a hydrophobic cluster in the complex formation (Fig. 3). In addition, the further pro- corresponding region, the pocket is not as deep as in teolysis of Pcdh-a and Pcdh-g generates cytoplasmic classical cadherins. The lack of W2 and a deep hydro- fragments, which translocate into the nucleus from the phobic pocket suggests that the Pcdhs do not have plasma membrane (Haas et al. 2005; Hambsch et al. the homophilic adhesion property of the classical 2005; Bonn et al. 2007). Overexpression of the Pcdh-g cadherins. Consistent with these structural character- cytoplasmic domain can transactivate several Pcdh-g istics, a biochemical analysis of the Pcdh-a4 protein promoters, but not the Pcdh-a promoters, in a manner by a protein-coated bead aggregation assay showed that does not depend on the conserved element (CSE) little trans-homophilic adhesion activity compared with (Hambsch et al. 2005). Therefore the proteolysis of that of N-cadherin (Morishita et al. 2006). On the other Pcdh-g proteins in cellular membrane might induce hand, it is reported that Pcdh proteins form cis- transactivation of the Pcdh-g promoters. The down- homodimers (Pcdh˜-a and Pcdh-a (Triana-Baltzer & regulation of Pcdh-g proteins in mutant mice lacking Blank 2006), or Pcdh-g and Pcdh-g (Hambsch et al. the cytoplasmic tail of Pcdh-g is thought to be caused 2005)), and cis-heterodimers (Pcdh-a and Pcdh-g by the loss of this feedback reaction. However, the (Murata et al. 2004)) on the same cell surface. It is truncated transcripts from the knock-out mice, which possible that Pcdh proteins possess trans-homophilic possess a frame-shift STOP codon resulting in a adhesive activity after they form Pcdh complexes with skipped third exon, are degraded by the nonsense- other Pcdh proteins (see Fig. 3). mediated mRNA decay (NMD) reaction (Culbertson The structural characterization has also been per- 1999). Thus, the biochemical mechanisms behind the formed for the loop regions of the Pcdh EC1 domain, feedback reaction in the Pcdh-g gene cluster are not including the Pcdh-specific disulfide-bonded Cys-(X)5- yet clear. Cys motif, and the RGD (Arg-Gly-Asp) motif (Fig. 1). Interestingly, the Cys-(X) -Cys motif is well conserved 5 Pcdh protein interactions among the Pcdhs, but not in classical cadherins (Mor- ishita & Yagi 2007). Structural analysis revealed that the During neocortical development, cortical plate

Cys-(X)5-Cys motif is exposed to solvent as a loop, with neurons express Pcdh-a mRNA, and Pcdh-a proteins the two cysteines forming a disulfide bond (Morishita are found in the marginal zone, subplate, and inter- et al. 2006). The disulfide-bonded loop of a Cys-(X)5- mediate zone (Senzaki et al. 1999; Morishita et al. Cys motif provides a protein–protein interaction site for 2004b). Cellular interactions between the cortical viral (Poumbourios et al. 2003), suggest- neurons and the Cajal-Retzius cells in the marginal ing that the Cys-(X)5-Cys motif specific to the Pcdh zone are important for the layering and positioning of family contributes to its function by acting as a novel neocortical neurons. Reelin expressed in the Cajal- adhesion site. The RGD motif in the EC1 of Pcdh-a4is Retzius cells is a key molecule for appropriate layering also in a loop region that is exposed to solvent (Fig. 1). (D’Arcangelo & Curran 1998). We showed that the The RGD motif is a well-known binding motif for integrin RGD motif of Pcdh-a4 protein binds to the first Reelin (Ruoslahti & Pierschbacher 1987). Consistent with the repeat in the N-terminus of Reelin (Senzaki et al. structural characteristics of the Pcdh-a4 protein, het- 1999). We showed this using detergent-solubilized erophilic cell adhesion activity was observed between crude extracts from the transfected cells, because the Pcdh-a4 and b1 integrin in a cell aggregation assay Reelin-alkaline phosphatase (AP) fusion proteins used with HEK293T cells (Mutoh et al. 2004). The activation in the binding experiments were not secreted from of integrin is necessary for cells to bind immobilized the transfected HEK293 cells. However, Jossin et al. Pcdh-a4 EC1 fragments (Morishita et al. 2006). The reported that the secreted form of Reelin from trans- RGD motif is conserved in mammalian Pcdh-a family fected cells binds to very-low-density lipoprotein

© 2008 The Author Journal compilation © 2008 Japanese Society of Developmental Biologists Clustered protocadherin family S135

α α γ γ α γ EC1 EC1 EC1 C C EC4 EC2 EC1 C EC EC1

RGD E Integrins RGD RGD C 1 EC2 C EC EC2 EC2 EC2 2 EC3 EC3 C C EC6 EC5 EC3 EC3 E E 3E5EC6 EC5 C3 C 3EC EC4 EC4 EC4 E 4 C4 EC5 EC EC5 5 5 E E E C C

γ− C6

secretase E 6 γ− 6 secretase C 6 T T M T T extracell M T TM M M M CP CP C CP CP P CP ? cytoplasm CP CP

transactivation

constant constant maternal allele 1 2 3 4 5 6 7 8 9 101112C1 C2 123456789101112 131415 16171819 202122A1A2A3B1A4B2A5A6A7B4A8B5A9B6A10B7A11B8A12 C3 C4 C5

constant ? constant paternal allele 1 234567 8 9101112C1 C2 123456789101112 131415 16171819 202122 A1A2A3B1A4B2A5A6A7B4A8B5A9B6A10B7A11B8A12 C3 C4 C5 Nucleus α β γ

Fig. 3. Schematic overview of the gene regulation and protein interaction of Pcdh-a and Pcdh-g. In the nucleus, monoallelic and biallelic types of gene regulation in the Pcdh-a and Pcdh-g clusters can occur in a single neuron. In each maternal or paternal allele, the Pcdh-a1 to a12, Pcdh-gA1 to gA12, and Pcdh-gB1 to gB8 variable exons exhibit monoallelic and combinatorial expression. By contrast, the C-types of the Pcdh-a and Pcdh-g variable exons are expressed biallelically. This model is based on data obtained from single Purkinje cells. Arrows indicate the strongly expressed variable exons. The differential expression of these genes in individual neurons may specify neuronal identity in the brain. Outside the nucleus, the co-expression of the Pcdh-a (dark blue) and Pcdh-g (light green) proteins greatly enhances their transportation from the endoplasmic reticulum to the cell surface, compared with the expression of Pcdh-a alone. A portion of the Pcdh-a protein forms a heterocomplex with Pcdh-g. These protein complexes increase the combinatorial diversity on the cell surface, and contribute to the differential characteristics specifying the individuality of single neurons. The Pcdh-a protein interacts with integrins through its RGD motif. The Pcdh-a and Pcdh-g cytoplasmic domains are cleaved by g-secretase after the extracellular domain is cleaved by metalloproteinase. Their C-terminal fragments are transported into the nucleus, and then the Pcdh-g fragment may transactivate the Pcdh-g genes. The formation of cis-heteromers between Pcdh-a and Pcdh-g alters susceptibility to processing of their proteolysis. a, Pcdh-a; g, Pcdh-g; EC, extracellular cadherin domain; TM, transmembrane domain; CP, cytoplasmic domain; ER, endoplasmic reticulum. (VLDLR) and apolipoprotein-E receptor type motif of Pcdh-a4 is required for its binding to b1 inte- 2 (ApoER2), but not to the Pcdh-a4 protein (Jossin grin. This RGD motif is the same site as we found to et al. 2004). Reelin binds to VLDLR and ApoER2 via bind the Reelin-AP fusion protein (Senzaki et al. 1999). Reelin repeats 3–6, in the middle of the Reelin protein. In addition, Dulabon et al. (2000) demonstrated that The authors speculated that the Reelin-AP fusion Reelin binds to the a3b1 integrin. The a3b1 integrin protein used in our previous study contained mis- binds directly to a secreted form of Reelin, and the folded Reelin-AP fusion proteins, and that the N-terminal region of Reelin is required for this binding misfolded proteins bound the Pcdh-a4 protein non- (Schmid et al. 2005). The N-terminal region of Reelin, specifically (Jossin et al. 2004). which contains the first Reelin repeat, is the region we On the other hand, as described above, our recent identified as binding the Pcdh-a4 protein (Senzaki studies demonstrated that the Pcdh-a4 protein inter- et al. 1999). acts with b1 integrin expressed in HEK293 cells Thus, both the RGD motif of Pcdh-a4 and the (Mutoh et al. 2004; Morishita et al. 2006). The RGD N-terminal region of Reelin-AP can bind integrins, sug-

© 2008 The Author Journal compilation © 2008 Japanese Society of Developmental Biologists S136 Y. Yagi gesting that the Pcdh-a4 protein might bind the protein is not included in these immunoprecipitants (T. secreted form of Reelin indirectly, through integrins. Yagi, unpubl. data, 2007). Thus, whether there is an Therefore, in our experiments, the Pcdh-a4 protein in vivo interaction between Pcdh-a and Fyn proteins may have bound to the integrin/Reelin protein is not yet clear. On the other hand, neurofilament M complex through its RGD motif in the protein extracts (NFM) and fascin were isolated by yeast two-hybrid from the transfected HEK293 cells. It is also possible screening with the two (A-type and B-type) cytoplas- that Pcdh-a4 protein does not bind directly to the mic constant regions of Pcdh-a, respectively (Triana- secreted full-length Reelin protein (as shown in Jossin Baltzer & Blank 2006), suggesting that links between et al. 2004), but rather to a protein complex of integrin Pcdh-a protein and the cytoskeleton are mediated by and Reelin. However, the function of the interaction NFM (neurofilament) and fascin (actin filament); between Pcdh-a4 and integrin (or Reelin) in vivo is not however, these protein interactions and their functions yet known. in vivo have yet to be determined. It is difficult Subcellularly, fractionation experiments showed that to obtain conclusive evidence for Pcdh-a-binding both the Pcdh-a and Pcdh-g proteins are enriched in proteins from intact-cell-based assays, because het- the post-synaptic density (PSD) fraction, and double- erologously overexpressed Pcdh-a protein is mostly label immunostaining showed that their signals retained in intracellular compartments, such as the partially overlap. It is reported that overexpressed endoplasmic reticulum and Golgi (Takei et al. 2001; Pcdh-a proteins are barely expressed on the cell Blank et al. 2004). surface (Mutoh et al. 2004). Interestingly, when From a yeast two-hybrid binding assay with the Pcdh-a and Pcdh-g are coexpressed in HEK293T cytoplasmic region of Pcdh-gB1 a microtubule- cells, there is a more than fivefold increase in the destabilizing protein, SCG10, was isolated (Gayet cell-surface expression of Pcdh-a (Murata et al. et al. 2004). Both SCG10 and Pcdh-gB1 protein 2004). These results suggest that Pcdh-a and Pcdh-g are concentrated in growth cones, where they are form a hetero-protein complex (Fig. 3). The Pcdh-a co-localized, and they are highly expressed in the and Pcdh-g interaction was confirmed by dorsal root entry zone of the spinal cord at specific co-immunoprecipitation studies using extracts of developmental stages (Stein et al. 1988). However, an mouse brain and neuroblastoma. This combinatorial interaction between these proteins has not been sup- expression of the Pcdh-a and Pcdh-g proteins would ported by in vivo immunoprecipitation assays using allow individual neurons to express more diverse brain extracts. To clarify the molecular function of the surface molecules, perhaps establishing their neu- clustered Pcdh proteins, the proteins that bind to their ronal cell identity and serving as molecular signals for cytoplasmic regions need to be further evaluated and neural network formation. However, using the Pcdh-g analyzed with respect to their in vivo interactions and targeting mice lacking the first constant exon, no functions. marked differences in the Pcdh-a protein levels on the cell surface are found between the wild-type and Expression and gene regulation of the Pcdh-g targeted neurons. This result indicates that clustered Pcdh family Pcdh-a surface delivery, in cultured neuron and in vivo, is mainly independent of Pcdh-g expression Clustered Pcdhs are expressed during neural devel- (Bonn et al. 2007). opment, gradually concentrated in the synaptic Several proteins that interact with the cytoplasmic region, and their levels greatly decrease as the brain regions of Pcdh-a and Pcdh-g have been reported matures (Kohmura et al. 1998; Wang et al. 2002b; (Kohmura et al. 1998; Gayet et al. 2004; Triana-Baltzer Phillips et al. 2003; Blank et al. 2004; Morishita et al. & Blank 2006). A complementary DNA fragment of 2004b; Frank et al. 2005; Zou et al. 2007). The levels Pcdh-a was firstly isolated as a weak binder from a of the clustered Pcdh gene transcripts peak between yeast two-hybrid system using the N-terminus of Fyn postnatal days 0 to 10. The decline in Pcdh-a protein kinase (Kohmura et al. 1998). The expression patterns expression is triggered by myelination (Morishita et al. and subcellular fractionation (enrichment in the PSD 2004a) (Fig. 4). In the optic nerve, the expression of fraction) are similar between the Pcdh-a and Fyn Pcdh-a proteins increases during nerve development proteins. The Pcdh-a4 protein (recognized with the and then dramatically decreases at P10, when the monoclonal antibody 6-1B) is co-precipitated with an myelin sheath is compacted. In the Shiverer mutant anti-Fyn antibody (Kohmura et al. 1998). In addition, mouse, which carries a deletion of the gene encoding although several anti-Pcdh-a and anti-Pcdh-g anti- myelin basic protein, the expression of Pcdh-a protein bodies can co-precipitate both Pcdh-a and Pcdh-g does not decrease rapidly at P10, and is elevated at proteins from brain extracts (Murata et al. 2004), Fyn P19 (Morishita et al. 2004a; Fig. 4). These results may

© 2008 The Author Journal compilation © 2008 Japanese Society of Developmental Biologists Clustered protocadherin family S137

individual neurons (Esumi et al. 2006). Single-cell PCR A E17.5 P4 P7 P10 P19 analyses of Purkinje cells showed that individual neurons express apparently random sets of Pcdh- Pcdh-α a1–12 mRNA. Although the same Pcdh-a1–12 isoform is expressed from both , this dual expression is rarely seen, suggesting that the choice of promoters is stochastic within each chromosome MAG and independent of the choice in the other chromo- some (Esumi et al. 2005; Fig. 3). The total allelic gene regulation in the Pcdh-a and -g clusters, including the C-type variable exons (C1 to C5) and the Pcdh-gA and B α P10 P19 Adult Pcdh- -gB variable exons, was examined in single Purkinje cells. The results showed that all the C-type isoforms are biallelically expressed in almost all Purkinje cells. Wild-type On the other hands, the Pcdh-a,-gA, and -gB isoforms are differentially regulated in each cell by both monoallelic and combinatorial gene regulation mechanisms (Kaneko et al. 2006; Fig. 3). In the Pcdh-a and -g clusters, different types of allelic gene Shiverer regulation (monoallelic and biallelic) occur, and these genes are spliced to the same constant exons. In particular, elucidating the process by which the stochastic expression of Pcdh-a,-gA, and -gB is regu- Fig. 4. Localization of Pcdh-a protein to axons during brain development and regulation of Pcdh-a protein level by axonal lated in individual neurons will be valuable for development and myelin maturation. (A) Inverse pattern of understanding the molecular mechanisms involved in Pcdh-a and myelin basic protein (MAG) expression in develop- generating neuronal individuality and diversity. There ing axons of the internal capsule. The level of Pcdh-a protein is is a possible hypothesis that their differential expres- decreased during myelination. (B) Elevated expression of sion in individual neurons might provide a unique Pcdh-a protein in the optic nerve of Shiverer mice, which are hypomyelinated mutants. E, embryonic day; P, postnatal day. identity to distinguish between self and non-self Adapted from Morishita et al. (2004a). neurons. In addition, similar to clustered Pcdh genes, it is reported that more than 1000 genes in the can potentially be monoallelically expressed suggest that the Pcdh-a family contributes to neural (Gimelbrant et al. 2007). These suggest that gene circuit maturation during development. Interestingly, regulation by random allele inactivation can generate recent reports demonstrate that the formation of diversity in gene expression that affects cell fate and appropriate neuronal circuits in the visual cortex is physiology (Ohlsson 2007). In addition, different terminated by myelination. In the Pcdh-g cluster, the states of DNA methylation are observed in variable expression pattern of Pcdh-gC5, a gene located at Pcdh-a promoter regions (Tasic et al. 2002; Kawagu- the end of the Pcdh-g cluster, is distinct from that of the chi et al. 2008). other clustered Pcdh genes: it has a later postnatal Long-range cis-regulatory DNA elements in the onset and stays high until adulthood (Frank et al. Pcdh-a gene cluster were identified (Ribich et al. 2005). Thus, Pcdh-gC5 may engage in neuronal 2006). Two genomic DNA fragments, HS5-1 and HS7, adhesion and signaling events related to neuronal possess enhancer activity in reporter assays (see maturation. Fig. 1). Interestingly, both HS5-1 and HS7 are In situ hybridization analyses reveal that individual conserved among vertebrates, and include CSE neurons express different sets of the clustered Pcdh sequences, which are highly conserved among the isoforms (Kohmura et al. 1998; Wang et al. 2002a; promoters in the clustered Pcdh genes. Reporter Esumi et al. 2005; Fig. 5). The clustered Pcdh genes assays using transgenic mice showed that the HS5-1 are also regulated by a unique mechanism (Esumi element functions in the brain, and the HS7 element et al. 2005; Kaneko et al. 2006). Single-cell analysis of functions in the peripheral nervous system. Further- Purkinje cells using multiple reverse transcription– more, disruption of the HS5-1 region from the Pcdh-a polymerase chain reactions (RT–PCR) revealed that cluster genome in embryonic stem cells revealed that the Pcdh-a and -g isoforms exhibit unusual monoal- HS5-1 is required for high-level expression from the lelic and combinatorial expression patterns in Pcdh-a1–12 and -aC1 promoters, but not from the

© 2008 The Author Journal compilation © 2008 Japanese Society of Developmental Biologists S138 Y. Yagi

Pcdh-α (constant region) Pcdh-α3 Pcdh-α11 Fig. 5. Expression of Pcdh-a mRNA in the cerebellum of P21 mice. Digoxigenin-labeled RNA probes for Pcdh-a constant, Pcdh- a3, and Pcdh-a11 mRNA were used for in situ hybridization. Signals for Pcdh-a constant were present in all observed Purkinje cells. In contrast, the signals for Pcdh-a3 and Pcdh-a11 were found in a subpopulation of Purkinje cells (arrowheads). Adapted from Esumi et al. (2005).

Pcdh–aC2 promoter. That the HS5–1 element is evidence for a role of this clustered Pcdh in synaptic involved in regulating the majority of the clustered development. However, the localized effect of the members is consistent with the idea that monoallelic Pcdh-g deletion on interneurons of the spinal cord also expression of the Pcdh-a variable exon is a conse- indicates that the expression of different members of quence of competition among the individual variable the clustered Pcdh family might specify both the sur- exon promoters for the regulatory elements. However, vival and the synaptic formation of different neuronal the long-range cis-regulatory DNA elements have not populations. Recently we demonstrate that Pcdh-a yet been completely identified in the Pcdh-b and mutants show abnormal sorting of olfactory sensory Pcdh-g clusters, or even in the Pcdh-a cluster. To neurons into glomeruli (Hasegawa et al. in press). clarify the molecular mechanisms responsible for the Further analyses of Pcdh-a and Pcdh-b family mutant unusual regulation of the clustered Pcdh genes, mice, as well as mice with reduced numbers of clus- the cis-regulatory DNA elements in the genome of the tered Pcdhs, are current research directions. Pcdh gene cluster need to be further identified and evaluated. In addition to the long-range cis-regulatory ele- Conclusion ments, recent human genome-wide analyses of the The clustered Pcdh family is a structurally and func- binding sites of the insulator protein CCCTC-binding tionally related subgroup of the cadherin superfamily, factor (CTCF) indicate that CTCF-binding sites punc- whose members share a unique protein structure and tuate the alternative promoters in the Pcdh-a and unusual gene regulation mechanisms. Keen interest in Pcdh-g cluster locus (Kim et al. 2007; Xie et al. 2007). the clustered Pcdh family arises from its extensive This finding raises the possibility that insulator ele- diversity in the vertebrate brain. In invertebrates such ments are involved in the selection of promoters in as Drosophila, another , individual neurons. To better understand the molecular DSCAM, shows extraordinary diversity (Schmucker mechanisms involved in generating the diversity of et al. 2000), which provides each neuron with a unique individual neurons, further studies aimed at clarifying identity that enables it to distinguish between self and the mechanisms of regulation in the Pcdh gene cluster non-self cells (Zipursky et al. 2006). Recent findings will be valuable. on the protein structure and properties, gene regula- tion, and in vivo functions of the clustered Pcdh family indicate that these diverse proteins play significant In vivo function of the clustered Pcdh family roles in the vertebrate brain. Continued investigation To examine the in vivo function of the clustered Pcdh of the clustered Pcdh family by various approaches family, genetic studies in mice have been performed. will provide important information about the molecular Deleting the cluster of Pcdh-g genes results in a partial mechanisms involved in generating the highly apoptotic loss of spinal interneurons, and a reduction complex systems of the vertebrate brain. in spinal synaptic density (Wang et al. 2002b). To investigate the roles of Pcdh-g specific to synaptoge- Acknowledgments nesis, apoptosis can be minimized by removing BAX, or by using a hypomorphic allele of Pcdh-g (Weiner I am grateful to K Inoue and S Yokota, and Drs S et al. 2005). In these mutant mice, the spinal cord still Hamada, K Senzaki, R Kaneko, T Hirayama, and shows decreased synaptic density, and the activity of H Morishita for their contribution in preparing the the formed synapses is reduced, providing the first manuscript.

© 2008 The Author Journal compilation © 2008 Japanese Society of Developmental Biologists Clustered protocadherin family S139

critical to its function during cortical plate development. Conflict of Interest J. Neurosci. 24, 514–521. No conflict of interest has been declared by Y. Yagi. Kaneko, R., Kato, H., Kawamura, Y. et al. 2006. Allelic gene regulation of Pcdh-alpha and Pcdh-gamma clusters involv- ing both monoallelic and biallelic expression in single Purkinje cells. J. Biol. Chem. 281, 30 551–30 560. References Kawaguchi, M., Toyama, T., Kaneko, R., Hirayama, T., Kawa- mura, Y. & Yagi, T. 2008. Relationship between DNA Blank, M., Triana-Baltzer, G. B., Richards, C. S. & Berg, D. K. methylation states and transcription of individual isoforms 2004. Alpha- are presynaptic and axonal in encoded by the protocadherin-alpha gene cluster. J. Biol. nicotinic pathways. Mol. Cell. Neurosci. 26, 530–543. Chem.inpress. Boggon, T. J., Murray, J., Chappuis-Flament, S., Wong, E., Kim, T. H., Abdullaev, Z. K., Smith, A. D. et al. 2007. Analysis of Gumbiner, B. M. & Shapiro, L. 2002. C-cadherin the vertebrate insulator protein CTCF-binding sites in the ectodomain structure and implications for cell adhesion human genome. Cell 128, 1231–1245. mechanisms. Science 296, 1308–1313. Kirov, G., Georgieva, L., Williams, N. et al. 2003. Variation in the Bonn, S., Seeburg, P. H. & Schwarz, M. K. 2007. Combinatorial protocadherin gamma A gene cluster. Genomics 82, 433– expression of alpha- and gamma-protocadherins alters their 440. presenilin-dependent processing. Mol. Cell. Biol. 27, 4121– Kohmura, N., Senzaki, K., Hamada, S. et al. 1998. Diversity 4132. revealed by a novel family of cadherins expressed in Culbertson, M. R. 1999. RNA surveillance. Unforeseen conse- neurons at a synaptic complex. Neuron 20, 1137–1151. quences for gene expression, inherited genetic disorders Miki, R., Hattori, K., Taguchi, Y. et al. 2005. Identification and and cancer. Trends Genet. 15, 74–80. characterization of coding single-nucleotide polymorphisms D’Arcangelo, G. & Curran, T. 1998. Reeler: new tales on an old within human protocadherin-alpha and -beta gene clusters. mutant mouse. Bioessays 20, 235–244. Gene 349, 1–14. Drouin, G., Prat, F., Ell, M. & Clarke, G. D. 1999. Detecting and Morishita, H. & Yagi, T. 2007. Protocadherin family: diversity, characterizing gene conversions between multigene family structure, and function. Curr. Opin. Cell Biol. 19, 584–592. members. Mol. Biol. Evol. 16, 1369–1390. Morishita, H., Kawaguchi, M., Murata, Y. et al. 2004a. Myelination Dulabon, L., Olson, E. C., Taglienti, M. G. et al. 2000. Reelin triggers local loss of axonal CNR/protocadherin alpha family binds alpha3beta1 integrin and inhibits neuronal migration. protein expression. Eur. J. Neurosci. 20, 2843–2847. Neuron 27, 33–44. Morishita, H., Murata, Y., Esumi, S., Hamada, S. & Yagi, T. 2004b. Esumi, S., Kakazu, N., Taguchi, Y. et al. 2005. Monoallelic CNR/Pcdhalpha family in subplate neurons, and developing yet combinatorial expression of variable exons of the cortical connectivity. Neuroreport 15, 2595–2599. protocadherin-alpha gene cluster in single neurons. Nat. Morishita, H., Umitsu, M., Murata, Y. et al. 2006. Structure of the Genet. 37, 171–176. cadherin-related neuronal receptor/protocadherin-alpha Esumi, S., Kaneko, R., Kawamura, Y. & Yagi, T. 2006. Split single- first extracellular cadherin domain reveals diversity across cell RT-PCR analysis of Purkinje cells. Nat. Protoc. 1, 2143– cadherin families. J. Biol. Chem. 281, 33 650–33 663. 2151. Murata, Y., Hamada, S., Morishita, H., Mutoh, T. & Yagi, T. 2004. Frank, M., Ebert, M., Shan, W. et al. 2005. Differential expression Interaction with protocadherin-gamma regulates the cell of individual gamma-protocadherins during mouse brain surface expression of protocadherin-alpha. J. Biol. Chem. development. Mol. Cell. Neurosci. 29, 603–616. 279, 49 508–49 516. Gayet, O., Labella, V., Henderson, C. E. & Kallenbach, S. 2004. Mutoh, T., Hamada, S., Senzaki, K., Murata, Y. & Yagi, T. 2004. The b1 isoform of protocadherin-gamma (Pcdhgamma) Cadherin-related neuronal receptor 1 (CNR1) has cell adhe- interacts with the microtubule-destabilizing protein SCG10. sion activity with beta1 integrin mediated through the RGD FEBS Lett. 578, 175–179. site of CNR1. Exp. Cell Res. 294, 494–508. Gimelbrant, A., Hutchinson, J. N., Thompson, B. R. & Chess, A. Noonan, J. P., Li, J., Nguyen, L. et al. 2003. Extensive linkage 2007. Widespread monoallelic expression on human auto- disequilibrium, a common 16.7-kilobase deletion, and evi- somes. Science 318, 1136–1140. dence of balancing selection in the human protocadherin Haas, I. G., Frank, M., Veron, N. & Kemler, R. 2005. Presenilin- alpha cluster. Am. J. Hum. Genet. 72, 621–635. dependent processing and nuclear function of gamma- Noonan, J. P., Grimwood, J., Danke, J. et al. 2004a. Coelacanth protocadherins. J. Biol. Chem. 280, 9313–9319. genome sequence reveals the evolutionary history of verte- Hambsch, B., Grinevich, V., Seeburg, P. H. & Schwarz, M. K. brate genes. Genome Res. 14, 2397–2405. 2005. {gamma}-Protocadherins, presenilin-mediated Noonan, J. P., Grimwood, J., Schmutz, J., Dickson, M. & Myers, release of C-terminal fragment promotes locus expression. R. M. 2004b. Gene conversion and the evolution of pro- J. Biol. Chem. 280, 15 888–15 897. tocadherin gene cluster diversity. Genome Res. 14, 354– Hasegawa, S., Hamada, S., Kumode, Y. et al. 2008. The 366. pretocadherin-a family is involved in axonal coalescence of Ohlsson, R. 2007. Genetics. Widespread monoallelic expres- olfactory sensory neurons into glomeruli of the olfactory sion. Science 318, 1077–1078. bulb in mouce. Mol. Cell Neurosci,inpress. Patel, S. D., Chen, C. P., Bahna, F., Honig, B. & Shapiro, L. 2003. Hirayama, T. & Yagi, T. 2006. The role and expression of the Cadherin-mediated cell-cell adhesion: sticking together as protocadherin-alpha clusters in the CNS. Curr. Opin. Neu- a family. Curr. Opin. Struct. Biol. 13, 690–698. robiol. 16, 336–342. Phillips, G. R., Tanaka, H., Frank, M. et al. 2003. Gamma- Jossin, Y., Ignatova, N., Hiesberger, T., Herz, J., Lambert De protocadherins are targeted to subsets of synapses and Rouvroit, C. & Goffinet, A. M. 2004. The central fragment intracellular organelles in neurons. J. Neurosci. 23, 5096– of Reelin, generated by proteolytic processing in vivo, is 5104.

© 2008 The Author Journal compilation © 2008 Japanese Society of Developmental Biologists S140 Y. Yagi

Poumbourios, P., Maerz, A. L. & Drummer, H. E. 2003. Functional Tasic, B., Nabholz, C.E., Baldwin, K.K. et al. 2002. Promoter evolution of the HIV-1 envelope 120 associa- choice determines splice site selection in protocadherin tion site of glycoprotein 41. J. Biol. Chem. 278, 42 149– alpha and gamma pre-mRNA splicing. Mol. Cell 10, 21–33. 42 160. Triana-Baltzer, G. B. & Blank, M. 2006. Cytoplasmic domain of Reiss, K., Maretzky, T., Haas, I. G. et al. 2006. Regulated protocadherin-alpha enhances homophilic interactions and ADAM10-dependent ectodomain shedding of gamma- recognizes cytoskeletal elements. J. Neurobiol. 66, 393– protocadherin C3 modulates cell-cell adhesion. J. Biol. 407. Chem. 281, 21 735–21 744. Umitsu, M., Morishita, H., Murata, Y. et al. 2005. 1H, 13C and Ribich, S., Tasic, B. & Maniatis, T. 2006. Identification of long- 15N resonance assignments of the first cadherin domain of range regulatory elements in the protocadherin-alpha gene Cadherin-related neuronal receptor (CNR)/protocadherin cluster. Proc. Natl Acad. Sci. USA 103, 19 719–19 724. alpha. J. Biomol. NMR 31, 365–366. Ruoslahti, E. & Pierschbacher, M. D. 1987. New perspectives in Wang, X., Su, H. & Bradley, A. 2002a. Molecular mechanisms cell adhesion: RGD and integrins. Science 238, 491– governing Pcdh-gamma gene expression: evidence for a 497. multiple promoter and cis- model. Genes Schmid, R. S., Jo, R., Shelton, S., Kreidberg, J. A. & Anton, E. S. Dev. 16, 1890–1905. 2005. Reelin, integrin and DAB1 interactions during Wang, X., Weiner, J. A., Levi, S., Craig, A. M., Bradley, A. & embryonic cerebral cortical development. Cereb. Cortex Sanes, J. R. 2002b. Gamma protocadherins are required 15, 1632–1636. for survival of spinal interneurons. Neuron 36, 843–854. Schmucker, D., Clemens, J. C., Shu, H. et al. 2000. Drosophila Weiner, J. A., Wang, X., Tapia, J. C. & Sanes, J. R. 2005. Gamma Dscam is an axon guidance receptor exhibiting extraordi- protocadherins are required for synaptic development in the nary molecular diversity. Cell 101, 671–684. spinal cord. Proc. Natl Acad. Sci. USA 102, 8–14. Senzaki, K., Ogawa, M. & Yagi, T. 1999. Proteins of the CNR Wu, Q. 2005. Comparative genomics and diversifying selection family are multiple receptors for Reelin. Cell 99, 635–647. of the clustered vertebrate protocadherin genes. Genetics Stein, R., Mori, N., Matthews, K., Lo, L. C. & Anderson, D. J. 169, 2179–2188. 1988. The NGF-inducible SCG10 mRNA encodes a novel Wu, Q. & Maniatis, T. 1999. A striking organization of a large membrane-bound protein present in growth cones and family of human neural cadherin-like cell adhesion genes. abundant in developing neurons. Neuron 1, 463–476. Cell 97, 779–790. Sugino, H., Hamada, S., Yasuda, R. et al. 2000. Genomic orga- Xie, X., Mikkelsen, T. S., Gnirke, A., Lindblad-Toh, K., Kellis, M. & nization of the family of CNR cadherin genes in mice and Lander, E. S. 2007. Systematic discovery of regulatory humans. Genomics 63, 75–87. motifs in conserved regions of the human genome, includ- Sugino, H., Yanase, H., Hamada, S. et al. 2004. Distinct genomic ing thousands of CTCF insulator sites. Proc. Natl Acad. Sci. sequence of the CNR/Pcdhalpha genes in chicken. USA 104, 7145–7150. Biochem. Biophys. Res. Commun. 316, 437–445. Yagi, T. & Takeichi, M. 2000. Cadherin superfamily genes: func- Tada, M. N., Senzaki, K., Tai, Y. et al. 2004. Genomic organiza- tions, genomic organization, and neurologic diversity. tion and transcripts of the zebrafish Protocadherin genes. Genes Dev. 14, 1169–1180. Gene 340, 197–211. Yanase, H., Sugino, H. & Yagi, T. 2004. Genomic sequence and Taguchi, Y., Koide, T., Shiroishi, T. & Yagi, T. 2005. Molecular organization of the family of CNR/Pcdhalpha genes in rat. evolution of cadherin-related neuronal receptor/ Genomics 83, 717–726. protocadherin(alpha) (CNR/Pcdh(alpha)) gene cluster in Yu, W. P., Yew, K., Rajasegaran, V. & Venkatesh, B. 2007. Mus musculus subspecies. Mol. Biol. Evol. 22, 1433– Sequencing and comparative analysis of fugu protocad- 1443. herin clusters reveal diversity of protocadherin genes Takei, Y., Hamada, S., Senzaki, K., Mutoh, T., Sugino, H. & Yagi, among teleosts. BMC Evol. Biol. 7, 49. T. 2001. Two novel CNRs from the CNR gene cluster Zipursky, S. L., Wojtowicz, W. M. & Hattori, D. 2006. Got diver- have molecular features distinct from those of CNR1 to 8. sity? Wiring the fly brain with Dscam. Trends Biochem. Sci. Genomics 72, 321–330. 31, 581–588. Takeichi, M. 1988. The cadherins: cell-cell adhesion molecules Zou, C., Huang, W., Ying, G. & Wu, Q. 2007. Sequence analysis controlling animal morphogenesis. Development 102, 639– and expression mapping of the rat clustered protocadherin 655. gene repertoires. Neuroscience 144, 579–603.

© 2008 The Author Journal compilation © 2008 Japanese Society of Developmental Biologists