sDscam isoforms combine homophilic specificities to define unique cell recognition

Fengyan Zhoua, Guozheng Caoa, Songjun Daia, Guo Lia, Hao Lia, Zhu Dinga, Shouqing Houa, Bingbing Xua, Wendong Youb, Gil Wiseglassc, Feng Shia, Xiaofeng Yangb, Rotem Rubinsteinc, and Yongfeng Jina,b,1

aMOE Laboratory of Biosystems Homeostasis & Protection, Innovation Center for Cell Signaling Network, College of Life Sciences, Zhejiang University, ZJ310058 Hangzhou, Zhejiang, China; bDepartment of Neurosurgery, First Affiliated Hospital, School of Medicine, Zhejiang University, ZJ310058 Hangzhou, Zhejiang, China; and cSchool of Neurobiology, Biochemistry and Biophysics, Sagol School of Neuroscience, George S. Wise Faculty of Life Science, Tel Aviv University, 69978 Ramat Aviv, Israel

Edited by Barry Honig, Howard Hughes Medical Institute, Columbia University, New York, NY, and approved August 20, 2020 (received for review December 15, 2019) Thousands of Down syndrome cell adhesion molecule (Dscam1) (13–16). Different isoforms share the same domain organization isoforms and ∼60 clustered protocadhrein (cPcdh) proteins are re- with 10 Ig domains, 6 fibronectin type III (FNIII) domains, a quired for establishing neural circuits in insects and vertebrates, single transmembrane (TM) region, and a cytoplasmic domain, respectively. The strict homophilic specificity exhibited by these but differ in the primary sequences of at least 1 of 3 Ig domains. proteins has been extensively studied and is thought to be critical Individual neuronal identities are determined by the stochastic for their function in neuronal self-avoidance. In contrast, signifi- expression of a small set of 10 to 50 distinct Dscam1 isoforms of cantly less is known about the Dscam1-related family of ∼100 short- the tens of thousands of possible isoforms that can be generated ened Dscam (sDscam) proteins in Chelicerata. We report that via alternative splicing (8, 14, 17, 18). In contrast to insect Chelicerata sDscamα and some sDscamβ protein trans interactions Dscam1, vertebrate Dscam genes do not produce extensive iso- are strictly homophilic, and that the trans interaction is meditated form diversity (19). In vertebrates, a different set of cell surface via the first Ig domain through an antiparallel interface. Addition- adhesion receptors, the cPcdhs, performs an analogous function ally, different sDscam isoforms interact promiscuously in cis via (20–23). In human and mouse, 53 and 58 cPcdh proteins, re- membrane proximate fibronectin-type III domains. We report that spectively, are encoded by three tandemly arranged gene clusters – cell cell interactions depend on the combined identity of all sDscam of Pcdhα, Pcdhβ, and Pcdhγ (24, 25). Single neuronal surface BIOCHEMISTRY isoforms expressed. A single mismatched sDscam isoform can inter- identity is achieved by a combination of stochastic promoter fere with the interactions of cells that otherwise express an identical selection and alternative splicing (26–28). In addition to engag- set of isoforms. Thus, our data support a model by which sDscam ing in trans (cell-to-cell) through strict homophilic interactions association in cis and trans generates a vast repertoire of combina- (29, 30), cPcdhs also exhibit an additional independent cis (same torial homophilic recognition specificities. We propose that in Che- cell) interaction that is isoform promiscuous. It is surprising that licerata, sDscam combinatorial specificity is sufficient to provide fewer than 60 proteins are able to mediate the process of neu- each neuron with a unique identity for self–nonself discrimination. ronal self-avoidance in the complex mammalian brain, as op- Surprisingly, while sDscams are related to Drosophila Dscam1, our posed to thousands of isoforms required for an analogous results mirror the findings reported for the structurally unrelated function in Drosophila. Studies using cell aggregation assays have vertebrate cPcdh. Thus, our findings suggest a remarkable example found a possible explanation for this challenge. Specifically, in of convergent evolution for the process of neuronal self-avoidance and provide insight into the basic principles and evolution of meta- Significance zoan self-avoidance and self–nonself discrimination.

Neuronal self-avoidance is a conserved process in vertebrates Down syndrome cell adhesion molecule | homophilic binding | and invertebrates. In Drosophila, self-avoidance is mediated by combinatorial specificity | self-recognition | Chelicerata the Down syndrome cell adhesion molecule (Dscam1) gene that encodes tens of thousands of proteins through alternative atterning of the developing brain is critically affected by the splicing. In vertebrates, an analogous function is performed by Pprecision of selective recognition and the strength of the ∼60 clustered protocadherins (cPcdh) through promoter choice. interactions between cell adhesion receptors (1, 2). Two large Here we use cell aggregation assays to study the binding cell adhesion receptor families, Down syndrome cell adhesion preferences of ∼100 sDscam protein in . We report molecule (Dscam1) of the immunoglobulin superfamily and that while related in sequence to the fly Dscam, the scorpion clustered protocadherins (cPcdhs) of the cadherin superfamily, sDscam adopts a strategy that is similar to that of vertebrate play a central role in neural circuit assembly in insects and ver- cPcdhs, of combined specificity when coexpressed. Our find- tebrates, respectively. These proteins mediate highly selective ings identify sDscams as likely candidates to mediate neuronal homophilic interactions and generate a unique molecular iden- self-avoidance in Chelicerata, as well as provide a remarkable tity at the surface of individual neurons, thereby enabling them example of convergent evolution. to distinguish self from nonself and ultimately to self-avoid. Genetic studies using fly and mouse neurons have described a Author contributions: Y.J. conceived this project; F.Z., G.C., S.D., G.L., H.L., B.X., and Y.J. remarkably similar molecular strategy of self-avoidance (3–12). designed research; F.Z., G.C., S.D., G.L., and H.L. performed research; F.Z., Z.D., and S.H. Homophilic interactions between identical repertoires of Dscam/ performed homology modeling and protein–protein docking; F.Z., Z.D., S.H., B.X., W.Y., cPcdh proteins on the surface of the same neuron lead to self- G.W., F.S., X.Y., R.R., and Y.J. analyzed data; and F.Z., R.R., and Y.J. wrote the paper. recognition and result in neurite repulsion. In contrast, contact The authors declare no competing interest. by two arbors from distinct neurons, with differing isoform This article is a PNAS Direct Submission. compositions, does not result in homophilic binding and does not Published under the PNAS license. trigger an avoidance mechanism. 1To whom correspondence may be addressed. Email: [email protected]. In Drosophila, neuronal self-avoidance is mediated by sto- This article contains supporting information online at https://www.pnas.org/lookup/suppl/ chastic alternative splicing of a single gene, Dscam1, that can doi:10.1073/pnas.1921983117/-/DCSupplemental. encode as many as 19,008 isoforms with distinct ectodomains

www.pnas.org/cgi/doi/10.1073/pnas.1921983117 PNAS Latest Articles | 1of12 Downloaded by guest on September 23, 2021 these assays recognition of cells that express multiple distinct cells using an insect baculovirus expression system (Fig. 1B). This cPcdh isoforms was observed to be dependent on the combined system is a powerful tool for investigating homophilic interac- identity of all expressed isoforms (29, 30). That is, two cells that tions between expressed cell surface adhesion molecules (35). express a mismatched isoform will not bind to each other even if An analogous approach, using different cells, was used in studies all other expressed cPcdh isoforms are identical. of trans binding properties of mouse cPcdh and Drosophila The Chelicerata subphylum is a basal branch of Dscam (7, 29, 30). Sf9 cells that expressed constructs encoding that includes , such as spiders and , with rela- sDscamβ6v2, either full-length (β6v2FL-mCherry) or lacking the tively complex brains that are similar in magnitude to that of the cytoplasmic domain (β6v2Δcyto-mCherry), exhibited strong ag- Drosophila brain (31). In contrast to Drosophila Dscam1, Cheli- gregation (SI Appendix, Fig. S1A). This finding indicates that the cerata Dscam genes do not generate highly diverse proteins and do homophilic interaction is mediated by sDscamβ6v2 in trans in- not have cPcdh genes. Recently, we discovered a “hybrid” gene dependent of the cytoplasmic region. We therefore used Δcyto family in the subphylum Chelicerata that is particularly relevant to constructs for all sDscam proteins in the cell aggregation assay. the remarkable functional convergence of Drosophila Dscam1 and We performed a systematic analysis of the homophilic inter- vertebrate cPcdhs. This gene family is composed of Dscam-related actions for 86 of the 95 sDscam proteins (34 of 40 sDscamα, and genes with tandemly arrayed 5′ cassettes, which encode ∼50 to 100 52 of 55 sDscamβ1–β6), as 9 sDscam cDNAs failed to be cloned isoforms each with alternative promoters for the number of iso- (SI Appendix, Table S1). We found that all of the 34 sDscamαs, forms varying across Chelicerata species (32, 33). Although these which were individually expressed, formed homophilic aggre- Chelicerata Dscams are evolutionarily related to Drosophila gates (Fig. 1 C and D). In contrast, only 15 of the 52 sDscamβ Dscam1, they only have 6 extracellular domains (3 Ig and 3 FNIII isoforms tested formed homophilic aggregates; 8 of these domains), making them much shorter compared to the 16 domains belonged to the β5 cluster (SI Appendix, Fig. S1B and Table S1). of Drosophila Dscam1. We therefore refer to this type of Dscam as The size of the cell aggregates varied markedly across the shortened Dscam (sDscam) to distinguish it from classic Dscam sDscam subfamilies and the individual isoforms according to (33). Based on their different variable 5′ cassettes that encode a quantitative assay (Fig. 1D and SI Appendix, Fig. S1C). Cells single or two Ig domains, these sDscams can be subdivided into expressing sDscamα isoforms exhibited extensive aggregation for sDscamα and sDscamβ subfamilies, respectively. Thus, all sDscam all isoforms tested (Fig. 1D, rows 1 to 3, and SI Appendix, Fig. isoforms share the same domain organization with different amino S1B). Similarly, the 2 β6 cluster isoforms and 8 of the 10 β5 cluster acid primary sequences of at least the N-termini Ig domains. isoforms formed homophilic aggregates (Fig. 1D, rows 8 and 9, Interestingly, the 5′ variable regions of Chelicerata sDscams and SI Appendix, Fig. S1B). In contrast, none of the sDscamβ1or exhibit a remarkable organizational resemblance to those of -β4 cluster isoforms, nor the majority of isoforms of the sDscamβ2 vertebrate-clustered Pcdhs (32–34). Similar to Drosophila Dscam1 and -β3 clusters, formed any homophilic aggregates (Fig. 1D,rows and vertebrate Pcdhs, Chelicerata sDscam are abundantly expressed 4 to 7). We note that aggregates of the cells expressing sDscamβ in the nervous system and their expression is controlled by promoter isoforms were largely smaller than those expressing sDscamαs choice (32, 33). Because Chelicerata sDscams are closely related to (Fig. 1D and SI Appendix,Fig.S1B and C). Additionally, in some Drosophila Dscam1, and exhibit a striking organizational resem- cases, cells expressing individual sDscam isoforms, belonging to blance to the vertebrate-clustered Pcdhs, with the latter two proteins the same cluster, which differed only in the N-terminal variable both capable of mediating self-recognition and self-avoidance, we region, exhibited markedly different cell aggregation behavior. For speculate that these sDscam isoforms play analogous roles in Che- example, only three of the eight tested isoforms within the licerata species. Therefore, it is essential to perform a systemic ex- sDscamβ2 subfamily formed homophilic aggregates, while the amination of the homophilic recognition specificities of these other five members did not (Fig. 1D). Isoforms from the clustered sDscam isoforms to clarify their potential roles in speci- sDscamβ3andβ5 clusters exhibited a similar cluster discrepancy in fying single-cell identities and neural circuit assembly. their cell aggregation behavior (Fig. 1D, rows 6 and 8). In this study, we demonstrate that all tested sDscamαs and This discrepancy in aggregation activity between sDscam iso- some sDscamβs engage in highly specific homophilic interactions forms was likely due to differences in expression, membrane lo- via antiparallel self-binding of the variable Ig1 domain. More- calization, or intrinsic trans-binding affinities of the individual over, we provide compelling evidence that sDscam isoforms as- isoforms (30). For example, failure to form cell aggregates in sociate promiscuously in cis, which is mediated by the constant mammalian Pcdhα isoforms is reportedly due to a lack of mem- FNIII1–3. Remarkably, using a cell aggregation assay we found brane localization (30, 36, 37). However, immunostaining revealed that, as is the case for the cPcdh, cell–cell recognition depends that both sDscamβ4v3, which does not mediate cell aggregation, and on the combined identity of all isoforms expressed. We propose sDscamα14, which engages in homophilic interactions, were de- that these sDscam are able to sufficiently provide the unique tected on the surface of Sf9 cells (SI Appendix,Fig.S1D). Moreover, single-cell identity necessary for neuronal self–nonself discrimi- sDscamβ4v1, which does not mediate cell aggregates, was expressed nation. Interestingly, in many respects Chelicerata sDscams ex- at a similar or higher level to those exhibited by isoforms that me- hibit more parallels with the genetically unrelated vertebrate diate cell aggregates (SI Appendix,Fig.S1E). In general, we did not Pcdhs than to the closely related fly Dscam1. Thus, our findings observe a correlation between isoform expression level and cell ag- provide mechanistic and evolutionary insight into self–nonself gregation outcome among individual isoforms (SI Appendix,Fig. discrimination in metazoans and enhance our understanding of S1 E). Further truncation and domain-swapping experiments indicate the general biological principles required for endowing cells with that the first two N-terminal Ig domains are essential for homophilic distinct molecular identities. trans binding (SI Appendix,Figs.S2andS3). In particular, we note that at least some sDscamβ isoforms are able to interact homo- Results philically via their membrane distal Ig domains, yet their membrane Cluster-wide Analysis of sDscam Homophilic Interactions. The Mes- proximate FNIII domains and the TM region inhibit homophilic obuthus martensii sDscam gene clusters encode 95 diverse cell binding (SI Appendix,Figs.S2C–G and S3A). adhesion proteins that consist of 40 clustered sDscamα and 55 sDscamβ isoforms. The sDscamβ family is further divided into six sDscams Exhibit Highly Specific Isoform Binding. A striking and additional clusters (β1–β6) with the following distribution: 13 β1, functionally crucial property of Drosophila Dscam and vertebrate 8 β2, 13 β3, 9 β4, 10 β5, and 2 β6 (Fig. 1A and SI Appendix, Table cPcdh isoforms is the strict homophilic specificity recognition in S1) (32, 33). To investigate whether sDscam isoforms mediate trans. To analyze the specificity of trans interactions between homophilic binding, we expressed the sDscam proteins in Sf9 different sDscams, we assessed cell aggregates formed via mixing

2of12 | www.pnas.org/cgi/doi/10.1073/pnas.1921983117 Zhou et al. Downloaded by guest on September 23, 2021 BIOCHEMISTRY

Fig. 1. Cluster-wide analysis of sDscam-mediated homophilic binding in M. martensii.(A) Overview of the M. martensii sDscam gene clusters. Variable exons (colored) are joined via cis-splicing to the constant exons (black) in sDscamα (Left) and sDscamβ1–β6(Right) subfamilies. Each variable cassette of sDscamα encodes Ig1 domain, while that of sDscamβ encodes Ig1–2 domains. The constant exons of sDscamα and sDscamβ encode the Ig2–3 or Ig3 domains, FNIII1–3 domains, the TM and cytoplasmic domains. (B) Schematic diagram of the cell aggregation assay. mCherry-tagged sDscam proteins are expressed in Sf9 cells to test their ability to form cell aggregates. As shown in the diagram, cells expressing some sDscam-mCherry alone do not aggregate as negative control- mCherry, while strong cell aggregates were observed with cells expressing other sDscam-mCherry as positive control Dscam1-mCherry. (C) A summary for results of homophilic binding properties, with an evolutionary relationship among distinct sDscam subfamilies shown on the left. (D) The outcome of cell aggregation mediated by 86 sDscam isoforms when assaying individually. mCherry and fly Dscam1 isoform were used as negative and positive control re- spectively. See also SI Appendix, Fig. S1 B and C. (Scale bar, 100 μm.)

Zhou et al. PNAS Latest Articles | 3of12 Downloaded by guest on September 23, 2021 of two fluorescently labeled cell populations (Fig. 2A). Each constant region (Fig. 3 B and C). We tested the chimera gen- sDscam was expressed with mCherry or an enhanced green erated by replacing the first Ig domain of isoforms from the β6 fluorescent protein (EGFP) fused to the C terminus and assayed and β5 clusters (Fig. 3B), as well as those from the β3 and α for binding specificity (Fig. 2 C–G). First, we investigated the clusters (Fig. 3C). Overall, these domain-swapping experiments specificity of the interaction between sDscamα isoforms, which suggested that the first Ig domain is the primary determinant of differed only by their Ig1 domain at the N terminus. To deter- trans interaction specificity (Fig. 3D ), and that the constant re- mine the stringency of recognition specificity, we generated gions of sDscam isoforms are not involved in defining binding pairwise sequence identity heat maps of the variable Ig1 domains specificity. (Fig. 2B). Using these heat maps, we identified sDscam pairs that had the highest pairwise sequence identity within their Ig1 do- sDscams Interact in Trans via Antiparallel Ig1 Self-Binding. To gain mains. We tested 14 of the sDscams with the highest pairwise insight into how the variable Ig1 domain mediates specific sequence identity (>87% identity) (Fig. 2B) along with 21 more homophilic binding, we carried out homology modeling studies distantly related sDscams (Fig. 2C). In total, we tested 35 unique to generate homodimeric complexes of sDscamα Ig1. We used pairs of sDscams with the sequence identity for nonself pairs the dimeric structure of the Drosophila Dscam variable Ig7 do- ranging from 50 to 97% in the Ig1 domains. Despite the high main as a template, as previous studies have found that the Ig1 sequence identity for many of the pairs, all but a single pair of domain of sDscam is homologous to the variable Ig7 domain of sDscamα isoforms exhibited exclusively homophilic interaction Drosophila Dscam1 (33) (SI Appendix, Figs. S6 and S7 A and B). specificity with no observed heterophilic interactions (Fig. 2 C–G The Ig7 domain of Drosophila Dscam1 is one of three domains and SI Appendix, Fig. S4A). Only self-pairs on the matrix diag- (together with the second and third Ig domains) that determine onals exhibited intermixing of red and green cell aggregates, trans binding specificity. We hypothesized that because of the while (with one exception) all nonself pairs formed separate, evolutionary relationship, the Ig1 domain of sDscam may noninteracting homophilic cell aggregates (Fig. 2 C–G and SI maintain the overall molecular mechanism of the Ig7 trans in- Appendix, Fig. S4A). Heterophilic binding was only detected for teractions shown in the homodimeric structure of Drosophila one pair of isoforms, sDscamα20 and sDscamα36, which are Dscam1. Using the SWISS-MODEL program with the crystal closely related with variable Ig domain amino acid sequences structure of Drosophila Dscam1 Ig7.5 (PDB ID code 4WVR, that are ∼97% identical (Fig. 2G). Similar results have been 1.95 Å) (38, 39) as a template, we built an Ig1 homodimeric obtained for reciprocal binding pairs. These data indicate that model of all sDscamα isoforms. These structural models sug- similar to Drosophila Dscam and vertebrate cPcdh, the sDscamα gested that the Ig1 domain of sDscamα interacts in an antipar- isoforms exhibit a strict homophilic specificity. Additionally, allel orientation via residues on the ABDE β-strands (Fig. 4A these data indicate that the first variable Ig domain of sDscamα and SI Appendix, Fig. S8A). Next, we tested these models using determines binding specificity. mutations designed to disrupt homophilic interactions and We then investigated the specificity of the interactions be- binding specificity. tween sDscamβ isoforms, which differ in their Ig1–2 domains at Based on the structural models of two isoforms, sDscamα30 the N terminus. We tested pairwise combinations of isoforms and sDscamα39, we predicted the formation of two salt bridges that belong to the sDscamβ5 cluster and pairwise combinations at the interface. One salt bridge was formed by lysine residue 5 of isoforms that belong to the sDscamα and -β5/β6 clusters. All of (K5, sDscamα Ig1 numbering) in the A strand and glutamic acid the sDscamβ5/β5, -α/β5, and -α/β6 pairs tested bound strictly residue 19 (E19) in the B strand, and a second was formed by homophilically (SI Appendix, Fig. S4 C and D). Taken together E17 in the AB loop and K26 in the B strand (Fig. 4A and SI with the results of sDscamα analyses (Fig. 2), these observations Appendix, Fig. S8A). To experimentally test this binding model, demonstrate that sDscam α and sDscamβ5/β6 isoforms exhibit we performed single mutations, swapping the charges of the strict homophilic trans binding. predicted salt-bridge residues (K5E, E19K, E17K, and K26E), and assessed their ability to mediate cell aggregation. As a result, Domain Shuffling Identifies Variable Ig1 as the Specificity-Determining sDscamα39 mutants did not aggregate homophilically and Domain. The sDscamα isoforms differ only in their first Ig domain, sDscamα30 mutants exhibited smaller aggregates (Fig. 4B and SI which indicates that this domain contributes to the trans binding Appendix, Fig. S8 A and B). We note that the mutant protein specificity observed in the cell aggregation assays. In contrast to expression levels and protein stability did not significantly differ sDscamαs, all sDscamβ isoforms contain two variable Ig domains from those of wild-type proteins (SI Appendix, Fig. S8 C and D). at the N terminus. To identify the domains responsible for the Although we did not directly investigate the effect that Ig1 do- specificity of sDscamβ isoform trans interactions, we constructed a main point mutations have on cellular localization, a series of series of Ig-domain swapping chimeras. The sDscamβ5v4 Ig1 do- ΔIg1 mutants were all cell surface-located (SI Appendix, Fig. main was replaced with the -β5v10 Ig1 domain (Fig. 3A), which S1 D, i–vi). share 46.7% pairwise sequence identity within their Ig1 domains. We next identified Ig1 specificity-determining residues. Can- As a result, this chimeric construct no longer interacted with its didate residues were selected if they were located at the parent sDscamβ5v4; however, it continued to interact with sDscamα trans dimer model interface and differed between sDscamβ5v10, with which it shares only the Ig1 domain (Fig. 3 A, ii closely related sDscamα isoforms. We tested specificity deter- and iv,andSI Appendix,Fig.S5A). In contrast, a chimeric con- mining residues by swapping these residues between closely re- struct encoding sDscamβ5v4 with its Ig2 domain replaced by that lated isoforms and analyzed their recognition preferences in our of sDscamβ5v10 continued to interact with its parent sDscamβ5v4, Sf9 cell aggregation assay (SI Appendix, Fig. S9 A and B). For but not with sDscamβ5v10 (Fig. 3 A, i and iii). Identical results example, the closely related sDscamα20, -α30, and -α36 isoforms were obtained by Ig domain swapping between sDscamβ5v8 and had a greater than 91% sequence identity within their Ig1 domain. -β5v10 (Fig. 3 A, v–viii), and sDscamβ5v5 and -β5v10 (SI Appendix, As shown above, sDscamα20 and sDscamα36 exhibited both Fig. S5B). homophilic as well as heterophilic recognition to each other but Isoforms belonging to different clusters differ in their entire not with sDscamα30. Only four residues are identical between sequence (i.e., not only in the alternatively spliced exons). Nev- sDscamα20 and -α36 but differ in the sDscamα30 sequence (1S/P, ertheless, similar to the results above, chimeric constructs that 15S/N, 21V/I, and 22I/T) (Fig. 4 C, i); however, only residue 22I/T replaced the Ig1 domain between two parent isoforms from is found at the interface in our structural trans dimer model. distinct clusters only coaggregate with the parent isoform that Swapping residue 22 between sDscamα36 and sDscam30 resulted has an identical Ig1 domain, despite of differences in the in switching their binding preferences (Fig. 4 C, ii and iii). Mutated

4of12 | www.pnas.org/cgi/doi/10.1073/pnas.1921983117 Zhou et al. Downloaded by guest on September 23, 2021 BIOCHEMISTRY

Fig. 2. sDscam isoforms engaged in highly specific homophilic interactions. (A) Schematic diagram of the binding specificity assay. Cells expressing mCherry- or EGFP-tagged sDscam isoforms were mixed and assayed for homophilic or heterophilic binding. The outcome of cell aggregation included red-green cell segregation and red-green cell coaggregation. (B) Heat map of pairwise amino acid sequence identities of the Ig1 domains of sDscamα isoforms and their evolutionary relationship. Subsets of the isoforms marked by an asterisk (*) and within the boxed region were assayed in C–G. See also SI Appendix, Fig. S4A. (C) Pairwise combinations within representative sDscamαs were assayed for their binding specificity. (D–F) sDscamα isoforms with sequence identity for nonself pairs ranging from 87 to 94% in their Ig domains display strict trans homophilic specificity. Mean coaggregation (CoAg) indices were quantified and illustrated as a heat map. (G) Pairwise combinations within sDscamα20, -30, -36 pairs were assayed for their binding specificity. sDscamα36 exhibited strong heterophilic binding to sDscamα20, but not to sDscamα30. (Scale bars, 100 μm.)

Zhou et al. PNAS Latest Articles | 5of12 Downloaded by guest on September 23, 2021 Fig. 3. sDscam trans-binding specificity is largely dependent on N-terminal Ig1 domain. (A) Domain-shuffled chimeras of sDscamβ5 isoforms and their pa- rental counterparts were assayed for binding specificity. Chimeras in which either the Ig1 or Ig2 domains were replaced with the corresponding domains of cluster-within isoforms swapped or not swapped trans-binding specificity. See also SI Appendix, Fig. S5.(B) Swapped specificity was observed in sDscamβ5v10 and sDscamβ6v1 chimeras. These chimeras were constructed through replacing either the Ig1–2 or single Ig1/Ig2 domains with the corresponding domains of different cluster isoforms. See also SI Appendix, Fig. S5A.(C) Domain-shuffled chimeras between sDscamα and sDscamβ3 and their parental counterparts were assayed for binding specificity. See also SI Appendix, Fig. S5A.(D) Schematic representation of domain-shuffled sDscam chimeras with a summary of results in their binding specificity assay. These data indicated that the presence of a single common Ig1 domain might be essential and sufficient to confer coag- gregation between sDscam isoforms. (Scale bars, 100 μm.)

I22T sDscamα36 no longer binds to sDscamα36 or sDscamα20 but the ABED strands. Overall, the binding preferences of the does bind to sDscamα30. Similarly, mutated T22I sDscamα30 binds designed mutations support the structural model based on the to sDscamα36 and sDscamα20 but not to its parent isoform, Drosophila Dscam Ig7 homophilic interface. sDscamα30 (Fig. 4 C, ii and iii). In contrast, mutating the remaining three residues (1S/P, 15S/N, and 21V/I) did not result in a change in sDscams Form Cis-Multimers Independently of Trans Interactions. binding specificity, thereby strengthening the validity of our postu- Many cell adhesion and signaling receptors form stable homo- lated trans binding interface model. and hetero-oligomers, which are important for trans interactions. Importantly, we identified the equivalent position of residue Studies of classic cadherins have shown that cooperation be- 22 as a specificity-determining candidate residue in a different tween cis and trans interactions is crucial for ordered junction group of three closely related sDscamα isoforms: sDscamα11, formation (40). The evolutionarily related cPcdhs have been -13, and -15. Swapping residue 22 between sDscamα11 and -α15, shown to form stable homo and hetero cis dimers independent of or between sDscamα13 and -α15, produced a novel homophilic trans interactions and are thought to be critical in forming large binding specificity without swapping their binding specificity (SI oligomeric (zipper) assemblies and in recognition specificity (41, Appendix, Fig. S9C). The finding that the same structural posi- 42). To test the ability of sDscam isoforms to form cis interac- tion determines binding specificity in distantly related sDscamα tions, lysate from Sf9 cells cotransfected with sDscam-HA and suggests that this region likely contributes to binding specificity sDscam-Myc were coimmunoprecipitated (co-IP) using an HA in additional sDscamα isoforms. These observations also suggest antibody and detected by Western blot analysis with Myc anti- that additional residues at the Ig1/Ig1 interface contribute to body (Fig. 5A). Because our data indicate that deletion of the binding specificity. Using a similar approach, we analyzed the cytoplasmic domain did not significantly affect sDscam interac- sequences of other closely related isoforms and identified addi- tions (SI Appendix, Fig. S10 A and B), we used the Δcyto con- tional candidate specificity-determining residues located at the structs to study sDscam cis interactions. When HA-sDscamβ6v2 putative interface (SI Appendix, Fig. S9 D–G). We found that and Myc-sDscamβ6v2, -β4v1, or -β4v2 were coexpressed by shuffling residues 52 between sDscamα11 and -α13 and between coinfection with individual recombinant viruses, Myc-tagged sDscamα23 and -α27 altered, but did not swap binding specificity proteins strongly co-IP with HA-β6v2 (Fig. 5B). Similar results (SI Appendix, Fig. S9E). In addition, we found that swapping were obtained for co-IP experiments with isoforms from differ- residues 5, 10, and 56 between sDscamα21 and -α37 and residues ent clusters, indicating that sDscam proteins interact with each 6, 19, and 52 between sDscamα23 and -α27 swapped binding other and exhibit no specificity between isoforms of the same or specificity (Fig. 4D and SI Appendix, Fig. S9F). The key residues different clusters (Fig. 5B and SI Appendix, Fig. S10C). To en- identified in the above experiments were frequently located at sure that the interactions we observed were not trans

6of12 | www.pnas.org/cgi/doi/10.1073/pnas.1921983117 Zhou et al. Downloaded by guest on September 23, 2021 BIOCHEMISTRY

Fig. 4. Identification of Ig1 specificity-determining residues. (A) Ig1.39 domain structural modeling. Structural modeling shows that Ig1.39 domain might interact in an antiparallel fashion. (Left) The interaction residues (predicted by PDBePISA) represented in licorices have been shown in the homodimer model. (Right) Slices of the Ig1.39–Ig1.39 interface between strand AB subunits. Potential interaction residues (K5 and E19; E17 and K26) are shown in licorices. (B) The single point mutations of these candidate residues were assayed for cell aggregation. The single point mutation disrupted cell aggregates, supporting the antiparallel binding fashion. (C) Residues swapping between sDscamα20, -α30, -α36 were designed to assess specificity-determining residues. (i) Ig1 docking model and sequence alignments of shuffled regions. Four candidate specificity-determining residues were located on adjacent B strands. (ii) Schematic representation of residue swapping mutants used in the experiments, along with a summary of results from binding specificity. (iii) The binding specificity of isoforms containing wild-type and swapped residue 22. See also SI Appendix, Fig. S9A.(D) Residues swapping of variable Ig1 between sDscamα21 and -α37. (i) Ig1 docking model and sequence alignments of the shuffled regions. Three candidate specificity-determining residues were located on adjacent A, B, and D strands. (ii) Schematic diagrams of residue swapping mutants used in the experiments, along with observed binding specificity. (iii) Cell aggregation assays of isoforms containing wild-type and residue-swapped Ig1 domains. Swapping of either one of three residues in Ig1.21 to Ig1.37 did not swap the binding specificity, and swapping of two of three residues partially swapped the binding specificity, and swapping of all three residues fully swapped the binding specificity. See also SI Appendix, Fig. S9G. (Scale bars, 100 μm.)

Zhou et al. PNAS Latest Articles | 7of12 Downloaded by guest on September 23, 2021 interactions, we performed similar co-IP experiments with mu- expressing common cPcdh isoforms prevented intermixed cell tant isoforms designed to lack trans interactions. HA- aggregation (29, 30). Similar to cPcdh, sDscamα isoforms and sDscamβ4v1 and Myc-β4v2, which did not exhibit homophilic some sDscamβs interact through homophilic trans binding trans interactions in cell aggregation assays, were immunopreci- (Fig. 1D) while simultaneously demonstrating promiscuous pitated with HA antibody and detected using Western blot specificity in cis (Fig. 5). We therefore tested the manner in analysis with Myc-antibody (SI Appendix, Fig. S10C). Moreover, which combinatorial expression of multiple sDscam isoforms Ig1 domain deletions, which ablated homophilic trans interac- diversified binding specificities. In almost all cases, cells coex- tions in cell aggregation assays, did not affect the co-IP outcome pressing a set of two sDscamα isoforms failed to coaggregate (SI Appendix, Fig. S10 D and E). These results support the no- with cells expressing a different set of two sDscamαs (Fig. 6A and tion that sDscams multimerize in cis independently of trans in- SI Appendix, Fig. S13 A–C). In contrast, cells that coexpressed teractions and indicate that the robust sDscam multimers result the same set of sDscamα isoforms demonstrated robust inter- from cis interactions. mixing of cell aggregates. Consistent with these observations, co- To further characterize the cis interaction between sDscam IP experiments revealed that different sDscamαs interact with isoforms, we performed multimer analysis by Western blotting each other when coexpressed (SI Appendix, Fig. S13D, lanes 1 with Myc antibody or antibodies specific to isoform (SI Appendix, and 2). Similar data were obtained for each of the sDscamα/β Fig. S10F) in both boiled and unboiled samples. The boiled pairs (Fig. 6B and SI Appendix, Fig. S13D, lanes 3 and 4). These sDscamβ6v2 migrated with a single molecular mass of ∼80 kDa, results suggest that the identity of both isoforms determine the which corresponded to the size of the monomer (Fig. 5C). recognition specificity of cells coexpressing two distinct isoforms. However, several large bands migrated behind the monomer in We also coexpressed distinct sets of three sDscam isoforms the unboiled samples, which corresponded to the size of sDscam and evaluated their ability to coaggregate with cells containing assembly of putative dimer, tetramer, and larger multimers various numbers of mismatches (Fig. 6C). We found that only (Fig. 5C). We observed multimerization to different extents in all cells expressing identical isoform combinations formed inter- sDscam proteins investigated, suggesting that sDscam proteins mixed aggregates, while cells expressing mismatched isoforms from different subfamilies are able to cis-multimerize (Fig. 5 D, largely displayed separate red and green aggregates (Fig. 6C). i). Furthermore, we observed in vivo abundant endogenous Remarkably, even when two of the three isoforms were identical, sDscamβ6v2 multimers in the cephalothorax of scorpions the cells did not coaggregate (Fig. 6C, panel 3). However, we (Fig. 5 D, ii). Using a series of sDscamβ6v2 extracellular domain also observed a “mixed state” with an intermediate coag- truncations from the N terminus, we identified the extracellular gregation index, where separate red and green aggregates region that contributes to cis interactions (Fig. 5F). We found coexisted with intermixing aggregates (Fig. 6C, panels 2 and 10). that the truncated proteins lacking Ig1 (β6v2ΔIg1), Ig1–2 Taken together, these results strongly suggest that sDscams in- (β6v2ΔIg1–2), and Ig1–3(β6v2ΔIg1–3) exhibited robust multi- teract in cis to create new homophilic specificities that differ merization (Fig. 5 F, i, lanes 2–4). Similar results have been from the specificities of the individual sDscam isoforms. obtained in other sDscamα,-β1, -β2, and -β5 proteins (SI Ap- pendix, Fig. S10G). Co-IP experiments and multimerization as- Discussion says revealed that the deletion constructs containing individual Here, we provide compelling evidence that different combina- or multiple continuous FNIII domains were capable of binding tions of sDscam isoforms interact in cis to significantly expand the to sDscamβ6v2 (Fig. 5G). This result is also consistent with homophilic trans recognition specificities in Chelicerata. Specifi- computational modeling using the ZDOCK server (43), by which cally, we demonstrated that sDscam isoforms engage in a strict sDscamβ6v2 could form a homodimer via multiple parallel in- homophilic interaction in trans via their Ig1 domains. Further- terfacial regions involving the FNIII1–3 domains (SI Appendix, more, the Ig1:Ig1 interactions are likely to occur in an antiparallel Fig. S10H). Because the sDscamβ6v2 FNIII1–3 domains lack orientation and be structurally similar to the trans interactions cysteine residues (SI Appendix, Fig. S11), they likely mediate cis observed for the Ig7 domains of Drosophila Dscam1. We found multimerization by noncovalent mechanisms. Taken together, that different sDscam isoforms interact in cis promiscuously and these findings strongly suggest that the Ig1–3 domains of sDscam that the cis interaction is independent of the trans interaction. Our that include the region that contributes to trans interactions are results indicate that sDscam cis interactions are mediated via the not required for cis interactions, and that membrane-proximal membrane-proximal FNIII1–3 domains. Importantly, we found FNIII1–3 domains are sufficient for efficient cis multimerization. that when multiple sDscam isoforms are coexpressed, cellular To further investigate whether the TM domain of sDscam recognition depends on the identity of all expressed isoforms. contributes to cis multimerization, we first examined the effect of Cells will only bind if both cells express the same set of isoforms. TM deletion on self-multimerization of sDscam. Each of the Below, we discuss our data with a particular emphasis on com- truncated ΔTM proteins was capable of multimerizing, indicat- parison of the Chelicerata sDscam with fly Dscam1 and vertebrate ing that the extracellular domain is sufficient to confer efficient cPcdhs, both of which carry out the role of self-avoidance in in- cis multimerization (SI Appendix, Fig. S12A). However, multi- sects and vertebrates, respectively. We also discuss the challenges merization efficiency was markedly reduced in most and requirements for neuronal self-avoidance. sDscamΔTM mutants. Additionally, we observed dimerization of TM peptides expressed from all six sDscams investigated (SI sDscams Mediate Highly Specific Homophilic Recognition via Appendix, Fig. S12B). A further mutation experiment revealed Self-Binding Variable Ig1. Our findings indicate that all sDscamα that mutation of a cysteine residue in the TM domain did not and some sDscamβ isoforms engage in specific homophilic in- markedly affect the formation of cis-multimers in both the β6v2 teractions (Figs. 1D and 2 B–E and SI Appendix, Fig. S4). The and β6v2ΔIg1–3 constructs (SI Appendix, Fig. S12 C and D). strict preference for homophilic interactions is exemplified by Collectively, these results suggest that the TM domain of each our observation that most pairs of sDscams, with sequence sDscam likely mediated cis-multimerization. identities >90%, do not demonstrate heterophilic interactions. In contrast, the majority of sDscamβ1–β6 isoforms did not in- Coexpression of Multiple sDscam Isoforms Diversifies Homophilic teract homophilically in our assay (Fig. 1D). However, sDscamβ Specificities. Previous studies have shown that coexpression of chimeric constructs (i.e., sDscamβ4v1 and -β4v3) produced by distinct cPcdh isoforms results in new cell recognition that de- deleting or replacing their partial constant region, did, in fact, pends on the combination of all isoforms expressed. The ex- engage in homophilic interactions (SI Appendix, Fig. S2 C, xv and pression of even a single mismatched isoform between cells xvi, SI Appendix, Fig. S2 F, i and ii, and SI Appendix, Fig. S3 A,

8of12 | www.pnas.org/cgi/doi/10.1073/pnas.1921983117 Zhou et al. Downloaded by guest on September 23, 2021 BIOCHEMISTRY

Fig. 5. sDscams form promiscuous cis-multimers mediated by FNIII1–3 domains. (A) Schematic diagram of cis and trans interaction of sDscam. sDscam monomers interacted in a parallel fashion to form a homomultimer or heteromultimer complex, while trans-multimers are formed between two opposing cells in an antiparallel fashion. (B) All sDscamα and sDscamβ isoforms tested interacted strongly with each other in co-IP experiments. Lysates from Sf9 cells cotransfected with sDscamα1 and sDscamβ6v2 bearing a C-terminal HA-tag, and different Myc-tagged sDscam isoforms were immunoprecipitated using anti- HA antibody and probed with anti-Myc or anti-HA antibodies. See also SI Appendix, Fig. S10C.(C) sDscamβ6v2 expressed in Sf9 cells formed cis-multimers. Boiled and unboiled samples were analyzed by Western blot with Myc antibody (Left), and with sDscamβ6v2 antibody (Right). (D) Multimerization assay of lysates from Sf9 cells expressing distinct sDscam isoforms (i) and the scorpion cephalothorax (ii). (E) sDscams formed cis-multimers in the absence of trans interaction. (i) Proteins lacking Ig1–2, which have abolished homophilic trans interactions, could also form robust multimers. (ii) Single residue mutations, which abolished homophilic cell aggregation, caused increased multimerization. (F) A series of N-terminal truncations of the extracellular domain of sDscamβ6v2 fused with Myc-tag were examined for multimerization assay. Unboiled and boiled samples were analyzed by Western blot (i and ii). See also SI Appendix, Fig. S10G.(G) Co-IP and multimerization assay of FNIII1–3 domains. sDscamβ6v2 interacted strongly with each truncated protein expressing in- dividual or combined domain of FNIII1–3s (i), and the truncated proteins could form strong cis-multimers (ii).

panels 15–17). This finding indicates that the failure in homo- antiparallel Ig1/Ig1 self-binding (Fig. 4 A and B). Because the Ig1 philic recognition is not due to incompatibility of the trans self- domain of Chelicerata sDscams is homologous to the Ig7 of fly binding interface. Thus, it seems likely that at least some of these Dscam1 (32, 33), we hypothesized that they have a similar an- sDscamβs mediate self-recognition, although the underlying tiparallel self-binding architecture. Indeed, site-directed swap- mechanism remains unknown. ping mutagenesis revealed that Ig1 of Chelicerata sDscams share Our structural modeling and mutagenesis experiments dem- several key specificity-determining residue positions with Ig7 of onstrated that sDscam homophilic specificity is determined by an fly Dscam1 isoforms (Fig. 4 C and D and SI Appendix, Fig. S6)

Zhou et al. PNAS Latest Articles | 9of12 Downloaded by guest on September 23, 2021 Fig. 6. Combinatorial coexpression of multiple sDscam isoforms generates unique cell surface identities. (A and B) Combinatorial coexpression of multiple sDscam isoforms generates unique cell surface identities. Cells coexpressing an identical or a distinct set of sDscamα (A) and sDscamβ5–β6 isoforms (B) were assayed for coaggregation. Mean CoAg indices for each experiment were shown in the upper right of each representative image. (C) Cells coexpressing three distinct mCherry-tagged sDscam isoforms were assayed for interaction with cells expressing an identical or a distinct set of GFP-tagged sDscam isoforms. The nonmatching isoforms between two cell populations were underlined. (Scale bars A–C, 100 μm.) (D) Illustration of the outcome of cell–cell interaction dictated by combinatorial homophilic specificity of two (Left) and three (Right) distinct sDscam isoforms. This schematic diagram presented here does not reflect the cis-multimer, and the nonmatching isoform is shown with the asterisk.

(38, 39). These findings further support the evolutionary rela- interface involving the membrane proximal EC5–EC6 domains tionship between Drosophila Dscam1 and Chelicerata sDscams. (30, 41, 42, 45–49). Our data indicate that although Chelicerata However, while the homophilic recognition of sDscam is sDscams and cPcdhs are evolutionarily unrelated, they form achieved via association of a single Ig domain, the homophilic similar interactions. sDscams interact in trans via a membrane specificity of fly Dscam1 is determined via combined specificity distal Ig1 domain and in cis via a nonoverlapping and indepen- of three independent antiparallel self-binding domains, Ig2/Ig2, dent interface involving the membrane proximal FNIII domains. Ig3/Ig3, and Ig7/Ig7 (18, 38, 39, 44). Notably, self-binding of Several lines of independent evidence support the ability of sDscam Ig1 was sufficient for cell adhesion in our cell aggrega- sDscams to form robust cis interactions. First, distinct sDscam tion assay. In contrast, self-binding of fly Dscam1 Ig7-Ig7 (single isoforms form different clusters that can be co-IP (Fig. 5B and SI domain) was insufficient to sustain homophilic recognition (18). Appendix, Fig. S10C); second, sDscams are present in high Although structural models are not accurate enough to study molecular-weight detergent-solubilized assembly complexes details of the interface, we note that the modeled interface of from the scorpion cephalothorax (Fig. 5 D, ii); third, high-order sDscam buries ∼30% more surface area compared to the Dro- oligomers are observed with unboiled sDscam samples; and sophila Dscam1 Ig7 interface (SI Appendix, Fig. S7D). Addi- fourth, we observed an altered recognition specificity when tionally, fly Dscam with 16 extracellular domains is a significantly multiple sDscam isoforms were coexpressed (Fig. 6). Impor- longer and more flexible molecule than sDscam, which has only tantly, our co-IP experiments revealed that sDscam cis interac- six domains. It is possible that because of the additional degrees tions are mediated by membrane proximal FNIII domains of freedom in the fly Dscam, the affinity generated by Ig7 self- independently from trans interactions. However, the data pre- binding is insufficient and requires the joint participation of sented here largely represent biochemical evidence for sDscam other trans interacting domains to achieve binding. cis interactions using in vitro systems. More structural and bio- physical measurements will be needed to define the overall ar- Chelicerata sDscams Form Promiscuous Cis Interactions. cPcdh rec- chitecture of the sDscam recognition unit. ognition is mediated by a mechanism coupling nonspecific cis and specific trans interactions (29, 30, 41, 42, 45). cPcdh isoforms Requirement of Neuronal Self-Avoidance. Neuronal self-avoidance engage homophilically in trans via four membrane distal domains requires that interacting neurons have unique cell surface iden- (EC1–EC4) and in cis via a nonoverlapping and independent tities so that they will not recognize each other as self and repel.

10 of 12 | www.pnas.org/cgi/doi/10.1073/pnas.1921983117 Zhou et al. Downloaded by guest on September 23, 2021 When neuron arbors do not overlap with other neurons of the Chelicerate investigated have two to eight genes corresponding same type, simple receptor–ligand interactions are sufficient for to the canonical (long) Dscam (32), which is on similar scale to self-avoidance. For example, the PVD nociceptive neurons in the number of Dscam genes in mammals and non-Dscam1 iso- Caenorhabditis elegans separately innervate the left and right forms in insects. Since fly Dscam2–4 and mammalian Dscams body walls and use a single unc40/DCC receptor to control have function in neuronal assembly (52–54), we speculate that dendritic self-avoidance (50). However, neurons with dendrite chelicerate canonical Dscams would play similar roles in neu- arbors that overlap with homotypic neurons require a diverse ronal assembly. However, in many respects, Chelicerata sDscams population of receptors and ligands to discriminate self from exhibit more parallels to vertebrate cPcdhs (SI Appendix, Fig. nonself interactions. S14). Both are organized in a tandem array in the 5′ variable Insects such as Drosophila generate diverse cell surface pop- region, encoding the same order-of-magnitude of isoforms (50 ∼ ulations by randomly expressing a small set of Dscam1 proteins 100) via alternative promoters. Both have a similar structural from a pool of tens of thousands of isoforms, each with homo- composition comprising six extracellular domains, a single TM philic specificity (22, 23). In vertebrates, only 50 to 60 cPcdh domain, and a cytoplasmic region. We have now shown that proteins mediate self-avoidance; it is therefore clear that verte- scorpion sDscam, like mouse Pcdhs, exhibits combinatorial rec- brate cell surface diversity is not generated solely based on the ognition specificities based on the assembly of cis interactions. number of isoforms (20–23). Cell aggregation assays revealed Thus, our findings highlight molecular strategies for neuronal that two populations of cells expressing cPcdhs that differ only by self-avoidance as striking examples of convergent evolution. It a single isoform segregate into aggregates expressing identical will be interesting to learn if convergent examples for self- isoforms (30). These results suggest that vertebrate cell surface avoidance in other animals are available. Finally, based on the diversity is achieved through the combined identities of all remarkable parallels between Chelicerata sDscams and verte- expressed cPcdh isoforms. brate Pcdhs, we wonder whether cadherins, in the king- While the Chelicerata sDscam proteins are evolutionary re- dom, generate extraordinary isoform diversity via alternative lated to Drosophila Dscam1, they encode only 50 to 100 isoforms, splicing like their fly Dscam1 counterparts. One thing is certain: which is two orders-of-magnitude less than the number of Dro- Insights from extraordinary isoform diversity continue to deepen sophila Dscam1 isoforms. Thus, they are unable to produce cell our understanding of basic biological principles. surface diversity in a manner similar to that of Drosophila, which relies solely on the number of isoforms. However, our findings Materials and Methods demonstrate that, similar to cPcdhs, the recognition of cells de- Cell Culture and Cell Lines. Sf9 cells (a gift from Jian Chen, Zhejiang Sci-Tech pends on the combined identity of all expressed sDscam iso- University, Hanzhou, China) were cultured in Sf-900 II SFM (Gibco, 10902088) BIOCHEMISTRY forms. That is, a single mismatched sDscam isoform ensures that supplemented with 10% fetal bovine serum (Gibco, 10099141), and 1% two cells will not recognize each other as self (Fig. 6D). Note that penicillin-streptomycin (Gibco, 15140163) at 27 °C. in this study we only showed the potential for sDscam to mediate Cell Aggregation Assays. Cell aggregation assay was performed as previously neuronal self-avoidance in Chelicerata. Further studies, both reported with some modification (35). Sf9 cells were infected with in vitro and in vivo, will be required before we can confidently recombinant viruses of mCherry or EGFP tagged target proteins and incu- state the role of sDscam in neuronal development. bated at 27 °C for 3 d. Then, the infected cells were allowed to aggregate for 30 min at 27 °C on gyratory shaker (IKA KS260) at 60 rpm. Finally, the Nikon Chelicerata sDscams Have More Parallels with Vertebrate Pcdhs than Ti-S inverted fluorescence microscope was used to capture images (details Drosophila Dscam1. Our findings indicate that Chelicerata are in SI Appendix, Supplementary Methods). sDscams have striking parallels to Drosophila Dscam1 and ver- tebrate Pcdhs, thereby suggesting analogous roles. These three Binding Specificity Assay for Cells Expressing Single or Multiple sDscam cell surface adhesion receptor families encode large numbers of Isoforms. Sf9 cells were infected with differentially tagged sDscam iso- neuronal TM protein isoforms, and the encoded proteins inter- forms as described above. Images were captured using the Nikon Ti-S inverted act strictly homophilically (SI Appendix, Fig. S14) (20, 23, 34). In fluorescence microscope and counted for analysis of binding specificity (details are in SI Appendix, Supplementary Methods). addition, such a striking isoform diversity has been shown to Additional methods can be found in SI Appendix, Supplementary Methods. underlie neuronal self–nonself discrimination for Drosophila – Dscam1 and vertebrate Pcdhs (3 6, 51). Although the functional Data Availability. All data used for the study are available in the main text evidence for sDscam diversity is lacking, our results support and figures and SI Appendix figures and table. extend the notion that different phyla use distinct molecules or mechanisms to underlie the analogous principle for mediating ACKNOWLEDGMENTS. We thank Heng Ru for technical assistance for self-recognition and self-avoidance during neuronal arborization homology modeling and for discussion and comments; and Mu Xiao for (SI Appendix, Fig. S14) (20, 22). technical assistance for multimerization assay and comments. This work was supported by research grants from the National Natural Science Foundation From an evolutionary viewpoint, Chelicerata sDscams are of China (31630089, 31430050, 91740104) and Israel Science Foundation related to Drosophila Dscam1 and have no evolutionary rela- Grant 1463/19. R.R. is supported by the Israel Cancer Research Fund (ICRF 19- tionship with vertebrate cPcdhs (23, 33). In addition, all 203-RCDA).

1. B. Honig, L. Shapiro, Adhesion protein structure, molecular affinities, and principles of 8. X. L. Zhan et al., Analysis of Dscam diversity in regulating axon guidance in Drosophila cell-cell recognition. Cell 181, 520–535 (2020). mushroom bodies. Neuron 43, 673–686 (2004). 2. J. R. Sanes, S. L. Zipursky, Synaptic specificity, recognition molecules, and assembly of 9. M. E. Hughes et al., Homophilic Dscam interactions control complex dendrite mor- neural circuits. Cell 181, 536–556 (2020). phogenesis. Neuron 54, 417–427 (2007). 3. J. L. Lefebvre, D. Kostadinov, W. V. Chen, T. Maniatis, J. R. Sanes, Protocadherins 10. P. Soba et al., Drosophila sensory neurons require Dscam for dendritic self-avoidance mediate dendritic self-avoidance in the mammalian nervous system. Nature 488, and proper dendritic field organization. Neuron 54, 403–416 (2007). 517–521 (2012). 11. T. Hummel et al., Axonal targeting of olfactory receptor neurons in Drosophila is 4. G. Mountoufaris et al., Multicluster Pcdh diversity is required for mouse olfactory controlled by Dscam. Neuron 37, 221–231 (2003). neural circuit assembly. Science 356, 411–414 (2017). 12. H. Zhu et al., Dendritic patterning by Dscam and synaptic partner matching in the 5. D. Hattori et al., Robust discrimination between self and non-self neurites requires Drosophila antennal lobe. Nat. Neurosci. 9, 349–355 (2006). thousands of Dscam1 isoforms. Nature 461, 644–648 (2009). 13. S. K. Miura, A. Martins, K. X. Zhang, B. R. Graveley, S. L. Zipursky, Probabilistic splicing 6. D. Hattori et al., Dscam diversity is essential for neuronal wiring and self-recognition. of Dscam1 establishes identity at the level of single neurons. Cell 155, 1166–1177 Nature 449, 223–227 (2007). (2013). 7. B. J. Matthews et al., Dendrite self-avoidance is controlled by Dscam. Cell 129, 14. G. Neves, J. Zucker, M. Daly, A. Chess, Stochastic yet biased expression of multiple 593–604 (2007). Dscam splice variants by individual cells. Nat. Genet. 36, 240–246 (2004).

Zhou et al. PNAS Latest Articles | 11 of 12 Downloaded by guest on September 23, 2021 15. D. Schmucker et al., Drosophila Dscam is an axon guidance receptor exhibiting ex- 35. G. C. Zondag et al., Homophilic interactions mediated by receptor tyrosine phos- traordinary molecular diversity. Cell 101, 671–684 (2000). phatases mu and kappa. A critical role for the novel extracellular MAM domain. 16. W. Sun et al., Ultra-deep profiling of alternatively spliced Drosophila Dscam isoforms J. Biol. Chem. 270, 14247–14250 (1995). by circularization-assisted multi-segment sequencing. EMBO J. 32, 2029–2038 (2013). 36. S. Bonn, P. H. Seeburg, M. K. Schwarz, Combinatorial expression of alpha- and 17. W. M. Wojtowicz, J. J. Flanagan, S. S. Millard, S. L. Zipursky, J. C. Clemens, Alternative gamma-protocadherins alters their presenilin-dependent processing. Mol. Cell. Biol. – splicing of Drosophila Dscam generates axon guidance receptors that exhibit isoform- 27, 4121 4132 (2007). specific homophilic binding. Cell 118, 619–633 (2004). 37. Y. Murata, S. Hamada, H. Morishita, T. Mutoh, T. Yagi, Interaction with 18. W. M. Wojtowicz et al., A vast repertoire of Dscam binding specificities arises from protocadherin-gamma regulates the cell surface expression of protocadherin-alpha. J. Biol. Chem. 279, 49508–49516 (2004). modular interactions of variable Ig domains. Cell 130, 1134–1145 (2007). 38. M. R. Sawaya et al., A double S shape provides the structural basis for the extraor- 19. D. Schmucker, B. Chen, Dscam and DSCAM: Complex genes in simple animals, complex dinary binding specificity of Dscam isoforms. Cell 134, 1007–1018 (2008). animals yet simple genes. Genes Dev. 23, 147–156 (2009). 39. S. A. Li, L. Cheng, Y. Yu, J. H. Wang, Q. Chen, Structural basis of Dscam1 homo- 20. G. Mountoufaris, D. Canzio, C. L. Nwakeze, W. V. Chen, T. Maniatis, Writing, reading, dimerization: Insights into context constraint for protein recognition. Sci. Adv. 2, and translating the clustered protocadherin cell surface recognition code for neural e1501118 (2016). – circuit assembly. Annu. Rev. Cell Dev. Biol. 34, 471 493 (2018). 40. O. J. Harrison et al., The extracellular architecture of adherens junctions revealed by 21. T. Yagi, Molecular codes for neuronal individuality and cell assembly in the brain. crystal structures of type I cadherins. Structure 19, 244 –256 (2011). Front. Mol. Neurosci. 5, 45 (2012). 41. R. Rubinstein et al., Molecular logic of neuronal self-recognition through proto- 22. S. L. Zipursky, W. B. Grueber, The molecular basis of self-avoidance. Annu. Rev. cadherin domain interactions. Cell 163, 629–642 (2015). Neurosci. 36, 547–568 (2013). 42. J. Brasch et al., Visualization of clustered protocadherin neuronal self-recognition 23. S. L. Zipursky, J. R. Sanes, Chemoaffinity revisited: Dscams, protocadherins, and neural complexes. Nature 569, 280–283 (2019). circuit assembly. Cell 143, 343–353 (2010). 43. B. G. Pierce et al., ZDOCK server: Interactive docking prediction of protein-protein 24. Q. Wu, T. Maniatis, A striking organization of a large family of human neural complexes and symmetric multimers. Bioinformatics 30, 1771–1773 (2014). cadherin-like cell adhesion genes. Cell 97, 779–790 (1999). 44. R. Meijers et al., Structural basis of Dscam isoform specificity. Nature 449, 487–491 25. Q. Wu et al., Comparative DNA sequence analysis of mouse and human proto- (2007). cadherin gene clusters. Genome Res. 11, 389–404 (2001). 45. K. M. Goodman et al., Protocadherin cis-dimer architecture and recognition unit di- 26. S. Ribich, B. Tasic, T. Maniatis, Identification of long-range regulatory elements in the versity. Proc. Natl. Acad. Sci. U.S.A. 114, E9829–E9837 (2017). protocadherin-alpha gene cluster. Proc. Natl. Acad. Sci. U.S.A. 103, 19719–19724 46. K. M. Goodman et al., Structural basis of diverse homophilic recognition by clustered (2006). α- and β-protocadherins. Neuron 90, 709–723 (2016). 27. B. Tasic et al., Promoter choice determines splice site selection in protocadherin alpha 47. J. M. Nicoludis et al., Structure and sequence analyses of clustered protocadherins and gamma pre-mRNA splicing. Mol. Cell 10,21–33 (2002). reveal antiparallel interactions that mediate homophilic specificity. Structure 23, – 28. X. Wang, H. Su, A. Bradley, Molecular mechanisms governing Pcdh-gamma gene 2087 2098 (2015). expression: Evidence for a multiple promoter and cis-alternative splicing model. 48. J. M. Nicoludis et al., Antiparallel protocadherin homodimers use distinct affinity- and specificity-mediating regions in cadherin repeats 1-4. eLife 5, e18449 (2016). Genes Dev. 16, 1890–1905 (2002). 49. J. M. Nicoludis et al., Interaction specificity of clustered protocadherins inferred from 29. D. Schreiner, J. A. Weiner, Combinatorial homophilic interaction between gamma- sequence covariation and structural analysis. Proc. Natl. Acad. Sci. U.S.A. 116, protocadherin multimers greatly expands the molecular diversity of cell adhesion. 17825–17830 (2019). Proc. Natl. Acad. Sci. U.S.A. 107, 14893–14898 (2010). 50. C. J. Smith, J. D. Watson, M. K. VanHoven, D. A. Colón-Ramos, D. M. Miller 3rd, Netrin 30. C. A. Thu et al., Single-cell identity generated by combinatorial homophilic interac- (UNC-6) mediates dendritic self-avoidance. Nat. Neurosci. 15, 731–737 (2012). α β γ – tions between , , and protocadherins. Cell 158, 1045 1059 (2014). 51. W. V. Chen et al., Pcdhαc2 is required for axonal tiling and assembly of serotonergic 31. H. Wolf, The pectine organs of the scorpion, Vaejovis spinigerus: Structure and circuitries in mice. Science 356, 406–411 (2017). – (glomerular) central projections. Struct. Dev. 37,67 80 (2008). 52. S. S. Millard, J. J. Flanagan, K. S. Pappu, W. Wu, S. L. Zipursky, Dscam2 mediates axonal 32. G. Cao et al., A chelicerate-specific burst of nonclassical Dscam diversity. BMC Geno- tiling in the Drosophila visual system. Nature 447, 720–724 (2007). mics 19, 66 (2018). 53. W. Tadros et al., Dscam proteins direct dendritic targeting through adhesion. Neuron 33. Y. Yue et al., A large family of Dscam genes with tandemly arrayed 5′ cassettes in 89, 480–493 (2016). Chelicerata. Nat. Commun. 7, 11252 (2016). 54. A. M. Garrett, A. Khalil, D. O. Walton, R. W. Burgess, DSCAM promotes self-avoidance 34. Y. Jin, H. Li, Revisiting Dscam diversity: Lessons from clustered protocadherins. Cell. in the developing mouse retina by masking the functions of cadherin superfamily Mol. Life Sci. 76, 667–680 (2019). members. Proc. Natl. Acad. Sci. U.S.A. 115, E10216–E10224 (2018).

12 of 12 | www.pnas.org/cgi/doi/10.1073/pnas.1921983117 Zhou et al. Downloaded by guest on September 23, 2021