Proc. Natl. Acad. Sci. USA Vol. 90, pp. 6410-6414, July 1993 Biochemistry IlIa: Extensive alternative splicing generates membrane-bound and soluble forms YURi A. USHKARYOV AND THOMAS C. SUDHOF* Howard Hughes Medical Institute and Department of Molecular Genetics, University of Texas Southwestern Medical School, Dallas, TX 75235 Communicated by Michael S. Brown, March 26, 1993

ABSTRACT The structure ofneurexin lIa was elucidated and 13- have identical C termini including the from overlapping cDNA clones. Neurexin lIa is highly ho- 0-linked sugar region, transmembrane region, and cytoplas- mologous to neurexins la and Ha and shares with them a mic domain (2). distinctive domain structure that resembles a cell surface Structures of the neurexins are suggestive of cell surface receptor. cDNA cloning and PCR experiments revealed alter- receptors. Indeed, the neurexins were originally found be- native splicing at four positions in the mRNA for neurexin HIa. cause of the identity of the sequence of neurexin Ia and Alternative splicing was previously observed at the same po- sequences from the high molecular weight subunit of the sitions in either neurexin Ia or neurexin Ila or both, suggesting receptor for a presynaptic neurotoxin, a-latrotoxin (7). Im- that the three neurexins are subject to extensive alternative munocytochemistry ofrat brain frozen sections revealed that splicing. This results in hundreds of different neurexins with neurexin I is highly enriched in synapses in agreement with variations in small sequences at similar positions in the pro- the localization of a-latrotoxin (2). These findings suggest teins. The most extensive alternative splicing of neurexin Ma that at least neurexins Ia and I13 are synaptic proteins that, was detected at its C-terminal site, which exhibits a minimum based on their structure and homologies, may represent a of 12 variants. Some of the alternatively spliced sequences at class of neuronal cell-interaction molecules (2). this position contain in-frame stop codons, suggesting the We now report the identification, structure, and expression synthesis of secreted proteins. None of the sequences of the of neurexin IIIa.t By extending the neurexin family of other splice sites in this or the other two neurexins include stop proteins, our results demonstrate the generality of the fea- codons. RNA blot analysis demonstrate that neurexin IlIa is tures of this protein family: their conserved alternative splic- expressed in a brain-specific pattern. Our results suggest that ing patterns, brain-specific expression, and similar domain the neurexins constitute a large family of polymorphic cell structure with distinct patterns of shared and divergent surface proteins that includes secreted variants, indicating a sequences. The expanded number of expressed neurexins possible role as signaling molecules. lends further support to a putative role of the neurexins as polymorphic identifier molecules. The surprising finding, The development and function of the nervous system are however, was that neurexin IIIa may be produced in secreted dependent on an intricate network of cell signaling and forms, suggesting a higher degree of complexity than indi- cell-cell interaction mechanisms (1). We have identified (2) a cated by the structures of neurexins I and II. family of cell surface receptors named neurexins with char- acteristics that suggest a role in such cell-cell interactions. Two genes for neurexins were identified; each gene con- EXPERIMENTAL PROCEDURES tains two independent promoters that drive transcription of cDNA Cloning and Sequencing. cDNA cloning and mapping two classes ofmRNAs. The two classes oftranscripts encode were performed as described using 32P-labeled oligonucleo- proteins with different N termini; the larger proteins are tides as probes (8). DNA sequencing was performed on referred to as a-neurexins and the shorter are l3-neurexins. single-stranded DNA from M13 subclones using the dideox- The expression of the neurexins is brain-specific and con- ynucleotide-termination method (9) with fluorescent primers centrated in neurons. Neurexins are highly polymorphic due and Taq polymerase (Promega). Reaction products were to extensive differential splicing with the potential to express analyzed on an ABI 373A automatic DNA sequencer. hundreds of different proteins. PCRs. First-strand cDNA synthesis was performed in 100 Neurexins have a distinctive domain structure. In the ,ul by standard procedures (10) using 10 ug -of poly(A)+- a-neurexins, a signal sequence is followed by three copies of enriched RNA from rat forebrain, cerebellum, or spinal cord. a long repeat that make up >75% of the total protein. Each PCR mixtures contained 2-5 ,ul of cDNA, 2.5 mM MgCl2, 10 ofthe long repeats contains a central mM Tris'HCl (pH 8.3), 50 mM KCI, 20 pmol of each primer, (EGF) domain flanked by right and left arms that are weakly all four dNTPs (each at 200 AM), and AmpliTaq DNA homologous to each other. Data bank searches (GenBank polymerase (Perkin-Elmer) at 30 units/liter. PCRs were per- Release 70 and Protein Identification Resource Release 30) formed in 32 cycles, with the first two cycles at 95°C for 5 min, demonstrated that the extracellular repeats in neurexins are 60°C for 1 min, and 72°C for 5 min, followed by 30 cycles at related to sequences found in several the same temperatures but for 1 min, 1 min, and 2 min. proteins that participate in cell-cell interactions during de- Neurexin III was initially identified with two oligonucleotides velopment, including A (3), agrin (4, 5), and slit (6). corresponding to sequences that are highly conserved among After the three long repeats, a-neurexins contain an 0-linked neurexins [(nucleotides in parentheses denote redundant sugar domain, a single transmembrane region, and a short positions with multiple nucleotides) GGAGATCTAGT- cytoplasmic tail. The N-terminal sequence of f-neurexins TGCGGTACTTGTACATGGC(G/A)TAGAG(G/A)AG- differs from that of the a-neurexins but is spliced into the GAT and AGCAAGCTTGGCACCGTGCCCATCGCCAT- a-neurexin sequences at the end ofthe third EGF domain. a- Abbreviation: EGF, epidermal growth factor. The publication costs of this article were defrayed in part by page charge *To whom reprint requests should be addressed. payment. This article must therefore be hereby marked "advertisement" tThe sequences reported in this paper have been submitted to in accordance with 18 U.S.C. §1734 solely to indicate this fact. GenBank (accession nos. L14851). 6410 Downloaded by guest on October 6, 2021 Biochemistry: Ushkaryov and Siidhof Proc. Natl. Acad. Sci. USA 90 (1993) 6411 CAA]. Subsequent PCRs to map the extent of alternative is followed by a long open reading frame. This identification splicing were carried out with oligonucleotides I, II, and III is supported by the fact that the putative initiator methionine in Fig. 3 (I, GCGAAGCTTGTTTCATCTGCTGAATGCT; is preceded by an in-frame stop codon (underlined in Fig. 2) II, GCGGCATGCCACATTCTTGAAAGTTGC; III, and that its sequence context conforms well to the consensus GCCTCTAGAACCCGTCTGATTCCTGGCT). Two types site of initiator methionines (12). of PCR experiments were performed: (i) PCRs were per- Six of the cDNA clones had poly(A) tails at their 3' end. formed in duplicate with one labeled primer and one unla- However, these poly(A) tails initiated at three positions in the beled primer. The products of such reactions were then nucleotide sequence (Fig. 2). None of the poly(A) tails was analyzed by denaturing PAGE and autoradiography to de- preceded by a classical polyadenylylation consensus se- termine the minimal number of fragments generated that quence. Therefore, it is doubtful that these poly(A) tracts incorporate both primers. This avoids artifacts caused by represent the poly(A) tails of the neurexin IIIa mRNA. The single-oligonucleotide priming or heteroduplex formation 3' untranslated region of the neurexin IIIa mRNA is rich in (11). (ii) PCRs were performed with unlabeled primers, adenosines, suggesting that the poly(A) tails in the cDNAs analyzed by nondenaturing PAGE, and stained by ethidium may represent oligo(dT)-priming sites that are not necessarily bromide. PCR products were then subjected to Southern blot poly(A) tracts. analysis with internal oligonucleotides or excised and cloned Alternative Splicing of Neurexin Ma. The 12 cDNA clones into M13 vectors and sequenced. All of the PCR products of that were sequenced contained four internal positions at Fig. 3 were sequenced except for products C, E, and G, which different clones exhibited sequence variations (Figs. 1 which were identified on the basis of their size in relation to and 2). These differences consisted in the presence or ab- the sequenced other products and of their hybridization with sence of a sequence stretch, suggesting that they arose by an insert-specific oligonucleotide (ACCAAGTACAGGTAG- alternative splicing. Comparison of the sequences of neurex- GTC). ins Ia, Ila, and IIIa demonstrates that, at each of these four RNA Blot Analysis. This procedure was performed (2) using positions, neurexin Ia, neurexin Ila, or both are also alter- total RNA from various rat tissues and single-stranded 32p- natively spliced (2). This result raises the possibility that the labeled DNA probes. Blots were rehybridized with a probe to three neurexins are alternatively spliced at similar positions. cyclophilin to control for RNA loading. A simple pattern of alternative splicing with only two variants was observed in three of the four alternatively spliced sequences based on the cDNA clones. At these RESULTS AND DISCUSSION positions, a sequence is either included or deleted from the Cloning of Neurexin ma. Neurexin IIIa was initially iden- mRNA (Fig. 2). Remarkably, the third of these sequences, a tified by PCRs performed with oligonucleotides correspond- conserved 90-bp sequence in the C-terminal half of the ing to conserved sequences in the C termini of neurexins I and neurexins, is alternatively spliced in all neurexins. II. Analysis of the PCR products by DNA sequencing dem- A complex pattern of alternative splicing was observed at onstrated that in addition to characterized neurexins, another the fourth position (Fig. 2). Here, cDNA clones contained neurexin transcript was amplified (referred to hereafter as five sequences, some of which exhibited in-frame stop neurexin lIla). An oligonucleotide specific for neurexin IlIa codons, which predict the synthesis of secreted proteins was then synthesized based on the PCR sequence and used without a transmembrane region. To investigate whether to isolate cDNAs encoding neurexin Illa from a rat brain these cDNA clones represented cloning artifacts arising from library. To obtain the 5' end of the neurexin IIIa mRNA, partially processed heterogeneous nuclear RNA, PCR exper- further 5' clones were isolated using additional neurexin lIla- iments on single-stranded cDNA made from poly(A)+- specific oligonucleotides. enriched RNA from rat brain were carried out. Two sets of Twelve overlapping cDNA clones were selected for se- oligonucleotide primers (Fig. 3) were used to amplify the quence analysis (Fig. 1). These clones show internal differ- alternatively spliced sequences ofthe neurexin IlIa mRNA at ences, suggesting that neurexin IlIa, like the other neurexins, this position. PCR products were analyzed on a Southern blot is extensively alternatively spliced (see below). The full- with internal oligonucleotides and by DNA sequencing. length structure of neurexin Illa was assembled from the The PCR experiments confirmed and extended the cDNA sequences ofthe cDNA clones and is shown with the deduced cloning results by identifying additional splice variants (see amino acid sequence of neurexin IlIa in Fig. 2. Fig. 3). The results from the PCR experiments agreed well The mRNA for neurexin IIIa contains a long 5' untrans- with the cDNA cloning data. Thus, evidence for at least 12 lated region (>700 bp). The translation initiation codon was splice variants at this site was obtained. PCR is a very identified as the first methionine codon in the sequence that sensitive technique that is prone to artifacts. However, the following findings suggest that the C-terminal splice variants mRNA 5' - - (A)n containing stop codons correspond to physiological mRNAs: p679-7a A (i) These forms were independently isolated in multiple p679-12a cDNA clones. In the course of studying neurexins, we have isolated >100 neurexin cDNAs. None of the sequenced p679-la splice sites ofthese cDNAs contained stop codons except for p679-15a the C-terminal splice site of neurexin III (ref. 2 and Y.A.U. p679-11b and T.C.S., unpublished observations). (ii) PCR experiments p664-31a on different splice sites in all three neurexins led to >100 p664-12a sequencing events of neurexin splice sites, none of which p664-24a exhibited in-frame stop codons except for the C-terminal p664-25a splice site of neurexin IlIa. (iii) PCRs with rat genomic DNA p664-39a under the same conditions with positive controls failed to 1 kb p664-16a amplify a DNA fragment, suggesting that the primers are p664-15a separated by a large intron. Thus these data suggest that FIG. 1. Structures of the mRNA for neurexin IlIa and the cDNA contamination of the cDNA library and PCRs with partially clones encoding neurexin IIIa used in the current study. Sequence processed heterogeneous nuclear RNA is not a source of segments deleted by alternative splicing are indicated by thin arched artifacts and that contamination by genomic DNA also can- lines. The thick line indicates the coding reigon. not explain our findings. Downloaded by guest on October 6, 2021 6412 Biochemistry: Ushkaryov and Suidhof Proc. Nati. Acad. Sci. USA 90 (1993)

TTCTGGATTGAATTTTCAAGGGGAACACCACCAACTCGTTGTTAACACCAACTACAGTTCCAACTTTCCTTCACTCCAGGTTTTTTGTGGGTCT 117 GGGACTCTTTATGGCAGAACGCCTTGGCAGCCCATAAGCTGAACAGCTCCCTCCCTCTGGCCCACTCACCGAGACTCCTTGTTGCGTTCCCT 2343

14 S FT L H S V F F T L K V S I F LC 19 TCCCOGTGGGCTTGCCGGGTTGGATTCTCGGCTCCTAACAGTGCCCCCTACTACCTGGATGCACCAGCGCGTGCCTGGCTTCAGTCAAGCCA 9363 _S L V C L C L G L E F 14 G L P N Q 14 A R YL R W 1) A S T R S D L S F Q F KTN 58 V S T G L L YL 5 D0 G V C D F L C L S LV D C R V 13 L R F S N D C A E T T 97 V L SN KQ V NDS SW HF LNMVSR D kV RT CL V I DCECGQS GE LRFP 136

Q R P Y M D V V S D L F L G G V F A D I R F S A L T L D C V Q3 N F C F K C L 175 ATGCGGATTCAATATGCAATCCGACCCGGCTCTCGGACCAGGTGTCAGTACACCAGGGGACCTTCCCACCCCCCGTGAAATGTGGATCTCT 14040 14 L D L K Y G N S E P R L L C S Q S V Q3 L E A E G F C G E R F C E N C C I C F 214 L L D C H F TC D C S T T C Y C C T L C S ED V S Q C F C L S N L 14 1 S E Q3 Ci 253 AGAATAAACTCGGAGAGAAGTCGCACCTCCAGGCCAGATATCGTGTATCCCTGCCCCAACCCATCACACACAGCGTGAATCACCTTCCTTA, 16383 R S. K A R E K N V AT F R C S 8 YL C Y D L S Q N F I Q S S S D E I T L S F K 292 ACCTnCkAGrAAGGGTCATCTCCCACCGCAGTCGCCGCTATTTAAuTGCTCTAAGGTGGTCGGCTCCTGGTATTACCTGGGTCGGGCCTTG 17555 T W4 Q R N G L I L H T G K S A D Y V N L A L K D5 C A V S L V I N L G S C A F E 331 GCCATGTGAGCCCTCATCGAAATCAAGACACGCCGGCTCATTCAAGTGCACGAATCTCGGAGGGACATCTCGTGATGGATCCAACACAAAG 18727 A I V E F V N G K FN D N A 14 N D V K V T R N L R Q3 V T I S V D C I L TT T C 370

YT 13 ED Y T N L G S DD F F Y V C C S F S T A D L F C S F V S N N F 14 C C L 409 AAAGGGTTTTTAAACATAACACACCGCTGGGCTGCTCCTTGCCCCATTGGGAACCAGATGAAACTATGTGAGTTTGTTAAGTTGAAACGGG 21060 K E VV Y K N N D I R L E L S R L A R I GDT K M K I Y 0 E V V F K C K N V A 448

T 1, 0 F N F C T F E A Y I S L F K 8 N T K R 14 C S 1 S F SF R T T E F N C L 487 ATCCCTTCCTCAGGGAACCCAAGAAGGAACACTCCGAGCAGAAAACAAAAATTGATTGTTGCTTGGACTTTTGAGGCACCTGATTTCTGCTG 23404 I L F T N1 C K F Q E K K D V R S Q K N T K V D F F A V E L LD C N L Y L L L D 526

8 C S C T I K V K A T 13 K K AN D GE W4 YH V D I 13 R DG R S C T I S V N S R 565

R T F F T A S C E S E I L D L E C D M4 Y L GG L PE N R A C L I L F T E L W4 T 604 GCCAGCTCACTAGGTATGTGGCTCATCGCGCCTGTCACGATGGCGAGCAAAAATCCGCAGTGGGGAGTGCAAATCAGCGGCGCAATCCTCT 26969 A M L N Y C Y V C C I R D L F I 8 C R S K N I R Q L A E 14 Q N A AC V K S S C 643

S R N S A K Q C D S Y F C K NN A V C K DG W4 N R F I C D CT GT C Y W4 G R T 682

C E K E A S I L S Y 0 G S 4 Y 14 K V I 14 P 14 V 14 N T K A K D V S F R F K S 13 R 721

A Y GL L V A T T S R 8 S A 8 T L K L E L DGG R V K L H V N L 15 I V N 760 N.. S.-S K C F E T L Y A C Q3 K L N D N E W4 H T V R VV R R C K S L K L T V 5 DD 799 V AKEGTNHVGD H T RLEFNHN IEVTC INMTKEKKRY I SV V PS S F I C 838 1.131114 F N C ILLY I DLC K N CD D1 Y CELK KA R F C 18810I A SF14V 877 T F K T K SS Y I T L A T L Q3 A Y T S 14 H L F F 13 F K V T S A D C F I L F N S 916 C DCGNODFI A V ELV KCY I HYVFD LC NCGP NVI KCGN S DRFPLN D 955 N 13 14 It N VV I V RD N S NT H S L K V D V K V V T Q3 V I N C A KN L D L K C 994

D L Y N A C L A 13 C 1 Y S N I P K I V A S R D C F 13 C C L A S V D I N G R I F 1033

0) 1. I N 8 A 1, It R S C 13 D 8 C C E C F S V V C 13 E 5 S C A N 13 C. V C M Q3 13 W 1072

E C F T C 0 C S N V S V S G N 13 C N D P C A T Y I F C K SG C L I L V T 14 F A 1111 N 0 R P S T R S DRL A V G FS V V V K D C I L V K I D S A F G L C D F 1 13 L 1550 N I E 13 C K I G V V F N I G T V D I S I K E E R V P V N DG K V N V V R F TR 1189 T ~~~~~~~4446

NS C IVVSCIKV181N A V KITNCLV KINVWCPN E HS CY 1306

AATCAOCCTVVCVVV,CAVCGVCTCCAA WGCACVCCVCAGCACCTTCAACACVVC CCAGVCACCGVCAG TAGCCCATCCCAGGAACACVAAACCACGVCTCTCTTCCAGVGCCCC 4912 N 1P..V S VS KC S'E S:::'--:: C R 1 L P V K C K P A V CSDK K C AA F Q V FC 1383 S P V YY N A V L N VF AS IN N I K V N 1 S V P L P F F V F F FP N T 1306

ACAAGAAAVCAAVCCCATACCCCTVAAATGTGTCAACTVVTVVATGAAGCCCACCVACGATGGVVCTACCACCCCA CCCACAAACAACGCTACVAVVCAGAATTCAAAACCAAVCACA 5262

GATGAKCVTTTAACATCTGCCAVTVGCTCAAGTATCVVVATGAACAACATTGCAGTGTAAACACCAAGT ACACAGTAGGCAACAGAAAAGCTCTAATCAGCTAGATCACTCCAGGCCA 54942

GCCTTACATVACCCVCATGCCCACVCACTTCACATTTCCCTTCTUATCTTCATCTC1ACTS TCTACIACCATCTATTACACCACCCCTPSATTrJTCCCCCI.ACTTCTTCTCATCTC 5728 6AGACTTVCTTTAG CVVCTGTAATTTAGTAACTGTCATCCACCCVCCCAAATTAAVCAAAAACTVCTCAAATGCAAAAACCACAATACAAMCCTCAACATTTATCTGAAACCAACA 59461 P P SL AL KF H 0 K PS P *V S IAPK1 14399 K V S K V S K V V V V S I S P 1 1 5 F VA S S SV C 14 V P K I P A C K T N N 14385 HPK S L Q SKT C * ~~~~~~~~~~~~~~~~~~~~~1395 P1 D I L V 1 F * I A 5VTKKT T0T KP S1 V P QCG4 N S 14877

ACAGATTTCAAGCCTTACCTCCGTGACCCAACGACAGCGATCGTTCGTGACATVCCATGCCTCGTAACTGCGTCATCCTVCAGGAAC6311CACTGCCATACAC54CACTTCTGAAATAATTTCCTAATCTAAAGATCACTCAAACCGCATVCCCCAAVTCVACCAAGcAGcTTTckCCTAc 596 I I II IV A N VK D K N L S KT S V F3 ECG V K A IIA P K A 13 S K C RIV N 1555 S R K I 13 A S K T C TI K K 13 K E K IS K , I PVAGVKv* N 1578

CATCTTATCTCVAACAAACGCCATGTATAAGTACAGGACAGGGCVACGCTATACVAAGTGGACGAGAACGAGGAACTAGATCCCAATCGCAAGAGCAACMCCAACGCCA 76138 Fio2.Nucleotide and aminL AacisequecesRo rat nerxnAiQVDETRleraieyQpie sqecSNCar Tidcae bya hae5bckron forthefirst tree splceAsites ACT(;AndAreboxdAG or hcomlxfuThAAACsiAAAGCAGATe. GGAACTAACACGCTheainci eqe c isTshownGbelowI'th6u5etd4sqece5h rn-framestop codon inKtE5'unrasltdYQASSKSC1egonpecdigYh intatrmehonn isudrlnd ndso1odn5remre7b8seiss h

N-erInal2 siNucleoieadaioaisequencesinelndnh putrativeutraxnsmembanlemregion ispdoubed unduerlnced. Fourindicted ofalthaernbativegrucnd

were observed. Three consist of simple patterns (shown on a shaded background) that were found to be spliced either in or out [nt 1519-1530, nt 3023-3050 (deletion of this sequence deletes aa 754-762 and changes Ser-763 to Gly), and nt 4390-4479]. The fourth alternatively spliced sequence consists of several independently spliced sequence segments that are boxed. (See Fig. 3 for a diagram of the splice patterns observed, creating at least 12 variants. The alternatively spliced sequences that are boxed here are shaded in Fig. 3.) At the 3' end, clones p679-lib, p664-25a, and p664-16a had a poly(A) tract starting at nt 6962; clone p664-31a had one starting at nt 6973; and clones p664-17a and p664-24a had a poly(A) tract starting at nt 7013. Since none of these poly(A) tracts is preceded by a polyadenylylation consensus sequence, they may represent A-rich oligo(dT)-priming sites in the mRNA and not true poly(A) tracts. Downloaded by guest on October 6, 2021 Biochemistry: Ushkaryov and Siidhof Proc. Natl. Acad. Sci. USA 90 (1993) 6413

II 11oDNA POR A B C :0 A: + a U.. b a b a b Q,'N B El -I- J1 SP EGF EGF EGF C: 31 D. --X- :+ SP EGF EGF EGF E .- *Z1 .+ + I h.W-NW l2 G SP EGF EGF EGF plasma- I membrane _owS H .- _l_ 3 SP EGF EGF EGF m+ J -- + + FIG. 4. Domain models of membrane-bound and secreted forms K * +- of neurexin IIIa as predicted from the sequences of the cDNAs and of the PCR products. Similar to neurexins Ia and Ila, neurexin lIla L + + contains an N-terminal signal peptide (SP) followed by three long 200 bp repeats (labeled A, B, and C). The three repeats are composed of central EGF-like domains flanked by right and left arms (boxes FIG. 3. Splice-site variants observed at the C-terminal splice site labeled a and b). The right and left arms of each repeat are weakly of neurexin lIla (nt 4868-6199, boxed in Fig. 2). Variants were homologous to each other and to the G-domain repeats of laminin A identified by cDNA cloning and PCR; ti he relative positions of the and the corresponding sequence repeats found in merosin, agrin, three oligonucleotides used for PCR are indicated above variant A perlecan, and slit (2). After the repeats, neurexin Illa contains a (labeled I, II, and III). The method of id entification of each variant threonine- and serine-rich sequence typical of 0-linked carbohydrate is indicated on the right. The different spliiced sequence segments are attachment sites (labeled CHO) followed by a transmembrane region marked by different textures. Positions of the first in-frame stop and a cytoplasmic C terminus. The diagram shows the four principal codon in a transcript are shown by astterisks; these stop codons protein forms resulting from the extensive C-terminal splice site create secreted forms of neurexin IIIa. variants (Fig. 3): two membrane-bound forms (Upper) corresponding to splice variants Ito L and two secreted forms (Lower) correspond- These results suggest that the spliice variants observed at ing to variants A to H. the C-terminal site of neurexinIIIa are physiological. Since most of the alternatively spliced sequences in neurexinIIla membrane region being identical in all neurexins. Recent are widely separated and occur in miultiple combinations, it experiments have demonstrated that the a-latrotoxin recep- is likely that they are independento f each other. Combina- tor and the neurexins interact with synaptotagmin, a synaptic tions of the different inserts at the ftour sites could generate vesicle membrane protein, and that the cytoplasmic C-ter- at least -% transcripts for neurexiin IIIa, indicating that minal domains of neurexins bind to synaptotagmin in vitro neurexinIlIa is also a very polymorphic protein. Several of (14, 15). The high degree of sequence identity in the cyto- the neurexinIlIa transcripts generate d by alternative splicing plasmic domains of neurexins supports a functional role for at the C-terminal site contain stop codons before the trans- these interactions. membrane region, suggesting they en icode secreted proteins. Data bank searches indicate that the repeats of neurexin No splice variant of the other neurexiins exhibits this feature. Illa have a similar degree of homology to agrin (4, 5), slit (6), Protein Structure of Neurexin lla. NeurexinIlIa is highly laminin A (3), merosin (16), and perlecan (17-19), as was homologous to neurexins Ia and Ila x. Fifty-eight percent of observed for neurexinsIa andIIa (2). In particular, those the residues are invariant in the thr.ee neurexins, and neu- residues tend to be conserved among the neurexins, agrin, rexinIIIa is approximately equally h omologous to neurexins laminin A, slit, and perlecan that are also conserved among Ia andIla. As expected from the hiigh degree of homology repeats within a neurexin (data not shown). This finding between the neurexins, neurexin IIIla has the same overall suggests that these repeats serve as protein modules with domain structure as the other neurex,ins (Fig. 4). Analysis of related structures and functions. Since at least some of the the neurexin sequences for signal sequences and signal domains corresponding to the neurexin repeats appear to be peptidase cleavage sites (13) predicts that all three neurexins involved in cell signaling and cell attachment in agrin and contain cleaved signal sequences al t their N termini. This laminin A, a possible function of this protein domain is that analysis suggests that the conserved sequence Gly-Leu-Glu- of a cell-interaction domain. Phe represents the probable cleavage i site, with the cleavage Most of the alternatively spliced sequences in the neurex- occurring after the glycine (data not shown). ins are located in conserved regions in the three repeats and After the signal peptide, neurexin IIIa, as the other neu- insert or delete small sequences. The exception to this rexins, contains three overall repeats (labeled A, B, and C in observation is the highly polymorphic splice site in neurexin Fig. 4). These three repeats are comlposed of a central EGF IIIa that contains at least 12 variants. This site is located domain flanked by right and left arms that are weakly between the 0-linked sugar domain and the transmembrane homologous to each other. The repteats are followed by a region. Several of the splice variants at this alternatively threonine- and seine-rich sequence t:hat probably represents spliced position contain in-frame stop codons, suggesting the an 0-linked sugar domain. At the C Iterminus, neurexinIIIa synthesis of secreted forms of neurexinIIIa (Fig. 2). These contains a single transmembrane regi on and a short cytoplas- secreted forms contain all repeats of neurexinIIIa and the mic tail. 0-linked sugar region but end in hydrophilic C-terminal Analysis of the sequences of thet hree neurexins demon- sequences of variable length (Fig. 4). Secreted forms of strates a patchy distribution of idenitical and divergent se- neurexins would have the potential to serve as soluble quences. In the repeat region, the thre De a-neurexins show the signaling molecules analogous to agrin, which has a role in least homology to each other in the letft arm of the first overall synaptogenesis and is also alternatively spliced (4, 5,20). The repeat (33% identity) and the most ha )mology in the right arm presence of secreted and membrane-bound forms of neurex- of the first overall repeat (83% ident;ity). The remaining left ins increases the number of potential signaling roles of these and right arms are between 56% and 70% identical. In some molecules. regions, blocks of identical residuesi in all three neurexins Expression of NeurexinElla. RNA blots of various rat flank completely dissimilar sequence s. The C termini of the tissues were hybridized with probes from neurexinIIIa to neurexins are among the most consserved, with the trans- determine its tissue distribution (Fig. 5). Like the other Downloaded by guest on October 6, 2021 6414 Biochemistry: Ushkaryov and Sudhof Proc. Natl. Acad. Sci. USA 90 (1993) 1 2 3 4 5 6 7 8 9 10 that these alternative splicing events are functionally impor- tant. Except for the C-terminal splice site, all proposed alternative splicing events consist of in-frame deletions/ insertions. Some of the inserts in the C-terminal splice site of neurexin IIIa contain in-frame stop codons, a property -9.5 different from those of other neurexins. At least 12 splice .. W::::: : variants were found at this position in neurexin IIIa, sug- gesting that the total number ofneurexin transcripts could be A much larger than those currently known. -4.4 Neurexins are neuronal cell surface proteins whose struc- tures suggest that they serve as receptors. The homologies of neurexins to laminin A, agrin, perlecan, and slit raise the possibility that neurexin may participate in cell-cell interac- -2.4 tions. One unusual feature of the neurexins is their high degree of polymorphism. The finding of a third gene for neurexins indicates that the neurexins may constitute a large B~ ~ ~ ~ and conserved gene family. The conservation of alternative splicing among all neurexins suggests that this alternative splicing may be functionally important and could represent a mechanism by which different neurons display different FIG. 5. Expression of neurexin IlIa in rat tissues by RNA blot combinations of these molecules on their surface. The pres- analysis. Total RNA from various tissues (10 ug per lane) was ence of potentially secreted forms of neurexin IIIa suggests denatured, electrophoresed, and blotted. The blot was hybridized that these molecules may be incorporated into the extracel- with a uniformly 32P-labeled DNA probe from the 3' end of neurexin lular matrix or may themselves represent signaling mole- IIIa (A). After exposure (1 day at -70°C with screen), the blot was cules. rehybridized with a cyclophilin probe to control for RNA loads (B). Numbers on the right indicate the positions of RNA size markers. We thank I. Leznicki and A. Hopkins for excellent technical Lanes: 1, PC12 cells; 2, brain; 3, cerebellum; 4, spinal cord; 5, assistance, and Drs. M. S. Brown and J. L. Goldstein for advice. adrenal; 6, intestine; 7, kidney; 8, liver; 9, spleen; 10, testis. This work was supported by the Howard Hughes Medical Institute and the Perot Family Foundation. neurexins, neurexin Illa was found to be expressed exclu- sively in brain. Even prolonged exposure failed to visualize 1. Sheperd, G. M., ed. (1990) The Synaptic Function ofthe Brain a hybridization signal in nonneuronal tissues (data not (Oxford Univ. Press, New York), 3rd Ed. shown). RINA from the adrenal gland, which expresses many 2. Ushkaryov, Y. A., Petrenko, A. G., Geppert, M. & Sudhof, T. C. (1992) Science 257, 50-56. otherwise neuron-specific genes, also showed no expression 3. Sasaki, M., Kleinman, H. K., Huber, H., Deutzmann, R. & of neurexin IIIa. In the three brain regions tested (forebrain, Yamada, Y. (1988) J. Biol. Chem. 263, 16536-16544. cerebellum, and spinal cord), transcripts of different sizes 4. Rupp, F., Payan, D. G., Magill-Solc, C., Cowan, D. M. & were observed whose levels varied among the brain regions Scheller, R. H. (1991) Neuron 6, 811-823. (Fig. 5). It is likely that the different RNA transcripts 5. Ruegg, M. A., Tsim, K. W. K., Horton, S. E., Kroger, S., observed correspond to splice variants of neurexin IIIa and Escher, G., Gensch, E. M. & McMahon, U. J. (1992) Neuron 8, 691-699. possibly to an as yet unidentified neurexin III13. 6. Rothberg, J. M., Jacobs, J., Goodman, C. S. & Artavanis- Tsakonas, S. (1990) Genes Dev. 4, 2169-2187. CONCLUSION 7. Petrenko, A. B., Lazaryeva, V. D., Geppert, M., Tarasyuk, T. A., Moomaw, C., Khokhlatchev, A. D., Ushkaryov, Y. A., Neurexins I and II are cell surface proteins that were origi- Slaughter, C., Nasimov, I. V. & Sudhof, T. C. (1993) J. Biol. nally identified by the sequence identity ofone ofits members Chem. 268, 1860-1867. with the high molecular weight component ofthe a-latrotoxin 8. Sudhof, T. C., Lottspeich, F., Greengard, P., Mehl, E. & Jahn, receptor (2, 3). We have now determined the full-length R. (1987) Science 238, 1142-1144. 9. Sanger, F., Nicklen, S. & Coulson, A. R. (1977) Proc. Natl. structure, alternative splicing, and expression of neurexin Acad. Sci. USA 87, 4509-4513. IIIa, a third member of the neurexin gene family. Neurexin 10. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989) Molecular lIla is highly homologous to neurexins Ia and hIa and is also Cloning: A Laboratory Manual (Cold Spring Harbor Lab. expressed in a brain-specific manner. Neurexins Ia, Ila, and Press, Plainview, NY), 2nd Ed. IIIa have the same overall domain structure and homologies 11. Zorn, A. M. & Krieg, P. M. (1991) Biotechniques 11, 181-183. to cell-attachment proteins (such as laminin A and agrin). 12. Kozak, M. (1991) J. Cell Biol. 115, 887-903. Furthermore, neurexin IIIa is also subject to extensive 13. von Heijne, G. (1986) Nucleic Acids Res. 14, 4683-4690. 14. Petrenko, A. G., Perin, M. S., Davletov, B. A., Ushkaryov, alternative splicing with the potential to generate at least 96 Y. A., Geppert, M. & Sudhof, T. C. (1991) Nature (London) forms. However, neurexin IlIa differs from other neurexins 353, 65-68. in that alternative splicing produces truncated and mem- 15. Hata, Y., Davletov, B., Petrenko, A. G., Jahn, R. & Sudhof, brane-bound forms, suggesting that some forms of neurexin T. C. (1993) Neuron 10, 307-315. lIla may be secreted. 16. Ehrig, K., Leivo, I., Argraves, W. S., Ruoslahti, E. & Engvall, Four positions of alternative splicing in the neurexin IIIa E. (1990) Proc. Natl. Acad. Sci. USA 87, 3264-3268. mRNA were observed by cDNA cloning and PCRs. Inter- 17. Noonan, D. M.. Fulle, A., Valente, P., Cai, S., Horigan, E., Sasaki, M., Yamada, Y. & Hassel, J. R. (1991) J. Biol. Chem. estingly, neurexin Ia, neurexin hIa, or both are also alterna- 266, 22939-22947. tively spliced at each of these positions (2), suggesting that 18. Kallunki, P. &Tryggvason, K. (1992)J. Cel Biol. 116, 559-571. the alternative splicing is conserved among neurexins. Most 19. Murdoch, A. D., Dodge, G. R., Cohen, I., Tuan, R. S. & sites of alternative splicing interrupt sequence regions that lozzo, R. V. (1992) J. Biol. Chem. 267, 8544-8855. are highly conserved among neurexins, supporting the notion 20. Ferns, M. J. & Hall, Z. W. (1992) Cell 70, 1-3. Downloaded by guest on October 6, 2021