Journal of Cell Science 107, 3591-3600 (1994) 3591 Printed in Great Britain © The Company of Biologists Limited 1994

Expression of mouse LSP1/S37 isoforms S37 is expressed in embryonic mesenchymal cells

V. L. Misener1, C.-c. Hui2, I. A. Malapitan1, M.-E. Ittel1, A. L. Joyner2,3 and J. Jongstra1,* 1The Arthritis Centre-Research Unit, The Toronto Hospital Research Institute and Department of Immunology, University of Toronto, Toronto, Ontario, Canada 2Division of Molecular and Developmental Biology, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada 3Department of Molecular and Medical Genetics, University of Toronto, Toronto, Ontario, Canada *Author for correspondence at: Toronto Western Hospital, Room 13-419, 399 Bathurst Street, Toronto, Ontario, M5T 2S8 Canada

SUMMARY

Mouse LSP1 is a 330 amino acid intracellular F- 18 bp encoding the 6 amino acids HLIRHQ of the acidic binding expressed in lymphocytes and domain. Therefore, the Lsp1 encodes four protein macrophages but not in non-hematopoietic tissues. A 328 isoforms: full-length LSP1 and S37 , designated amino acid LSP1-related protein, designated S37, is LSP1-I and S37-I and the same proteins without the expressed in murine bone marrow stromal cells, in fibro- HLIRHQ sequence, designated LSP1-II and S37-II. By in blasts, and in a myocyte cell line. The two proteins differ situ hybridization analysis we show that the S37 isoforms only at their N termini, the first 23 amino acid residues of are expressed in mesenchymal tissue, but not in adjacent LSP1 being replaced by 21 different residues in S37. The epithelial tissue, of several developing organs during mouse presence of different amino termini suggests that the LSP1 embryogenesis. This, together with our finding that S37 is and S37 proteins are encoded by transcripts arising an F-actin binding protein, suggests that S37 is a cytoskele- through alternative exon splicing. Here we report the tal protein of mesenchymal cells, which may play a role in genomic organization of the Lsp1 gene and show that the mesenchyme-induced epithelial differentiation during distinct N termini of LSP1 and S37 are encoded by two organogenesis. alternatively used exons, each containing a translational start codon. We also demonstrate that alternative 3′ acceptor sites are used in the splicing of exon 5. This results Key words: LSP1, S37, gene structure, expression, embryonic in LSP1 and S37 transcripts that either do or do not contain mesenchyme, F-actin binding

INTRODUCTION detergent-insoluble residue after lysis of B-lymphoma cells or normal spleen cells with the non-ionic detergent NP-40 Mouse LSP1 (Jongstra et al., 1988a), also designated pp52 (Jongstra-Bilen et al., 1992; M.-E. Ittel, unpublished). Fur- (Gimble et al., 1993), is a 330 amino acid intracellular phos- thermore, we showed that upon treatment of B-lymphoma cell phoprotein expressed in normal B-cells and transformed B-cell lines with anti-immunoglobulin (Ig) antibody to induce lines, in normal T-cells and non-transformed functional T-cell capping of membrane Ig molecules, the intracellular LSP1 lines, but not in transformed T-cell lines (Jongstra et al., 1988a; protein aggregates directly underneath the clusters of Klein et al., 1989). Human LSP1 has a similar expression membrane IgM (Klein et al., 1990). These results suggest that pattern (Jongstra et al., 1988a) and, in addition, has been found LSP1 protein is associated with the cytoskeleton. Our finding to be expressed in human neutrophils (Howard et al., 1994). A that mouse LSP1 protein binds to filamentous actin (F-actin) comparison of the mouse and human LSP1 cDNA sequences with a Kd of 0.2 µM, but does not bind to globular actin (G- predicts that the LSP1 proteins have a two-domain structure actin) suggests that LSP1 protein associates with the cytoskele- with an amino-terminal domain rich in acidic residues and a ton by binding directly to the F-actin-containing microfila- carboxy-terminal domain rich in basic residues. The basic ments. We also showed that both F-actin binding in vitro and domains of the mouse and human LSP1 proteins are 85% cytoskeletal binding in vivo occur through the conserved basic identical, suggesting that the basic domain of LSP1 is of par- domain (Jongstra-Bilen et al., 1992). ticular importance for LSP1 function (Jongstra-Bilen et al., Mouse LSP1 protein is encoded by a 1.6 kb mRNA tran- 1990). script expressed in lymphoid cells, lymphoma cell lines Intracellular fractionation studies showed that a significant (Jongstra et al., 1988a; Klein et al., 1989) and transformed fraction of the intracellular LSP1 protein associates with the macrophage cell lines (this paper). Expression is undetectable 3592 V. L. Misener and others in several other myeloid cell types tested, including peripheral Bilen et al., 1992), except that polymerization reactions were done at blood granulocytes, erythroleukemia cell lines and the masto- room temperature for 1 hour and contained 4 µM G-actin (from rabbit cytoma cell line P815. In addition, expression of mouse LSP1 skeletal muscle, a gift from Dr P. A. Janmey, Brigham and Women’s RNA is not detected in a variety of non-hematopoietic tissues Hospital, Boston) and 2 µM recombinant protein. such as adult liver, brain and kidney (Jongstra et al., 1988a). Oligonucleotides Recently, an LSP1-related cDNA designated S37, was isolated from the murine bone marrow stromal cell line BMS2. These Sequences of oligonucleotides used in this study are given below. Nucleotides represented in upper case letters correspond to either cells express a 2 kb RNA transcript that hybridizes with the sense or anti-sense Lsp1 sequences. Those in lower case letters are LSP1 cDNA and which is also detected in the fibroblast cell nonhomologous sequences, incorporating restriction sites designed to lines NIH3T3 and L-cells, and in the myocyte cell line G7 facilitate the subcloning of amplified fragments: (Gimble et al., 1993). The nucleotide sequence of the stromal A1 (sense): 5′-ATGGCGGAGGCTGCCATCGATCCCAGA-3′; cell-derived S37 clone predicts that it encodes a 328 amino A2 (anti-sense): 5′-aggtcgaCAGCTCAACGGTGTCTTCTA-3′; acid protein that is identical to the LSP1 protein expressed in A17 (sense): 5′-CCTGCAGCATGAATGGCCCCGCACTCC-3′; lymphocytes and macrophages, except that the 23 amino- JJ24 (sense): 5′-aggtcgacAACAGAGGTGCTGGGCCAGA-3′; terminal residues of LSP1 are replaced by a different amino JJ25 (anti-sense): 5′-aggtcgaCTCAGCAGCTTCTCCAGGC-3′; terminus of 21 amino acid residues in S37. JJ53 (sense): 5′-agggatccAGAGAAACACCAGGAGCC-3′; The F-actin binding property and the cytoskeletal localiza- JJ54 (anti-sense): 5′-aggtcgacCCTTCGCTGTCTTCTGCA-3′. tion of LSP1 protein led us to suggest that this protein is involved in the proper functioning of the immune system, in RNA isolation and northern blot analysis which it is uniquely expressed, through its involvement in Isolation of poly(A)+ RNA from mouse embryos and isolation of total cytoskeleton-dependent aspects of biological processes such as cytoplasmic RNA from cell lines, were carried out as described signal transduction, cellular adhesion or cell motility. As a (Jongstra et al, 1988b; Chomczynski and Sacchi, 1987; Kingston, 1987). Samples of poly(A)+ RNA (10 µg) or of total cytoplasmic prelude to generating an LSP1-deficient mouse strain by RNA (10 µg) were subjected to agarose/formaldehyde gel elec- targeted disruption of the Lsp1 gene in embryonic stem cells, trophoresis, transferred to nitrocellulose and hybridized with 32P- we determined the genomic organization of the Lsp1 gene and labelled DNA probes (Jongstra et al., 1988b). investigated its expression in the developing mouse embryo. The genomic organization of the Lsp1 gene demonstrates that In situ hybridization analysis the different N termini of the LSP1 and S37 proteins are In situ hybridization analysis of fixed and acetylated mouse embryo encoded by separate exons. Further isoform diversity is created sections with the full-length LSP1 probe was carried out essentially by of exon 5, which encodes amino acid as described (Hui and Joyner, 1993). In experiments using the S37- residues in the acidic domain. In addition, we extend the specific Emb 5′ probe the hybridization and wash conditions were findings of previous expression studies (Jongstra et al., 1988a; modified as described below. The Emb 5′ probe contains 106 bp of ′ Klein et al., 1989), and show that transcripts encoding LSP1 5 untranslated sequence and 62 bp of N-terminal coding sequence of are expressed in macrophage cell lines and that transcripts S37 RNA. Sense and anti-sense RNA probes were prepared by in vitro transcription of DNA templates in the cloning vector pBluescript II encoding S37 are expressed in certain mesenchymal tissues KS(+), using either T3 or T7 RNA polymerase. The Emb 5′ probe during mouse embryonic development. was hybridized overnight at 50¡C. Modified post-hybridization washes were as follows: slides were immersed at 65¡C for 10 minutes in washing buffer (50% formamide, 2× SSC, 0.1% β-mercap- MATERIALS AND METHODS toethanol), rinsed three times (10 minutes each) at 37¡C in NTE (0.5 M NaCl, 10 mM Tris-HCl, 5 mM EDTA, pH 7.5), treated for 30 Cell lines minutes at 37¡C with 20 µg/ml RNase A in NTE, and immersed for The macrophage cell lines J774A.1 and IC-21 were purchased form a further 15 minutes in NTE. Slides were then washed sequentially in the ATCC and cultured according to the instructions provided. B- washing buffer at 65¡C for 5 minutes, 2× SSC at 37¡C for 10 minutes lymphoma cell lines BAL17 and WEHI-231 were obtained from M. and 0.1× SSC at 37¡C for 10 minutes, followed by dehydration. Slides M. Davis, Stanford University. The T-lymphoma cell line BW5147 were coated with Kodak NTB 2 emulsion and exposed at 4¡C for 4- was obtained from N. Hozumi, Samuel Lunenfeld Research Institute, 6 weeks. The slides were then developed in Kodak D19 and stained Mount Sinai Hospital, Toronto. The Abelson murine leukemia virus- with Toluidine Blue. transformed pre-B cell line CB32 was obtained from G. Wu, The Wellesley Hospital Research Institute, Toronto. All B, T and pre-B RT-PCR analysis cell lines were grown in RPMI-1640 supplemented with 10% FCS as Poly(A)+ or total cytoplasmic RNA was heated at 65¡C for 5 minutes described (Klein et al., 1989). in a volume of 11 µl and cooled on ice. First-strand cDNA synthesis was primed with the anti-sense oligonucleotide A2. The reverse tran- Preparation of recombinant LSP1 and S37 protein scription (RT) mix consisting of 1 µl (40 units) RNAguard Recombinant LSP1 (rLSP1) protein was prepared and purified from (Pharmacia, Montreal, Quebec), 2 µl of a dNTP mix (2mM each Escherichia coli BL21(DE3) using the prokaryotic expression vector dATP, dCTP, dGTP and dTTP), 1 µl of A2 primer (20 ng), 4 µl of pET-3c as described (Klein et al., 1989). Recombinant S37 (rS37) 5× RT′ buffer (250 mM Tris-HCl, pH 8.3, 250 mM KCl, 60 mM protein was prepared similarly after amplification of the coding MgCl2 and 50 mM DTT) and 0.5 µl (12 units) of AMV reverse tran- sequence of the cDNA clone C-3 (see below) and cloning of the scriptase (Boehringer Mannheim, Montreal, Quebec) was added to the amplified product in pET-3c. RNA. After incubation at 42¡C for 1 hour the mixture was heated at 90¡C for 2 minutes and cooled on ice. PCR amplification of reaction F-actin binding assay products was carried out using the A2 primer in combination with Binding of rLSP1 and rS37 proteins to F-actin was determined by a either the A1 or A17 sense oligonucleotide. Radiolabelled A2 was co-centrifugation assay performed essentially as described (Jongstra- included in the PCR reaction (1/50th of total A2), so that reaction

Embryonic expression of the Lsp1 gene 3593 products could be visualized by autoradiography. Radiolabelled A2 was prepared by mixing 50 ng A2 with 7.5 µl [γ-32P]ATP (75 mCi, ~3000Ci/mmol, Amersham Corp., Oakville, Ontario) and 10 units T4 polynucleotide kinase (Pharmacia) in a final volume of 15 µl 1× One- Phor-AllPlus Buffer (Pharmacia). Incubation was carried out at 37¡C for 1 hour, followed by 5 minutes at 55°C. To the first-strand RT mix was added: 2.5 ng 32P-labelled A2, 122.5 ng unlabelled A2, 125 ng A1 or A17, 10 µl 10× PCR buffer (100 mM Tris-HCl, pH 8.3, 500 mM KCl, 20 mM MgCl2 and 0.1% gelatin) and water, to give a final volume of 100 µl. After heating the mixture at 94¡C for 2 minutes followed by brief cooling on ice, 5 units of Taq polymerase (Boehringer Mannheim) was added. Samples were mixed and overlayed with 100 ml mineral oil. PCR was carried out for 20 cycles Fig. 1. Expression of a 2 kb S37 transcript in embryonic RNA. A 10 on a Perkin Elmer Thermocycler, each cycle consisting of 1.5 minutes µ + µ g sample of poly (A) RNA from d17.5 mouse embryo (lanes 3, 5, at 94¡C, 1.5 minutes at 60¡C and 2 minutes at 72¡C. Samples (10 l) and 7) and 10 µg total cytoplasmic RNA from the B-lymphoma line of final RT-PCR products were electrophoresed on 4% polyacryl- WEHI-231 (lanes 2, 4 and 6) and the T-lymphoma line BW5147 amide gels and visualized by autoradiography of dried gels. Densito- (lane 1) was subjected to agarose/formaldehyde gel electrophoresis metric analysis of autoradiographs was done using a Model GS-670 and analysed by northern blot hybridization using the following 32P- Imaging Densitometer (Bio-Rad Laboratories, Mississauga, Ontario). labelled DNA fragments as probes: full-length mouse LSP1 cDNA (lanes 1-3), LSP1-specific Lsp5′ fragment (lanes 4 and 5) and S37- Isolation and characterization of S37 cDNA ′ λ specific Emb 5 fragment (lanes 6 and 7). The 2 kb S37 and 1.6 kb A gt10 cDNA library made from RNA extracted from 11.5-day-old LSP1 transcripts are indicated by open and filled arrowheads, mouse embryos, purchased from Clontech Laboratories, Inc., Palo respectively. Rehybridization of the filter with a tubulin probe Alto, CA (cat. no. ML1027a), was screened for LSP1 sequences using 32 indicated similar loading of RNA in lanes 1 and 2 (not shown). RNA P-labelled LSP1 cDNA as a probe, according to the instructions molecular mass markers are indicated in kb. provided. The 1.2 kb insert of one of two positive clones, designated C-3 was subcloned into pBluescript II KS(+) and sequenced. Genomic characterization the LSP1 transcript. Thus, using WEHI-231 RNA as a template for primer pair JJ53/JJ54, we generated by RT-PCR an LSP1- Four overlapping genomic clones spanning the Lsp1 gene were ′ isolated by plaque hybridization from a Lambda DASH (Stratagene specific probe consisting of 68 bp of LSP1 5 coding sequence Cloning Systems, La Jolla, CA) genomic library derived from the preceded by 53 bp of 5′ untranslated sequence. As expected, 129/Sv mouse strain, a gift from A. Reaume and J. Rossant, Samuel this LSP1-specific probe, designated Lsp5′, hybridizes with the Lunenfeld Research Institute, Mount Sinai Hospital, Toronto. The 1.6 kb LSP1 transcript in WEHI-231 (lane 4). Furthermore, intron/exon structure of the Lsp1 gene was determined by restriction Lsp5′ hybridizes with the 1.6 kb transcript but not with the 2 fragment hybridization mapping of the genomic clones and nucleotide kb transcript in embryonic RNA (lane 5), thus identifying the sequencing of appropriate fragments. 1.6 kb transcript present in mouse embryo RNA as LSP1, pre- sumably expressed in fetal lymphoid organs. To determine the nature of the 2 kb transcript we screened a d11.5 embryonic RESULTS mouse cDNA library with the LSP1 cDNA and isolated a 1.2 kb clone, designated C-3. DNA sequence analysis of C-3 Expression of the Lsp1 gene during mouse revealed that its predicted amino acid sequence is identical to embryonic development that of the S37 cDNA clone isolated from the bone marrow Expression of the Lsp1 gene during mouse embryonic devel- stromal cell line BMS2 (Gimble et al., 1993), except for the opment was analysed by northern blot hybridization of absence of 18 bp encoding the amino acids HLIRHQ in its embryonic mouse RNA using LSP1 cDNA as a probe. As acidic domain. The absence of this 18 bp segment is addressed shown in lane 3 of Fig. 1, there appear to be two different tran- below in more detail. To determine whether the 1.2 kb scripts present in RNA from 17.5-day-old (d17.5) embryos, 1.6 embryonic cDNA is derived from the 2 kb transcript, we used kb and 2 kb in size. Both transcripts are also detectable in RNA the C-3 cDNA as a template with primer pair JJ24/JJ25, to from d14.5 and d16.5 embryos (not shown). Neither transcript generate by PCR a 168 bp DNA fragment consisting of the is present in the LSP1-negative T-lymphoma cell line BW5147 unique 62 bp 5′ coding sequence preceded by 106 bp of 5′ (lane 1). We noted that the 1.6 kb transcript is similar in size untranslated sequence. Used as a probe for northern blots this to the LSP1 transcript in the B-lymphoma cell line WEHI-231 S37-specific fragment, designated Emb5′, hybridizes with the (lane 2) and that the size of the larger transcript, 2 kb, is the 2 kb transcript present in the embryonic RNA (Fig. 1, lane 7) same as that which has been reported previously for the LSP1- but not with the 1.6 kb LSP1 mRNA species (lanes 6 and 7), related bone marrow-derived transcript, S37 (Gimble et al., confirming that the embryonic 2 kb transcript encodes S37 1993). In order to confirm the identity of the 1.6 kb transcript protein as defined by its unique amino terminus. Furthermore, we used an LSP1-specific probe, which was generated in the the 18 bp deletion in clone C-3 suggests that at least some pro- following manner. Preliminary mapping experiments have portion of the embryonically expressed S37 transcripts lack indicated that the initiation site for the LSP1 transcript is this segment, which would otherwise encode 6 amino acids of located approximately 50 bp upstream of the translational start the acidic domain. codon (unpublished). On the basis of genomic sequence data To determine in what tissues of the mouse embryo S37 tran- we designed a pair of primers that would allow us to amplify scripts are expressed, we performed in situ hybridization a 121 bp DNA fragment corresponding to the unique 5′ end of analysis of mouse embryo sections, using sense and anti-sense 3594 V. L. Misener and others

Fig. 2. In situ hybridization analysis of Lsp1 during mouse embryonic development. Bright-field illumination (A,C,E and G), and dark-field illumination (B,D,F and H) of sagittal sections through d10.5 (A and B), d12.5 (C and D), d13.5 (E and F) and d16.5 (G and H) mouse embryos, hybridized with an anti-sense LSP1 RNA probe. m, mesonephros; k, kidney; u, urogenital sinus; te, testis; p, penis; tr, trachea; lu, lung; e, esophagus; i, intestine; to, tooth; h, hair follicle; li, liver; th, thymus; he, heart.

RNA probes prepared by in vitro transcription of either the signals were easily detectable in the esophagus and intestine full-length LSP1 cDNA or the Emb 5′ DNA fragment. Embryo (Fig. 2D, F and H) and stomach (not shown), at all stages tested sections probed with anti-sense LSP1 RNA are shown in Fig. from d12.5 onwards. In the urogenital system, both the devel- 2. As expected, at d16.5 of gestation the LSP1 probe oping kidney and the urogenital sinus (which gives rise to later hybridized strongly with the thymus, which is a major site of structures of the urogenital system) were positive at d12.5 (Fig. lymphopoiesis at this stage of development (Fig. 2H). 2D) as was the kidney at d13.5 (Fig. 2F). Expression in the However, we were unable to detect hybridization in the fetal urogenital system continued through to d16.5. At d16.5 strong liver. This is likely due to the fact that hematopoietic cells are signals were detected in the kidney, penis, testis and bladder, dispersed thoughout the liver and collectively represent only a and in the degenerating mesonephros (H). In addition, hybrid- small percentage of the total number of cells that constitute this ization signals were detected at sites of tooth and hair follicle organ. development (H). To determine whether signals observed in Hybridization was also detected in several non-hematopoi- non-hematopoietic tissues were due to the expression of S37 etic tissues at various stages of embryonic development. These transcripts, d10.5, d12.5 and d13.5 embryo sections were included the mesonephros at d10.5 (Fig. 2B), as well as the probed with anti-sense Emb 5′ RNA. Hybridization signals developing respiratory, digestive and urogenital systems at generated with this S37-specific probe exhibited a similar dis- later stages of development. In the respiratory tract, hybridiz- tribution in non-hematopoietic tissues (Fig. 3 and data not ation was evident in the lung at d12.5 (Fig. 2D). Expression shown), demonstrating that the S37 isoform is expressed in was easily detectable in the trachea at d13.5, whereas hybrid- non-hematopoietic tissues of the developing mouse embryo. At ization in the lung was diminished at this stage (Fig. 2F), and present, we cannot exclude the possibility that LSP1 is also was undetectable at d16.5 (Fig. 2H). In the digestive tract, expressed in non-hematopoietic tissues, since we cannot Embryonic expression of the Lsp1 gene 3595

Fig. 3. In situ hybridization analysis of S37 expression in the embryonal kidney, lung and stomach. Bright-field illumination (A,C,E and G), and dark-field illumination (B,D,F and H) of the kidney (A and B), lung (C and D) and stomach (E and F) from a d12.5 mouse embryo sagittal section hybridized with the anti-sense Emb 5′ RNA probe. The epithelial cell layers in A, C and E are indicated by arrowheads. The same organs of an embryo section probed with sense Emb 5′ RNA are shown in G and H. k, kidney; lu, lung; s, stomach; li, liver. prepare an LSP1-specific RNA probe that is long enough to be predicted that S37 is also an F-actin binding protein. To test useful in our in situ hybridization analysis. this we expressed LSP1 and S37 as recombinant proteins in Development of the urogenital, respiratory and digestive Escherichia coli and compared their ability to bind to F-actin. organs, as well as teeth and hair follicles, is governed by Fig. 4 shows that both recombinant proteins are soluble under specific interactions between the developing epithelia and the polymerization conditions used in our assay. Essentially all underlying mesenchymal cells, through a process known as rLSP1 or rS37 protein remains in the supernatant after cen- secondary induction. Thus we were interested to know if the trifugation (lanes 3 and 4). However, when G-actin is included S37 expression that we observe in these developing systems is in the polymerization reactions, approximately 50% of rLSP1 confined to a particular cell type. Individual organs in d12.5 or rS37 co-sediments with polymerized actin (lanes 1 and 2). embryo sections probed with the S37-specific anti-sense Emb These results show that rS37 can bind to F-actin in vitro, to a 5′ probe are shown in Fig. 3. Under bright-field illumination, similar extent to rLSP1. the epithelial cell layer that forms the tubules of the kidney (Fig. 3A), the bronchial structures of the lung (Fig. 3C) and the Genomic organization and alternative exon usage of mucosal lining of the stomach (Fig. 3E), is seen as a densely the mouse Lsp1 gene staining cell layer surrounded by more loosely organized mes- We have determined the genomic organization of the mouse enchymal cells. Dark-field illumination of the corresponding Lsp1 gene. Four overlapping genomic clones that span the sections (Fig. 3B, D and F) shows that hybridization signals in Lsp1 gene, A1-16, A1-6, S350-2 and LSP1-2 (Fig. 5), were the kidney, lung and stomach are confined to the mesenchymal isolated from a 129/Sv mouse genomic library and character- tissue and are not present in the developing epithelia. Hybrid- ized. Clone LSP1-2 was isolated by screening the library with ization was specific, as judged by the absence of any signal in the LSP1 cDNA probe. Restriction fragment hybridization the kidney, lung or stomach, when the sense probe was used analysis using a series of oligonucleotide probes correspond- (Fig. 3G and H). Thus, we conclude that S37 is expressed in ing to segments of coding sequence along the LSP1 cDNA mesenchymal tissues adjacent to developing epithelia, in molecule (not shown), indicated that LSP1-2 contains all but several organs of the embryonic mouse. the unique 5′ end of the LSP1 coding sequence. To locate the remaining upstream sequence the library was re-screened using S37 is an F-actin binding protein oligonucleotide A1, a probe specific for the unique 5′ end of The binding of LSP1 to F-actin in vitro or to the cytoskeleton LSP1. None of the A1+ clones that we isolated overlapped with in vivo is mediated by sequences in the basic domain (Jongstra- the 5′ end of LSP1-2. The library was therefore screened a final Bilen et al., 1992). Since the basic domain is contained within time using a 0.3 kb BamHI-SacI fragment isolated from the 5′ the sequence common to the LSP1 and S37 proteins, we end of LSP1-2. One of the resulting clones, designated S350-

3596 V. L. Misener and others

Table 1. Summary of Lsp1 exon sizes and splice sites 3′SS (acceptor) 5′SS (donor) Consensus: (Py)nNPyAG ------agGTAAGT Exon Size (bp) 1 >68 ggGTGAGT 1′ 521 agGTAGGT 2 147 TCCTCCAGgc ------ctGTAAGC 3 159 CCTTGCAGca ------agGCAAGT 4 97 TCCCATAGtt ------agGTGAGA 5 93 (a) ATTCTTAGca ------aaGTAGGT (b) GTCATCAGgt ------same 6 44 CTCCACAGct ------agGTGTGT 7 82 ATCCCTAGca ------agGTGAAG 8 135 CTGTACAGtc ------agGTAGGA 9 78 GCCCCCAGga ------agGTAGGT 10 107 CCTAACAGag ------tgGTGAGT 11 369 ATGTCCAGga

Intron sequences are shown in upper case letters. Exons are represented by lower case letters connected by a broken line. (a) and (b) represent alternative 3′ splice sites of exon 5. The size of exon 1′ is based on mapping of the S37 transcriptional initiation site (Gimble et al., 1993). The exact size of exon 1 has not been determined. Fig. 4. S37 is an F-actin binding protein. F-actin binding assay of rLSP1 (A) and rS37 (B). Equal fractions of the supernatants (S) and upstream sequences contained in clone A1-6 and the down- pellets (P) were analysed by SDS-PAGE and staining with Coomassie Blue. Numbers at left represent protein molecular mass stream sequences contained in clone LSP1-2. markers in kDa. The intron/exon structure of the Lsp1 gene was determined using a combination of restriction fragment hybridization and nucleotide sequence analysis of the overlapping genomic clones. An exon map of Lsp1 is shown in Fig. 5. The gene encompasses approximately 32 kb. Exon 1 encodes the amino terminus of the LSP1 protein. Exons 2-11 constitute the sequences that are shared by the LSP1 and S37 isoforms. Exon 10 contains the translational stop codon TAG, and is followed approximately 4 kb downstream by exon 11, which is entirely non-coding and contains the polyadenylation signal sequence. As shown in Table 1, sequences adjacent to the 5′ and 3′ ends of exons 2-10 are compatible with the consensus sequences for acceptor and donor splice junctions, respectively (Mount, 1982), as is the sequence adjacent to the 5′ end of exon 11. To locate the DNA sequence corresponding to the unique 5′ end of S37 we began by probing the A1+ clones with oligonu- cleotide A17, which is specific for the unique 5′ sequence of S37. Only clone A1-6 hybridized with A17. Restriction Fig. 5. Genomic organization and differential exon utilization of the fragment hybridization and nucleotide sequence analysis of Lsp1 gene. Intron/exon structure of the mouse Lsp1 gene. Exon 1 this clone identified exon 1′, which encodes the unique amino (shaded bar) and exon 1′ (dotted bar) are alternatively used exons. terminus of the S37 protein. The alternatively used exons, 1 Exons 1 and 1′ encode the amino termini of the LSP1 and S37 and 1′, are situated well upstream of exon 2 and are separated proteins, respectively. Filled bars represent exons 2-11, which are from each other by approximately 11 kb. Sequences adjacent common to both isoforms. The first 18 bp of exon 5 are represented to their 3′ ends conform to the consensus donor splice site ′ by the open bar. The alternative 3 acceptor splice site provided by sequence (Table 1). this 18 bp segment is illustrated below the map. Shown above the We noted that the combined nucleotide sequence of exons map are the four overlapping genomic clones, A1-16, A1-6, S350-2 and LSP1-2, from which the genomic structure was determined. 1-11 differed at 13 positions from the LSP1 cDNA sequence Distances are indicated in kb. B, BamHI restriction sites. previously derived from the pre-B cell line 220.2 (Jongstra et al., 1988a). Six differences are located within the coding region, of which five result in amino acid changes. However, 2, which hybridized with this 0.3 kb BamHI-SacI fragment, when we re-sequenced the coding region of the original 220.2 also hybridized with a 0.6 kb XbaI-EcoRI fragment isolated cDNA clone we noted no differences from the genomic from the 3′ end of A1-6, thus bridging the gap between the sequence. Therefore, the LSP1 sequence originally published Embryonic expression of the Lsp1 gene 3597 contained several incorrectly assigned nucleotides. The amended predicted amino acid sequence is in accordance with that reported for the pp52 protein (Gimble et al., 1993). Use of alternative exon 5 acceptor splice sites generates additional LSP1/S37 isoforms We had expected that the 18 bp segment, which is absent in the embryonic S37 cDNA clone C-3, would be a separate dif- ferentially used exon of the Lsp1 gene. On the contrary, this segment constitutes the first 18 bp of exon 5. Comparison of the 18 bp sequence with the consensus sequence for acceptor splice sites (see Table 1) shows that it may provide an alter- native 3′ splice site. As shown in Fig. 5, the 18 bp sequence ends in an AG dinucleotide with 10 out of the 16 preceding nucleotides being pyrimidines. Splicing of exon 5 at this site would be expected to give rise to mRNA transcripts lacking these 18 nucleotides. To determine if this is a rare splicing event we measured the proportion of S37 transcripts lacking the 18 nucleotides in total mouse embryo RNA, using a semi- quantitative reverse transcriptase-PCR (RT-PCR) assay. Serial dilutions of mouse embryo RNA were subjected to RT-PCR using the primer pair A17/A2, which flanks the insertion site of the 18 bp segment and which can amplify only those sequences containing exon 1′. Radiolabelled A2 was included in the PCR reaction, allowing products to be visualized by autoradiography. An example of the results obtained in this type of RT-PCR analysis is shown in Fig. 6. Two DNA Fig. 6. Alternative exon 5 acceptor splice site utilization. fragments were obtained in the reaction (Fig. 6C, lanes 3-7). (A) Diagrammatic comparison of LSP1 and S37 mRNA species showing primer pairs used for RT-PCR analysis. The thin lines The sizes of these fragments are consistent with either the represent sequence that is common between LSP1 and S37, while the presence or absence of the 18 bp segment, using PCR- filled and open bars represent the unique 5′ sequences. Primer pairs amplified cDNA fragments as size standards for comparison A1/A2 and A17/A2, which flank the 18 bp insertion site, are specific (lanes 1, 2). Furthermore, the presence or absence of the 18 bp for LSP1 and S37 sequences, respectively. (B) Densitometric segment was confirmed by nucleotide sequence analysis of RT- analysis of +18 bp S37-derived RT-PCR products generated from PCR products (not shown). To determine the ratio of the inten- mouse embryo RNA (upper bands in C, lanes 3-7). Intensities are sities of the bands representing the +18 and −18 bp forms we expressed as a percentage of the band intensity generated from 200 first determined whether twofold serial dilutions of input ng of input RNA. (C) Semi-quantitative RT-PCR analysis to − embryonic RNA yielded similarly diminishing signals. Fig. 6B determine the relative abundance of +18 bp vs 18 bp S37 transcripts shows the densitometric analysis of the upper, 18 bp-contain- in embryonic mRNA. RNA from d16.5 mouse embryos was serially diluted in total cytoplasmic RNA from the T-lymphoma BW5147, to ing fragments in lanes 3-7 of Fig. 6C, which shows that over give a total of 2 µg of RNA in each sample. Twofold dilutions of the range of dilutions tested the observed signal is directly pro- input embryonic RNA ranging from 200 ng to 12.5 ng were portional to the amount of input RNA. Linearity of the signals subjected to RT-PCR using the A17/A2 primer pair (lanes 3-7). allowed a comparison of band intensities between lanes. For Radiolabelled A2 was included in the reactions and products were example, by comparing lanes 4 and 6, it is evident that the visualized by autoradiography. Negative controls were 2 µg BW5147 intensity of the lower band in lane 4 is similar to that of the RNA alone (lane 8) and 200 ng embryonic RNA without reverse upper band in lane 6, which was generated from 4 times less transcriptase (lane 9). DNA fragments of known size (lanes 1 and 2) RNA. Therefore, assuming that both the +18 and −18 bp RNA were generated by PCR amplification of LSP1 and S37 cDNA templates using primer pairs A1/A2 and A17/A2, respectively. species are assayed with the same efficiency, it appears that 32 φ the larger RNA species containing the first 18 bp of exon 5 is Additional size markers (in bp) are provided by P-labelled X174 RF DNA HaeIII fragments (lane 10). approximately 4-fold more abundant than the smaller RNA species lacking these 18 bp. As expected, no products were obtained from the T-lymphoma BW5147 RNA (lane 8), or from embryonic RNA when reverse transcriptase was omitted DNA fragments that were similar in size to the markers in lanes (lane 9). 2 and 3, consistent with the expression of both +18 bp and To address whether LSP1 mRNA also exists in two forms −18 bp LSP1 transcripts. As expected, when the S37-specific differing with respect to the 18 bp segment of exon 5, we A17/A2 primer pair was used, products were obtained from the employed RT-PCR with the LSP1-specific primer pair A1/A2, embryonic RNA (lane 5) but not from RNA of any of the B- to analyse RNA from mouse embryos and from a panel of lineage cell lines (lanes 7, 9 and11). On the basis of our LSP1-expressing cell lines. The results of this analysis are previous finding that the monomyelocytic cell line WEHI-3 shown in Fig. 7. Embryonic RNA (lane 4) and RNA from the expresses a 1.6 kb LSP1 transcript (Jongstra et al., 1988a), we B-lymphoma cell lines WEHI-231 and BAL17 and from the also analysed RNA from several macrophage cell lines. The pre-B cell line CB32 (lanes 6, 8 and 10) all gave rise to two RT-PCR analysis of macrophage lines IC-21 and J774A.1 3598 V. L. Misener and others

Fig. 7. Expression of +18 bp and −18 bp LSP1 mRNA. RT-PCR analysis of RNA from B-lymphoma, pre-B and macrophage cell lines (2 µg total cytoplasmic RNA) and RNA from d16.5 mouse embryo (100 ng poly(A)+ RNA combined with 1.9 µg total cytoplasmic RNA from T-lymphoma BW5147). Primers pairs used were the LSP1-specific A1/A2 (even numbered lanes) and S37-specific A17/A2 (odd numbered lanes). Size standards in lanes 1-3 are as described in the legend to Fig. 6C. demonstrates that these cells also express both +18 and −18 bp forms of LSP1 mRNA (Fig. 7, lanes 12 and 14) but do not express S37 (lanes 13 and 15). We conclude that alternative exon 5 acceptor splice sites are utilized in the generation of both LSP1 and S37 mRNA. Results in Fig. 7 show that the ratios of the intensities of the upper and lower bands are variable. Whether this reflects cell-specific differences in uti- lization of the splice sites in exon 5 or post-transcriptional dif- ferences remains to be determined. On the basis of the present findings, alternative splicing pathways for Lsp1 exons give rise to four protein isoforms as summarized in Fig. 8. Alternative utilization of exons 1 and 1′ generates LSP1 and S37 transcripts, respectively. The use of alternative exon 5 acceptor splice sites yields transcripts that contain either exon 5ab (+18 bp) or exon 5b (−18 bp). The resulting proteins are designated LSP1-I or II and S37-I or II, respectively. Fig. 8. Summary of LSP1/S37 protein products of the Lsp1 gene. Continuous lines represent amino acid sequences that are common to DISCUSSION all isoforms. Amino acid sequences that differ between the four proteins are shown. We have determined that the S37 product of the Lsp1 gene is abundantly expressed in several non-hematopoietic tissues during mouse embryonic development. We have detected S37 be elucidated. Direct heterotypic cell contact as well as signals transcripts in the mesenchymal tissue of the developing uro- transduced via mesenchyme-derived soluble factors and matrix genital, respiratory and digestive systems, as well as at sites of components are likely to contribute to epithelial-mesenchymal hair follicle and tooth development. These findings are intrigu- interactions in vivo (Lehtonen, 1975; Slavkin and Bringas, ing, since the development of these specialized structures is 1976; Montesano et al., 1991a,b; Sonnenberg et al., 1993; known to depend on signals from the mesenchyme, which Birchmeier and Birchmeier, 1993; Sakakura, 1991). induce the appropriate cytodifferentiation and morphogenesis We can postulate different ways in which S37 may play a of the adjacent epithelia (Birchmeier and Birchmeier, 1993) role in mesenchymal-epithelial interactions. One possibility is Since S37 binds F-actin in vitro, to a similar extent as the that S37 is involved in heterotypic cell adhesion. For example, cytoskeletal protein LSP1, we put forward the hypothesis that it might function as an accessory protein that binds to the intra- S37 is a cytoskeletal protein of mesenchymal cells that plays cellular domain of some surface adhesion molecule(s), thereby a role in epithelial-mesenchymal interactions during providing a link with the cytoskeleton. Such a role for S37 embryonic development. would be analogous to the role of talin, which links the integrin The molecular mechanisms by which mesenchymal cells LFA-1 to the cytoskeleton in T-lymphocytes (Kupfer et al., transduce signals to developing epithelia are only beginning to 1990). In this manner S37 may contribute to the establishment Embryonic expression of the Lsp1 gene 3599 and/or maintenance of cell contacts mediated through adhesion ulation of gene expression. In the case of encoding the molecules. Alternatively, S37 may play a role in the secretion cytoskeletal proteins MLC1/3, caldesmon and LSP1/S37, the of mesenchyme-derived soluble factors or ECM-associated presence of two promoters combined with alternative transla- molecules required by the developing epithelium, analogous to tional start sites would allow not only regulation of protein the role that has been proposed for the neuronally expressed expression at the transcriptional level, but also regulation of synaptic vesicle-associated phosphoproteins synapsin I and II. protein function through the inclusion of different amino- In addition to associating with the cytoplasmic face of synaptic terminal ends. Additionally, both caldesmon and LSP1/S37 vesicles, these proteins can bind and bundle actin filaments and proteins are subject to further isoform diversity through alter- are thought to be involved in regulating the availability of native splicing of internal exons. This leads to the existence of synaptic vesicles for exocytosis, by reversibly tethering them four protein isoforms derived from the Lsp1 gene and five to the actin-based microfilaments of the cytoskeleton protein isoforms derived from the caldesmon gene. The func- (Greengard et al., 1993). It is conceivable that secretion of tional significance of the MLC 1/3, caldesmon and LSP1/S37 factors from mesenchymal cells might be similarly regulated isoforms is unknown. However, we may speculate that the gen- by the microfilament-associated S37 protein. eration of protein isoforms through differential exon utiliza- We might also consider the possibility that, rather than func- tion, combined with the regulation of RNA transcription via tioning as a component of the molecular machinery by which alternative promoters, is a means by which cytoskeletal mesenchymal cells transmit inductive signals, S37 may be proteins involved in the regulation of ubiquitous cytoskeleton- involved in another important aspect of mesenchymal cell dependent biological processes, such as contraction, motility, physiology, its conversion into visceral smooth muscle. The adhesion or vesicular transport, can adapt efficiently to the smooth muscle present in the walls of the respiratory, digestive specific requirements of the different cell types in which they and urogenital tracts is derived embryologically from its are expressed. respective mesenchyme and occurs via a process in which mes- enchymal cells retract their processes and gradually become We thank Dr J. Jongstra-Bilen for performing the intitial in situ fusiform in shape (Gould, 1973). Remodelling of cell structure hybridization experiment, Mr G. Beebakee and Ms N. King-Trickey is known to be highly dependent on the dynamics of the for technical help and Mr K. Harpal for assistance with histology. This cytoskeleton and its associated proteins (Stossel, 1993; work was funded by grants from the National Cancer Institute of Canada to J.J. and A.J., and by a grant from the Medical Research Bretscher, 1993). Thus it is possible that S37, as a cytoskel- Council of Canada (MRC) to A.J. who is an MRC Scientist and Inter- etal protein of mesenchymal cells, plays a specific role in the national Scholar of the Howard Hughes Medical Institute. C.-c.H. was morphological changes that these cells must undergo as they supported by a postdoctoral fellowship from the MRC. M.-E.I. was a develop into smooth muscle. visiting scientist from the Centre National de la Recherche Scien- The Lsp1 gene appears to be one of only a few eukaryotic tifique of France. genes at present known to encode protein isoforms originating from two different translational start sites contained in alter- natively used exons. Two examples of this unusual gene structure are the gene encoding the myosin light chain proteins REFERENCES MLC1 and MLC3 (Nabeshima et al., 1984; Robert et al., 1984) Birchmeier, C. and Birchmeier, W. (1993). Molecular aspects of and the gene encoding the cytoskeletal protein caldesmon mesenchymal-epithelial interactions. Annu. Rev. Cell Biol. 9, 511-540. (Hayashi et al., 1992). MLC1 and MLC3 share a common 141 Bretscher, A. (1993). Microfilaments and membranes. Curr. Opin. Cell Biol. 5, amino acid residue carboxy-terminal sequence but have unique 653-660. amino-terminal sequences of 49 and 8 amino acid residues, Chomczynski, P. and Sacchi, N. (1987). Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction. Anal. respectively. The two mRNA molecules encoding MLC1 and Biochem. 162, 156-159. MLC3 are transcribed from different promoters (Strehler et al., Gimble, J. M., Dorheim, M.-A., Youkhana, K., Hudson, J., Nead, M., Gilly, 1985). Similarly, two isoforms of caldesmon protein have been M., Wood, Jr., W. J., Hermanson, G. G., Kuehl, M., Wall, R. and identified that differ at their amino termini. Although the Kincade, P. W. (1993). Alternatively spliced pp52 mRNA in nonlymphoid different amino-terminal sequences are encoded by two alter- stromal cells. J. Immunol. 150, 115-121. Gould, R. P. (1973). The microanatomy of muscle. In The Structure and natively used exons it has not yet been determined if Function of Muscle, 2nd edn, vol. 2. (ed. G. H. Bourne). Academic Press, expression of the different caldesmon transcripts is regulated New York. by different promoters or by a single promoter contained Greengard, P., Valtorta, F., Czernik, A. J. and Benfenati, F. (1993). within a common untranslated exon. In the case of Lsp1, iden- Synaptic vesicle phosphoproteins and regulation of synaptic function. Science 259, 780-785. tification of promoter elements that regulate the expression of Hayashi, K., Yano, H., Hashida, T., Takeuchi, R., Takeda, O., Asada, K., LSP1 and S37 transcripts has not been reported. However, Takahashi, E.-I., Kato, I. and Sobue, K. (1992). Genomic structure of the since the transcriptional initiation site of S37 has been mapped human caldesmon gene. Proc. Nat. Acad. Sci. USA 89, 12122-12126. as the start of exon 1′ (Gimble et al., 1993), we presume that Howard, T., Li, Y., Torres, M., Guerrero, A. and Coates, T. (1994). The 47- these transcipts are regulated by separate promoters. kD protein increased in neutrophil actin dysfunction with 47- and 89-kD protein abnormalities is lymphocyte-specific protein 1. Blood 83, 231-241. Several genes including Lck (Takadera et al., 1989) and the Hui, C.-c. and Joyner, A. L. (1993). A mouse model of Greig α-amylase gene Amy-1a (Schibler et al., 1983) are known to cephalopolysyndactyly syndrome: the extra-toesJ mutation contains an express different mRNA species transcribed from alternative intragenic deletion of the Gli3 gene. Nature Genet. 3, 241-246. promoters. However, since the alternative transcripts contain a Jongstra, J., Tidmarsh, G. F., Jongstra-Bilen, J. and Davis, M. M. (1988a). A new lymphocyte-specific gene which encodes a putative Ca2+-binding common translational initiation codon they do not give rise to protein is not expressed in transformed T-lymphocyte lines. J. Immunol. 141, different protein isoforms. The presence of alternative 3999-4004. promoters is thought to allow for greater flexibility in the reg- Jongstra, J., Jongstra-Bilen, J., Tidmarsh, G. F. and Davis, M. M. (1988b). 3600 V. L. Misener and others

The in vitro translation product of the murine λ5 gene contains a functional Nabeshima, Y-I., Fuii-Kuriyama, Y., Muramatsu, M. and Ogata, K. signal peptide. Mol. Immunol. 25, 687-693. (1984). Alternative transcription and two modes of splicing result in two Jongstra-Bilen, J., Young, A. J., Chong, R. and Jongstra, J. (1990). Human myosin light chains from one gene. Nature 308, 333-338. and mouse LSP1 genes code for highly conserved phosphoproteins. J. Robert, B., Daubas, P., Akimenko, M.-A., Guenet, J.-L. and Buckingham, Immunol. 144, 1104-1110. M. (1984). A single locus in the mouse encodes both myosin light chains 1 Jongstra-Bilen, J., Janmey, P. A., Hartwig, J. H., Galea, S. and Jongstra, J. and 3, a second locus corresponds to a related pseudogene. Cell 39, 129-140. (1992). The lymphocyte-specific protein LSP1 binds to F-actin and to the Sakakura, T. (1991). New aspects of stroma-parenchyma relations in cytoskeleton through its COOH-terminal basic domain. J. Cell Biol. 118, mammary gland differentiation. Int. Rev. Cytol. 125, 165-202. 1443-1453. Schibler, U., Hagenbuechle, O., Wellauer, P. and Pittet, A. (1983). Two Kingston, R. E. (1987). Preparation of poly(A)+ RNA. In Current Protocols in promoters of different strengths control the transcription of the mouse α- Molecular Biology, vol. 1. Greene Publishing Associates and Wiley- amylase gene Amy-1α in the parotid gland and the liver. Cell 33, 501-508. Interscience, New York. Slavkin, H. C. and Bringas, Jr., P. (1976). Epithelial-mesenchyme Klein, D. P., Jongstra-Bilen, J., Ogryzlo, K., Chong, R. and Jongstra, J. interactions during odontogenesis. IV. Morphological evidence for direct (1989). Lymphocyte-specific Ca2+-binding protein LSP1 is associated with heterotypic cell-cell contacts. Dev. Biol. 50, 428-442. the cytoplasmic face of the plasma membrane. Mol. Cell. Biol. 9, 3043-3048. Sonnenberg, E., Meyer, D., Weidner, K. M. and Birchmeier, C. (1993). Klein, D. P., Galea, S. and Jongstra, J. (1990). The lymphocyte-specific Scatter factor/hepatocyte growth factor and its receptor, the c-met tyrosine protein LSP1 is associated with the cytoskeleton and co-caps with membrane kinase, can mediate a signal exchange between mesenchyme and epithelia IgM. J. Immunol. 145, 2967-2973. during mouse development. J. Cell Biol. 123, 223-235. Kupfer, A., Burn, P. and Singer, S. J. (1990). The PMA-induced specific Stossel, T. P. (1993). On the crawling of animal cells. Science 260, 1086-1094. association of LFA-1 and talin in intact cloned T helper cells. J. Mol. Cell. Strehler, E. E., Periasamy, M., Strehler-Page, M.-A. and Nadal-Ginard, B. Immunol. 4, 317-325. (1985). Myosin light-chain 1 and 3 gene has two structurally distinct and Lehtonen, E. (1975). Epithelio-mesenchymal interface during mouse kidney differentially regulated promoters evolving at different rates. Mol. Cell. Biol. tubule induction in vivo. J. Embryol. Exp. Morph. 34, 695-705. 5, 3168-3182. Montesano, R., Schaller, G. and Orci, L. (1991a). Induction of epithelial Takadera, T., Leung, S., Gernone, A., Koga, Y., Takihara, Y., Miyamoto, tubular morphogenesis in vitro by fibroblast-derived soluble factors. Cell 66, N.G. and Mak, T. W. (1989). Structure of the two promoters of the human 697-711. lck gene: differential accumulation of two classes of lck transcripts in T cells. Montesano, R., Matsumoto, K., Nakamura, T. and Orci, L. (1991b). Mol. Cell. Biol. 9, 2173-2180. Identification of a fibroblast-derived epithelial morphogen as hepatocyte growth factor. Cell 67, 901-908. Mount, S. M. (1982). A catalogue of splice junction sequences. Nucl. Acids Res. 10, 459-472. (Received 7 March 1994 - Accepted 19 August 1994)