Proc. Nati. Acad. Sci. USA Vol. 88, pp. 10811-10815, December 1991 Cell Biology Structure of the human villin gene (-binding /gene evolution/intestinal differentiation) ERIC PRINGAULT, SYLVIE ROBINE, AND DANIEL LOUVARD Unite de Biologie des Membranes, Centre National de la Recherche Scientifique, Unitt Associde 1149, DIpartement de Biologie Molculaire, Institut Pasteur, 25 rue du Dr. Roux, 75015 Paris, France Communicated by Frangois Jacob, August 26, 1991 (received for review July 12, 1991)

ABSTRACT We have isolated and characterized the com- structural and functional organization of the protein se- plete human villin gene. The villin gene is located on chromo- quence. We suggest that the two transcripts observed in some 2q35-36 in humans and on chromosome 1 in mice. Villin humans arise from a common pre-mRNA by the random belongs to a family ofcalcium-regulated actin-binding choice of one of two polyadenylylation signals located in the that share structural and functional homologies. The villin gene last exon. is expressed mainly in cells that develop a brush border, such as mucosal cells of the small and large intestine and epithelial cells of the kidney proximal tubules. Villin gene expression is MATERIALS AND METHODS strictly regulated during adult life and embryonic development Southern Blots. Human genomic DNA was purified from in the digestive and urogenital tracts and, thus, may be used as lymphocytes and digested to completion with various restric- a marker of the digestive and renal cell lineages. The human tion enzymes, subjected to electrophoresis, and blotted onto villin gene has one copy per haploid genome, encompasses nitrocellulose membranes as described by Southern (16). about 25 kilobases, and contains 19 exons. Analysis of the Blots were then hybridized with three different villin cDNA structural organization of this gene shows that the two mRNAs probes labeled with [a-32P]dCTP by nick-translation. that encode villin in humans arise by alternative choice of one Genomic Library Screening. A genomic library (a gift of H. of the two polyadenylylation signals located within the last Lehrach, European Molecular Biology Laboratory, Heidel- exon. The overall organization of the exons reflects the gene berg), constructed in phage A by cloning partially Sau3A1- duplication event from which this family of actin-binding digested human lymphocyte genomic DNA into the BamHI proteins originated. sites of the EMBL3 vector (17) was screened with a mixture of overlapping villin cDNA probes spanning the entire length The intestinal mucosa provides an attractive model with of the mRNA (3.5 kb). Positive clones were then rescreened which to study the physiological role and the regulation of separately with each probe. Two overlapping clones were expression of tissue-specific genes. Arising from stem cells selected for further analysis: EMBL3-85e contains the 5' end located in the intestinal crypts, immature precursors of of the gene, and EMBL3-30a contains the 3' end. enterocytes migrate toward the tips of villi, gradually acquir- DNA Amplification (PCR). Amplification of fragments of ing their differentiated phenotype. This crypt/villus spatio- the villin gene was performed using the Perkin-Elmer/Cetus temporal gradient of differentiation allows us to establish thermal cycler and cloned Taq polymerase (Cetus). The correlations between the expression of specific proteins and temperature cycle was 920C, 1 min; 550C, 1 min; 720C, 3 min the morphological maturation of enterocytes. One of these (with an increase in elongation time of 2 sec per cycle). The proteins, villin, is of particular interest since (i) it belongs to number of cycles was 30-40 depending of the experiment. a family of actin-binding proteins that are modulated by DNA Sequencing. Exons and the junctions between exons calcium (1-6), (ii) it plays a key role in the assembly of the and intervening sequences were sequenced (18) directly on brush border (7), (iii) it is expressed in immature precursors DNA purified from phage EMBL3-85e or -30a. Specific of enterocytes (8-12), and (iv) it displays a strict tissue- oligonucleotide primers chosen from the coding sequence of specific distribution (8). We previously isolated and charac- the villin cDNA were used. The complete villin cDNA terized a complete cDNA clone coding for human villin (13). has The primary structure of villin displays a large duplicated sequence already been published (13) and has been domain common to other actin-severing proteins and a spe- deposited in the EMBL/GenBank data base (HSVILLR). cific additional domain necessary for bundling. S1 Nuclease Mapping. A 380-base single-stranded DNA Expression of villin mRNA has been studied in various probe spanning the 5' end of the villin mRNA was generated tissues from different species (14). In humans, there are two by synthesizing the noncoding strand ofa genomic restriction villin mRNA transcripts; the characterization of the corre- fragment subcloned in the bacteriophage M13 mpl8 with sponding cDNAs has shown that these two mRNAs differ by Klenow enzyme, from an oligonucleotide primer located in an extension of 800 bases in the 3' noncoding region (13). The the first exon of the gene. This probe was labeled by villin gene maps to human chromosome 2q35-36 and mouse incorporating [a-32P]dCTP during synthesis and purified by chromosome 1 and belongs to a cluster of genes conserved electrophoresis in a 6% polyacrylamide gel containing 7 M between the two species (15). urea. An aliquot of the probe (10,000 cpm) was hybridized at Here we report the cloning and characterization of the 48°C for 16 hr in the presence of 10 ,ug of total RNA from complete human villin gene. There is one copy of the villin either HeLa cells or HT-29 differentiated cells, in 80% gene per human haploid genome. This gene encompasses formamide/40 mM Pipes/400 mM NaCl/1 mM EDTA with about 25 kilobases (kb) and contains 19 exons. The general pH adjusted to 6.4. The heteroduplex hybrids were then distribution of exons and intervening sequences reflects the digested for 1 hr by S1 nuclease (Pharmacia) at concentra- tions ranging from 0 to 1000 units/ml in 30 mM NaOAc, pH The publication costs ofthis article were defrayed in part by page charge 4.4/280 mM NaCI/4.5 mM Zn(OAc)2 containing salmon payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. Abbreviation: nt, nucleotide(s).

10811 Downloaded by guest on September 26, 2021 10812 Cell Biology: Pringault et al. Proc. Natl. Acad. Sci. USA 88 (1991) sperm DNA at 50 Ag/ml. Protected fragments were analyzed the cDNA allow one to test for the presence or absence of in an 8% polyacrylamide gel containing 7 M urea. introns by comparing the length of the amplified fragment with the distance separating the two primers on the cDNA. The difference in size between these fragments should rep- RESULTS AND DISCUSSION resent the size of the intervening sequence. The exact posi- Human Haploid Genome Contains One Copy of the Vilhin tion of each exon/intron boundary was then determined by Gene. Previous in situ hybridization experiments with a villin sequencing the entire coding sequence and a portion of the probe on human chromosomes have provided evidence for flanking introns directly from phage DNA containing the the existence of a single locus for the villin gene, on chro- villin gene, using coding oligonucleotides as primers. Exon mosome 2, sub-band q35-36 (15). As there are two villin sequences determined from the villin genomic fragments mRNAs in humans, we used Southern hybridization to were consistently identical to the villin cDNA sequence examine the number of villin gene copies present at this previously obtained from a human colon cell line (13). locus. Fig. 2 summarizes the results obtained using this strategy Total genomic human DNA was digested with various and shows amplified DNA fragments obtained using primers restriction enzymes (EcoRI, BamHI, Bgl II, or combina- bordering introns 13 and 14. Genomic DNA was amplified tions), blotted onto nitrocellulose (16), and hybridized with a using a 5' coding primer beginning at nt 1604 and a 3' probe spanning the 5' end of the villin gene from nucleotide noncoding primer ending at nt 1749 of the villin cDNA. The (nt) -300 to nt +80 (Fig. 1). This probe contains no restriction predicted length of coding sequence between these two sites for the enzymes used. Autoradiography revealed only primers is 145 bp whereas the size of the corresponding one hybridizing restriction fragment for each enzyme tested. genomic fragment was about 850 bp. The difference of about The experiment was repeated with two different villin cDNA 700 bp represents the length of intervening sequences present probes, corresponding either to the amino terminus or to the between nt 1604 and 1749 of the cDNA. To determine carboxyl terminus ofthe molecule, giving similar results (data whether these intervening sequences were distributed in one not shown). These data demonstrate that there is only one or more introns, the coding sequence and exon/intron bound- copy ofthe villin gene per human haploid genome and that the aries were elucidated in this region of the genomic clone. We two mRNAs are therefore transcribed from the same gene. found that the intervening sequence of 700 bp represents a Isolation ofthe Human Villin Gene. The previously isolated single intron located between nt 1702 and 1703 of the cDNA human villin cDNA (13) was used to screen a bacteriophage (numbered as intron 13 once the entire gene structure was library of human genomic DNA in the vector EMBL3 (17). determined). When the same 5' primer was used with a After -4 human genome equivalents were screened with a different 3' primer ending at nt 1849 on the cDNA (predicting mixture of overlapping probes covering the entire villin a coding sequence of 245 bp), the size of the amplified cDNA, 12 positive phages were isolated and purified. These genomic fragment was about 950 bp, confirming that the positive phages were subjected to restriction enzyme diges- length of intervening sequence in this region is about 700 bp tion and Southern blotting using either 5' or 3' cDNA probes and indicating that there are no introns between nt 1749 and (data not shown). Two phages containing overlapping inserts, 1849. Another example is shown in lane 3, where predicted EMBL3-85e and EMBL3-30a, corresponding to the 5' and length of the coding sequence is 200 bp whereas the primers the 3' regions of the cDNA, respectively, were analyzed in used amplified a genomic fragment of about 1600 bp, sug- detail. EMBL3-85e contained an insert of about 15 kb, of gesting an intervening sequence of about 1400 bp between nt which 2 kb was 5' flanking sequence and 13 kb was from the 1794 and nt 1994. Sequence analysis of the genomic fragment 5' region of the villin gene. EMBL3-30a contained an insert of 19 kb, consisting of 13 kb of the 3' region of the villin gene and about 6 kb of 3' flanking sequence. Structure of the Human VilWin Gene. PCR DNA amplifica- 0 2 QZ U 2 tion combined with DNA sequence analysis allowed us to elucidate the exon/intron structure of the villin gene. DNA amplification from genomic clones using primers located in 3530 1 2 3 4 5 2322 1375 kb 947 831 904 564 6.6

4;3 _-

2.3 -

FIG. 2. Mapping of intervening sequences in the human villin gene by genomic DNA amplification (PCR). Villin gene fragments were amplified from 1 ng of EMBL3-30a phage DNA, using several pairs of primers from villin cDNA, by the Taq polymerase chain reaction. Results of only three amplification experiments are shown. FIG. 1. Hybridization analysis of genomic fragments from the A DNA digests (Boehringer Mannheim) were used as molecular size human villin gene. Aliquots of human genomic DNA (15 ,ug) were markers. C, no-DNA control. Lanes 1, amplification with 5' primer digested to completion with EcoRI (lane 1), BamHI (lane 2), BgI II beginning at nt 1604 and 3' primer ending at nt 1749 of the cDNA; (lane 3), EcoRI/BamHI (lane 4), and EcoRI/Bgl II (lane 5). Restric- lanes 2, amplification with 5' primer beginning at nt 1604 and 3' tion fragments were fractionated in a 0.8% agarose gel, denatured, primer ending at nt 1849 of the cDNA; lanes 3, amplification with 5' transferred onto nitrocellulose, and hybridized with a partial villin primer beginning at nt 1794 and 3' primer ending at nt 1994 of the cDNA probe (300 bp) labeled with [a-32P]dCTP by nick-translation. mRNA. Downloaded by guest on September 26, 2021 Cell Biology: Pringault et al. Proc. Natl. Acad. Sci. USA 88 (1991) 10813

Exons 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ? || I I I I I I ? Gene LL I 11 I I

EMBL3 Clones EMBL3-85e EMBL3-30a

Scale (Kb) L- -l. -L- __. -L j 0 5 10 15 20 25

FIG. 3. Exon/intron organization ofthe human villin gene. Exons are represented by solid boxes numbered 1 to 19 from 5' to 3' in the direction of transcription. The connecting line represents intervening sequences. Question marks indicate three intervening sequences for which DNA amplification products of the intron could not be obtained using flanking primers and thus should be longer than 3 kb. The two EMBL3 phages containing villin genomic DNA that were analyzed in detail are represented below the gene.

showed that this intervening sequence represented a single produce villin mRNA. A single fragment was protected from intron located between nt 1848 and nt 1849 of the cDNA digestion by total RNA isolated from Caco-2 cells but not by (intron 14). total RNA from HeLa cells. The length of the protected Proceeding along the entire villin gene, we mapped all the fragment was determined by comparison with a sequence intervening sequences by this method. No PCR amplification ladder from the genomic fragment, obtained using the primer of fragments longer than 2.5 kb was obtained, and therefore used to synthesize the probe. This allowed us to determine the absence ofamplification products suggested the presence directly the position of the first protected nucleotide. The of introns of 3 kb or larger. The precise size of these huge single product extended 20 nt upstream of the translation introns could not be determined by this method and was not initiation site, to the bold underlined cytosine residue of the investigated further. sequence CCTTCTCCCCCAGGCTCACTCACCATGACC. Fig. 3 shows the general exon/intron organization of the Overexposure of the gel did not reveal any additional pro- human villin gene determined by detailed analysis of the two tected fragments (data not shown). genomic fragments contained in phage (EMBL3-85e and This experiment cannot exclude the possibility that an EMBL3-30a). The villin gene encompasses 25 kb and is intron longer than 300 bp located just upstream from this interrupted by 18 introns ranging in size from 83 bp to 3 kb cytosine residue could be responsible for the length of the or larger, such as introns 6, 8, and 9, which yielded no observed protected fragment. Primer extension experiments amplified fragments in PCR using flanking primers. Thus the were unsuccessful, possibly due to the presence ofsecondary human villin gene contains a total of 19 exons including 18 structure on the 5' end of the mRNA. This difficulty was short exons of 79-203 bp and one larger exon (no. 19) of 1041 previously encountered when cloning the 5' end of the villin bp (Table 1). Sequences of the exon/intron boundaries are cDNA (13) and has also been reported in the case of the shown in Table 2. In all cases, the first two bases of the gene (19). Nevertheless, the presence of the trans- introns are GT, and the last two bases are AG, in agreement lation start site at this position can be argued to be due to the with the consensus sequence of donor and acceptor splice absence of an AG splice acceptor site just upstream of the sites. cytosine residue, a feature that appears to be the rule in all Transcription Initiation Site. To map the site of initiation of introns of the villin gene. Moreover, we have isolated and villin transcription, an antisense genomic probe overlapping characterized 2 kb of5' flanking sequence containing putative the ATG translation start site of the villin gene was hybrid- ized with total RNA from the intestinal cell line CaCO2 and Table 2. Intron sizes and exon/intron boundaries the heteroduplex was digested with S1 nuclease. Protected Intron Size 5' donor 3' acceptor fragments were then analyzed in an acrylamide gel (data not 1 450 ATC GAG gtga ..... ttag GCC ATG shown). Control for the specificity of protection was carried 2 1400 CTG GCT gtga ..... ctag ATC CAC out with total RNA isolated from HeLa cells, which do not 3 700 CTT GT gtag ..... tcag G ATC Table 1. Exon positions and sizes 4 150 GGA GAG gtag ..... ctag GTA GAG 5 1000 CTC AGG gtaa ..... gcag GGC ATG Exon Position in mRNA, nt* Size, bp 6 >3000? TAC CA gtga .... . acag T GTG 1 1-97 97 7 150 CAC GAG gtaa ..... ctag GAC TGT 2 98-172 75 8 >3000? GCG CTG gtgt ..... ccag AAC TTC 3 173-369 197 9 1000 GTG G gtga ..... ctag CC AAA 4 370-478 109 10 83 GTG CAG gtat ..... ctag GTG TGG S 479-589 111 11 600 TGG CAG gtca ..... ccag GGC AGC 6 590-792 203 12 1750 TAC CAG gtgt ..... gcag GGG GGC 7 793-871 79 13 700 GGG AAG gtgt ..... ctag GGT TGT 8 872-970 99 14 1400 AAG AG gtaa ..... gcag A CTA 9 971-1124 154 15 500 GAC CAG gtag ..... atag GTC TTC 10 1125-1225 101 16 1500 TGG AGT gtga ..... gcag AAC ACC 11 1226-1363 138 17 >3000? ACT GCT gtga ..... ctag GAG GTC 12 1364-1522 159 18 900 AAG GAG gtag ..... ctag GAG CAC 13 1523-1702 180 Introns are numbered from 5' to 3' in the direction oftranscription. 14 1703-1848 146 Approximative sizes ofintrons determined by DNA amplification are 15 1849-1993 145 shown. Question marks show three intervening sequence sizes for 16 1994-2182 189 which DNA amplification products ofthe intron were never obtained 17 2183-2251 69 using flanking primers, indicating that the intron is longer than 3 kb 18 2252-2392 141 (see Results). Splice 5' donor and 3' acceptor sites are also shown. Exon sequences are in uppercase letters. Intron sequences are in 19 2393-3434 1041 lowercase letters. Nucleotides corresponding to the splicing consen- *Numbered 5' to 3' in the direction of transcription. sus site are in bold letters. Downloaded by guest on September 26, 2021 10814 Cell Biology: Pringault et al. Proc. Natl. Acad. Sci. USA 88 (1991)

1:000 tD

FIG. 4. Transcription of the two mRNAs from the human villin gene. (Top) Structural organization of villin, represented from the amino terminus to the carboxyl terminus of the protein, showing the large repeated domain (open boxes), homologous to other severing proteins and containing repeated elements (dashed and black boxes), and the specific domain (HP) (see ref. 13). (Middle) The two mRNAs are depicted by horizontal lines. Positions of translation initiation codons (ATG), stop codons (TGA), polyadenylylation signals (AATAA or AATAAA), and poly(A) extensions are indicated. (Bottom) Schematic representation of the structure of the human villin gene. Exons are represented by solid boxes numbered 1 to 19 from 5' to 3' in the direction of transcription. The thin lines connecting one exon to the next represent intervening sequences drawn to scale. Question marks illustrate intervening sequences greater than 3 kb. upstream transcription factor-responsive elements that pro- 800-nt extension of the 3' noncoding region in the larger mote in vitro transcription of a reporter gene when trans- mRNA and that coding sequences are identical in the two fected into cells expressing villin, but not when transfected mRNAs (13). Thus, it is likely that the two mRNAs do not into villin-negative cells (unpublished observations). encode different villin protein isoforms. The villin gene Two mRNAs from the Villin Gene. Northern blot hybrid- appears to have a single transcription start site located 21 nt ization studies have revealed the presence of two mRNA upstream from the ATG initiation codon, suggesting that transcripts (2.7 and 3.5 kb) encoding villin in humans (14). By these two mRNAs arise from a single precursor. The 3' cloning and sequencing the corresponding cDNAs, we have noncoding region of the larger mRNA must contain the two demonstrated that this size difference is entirely due to an polyadenylylation signals used to generate the two mRNAs. A

F-ACTIN SEVERING F-ACTIN BUNDLING

HET-t: I IfR~ P Ii i /l \ /

.....

SEVERiNG DOMAIN LIN K DUPLICATED DOMAIN BUNDLING - 3 NONCODING - EXONS 1-8 EXON 9 EXONS 10.16 POLYADENYLYLATICON EXON -9 B SEVERING DOMAIN EXONS 1-8 LIII] [.1...1] [111111111W .1..1: [111 L

1 ~1.EE11N7 ~ 1 11 14 15 DUPLICATED DOMAIN EXONS 10-16 FIG. 5. Intron/exon distribution of the human villin gene compared to structural and functional domains of the protein. (A) Schematic diagrams of the primary structure of villin (U~pper) and of the exon distribution in the human gene (Lower) are aligned. In the protein, large open boxes represent the duplicated domains containing internal repeats (hatched and black boxes). The specific carboxyl-terminal domain is represented by a stippled box (HP). Coding exons are represented by light or dark stippled boxes, whereas 3' noncoding sequence in the last exon is shown by an open box. (B) Exon distributions in the two duplicated domains of the villin gene are aligned. Downloaded by guest on September 26, 2021 Cell Biology: Pringault et al. Proc. Natl. Acad. Sci. USA 88 (1991) 10815 Here we demonstrate that the two polyadenylylation signals amino-terminal extension found in secreted gelsolin. In con- are located in the last exon (no. 19) of the villin gene (Fig. 4). trast to the first three exons, which are specific to the gelsolin The two mRNAs differing in size in the 3' noncoding region gene, exons 4-14 encode sequences common to the two may therefore be generated by alternative choice of poly- gelsolin isoforms and are similar to the villin core. This adenylylation signal rather than by alternative splicing of domain is encoded by 16 exons in the case of villin and only exons. Finally, since the two villin mRNAs are expressed in 11 exons in the case of gelsolin, indicating an extensive exon equimolar ratio in all villin-positive tissues (normal and redistribution, despite a controlled sequence divergence, tumor) so far analyzed, and since preferential expression of since villin and gelsolin share more than 50% homology in one transcript has not been observed with respect to the stage their sequences. The precise organization of of differentiation of intestinal cells or in any cells able to exons 6-14 of the gelsolin gene has not been reported. Thus, produce villin (14), it is legitimate to suggest that the site of it is not possible to conclude whether or not the genomic polyadenylylation is randomly selected. The physiological organization of gelsolin also reflects a gene duplication. significance of an alternative 3' noncoding extension of villin In conclusion, the structural divergences between the villin mRNA is not known. and gelsolin genes exclude a simple evolutionary link. After Evolution of the Villin Gene. Villin belongs to a family of an early gene duplication, these genes may have followed a four known proteins that share F-actin-severing activity and complex parallel evolution, with the gain of specific domains, amino acid sequence homologies (13). These are fragmin and such as that allowing secretion, encoded at the 5' end of the severin, isolated from lower eukaryotes, and gelsolin and gelsolin gene, and that conferring F-actin-bundling activity, villin, found in higher eukaryotes (20-22). These proteins encoded at the 3' end of the villin gene. share a large structural domain containing highly conserved repeated motifs, and all four proteins display an F-actin- We thank Dr. H. Lehrach for the gift of the EMBL3 human severing activity. There is one copy of this structural domain genomic library and Dr. S. Pellegrini for help in performing Southern in whereas two tandem copies are found blot experiments. We thank Dr. M. Buckingham and Dr. R. Kelly for fragmin and severin critically reading the manuscript. This work was supported by grants in gelsolin and villin, suggesting a duplication event. Inter- from the Institut National de la Sante et de la Recherche Medicale estingly, villin displays an additional specific carboxyl- (no. 86-7008), the Association pour la Recherche sur le Cancer (no. terminal domain called the "head-piece," which confers on 6379), the Ligue Nationale Franraise contre le Cancer, the Fondation villin F-actin-bundling activity not observed in the others pour la Recherche Mddicale, and the Association Francaise de Lutte members of the family. It has been postulated that these contre la Mucoviscidose. proteins may have evolved from a common ancestor, by gene 1. Bretscher, A. & Weber, K. (1979) Proc. Nati. Acad. Sci. USA duplication in the case ofgelsolin and by gene duplication and 76, 2321-2325. subsequent addition of a specific domain in the case of villin 2. Bretscher, A. & Weber, K. (1980) Cell 20, 839-847. (13). Seeking support for these postulated evolutionary 3. Craig, S. W. & Powell, L. D. (1980) Cell 22, 739-746. events, we have examined the exon/intron distribution in the 4. Mooseker, M. S., Graves, T. A., Wharton, K. A., Falco, N. & villin gene with respect to the structural and functional Howe, C. L. (1980) J. Cell Biol. 87, 809-822. domains of the encoded protein (Fig. 5) and then compared 5. Glenney, J. R., Jr., & Weber, K. (1981) Proc. Natd. Acad. Sci. this distribution with that of gelsolin. USA 78, 2810-2814. The duplicated domains of human villin are encoded by 6. Glenney, J. R., Jr., Geisler, N., Kaulfus, P. & Weber, K. (1981) J. Biol. Chem. 256, 8156-8161. eight and seven small exons (exons 1-8 and 10-16) and are 7. Friederich, E., Huet, C., Arpin, M. & Louvard, D. (1989) Cell separated by exon 9, which encodes a short hinge sequence 59, 461-475. (Fig. 5A). Exons 17 and 18 encode the second linking region 8. Robine, S., Huet, C., Moll, R., Sahuquillo-Merino, C., Coud- and the amino terminus of the head-piece, covering two- rier, E., Zweibaum, A. & Louvard, D. (1985) Proc. NatI. Acad. thirds ofthis domain. Exon 19 encodes the carboxyl-terminal Sci. USA 82, 8488-8492. region of the head-piece, which is responsible for F-actin 9. Boller, K., Arpin, M., Pringault, E., Mangeat, P. & Reggio, H. bundling activity and is required for the morphogenetic effect (1988) Differentiation 39, 51-57. induced by villin in transfected fibroblasts (7), as well as the 10. Maunoury, R., Robine, S., Pringault, E., Huet, C., Guenet, 3' untranslated region of both villin mRNAs. This overall J. L., Gaillard, J. A. & Louvard, D. (1988) EMBO J. 7, 3321-3329. organization reflects the postulated gene duplication of an 11. Ezzel, R. M., Chafel, M. M. & Matsudaira, P. T. (1989) De- ancient precursor gene and the addition of a specific domain velopment 106, 407-419. giving rise to villin. In the two duplicated domains of villin, 12. Pringault, E. (1990) Ann. Inst. Pasteur (Paris) 2, 120-126. however, splice sites are not conserved, resulting in a dif- 13. Arpin, M., Pringault, E., Finidori, J., Garcia, A., Jeltsch, J. M., ferent exon/intron distribution (Fig. SB). Moreover, introns Vandekerckhove, J. & Louvard, D. (1988) J. Cell Biol. 107, vary in size between the two domains (Fig. 4) and two gaps 1759-1766. of 15 and 40 bp appear in the coding sequence of the second 14. Pringault, E., Arpin, M., Garcia, A., Finidori, J. & Louvard, D. domain. The early duplication event has therefore been (1986) EMBO J. 5, 3119-3124. followed by extensive divergence of the duplicated domains. 15. Rousseau-Merck, M. F., Simon-Chazottes, D., Arpin, M., Pringault, E., Louvard, D., Gudnet, J. L. & Berger, R. (1988) This exon redistribution and sequence drift may account for Hum. Genet. 78, 130-133. the functional difference observed between these two do- 16. Southern, E. M. (1975) J. Mol. Biol. 98, 503-517. mains, such as the loss of severing activity in the second 17. Frischauf, A. M., Lehrach, H., Poustka, A. M. & Murray, N. duplicated domain (23). (1983) J. Mol. Biol. 170, 827-842. We have also compared the genomic organization of hu- 18. Sanger, F., Nicklen, S., Coulson, A. R. (1977) Proc. Natl. man gelsolin (19) with that of human villin. Although these Acad. Sci. USA 74, 5463-5467. two proteins belong to the same family, extensive differences 19. Kwiatkowski, D. J., Mehl, R. & Yin, H. L. (1988) J. Cell Biol. in gene organization are observed. A single gene, much larger 106, 375-384. than the villin gene (70 kb) but containing only 14 exons, 20. Kwiatkowski, D. J., Stossel, T. P., Orkin, S. H., Mole, J. E., Colten, H. R. & Yin, H. L. (1986) Nature (London) 323, encodes the two gelsolin protein isoforms, a cytoplasmic 455-458. isoform and a secreted isoform, which are generated by 21. Ampe, C. & Vandekerckhove, J. (1987) EMBO J. 6, 4149-4157. alternative transcriptional start sites and exon skipping. 22. Andre, E., Lottspeich, F., Schleicher, M. & Noegel, A. (1988) Exons 1 and 2, located far upstream from exon 3, encode the J. Biol. Chem. 263, 722-727. 5' untranslated region of the mRNA encoding cytoplasmic 23. Janmey, P. A. & Matsudaira, P. T. (1988) J. Biol. Chem. 263, gelsolin. Exon 3 encodes the signal peptide and specific 16738-16743. Downloaded by guest on September 26, 2021