<<

Proc. Natl. Acad. Sci. USA Vol. 87, pp. 5667-5671, August 1990 Biochemistry Complete sequence of human vinculin and assignment of the gene to 10 (cell-natrix junctions/ /) P. A. WELLER*, E. P. OGRYZKOt, E. B. CORBEN*, N. I. ZHIDKOVAt, B. PATEL*, G. J. PRICE**, N. K. SPURR§, V. E. KOTELIANSKYt, AND D. R. CRITCHLEY*¶ *Department of Biochemistry, University of Leicester, Leicester LEl 7RH, United Kingdom; tInstitute of Experimental Cardiology, Cardiology Research Centre of the USSR, Academy of Medical Sciences, 3rd Cherepkovskaya Street 15A, Moscow 121552, USSR; and §Department of Human Genetic Resources, Imperial Cancer Research Fund, Clare Hall Laboratories, Potters Bar, Herts EN6 3LD, United Kingdom Communicated by Nikolay P. Dubinin, April 13, 1990 (receivedfor review March 1, 1990)

ABSTRACT We have determined the complete sequence Sad BclI PstI PstI EcoRI KpnI KpnI PstI EcoRI of human vinculin, a cytoskeletal associated with I I ifI I I I I i I cell-cell and cell-matrxjunctions. Comparison ofhuman and chicken embryo vinculin sequences shows that both contain 1066 amino acids and exhibit a high level of sequence HVI identity (>95%). The region ofgreatest divergence falls within HV2 three 112- repeats spanning residues 259-589. In- terestingly, nematode vinculin lacks one of these central re- HV7 peats. The regions of human vinculin that are N- and C- Vh6F terminal to the repeats show 54% and 61% sequence identity, - Vh1F respectively, to nematode vinculin. Southern blots of human genomic DNA hybridized with short vinculin cDNA fragments Vh3E indicate that there is a single vinculin gene. By using a panel of human-rodent somatic cell hybrids, the human vinculin gene FIG. 1. Restriction map of overlapping human vinculin cDNAs. was mapped to chromosome 10qll.2-qter. -, Coding sequence; El, untranslated sequence. Clones HV1, HV2, Vh1F, Vh6F, Vh2E, and Vh3E were completely sequenced on both strands. The sequence across the EcoRI site within the coding Vinculin is a cytoskeletal protein associated with the cyto- region was confirmed by using clone HV6. There were no differences plasmic face of both cell-cell and cell- between the sequences ofthese cDNAs in the regions ofoverlap. The adherens-type junctions (1, 2), where it is thought to function origins ofthe cDNA clones are described in Materials and Methods. as one of several interacting proteins involved in anchoring F-actin to the membrane (reviewed in ref. 3). In cell-matrix primed human endothelial cell Agtl1 library (provided by S. junctions, these proteins include a- (4) and talin (5), Orkin of Children's Hospital, Boston) (18), and HV2, HV6, although talin is absent from cell-cell junctions (6). Evidence and HV7 were isolated. Vh1F was isolated from a human in support of such a role for vinculin comes from in, vitro fibroblast Agtll library (Clontech) by screening with an biochemical studies that show that vinculin binds to a-actinin antibody to human vinculin following standard procedures. (7, 8), an F-actin bundling protein (reviewed in ref. 9), and to Vh2E and Vh3E were isolated from a human endothelial talin (10). Talin in turn has been reported to bind to the Agtll library (a gift from M. Chao of Cornell Medical Center, cytoplasmic domain of the A1 subunit of the fibronectin New York) by using Vh1F as a probe. Vh6F was isolated receptor (11, 12), thus completing the link between the actin from a human lung fibroblast Agtll library (Clontech) by cytoskeleton and the extracellular matrix. Interaction be- using the 0.27-kb 5' EcoRI-BamHI fragment from HV1 as a tween matrix proteins and their receptors is thought to probe. DNA probes were labeled with [a-32P]dCTP by the provide the trigger for assembly of these specialized cellular random priming method (19). Plaque screening by DNA' junctions (13), and protein phosphorylation is likely to be DNA hybridization was by standard procedures (20), with a involved in the regulation of their stability (reviewed in ref. final wash in 0.45 M NaCI/0.045 M sodium citrate, pH 7, at 3). To obtain a more complete understanding of the role of 650C. cDNAs were subcloned into either M13mp18 or the vinculin in adherens junctions, the sequence of chicken plasmid vectors Bluescript (Stratagene) or pTZ19 and were embryo vinculin has been determined (14-16). The sequence sequenced on both strands by the dideoxy chain-termination of nematode Caenorhabditis elegans vinculin also has been method (21). published (17). Here we report the complete sequence of All of the hybrids used in this work have been described human vinculin 11 together with evidence that there is a single (22). Cells were maintained in RPMI 1640 medium supple- human vinculin gene, which we have localized to chromo- mented with 10% fetal calf serum at 370C in 5% C02/95% air. some 10. To retain human X or 17, media were supple- MATERIALS AND METHODS mented where appropriate with 0.1 mM hypoxanthine, 10 AuM methotrexate, and 16 ,LM thymidine. All hybrids were A partial human vinculin cDNA (HV1) was isolated from a checked for the presence of human chromosomes by isoen- human phage AgtlO fibroblast cDNA library (provided by the zyme and karyoptic analysis (22) at the same passage as that late A. R. Macleod ofLudwig Institute, Cambridge, U.K.) by used to DNA. using a 2.89-kilobase (kb) chicken embryo vinculin cDNA prepare high molecular weight genomic probe (14). The HV1 cDNA was used to screen an oligo(dT)- tPresent address: Roche Products, Welwyn Garden City, Herts, U.K. The publication costs of this article were defrayed in part by page charge 1To whom reprint requests should be addressed. payment. This article must therefore be hereby marked "advertisement" 'The sequence reported in this paper has been deposited in the in accordance with 18 U.S.C. §1734 solely to indicate this fact. GenBank data base (accession no. M33308). 5667 Downloaded by guest on September 26, 2021 5668 Biochemistry: Weller et al. Proc. Natl. Acad. Sci. USA 87 (1990)

-44 CACTTCTCTGTCGCCCGCGGTTCGCCGCCCCGCTCGCCGCCGCG R3 --" C C C C C T C T C A AAA C G T T T 0 ATGCCAGTGTTTCATACGCGCACGATCGAGAGCATCCTGGAGCCGGTGGCACAGCAGATC 1740 AAGGATCTAAAAGCTCGGATGCAGGAGGCCATGACTCAGGAAGTGTCAGATGTTTTCAGC MetProValPheHisThrArgThrIleGluSerIleLeuGluProValAlaGlnGlnIle 20 LysAspLeuLysAlaArgMetGlnGluAlaMetThrGlnGluValSerAspValPheSer 600 C C GA G C G C T T TT AA T T T A T C 60 TCCCACCTGGTGATAATGCACGAGGAGGGCGAGGTGGACGGCAAAGCCATTCCTGACCTC 1800 GATACCACAACTCCCATCAAGCTGTTGGCAGTGGCAGCCACGGCGCCTCCTGATGCGCCT SerHisLeuValIleMetHisGluGluGlyGluValAspGlyLysAlaIleProAspLeu 40 AspThrThrThrProIleLysLeuLeuAlaValAlaAlaThrAlaProProAspAlaPro 620 C T G C T G G A T A GA AA A T G T CT GA G 120 ACCGCGCCCGTGGCCGCCGTGCAGGCGGCCGTCAGCAACCTCGTCCGGGTTGGAAAAGAG 1860 AACAGGGAAGAGGTATTTGATGAGAGGGCAGCTAACTTTGAAAACCATTCAGGAAAGCTT ThrAlaProValAlaAlaValGlnAlaAlaValSerAsnLeuValArgvalGlyLysGln 60 AsnArgGluGluValPheAspGluArgAlaAlaAsnPheGluAsnHisSerGlyLysLeu 640 Ser Glu AlaAlaArg GG A AA C C A G C C A A A A A A AA A A A T T 180 ACTGTTCAAACCACTGAGGATCAGATTTTGAAGAGAGATATGCCACCAGCATTTATTAAG 1920 GGTGCTACGGCCGAGAAGGCGGCTGCGGTTGGTACTGCTAATAAATCAACAGTGGAAGGC 80 GlyAlaThrAlaGluLysAlaAlaAlaValGlyThrAlaAsnLysSerThrValGluGly 660 ThrValGlnThrThrGluAspGlnIleLeuLysArgAspMetProProAlaPheIleLys Thr A C C T GA C G AG T T 240 GTTGAGAATGCTTGCACCAAGCTTGTCCAGGCAGCTCAGATGCTTCAGTCAGACCCTTAC AA C AT T AA G G T A A A A ValGluAsnAlaCysThrLysLeuValGlnAlaAlaGlnMetLeuGlnSerAspProTyr 100 1980 ATTCAGGCCTCAGTGAAGACGGCCCGAGAACTCACACCCCAGGTGGTCTCGGCTGCTCGT Ala IleGlnAlaSerValLysThrAlaArgGluLeuThrProGlnValValSerAlaAlaArg 680 Arg Thr Ser A AT C C A A T 300 TCAGTGCCTGCTCGAGATTATCTAATTGATGGGTCAAGGGGCATCCTCTCTGGAACATCA C C G A T G A A 120 2040 ATCTTACTTAGGAACCCTGGAAATCAAGCTGCTTATGAACATTTTGAGACCATGAAGAAC SerValProAlaArgAspTyrLeuIleAspGlySerArgGlyIleLeuSerGlyThrSer IleLeuLeuArgAsnProGlyAsnGlnAlaAlaTyrGluHisPheGluThrMetLysAsn 700 T A T G A T A G C CC T C 360 C T A G T GC G GACCTGCTCCTTACCTTCGATGAGGCTGAGGTCCGTAAAATTATTAGAGTTTGCAAAGGA 2100 CAGTGGATCGATAATGTTGAAAAAATGACAGGGCTGGTGGACGAAGCCATTGATACCAAA AspLeuLeuLeuThrPheAspGluAlaGluValArgLysIleIleArgValCysLysGly 140 GlnTrpIleAspAsnValGluLysMetThrGlyLeuValAspGluAlaIleAspThrLys 720 A G T A A A G G A T His 420 ATTTTGGAATATCTTACAGTGGCAGAGGTGGTGGAGACTATGGAAGATTTGGTCACTTAC A G T G G T T T A T A T A IleLeuGluTyrLeuThrValAlaGluValValGluThrMetGluAspLeuValThrTyr 160 2160 TCTCTGTTGGATGCTTCAGAAGAAGCAATTAAAAAAGACCTGGACAAGTGCAAGGTAGCT A A G A C T A SerLeuLeuAspAlaSerGluGluAlaIleLysLysAspLeuAspLysCysLysValAla 740 480 ACAAAGAATCTTGGGCCAGGAATGACTAAGATGGCCAAGATGATTGACGAGAGACAGCAG T G A A A C C A A A G ThrLysAsnLeuGlyProGlyMetThrLysletAlaLysMetIleAspGluArgGlnGln 180 2220 ATGGCCAACATTCAGCCTCAGATGCTGGTTGCTGGGGCAACCAGTATTGCTCGTCGGGCC AT A T A TA G T C C T T G MetAlaAsnIleGlnProGlnMetLeuValAlaGlyAlaThrSerIleAlaArgArgAla 760 540 GAGCTCACTCACCAGGAGCACCGAGTGATGTTGGTGAACTCGATGAACACCGTGAAAGAG Met GluLeuThrHisGlnGluHisArgValMetLeuValAsnSerMetAsnThrValLysGlu 200 A T A AC T A A A CT A A G 2280 C AT T A T C C A A T G AACCGGATCCTGCTGGTGGCTAAGAGGGAGGTGGAGAATTCCGAGGATCCCAAGTTCCGT 600 TTGCTGCCAGTTCTCATTTCAGCTATGAAGATTTTTGTAACAACTAAAACTCAAAAAAC AsnArgIleLeuLeuValAlaLysArgGluValGluAsnSerGluAspProLysPheArg 780 LeuLeuProValLeuIleSerAlaMetLysIlePheValThrThrLysAsnSerLysAsn 220 T A T GC A A G A Thr Ser 2340 GAGGCTGTGAAAGCTGCCTCTGATGAATTGAGCAAAACCATCTCCCCAATGGTGATGGAT G A A G C G C A G G GluAlaValLysAlaAlaSerAspGluLeuSerLysThrIleSerProMetValMetAsp 800 660 CAAGGCATAGAGGAAGCTTTAAAAAATCGCAATTTTACTGTAGAAAAAATGAGTGCTGAA T A A T C T T TT G T T T T GlnGlyI1eGlUGluAlaLeuLysAsnArgAsnPheThrValGluLysMetSerAlaGlu 240 2400 GCAAAAGCTGTGGCTGGAAACATTTCCGACCCTGGACTGCAAAAGAGCTTCCTGGACTCA A A C A T C r R AlaLysAlaValAlaGlyAsnIleSerAspProGlyLeuGlnLysSerPheLeuAspSer 820 720 ATTAATGAGATAATTCGTGTGTTACAACTCACCTCTTGGGATGAAGATGCCTGGGCCAGC CA T A T G A IleAsnGlutleIleArgValLeuGlnLeuThrSerTrpAspGluAspAlaTrpAlaSer 260 2460 GGATATCGGATCCTGGGAGCTGTGGCCAAGGTCAGAGAAGCCTTCCAICCTCAGGAGCCT A T A CTA T A GA A GlyTyrArgIleLeuGlyAlaValAlaLysValArgGluAlaPheGl rProGlnGluPro 840 780 AAGGACACTGAAGCCATGAAGAGAGCATTGGCCTCCATAGACTCCAAACTGAACCAGGCC T C T T T C GT AT G LysAspThrGluAlaMetLysArgAlaLeuAlaSerIleAspSerLysLeuAsnGlnAla 280 T Leu 2520 GACTTCCCGCCGCCTCCACCAGACCTTGAACAACTCCGACTAACAGATGAGCTTGCTCCT Met AspPheProProProProProAspLeuGluGlnrLeuArgLeuThrAspGluLeuAlaPro 860 C GA A T A AC AC T A A A G 840 AAAGGTTGGCTCCGTGACCCTAGTGCCTCCCCAGGGGATGCTGGTGAGCAGGCCATCAGA A A T A T C C AA C T A LysGlyTrpLeuArgAspProSerAlaSerProGlyAspAlaGlyGluGlnAlaIleArg 300 Asn Pro 2580 CCCAAACCACCTCTGCCTGAAGGTGAGGTCCCTCCACCTAGGCCTCCACCACCAAGGAA ProLysProProLeuProGluGlyGluValProProProArgProProProPrluGlu 880 C T CA A T G 900 CAGATCTTAGATGAAGCTGGAAAAGTTGGTGAACTCTGTGCAGGCAAAGAACGCAGGGAG A G A A A A A CT T C 320 2640 AAGGATGAAGAGTTCCCTGAGCAGAAGGCCGGGGAGGTGATTAACCAGCCAATGATGATG GlnIleLeuAspGluAlaGlyLysValGlyGluLeuCysAlaGlyLysGluArgArgGlu LysAspGluGluPheProGluGlnLysAlaGlyGluValIleAsnGlnProMetMetMet 900 Ala Ala A CTT G C A C T T A T 960 T G T G C G G T T C ATTCTGGGAACTTGCAAAATGCTAGGGCAGATGACTGATCAAGTGGCTGACCTCCGTGCC 2700 GCTGCCAGACAGCTCCATGATGAAGCTCGCAAATGGTCCAGCAAGGGCAATGACATCATT IleLeuGlyThrCysLysMetLeuGlyGlnMetThrAspGlnValAlaAspLeuArgAla 340 Thr Leu AlaAlaArgGlnLeuHisAspGluAlaArgLysTrpSerSerLysGlyAsnAspIleIle 920 T T T A A G C A C GC A G TG TA A A G A G A A C 2760 GCAGCAGCCAAGCGCATGGCTCTGCTGATGGCTGAGATGTCTCGGCTGGTAAGAGGGGGC 1020 AGAGGACAAGGATCCTCACCGGTGGCCATGCAGAAAGCTCAGCAGGTATCTCAGGGTCTG AlaAlaAlaLysArgMetAlaLeuLeuMetAlaGluMetSerArgLeuValArgGlyGly ArgGlyGlnGlySerSerProValAlaMetGlnLysAlaGlnGlnValSerGlnGlyLeu 360 940 AlaThr Met R2 C A A T C C A A T T T A G A 2820 AGTGGTACCAAGCGGGCACTCATTCAGTGTGCCAAGGACATCGCCAAGGCCTCAGATGAG T T G 1. C G AT G A T SerGlyThrLysArgAlaLeuIleGlnCysAlaLysAspIleAlaLysAlaSerAspGlu 960 1080 GATGTGCTCACAGCAAAAGTGGAAAATGCAGCTCGCAAGCTGGAAGCCATGACCAACTCA Asn AspValLeuThrAlaLysValGluAsnAlaAlaArgLysLeuGluAlaMetThrAsnSer 380 Leu C A A G A T T C A 2880 GTGACTCGGTTGGCCAAGGAGGTTGCCAAGCAGTGCACAGATAAACGGATTAGAACCAAC GCT T GC C 1140 AAGCAGAGCATTGCAAAGAAGATCGATGCTGCTCAGAACTGGCTTGCAGATCCAAATGGT ValThrArgLeuAlaLysGluValAlaLysGlnCysThrAspLysArgIleArgThrAsn 980 LysGlnSerIleAlaLysLysIleAspAlaAlaGlnAsnTrpLeuAlaAspProAsnGly 400 C C G T T Ala 2940 CTCTTACAGGTATGTGAGCGAATCCCAACCATAAGCACCCAGCTCAAAATCCTGTCCOCA LeuLeuGlnValCysGluArgIleProThrIleSerThrGlnLeuLysIleLeuSerThr 1000 AGT A C AAT A T AA G T 1200 GGACCGGAAGGAGAAGAGCAGATTCGAGGTGCTTTGGCTGAAGCTCGGAAAATAGCAGAA A T A T C A A A A A T GlyProGluGlyGluGluGlnIleArgGlyAlaLeuAlaGluAlaArgLysIleAlaGlu 420 3000 GTGAAGGCCACCATGCTGGGCCGGACCAACATCAGTGATGAGGAGTCTGAGCAGGCCACA Ser His IleMetSer Val ValLysAlaThrMetLeuGlyArgThrAsnIleSerAspGluGluSerGluGlrnAlaThr 1020 GG G T C T C T G C T T C A A A 1260 TTATGTGATGATCCTAAAGAAAGAGATGACATTCTACGTTCCCTTGGGGAAATATCTGCT 3060 GAGATGCTGGTTCACAATGCCCAGAACCTCATGCAGTCTGTGAAGGAGACTGTGCGGGAA LeuCysAspAspProLysGluArgAspAspIleLeuArgSerLeuGlyGluIleSerAla 440 GluMetLeuValHisAsnAlaGlnAsnLeuMetGlnSerValLysGluThrValArgGlu 1040 GluGlu A A C T G AA C CT C AG GC GT G C T C C T C T 3120 GCTGAAGCTGCTTCAATCAAAATTCGAACAGATGCTGGATTTACACTGCGCTGGGTTAGA 1320 CTGACTTCTAAATTAGCAGATCTACGAAGACAGGGGAAAGGAGATTCTCCAGAGGCTCGA AlaGluAlaAlaSerIleLysIleArgThrAspAlaGlyPheThrLeuArgTrpValArg 1060 LeuThrSerLysLeuAlaAspLeuArgArgGlnGlyLysGlyAspSerProGluAlaArg 460 Ala Ser His C A T A R3 3 180 AAGACTCCCTGGTACCAGTAGGCACCTGGCTGAGCCTGGCTGGCACAGAAACCTCTACTA A G AA A T AT A T T A T A A Ar LysThrProTrpTyrGlnEnd 1066 1380 GCCTTGGCCAAACAGGTGGCCACGGCCCTGCAGAACCTGCAGACCAAAACCAACCGGGCT -x...s --on ------a~------x-X-x-- AlaLeuAlaLysGlnValAlaThrAlaLeuGlnAsnLeuGlnThrLysThrAsnArgAla 480 3240 Ile Ser Ser 3300 3360 A A T CT A TT T C TT G T 3420 1440 GTGGCCAACAGCAGACCGGCCAAAGCAGCTGTACACCTTGAGGGCAAGATTGAGCAAGCA 3480 ValAlaAsnSerArgProAlaLysAlaAlaValHisLeuGluGlyLysIleGluGlnAla 500 3540 Thr Val 3600 3660 A A T T T G A C A A T 3720 1500 CAGCGGTGGATTGATAATCCCACAGTGGATGACCGTGGAGTCGGTCAGGCTGCCATCCGG 3780 GlnArgTrpIleAspAsnProThrValAspAspArgGlyValGlyGlnAlaAlaIleArg 520 3840 3900 TT G T A T G A C C A T G C 1560 3960 GGGCTTGTGGCCGAAGGGCATCGTCTGGCTAATGTTATGATGGGGCCTTATCGGCAAGAT 4020 GlyLeuValAlaGluGlyHisArgLeuAlaAsnValMetMetGlyProTyrArgGlnAsp 540 4080 Arg 4140 4200 G T A T A G TT T T A A 4260 162 0 CTTCTCGCCAAGTGTGACCGAGTGGACCAGCTGACAGCCCAGCTGGCTGACCTGGCTGCC 4320 LeuLeuAlaLysCysAspArgValAspGlnLeuThrAlaGlnLeuAlaAspLeuAlaAla 560 4380 Ala 4440 4500 A TC TA G A TG T G C G 4560 1680o AGAGGGGAAGGGGAGAGTCCTCAGGCACGAGCACTTGCATCTCAGCTCCAAGACTCCTTA 4620 ArgGlyGluGlyGluSerProGlnAlaArgAlaLeuAlaSerGlnLeuGlnAspSerLeu 580 4680 Ile Ala 4740 4800 4860 4920 4980 5040 AATTAAAAAAAA 5051

FIG. 2. Complete nucleotide sequence and derived amino acid sequence of human vinculin. Positions where the chicken embryo vinculin nucleotide and derived amino acid sequences differ from human are shown above and below the human sequence, respectively. The positions of the boundaries of each of the three 112-residue repeats in the amino acid sequence are indicated (R1-R3), and the -rich region (residues 837-878) is boxed. One chicken embryo vinculin cDNA (cVin5) analyzed by Price et al. (15) lacks 123 bp encoding residues 167-207. The Downloaded by guest on September 26, 2021 Biochemistry:mWeller et al. Proc. Natl. Acad. Sci. USA 87 (1990) 5669 residue no. 500 1000 A 1 50 nematode MwF human NPVFHTRTIE SILEPVAQQI SHLVIMHEEG EVDGK&IPDL TAPVAAVQAA chick MPVFHTRTIE SILEPVAQQI SHLVIMHEEG EVDGMAIPRL TAPVO&VQAA 6 , nematode_ 5. human LVVGR TVQITEDQIL KRDNPPAFIK VENACTKLVQ AAQMLQSDPY chick VSNLVRVGKE TVQTTEDOIL KRDNPPAFIK VENA CTKLAAQMLdPY / ~/ ~ 101 150 V ,, ' * nematode S VRK IVB_ (_q% _ -S human SVPAYLID GSRGILSGTS DLLLTFDEAE VRKIIRVCKG ILEYLTVAEV chick 4) SVPARDYLID GSRGILSGTS DLLLTFDEAE VRKIIRVCKG ILEYLTVAEV .s'0 151 200 nematode 500- 0 human VETNEDLVTY TKNLGPGN4TK NAKNIDERQQ ELTHQEHRVM LVNSMNTVKE a chick VETMEDLVTY TKNLGPGNTK XAKMIDERQQ ELTHQEHRVM LVNSM4TVKE . cs 7 -- . :201 249 //. /, /~~~~~~~~~~~~~~ nematode human LLPVLISANK . N RNFTVEGKMSAIEEAL EINEIIRVLQ / / . /#~~~~~~~~~~~~.1 chick LLPVLISAMK . IFVTT' PGIEEALKN RNFTVEKNSA EINEIIRVLQ 250 258 nematode LI-rxERD human LTSWDEDAW chick LTSWDEDAW B 820 869 nematode RASPYNPPPP SSOVIRSVNA SPPTAPgIHN KEILRLTDELAPPKiPLEGE human SGYRILGAVA KVREAFQPQE PDFPPPPPDL EOLRLTDELA PPKPPLPEGE human vinculin chick SGYRIDGAVA KVREAFQPQE PDFPPPPPDL ELtjDLELA PPKPPLPEGE nematode _ FIG. 3. Dot matrix comparison of human and nematode vinculin human VPPPRPPPPE EKDEEFPEQK AGE ...... V INQPHNKAAR QLHDEARKWS amino acid sequences using the University of Wisconsin Genetics chick VPPPRPPPPE EKDEEFPEQK AGE..i INQENQOAAR QLHDEARKWS Computing Group programs COMPARE and DOTPLOT, with window 914 963 nematode R and stringency values of 30 and 8, respectively. human SKGNDIIAAA KRMALIMAEM SRLVRGGSG KRAIQCAKD IAKASDEVTR chick SKGNDIIAAA KRNALIMAE( SRLVRGGSG#4 KkALI92CAKD IAKASDEVTR RESULTS 964 1006 nematode VJRP5__QtT _M~ ADVI~G human LAKEVARQCT DKRIRTNLLQ VCERIPTIST QLKILSTVKA TML. The relationship between the eight cDNA clones from which chick IAKEVAKQCT DKRIRTNLLQ VCERIPTIST QLKILSTV!A TML. we have determined the complete sequence ofhuman vinculin 1007 1054 nematode E H is shown in Fig. 1. Sequence analysis shows that the cDNAs human THNAQNLMQSV VRAEAA SIKIRTDAGF span 44 base pairs (bp) of 5' untranslated sequence, the chick . . GRTNISDE ESEQATErMLV HNAQNIMQSV KETVREAEAA SIKIRTDAGF complete coding sequence (3198 bp), and 1852 bp of 3' 1055 1066 nematode R _FWU~kSNF* untranslated sequence (Fig. 2). Comparison of the coding human TLRWVRKTPW YQ* sequence of human vinculin with that of chicken embryo chick TLRWVRKTPW YQ* vinculin (14-16) shows a high level of sequence identity both FIG. 4. Alignment of the N- and C-terminal regions of human, at the DNA (82%) and amino acid (95%) levels, although the chicken embryo, and nematode vinculin. Amino acids are numbered 5' and long 3' untranslated sequences are divergent. Human relative to the human sequence. Alignments were determined by vinculin contains 1066 amino acids, as does the chicken using the University of Wisconsin Genetics Computing Group pro- embryo protein, and has a deduced molecular mass of 116,732 gram GAP. Residues identical to the human sequence are boxed. (A) Da. Most ofthe amino acid sequence divergence between the N-terminal region, residues 1-258. Arrows denote the boundaries of the 41-amino acid region (residues 167-207) absent from the chicken human and chicken embryo vinculin sequences occurs in the embryo cDNA cVin5 820-1066. central region of the protein, which contains three repeats of (15). (B) C-terminal region, residues approximately 112 amino acids (residues 259-589) ofunknown number ofvinculin cDNA fragments as probes (Fig. 5). When function (14). There are 32 amino acid sequence differences in blots were hybridized with fragments spanning the entire this repeat region, which spans 330 residues. The region coding sequence and part of the 3' untranslated region, a N-terminal to the repeats, which contains a talin-binding site relatively complex pattern of hybridization was obtained (23), is more highly conserved with only 5 differences in 258 (data not shown). In contrast, short fragments encoding amino acids. Similarly, there are only 13 sequence differences residues 447-539 and 704-773 as well as a 499-bp fragment in the 476 amino acids C-terminal to the repeats. This region from the 3' untranslated sequence gave rise to single hybrid- contains a proline-rich sequence (residues 837-878) thought to izing bands (Fig. 5 b, c, and e), strongly suggesting that there be important in separating the globular head of vinculin from is a single human vinculin gene. However, a 596-bp fragment its extended tail (14, 15). consisting of the 5' untranslated region and coding up to The sequence of nematode vinculin also has been deter- residue 182 and a 367-bp fragment encoding residues 943- mined (17), and similarities with the partial chicken embryo 1065 hybridized to multiple bands in most enzyme digests vinculin sequence (14) were noted. Interestingly, there are (Fig. 5 a and d). This result suggests that the organization of only two central repeats in nematode vinculin, which accounts the 5' and 3' ends of the gene may be relatively complex. for its lower molecular mass estimated by SDS/PAGE. Dot Alternatively, the probes could be detecting pseudogenes or matrix analysis of the complete human and nematode se- other vinculin-related sequences, although the high strin- quences (Fig. 3) confirms that the N- and C-terminal domains gency at which hybridization was carried out makes this ofvinculin are the most highly conserved regions ofthe protein latter possibility unlikely. (54% and 61% sequence identity, respectively), and the align- Chromosome Assignment of the Human Vinculin Gene. ments of these regions are shown in Fig. 4. Southern blots of EcoRI-digested somatic cell hybrid DNAs Southern Blot Analysis ofthe Human Vinculin Gene. South- were hybridized with the 1.5-kb human vinculin cDNA HV1 ern blot analysis of human DNA was carried out with a (see Fig. 1), which detects four major fragments in human origin of this heterogeneity remains to be established. However, both the human vinculin cDNAs that span this region (HV2 and HV7) contain this sequence. The chicken embryo vinculin sequence published by Price et al. (GenBank accession no. Y00312; refs. 14 and 15) contains an error: residues 442-447 should now read Thr-Ala-Lys-Leu-Ser-Asp, in agreement with the sequence published by Coutu and Craig (GenBank accession no. J04126; ref. 16). Downloaded by guest on September 26, 2021 5670 Biochemistry: Weller et al. Proc. Natl. Acad. Sci. USA 87 (1990)

E 13 13 K| B P K| P K | B P K E 13 I) K Il IB I Up" - c< x I I M~ I .t r: O _ -23 - 2-L --let _- ; , "e ; 0 - j__± = C a* 2.3 - _ - 6.,. ra. 7,if cr,=_z_ D-1 = j.. 0 _ 4

- 2 In - 4.4: _w_~'I,

.. m -0 9.4 - ( - _~~ ~ ~~~~~~~~~He aW - 6.6 amv .X \i 4.4 - Am 46

2.3 - 4W 2.0 -- 4. ).>z;

a b C d 2 ' 4 ( 7 8 l 1(1 Ii ii I 1 FIG. 5. Analysis ofthe human vinculin gene by Southern blotting. FIG. 6. Chromosome assignment of the human vinculin gene. Human genomic DNA was cleaved with the restriction endonu- Southern blot ofEcoRI-cleaved somatic cell hybrid DNAs (lanes 1-7 cleases EcoRI (E), BamHI (B), Pst I (P), and Kpn I (K). DNA and 12-14) and mouse (lanes 8 and 10) and human (lanes 9 and 11) fragments were separated in a 0.8% agarose gel and transferred to genomic DNA hybridized with a 1559-bp human vinculin cDNA Hybond N (Amersham), and the blot was hybridized at 650C in 0.45 probe (HV1, see Fig. 1). Hybrids in lanes 4 and 5 are human-rat. M NaCl/0.045 M sodium citrate (pH 7) with five different vinculin Migration positions of size markers (phage A HindIll fragments) are cDNA fragments in succession. Posthybridization washes were at shown in kb. 650C in 0.03 M NaCl/0.003 M sodium citrate (pH 7). The filters were stripped between each hybridization, and the efficiency of probe Research Fund, London), which contains a chromosome removal was checked by autoradiography. (a) EcoRI-Sac I fragment X/10 translocation t(X;10)(p22.3;qll.2) was positive for the (596 bp) from clone HV2, including the 5' untranslated sequence and human vinculin gene (Fig. 6, lane 12), thus allowing further the coding sequence up to residue 182. (b) Bgl II fragment (279 bp) localization to chromosome 10qll.2-qter. from HV1 encoding residues 447-539. (c) Cla I-EcoRI fragment (208 bp) from HV1 encoding residues 704-773. (d) Kpn I fragment (367 DISCUSSION bp) from HV6 encoding residues 943-1065. (e) Nco I-EcoRI fragment (499 bp) from HV6 containing only the 3' untranslated sequence. Comparison of the human, chicken embryo, and nematode Migration positions of size markers (phage A HindIII fragments) are vinculin sequences shows that the regions N- and C-terminal shown in kb. to the central repeat region are highly conserved and pre- sumably contain domains important to the function of the DNA of approximately 12, 6, 4, and 3 kb (Fig. 6, lanes 9 and protein. In the case of the N-terminal region, we have 11). In mouse DNA, this probe detects fragments of 10, 8, and provided evidence that residues 167-207 are important to the 3.5 kb, (Fig. 6, lanes 8 and 10) and in rat DNA of 11 kb and talin-binding activity of chicken embryo vinculin (23). It is 2 kb (Fig. 6, lane 5). Typical results obtained from screening also apparent that this region of the molecule is important for seven hybrids, two ofwhich were found to contain the human the ability of the expressed protein to localize to cell-matrix vinculin gene, are shown (Fig. 6, lanes 1-7). A total of 17 junctions (23, 24). Furthermore, we have shown that the human-rodent cell hybrids were screened, and results from deletion of contiguous regions of as little as 7 amino acids in the analysis are shown in Table 1. The only chromosome to the- sequence spanning residues 167-201 dramatically reduce show complete concordance with the presence of the human the talin-binding activity ofthe N-terminal 398 amino acids of vinculin gene is (Table 1). The hybrid vinculin (23). Although residues 167-207 are identical in TRAXK12 (a gift from P. J. Goodfellow, Imperial Cancer human and chicken embryo vinculin, this is a region of Table 1. Assignment of the human vinculin gene to chromosome 10 Chromosome Hybrid name 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X VINC 3WCL5. .... + - - + + + - + + + - - + + + SIF15 - + - - - + + - - + - - + + - + + + DT1.2 -- +...... + + - + - + + + - + + + + + HORP9.5...... + + + - +...... + + + MOG34A4 + - + + + + + + - + + + + + + + + + _ + + + FG10 - + - - + - - + - + - -- + + -- + + + FST9/10 - - + + - + - + - + + + - + + - + + + + DT1.2.4 - - +...... + + - -- + + + - + + + + RJDlt - - + - - - + - - +...... + + + + + TRAXK12t - - -. .. .- -- * + - - - + + - - + + * + SIR74ii + + + +...... + + + - + t - - + t + DUR4R3 - - + - + - - + - - + + + + - + - + + - SIF4A31 - - + + + +...... + - +...... + CTP41A2 - + - - - + +...... + - +...... + CTP34B4 + + + - + + + + - - + - + - + + + - - + TWIN19-D12 + - + + - + - + -- + + - + - + + + - + - HORL9X------+ +, Chromosome present; -, chromosome absent; t, trace; *, hybrids containing chromosome fragments; VINC, presence (+) or absence (-) of human vinculin gene. tHuman Genetic Resources, Imperial Cancer Research Fund. *Gift from P. J. Goodfellow. For details of other hybrids, see ref. 22. Downloaded by guest on September 26, 2021 Biochemistry: Weller et al. Proc. Natl. Acad. Sci. USA 87 (1990) 5671 considerable divergence in the nematode sequence (17). 7. Belkin, A. M. & Koteliansky, V. E. (1987) FEBS Lett. 220, However, it is striking that residues 178-185 are identical in 291-294. all three species. The C-terminal region of vinculin (the last 8. Wachsstock, D. H., Wilkins, J. A. & Lin, S. (1987) Biochem. 170 amino acids) apparently contains a site involved in Biophys. Res. Commun. 116, 1026-1032. targeting the molecule to cell-matrix junctions (24). This is a 9. Blanchard, A., Ohanion, V. & Critchley, D. R. (1989) J. Muscle which has also been Res. Cell Motil. 10, 280-289. highly conserved region of the protein, 10. Burridge, K. & Mangeat, P. (1984) Nature (London) 308, suggested to be involved in the ability of vinculin to self- 744-746. associate (25). It has been calculated that the C-terminal 11. Horwitz, A., Duggan, K., Buck, C. A., Beckerle, M. C. & region of chicken embryo vinculin (residues 858-1066) has a Burridge, K. (1986) Nature (London) 320, 531-533. pI of 9.7, compared with a pI of 5.9 for the intact protein (16). 12. Tapley, P., Horwitz, A., Buck, C., Duggan, K. & Rohrschnei- This may account for the apparent ability of vinculin to der, L. (1989) Oncogene 4, 325-333. interact directly with phospholipids (26). 13. Stickel, S. K. & Wang, Y. (1988) J. Cell Biol. 107, 1231-1239. Vinculin has been reported to contain a short region of 14. Price, G. J., Jones, P., Davison, M. D., Patel, B., Eperon, I. C. with actin (27). Thus, residues 300-307 & Critchley, D. R. (1987) Biochem. J. 245, 595-603. in the first repeat of chicken embryo vinculin (Arg-Gln-Ile- 15. Price, G. J., Jones, P., Davison, M. D., Patel, B., Bendori, R., Leu-Asp-Glu-Ala-Gly) can be aligned with residues 359-366 Geiger, B. & Critchley, D. R. (1989) Biochem. J. 259, 453-461. in a-actin (Lys-Gln-Glu-Tyr-Asp-Glu-Ala-Gly) (27). A num- 16. Coutu, M. D. & Craig, S. W. (1988) Proc. Natl. Acad. Sci. also contain a similar sequence USA 85, 8535-8539. ber of actin-binding proteins 17. Barstead, R. J. & Waterston, R. H. (1989) J. Biol. Chem. 264, (reviewed in ref. 27). The human and chicken embryo vin- 10177-10185. culin sequences are identical over this region, but the nem- 18. Ginsburg, D., Handin, R. I., Bonthron, D. T., Donlon, T. A., atode sequence is divergent, although it does contain a Bruns, G. A. P., Latt, S. A. & Orkin, S. H. (1985) Science 228, different short stretch of sequence homology with actin (17). 1401-1406. Despite an earlier report, there is no evidence that vinculin 19. Feinberg, A. P. & Vogelstein, B. (1984) Anal. Biochem. 137, binds directly to actin (28). There is no significant sequence 266-267. similarity between vinculin and any other protein in the 20. Sambrook, J., Fritch, E. F. & Maniatis, T. (1989) Molecular current data base. Vinculin does not contain the SH3 domain Cloning: A Laboratory Manual (Cold Spring Harbor Lab., Cold found in a number of other proteins associated with the Spring Harbor, NY). membrane cytoskeleton (29). 21. Sanger, F., Nicklen, S. & Coulsen, A. R. (1977) Proc. Natl. There is at least one additional isoform of vinculin, termed Acad. Sci. USA 74, 5463-5466. metavinculin, which has a higher apparent molecular weight 22. Spurr, N. K., Gough, A., Goodfellow, P. J., Goodfellow, P. N., Lee, M. G. & Nurse, P. (1988) Hum. Genet. 78,333-337. than vinculin and is expressed specifically in muscle (30-33). 23. Jones, P., Jackson, P., Price, G. J., Patel, B., Ohanion, V., The close similarity between these isoforms suggests that Lear, A. & Critchley, D. R. (1989) J. Cell Biol. 109, 2917-2927. they are generated by of transcripts from 24. Bendori, R., Salomon, D. & Geiger, B. (1989) J. Cell Biol. 108, a single gene (34). Southern blots ofchicken embryo genomic 2383-2393. DNA cleaved with a variety of restriction endonucleases and 25. Milam, L. M. (1985) J. Mol. Biol. 184, 543-545. hybridized with a short vinculin cDNA fragment containing 26. Niggli, V., Dimitrov, D. P., Brunner, J. & Burger, M. M. (1986) only the 5' untranslated sequence detected a single band in J. Biol. Chem. 261, 6912-6918. each digest, consistent with the existence of a single vinculin 27. Tellam, R. L., Morton, D. J. & Clarke, F. M. (1989) Trends gene in chicken (15). The more extensive Southern blot Biochem. Sci. 14, 130-133. analysis of the human vinculin gene reported here is also 28. Wilkins, J. A. & Lin, S. (1986) J. Cell Biol. 102, 1085-1092. consistent with a single functional gene. By using the HV1 29. Drubin, D. G., Mulholland, J., Zhu, Z. & Botstein, D. (1990) cDNA as a probe, the human vinculin gene has been assigned Nature (London) 343, 288-290. 30. Glukhova, M. A., Kabakov, A. E., Belkin, A. M., Frid, to chromosome 10 in the region qll.2-qter. It is interesting M. G., Ornatsky, 0. I., Zhidkova, N. I. & Koteliansky, V. E. to note that chromosome mapping of the mouse vinculin (1986) FEBS Lett. 207, 139-141. locus with the same probe has led to assignment ofthe mouse 31. Belkin, A. M., Ornatsky, 0. I., Glukhova, M. A. & Kotelian- gene to chromosome 14, which is syntenic to human chro- sky, V. E. (1988) J. Cell Biol. 107, 545-553. mosome 10 (M. Strobel, personal communication). It should 32. Belkin, A. M., Ornatsky, 0. I., Kabakov, A. E., Glukhova, be of interest to define more accurately the position of the M. A. & Koteliansky, V. E. (1988) J. Biol. Chem. 263, 6631- human vinculin gene, as a number of loci of interest map to 6635. the long arm of chromosome 10. These include the cell cycle 33. Glukhova, M. A., Kabakov, A. E., Frid, M. G., Ornatsky, control and the disease locus for multiple 0. I., Belkin, A. M., Mukhin, D. M., Orehov, A. N., Kotel- gene CDC2 (22, 35) iansky, V. E. & Smirnov, V. N. (1988) Proc. Natl. Acad. Sci. endocrine neoplasia types 2A and 2B (36, 37). The data on USA 85, 9542-9546. mapped loci on the long arm of chromosome 10 are summa- 34. Gimona, M., Small, J. V., Moermans, M., VanDamme, J., rized in the recent Human Gene Mapping Workshop (38). Puype, M. & Vandekerckhove, J. (1988) EMBO J. 7, 2329- We are grateful to Dr. V. Kushnirov for help and advice with DNA 2334. sequencing and to Ms. M. Kirillova for excellent technical assist- 35. Myers, S., Wu, J., Goodfellow, P. J., Spurr, N., Kidd, J. R., ance. This work was supported by a grant to D.R.C. from the Medical Carson, N., Castaglione, C. M., Kidd, K. K. & Simpson, Research Council (U.K.). N. E. (1988) Genome 30, 239 (abstr.). 36. Mathew, C. G. P., Chin, K. S., Easton, D. F., Thorpe, K., 1. Geiger, B. (1979) Cell 10, 193-205. Carter, C., Liou, G. I., Fong, S.-L., Bridges, C. D. B., Haak, 2. Burridge, K. & Feramisco, J. R. (1980) Cell 19, 587-595. H., Nieuwenhuizen Kruseman, A. C., Schifter, S., Hansen, 3. Burridge, K., Fath, K., Kelly, T., Nuckolls, G. & Turner, C. H. H., Telenius, A., Telenius-Berg, M. & Ponder, B. A. J. (1988) Ann. Rev. Cell Biol. 4, 487-525. (1987) Nature (London) 328, 527-528. 4. Lazarides, E. & Burridge, K. (1975) Cell 6, 289-298. 37. Jackson, C. E., Norum, R. A., O'Neal, L. W., Nikolai, T. F. 5. Burridge, K. & Connell, L. (1983) J. Cell Biol. 97, 359-367. & Delaney, J. P. (1988) Am. J. Hum. Genet. 43, A147 (abstr.). 6. Geiger, B., Volk, T. & Volberg, T. (1985) J. Cell Biol. 101, 38. Smith, M. & Simpson, N. E. (1989) Cytogenet. Cell Genet. 10, 1523-1531. 202-225. Downloaded by guest on September 26, 2021