Proc. Nadl. Acad. Sci. USA Vol. 88, pp. 4476-4480, May 1991 Cell Biology form a distinct subset of the family of molecules (/adherens junctions/cell-ceil interactions) SUSAN MECHANIC*t, KAREN RAYNOR*t, JOHN E. HILL*, AND PAMELA COWIN*t Departments of *Cell Biology and tDermatology and Kaplan Cancer Center, New York University Medical Center, 550 First Avenue, New York, NY 10016 Communicated by David D. Sabatini, February 11, 1991

ABSTRACT The desmosomal adhesive core is formed by , which are the major adhesive components of the four major components: (Mr, 165,000), desmocol- . lins I and H (Mr, 120,000 and 110,000, respectively), and a Mr Herein we present the complete amino acid sequence of a 22,000 protein. Here, we report the cloning and sequencing of precursor deduced from the nucleotide sequence cDNAs encoding a bovine desmocollin. The open reading frame of cDNA clonest and demonstrate that desmocollin repre- found in the longest cDNA, 5 kilobases, contains a region sents a unique subset within the cadherin family of adhesion encoding a protein of 839 amino acids. The features of the molecules. Thus, the adherens junction and the , deduced amino acid sequence imply that the mature 707-amino which are morphologically similar, contain adhesive compo- acid desmocollin is a type I transmembrane protein that is nents that are structurally related. produced by proteolytic cleavage of an 810-amino acid precur- AND sor. The ectodomain of desmocollin contains repeats that show MATERIALS METHODS similarity to members of the cadherin Desmocollin-Enriched Fractions. Individual desmocollin extensive sequence protein bands were gel-purified (6) from subcellular fractions family of calcium-dependent cell adhesion molecules. A com- enriched in bovine snout desmosomal glycoprotein cores parison of the amino acid sequences of desmocollin, desmo- (18). glein, and the cadherins shows that although these intercellular Production. Anti-desmocollin were junctional adhesion molecules share a consensus sequence in produced as described (6). The resulting antibodies were their adhesive domains that dermes them as a family, several characterized by immunoblot analysis of bovine epidermal features, including the divergence in the sequence of their desmosomal fractions and by immunofluorescence micros- cytoplasmic tails, divide them into three distinct subtypes. copy and gave similar results with these techniques to those described (6). Desmosomes are attachment devices between cells that are Peptide Isolation and Sequencing. Desmocollin proteins characterized by a widened intercellular space bisected by a were digested with trypsin. Resulting peptides were purified midline density and a pair of electron-dense submembranous by reverse-phase HPLC and sequenced by William Lane mats called plaques, which are the foci of anchorage of (Harvard Microchemistry Facility, Cambridge, MA). The intermediate filaments to the membrane (1). The subcellular sequence ofthe amino-terminal 12 amino acids of the mature fraction enriched in the intercellular adhesive cores consists protein and those of three tryptic peptides are underlined in of four major glycoproteins, desmoglein (Mr, 165,000), des- Fig. 1 and denoted by the numerals i, ii, iii, and iv. mocollins I and II (Mr, 120,000 and 110,000, respectively), cDNA Library Construction and Islation of cDNA Clones and a Mr 22,000 component (2). Encoding Bovine Epidermal Desmocollin. The complete des- Desmocollins I and II are a pair ofclosely related proteins. mocollin cDNA was obtained by a 3-fold screening strategy. isoelectric and overall First, a bovine epidermal cDNA library was constructed in Both polypeptides have similar points Agtll as described (19), from mRNA isolated from the amino acid composition, and proteolysis of each produces of 10 cow snouts, and screened with specific similar patterns of fragments (3, 4). Their size difference has anti-desmocollin antibodies. Four clones were identified. been attributed to differential glycosylation and phosphoryl- The longest, 720 base pairs (bp) termed DC-700, contained an ation (3, 5). To date, all polyclonal and monoclonal antibodies open reading frame coding for 50 amino acids (aa) followed raised against the desmocollins recognize both desmocollins by a stop codon, 554 bp ofnoncoding sequence, and a poly(A) I and 11(5-9). Fab' fractions ofan anti-desmocollin antiserum tail of16 bases. Overlapping and complementary oligonucleo- detect desmocollins on the cell surface prior to cell contact tides were constructed from the 5' sequence of DC-700 (Fig. and inhibit desmosome assembly when added to the culture 1, bases 2581-2626 and 2644-2595) and used to rescreen the medium of subconfluent cells (10). This led to the suggestion library for longer clones. Seventy positive clones were found that desmocollins play a key role in initiating desmosome in this secondary screen. To produce a probe that could select assembly and possibly in desmosomal adhesion (10). the longest clones from among these 70 clones, a polymerase Desmosomal assembly and stability are extremely sensi- chain reaction (PCR) was performed with first-strand bovine tive to calcium ion concentration (11, 12), and desmocollins epidermal cDNA as the template. The oligonucleotide con- and desmoglein bind calcium ions in gel-overlay assays (13). structed from bases 2643 to 2595 was used as the 3' primer There are three families of cell adhesion molecules that show and a set of degenerate oligonucleotides encoding the first 6 a similar requirement for calcium ions to manifest their aa oftryptic peptide ii (see above), as the 5' primer. Ethidium adhesive properties: the , the lectins, and the cad- bromide staining of the electrophoresed PCR products de- herins (14, 15). We (16) and others (17) have shown that tected a single band of 1.3 kilobases (kb) (DC-PCR). Se- desmoglein shares amino acid sequence similarity with the quence adjacent to the 5' region ofthe PCR product was used

The publication costs of this article were defrayed in part by page charge Abbreviation: aa, amino acid. payment. This article must therefore be hereby marked "advertisement" tThe sequence reported in this paper has been deposited in the in accordance with 18 U.S.C. §1734 solely to indicate this fact. GenBank data base (accession no. M61750). 4476 Downloaded by guest on September 30, 2021 Cell Biology: Mechanic et al. Proc. Natl. Acad. Sci. USA 88 (1991) 4477

M A V A S A A S 1 FFMW K Q L ; F S I V . F' A Q K : S

Q V P S H 'R A L V G KV N L K _0 ' S A S L ' S S D 3 CDCTGAADTTGCGTDCC AGCGAGTDAACGAGC-- A GACAAGTGAAD-GCAAGA-AGT S C TCAGcG-CDGCAGC'CDAAGC ITG:GAAGDGATCC

F R T ? _V:.S S S _ K J S : S ) SSL '-CACTTCAGGA:CGTACAA A"'G-C-ADTACACAA- cACAfG"AGG DTo:C: T th : ATTD-GC"CTCAGACAGDGzCAD-D. SA i- : D Q K E , A G G KK V P XK R H M KD A V L R R I K R R W CADGDG AGAAGGASA A-AAA A7A GAAGDAGGAGAAAGAAGDT:h-GAAGAGACAA A--AAA A GACAAAACgA.A SAT V A P _ p0 S M 'e L S P F CQ H V Q Q ' Q Svt-D A A Q NY - Fv

3 SS G P- V G F F N L F F 7 KG T GD_: F R S _ R DACGDCGATAADCGDACCADDGTAGACAAAGADCGCDDTAAD.GDGTTCTTGA-DADAAAAAADAGAGGGGAATADD_ 1 ACAA'-AA-A AAAC T Q- .g v 7. V :Z Y v fA A ' Y A P F Y P P L V F K.V E ID ?JN CAG"AATCAACAT T?-C- C7 -; A',C;AACSAATCG7A:GA^As.7- SdtAGwW.sqsT>:ivATAC .C-

2 D DR L K Y : 'v _Q N N PR :F S V G rA-CAGDDDG-4 AGA h...MCA s h.DAAD.D.A-AAAA'GDDGGAG AAAD-. GCGAAACAADGGCAASACAGADDDGACTGDACATCAAGACACADDTG- FIG. 1. Nucleotide and protein .KFOG D K L Y V RD M GGG C P F _ F.L;AAR-N 3 5 ATrCA'CACtAACCAI'AC-T -AC -'C oRMA-A. :AA.hA-C--tA'CAwACC.r AG '.hAA-. AA GC A A'=r-_'C sr"AGACAo-sGGGG''b AAwC -."'T n.A-:':A"'-r -:'A sequence of desmocollin. The _7 T D ID S L S F N : N A P Y -F T F TS S D V E V E E. R i D nucleotide sequence for des- A ACASGAACAA TAC TA-TT 7 .C c GRAS -" t;A'- SA '-"A.ALG AwC A T -ACAGAAAoCT: --IrA --AC fit A ,AASTAGAAA'AAACA AATTGA mocollin is shown with the corre- sponding 839 aa in the single-letter C. V r -L R U A V K 3 D |P N D P h S R A V Y Q D L Q S N E N G code. The dashed lines above res- F V V K P L N Yr V N X _V V L _ V idues 1-29 and 683-715 show the , '_ C ;AAA A C'" 'ACA ACC C'A A'A;C A~A 77G C S>G*C STGTT T AA,- C~AAC wf- AA tTA S A_A~wsC;T;C,A tAo'-7C GCC;AAG G T T TGCA AA I -'ltG 2 6f8 signal sequence and transmem- t

- N V S V S V V F marked and iv S V N S _ G E protein; bars ii, iii, GAGAADGAACAT'rD7GTAAAAAA AAA AA-AATGCDTGDCGTGG -AGADCCAGA NT-T GCACCAATT.T AGDADDTDDDL- represent the sequence of three tryptic peptides. Highly acidic re- [*:ACTT N T7 P I -- 1pV . D C D A --AAARAA u sAnC CACCGCAwSA wAAhAC4AA|SAAtAAGAAAtASSA-A--GADDGAGAAo-AA-_os sb A .!_ AGAA-DGAAGADGA 3A gions corresponding to the puta- tive calcium in the J N : : N SL V S RD A S L T binding regions AA:-:',:W C T 7- -T "~C GADGGDAAAACD-GG-CCA .T uTC -G cadherins are indicated boxed 66B by shaded areas, and an additional Q -- :1 D V7 y ,V ' _ S.-:AS A D T ; V R V O D 22C~ GCCAAtz7A:iA -AA.-A A C --'.A -P':AGAAAAG AACATDGDCGT-G C _ AAAGAA_A AT-AGGADDGA A -A--G AI area high in acidic residues is shown in an unboxed shaded area. P SA R A ANA V F W I ; AM V D_ACAA _CCATCGCAA G-AAACGTCGGCAAADADAA-CDGG .AAAADSGGAADDGT TC-CA SSS Potential N-linked glycosylation sllsslal~slealff|||alls||a|W@||wll|||~l* * sites are indicated by open trian- S S ** DV FK D V K KOF PSX P V A yQ N gvi gles, and asterisks represent pos- C73; * sible sites. Dots D N phosphorylation _ N- E .S P s .VMA - N RL. P 7 S , 0 D S TA.A:A ~-iR-- -7r - SAAn'S.;AAvS'.; ww _:A TSSA .:-S C A,A1A.{ ACA AAA oC .zAA f et r-rT GTACACAAGC1ATATCTS below nucleotides in the sequence indicate base changes in the prod- ST S V r D ,r ' SN V F G A - Y - J-A N K G ..GH Q T 2 C uct of the PCR. Cysteines are * * * * * in solid circles. The H highlighted V KG 7 D ; , Y D Y SDK R N F T PR LG EES I R G H 2v CC-MDAGDCTCDD7AA C; CT-C C A C-: ACAG A-A 'A -7 C kCC- .C T-- SC A A CCATTA AG0AC stop codon is denoted by excla- mation marks. The positions of V N C525 nucleotides Ar 1.7Kb AATAAA the two polyadenylylation signals are indicated diagrammatically.

to create an oligonucleotide (bases 1370-1347). A tertiary PCR), and the longest (5.0 kb) cDNA (DC-23). The nucleotide screen with the latter probe identified a single positive 5.0-kb and deduced amino acid sequence of the complete coding clone (DC-23). region of a bovine epidermal desmocollin precursor derived DC-700, DC-PCR, and DC-23 were subcloned into pGEM from DC-23 is shown in Fig. 1. The authenticity of this clone (Promega) and an exonuclease III deletion series was created was established by a 100% match between regions of the from both ends of each, using the Erase-a-Base kit (Promega) deduced protein sequence and the sequence obtained from according to the manufacturer's instructions, to facilitate three tryptic peptide fragments of desmocollin (Fig. 1, amino sequencing of both strands. Double-stranded DNA sequenc- acids overlined and denoted by the numerals ii, iii, and iv). ing was :.Sperformed using Sequenase (Promega) with either cDNA DC-23 has a single long open reading frame that SP6 or T7 primers. The University of Wisconsin Genetics extends to a stop codon at base 2726. The first methionine at Computer Group package of sequence analysis programs (20) base 207 is followed by a stretch of 29 aa that contains the was used on a VAX 6000-410 computer (Digital Equipment). features required of a signal sequence (21). The 12 aa ob- Northern Blot Analysis. Bovine snout poly(A)+ mRNA tained by sequencing the amino terminus of the mature (5-10 Ag) was electrophoresed in denaturing formaldehyde/ protein are found at residues 133-144. These results are agarose gels, transferred to nylon membranes, and used for consistent with desmocollin being synthesized as a precursor Northern blot hybridization as described (19). of 810 aa that is inserted into the membrane by a cleavable signal sequence of 29 aa and subsequently processed to a RESULTS mature form of 707 aa by removal of 103 aa from the amino Characterization of a Desmocollin Precursor. Three cDNAs terminus. A single hydrophobic region of 33 aa, sufficient to were sequenced: the 720-bp clone (DC-700) recognized by the span the lipid bilayer once, is located at residues 683-715 and desmocollin antibody, a 1.3-kb internal PCR product (DC- is followed by a 124-aa carboxyl-terminal domain. A scan of Downloaded by guest on September 30, 2021 4478 Cell Biology: Mechanic et al. Proc. Natl. Acad Sci. USA 88 (1991) the Prosite motif data base revealed three potential sites of cadherin family. This compares to values of54% identity and N-linked glycosylation (NXS/T) in the amino-terminal do- 70% similarity between the extracellular domains of E- and main at residues 163, 398, and 545 and a signal for glycos- P-cadherins (Table 1). aminoglycan addition (SGXG) at residues 172-176. A total of Like cadherins, the extracellular domain of desmocollin is eight potential phosphorylation sites were found in the car- highly repetitive. These extracellular repeats have been des- boxyl-terminal domain. Five sites for protein kinase C phos- ignated EC1 to EC5; EC5 being closest to the membrane (25). phorylation (S/TXR/K) were found on threonines at residues To define the regions of identity in greater detail, we con- 716, 720, and 810 and on serines at residues 803 and 830. Two structed a multiple sequence alignment between the des- potential sites of casein kinase 2 phosphorylation (S/ mocollin extracellular domain and those of all cadherins for TXXD/E) were found on the serine at residue 737 and on the which the complete sequence is known (Fig. 3; for a similar threonine at residue 814. A single motif for tyrosine kinase comparison of desmoglein to cadherins see ref. 16). The phosphorylation (R/KXXXD/EXXXY) was found at residue percentage ofidentical residues was greatest in EC2 followed 813. by EC4 > EC1 > EC3 > EC5. Identical residues were Desmocollins I and II show a molecular weight difference, clustered in the putative calcium-binding regions of the variously reported, between 8000 and 15,000 (5, 6, 22). cadherins (marked by arrowheads in Fig. 3) that obey the Differential glycosylation of the two proteins and phosphor- consensus sequences A/VXDXD and DXNDN (26). This ylation only of the larger desmocollin (3, 9) have been feature was also observed in desmoglein (16). It is generally in electrophoretic accepted that calcium ions interact with and alter the con- postulated to account for the variation from therefore examined our cDNAs for differ- formation ofcadherins in such a way as to protect them mobilities. We (15). Synthetic peptides constructed to ences with this explanation. The proteolytic attack that would be consistent these sequences bind calcium, and point mutations within three clones examined were identical with six exceptions this motif result in loss of the calcium-conferred protection none of which could give rise to differential glycosylation or from tryptic digestion (27). The presence of these calcium phosphorylation. The 1.3-kb PCR clone (DC-PCR) that ex- binding sites within desmosomal glycoproteins is likely to tends from base 1322 to 2643 differed from DC-23 at 5 bases account for the sensitivity of desmosome assembly and (denoted by dots in Fig. 1), two of these resulted in a change stability to the external concentration of calcium ions (12). in amino acid residue from isoleucine to phenylalanine at Several other highly repetitive cadherin amino acid motifs residue 519 and from tyrosine to cysteine at residue 788. of unknown function (e.g., LDRE) are present at conserved These changes could result from mistakes introduced during positions in the desmocollin and desmoglein sequences. The the 30 cycles of PCR. However, the number of substitutions putative proteolytic cleavage signal for cadherin precursor in our sequence is almost 2-fold that reported to result from processing, LXRXKR, (Fig. 2) (28) is also present in des- these PCR conditions (1/400 bases) (23). Therefore, at least mocollin (residues 126-131) and desmoglein. This suggests some of the differences may represent polymorphism in the that the processing of each of these proteins may employ the population. same protease. Mutations in this cleavage motif inhibit pro- The sixth difference found between the sequences of the cessing and abolish adhesion of cadherins (28). This supports clones lies in the site of polyadenylylation. Clone DC-700 is the hypothesis that adhesion molecules are synthesized as derived from an mRNA in which polyadenylylation initiates precursors to prevent intracellular interaction. Subsequent 20 bp downstream of the rare ATTAAA polyadenylylation proteolytic processing would then expose or alter an active signal found 525 bp beyond the stop codon. Clone DC-23 site of intermolecular interaction (28). represents a mRNA that continues for an additional 1.7 kb To determine which residues are crucial to the structure of before polyadenylylation begins 12 bp downstream from the cadherins, a consensus was constructed of amino acids that more conventional AATAAA signal. Northern blot analysis are conserved in all members of this family (Fig. 3, line 9). ofepidermal mRNA detects a message of -5.5 kb and a faint The shaded residues in line 9 indicate that the desmocollin band is seen (Fig. 2) slightly above the position ofplakoglobin sequence complies with 68% ofthis consensus, a figure much mRNA (3.2 kb). These results indicate that DC-23 (5.0 kb) is higher than that derived from comparisons of the overall nearly complete and that the polyadenylylation signal found sequence. Dashes below line 9 indicate amino acids that are in DC-700 is used very rarely and/or results in an unstable conserved in both desmosomal glycoproteins and in all species of mRNA. cadherins. Four cysteines found in EC5 (arrows in Fig. 3) and Desmocollins and Cadherins Show Similarity in Their Ex- prolines throughout the sequence are highly conserved in ternal Domains but Differ in Their Cytoplasmic Tail. Searches cadherins. Both of these features are present in desmocollin of the data bases found significant similarity between the but are absent from desmoglein, where they are often re- protein sequences of desmocollin and members of the cad- placed by glutamic acid residues (16). Nine cysteines are herin family (24). The greatest similarity was found to the present in EC1 to EC4 of desmocollins and six cysteines are N-cadherins. Surprisingly, desmocollin showed a stronger resemblance to all the cadherins than it did to desmoglein. Table 1. Comparison of the percentage of identical residues The amino-terminal domain of mature desmocollin (residues between the extracellular domains of desmocollin, desmoglein, 133-683) shares between 34 and 38% identity and 55 and 57% and cadherins similarity with the extracellular domains of members of the % identity T S bDC mN mE mP mN 39 (57) FIG. 2. Northern blot analysis. mE 35 (55) 47 (64) DC700 cDNA detected a 5.5-kb mP 36 (53) 47 (64) 54 (70) .,v band in bovine snout PI epidermal bDG 28 (47) 29 (50) 29 (50) 29 (50) (lane S) or tongue (lane T) mRNA. -.4 The arrowhead marks a lower band Numbers in parentheses are the percent similarity as defined by that probably represents transcripts the PAM-250 matrix (24). bDC, bovine desmocollin; mN, mouse terminating at the first polyadeny- N-cadherin; mE, mouse E-cadherin; mP, mouse P-cadherin; bDG, lylation signal. The positions of the bovine desmoglein. The region comprising EC1 to EC5 were aligned 28S and 18S rRNA bands are using the program GAP with a gap weight of 3.0 and a gap length marked by arrows. weight of 0.1. Downloaded by guest on September 30, 2021 Cell Biology: Mechanic et al. Proc. Natl. Acad. Sci. USA 88 (1991) 4479

-QQ YA"A'-- ADGY'A2EY :;C.*WA L - YSq'I --SQC-A.D--K-DP tl'E RE T ST TG AD'S -D _P7,-. -7 I ERF 17 p \ .. vT K ;R two w'~:NLE~s; C?- N'-( A:AKY-L-AS:A'D'SNGNQEWNT RY S GPGA", P --.r N?

5- L RY SPGAD T 3 T N S, S C-' GtR2' Y""AVRA, XD C( GS W7!! :7A'T VA -AINA--,FA.. DRP.:-Domw D D(.CK3V LS EC 2 V.7A D, -ED"D.'-' N-Z N S T.:- 31 7 C.. TV AAG 3R'l S 3. L C' --s T'~ATAVIT-V7DAKINP3N -. X7 'OP A LN 3 R clzA 5 PSN N bN V A'N3,T'~_ EC 3 R. T- M.F P N .;X S Z'- S -'.TA' V1D, D N-:P

VA O' 7 7, N

*I,__.sY. vfV'3K5N V.-0 R -V- 3-3'""RV"D .-EAV"" 3PLV "?V"'V"V NAK FAKA \ ~~~~~~~~~TVV~~Kv 17 VVD: TV NMA: TVAAS x ' N?YY3FNKP- AR C 3 AKCS?AWQV L PS T NS A Y A"DC'Q'Y'T-VRV.7NE7ATW7GS7Vp NEA7 I EC 3 WAV:'3ENK ' V VQZL.V '.-N: i - A Y-IV711 V PRFAT QTDPN iqm' V AA-N V-P LAKG Q---'P S ALT vvvcNpyrV bN PN'V TRXYVI'7VAAENQVP"'3K- ""'~ S AT V~ASVV" KN?-' YA~EnVKF ZN A'DV_V A NL'VT3'DQ T WAAk*-RI S G N-.XV` -V AAN"DASf:-D'E3 --Al V3VnVCVNE'3'3 TAX - riV-PEN ?-,R"VVAN-':V bKT(3" TAWNA R -5 PNSNDC LV* 'N - L V 3- "_-- AW- - mw

3??p '(V - ACT' L C YA"VD-1 R"'_SF R"' - I KN\ "NV-SV '-rA DADC-SS C--AN'K Lz'I TCTi L33:D 41

' a: ELT.1CR:-; n"K D y "VIV DQ: 3"'3SGPP F,iP:,NS AS LW TVE3SR *SKTA ILRG 5f-' To VP Z'1:K" '323~A SA-7:'1V`,'C. SP~C-"S 7- -P R NMn,,7CQ RN SAS. Y.--;-KLF N-QNKD Qv7 VCC-TN M I- Q YK- 1,SQGX AQV:QV!,; KA' VC "`TAXCER'SY3 -T -N IV :YDL`S7 DHGRQ5 M-:RA-V-C-"3V'NDCP5?WK -EA' cN N 7 F I vi' -v"V 1P ---'A-7-'! C T P NSN :,3, '-NW T7--RL G3DAQLNL SzNKS:53VKV'- S-N::T~V D VP7' -5' S535555SNNsLVCQ-_-" :ED NFS T:NRLN' G 3AQLN' : K ---A7TC NA N Y7vP-:5-SN3ssN?:'(SSTSV:KV'(VcxDv

FIG. 3. Multiple protein sequence alignment of bovine desmocollin and cadherins. The five extracellular repeats ECi to EC5 of desmocollins are shown aligned to the sequences of other cadherins. DC, bovine desmocollin; mE, mouse E-cadherin; cE, chicken E-cadherin; mP, mouse P-cadherin; hN, human N-cadherin; bN, bovine N-cadherin; inN, mouse N-cadherin; cN, chicken N-cadherin. Within each repeat, at most only a few small gaps were required to align any pair of sequences. Nucleotides that are conserved in all cadherins form a consensus shown on the line marked *; dashes under the consensus indicate residues conserved among desmocollin, desmoglein, and the cadherins. Shading indicates residues conserved between desmocollin and other cadherins. The arrows indicate cysteine residues in desmocollin. Arrowheads indicate the putative calcium binding domains of cadherins. Bars indicate additional concentrations of acidic amino acids. Each repeat was aligned with desmocollin and each of the other cadherins using the program GAP with a gap weight of 3.0 and a gap-length weight of 0. 1. The resulting pairwise, alignments were imported into the program LINEup for the final alignment. GenBank accession numbers are as follows: mE, X06115; cE, M16260; mP, X06340; hN, M34064; bN, X53615; inN, M31131; cN, X07277.

found in ECi to EC4 of desmoglein that are not present in stretches of unique sequence, a central region with 28% cadherins. identity and 53% similarity to the cadherin tail, followed by Several other intriguing differences exist among these a unique and repetitive sequence that accounts for the larger molecules. In particular, the tripeptide amino acid sequence size of this molecule (17). HAV, which is found in ECi of cadherins, as well as in

hemagglutinins (29) is replaced in desmocollin by the se- quence YAT and in desmoglein by the sequence RAL. DISCUSSION Single-base mutations of the amino acid residues neighboring Desmocollin has significant similarity in its external domain the HAV motif have been shown to alter significantly the to a family of cell adhesion molecules termed cadherins. This specific homophilic interactions of cadherins (30). observation and previous results (10) showing that monova- Finally, the largest sequence divergence is seen in the lent anti-desmocollin antibodies inhibit desmosomal assem- cytoplasmic tails of these molecules. The desmocollin cyto- bly strongly suggest that desmocollins are directly involved plasmic domain shows little resemblance to the cadherins or in the adhesive mechanism of the desmosome. any other protein in the data base. Randomized versions of the desmocollin cytoplasmic tail yielded values of between 11 Our data show that desmocollin, desmoglein, and cadher- and 22% identity (average, 18%) and 35 and 48% similarity to ins are structurally related and may be considered a family. the cadherin tails, suggesting that the values obtained with These proteins are all intercellular junctional components, the correct desmocollin sequence (20% identity and 45% i.e., they are functionally related not only in their adhesive similarity after introducing extensive gaps) have little signif- capacity but in their ability to concentrate into specific icance and are due to chance. In contrast, desmoglein has membrane sites. It is therefore a formal possibility that the Downloaded by guest on September 30, 2021 4480 Cell Biology: Mechanic et al. Proc. Natl. Acad. Sci. USA 88 (1991) structural resemblance of these proteins is an example of supported by the American Cancer Society Grant BC-691, the convergent evolution. National Institutes of Health Grants 5P30AR39749-03, and 2 T32 Despite the structural similarity of desmosomal glycopro- GM07238-16, and a PEW Scholar Award to P.C. Biocomputing was teins and cadherins in their extracellular domains, there are supported by National Science Foundation Grant DIR-8908095. notable differences in key residues that may alter signifi- cantly the secondary folding of these proteins. Prolines are 1. Farquhar, M. G. & Palade, G. E. (1%3) J. Cell Biol. 17, they play a 375-412. highly conserved among the cadherins suggesting 2. Gorbsky, G., Cohen, S. M., Shida, H., Giudice, G. J. & crucial role in the folding of these molecules. These residues Steinberg, M. S. (1985) Proc. Natl. Acad. Sci. USA 82, 810- are largely conserved in desmocollin but have been fre- 814. quently substituted by acidic glutamic acid residues in des- 3. Kapprell, H. P., Cowin, P. & Franke, W. W. (1987) Eur. J. moglein (16). Furthermore, cysteines, which may alter pro- Biochem. 166, 505-517. tein structure through their capacity to form disulfide bridges, 4. Mueller, H. & Franke, W. W. (1983)J. Mol. Biol. 163,647-671. are found in large numbers in both desmosomal glycoproteins 5. Parrish, E. P., Marston, J. E., Mattey, D. L., Measures, yet have no counterparts in the cadherin sequences. These H. R., Venning, R. & Garrod, D. R. (1990) J. Cell Sci. 96, differences favor the hypothesis that cadherins, desmocol- 239-248. lins, and desmoglein may have arisen from a common an- 6. Cowin, P. & Garrod, D. (1983) Nature (London) 302, 148-150. cestral molecule that has diverged into three subtypes in 7. Cowin, P., Mattey, D. & Garrod, D. R. (1984) J. Cell Sci. 66, response to junction-specific requirements. 119-132. The most striking sequence difference among desmocollin, 8. Giudice, G. J., Cohen, S. M., Patel, N. H. & Steinberg, M. S. (1984) J. Cell Biochem. 26, 35-45. desmoglein, and the cadherins lies in their cytoplasmic tails. 9. Cohen, S. M., Gorbsky, G. & Steinberg, M. S. (1983) J. Biol. Desmocollins show little resemblance to the cytoplasmic tail Chem. 258, 2621-2627. of either cadherins or desmoglein. Cadherins show 65% 10. Cowin, P., Mattey, D. & Garrod, D. R. (1984) J. Cell Sci. 70, identity to each other in this region, and desmoglein contains 41-60. unique and cadherin-like regions (17). The cytoplasmic tails 11. Mattey, D. L. & Garrod, D. R. (1986) J. Cell Sci. 85, 113-124. of desmocollin, desmoglein, and cadherins are in close prox- 12. Hennings, H. & Holbrook, K. A. (1983) Exp. Cell Res. 143, imity to the submembranous plaques associated with inter- 127-142. cellular junctions. These plaques contain common and spe- 13. Steinberg, M. S., Shida, H., Giudice, G. J., Shida, M., Patel, cific components and associate with distinct sets ofcytoskel- N. H. & Blaschuk, 0. W. (1987) Ciba Found. Symp. 125, 3-25. etal filaments. The plaques of adherens junctions contain 14. Hynes, R. 0. (1987) Cell 48, 549-554. , vinculin, and catenins (31) and associate with 15. Takeichi, M. (1988) Development 102, 639-655. actin filaments. The plaques of desmosomes also contain 16. Goodwin, L. O., Hill, J. H., Raynor, K. R., Rasei, L. K., Manabe, M. & Cowin, P. (1990) Biochem. Biophys. Res. plakoglobin as well as the desmosome-specific components: Commun. 173, 1224-1230. I and II, and band 6, and associate with 17. Koch, P. J., Walsh, M. J., Schmelz, M., Goldschmidt, M. D., intermediate filaments. Immunoprecipitations of desmoglein Zimbelmann, R. & Franke, W. W. (1990) Eur. J. Cell Biol. 53, by foliaceus sera also coprecipitate plakoglobin, 1-12. suggesting that these two molecules form a complex (32). 18. Gorbsky, G. & Steinberg, M. S. (1981) J. CellBiol. 90,243-248. Similar experiments have detected complexes between cad- 19. Cowin, P., Kapprell, H. P., Franke, W. W., Tamkun, J. & herins and three polypeptides called catenins (31) and the Hynes, R. 0. (1986) Cell 46, 1063-1073. cytoskeletal proteins fodrin and ankyrin (33, 34). However, 20. Devereux, J., Haeberli, P. & Smithies, 0. (1984) Nucleic Acids the interactions between desmocollins and other components Res. 10, 305-321. remain to be determined. 21. Sabatini, D. D., Kreibich, G., Morimoto, T. & Adesnik, M. (1983) in Liver Plasma Proteins, eds. Glaumann, H., Peters, J. Which structural features lead these adhesive proteins to & Redman, C. (Academic, New York), pp. 237-255. specifically coalesce into the desmosome and the adherens 22. Penn, E. J., Hobson, C., Rees, D. A. & Magee, A. 1. (1989) junctions is an intriguing question. Experiments have shown FEBS Lett. 247, 13-16. that the cytoplasmic domain of the cadherins is required to 23. Higuchi, R., Krummel, B. & Saiki, K. (1988) Nucleic Acids enable them to cluster into adherens-typejunctions (31). This Res. 16, 7351-7367. suggests that cytoskeletal linkage and/or signal transduction 24. Dayhoff, M., Shwartz, R. M. & Orcutt, B. C. (1978) in Atlas is a function of the carboxyl-terminal domain and is required of Protein Sequence and Structure, ed. Dayhoff, M. (Nati. to collect these proteins into junctional sites. Biomed. Res. Found., Washington) Suppl. 5, pp. 345-352. of cadherin function not 25. Hatta, K., Nose, A., Nagafuchi, A. & Takeichi, M. (1988) J. Experimental perturbations only Cell Biol. 106, 873-881. lead to loss of adhesive properties between cells but also 26. Ringwald, M., Schuh, R., Vestweber, D., Eistetter, H., markedly affect cellular differentiation (15). These results Lottspeich, F., Engel, J., Dolz, R., Jahnig, F., Epplen, J., suggest that cadherins play an active role in adhesion in the Mayer, S., Muller, C. & Kemler, R. (1987) EMBO J. 6, process of cellular recognition and in the transmission of 3647-3653. positional information to the cell during embryogenesis. The 27. Ozawa, M. & Kemmler, R. (1990) Cell 63, 1033-1038. finding that desmosomal glycoproteins resemble cadherins, 28. Ozawa, M. & Kemmler, R. (1990) J. Cell Biol. 111, 1645-1650. therefore, raises questions concerning their role in the process 29. Blaschuk, 0. W., Sullivan, R., David, S. & Pouliot, Y. (1990) of cellular recognition. It is of some consequence that plako- Dev. Biol. 139, 227-229. a molecule with all these molecules is 30. Nose, A., Tsuji, K. & Takeichi, M. (1990) Cell 61, 147-155. globin, colocalizing 31. Ozawa, M., Baribault, H. & Kemmler, R. (1989) EMBO J. 8, required for the implementation of positional information in 1711-1717. Drosophila embryos (35). It is possible that the desmosomal 32. Korman, N. J., Eyre, R. W., Klaus-Kovtun, V. & Stanley, adhesive molecules may play a far more dynamic role in the J. R. (1989) N. Engl. J. Med. 321, 631-635. cellular recognition process than was previously imagined. 33. Nelson, W. J., Shore, E. M., Wang, A. Z. & Hammerton, R. W. (1990) J. Cell Biol. 110, 349-357. We thank Yigal Aharon and Jianbo Zeng for help with early studies 34. McNeil, H., Ozawa, M., Kemler, R. & Nelson, W. J. (1990) and Drs. Leslie Goodwin, Irwin Freedberg, Dan Rifkin, and David Cell 62, 309-316. Frendewey for critically reading the manuscript. This work was 35. Peifer, M. & Weischaus, E. (1990) Cell 63, 1167-1178. Downloaded by guest on September 30, 2021