Proc. Natl. Acad. Sci. USA Vol. 93, pp. 4278-4283, April 1996 Biochemistry

Human plectin: Organization of the gene, sequence analysis, and localization (8q24) CHANG-GONG LIU*, CHRISTIAN MAERCKER*, MARIA J. CASTANONt, RUDOLF HAUPTMANNt, AND GERHARD WICHE* *Institute of Biochemistry and Molecular Biology, University of Vienna-Biocenter, 1030 Vienna, Austria; and tErnst Boehringer Institut, 1121 Vienna, Austria Communicated by Gottfried Schatz, Biozentrum der Universitat Basel, Basel, Switzerland, January 2, 1996 (received for review August 28, 1995)

ABSTRACT Plectin, a 500-kDa C-terminal globular domain of the molecule (9). The overex- binding , has been proposed to provide mechanical pression of plectin mutant containing this site(s) has strength to cells and tissues by acting as a cross-linking a dramatic dominant negative effect on cells, causing the total element of the . To set the basis for future studies collapse of cytoplasmic IF networks. Based on the combined on gene regulation, tissue-specific expression, and patholog- information available to date, plectin has been proposed to ical conditions involving this protein, we have cloned the play a key role as a versatile cross-linking element of the human plectin gene, determined its coding sequence, and cytoskeleton. Particularly, because of its strategic localization established its genomic organization. The coding sequence at the cytoskeleton-plasma membrane interface, such as in all contains 32 exons that extend over 32 kb ofthe . types of muscle cells; in basal cell layer cells of stratified and Most of the introns reside within a region encoding the single layer epithelia, including epidermis and cornea; and in globular N-terminal domain of the molecule, whereas the cells forming the blood-brain barrier, plectin's functions are entire central rod domain and the entire C-terminal globular expected to contribute to the strengthening of cells toward domain were found to be encoded by single exons of remark- shearing forces or other mechanical stress. However, the able length, >3 kb and >6 kb, respectively. Overall, the precise biological function of the protein is unknown. organization of the human plectin gene was strikingly similar The molecular characterization of human plectin provides a to that of human bullous pemphigoid antigen 1 (BPAG1), new avenue to study the function of this molecule by extending confirming that both proteins belong to the same gene family. these studies to the investigation of its possible role in human Comparison of the deduced protein sequences for human and diseases, whose symptoms correlate with impaired perfor- rat plectin revealed that they were 93% identical. By using mance of cells exposed to great mechanical stress. Pathological fluorescence in situ hybridization, we have mapped the plectin conditions of this kind include muscular dystrophy and blis- gene to the long arm of within the telomeric tering skin disorders. In this initial study, we report the region. This gene locus (8q24) has previously been implicated exon-intron organization of the human plectin gene, the in the human blistering skin disease epidermolysis bullosa analysis of its coding sequence, and its chromosome localiza- simplex Ogna. Detailed knowledge of the structure of the tion. This information is important for the characterization of plectin gene and its chromosome localization will aid in the mutated human genes, identification of any tissue-specific elucidation of whether this or any other pathological condi- variants, and genetic linkage analysis. tions are linked to alterations in the plectin gene. MATERIALS AND METHODS Plectin, one of the largest polypeptides known (Mr >500,000), Characterization of Human Plectin Clones. Isolation of was identified as a of intermediate originally major component human cDNA clones was performed by several rounds of filament (IF) preparations obtained from cultured cells (1). of two human cDNA libraries in as as structure screening placenta Agtll (one Ultrastructural studies (2) well secondary was a of C. and based on the deduced amino acid of the generous gift Stratowa, EBI, Vienna, Austria, prediction sequence the other was from Clontech), first with two previously char- cloned rat cDNA (3) revealed that the molecule is dumbbell- acterized human plectin probes (3), and then with newly shaped, comprising a long central a-helical coiled-coil rod derived human and a rat subclone. Genomic flanked domains. The dominant structural feature probes plectin by globular clones were isolated from a human placental genomic DNA of the C-terminal domain, six highly homologous repeats library (Stratagene) in two rounds of screening using 32p_- exhibiting in their core a tandemly repeated amino acid labeled human plectin subclones. sequence motif, occur in lesser numbers also in other IF- DNA DNA inserts of human cDNA and as the -associated Sequencing. plectin associated proteins, such genomic clones were subcloned into Bluescript vectors (Strat- bullous pemphigoid antigen (BPAG1) (4) and , a agene) and sequenced by the dideoxynucleotide termination constituent of (5). method (10) using the modified T7 polymerase protocol Unlike IF proteins and other IF-associated proteins, which (United States Biochemical) or an automated DNA sequencer are generally specific to certain cell types and tissues, plectin (Applied Biosystems). On average, each base in the coding is abundantly expressed in many different tissues and cell lines region has been sequenced 6.36 times. All sequence data were (for review, see ref. 6). Furthermore, the protein specifically compiled, assembled, and analyzed using the software pack- binds to a variety of IF proteins, including the nuclear lamina ages LASERGENE (DNASTAR) and DNA STRIDER (11). Splice constituent B, -associated proteins, and sites and intron-exon boundaries were determined by align- a- and its nonerythroid analogue, fodrin, major com- ment with the corresponding cDNAs. ponents of the subplasma membrane skeleton (7, 8). Using PCR. Human genomic DNA (Promega) and DNA prepared transient transfection of cultured cells, plectin's binding sites to from the isolated genomic clones was used as template for and IFs have been localized within the Abbreviations: IF(s), intermediate filament(s); BPAG, bullous pem- The publication costs of this article were defrayed in part by page charge phigoid antigen; FISH, fluorescence in situ hybridization. payment. This article must therefore be hereby marked "advertisement" in The sequences reported in this paper have been deposited in the accordance with 18 U.S.C. §1734 solely to indicate this fact. GenBank data base (accession nos. Z54367 and X59601). 4278 Downloaded by guest on September 28, 2021 Biochemistry: Liu et al. Proc. Natl. Acad. Sci. USA 93 (1996) 4279 PCR. Synthetic oligonucleotide primers complementary to the A flanking exons were used for the DNA amplification. The GN R1 R2 GC reaction conditions were as described by the manufacturer (Perkin-Elmer/Cetus). N--- C Fluorescence in Situ Hybridization (FISH). Metaphase 5'l/////////// //Z3' chromosome spreads were prepared from an Epstein-Barr pCGL25 virus-immortalized pCGL26 lymphoblastic cell line, D282 (ref. 12; pCGL16 originally supplied by C. Gosden and W. Muir, Edinburgh, and pCGL21 kindly provided to us by A. Weith, IMP, Vienna, Austria), HP2 using standard cytogenetic techniques. Pretreatment of slides HP1 pCGL20 for FISH, probe labeling, hybridization, and fluorescence pCGL52 detection of hybridized probes were performed according to pCGL53 published procedures (13) with minor modifications. A 14-kb human genomic clone (HPG9), and a 9-kb clone containing the human MOS protooncogene (kindly provided by F. Propst, B GN R1 R2 GC Vienna-Biocenter) were labeled with biotin-11-dUTP by nick -I// I I translation according to the suppliers protocol (GIBCO/ BRL). For immunofluorescence detection, fluorescein iso- thiocyanate-avidin (Sigma; 5 ,g/ml) in 4 x SSC/0.2% Tween 20 was employed, followed by a 1:200 dilution of monoclonal -----,',//- HPG24, 25 & 29 anti-avidin biotin-conjugated antibody (Sigma). Slides, "----// ------HPG20 mounted in 0.1% p-phenylenediamine and 0.05% propidium ------HPG8 iodide, were viewed using a Bio-Rad MRC-600 confocal HPG9 microscope. Images were processed using the NIH IMAGE software program. FIG. 1. Structural organization of the human plectin gene. (A) The RESULTS schematic drawing on top represents the three major molecular domains of plectin previously identified (3) as (i) the N-terminal Isolation and Characterization of cDNA and Genomic globular domain (horizontal long line), (ii) the a-helical coiled-coil rod Clones. DNA sequencing of two previously described cDNA domain consisting of subdomains Rl and R2 (empty bar), and (iii) the clones, HP1 and HP2, isolated from a Agtll human placenta C-terminal globular domain containing six repeat domains (circles) cDNA expression library using antibodies to rat plectin (3), and a short tail. Underneath, drawn to scale, are the aligned cDNA revealed 94% homology to the corresponding rat sequence on clones isolated from human placenta cDNA libraries. (B) Organiza- the protein level. These two clones, as well as a 2.3-kb probe tion of the human gene and alignment of genomic clones. The 32 exons derived from a 5' rat plectin cDNA clone (C9; ref. 14) and a comprising the entire coding region of human plectin are shown above. 1.6-kb subclone the 3' end of the Exons, drawn to scale, are depicted by vertical bars, introns by (pCGL9b6) containing horizontal lines. The broken line between exons 1 and 2 an human were represents plectin genomic clone HPG9 (see below), used to intron of -10 kb in size. Gene segments corresponding to the rescreen the original library and a second human placenta N-terminal globular domain, the Rl and R2 rod domains, and the cDNA library (Clontech). In total, 16 human plectin clones globular C-terminal repeat region are indicated. The genomic clones were isolated, some of which were sequenced from the ends used to define the plectin locus are shown below exon locations. and aligned against the rat sequence (Fig. 1A). Together, the clones spanned over 11.5 kb of plectin-coding region, reaching Structural Organization of the Plectin Locus. Comparison its 3' end, but not in a continuous fashion. of the human genomic sequences with the rat and human Initial screening of the human genomic DNA library (Strat- cDNA sequences enabled us to establish the organization of agene) with two different human plectin cDNAs (pCGL16 and the plectin gene. The plectin locus consisted of 32 exons pCGL20; Fig. 1A) yielded three positive clones, among ap- spanning over 32 kb (Fig. 1B). All introns were located within proximately 1.2 x 106 bacteriophage plaques. Two of these the protein coding region, and all of the splice acceptor and clones, HPG8 and HPG9 (Fig. 1B), were characterized in donor sequences conformed to the GT/AG rule (15). Table 1 detail by restriction enzyme digestions, Southern analysis, and identifies the sequence of each exon-intron junction and their DNA sequencing. Clones HPG8 and HPG9 spanned 20 kb of respective sizes. Most of the splice junctions belong to the human genomic DNA and contained most of the coding boundary class type 0 (74%); 10% belong to type 1, and 16% region, except for exons 1 and 2 and the 3' end of the gene. In belong to type 2. This data deviate from values of 41% for type a second screening of the same library using two distant probes 0, 36% for type 1, and 23% for type 2, previously reported for derived from the 5' and 3' ends of clones HPG8 and HPG9, vertebrate genes (16). In general, diversity in the type of intron respectively, 18 additional clones were isolated, but only four phases limits the numbers of mRNA forms that might arise by were found to extend 5' of clone HPG8, and none were found exon skipping while maintaining the same open reading frame. 3' of clone HPG9. The characterization of the insert DNAs The lengths of the introns that were not sequenced in full were showed that three of them, HPG24, -25, and -29, were iden- determined by PCR using primers based on the flanking tical, while the fourth one, HPG20, overlapped with both regions. Intron sizes varied from -10 kb to only 70 bp. The HPG24 and HPG8 (Fig. 1B). The sequences of all exons, as length of intron 1 could not be determined, as it exceeded 10 well as the 5'- and 3'-flanking regions were determined by kb. The sizes of exons varied from 62 to 6223 bp. Most exons, primer walking along the genomic DNA. In order to examine 2-30, were contained within a -12-kb region of the gene, in the plectin gene for introns in the C-terminal region where no which two exon clusters can be defined, one comprising exons genomic clones were available (see Fig. 1B), human genomic 7-25, the other exons 26-30 (see below). Exon 1 presumably DNA (Promega) was subjected to PCR amplification with contained the entire 5'-untranslated region preceeding the different sets of primers and the size of the PCR products first 175 amino acids. Attempts to confirm the transcription compared with those obtained using the corresponding cDNA start site by nuclease protection assays as well as primer as template. Similar to rat plectin (G.W., unpublished data), no extension experiments repeatedly failed, presumably due to introns were found in this region. the unusual length (>15 kb) of the plectin mRNA (3). The rat Downloaded by guest on September 28, 2021 4280 Bohmsr:Lue l Proc. Natl. Acad. Sci. USA 93 (1996)

Table 1. Exon-intron organization of the human plectin gene Exon Exon size, Intron size, number Exon-intron junctions bp bp

01 ... .GCC ACA G gtcagctgcacc 523 -10,000

02 agtctcccccag AT GAG CGG .... CTC ATC AAG gttggtggcgca 62 400 03 tgtccccaccag GCC CAG AGG ....GAC AGC CTG gtacgtgtgccc 90 >750

04 ctcccccaacag CCC CGG GAG .... CAC CGC CAG gtaaggctgccg 78 87

05 gcccctcagcag GTG AAG CTG .... GAT ATC CAG gtagaacggctg 108 522

06 cccgccttgcag GTG AGT GGG .... CGG CAC AAG gtccgcgggggg 153 >600

07 tgcttgtcccag CCC CTG CTC .... CCT GAG G gtacgtctgcgt 115 80

08 ccttccctgcag AC GTG GAT .... AGG GCC AAC gtgagtgggggc 107 80 09 gcctgcccgcag GAG CTG CAG ....GAG ATT GAG gtgggcctgccg 120 79

10 cgtccctggcag ATC CTG TGG .... TCC CTG GAG gtgagtgggacc 96 185

11 ccctgtctacag GGA GCG GTG .... TTT GAG AG gtgggtggggcc 128 159 12 gcgggaggcgag G CTG GAG ....CTT CAG TCG gtgagggggtgt 94 600 13 acgaccttgcag GAT ATC CGG ....TAC CGC AG gtgggccccgcc 155 85

14 ctgctgccccag G GTG TAC ... .AGT GAC GAG gtgggtggcgct 319 80

15 catctgttgcag GGC CAG CTC .... AAG CTG CTG gtgagtgggggc 78 73 16 tcctgcccacag AAC TCC TCC ....AGC TAC TCG gtgagcggtggc 162 165

17 ctctggtgccag GCG CTG ATG .... ACG GTG GAG gtggggctccct 105 97

18 gctgtcccgcag TCC TTC CAG .... TAC TTT CAG gtgagaaccgga 96 278

19 ccgcttccccag TTC TTC TCA .... GAT GCC CAG gtgagggagggg 126 996

20 cctcccccacag GAC GAG AAG .... CAG GTG GAG gtgagtgcaggc 153 85

21 tgcccgtcccag GTG ACT GTG .... GTC ACC AG gtgggtggcggg 155 81 22 ctgacctgccag G CTG GAG ....CTG GCC ACG gtacgcctgccc 127 107 23 tgcttcccccag TTC CGC ACC ....GAA CAG G gtagggcacggc 184 74 24 tgtgttccccag GT GCA CAG ....GAG CAG CAG gtaggggccgtc 158 132 25 ctgtctttgcag AAG GCA CAG ....CTG GAG AAG gtgaattgcaac 180 1,000 26 gctttcccgcag CTC AAG ACC ....TCT CTG AAG gtatcgtttcga 138 87

27 tgtccccaacag AAG CTG CGG ... .CAG GAG CAG gtgggcttgggt 357 74

28 accatcacccag GCC CTG CTG .... GCC ATC AAG gtgaggcccagc 84 170

29 gcattgccacag GAC TAT GAA .... ATC CAG GAG gtagggtggggc 105 94 30 tctactccacag TAC GTG GAC ....GAG GAG GAG gtacagcccgtt 99 900 31 ctgtggccacag AGG CTG GCT ....TCT GAG GAG gtaccgccccct 3381 108

32 cttcttctgcag ATG CAG ACG .... >6353 Nucleotide sequences in the introns are shown in lowercase type, those in exons in uppercase type. Exon sizes were determined by direct nucleotide sequencing of the corresponding human cDNAs (see Fig. 1A) or by alignment of human genomic sequences with rat cDNA. Intron sizes were determined either by nucleotide sequencing or estimated from the sizes of the amplified PCR products using primers in the flanking exons. plectin gene is organized in the same way and spans a length 17). Since the genomic sequence of BPAG1 is available (18), of DNA similar to the human plectin gene (G.W., unpublished the gene organization of these two proteins could be com- results). pared. Although plectin is considerably larger than BPAG1 Domain Structure and Exon Boundaries. The boundaries of (4684 versus 2649 amino acid residues), both genes contain the main structural domains of the plectin molecule closely similar numbers of exons and introns in the regions they have matched those of the exons. Exons 1-26 encoded the N- in common. Furthermore, the relative placement of introns is terminal globular domain. The 3'-half of exon 26 and exons identical for both proteins with only two exceptions. Exon 2 of 27-30 contained the RI domain. The R2 rod domain was BPAG1 aligns with exon 12 of plectin, and from then on they located entirely in exon 31 and the C-terminal globular domain are highly related in their primary sequences, with 57.7% of the in exon 32. The N-terminal globular domain distinguished residues being identical throughout the corresponding N- itself from other domains by the abundance of introns and the terminal regions. Exon 1 of BPAG1, however, has no sequence small size of the exons, some of them coding for as little as 21 homology to any other exons in plectin. Exon 7 of BPAG1 is amino acids. The RI domain, which comprises frequently into two exons in exons 17 and while exons 16 a-helical and thus not be of the split plectin, 18; interrupted stretches, may part and 17 of BPAG corresponds to a single exon, exon 27, in coiled-coil rod domain of plectin (3), was contained in exons As in the N-terminal domain of BPAG1 is a from the plectin. plectin, 26-30, which together formed cluster, separated encoded while the central rod and the other exons by relatively large introns. In contrast, the two by many exons, globular other major domains of the plectin molecule, the R2 rod C-terminal domains are encoded by single exons. domain and the C-terminal globular domain, were encoded by Deduced Protein Sequence of Human Plectin. The amino single large exons, over 3 and 6 kb, respectively, in length. Thus, acid sequence of human plectin was deduced from the nucle- 80% of the introns found in the plectin gene were within otide sequences determined for the genomic and cDNA clones. one-third of its coding sequence, namely within the N-terminal Fig. 2 shows the amino acid sequence alignment of human and globular domain, while the central rod and the C-globular rat plectin. The overall sequence identity between both species domain were flanked but not interrupted by introns. The was 93.2%. The isolation of genomic clones required for the intron/exon ratio of the gene was 1.28, indicating a relatively study of the gene organization of human (this report) and rat high coding density, compared with other genes (16). (G.W., unpublished results) plectin, led to an elongation of the Comparison of the Human Plectin and BPAG1 Genes. rat coding region beyond the previously reported 5' end of rat Because of structural similarities, plectin, desmoplakin, and plectin (3). This recent work also revealed errors at a few BPAG1 have been proposed to belong to one gene family (3, positions in the reported rat coding sequence. (The corrected Downloaded by guest on September 28, 2021 Biochemistry: Liu et al. Proc. Natl. Acad. Sci. USA 93 (1996) 4281

hPLE MVAGMHLMPRDOLRAI YEVLFREGVMVAKKDRRPRSLHPHVPGVTNLQVMRAIASLRARGLVRETFAWCHFFVYLTNEGIAHLROYLHLPPE IVASLORVRRPVAHVMPA-RRTPHVQAVOGPLGSPPKRGPLPTE----.EQRLYRRKE 144 rPLE ...... L ...... T.. K ...... O...... P...... R.. & TR C A. DPARE. RQV 150 hPLE LEEVSPETPVVPATTORTLARPGPEPAPATDERDRVOKKTSTKWNKHL IKAORHISDLYEDLRDGHNL ISLLEVLSGDSLPREKGRIRFHKLONVO IALDYLRHROVKLVN IRNDO IADGNPKLTLGL IWTI ILHFOISD OVSGOSED 294 rPLE R. GA...... S. IVG...... T ...... TK...... 300 hPLE MTAKEKLLLWSORHVEGYOGLRCDNFTSSWRDGRLFNA IIHRHKPLL IDHNKVYROTNLENLDOAFSVAERDLGVTRLLDPEDVDVPPDEKS II TYVSSLYDAMPRVPDVODGVRANELOLRWOEYRELVLLLLOWMRHHTAAFEERRF 444 rPLE ...... T V ....GA I...... K.. 460 hPLE PSSFEEIEILWSOFLKFKEMELPAKEADKNRSKGIYOSLEGAVOAGOLKVPPGYHPLDVEKEWLKLHVAILEREKOLRSEFERLECLQR IVTKLOMEAGLCEEOLHOADALLOSDIRLLAAGKVPQRAGEVERDLDKADSMIIRLLFNDVO 594

hPLErPLE ...... C...... TLKDGRHPOGEOMYRRVYRLHERLVAT ...... IRTEYNLRLKAGVAAPAT13VAQTLOSVORRPELEDSTLRYLODLLAwVEENOHRVDGAEwGVDLPSVEAQLGSHRGLHQST ...... Y IEEFQAK...... SIERARSDEGOLSPATRGAYRDCLGRLDLGYAKLLNS..AA...... G 67447 rPLE ...... G---. T ...... T ...... R.I ...... P...... It . 5...... 59 7 hPLE SKARLRSLESLHSlrVAAATKEULHNEKEEEEVGFDWSDRNTNMTAKKESYSALMRELELKEKKI KELONAGDRLLREDHPARPTVESFQAALOTQWSWILQLCCC I EAHLKENAAYFQFFSDVREAEGQLQIKLQEALRRKNSCDRSATV 894 rPLE ...A ...... I.. T ...... T ...... T ....Y I.. 89 7 rPLE. C . T.I.S . Y...S . SM . 0 . 000nal>|< od hPLE TRLEDLLQADAEKEOLNEYKGHL$GAKRAKAVV13LKPRHPAWPHRGRLPLLAVCDYKOVEVTVHKGDEC13LVGPAOPSHwKVL~SSGSEAAVP$VCFLVPPPNQEV13EAVTRLEAGHQALVTLWHLHVDMKSLLAWDLRR~VOLIR 10q4 rPLE ...... K...... V HV ....F ...... A A...... N...... 1047 hPLE SWVSLATFRTLKPEEROALHSLELHYQAFLRDSOAGGFGPEDRLMAEREYGSCSHHYQO LLOSLEOGAOEESRCQoRC SELKDIRLQOLEACETRTVHRLRLPLDKEPARECAQRIAEQKAOAEVEGLGKGVARLSAEAEKVLALPEPS 11 94

...... E ...... rPLE ...V ...... RNhPLE ~ ~ ~~~~~~~ ~~~~~~~~~~~~~~~~~....It ~~~~~~o ...... 1044Rd1197 N-Terminal damain>j< Rod hPLE PAAPTLRSELELTLGKLEO3VRSLSAI1YLEKLKTI1SLVI!RGT13GAEEVLRAHEEOLKEA13AVPATLPELEATKASLKKLRAO3AEAQ3~PTFDALRDELRGAOEVGERL13GRHGERDVEVERwRERVAGLLERw~AVLA13TDVRORELE13LGR 134q rPLE ...... KT...... 1 V...... V. NT...... T 1347 Rod I>1k Rod 2 hPLE OLRYYRESADPLGAWLQDARRROEOIOAMPLADSOAVREOLRQEOALLEEIERHGEKVEECORFAKOYI NAIKDYELOLVTYKAOLEPVASPAKKPKVOSGSESV IOEYVDLRTHYSELTTLTSOYIKFISETLRRMEEEERLAEOORAE 1494 rPLE ...... S KS V.N.. A K...... K ...... R...... 1497 hPLE ERERLAEVEAALEKOROLAEAHA(AKAOAEREAKELOOR IOEEVVRREEAAVDAQQQKRSIOEELOOLROSSEAE IQAKARGAEAAERSRLRI EEEIRVVRLOLEATERORGGAEGELOALRARAEEAEAOKR0AQEEAERLRROVQDES 1844 rPLE ...... L K T 4...... 7 V... T...... 1647 hPLE O3RKR13AEVELASRVKAEAEAAREKO3RALOALEELRL~AEEAERW.CQAEVERAROV13VALETAORsAEAELOsKRASFAEKTAOLERSLOEEHVAVAOLREEAERRAOQQEAERAREEAER13LERWOLKANEALRLRLOAEEVLOOKSL 179q ...... L ... rPLE ...... A...... AD...... V ...... P ..... T T.T...... A...... 1 797

rPLE E ....T ...R.T 1947

...... rPLE ..R.K T ...... E ...... 2097 hPLE TALRSKE1AELKEA EARORR LAAEEE RRRREAEERVRKRLAAEEE AARGRKAALEEVERLKAKEES EORLEEARLRETLAEEKKREAHAFAV(LKEORAELT L GEKKSVLDLRGTEKERLRAREARRAEEAEEARViERREAAS 1944 rPLE K...... R...... V ...... V...... E E..N ...... E...... 2247 hPLE RROVEEAERLKQSAEEQ3AQARA13AQAAAEKLRKEAE13EAARRAQIAE13AALRQKQAAAE~4EKHKKFAE13TLROKAQVE13ELTTLRL13LEETDHQKNLLDEELO3RLKAEATEAARO3RSOVEEELFSVRV13MEELSKLKAR IEAENRAL ILR 2394 rPLE .K ...... K...... V ...... 0 2397 hPLE DKDNT13RFL13EEAEKMK13VAEEAARLSVAAO3EAARLR13LAEEDLAOORALAEKM1LKEKMQAV13EATRLKAEAELL13Q~KELA13EQARRLOEDKE13HA(xLAEET13GFOiRTLEAER13RQLEMsAEAERLKLRVAEM~sRA13ARAEEDAQ3RFR 25q4 rPLE ...... A...... V...... K ...... 2547 Rod 2>2 hPLE KOAEE IGEKLHRTELATQEKVTLV13TLE I RORQSrHDAERLREAIlAELEREKEKL131EAIKLL1LKS£EE13TV1Q!OELLOETCQALQQ13FLSE:KDSLLO3RERF IEOEKAKLE3LFODEVAKA130LREE131R1300QHEOERORLVASHEEARR 2694 rPLE P I...... K K.. ( E...K. 2697 j< Repeat I hPLE RHEAEEGRRKEELLERREELLAEENRLRELLLEEHRAALAHsEETAAATKTLPNGRDALDGPAAEAEPEHFDGLKVARLEAGILSAEELRLAGHTTVDELARREDVRHYLGRSSAGLLLKATN 284 rPLE ..R K R... E AT.. AA ...... S .....YT P...... T .....A TO K G7...... P.. 2847 hPLE EKLSVYAALOROLLSPGTALI LLEAOAASGFLLDPVRNRRLTVNEAVKEGVVGPELHHKLLSAERAVTGYKDPYTGOOISLF1AHMKGL IVREHG IRLLEAQIATGGVIDPVHISHRVPVDVAYRRGYFDEEMNRVLADPSDDTKGFFDPN 2994 rPLE T...... E...... K. D. ... D...... I...... (1 ...... 2997 Repat I>c Repeat 2 hPLE THENLTYLOLLERCVEDPETGLCLLPLTDKAAKGGELVYTDSEARDVFEKATVSAPFGKFQGKTVTIWEI INSEYFTAEORRDLLROFRTGRI TVEK I IKI I TVVEEEQOKGRLCFEGLRSLVPAAELLESRVI DRELYOO3QRGERSV 3144 ... .. rPLE ...... R ...... T ...... 3V...... 1KR...... A-4 ...... 1D.G.. V ...... 47 3147 hPLE RDVAEVDTVRRALRGANVIAGVWLEEAGQKLSI YNALKKDLLPSDMAVALLEAOAGTGHI IDPATSARLTVDEAVRAGLVGPEFHEKLLSAEKAVTGYRDPYTGOSVSLF13ALKKGL IPREOGLRLLDAOLSTGGIVDPSKSHRVPLDVA 3294 rPLE . E... A. E.. 0.... TS...... E.. RR... OPEV ...... SK ...... S ...... 3297 Repeat 2>1< hPLE CARGCLDEETSRALSEPRADAKAYSDPSTGEPATYGELOORCRPDOLTGLSLLPLSEKAARAROEEFYSEL0.ARETFEKTPVEVPVGGFKGRTVTVVEL iSSEYFTAEOROELFRWFRTGKVTVEKV IK ILIT IVEEVETLROERLSFSG 3444 rPIErPLE ...... L.Y... Y.. K...... N... TS. D.. RV. L ...P. I.V..RSO...... 5...... V H...HK...... V...... L..AKC ...... AL ...... E...... IL ...... 1NHK. .. a...... 3 4 Repeat 3 0..^. 24 hPLE LRSPVPASELLASGVLSRAOFEQLKDGKTTVKDLSELGSVRTLLOGSGCLAGI YLEDTKEKVS IYEAMRRGLLRATTAALLLEOAGTGFLVDPVRNORLYVHEAVKAGVVGPELHEOLLSAEKAVTGYRDPYSGRT ISLFOAMKKGLVL 3694 rPLE ..A...... KI...... TT S.....V...... 5 . ..T...... A.. T...... A ...... KK ...... K ....S...... 3697 hPLE REHGI RLLEAO IATGGI 3DPVNHRvPvDvAY7RGYFSEEMNRvLADPS7TKGFFDPNTHENLTYROLLERCvEDPETGLRLLPLKGAEKVEvvETT7VYTEEETRRAFEET4iD!PGGGSHGGSTMSLWEV4tGDL!PEE1RAOLMAD 374 . .. .. rPLE D. A ...... K L ...... 0D ...... RL ...... R....T...... HS...... D..... It ... 3747 Repeat 3>1< Repeat 4 hPLE FOAGRVTKERHIIIIIE IEKTE!I R0OGLASYDYVRRRLTGEDLFEARI ISLETYNLLREGTKSLREALEGESAWYLYGTGSVAGVYLPGSROTL$SIYALKKGLLSAEVARLLLEAGAATGFLLDPVKGERLTVDEAVRKGLVGPEL 3894 rPLE ...... N ...... A...... Y ...... VF...... V .. ..R ...... T ...... 3897 Repeat 4>11< Repeat 6 hPLE AIAKNLIDRSALDOYRAGTLS I TEFADMLSGNAFRSRSSSVGSSSSYPiSPAVSRTOLASWSDPTEETGPVAG ILDTETLEKVSI TEAIHRNLVDNI TGORLLEAOACTGGIIDPSTGERLPVTDAVNKGLVDKIMIVDR INLAOKAFC 4494 rPLE T...... &S P...... F.. E...... 4497 Repeat C-Termlnue hPLE GFEDPRTKTKHSAAQALKKGWLYYEAGORFLEVOYLTGGLI EPDTPGRVPLDEALORGTVDARTAOKLRDVGAYSKYLTCPKTKLKI SYKDALDRSMVEEGTGLRLLEAAAOSTKGYYSPYSVSGSGSTAGSRTGSRTGSRAGSRRGSFD>< 4844 rPLE ...... $...... S ...... &S ...... 4647 hPLE ATGSGFSMTFSSSSYSSSGYGRRYASGSSASLGGPESAVA 4684 rPLE ...... P ...... 4687 FIG. 2. Comparison of the predicted amino acid sequences of human and rat plectin. The software package DNASTAR was used to align the sequences. Hyphens are gaps inserted to achieve maximum alignment. Exact matches are indicated by dots. Amino acid residues are numbered relative to the putative initiation codon. The N-terminal region, the rod domains Rl and R2, and each of the six tandem repeat domains in the C-terminal region are marked (<>). Note, that the rat sequence extends the published sequence (3) at its 5' end. The DNA sequences of human and rat plectin are available from GenBank, accession nos. Z54367 and X59601.

rat sequence has been deposited in GenBank, accession no. site, the precise size of the untranslated regions in human X59601.) plectin is unknown. The estimated size of human and rat plectin mRNA is -15 We calculated the molecular mass of human plectin to be kb (3). The reported open reading frame of 4684 amino acid 532 kDa, which by far exceeds the value of -300 kDa deduced residues is thus consistent with this estimate. In rat plectin, the by SDS/PAGE. This aberrant migration has been discussed length of the 5'-untranslated region is -200 bp (unpublished previously (3) and seems common to this family of proteins, data), and that of the 3'-untranslated region -1.2kb (3), which including desmoplakin (19) and BPAG1 (20), as well as to in total account for a -15. 3-kb mRNA. Because our most 3' several other large fibrous proteins. clone (pCGL53) extended only 134 bp over the stop codon, and The nucleotide sequence of all exons coincided with the due to our difficulties in mapping the transcription initiation cDNA sequence except for Gln706 (exon 14) and Lys1911 (exon Downloaded by guest on September 28, 2021 4282 Bohmsr:Lue l Proc. Natl. Acad. Sci. USA 93 (1996) over 32 kb on the q24 band of chromosome 8. The description C of the genomic organization of the plectin gene structure confirms the existence of a gene family formed by the struc- turally related proteins, plectin, desmoplakin, and BPAG1, as CHIIROMOSOME 8 proposed previously (3, 17). As shown in this report, plectin and BPAG1 display the same genomic arrangement despite their localization on different , i.e., chromo- somes 8 (this report) and 6 (22), respectively. Indeed, both genes contain all but two introns in the sequences encoding the N-terminal one third of the molecule, whereas the central rod and the entire C-terminal domain, containing in the case of BPAG1 two of the six repeats occurring in plectin (18), both are encoded by uninterrupted exons. Furthermore, exons 2-20 of BPAG1 are of exactly the same size as exons 12-30 of human plectin, except for one exon in BPAG1 (exon 7) that corre- sponds to two exons in plectin (exons 17 and 18) and two exons in BPAG1 (exons 16 and 17) that correspond to a single exon in plectin (exon 27). A possible functional implication of these differences remains to be determined. The organization and structure of the plectin and BPAG1 genes are different from those of the IF protein gene family, in which introns are primarily located in the regions encoding the rod and the C-terminal domains of the molecule (23). Plectin's gene organization is also distinct from that of the IF binding protein profilaggrin, which contains only two introns at the 5' end of the gene (24). The presence of introns opens several possibilities of alter- native of the transcripts. The junction of exons FIG. 3. Double FISH of a human and splicing plectin plectin gene fragment are of interest in this because chromosome 8 marker (protooncogene MOS) to human chromo- 30 and 32 particular context, they a the somes. Probes used and experimental details are described in the text. would allow the generation of splicing variant lacking (A) FISH of plectin probe alone. (B) Double FISH of plectin and MOS entire central rod domain (R2) of plectin. In fact, we have probes. (Upper) Overview of chromosomal spread showing two dou- previously isolated a rat cDNA characterized by the absence of ble-labeled chromosomes. (Lower) Selected double-labeled chromo- exon 31 (clone C4; figure 1 in ref. 3), and an alternative somes. Note distant localization of signals (white dots) in proximal transcript corresponding to such a variant has been observed (MOS) and telomeric (plectin) regions along the long arm of chro- in several rat tissues (unpublished data). mosome 8. (C) Idiogram of G-banded chromosome 8 illustrating The human gene is localized close to the telomeric positions of both genes and summarizing results of 10 double- and 12 plectin single-labeled chromosome spreads. Solid circles represent frequency region of chromosome 8 (8q24). The severe human blistering of double-labeled chromosomes displaying both plectin and MOS disease epidermolysis bullosa simplex (EBS) Ogna has been specific signals. Large circles, units of three double signals; small mapped to this region by genetic linkage analysis (25). Des- circles, units of 2.5 single signals at corresponding locations. Open moplakin and BPAG1, the other hitherto identified members circles represent units of three double signals (large circles) or 2.5 of this gene family, both have been implicated in human single signals (small circles) on chromosomes labeled with the plectin diseases involving skin blistering. Desmoplakin has been probe only. linked to paraneoplastic pemphigus (26), while patients suf- The cDNA derived from human placenta (library, C. fering from bullous pemphigoid, a rare autoimmune disease, 31). antibodies to BPAG1 Moreover, targeted dis- had CGG and AAG -> at these develop (27). Stratowa) Arg (CAG AGG) of the BPAG1 in mice resulted in skin This of sequence conservation ruption gene blistering positions. apparent high degree due to basal cell with severe and suggests that polymorphism of the plectin gene is not extensive. rupture along dystonia Chromosome Localization. A 14-kb clone sensory nerve degeneration (28). However, neither desmo- genomic (HPG9; nor BPAG1 are to a role in EBS see was used for FISH to human chro- likely play Ogna, Fig. 1B) metaphase because in both cases their have been to mosome from a cell line. genes mapped spreads prepared lymphoblastoid chromosome 6 none of the Initial chromosome 8 as a candidate. (22, 29). Similarly, genes, experiments suggested mutations of which have been shown to cause skin This was confirmed double FISH in blistering by analysis using parallel diseases review, see ref. map to chromosome 8. This a chromosome human MOS (for 23) 8-specific protooncogene probe lends support to a involvement of plectin in this Both the MOS and the signals were confined possible (Fig. 3). plectin genetic disorder, a hypothesis that is currently under investi- to the same chromosome but at distant locations. As reported gation. (21), the MOS-specific signal was observed near the centro- Considering plectin's abundance, widespread but distinct mere at position 8qll-12, while the plectin probe hybridized cellular and functional de- to the telomeric of chromosome 8, at distribution, proposed versatility, region position 8q24 (Fig. fects in its gene may be important for a variety of diseases, 3B). especially those hallmarked by decreased mechanical strength of tissues such as dystrophic conditions of striated and smooth DISCUSSION muscle. The data contained in this report should provide a useful basis for studies aimed at establishing such connections. This paper describes the structural organization of the gene for human plectin together with its coding sequence. Determina- We thank Dr. Andreas Weith, IMP-Vienna, for materials and advice tion of the organization and structure of the plectin gene regarding FISH analysis, our colleague Dr. Fritz Propst for providing provides the basis for investigation of the regulation, tissue- a chromosome 8 marker, and Dr. C. Stratowa, EBI-Vienna, for the specific expression, and pathological processes in which plectin human placenta cDNA library. C.M. was supported by a postdoctoral might be involved. The gene consists of 32 exons and extends fellowship from the European Community Human Capital and Mo- Downloaded by guest on September 28, 2021 Biochemistry: Liu et aL. Proc. Natl. Acad. Sci. USA 93 (1996) 4283

bility program. The work was supported by grants from the Austrian 15. Mount, S. M. (1982) Nucleic Acids Res. 10, 459-472. Science Research Fund and an EC-Science grant. 16. Smith, M. W. (1988) J. Mol. Evol. 27, 45-55. 17. Green, K. J., Virata, M. L. A., Elgart, G. W., Atanley, J. R. & 1. Pytela, R. & Wiche, G. (1980) Proc. Natl. Acad. Sci. USA 77, Parry, D. A. D. (1992) Int. J. Biol. Macromol. 14, 145-153. 4808-4812. 18. Tamai, K., Sawamura, D., Do, H. C., Tamai, Y., Li, K. & Uitto, 2. Foisner, R. & Wiche, G. (1987) J. Mol. Biol. 198, 515-531. J. (1993) J. Clin. Invest. 92, 814-822. 3. Wiche, G., Becker, B., Luber, K., Weitzer, G., Castafi6n, M. J., 19. Virata, M. L., Wagner, R. M., Parry, D. A. D. & Green, K. J. Hauptmann, R., Stratowa, C. & Stewart, M. (1991) J. Cell Biol. (1992) Proc. Natl. Acad. Sci. USA 89, 544-548. 114, 83-99. 20. Sawamura, D., Li, K., Chu M.-L. & Uitto J. (1991) J. Biol. Chem. 4. Sawamura, D., Li, K., Chu, M. L. & Uitto J. (1991) J. Biol. Chem. 266, 17784-17790. 266, 17784-17790. 21. Testa, J. R., Parsa, N. Z., Le Beau, M. M. & Vande Woude, G. 5. Green, K. J., Parry, D. A. D., Steinert, P. M., Virata, M. L. M., F. (1988) Genomics 3, 44-47. Wagner, R. M., Angst, B. D. & Nille, L. A. (1990) J. Biol. Chem. 22. Sawamura, D., Nomura, K., Sugita, Y., Mattei, M.-G., Chu, 265, 2603-2612. R. & J. Genomics 722-726. 6. Wiche, G. (1989) CRC Crit. Rev. Biochem. Mol. Biol. 24, 41-67. M.-L., Knowlton, Uitto, (1990) 8, 7. Herrmann, H. & Wiche, G. (1987)J. Biol. Chem. 262,1320-1325. 23. Fuchs, E. & Weber, K. (1994) Annu. Rev. Biochem. 63, 345-382. 8. Foisner, R., Leichtfried, F. E., Herrmann, H., Small, J. V., 24. Markova, N. G., Marenkow, L. N., Chipev, C. C., Gan, S.-Q., Lawson, D. & Wiche, G. (1988) J. Cell Biol. 106, 723-733. Idler, W. W. & Steinert, P. M. (1993) Mol. Cell Biol. 13, 613-625. 9. Wiche, G., Gromov, D., Donovan, A., Castafi6n, M. J. & Fuchs, 25. Gedde-Dahl, T., Jr. (1990) in Management ofBlistering Diseases, E. (1993) J. Cell Biol. 121, 607-619. eds. Wojnarowska, F. & Briggaman, R. A. (Chapman and Hall, 10. Sanger, F., Nicklen, S. & Coulson, A. R. (1977) Proc. Natl. Acad. London), pp. 189-211. Sci. USA 74, 5463-5467. 26. Camisa, C. & Helm, T. N. (1993) Arch. Dermatol. 129, 883-885. 11. Marck, C. (1988) Nucleic Acids Res. 16, 1829-1836. 27. Stanley, J. R., Hawley-Nelson, P., Yuspa, S. H., Shevach, E. M. 12. Stapleton, P., Weith, A., Urbanek, P., Kozmik, Z. & Busslinger, & Katz, S. I. (1981) Cell 24, 897-903. M. (1993) Nat. Genet. 3, 292-298. 28. Guo, L., Degenstein, L., Dowling, J., Yu, Q.-C., Wollmann, R., 13. Lengauer, C., Henn, T., Onyango, P., Francis, F., Lehrach, H. & Perman, B. & Fuchs, E. (1995) Cell 81, 233-243. Weith, A. (1994) GATA 11 (5-6), 140-147. 29. Arnemann, J., Spurr, N. K., Wheeler, G. N., Parker, A. E. & 14. Becker, B. (1990) Ph.D. thesis (University of Vienna, Austria). Buxton, R. S. (1991) Genomics 10, 640-645. Downloaded by guest on September 28, 2021