
Proc. Nati. Acad. Sci. USA Vol. 82, pp. 1609-1613, March 1985 Biochemistry Complete sequence of a gene encoding a human type I keratin: Sequences homologous to enhancer elements in the regulatory region of the gene (DNA sequence/gene expression/nuclease S1 mapping) DO.UGLAS MARCHUK, SEAN MCCROHON, AND ELAINE FUCHS Department of Molecular Genetics and Cell Biology, The University of Chicago, Chicago, IL 60637 Communicated by A. A. Moscona, November 5, 1984 ABSTRACT We report here the complete nucleotide se- polyadenylylation signals in a keratin gene. A comparison of quence of a gene encoding the 50-kDa keratin expressed in the 5' regulatory sequences of the keratin gene (this paper) abundance in human epidermal cells. According to its se- with those of the vimentin gene (17) has revealed a possible quence, this gene has a single transcriptional initiation site and explanation for the different levels of expression of these a single polyadenylylation signal. Nuclease S1 mapping of this two genes in different tissues. gene with total human epidermal mRNA confirmed the pres- ence of a single initiation site for the 50-kDa keratin gene. When the regulatory sequences 5' upstream from this gene MATERIALS AND METHODS were examined, three sequences that share significant homolo- The 50-kDa type I keratin gene was isolated and character- gy with viral and immunoglobulin enhancer elements were ized as described (18). To obtain the complete sequence of found. In comparison, the sequence of the regulatory region of this gene, we first subcloned into plasmid pUC8 two restric- vimentin, a structurally similar intermediate filament gene, tion endonuclease fragments that hybridized with a cloned was highly divergent [Quax, W., Egberts, W. V., Hendriks, 50-kDa keratin cDNA, KB-2 (3). The 5' end of the gene was W., Quax-Jeuken, Y. & Bloemendal, H. (1983) Cell 35, 215- sequenced by the method of Maxam and Gilbert (19). For the 223]. This finding may provide a clue to understanding the remainder of the gene, we applied the M13-dideoxy strategy molecular mechanisms underlying the widely varying levels of (20) and the shotgun cloning method of Anderson (21) as de- expression of different intermediate filament genes in different scribed (18). For 100% of the coding sequences and for 90% tissues. of the intron sequences, multiple and frequently opposite strands were sequenced. The intermediate filaments (IFs) are a family of 8- to 10-nm fibers that constitute a part of the cytoskeleton in virtually all higher eukaryotic cells (for a review, see ref. 1). The kera- RESULTS tins are a complex group of about 20 different proteins that comprise the IFs in epithelial cells (for a review, see ref. 2). Isolation, Identification, and Complete Nucleotide Sequence They can be further subdivided into two distinct sequence of the Human 50-kDa Keratin Gene. Using a 1380-base-pair classes, type I and type 11(3, 4). At least one member of each (bp) cDNA probe complementary to the 50-kDa keratin of these two keratin classes is always expressed at all times, mRNA (3), we isolated the human gene encoding this kera- suggesting the importance of each of these types of se- tin. Hybridization studies showed that this is the only human quences in filament assembly (2, 5-7). gene bearing a 3' noncoding sequence identical to that of the Although the two keratin types share only 25-30% amino 50-kDa keratin cDNA (18). To elucidate the complete struc- acid homology with one another, the individual members of ture of the 50-kDa keratin gene, we determined the entire a single class can be very closely related, as judged by posi- nucleotide sequence of the gene and its 5' and 3' flanking tive hybrid translation (5, 8) and by sequence analyses (4, 9- regions. An outline of the structure of the gene is illustrated 12). Despite their similarities, many of the keratins of a sin- in Fig. 1. The DNA sequence corresponding to the complete gle type are encoded by separate mRNAs (5, 13-15). Wheth- mRNA for the 50-kDa keratin is shown in Fig. 2 along with er multiple mRNAs might arise from a single keratin gene, the complete predicted amino acid sequence for the protein. however, has not been examined in detail. The positions of the introns were determined by comparing The level of expression of IF proteins in different cells and the sequence of the gene with that of the previously pub- tissues varies considerably. Even within a single IF family, lished cDNA sequence (9). Intron positions are noted by tri- such as the keratins, the amounts of these proteins in differ- angles, and the complete sequences of these introns are giv- ent epithelia can be dramatically different. In basal epider- en in Fig. 3. mal cells, almost 30% ofthe total protein synthesized is kera- The Transcription and Translation Initiation Sites of the tin (16). The 50-kDa type I and the 56-kDa type II keratin are Human 50-kDa Keratin Gene. At a position 85 nucleotides 5' major keratins present in these cells. Other pairs of type I upstream from the putative initiator ATG codon is found the and type II keratins are expressed in other epithelia, but they sequence T-A-T-A-A-A (for a review, see ref. 24). Fifty nu- are usually less abundant (2, 4-6). cleotides downstream from the TATA box, the sequence C- In this paper, we present the complete nucleotide se- T-T-C-T-G is found. This is a known consensus sequence quence ofthe 50-kDa type I human keratin gene expressed in that frequently appears downstream from the cap site of eu- cultured human epidermal cells. An analysis of this gene has karyotic mRNAs (25). enabled us to explore the possibility of mutiple initiation or To determine the precise transcription initiation site, we hybridized human epidermal mRNA with a radiolabeled ge- The publication costs of this article were defrayed in part by page charge nomic fragment encompassing 251 nucleotides 5' upstream payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. Abbreviations: bp, base pair(s); IF, intermediate filament. 1609 Downloaded by guest on October 1, 2021 1610 Biochemistry: Marchuk et aL Proc. NatL Acad ScL USA 82 (1985) n m IM m 0* ss r SEMI I- 3' i I 1 kb FIG. 1. A schematic diagram of the human 50-kDa keratin gene. An outline of the structure of the 50-kDa keratin gene was drawn to scale. The eight exons are represented as boxes and are identified by roman numerals. The introns are represented by the thin connecting lines. kb, Kilobase. and 172 nucleotides 3' downstream from the putative site. servation of intron positions in these genes provides a strong When the hybrid was subsequently treated with endonucle- indication that the two genes had a common origin. Although ase S1 to digest the unprotected DNA, a mRNA-protected intron position is highly conserved among IF genes, both the DNA fragment of exactly 172 nucleotides was generated. sizes and the sequences of the introns have diverged consid- DNA sequence analysis of this fragment indicated that the erably. The 5' and 3' noncoding sequences of the transcrip- mRNA protection extended exactly to the A-C sequence ap- tional domain of the genes are also highly divergent. pearing 26 nucleotides downstream from the TATA box In addition to the divergence in noncoding sequences of (Fig. 4). The size of the 5' untranslated region is 60 nucleo- the two genes, the sequences 5' upstream from the genes tides, in good agreement with estimates previously obtained were also very different. In the keratin gene, three enhancer- for electrophoretic migration of the 50-kDa keratin mRNA like sequences were discovered in the region 5' upstream (13). The first ATG 3' downstream from the TATA box is from the TATA box. In contrast, only a single sequence homologous to the consensus translation start sequence, sharing identity with only six of the eight enhancer core resi- A/G-X-X-A-U-G-G (27). The high serine content fits well dues was found in the vimentin gene (17). This sequence is with the general richness of serine residues at the amino ter- located very close to the TATA box (6 bp 5' upstream) and is mini of epidermal keratins (28). on the noncoding strand. The complete gene sequence encodes a polypeptide of 472 Unusual Features Within the Keratin Introns? When the amino acid residues (Fig. 2). A molecular mass of 51,591 Da introns of the 50-kDa keratin gene were analyzed for unusual is predicted, which is in good agreement with the size of 50 sequences or structural features, few interesting regions kDa originally estimated on the basis of NaDodSO4/polya- were found (Fig. 3). There are no segments in which a long crylamide gel electrophoresis (16). The predicted amino acid stretch of stable intrastrand secondary structure can be composition matches very well with that determined by formed. The longest stretch ofperfect dyad symmetry in any chemical means for the 50-kDa keratin (9). of the introns consists of only nine potential base pairs. Per- Transcription Polyadenylylation Signals. Only a single fect palindromic sequences are also short (<10 nucleotides). polyadenylylation site A-A-T-A-C-A was found at the 3' end of In part, the low degree of potential secondary structure in the gene. This site appeared 127 nucleotides (residues 4596- the keratin introns may be attributed to the small size of the 4601, Fig. 2) 3' downstream from the translational stop co- introns.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages5 Page
-
File Size-