Polyserase-I, a Human Polyprotease with the Ability to Generate Independent Serine Protease Domains from a Single Translation Product
Total Page:16
File Type:pdf, Size:1020Kb
Polyserase-I, a human polyprotease with the ability to generate independent serine protease domains from a single translation product Santiago Cal, Vı´ctorQuesada, Cecilia Garabaya, and Carlos Lo´ pez-Otı´n* Departamento de Bioquı´micay Biologı´aMolecular, Facultad de Medicina, Instituto Universitario de Oncologı´a,Universidad de Oviedo, 33006-Oviedo, Spain Communicated by Pamela J. Bjorkman, California Institute of Technology, Pasadena, CA, June 4, 2003 (received for review February 21, 2003) We have identified and cloned a human liver cDNA encoding an a region of the short arm of chromosome 19. In this work, we unusual mosaic polyprotein, called polyserase-I (polyserine pro- report the finding that these three predicted independent genes tease-I). This protein exhibits a complex domain organization are in fact part of a unique gene encoding a human polyprotease including a type II transmembrane motif, a low-density lipoprotein that we have tentatively called polyserase-I. We also show that receptor A module, and three tandem serine protease domains. polyserase-I undergoes a series of proteolytic processing events This unusual modular architecture is also present in the sequences to yield three independent serine protease domains. This is an predicted for mouse and rat polyserase-I. Human polyserase-I gene example of generation of independent serine protease domains maps to 19p13, and its last exon overlaps with that corresponding in human tissues from a single translation product and illustrates to the 3 UTR of the gene encoding translocase of mitochondrial an additional strategy for increasing the complexity of the inner membrane 13. Northern blot analysis showed the presence of human degradome. a major polyserase-I transcript of 5.4 kb in human fetal and adult tissues and in tumor cell lines. Analysis of processing mechanisms Materials and Methods of polyserase-I revealed that it is synthesized as a membrane- Molecular Cloning. A computer search of human genome data- associated polyprotein that is further processed to generate three bases allowed the identification of three regions in the human independent serine protease units. Two of these domains are DNA contig NT011245 (chromosome 19p13), which after con- proteolytically active against synthetic peptides commonly used ceptual translation, showed the presence of structural hallmarks for assaying serine proteases. These proteolytic activities of the for serine proteases. Then, a PCR-based approach to clone polyserase-I units are blocked by serine protease inhibitors. We cDNAs for these proteases was designed. The strategy first show an example of generation of separate serine protease do- involved the use of specific oligonucleotides derived from the mains from a single translation product in human tissues and central putative protease gene present in the cluster. The illustrate an additional mechanism for expanding the complexity of sequences of the designed primers Polys2f, Polys2f-nd, Polys 2r, the human degradome, the entire protease complement of human and Polys2r-nd are indicated in Table 1, which is published as cells and tissues. supporting information on the PNAS web site, www.pnas.org. The PCR was performed in a GeneAmp 2400 PCR system roteases comprise a growing group of structurally and func- (Perkin–Elmer Life Sciences) for 40 cycles of denaturation Ptionally diverse proteins that share a common property: the (94°C, 20 s), annealing (64°C, 15 s), and extension (68°C, 60 s). Ͼ ability to hydrolyze peptide bonds (1). To date, 500 different The amplified PCR product (439 bp) from a human liver cDNA human proteases and protease homologs have been identified (2, library was cloned and sequenced. The 5Ј and 3Ј ends of the 3). Functional studies involving this large group of enzymes were cloned cDNA were extended by successive rounds of rapid originally aimed at analyzing their role in nonspecific reactions amplification of cDNA ends (RACE) by using RNA from human associated with protein catabolism. However, over recent years, liver and the Marathon cDNA amplification kit (CLONTECH). it has been widely recognized that proteolysis represents a Sequence analysis of the RACE-extended cDNA clones led us to fundamental mechanism for regulating many events on which conclude that the two additional protease genes present at the cell life and death depend. Thus, proteases through their ability 5Ј and 3Ј ends of the central protease gene of the NT011245 to perform highly selective and limited cleavage of specific DNA contig were in fact part of a single gene encoding a protein substrates may control the appropriate intracellular or extracel- with three protease-like domains. Once the start and stop codons lular localization of multiple proteins, the ectodomain shedding of this large gene were located, the full-length cDNA was from membrane-bound proteins, or the activation and inactiva- BIOCHEMISTRY obtained by PCR using the primers polysATG and polysEND tion of cytokines, growth factors, and peptide hormones (2, 3). (Table 1). PCR conditions were as above, but with 180 s of These proteolytic processing events influence essential biologi- extension. cal processes including cell-cycle progression, morphogenesis, tissue remodeling, cell proliferation, cell migration, angiogene- ϩ Northern Blot Analysis. Nylon filters containing poly(A) RNA of sis, and apoptosis (4–8). Furthermore, alterations in the struc- diverse human tissues were prehybridized at 42°C for 3 h in 50% ture, function, and regulation of proteases underlie several ϫ ͞ human diseases such as cancer, arthritis, cardiovascular disor- formamide, 5 standard saline phosphate EDTA (0.18 M NaCl͞10 mM phosphate, pH 7.4͞1 mM EDTA), 10ϫ Denhardt’s ders, and neurodegenerative diseases (9–13). ͞ The increasing complexity of human proteolytic systems has solution, 2% SDS, and 100 g ml denatured herring sperm made necessary the introduction of global concepts and ap- DNA. Hybridization was performed with a 0.5-kb radiolabeled proaches to try to clarify the multiple questions that have arisen in the protease field. Thus, we have coined the term degradome Abbreviations: HA, hemagglutinin; LDLR, low-density lipoprotein receptor; TTSP, type II to define the complete set of proteases that are produced at a transmembrane serine protease; ACE, angiotensin-converting enzyme; AMC, 7-amino-4- specific moment or circumstance by a cell, tissue, or organism methylcoumarin. (2). Recently, and as part of our studies focused on trying to get Data deposition: The sequences reported in this paper have been deposited in the GenBank a complete view of the human degradome, we observed the database (accession nos. AJ488946 and AJ488947). presence of three putative serine protease genes closely linked at *To whom correspondence should be addressed. E-mail: [email protected]. www.pnas.org͞cgi͞doi͞10.1073͞pnas.1633392100 PNAS ͉ August 5, 2003 ͉ vol. 100 ͉ no. 16 ͉ 9185–9190 Downloaded by guest on September 26, 2021 probe amplified from the cDNA clone with oligonucleotides northF and northR (Table 1). After hybridization for 20 h under the same conditions used for prehybridization, filters were washed with 0.1ϫ SSC, 0.1% SDS for2hat50°C, and Fig. 1. Structure of polyserase-I. TM, transmembrane domain; LDLR, LDLR A autoradiographed. module; serase, serine protease domain. Production and Purification of Recombinant Proteins. Two cDNA contructs encoding the first or the second serine protease domains tification of a DNA contig containing three regions putatively of the cloned cDNA were generated by PCR amplification using the encoding new serine proteases. However, detailed sequence oligonucleotides mod1f, mod1r, mod2f, and mod2r (Table 1). The analysis of these three regions revealed that they were very close PCR amplifications were performed for 25 cycles of denaturation to each other, suggesting that they could be part of a single gene (95°C, 15 s), annealing (59°C, 10 s), and extension (68°C, 50 s). The containing different protease modules. To explore this possibil- PCR products were cloned in the SmaI site of the pGEX-3X ity, we performed PCR amplifications with specific oligonucle- expression vector (Amersham Pharmacia Biosciences). The result- otides derived from the central protease module, followed by Ј Ј ing vectors, pGEX-3X-serase-1 and pGEX-3X-serase-2, were successive rounds of 5 and 3 rapid amplifications of cDNA transformed into BL21(DE3)pLysE Escherichia coli cells, and ends. This strategy led us to generate a cDNA of 3,180 bp, which expression was induced by isopropyl -D-thiogalactoside. The re- after cloning and sequencing, revealed the presence of a single combinant proteins were purified on a glutathione-Sepharose 4B RNA transcript containing the coding information for the three column and used for enzymatic assays. predicted serine protease domains (Fig. 1). Computer analysis of this cDNA sequence demonstrated that it codes for a protein of Protease Assays. The putative enzymatic activity of the recombinant 1,059 aa, with a calculated molecular weight of 114,020. The GST-serase-1 and GST-serase-2 proteins was analyzed by using the proposed AUG initiation codon is located in a context (GC- synthetic fluorescent substrates N-t-Boc-Gln-Ala-Arg-7-amino-4- CAUGG) that matches perfectly the Kozak consensus sequence methylcoumarin (AMC), N-t-Boc-Gln-Gly-Arg-AMC, N-t-Boc- for translation initiation (16) and is immediately preceded by an Ala-Pro-Ala-AMC, and N-t-Boc-Ala-Phe-Lys-AMC. Routine as- in-frame UGA stop codon, strongly suggesting that this AUG says were performed at 37°C at substrate concentration of 100 M corresponds to the true initiation site for this protein. Domain in an assay buffer containing 50 mM Tris⅐HCl (pH 8.0), 20 mM analysis using the SMART tool and the TMHMM (transmembrane NaCl, and 2.5% Me2SO. The fluorometric measurements were helices Markov model) program, indicated that the protein ϭ made in a MPF-44A Perkin–Elmer spectrofluorometer ( ex 360 possesses a type II transmembrane segment and a low-density ϭ nm, em 460 nm).