The PA Domain: a Protease-Associated Domain
Total Page:16
File Type:pdf, Size:1020Kb
Protein Science ~2000!, 9:1930–1934. Cambridge University Press. Printed in the USA. Copyright © 2000 The Protein Society The PA domain: A protease-associated domain PIERS MAHON1 and ALEX BATEMAN2 1 Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, United Kingdom 2 The Sanger Centre, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom ~Received December 7, 1999; Final Revision July 20, 2000; Accepted August 3, 2000! Abstract We have identified a similarity between the apical domain of the human transferrin receptor and several other protein families. This domain is found associated with two different families of peptidases. Therefore, we term it the PA domain for protease-associated domain. The PA domain is found inserted within a loop of the peptidase domain of family M80M33 zinc peptidases. The PA domain is also found in a vacuolar sorting receptor and a ring finger protein of unknown function that may be a cell surface receptor. The PA domain may mediate substrate determination of peptidases or form protein–protein interactions. Keywords: aminopeptidase Y; cell wall-associated serine protease; transferrin receptor; vacuolar receptor Trafficking of soluble proteins through subcellular compartments that contains a PA domain at the N-terminus and three EGF do- is crucial to eukaryotic life. This process depends on integral mem- mains preceding the C-terminal transmembrane helix and the short, brane proteins that act as receptors. These bind lumenal proteins in tyrosine-targeting motif-containing, cytosolic region. These pro- one subcellular location. After transport of the receptor0lumenal teins bind the NPIR motif of soluble proteins destined for the lytic protein complex to the target organelle the lumen proteins are vacuole ~Kirsch et al., 1994, 1996!. The protein is found in the released. While studying the structure of a plant specific vacuolar prevacuolar compartments of plant cells ~Sanderfoot et al., 1998!. receptor, BP-80, we found distant homology in one domain to the There are at least 10 of these receptors in Arabidopsis thaliana, mammalian transferrin receptor, involved in the endocytosis of and they are conserved in both monocot and dicot plants, suggest- transferrin. The occurrence of this domain in such diverse traffick- ing an important role for this family. ing receptors was striking, and led us to further characterize the We found that C-RZF ~Tranque et al., 1995! contains an domain. The publication of the three-dimensional structure of the N-terminal PA domain, separated by transmembrane helix pre- ectodomain of the transferrin receptor ~Lawrence et al., 1999! dicted with Tmpred ~Hofmann & Stoffel, 1993!, from a previously allowed us to relate the sequence of these domains to a known recognized C-terminal ring finger domain. Members of this family structure. were found in round 5 of the PSI-Blast search with an E-value of We used the common structural domain between BP-80 and the 0.005. Members of the RZF family are found in fungi, plants, and human transferrin receptor ~residues 184–384 of P02786! as a metazoa. Every member has a predicted signal peptide using Sig- query using PSI-BLAST ~Altschul et al., 1997! with an E-value nalP ~Nielsen et al., 1999!. The PA domain from this family is threshold of 0.01. The search converged after eight iterations. Using more closely related to that from the plant vacuolar receptors, the BP-80 domain ~residues 27–182 of P93484! as a query con- suggesting that the plant specific vacuolar receptors may have verges to essentially the same set of proteins. We have called this evolved from this family. We predict the PA domain to be extra- domain the PA domain for protease-associated domain. Based on cellular and the ring finger intracellular. This conflicts with Tranque the known structure of the transferrin receptor, the PA domain is et al. ~1995!, who suggest that C-RZF is a soluble nuclear protein. 170–210 amino acids long and has a b-sandwich structure with However, the differential sedimentation method used in this paper two peripheral helices ~see Fig. 1A!. The PSI-BLAST searches to support nuclear localization does not distinguish nuclear and reveal that the PA domain occurs in four distinct families of pro- membrane locations. As we predict C-RZF to be an integral mem- teins as described below. brane protein, not soluble, Triton X-114 phase partitioning would The plant vacuolar receptors homologous to BP-80 ~Kirsch et al., resolve this conflict ~Brusca & Radolf, 1994!, as our structure 1994; Paris et al., 1997! were found in the second round of search- predicts C-RZF to enrich in the Triton detergent phase, in contrast ing with an E-value of 0.002. They have a large lumenal region to Tranque et al.’s transcription factor hypothesis, which would enrich in the aqueous phase. Reprint requests to: Alex Bateman, The Sanger Centre, Wellcome Ge- The structure of the transferrin receptor ectodomain has recently nome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom; e-mail: been published ~Lawrence et al., 1999! and shows that the receptor [email protected]. contains three structural domains. The first domain is an inactive 1930 PA domain 1931 protease related to known M80M33 amino- and carboxypepti- dases. Inserted within a loop in the protease domain is the apical domain. The third all-helical C-terminal domain is involved in receptor dimerization ~Lawrence et al., 1999!. The apical domain is equivalent to the PA domain ~see Fig. 1B,C! and allows us to define the domain boundaries of the PA domain. Rawlings and Barrett have examined the domain structure of the related gluta- mate carboxypeptidase0prostate-specific membrane antigen pro- tein ~PSM! and call the PA domain domain D ~Rawlings & Barrett, 1997!. PSM has two known splice variants—one is membrane bound, whereas the other is cytosolic ~Su et al., 1995!. This is the only clear example of the PA domain occurring in the cytosol. The cytosolic form may predispose cells to localized folate deficiency, a possible factor in prostate cancer progression ~Heston, 1997!. The PA domain is found in the Pyrolysin family of subtilases ~Siezen & Leunissen, 1997!, which includes bacterial endopepti- dases such as c5a protease, involved in immune response evasion, and plant subtilases such as cucumusin that are involved in plant pathogen defense and development. Members of this family were found in the first iteration of PSI-blast with an E-value of 0.0001. Fig. 1. A: The structure of the PA domain. The structure was drawn using The PA domain is equivalent to the I domain of Siezen Siezen, ~ MOLSCRIPT ~Kraulis, 1991! on Protein Data Bank file 1CX8. Secondary 1999! and vr13 of Siezen and Leunissen ~1997!. Removal of the structures are shown in ribbon representation with those aligned in ~B! PA domain in the Lactococcus lactis cell-envelope proteinase shaded black and marked with designations from Lawrence et al. ~1999!. changed the caseinolytic specificity of the enzyme ~Bruinenberg The dotted line represents the apical traverse that is found to be variable in et al., 1994!. They suggest that this region is involved in substrate length. B: A multiple alignment of the core conserved regions of repre- sentative PA domains. We constructed a multiple alignment of PA domains specificity. with CLUSTALW followed by manual adjustment. Although we know the In addition to the above four families, the PA domain was found structure of the whole domain, only the regions that can be confidently in a protein from Deinococcus radiodurans ~DRA0325! where it is aligned are shown. The first column gives the SWISS-PROT or TrEMBL N-terminal to a region homologous to the Peptide-N4-~N-acetyl- name. The TrEMBL names are followed by a five-letter SWISS-PROT d species designation. The names are followed by the start and end points of b- -glucosaminyl! asparagine Amidase F ~PNGase F!. The func- the alignment in the whole sequence. Conserved residues in the alignment tion of PNGase F is to cleave the b-aspartoglucosylamine bond of are highlighted using the CLUSTALX color scheme from the JALVIEW N-linked glycans. The PA domain is found in isolation in the program ~M. Clamp, pers. obs.! in default mode. The secondary structure Drosophila protein CG9849 and in association with a glycosyl is marked with an arrow for b-strands and a cylinder for a-helix, the hydrolase of family 47 in CG5682. The function of these Dro- designations for secondary structures is taken from Lawrence et al. ~1999!. Numbers in brackets indicate the length of insertions not shown in the sophila proteins awaits experimental determination. alignment. C: A schematic representation of the domain organization of PA We propose that the PA domain is a novel protein–protein in- domain containing proteins. The SWISS-PROT or SP-TrEMBL accession teraction domain for the following reasons: ~1! the linkage via the number for a representative of each domain organization is shown in brack- PA domain of the two trafficking receptors BP-80 and the trans- ets. The species distribution of each group is also shown. The boundaries ferrin receptor suggest this domain is responsible for their conser- of the domains in this figure may differ from those of the structural domain. ~Figure continues on next page.! vation of function, i.e., binding soluble proteins. ~2! The proposed model for transferrin binding to its receptor ~Lawrence et al., 1999! uses a large patch of the PA domain for the binding. ~3! In subti- lases, the PA domain has been shown to be a determinant of pro- tilases and in trafficking receptors would predict the following tease specificity, but its presence is not essential for catalytic activity results: ~1! Removal of a PA domain from an active protease should ~Bruinenberg et al., 1994!.