![Application to Ser-His-Asp Catalytic Triads in the Serine Proteinases and Lipases](https://data.docslib.org/img/3a60ab92a6e30910dab9bd827208bcff-1.webp)
Profein Science (1996), 5:lOOl-1013. Cambridge University Press. Printed in the USA. Copyright 0 1996 The Protein Society Derivation of 3D coordinate templates for searching structural databases: Application to Ser-His-Asp catalytic triads in the serine proteinases and lipases ANDREW C. WALLACE, ROMAN A. LASKOWSKI, AND JANET M. THORNTON Biomolecular Structure and Modelling Unit, Department of Biochemistry and Molecular Biology, University College, Cower Street, London WClE 6BT, England (RECEIVEDDecember 14, 1995; ACCEPTED March13, 1996) Abstract It is well established that sequence templates (e.g., PROSITE) and databases are powerful tools for identifying biological function and tertiary structure for an unknown protein sequence. Herewe describe a method for auto- matically deriving 3D templates from the protein structures depositedin the Brookhaven Protein Data Bank. As an example, we describe a template derived for the Ser-His-Asp catalytic triad foundin the serine proteases and triacylglycerol lipases. We find that the resultant template providesa highly selective tool for automatically dif- ferentiating between catalytic and noncatalytic Ser-His-Asp associations. When applied to nonproteolytic pro- teins, the template picks out two “non-esterase” catalytic triads that maybe of biological relevance. This suggests that the development of databases of 3D templates, such as those that currently exist for protein sequence tem- plates, will help identify the functions of new protein structures as they are determined and pinpoint their func- tionally important regions. Keywords: catalytic triad; cyclophilin; 3D structure; lipases; serine proteinases The use of protein sequence motifs and templates asa tool for gions and residues involved. A 3D template canprovide a quan- the identification of biological function andprediction of tertiary titative description of the relative dispositions of, for example, structure is already well established (see reviews by Taylor, 1988; the key residues in an enzyme activesite based solely on the co- Hodgman, 1989; Taylor & Jones, 1991). These templates arein ordinates. The template can then used be to scan a database of essence a ID protein sequence signature thatis identified by the known protein structures to identify putative catalytic centers. analysis of information in known protein structures andin data Here we demonstrate thepower of such search templates and from sequence alignments and pattern-matching techniques. The the novel information they can provide. We take as our exam- information is summarized in databases such as PROSITE ple the Ser-His-Asp catalytic triad of the serine proteases and (Bairoch & Bucher, 1994) and PRINTS (Attwoodet al., 1994) lipases, which is one of the best known and most intensively which, along with automatic sequence alignment algorithms, en- studied of all functional mechanisms. An added advantage of ables swift assessment of an unknown protein sequence. using this triad is that there are many examples in the protein There hasbeen detailed analysis of the 3D topologiesof metal- structures deposited in the Brookhaven Protein Data Bank binding sites, both in proteins and in small molecules (for re- (PDB) (Bernstein et al., 1977). The triads arepresent in several views see Glusker, 1991; Jernigan et al., 1994). However, there protein families, and the template providesa means of quanti- is not a database of 3D templates of functionally important units fying the differences in the conformational geometryacross the in proteins, analogous to the sequence templatesof PROSITE. different structural and functional families. As the number of known protein structures increases, so the In the Ser-His-Asp catalytic triad, the three residues, which need for a 3D equivalent of PROSITE grows with it - especially occur far apartin the aminoacid sequence of the enzyme, come for identifying likely functions of proteinswhose biological role together in a specific conformation in the active site to perform is unknown and,equally usefully, for locating the functional re- the hydrolytic cleavage of the appropriate bondin the substrate. This triad was first identified in the serine proteinases (Blow et al., 1969; Wright et al., 1969), which cleave peptides at the Reprint requests to: JanetM. Thornton, Biomolecular Structure and Modelling Unit, Department of Biochemistry and Molecular Biology, amide bond. Thisis an ubiquitous groupof proteolytic enzymes University College, Cower Street, London WClE 6BT, England; e-mail responsible for a range of physiological responses, such as the [email protected]. onsetof blood clotting (Mann, 1987) and digestion (Blow, 1001 1002 A.C. Wallace et al. 1976). They also play a major role in the tissue destruction as- its conformation. Our method differs substantially fromprevi- sociated with arthritis, pancreatitis, and pulmonaryemphysema. ous methods (e.g., Barth et al., 1993; Artymiuk et al., 1994; Each enzyme is highly specific for its own peptide substrate and Fischer et al., 1994) in that a simple template, specific to theSer- this specificity is governed by the substrate residue that fits into His-Asp catalytic triad, is derived. This templateallows such tri- the P’subsite, or specificity pocket, immediately adjacent to the ads to be identified uniquelyin other structures and quantifies scissile bond. Perona and Craik(1995) have recently published the differences in the triads from related proteins. The applica- a comprehensive review of the structuralbasis of substrate spec- tion of this procedure to other systems is discussed. ificity in serine proteinases. We used the derived template to search for similar triads in Barth et al. (1993) have performed an extensive steric com- other proteins, including non-enzymes, to see how often they parison of the activesite residues in the serine proteases. They occur outside the serine proteinases andlipases. We found two analyzed the differences in the relative conformations of the Ser- proteins with the correct triad conformation andthese are dis- His-Asp residues by performing RMS fitsof all against all oc- cussed. Our study demonstrates that3D templates derived from currences and, on thebasis of the differences and similarities, the PDB can provide interesting suggestions concerning the were able toclassify the serine proteases according the chymo- function ofa novel protein onceits structure has been determined. trypsin and subtilisin families. As a result of their analyses, they were able to identify an additional serine,which is present, and Results highly conserved, in theactive site of the serine proteases, and suggested it to be part ofa “catalytic tetrad.” They also found The data sets several examples of Ser-His-Asp triads in nonproteolytic pro- teins and in approximatelysimilar conformations to thatof the Two data sets were used, both extracted from the January1995 serine proteases (Barth et al., 1994). release of the PDB. Thefirst comprised the serine proteinases Artymiuk et al.(1994) have used a graph-theoretic approach and lipases used in deriving the Ser-His-Asp 3D templates. The for the identificationof 3D patterns of amino acid side chains serine proteinases, lipases, and relatedenzymes were identified in protein structures. For example, they constructed a search on the basis of their Enzyme Classification(E.C.) number (Bielka template from theside-chain atoms of the Ser 195-His 57-Asp 102 et ai., 1992), which characterizes an enzyme’s function in terms catalytic triad of chymotrypsin and, depending on theallowed of the reaction it catalyzes, the substrate on which it operates, interatomic distance tolerances, different numbersof catalytic and any associated co-factors. To ensure the most completedata triads were identified from their data set. They also identified set, the sequence of each structure in the PDB was cross- an unusual triad from the pro-enzymes chymotrypsinogen and referenced against the SWISS-PROT database (Bairoch & trypsinogen, which does not exist in the active form. Boeckmann, 1994) and the relevant E.C. numbers identified. A different structural comparison of the serine proteases, The resultant data set consisted of 192 serine proteinases, using a less specific technique, has been performed by Fischer 4 serine-type carboxypeptidases, and 9 triacylglycerol lipases. et al. (1994). Their method, derived from geometric hashing Some of the enzymes have more than one chain, and therefore methods used in computer vision research, treats all Ca atoms more than one catalytic triad. Tablelists 1 the dataset in terms in a protein as pointsin space and compares proteinspurely on of all the individual chains: 205 serine proteinases, 7 serine-type the geometric relationships between these points. It can detect carboxypeptidases, and 13 lipases. The chains are grouped into recurring substructural 3D motifs, andwas able to identify the four main fold groups, numbered1-4, according to their over- structural similarities of the active sites of the trypsin-like and all structure. This structural classification was achieved using subtilisin-like serine proteases based solely on thesimilarities of the program SSAP (Orengoet al., 1993), which computes a sim- the Col geometries of their constituent residues. ilarity score (SSAP score)between two proteins; the higher the Apart from the serine proteases, the Ser-His-Asp catalytic score, which ranges from 0 to 100, the more similar the overall triad also occurs in the triacylglycerol lipases, which are respon- structures.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages13 Page
-
File Size-