See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/221197867

Spatial Encoding Scheme for Protein Supporting Structure Discovery

Conference Paper · June 2009 DOI: 10.1109/BIBE.2009.51 · Source: DBLP

CITATIONS READS 0 45

4 authors, including:

Yu-Feng Huang Chien-Kang Huang ACT Genomics Inc. National Taiwan University

39 PUBLICATIONS 184 CITATIONS 31 PUBLICATIONS 382 CITATIONS

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Protein supporting structure discovery View project

All content following this page was uploaded by Yu-Feng Huang on 29 May 2014.

The user has requested enhancement of the downloaded file. 2009 Ninth IEEE International Conference on Bioinformatics and Bioengineering

Spatial Encoding Scheme for Protein Supporting Structure Discovery

Yu-Feng Huang Chia-Jui Yang/Yi-Wei Yang/Chien-Kang Huang* Department of Computer Science and Information Department of Engineering Science and Ocean Engineering Engineering National Taiwan University National Taiwan University Taipei, Taiwan 106, Republic of China Taipei, Taiwan 106, Republic of China [email protected] {r95525053, r94525054, ckhuang}@ntu.edu.tw ( *corresponding author )

Abstract—Protein function is highly correlated with its three- [9]. Even now, we still spend more than 30 minutes while dimensional conformation in order to interact with others comparing whole PDB via VAST [10] online search service proteins, ligands, substrates, inhibitors, etc. Studies on binding (http://structure.ncbi.nlm.nih.gov/Structure/VAST/vast.shtml) site and protein function have been investigated to understand or about 3 to 6 minutes by EBI-SSM [11] with multiple the mechanism of protein activity; therefore, protein stability computing nodes (http://www.ebi.ac.uk/msd-srv/ssm/). Like and flexibility play different roles on protein functions. In this webpage search of Google, the goal of protein structure work, we propose a framework to discover protein supporting database search attempts to report what protein structure is structures, which employs the signatures of local structures similar to query structure (discovery) and how similar is and mines the rigid regions form the same/similar proteins. (ranking). Through experiments and discussions, we heuristically Based on functional annotations such as Gene Ontology determined the effective encoding method. Further we apply this encoding method in discovering protein supporting (GO) [12] and [13], proteins can be grouped by structure by identifying stable regions. Our results reveal that their main biochemical reactions or functionality. The GO supporting structure can be discovered in selected enzyme project provides three structured controlled vocabularies to families. Moreover, the reasons for supporting structure describe gene productions associated with their biological existence vary between protein families. process, cellular component, and molecular function. ENZYME provides hierarchical classification to cluster Keywords-spatial enconding scheme; neighobrhood residues proteins based on the biochemical reactions they catalyzed sphere; protein supporting structure discovery and label these proteins with Enzyme Commission (EC) numbers. The EC number is composed of a hierarchy set of I. INTRODUCTION four digits like IP address: the first digit refers to enzyme class; the second digit refers to the type of bond or functional Recognizing protein pair similarity is a fundamental group they acted on; the last two digits refers to the specific important issue in modern molecular biology. Protein detail information about reactions and substrates. The structure comparison algorithm is used to identify the function annotations can give us good materials to study the maximum number of equivalent Cα (alpha carbon) atoms relationship between protein structures and protein functions. upon which to optimally align the three-dimensional Jones’ work reveals that the tertiary structures of a structures of protein pair. As the protein structure is much more conserved than the sequences of comparison is the NP-hard problem [1], most scientists try to the proteins within the family [14]. As protein function is propose different heuristic approaches to approximate the highly correlated to protein structure, local structure has been optimal solution. Therefore, in order to detect the fold focused for protein function analysis [15, 16]. Previous similarity and the functional or evolutionary relationship researches pointed out that protein function is associated between proteins, most of the global structure comparison with its local structure. Binkowski et al. applied algorithms exploit different heuristic algorithms in the initial computational approach to locate pockets or voids of protein alignment phase to quickly sieve out good candidates and structures including the surface template of CASTp [17, 18]. then refine the results from these candidate solutions by They also provide online service of pvSOAR [19] to identify iterative dynamic programming in order to get the optimized similar surface regions in three-dimensional protein solutions [2]. However, it will be a great challenge while structures. Porter et al. manually curated functional site applying general protein structure comparison tool to run residues of as the template library of Catalytic Site one-against-all whole Protein Data Bank (PDB) [3] search Atlas (CSA) [20]; furthermore, they also applied PSI- with the increasing number of protein structures. BLAST to identify functional residues based on collected Indexing technique for handling whole protein structure dataset to expand the template library semi-automatically. dataset to reduce comparison time has been investigated [4- Evidences showed that protein function is highly correlated 8]. With the fast growth of protein structures in PDB to its protein structure, especially the local structure at the (56,217, March 3, 2009), protein structure database search protein surface. becomes more and more critical for protein structure analysis

978-0-7695-3656-9/09 $25.00 © 2009 IEEE 153 DOI 10.1109/BIBE.2009.51 In this article, we propose a framework to discover the protein supporting structures for enzyme family by encoding A Set of Protein Structures the 3D positions of amino acids in local structure into one- dimensional structural signatures. Different encoding schemes were analyzed to find the most appropriate digest Neighborhood Residues for fast similar structure search. Supporting structure is Sphere Recognition formed by comprising conserved structural signatures which means that structural signature is most common among proteins. The connections between supporting structure, 1D Structural Signature protein function, protein stability, and protein flexibility can Encoding be observed from proteins in the same enzyme family. Moreover, the mechanism of protein activity could be inferred based on discovered supporting structures. Signature Clustering & Supporting Structure II. MATERIALS AND METHOD Discovery In this article, we discover protein supporting structure

by identifying rigid structure among proteins based on Figure 1. Flowchart of supporting structure discovery. enzyme classification, one of hierarchical functional classification. Materials for this framework are to collect protein structures in PDB based on enzyme classification, and will be illustrated in detail in the following subsections. Next, we explore our framework to discover supporting C-terminal I L T structure of proteins for enzyme family. As shown in Figure 1, the overall framework consists of three major components: W G (1) local structure construction; (2) 1D structural signature Y encoding; (3) signature clustering and supporting structure C discovery. A A. Dataset According to ENZYME Data Bank, we first collect protein labeled with EC number from SWISS-PROT database [25]. For each protein, protein structure information could be referred to PDB via structure annotation of database cross-references in SWISS-PROT N-terminal data. Therefore, the network between EC number, SWISS- Figure 2. Neighborhood residues sphere in two-dimensional (2D) sketch. PROT ID and PDB ID can be conducted via cross-references The gray part is the area within 10 Å radius surrounding a central residue information across multiple databases. In this article, we G. focus on enzyme families of EC class 1, , and EC class 3, . The reason is that these two From the historical theory of lock-and-key model to the classes is the majority among all six EC classes. For each modern concept of induce-fit, the connection between family, we select high quality structures with resolution protein function and protein flexibility has been established. better than 2.0 Å, and only enzyme family with more than In order to achieving multiple functions, a protein will three protein chains will be selected to discover protein remain flexible to interact with different substrates or ligands. supporting structures. On the other hand, in order to support protein functional site, there exists rigid regions close to functional residue to B. Local Structure Construction become a supporting structure. Scheeff et al. discovered the We now define local structure representation for protein evidence of structural evolution in the kinase-like protein three-dimensional structure. Our original idea comes from superfamily, and they found that there exists supporting the neighbor string (NSr,) developed by Jonassen et al. [26]. structure for binding pocket stabilization [21]. Chatani et al. This string encodes all residues in the structure that are with reviewed an enzyme of bovine pancreatic ribonuclease A a distance of d Å from r (d=10, as default), including r itself (RNase A), and they concluded that residues play different from N-terminal to C-terminal. The origin of NSr is used to roles while forming a stable structure, catalytic, or substrate mine structure motif in PDB. The authors used NSr to binding sites [22]. Structure conservation was discovered in represent structure motif and use support k of structure literatures such as the Cytochromes P450 (CYP450) [23, 24]. occurrences to decide which NSr is a significant structure In addition, protein stability is another issue accompanied motif. In addition, NSr is represented in regular expression with the topic of protein flexibility because both of them are encoded in gap information. In this article, we redefine NSr required for protein function. to be, neighborhood residues sphere (NRS) to include structure coordinate information; therefore, the NRS contains

154 information. As shown in Figure 2, if G is the center and the radius is 10 Å, residues within the gray part are a Y-axis (Vector-3) neighborhood closed to central residue with 10 Å distance. C. Spatial Geometry Encoding for Protein Local Structure For the sake of discovering the relationship between protein function and protein local structure, local structure Ra Rc comparison is necessary for determining local structure similarity, but geometric hashing for determining structure similarity pair-wisely is exhaustive task on large-scale R X-axis (Vector-1) e R dataset. In order to reduce comparison time for detecting b structures in detail, we transform three-dimensional spatial Rd information of local structure into one-dimensional structural Vector-2 Z-axis (Vector-4) signature by bit stream of binary signatures. As shown in Figure 3, the coordinate system is determined by five consecutive residues; this is the central residue (Rc) and the four nearest neighbor residues (Ra, Rb and Rd, Re) to form the Figure 3. Coordinate system construction. Each NRS should have at coordinate system with three axis, X-axis, Y-axis, and Z-axis. least 5 amino acids, and the coordinate system is constructed by central The X-axis, Vector-1, is the vector addition of RcRa and RcRb. residue Rc, addition vector of RcRa and RcRb, and addition vector of RcRe and R R . Vector-2 is the vector addition of RcRe and RcRd. The c d Vector-3, Y-axis, is the cross product of Vector-1 and Vector-2. Furthermore, the Z-axis is the cross product of X- axis and Y-axis. According to this coordinate transformation system, we use cube and sphere model to discuss the effect of spatial encoding scheme; (1) 2 Å cube bin, (2) 2.5 Å with 1 Å buffer layer, (3) 1 Å default layer without buffer layer, and (4) 1 Å default layer with 1 Å buffer layer. In Figure 4, (a) is a diagram of cube model with 2 Å bucket to encode three- dimensional space of NRS and (b) is a diagram of sphere model with 1 Å default layer and 1 Å buffer layer to encode three-dimensional space of NRS, and the default layer is the solid line and buffer layer is the dash line. The design concept of buffer layer is to provide fault tolerance while the point is located near or on the boundary of default layer bucket. Finally, we can have 1000 bits for 2 Å cube bin (a) Cube model of 2 Å cube bin encoding scheme (125 * 8 quadrants = 1000 buckets), 160 Y bits for 2.5 Å default layer without 1 Å buffer layer encoding scheme ((4 + 4) * 8 quadrants = 64 buckets) 160 bits for 1 Å default layer without 1 Å buffer layer encoding scheme (10 * 8 quadrants = 80 buckets), and 160 bits for 1 Å default layer with 1 Å buffer layer encoding scheme ((10 + 10) * 8 X quadrants = 160 buckets). If Cα atom is in the bucket, then the bit will be set as 1 or to the contrary. D. Signature Clustering and Supporting Structure Discovery 1010 1001 0001 0100 Signature For signature clustering, we simply group spheres with 0Å-1Å 0.5Å-1.5Å 1Å-2Å 1.5Å-2.5Å Layer exact matching of structural signature. The reason is that 1Å proposed encoding scheme has tolerance on coordinate Buffer Layer rotation and bucket boundary. In order to reduce comparison Default Layer time of discovering similar NRS, protein structure will be 1Å decomposed into one-dimensional signature to represent (b) Sphere model of 1Å default layer with 1Å buffer layer three-dimensional space of local structure. Each signature will count for its coverage by Sig defined as the coverage Figure 4. Encoding scheme of neighborhood residues sphere. (a) is cube Cov model representation and (b) is sphere model representation. ratio for each signature in (1). local structure information with its sequence. Thus, the NRS # of distinct protein chains with this signature Sig = (1) has compact spatial conformation and gapped sequence Cov # of protein chains in this EC

155 Further, supporting structure is formed by Top-1 conserved signature at the beginning and then to include less conserved signatures of Top-K satisfied with the threshold in candidate signature pool. For each run, candidate signature will be selected as a part of supporting structure if the decreasing rate of coverage is acceptable. The threshold for keeping coverage of supporting structure is determined by SSCov defined in (2).

Sig of Top - K signature SS = Cov (2) Cov Sig of Top 1- signature Cov Figure 5. Performance of spatial encoding scheme. Supporting structure here plays a role as an assistant to support flexible regions to have flexibility while interacting with ligands or substrates. III. RESULTS In this section, we report how the encoding scheme affects the similarity of local structures with the same signature. Furthermore, we select several cases from enzyme families to illustrate how supporting structure plays a role to interact with substrates or ligands. A. Performance of Structural Signature Encoding Because comparing local structure based on current (a) Enzyme Class 1: Oxidoreductases. available structure comparison approaches on huge dataset is a time-consuming task, our idea is to transform the 3D spatial information into 1D fix-length bit words to reduce comparison time. In order to evaluate the efficiency of signature encoding scheme, we define GH-score to measure the matching rate between NRSs in (3).

of # of matched residues GH − score = (3) MiniumSize(NRS1 , NRS2 )

GH-score is a ratio of matched residues over the minimum number of two NRSs and used to evaluate the comparison (b) Enzyme Class 3: Hydrolases. quality of paired NRSs. A qualified signature is defined as a Figure 6. Quality evaluation of structure signature design of selected signature that all pairs of NRS comparisons pass the enzyme families. specified GH-score threshold. Hence, we define SigQ in (4) to evaluate the quality of signature for each enzyme family. families and Y-axis is SigQ with different threshold of GH- score of 75%, 80%, 85%, 90%, and 95% respectively. As of # of qualified signatures (4) shown in Figure 6, while the threshold of GH-score ranges SigQ = of # of candidate signatures from 75% to 90%, the SigQ is above 95% in average. It means that the designed structural signature effectively The SigQ is defined as the number of qualified signature clustered similar local structures. over number of candidate signature, and the candidate B. Supporting Structure Discovery signature is the signature with more than there NRSs. As shown in Figure 5, we use SigQ to measure four encoding The idea to discover protein supporting structures of schemes and 0.75 is considered as the minimum value of protein family is coming from the stability of protein GH-score. In the experiment results, we find that encoding functional site. In order to binding with similar but different scheme of 1 Å default layer with 1 Å buffer layer have better ligands or substrates, protein structure will maintain some quality than others. The cube model is failed due to the level of flexibility accompanies with high stabile region to signature sparseness. Therefore, the encoding scheme of 1 Å support the functional site. If SSCov is higher than 75%, we default layer with 1 Å buffer layer was chosen to encode include this signature in supporting structure. According to protein local structure for protein supporting structure our observations, we found that discovered supporting discovery. In Figure 6, the X-axis is selected enzyme structure usually located near functional site. Based on these

156 supporting structures, we can further find low stable region that will interact with substrates or ligands. TABLE I. SELECTED REPRESENTATIVE STRUCTURES WITH DISCOVERED SUPPORTING STRUCTURE.

C. Case Study EC UniProt ID Species PDB ID Enzyme classification is one of functional annotation NOS3_HUMAN Homo sapiens 1m9m:A classification, which groups proteins with their biochemical 1.14.13.39 NOS1_RAT Rattus norvegicus 2g6i:A reactions and functions. Proteins of enzyme family come NOS3_BOVIN Bos Taurus 1d0c:A POLG_HAVLA 2h9h:A Human from different species with similar function. In this section, 3.4.22.28 POLG_HAVMB 2cxv:A virus we will introduce proteins with discovered supporting POLG_HAVHM 2h6m:A structures from selected enzyme families. TABLE I lists the selected enzyme families. For each enzyme family, we select representative structures for illustration about the location of discovered supporting structure, ligands, and substrates. Ligands and substrates are in CPK model and supporting structures colored pink are in wireframe model in all figures rendered by JMol (http://jmol.sourceforge.net/). In Figure 8 and Figure 9, the left column shows the position of discovered supporting structure colored in hotpink and ligands or substrates, the center column shows the positions of ligands/substrates and functional key residues colored in blue, and the right column shows protein flexibility with temperature factor (Debye-Waller factor, or B factor). EC 1.14.13.39 is an enzyme family of nitric-oxide synthase (NOS), and the NOS is the enzyme catalyzes the formation of NO from oxygen and arginine, which is a very complex enzyme containing several cofactors and a heme group which is part of the catalytic site. All of the NOSs require NADPH and O2 as co-substrates in the reaction. According to the observation of three representative protein structures aligned by EBI-SSM in Figure 7, there exist Figure 7. Multiple structure alignment of protein structures of 1d0c:A, disorder regions or high flexible regions in protein structures. 1m9m:A, and 2g6i:A by EBI-SSM. In Figure 8, 1m9m:A, 2g6i:A, and 1d0c:A are selected bucket size is another important issue to describe the representative protein structures which contain discovered distribution of atoms in three-dimensional space. In addition, supporting structure respectively. This supporting structure the quality of structural signature is promising. We found contains two highly stable regions close to . SigQ GH- EC 3.4.22.28 is an enzyme family of picornain 3C. In that the quality of signature has 90% of with the score is higher than 90%. EC 3.4.22.28, 2cxv:A, 2h6m:A, and 2h9h:A are selected Our proposed framework can help discovering representative structure as examples to illustrate, and there supporting structure for selected enzyme families. After would be a hinge point between β-hairpin and discovered rigid structure was identified, we could observe and try to supporting structure, as shown in Figure 9. . This means that infer the relationship between functional site and supporting this β-hairpin structure is more flexible than the supporting structure, and structure flexibility further. According to our structure based on our observation. According to literature published by Yin et al. [27], the authors pointed out that the observations, supporting structure can be discovered in any location of protein structure, which depends on the β-hairpin structure of anti-parallel β-sheets has flexible mechanism of protein activity. The stability and flexibility conformation. of a protein will affect its binding affinity, which also IV. DISCUSSION reflects the mechanism of protein’s biochemical reaction. In case studies, we found that supporting structure can be As we known, comparing protein structures pair-wisely located in the tong heads to clamp substrates, or in the in huge dataset is a time-consuming task; therefore, our bottom of functional site to support substrates, or near to solution is to encode spatial information into binary signature hinge point to support regions with low stability. is alternative to reduce comparison time. The critical issue for encoding protein structure with one-dimensional V. CONCLUSIONS structural signature is to keep structure information without too much information loss. The sphere model digest the This article presented a method to discover supporting structures by detecting rigid structures among proteins of the spatial atom coordinates more precise than cube model. same function. The conformational flexibility allows protein According to our experiments and evaluation testing on to interact with different ligands, substrates, or metal ions. relationship between signature and local structure similarity, The regions with high stability can sustain the ligands or the the design of buffer layer provides tolerance while resides substrates; therefore, rigid structure identification is falling on the boundary of the default layer. In sphere model

157 important point to understand how protein works. [12] M. Ashburner, C.A. Ball, J.A. Blake, D. Botstein, H. Butler, Furthermore, the regions with high stability are good bases to J.M. Cherry, A.P. Davis, K. Dolinski, S.S. Dwight, J.T. Eppig, observe the link between functional site and protein local M.A. Harris, D.P. Hill, L. Issel-Tarver, A. Kasarskis, S. Lewis, structure. Therefore, protein function requires both of J.C. Matese, J.E. Richardson, M. Ringwald, G.M. Rubin, and structure flexibility and structure stability. In our G. Sherlock, “Gene ontology: tool for the unification of observations, we found that protein structure should maintain a degree of flexibility to interact with different size of biology. The Gene Ontology Consortium,” Nat Genet, vol. 25, ligands or substrates; therefore, there are stable regions to (no. 1), pp. 25-9, May 2000. vise or support ligands or substrates and flexible regions to [13] A. Bairoch, “The ENZYME data bank,” Nucleic Acids Res, accompany with the stable regions. The further investigation vol. 21, (no. 13), pp. 3155-6, Jul 1 1993. of discovered supporting structure depicted how the stable [14] D.T. Jones, “Protein structure prediction in genomics,” Brief regions and the flexible regions work together. Even having Bioinform, vol. 2, (no. 2), pp. 111-25, May 2001. discovered supporting structure, it remains difficult to infer [15] T.A. Binkowski, L. Adamian, and J. Liang, “Inferring the relationship between protein function and protein local functional relationships of proteins from local sequence and structure directly. spatial surface patterns,” J Mol Biol, vol. 332, (no. 2), pp. 505- REFERENCES 26, Sep 12 2003. [16] R.J. Najmanovich, J.W. Torrance, and J.M. Thornton, [1] A. Godzik, “The structural alignment between two proteins: is “Prediction of protein function from structure: insights from there a unique answer?,” Protein Sci, vol. 5, (no. 7), pp. 1325- methods for the detection of local structural similarities,” 38, Jul 1996. Biotechniques, vol. 38, (no. 6), pp. 847, 849, 851, Jun 2005. [2] O. Carugo and S. Pongor, “Recent progress in protein 3D [17] T.A. Binkowski, S. Naghibzadeh, and J. Liang, “CASTp: structure comparison,” Curr Protein Pept Sci, vol. 3, (no. 4), Computed Atlas of Surface Topography of proteins,” Nucleic pp. 441-9, Aug 2002. Acids Res, vol. 31, (no. 13), pp. 3352-5, Jul 1 2003. [3] H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, [18] J. Dundas, Z. Ouyang, J. Tseng, A. Binkowski, Y. Turpaz, H. Weissig, I.N. Shindyalov, and P.E. Bourne, “The Protein and J. Liang, “CASTp: computed atlas of surface topography Data Bank,” Nucleic Acids Res, vol. 28, (no. 1), pp. 235-42, of proteins with structural and topographical mapping of Jan 1 2000. functionally annotated residues,” Nucleic Acids Res, vol. 34, [4] A. Bhattacharya, T. Can, T. Kahveci, A.K. Singh, and Y.F. (no. Web Server issue), pp. W116-8, Jul 1 2006. Wang, “Progress: simultaneous searching of protein databases [19] T.A. Binkowski, P. Freeman, and J. Liang, “pvSOAR: by sequence and structure,” Pac Symp Biocomput, pp. 264-75, detecting similar surface patterns of pocket and void surfaces 2004. of amino acid residues on proteins,” Nucleic Acids Res, vol. [5] O. Camoglu, T. Kahveci, and A.K. Singh, “Towards index- 32, (no. Web Server issue), pp. W555-8, Jul 1 2004. based similarity search for protein structure databases,” Proc [20] C.T. Porter, G.J. Bartlett, and J.M. Thornton, “The Catalytic IEEE Comput Soc Bioinform Conf, vol. 2, pp. 148-58, 2003. Site Atlas: a resource of catalytic sites and residues identified [6] O. Camoglu, T. Kahveci, and A.K. Singh, “PSI: indexing in enzymes using structural data,” Nucleic Acids Res, vol. 32, protein structures for fast similarity search,” Bioinformatics, (no. Database issue), pp. D129-33, Jan 1 2004. vol. 19 Suppl 1, pp. i81-3, 2003. [21] E.D. Scheeff and P.E. Bourne, “Structural evolution of the [7] O. Camoglu, T. Kahveci, and A.K. Singh, “Index-based protein kinase-like superfamily,” PLoS Comput Biol, vol. 1, similarity search for protein structure databases,” J Bioinform (no. 5), pp. e49, Oct 2005. Comput Biol, vol. 2, (no. 1), pp. 99-126, Mar 2004. [22] E. Chatani and R. Hayashi, “Functional and structural roles [8] M. Comin, C. Guerra, and G. Zanotti, “PROuST: a of constituent amino acid residues of bovine pancreatic comparison method of three-dimensional structures of ribonuclease A,” J Biosci Bioeng, vol. 92, (no. 2), pp. 98-107, proteins using indexing techniques,” J Comput Biol, vol. 11, 2001. (no. 6), pp. 1061-72, 2004. [23] J. Mestres, “Structure conservation in cytochromes P450,” [9] L. Holm and C. Sander, “Searching protein structure Proteins, vol. 58, (no. 3), pp. 596-609, Feb 15 2005. databases has come of age,” Proteins, vol. 19, (no. 3), pp. [24] C.R. Otey, J.J. Silberg, C.A. Voigt, J.B. Endelman, G. 165-73, Jul 1994. Bandara, and F.H. Arnold, “Functional evolution and [10] J.F. Gibrat, T. Madej, and S.H. Bryant, “Surprising structural conservation in chimeric cytochromes p450: similarities in structure comparison,” Curr Opin Struct Biol, calibrating a structure-guided approach,” Chem Biol, vol. 11, vol. 6, (no. 3), pp. 377-85, Jun 1996. (no. 3), pp. 309-18, Mar 2004. [11] E. Krissinel and K. Henrick, “Secondary-structure matching [25] C. The UniProt, “The Universal Protein Resource (UniProt),” (SSM), a new tool for fast protein structure alignment in three Nucl. Acids Res., vol. 36, (no. suppl_1), pp. D190-195, dimensions,” Acta Crystallogr D Biol Crystallogr, vol. 60, (no. January 11, 2008 2008. Pt 12 Pt 1), pp. 2256-68, Dec 2004.

158 [26] I. Jonassen, I. Eidhammer, and W.R. Taylor, “Discovery of of hepatitis A virus 3C by a serine-derived beta- local packing motifs in protein structures,” Proteins, vol. 34, lactone: selective crystallization and formation of a functional (no. 2), pp. 206-19, Feb 1 1999. in the ,” J Mol Biol, vol. 354, (no. 4), [27] J. Yin, E.M. Bergmann, M.M. Cherney, M.S. Lall, R.P. Jain, pp. 854-71, Dec 9 2005. J.C. Vederas, and M.N. James, “Dual modes of modification

159

Figure 8. Selected representative structures of EC 1.14.13.39.

Figure 9. Selected representative structures of EC 3.4.22.28.

160

View publication stats