2Bdz Lichtarge Lab 2006
Total Page:16
File Type:pdf, Size:1020Kb
Pages 1–7 2bdz Evolutionary trace report by report maker April 16, 2010 4.3.1 Alistat 6 4.3.2 CE 6 4.3.3 DSSP 6 4.3.4 HSSP 6 4.3.5 LaTex 6 4.3.6 Muscle 6 4.3.7 Pymol 6 4.4 Note about ET Viewer 6 4.5 Citing this work 6 4.6 About report maker 6 4.7 Attachments 6 1 INTRODUCTION From the original Protein Data Bank entry (PDB id 2bdz): Title: Mexicain from jacaratia mexicana Compound: Mol id: 1; molecule: mexicain; chain: a, b, c, d; ec: 3.4.22.- Organism, scientific name: Jacaratia Mexicana; 2bdz contains a single unique chain 2bdzA (212 residues long) and its homologues 2bdzD, 2bdzC, and 2bdzB. CONTENTS 2 CHAIN 2BDZA 2.1 P84346 overview 1 Introduction 1 From SwissProt, id P84346, 99% identical to 2bdzA: 2 Chain 2bdzA 1 Description: Mexicain (EC 3.4.22.-). 2.1 P84346 overview 1 Organism, scientific name: Jacaratia mexicana (Wild papaya) 2.2 Multiple sequence alignment for 2bdzA 1 (Pileus mexicanus). 2.3 Residue ranking in 2bdzA 1 Taxonomy: Eukaryota; Viridiplantae; Streptophyta; Embryophyta; 2.4 Top ranking residues in 2bdzA and their position on Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core the structure 2 eudicotyledons; rosids; eurosids II; Brassicales; Caricaceae; Jacara- 2.4.1 Clustering of residues at 25% coverage. 2 tia. 2.4.2 Overlap with known functional surfaces at Function: Cysteine protease. 25% coverage. 2 Subcellular location: Secreted. 2.4.3 Possible novel functional surfaces at 25% Tissue specificity: Expressed in latex. coverage. 3 Similarity: Belongs to the peptidase C1 family. About: This Swiss-Prot entry is copyright. It is produced through a 3 Notes on using trace results 5 collaboration between the Swiss Institute of Bioinformatics and the 3.1 Coverage 5 EMBL outstation - the European Bioinformatics Institute. There are 3.2 Known substitutions 5 no restrictions on its use as long as its content is in no way modified 3.3 Surface 5 and this statement is not removed. 3.4 Number of contacts 5 3.5 Annotation 5 2.2 Multiple sequence alignment for 2bdzA 3.6 Mutation suggestions 5 For the chain 2bdzA, the alignment 2bdzA.msf (attached) with 1129 sequences was used. The alignment was downloaded from the HSSP 4 Appendix 5 database, and fragments shorter than 75% of the query as well as 4.1 File formats 5 duplicate sequences were removed. It can be found in the attachment 4.2 Color schemes used 5 to this report, under the name of 2bdzA.msf. Its statistics, from the 4.3 Credits 6 alistat program are the following: 1 Lichtarge lab 2006 Fig. 1. Residues 1-106 in 2bdzA colored by their relative importance. (See Appendix, Fig.7, for the coloring scheme.) Fig. 2. Residues 107-212 in 2bdzA colored by their relative importance. (See Appendix, Fig.7, for the coloring scheme.) Fig. 3. Residues in 2bdzA, colored by their relative importance. Clockwise: front, back, top and bottom views. Format: MSF Number of sequences: 1129 Total number of residues: 226142 2.4.1 Clustering of residues at 25% coverage. Fig. 4 shows the Smallest: 123 top 25% of all residues, this time colored according to clusters they Largest: 212 belong to. The clusters in Fig.4 are composed of the residues listed Average length: 200.3 Alignment length: 212 Average identity: 43% Most related pair: 99% Most unrelated pair: 14% Most distant seq: 38% Furthermore, <1% of residues show as conserved in this ali- gnment. The alignment consists of 48% eukaryotic ( 7% vertebrata, 6% arthropoda, 17% plantae), and 2% viral sequences. (Descriptions of some sequences were not readily available.) The file containing the sequence descriptions can be found in the attachment, under the name 2bdzA.descr. 2.3 Residue ranking in 2bdzA The 2bdzA sequence is shown in Figs. 1–2, with each residue colored according to its estimated importance. The full listing of residues in 2bdzA can be found in the file called 2bdzA.ranks sorted in the attachment. 2.4 Top ranking residues in 2bdzA and their position on the structure In the following we consider residues ranking among top 25% of Fig. 4. Residues in 2bdzA, colored according to the cluster they belong to: red, followed by blue and yellow are the largest clusters (see Appendix for residues in the protein . Figure 3 shows residues in 2bdzA colored the coloring scheme). Clockwise: front, back, top and bottom views. The by their importance: bright red and yellow indicate more conser- corresponding Pymol script is attached. ved/important residues (see Appendix for the coloring scheme). A Pymol script for producing this figure can be found in the attachment. in Table 1. 2 Table 1. cluster size member color residues red 42 17,19,22,23,25,26,27,28,29 35,48,49,50,51,52,53,55,56 62,63,65,66,71,74,79,86,87 88,95,141,144,147,159,161 174,175,176,177,178,181,182 185 blue 10 6,7,8,129,164,165,166,167 170,171 Table 1. Clusters of top ranking residues in 2bdzA. 2.4.2 Overlap with known functional surfaces at 25% coverage. The name of the ligand is composed of the source PDB identifier and the heteroatom name used in that file. E64 binding site. Table 2 lists the top 25% of residues at the inter- face with 2bdzAE64501 (e64). The following table (Table 3) suggests possible disruptive replacements for these residues (see Section 3.6). Table 2. res type subst’s cvg noc/ dist antn (%) bb (A˚ ) 22 C C(97) 0.01 1/1 4.59 S-S .(1)PYA FG 66 G G(97) 0.03 34/34 2.65 A(1).CS TERD 19 Q Q(97) 0.04 9/0 2.92 .(1)NHL GPCXEM 159 H H(97) 0.04 28/13 3.29 .(2)TAG FDP 26 W W(91) 0.07 9/3 3.49 Y(5) .(1)SGF APDTCL 25 C C(94) 0.09 28/10 2.33 S(2) .(1)V G(1)DYQ LT 65 G G(94)L. 0.10 30/30 3.07 WQNRTKC AYSDEF 23 G G(91)W 0.19 18/18 3.08 .(1) N(1) S(1)L A(1)YKR continued in next column 3 Table 2. continued res type subst’s cvg noc/ dist antn (%) bb (A˚ ) EHMITDV C 158 D N(46) 0.25 29/29 3.17 T(2) D(45)Y .(2) S(1)AHR GIVFL Table 2. The top 25% of residues in 2bdzA at the interface with E64.(Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each type in the bracket; noc/bb: number of contacts with the ligand, with the num- ber of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. ) Table 3. res type disruptive mutations Fig. 5. Residues in 2bdzA, at the interface with E64, colored by their relative 22 C (K)(R)(E)(Q) importance. The ligand (E64) is colored green. Atoms further than 30A˚ away 66 G (R)(K)(FWH)(E) from the geometric center of the ligand, as well as on the line of sight to the 19 Q (Y)(H)(FTW)(S) ligand were removed. (See Appendix for the coloring scheme for the protein 159 H (E)(Q)(K)(MD) chain 2bdzA.) 26 W (K)(E)(Q)(R) 25 C (R)(K)(E)(H) 65 G (R)(K)(E)(H) 23 G (R)(K)(E)(H) 158 D (R)(H)(K)(FW) Table 3. List of disruptive mutations for the top 25% of residues in 2bdzA, that are at the interface with E64. Figure 5 shows residues in 2bdzA colored by their importance, at the interface with 2bdzAE64501. 2.4.3 Possible novel functional surfaces at 25% coverage. One group of residues is conserved on the 2bdzA surface, away from (or Fig. 6. A possible active surface on the chain 2bdzA. The larger cluster it susbtantially larger than) other functional sites and interfaces reco- belongs to is shown in blue. gnizable in PDB entry 2bdz. It is shown in Fig. 6. The right panel shows (in blue) the rest of the larger cluster this surface belongs to. Table 4. continued The residues belonging to this surface ”patch” are listed in Table res type substitutions(%) cvg antn 4, while Table 5 suggests possible disruptive replacements for these 19 Q Q(97).(1)NHLGPC 0.04 residues (see Section 3.6). XEM Table 4. 159 H H(97).(2)TAGFDP 0.04 res type substitutions(%) cvg antn 175 N N(97).(2)KHTSGX 0.05 22 C C(97).(1)PYAFG 0.01 S-S 63 C C(97)K.RHGSDNEL 0.06 S-S 88 Y Y(95)F(2)GHMSVR 0.01 YIV DQ 147 G G(96)CV.(1)QDTE 0.06 137 R 145.98 0.01 KARPI 144 Y Y(97)F.PRLV 0.02 26 W W(91)Y(5).(1)SG 0.07 66 G G(97)A(1).CSTER 0.03 FAPDTCL D 35 E E(96).(1)RQSVMG 0.07 continued in next column DKAT continued in next column 4 Table 4. continued Table 5. continued res type substitutions(%) cvg antn res type disruptive 25 C C(94)S(2).(1)V 0.09 mutations G(1)DYQLT 159 H (E)(Q)(K)(MD) 28 F F(96)Q.(1)LRIVW 0.09 175 N (Y)(FW)(H)(E) SPHMT 63 C (R)(KE)(H)(FW) 177 W W(94)V.(2)MY(1) 0.09 147 G (R)(H)(KE)(FW) LHSKFXREC 26 W (K)(E)(Q)(R) 65 G G(94)L.WQNRTKCA 0.10 35 E (H)(FW)(Y)(R) YSDEF 25 C (R)(K)(E)(H) 176 S S(94)T(1)G.(2)W 0.11 28 F (E)(K)(T)(D) KHFEQRIX 177 W (E)(K)(TD)(Q) 182 G G(96)I.(3)DAKEX 0.12 65 G (R)(K)(E)(H) RL 176 S (R)(K)(FWH)(M) 7 W W(93).(3)Y(1)LI 0.13 182 G (R)(H)(KE)(FW) SFRH 7 W (E)(K)(Q)(TD) 87 P P(90)G(1)K(1) 0.14 87 P (Y)(R)(H)(T) S(1)ETRA(1)QVIH 181 W (KE)(Q)(D)(R) YL 6 D (R)(H)(FW)(Y) 181 W W(91)I.(3)F(3) 0.14 50 E (H)(FW)(Y)(R) Y(1)CVSQGXH 161 V (Y)(R)(KE)(H) 6 D D(93).(3)N(2)EA 0.15 164 V (R)(KY)(E)(H) QG 23 G (R)(K)(E)(H) 50 E E(83)Q(2)K(1) 0.16 174 K (Y)(FW)(T)(VCAHG) P(4)V(3)T(1)L 8 R (TD)(Y)(E)(CG) A(1)SYGI.RDM 17 K (Y)(T)(FW)(CG) 161 V V(89)L(1)I(3) 0.16 158 D (R)(H)(K)(FW) M(2).(1)APDR 164 V V(88)I(6)AET(1) 0.18 Table 5.