2Phd Lichtarge Lab 2006
Total Page:16
File Type:pdf, Size:1020Kb
Pages 1–9 2phd Evolutionary trace report by report maker January 27, 2010 4.3.1 Alistat 8 4.3.2 CE 8 4.3.3 DSSP 8 4.3.4 HSSP 9 4.3.5 LaTex 9 4.3.6 Muscle 9 4.3.7 Pymol 9 4.4 Note about ET Viewer 9 4.5 Citing this work 9 4.6 About report maker 9 4.7 Attachments 9 1 INTRODUCTION From the original Protein Data Bank entry (PDB id 2phd): Title: Crystal structure determination of a salicylate 1,2- dioxygenase from pseudaminobacter salicylatoxidans CONTENTS Compound: Mol id: 1; molecule: gentisate 1,2-dioxygenase; chain: a, b, c, d; ec: 1.13.11.4; engineered: yes 1 Introduction 1 Organism, scientific name: Pseudaminobacter Salicylatoxidans; 2phd contains a single unique chain 2phdD (352 residues long) and 2 Chain 2phdD 1 its homologues 2phdA, 2phdC, and 2phdB. 2.1 Q67FT0 overview 1 2.2 Multiple sequence alignment for 2phdD 1 2.3 Residue ranking in 2phdD 1 2.4 Top ranking residues in 2phdD and their position on the structure 1 2.4.1 Clustering of residues at 25% coverage. 1 2 CHAIN 2PHDD 2.4.2 Overlap with known functional surfaces at 2.1 Q67FT0 overview 25% coverage. 2 2.4.3 Possible novel functional surfaces at 25% From SwissProt, id Q67FT0, 94% identical to 2phdD: coverage. 6 Description: Gentisate 1,2-dioxygenase. Organism, scientific name: Pseudaminobacter salicylatoxidans. 3 Notes on using trace results 7 Taxonomy: Bacteria; Proteobacteria; Alphaproteobacteria; Rhizo- 3.1 Coverage 7 biales; Phyllobacteriaceae; Pseudaminobacter. 3.2 Known substitutions 7 3.3 Surface 7 3.4 Number of contacts 8 3.5 Annotation 8 2.2 Multiple sequence alignment for 2phdD 3.6 Mutation suggestions 8 For the chain 2phdD, the alignment 2phdD.msf (attached) with 102 sequences was used. The alignment was downloaded from the HSSP 4 Appendix 8 database, and fragments shorter than 75% of the query as well as 4.1 File formats 8 duplicate sequences were removed. It can be found in the attachment 4.2 Color schemes used 8 to this report, under the name of 2phdD.msf. Its statistics, from the 4.3 Credits 8 alistat program are the following: 1 Lichtarge lab 2006 by their importance: bright red and yellow indicate more conser- ved/important residues (see Appendix for the coloring scheme). A Pymol script for producing this figure can be found in the attachment. Fig. 1. Residues 15-190 in 2phdD colored by their relative importance. (See Appendix, Fig.12, for the coloring scheme.) Fig. 2. Residues 191-366 in 2phdD colored by their relative importance. (See Appendix, Fig.12, for the coloring scheme.) Fig. 3. Residues in 2phdD, colored by their relative importance. Clockwise: Format: MSF front, back, top and bottom views. Number of sequences: 102 Total number of residues: 33659 Smallest: 114 2.4.1 Clustering of residues at 25% coverage. Fig. 4 shows the Largest: 352 top 25% of all residues, this time colored according to clusters they Average length: 330.0 belong to. The clusters in Fig.4 are composed of the residues listed Alignment length: 352 in Table 1. Average identity: 38% Most related pair: 99% Table 1. Most unrelated pair: 11% cluster size member Most distant seq: 32% color residues red 76 59,72,81,83,84,85,88,90,91 100,103,104,106,108,109,112 Furthermore, <1% of residues show as conserved in this ali- 113,114,116,117,119,120,121 gnment. 123,124,125,126,127,128,129 The alignment consists of <1% eukaryotic ( <1% fungi), 23% 131,132,134,135,137,139,141 prokaryotic, and <1% archaean sequences. (Descriptions of some 143,146,149,150,151,152,153 sequences were not readily available.) The file containing the 154,155,157,159,160,161,162 sequence descriptions can be found in the attachment, under the name 164,169,170,172,173,174,175 2phdD.descr. 176,177,178,179,188,228,229 235,268,269,272,305,308,325 2.3 Residue ranking in 2phdD 345,346,350,352 The 2phdD sequence is shown in Figs. 1–2, with each residue colored blue 4 50,298,330,332 according to its estimated importance. The full listing of residues yellow 3 37,38,39 in 2phdD can be found in the file called 2phdD.ranks sorted in the attachment. Table 1. Clusters of top ranking residues in 2phdD. 2.4 Top ranking residues in 2phdD and their position on the structure 2.4.2 Overlap with known functional surfaces at 25% coverage. In the following we consider residues ranking among top 25% of The name of the ligand is composed of the source PDB identifier residues in the protein . Figure 3 shows residues in 2phdD colored and the heteroatom name used in that file. 2 Table 2. continued res type subst’s cvg noc/ dist antn (%) bb (A˚ ) .(3) 104 W Y(72) 0.13 9/0 4.37 W(13) F(2) N(3)L .(1) V(1)SA 332 W Y(5) 0.13 18/0 3.81 W(80) F(4) N(1) H(2) .(1) D(1) 123 Q P(16) 0.14 17/3 3.81 Q(52) A(20) M(5) S(1)VT Fig. 4. Residues in 2phdD, colored according to the cluster they belong to: 188 F Y(15) 0.15 107/19 3.31 red, followed by blue and yellow are the largest clusters (see Appendix for F(79)SV the coloring scheme). Clockwise: front, back, top and bottom views. The .RD corresponding Pymol script is attached. 357 L L(89)MA 0.15 27/6 3.52 .(1) F(3) Interface with 2phdB.Table 2 lists the top 25% of residues at the V(1)I interface with 2phdB. The following table (Table 3) suggests possible 88 L L(88) 0.16 4/3 3.87 disruptive replacements for these residues (see Section 3.6). F(6) Table 2. M(1).VP res type subst’s cvg noc/ dist antn 178 I L(33) 0.16 14/0 3.13 (%) bb (A˚ ) I(56) 121 H H(99)N 0.03 9/0 4.04 site V(5)H 176 L L(99). 0.03 2/0 4.39 A(1). 177 D D(99). 0.03 6/0 3.67 85 A V(74) 0.17 31/28 2.78 59 W W(98) 0.06 30/0 3.21 A(14) .(1) N(5).QH 352 P P(93) 0.07 2/2 4.70 KT V(1) 157 W G(26) 0.18 20/0 3.56 A(1)D W(63) .(1) N(1) 84 R R(98).F 0.09 10/5 3.96 M(5)FS 91 P P(96).G 0.09 7/2 3.50 37 P P(77) 0.19 12/10 3.63 TD A(13) 83 R R(96) 0.10 80/22 2.88 G(4) G(1).T .(3) 39 W W(95) 0.11 92/10 3.03 81 G A(66) 0.21 1/1 4.95 .(3)F G(22) 50 P P(94)E 0.11 39/12 3.04 T(4) .(2)VG .(2)PDF 38 L L(89) 0.12 40/15 2.93 298 V S(14) 0.22 11/2 3.70 Q(4) T(60) G(1) V(17) continued in next column continued in next column 3 Table 2. continued res type subst’s cvg noc/ dist antn (%) bb (A˚ ) N(3)A .(1) 124 N N(26) 0.25 3/0 4.09 S(51) F(8) A(4) C(1) G(2)HET Table 2. The top 25% of residues in 2phdD at the interface with 2phdB. (Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each type in the bracket; noc/bb: number of contacts with the ligand, with the number of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. ) Table 3. res type disruptive mutations Fig. 5. Residues in 2phdD, at the interface with 2phdB, colored by their rela- 121 H (E)(T)(MD)(SVQCAG) tive importance. 2phdB is shown in backbone representation (See Appendix 176 L (YR)(TH)(SCG)(KE) for the coloring scheme for the protein chain 2phdD.) 177 D (R)(FWH)(VCAG)(KY) 59 W (KE)(TQD)(SNCG)(R) 352 P (R)(Y)(H)(K) Table 4. 84 R (TD)(SECG)(VLAPI)(Y) res type subst’s cvg noc/ dist antn 91 P (R)(Y)(H)(K) (%) bb (A˚ ) 83 R (D)(ELPI)(FTYVMAW)(S) 119 H H(100) 0.01 5/0 2.02 site 39 W (E)(K)(TD)(Q) 160 H H(100) 0.01 5/0 2.16 site 50 P (R)(Y)(H)(TK) 121 H H(99)N 0.03 5/0 2.13 site 38 L (Y)(R)(H)(T) 127 R R(99)Q 0.03 1/0 4.73 104 W (K)(E)(Q)(D) 332 W (K)(E)(TQ)(D) Table 4. The top 25% of residues in 2phdD at the interface with fe (iii) 123 Q (Y)(H)(FW)(T) ion.(Field names: res: residue number in the PDB entry; type: amino acid 188 F (K)(E)(Q)(D) type; substs: substitutions seen in the alignment; with the percentage of each 357 L (YR)(TH)(K)(E) type in the bracket; noc/bb: number of contacts with the ligand, with the num- 88 L (YR)(T)(H)(KE) ber of contacts realized through backbone atoms given in the bracket; dist: 178 I (R)(Y)(T)(EH) distance of closest apporach to the ligand.