3I7a Lichtarge Lab 2006
Total Page:16
File Type:pdf, Size:1020Kb
Pages 1–5 3i7a Evolutionary trace report by report maker April 13, 2010 4.3.3 DSSP 5 4.3.4 HSSP 5 4.3.5 LaTex 5 4.3.6 Muscle 5 4.3.7 Pymol 5 4.4 Note about ET Viewer 5 4.5 Citing this work 5 4.6 About report maker 5 4.7 Attachments 5 1 INTRODUCTION From the original Protein Data Bank entry (PDB id 3i7a): Title: Crystal structure of putative metal-dependent phosphohydro- lase (yp 926882.1) from shewanella amazonensis sb2b at 2.06 a resolution Compound: Mol id: 1; molecule: putative metal-dependent phos- CONTENTS phohydrolase; chain: a; synonym: putative signal transduction protein; engineered: yes 1 Introduction 1 Organism, scientific name: Shewanella Amazonensis Sb2b; 2 Chain 3i7aA 1 3i7a contains a single unique chain 3i7aA (280 residues long). 2.1 Q8EHG7 overview 1 2.2 Multiple sequence alignment for 3i7aA 1 2.3 Residue ranking in 3i7aA 1 2.4 Top ranking residues in 3i7aA and their position on the structure 1 2.4.1 Clustering of residues at 25% coverage. 2 2 CHAIN 3I7AA 2.4.2 Possible novel functional surfaces at 25% 2.1 Q8EHG7 overview coverage. 2 From SwissProt, id Q8EHG7, 79% identical to 3i7aA: Description: Hypothetical protein SO1254. 3 Notes on using trace results 3 Organism, scientific name: Shewanella oneidensis. 3.1 Coverage 3 Taxonomy: Bacteria; Proteobacteria; Gammaproteobacteria; Altero- 3.2 Known substitutions 4 monadales; Shewanellaceae; Shewanella. 3.3 Surface 4 3.4 Number of contacts 4 3.5 Annotation 4 3.6 Mutation suggestions 4 2.2 Multiple sequence alignment for 3i7aA 4 Appendix 4 For the chain 3i7aA, the alignment 3i7aA.msf (attached) with 43 4.1 File formats 4 sequences was used. The alignment was downloaded from the HSSP 4.2 Color schemes used 4 database, and fragments shorter than 75% of the query as well as 4.3 Credits 5 duplicate sequences were removed. It can be found in the attachment 4.3.1 Alistat 5 to this report, under the name of 3i7aA.msf. Its statistics, from the 4.3.2 CE 5 alistat program are the following: 1 Lichtarge lab 2006 Fig. 1. Residues 1-140 in 3i7aA colored by their relative importance. (See Appendix, Fig.6, for the coloring scheme.) Fig. 2. Residues 141-280 in 3i7aA colored by their relative importance. (See Appendix, Fig.6, for the coloring scheme.) Fig. 3. Residues in 3i7aA, colored by their relative importance. Clockwise: Format: MSF front, back, top and bottom views. Number of sequences: 43 Total number of residues: 11667 Smallest: 249 Largest: 280 Average length: 271.3 Alignment length: 280 Average identity: 44% Most related pair: 99% Most unrelated pair: 26% Most distant seq: 34% Furthermore, 5% of residues show as conserved in this alignment. The alignment consists of 6% prokaryotic sequences. (Descripti- ons of some sequences were not readily available.) The file contai- ning the sequence descriptions can be found in the attachment, under the name 3i7aA.descr. 2.3 Residue ranking in 3i7aA The 3i7aA sequence is shown in Figs. 1–2, with each residue colored according to its estimated importance. The full listing of residues in 3i7aA can be found in the file called 3i7aA.ranks sorted in the attachment. 2.4 Top ranking residues in 3i7aA and their position on Fig. 4. Residues in 3i7aA, colored according to the cluster they belong to: the structure red, followed by blue and yellow are the largest clusters (see Appendix for the coloring scheme). Clockwise: front, back, top and bottom views. The In the following we consider residues ranking among top 25% of resi- corresponding Pymol script is attached. dues in the protein . Figure 3 shows residues in 3i7aA colored by their importance: bright red and yellow indicate more conserved/important residues (see Appendix for the coloring scheme). A Pymol script for producing this figure can be found in the attachment. 2.4.1 Clustering of residues at 25% coverage. Fig. 4 shows the top 25% of all residues, this time colored according to clusters they belong to. The clusters in Fig.4 are composed of the residues listed in Table 1. 2 Table 1. Table 2. continued cluster size member res type substitutions(%) cvg color residues 168 E E(95)D(4) 0.07 red 69 15,18,20,22,23,24,25,26,27 226 D D(97)V(2) 0.07 29,32,33,36,50,53,55,56,58 100 E E(97)Q(2) 0.08 59,61,63,65,66,70,75,81,84 59 R R(97)G(2) 0.09 85,86,89,90,93,95,96,98,100 66 S S(97)N(2) 0.09 101,103,105,118,121,125,147 224 Y Y(95)D(2)L(2) 0.09 148,150,151,152,154,155,158 22 L L(97).(2) 0.10 159,161,162,163,164,166,167 23 P P(97).(2) 0.10 168,180,184,189,192,195,196 25 L L(97).(2) 0.10 199,201,208,224,226 26 P P(97).(2) 0.10 85 I L(46)I(53) 0.11 Table 1. Clusters of top ranking residues in 3i7aA. 105 S A(46)S(51)P(2) 0.11 58 A A(95)V(2)L(2) 0.12 24 T T(93).(2)G(2) 0.14 2.4.2 Possible novel functional surfaces at 25% coverage. One S(2) group of residues is conserved on the 3i7aA surface, away from (or 147 D D(95)E(4) 0.14 susbtantially larger than) other functional sites and interfaces reco- 20 L L(95).(2)I(2) 0.15 gnizable in PDB entry 3i7a. It is shown in Fig. 5. The right panel 18 D D(93).(2)N(2) 0.16 shows (in blue) the rest of the larger cluster this surface belongs to. G(2) 55 A A(93)S(6) 0.17 166 E Y(41)E(51)R(2) 0.17 I(2)K(2) 184 V I(60)V(37)M(2) 0.17 27 E E(93).(2)A(4) 0.18 61 I I(90)V(4)L(4) 0.19 15 L I(58).(2)L(39) 0.20 70 S R(46)S(37)G(13) 0.21 C(2) 167 A A(88)V(6)I(4) 0.21 75 A I(27)V(32)A(39) 0.23 95 T T(72)I(16)M(9) 0.23 V(2) Fig. 5. A possible active surface on the chain 3i7aA. The larger cluster it 148 T Q(41)T(46)S(4) 0.23 belongs to is shown in blue. E(4)V(2) 56 I L(58)I(37)V(2) 0.24 M(2) The residues belonging to this surface ”patch” are listed in Table 89 Q Y(39)Q(48)L(2) 0.24 2, while Table 3 suggests possible disruptive replacements for these K(2)R(2)M(2) residues (see Section 3.6). F(2) Table 2. 163 V I(46)V(53) 0.25 res type substitutions(%) cvg 208 V V(83)L(11)I(4) 0.25 53 D D(100) 0.06 65 N N(100) 0.06 Table 2. Residues forming surface ”patch” in 3i7aA. 84 R R(100) 0.06 86 G G(100) 0.06 101 Q Q(100) 0.06 Table 3. 103 F F(100) 0.06 res type disruptive 118 W W(97)V(2) 0.06 mutations 151 L L(100) 0.06 53 D (R)(FWH)(KYVCAG)(TQM) 161 L L(97)A(2) 0.06 65 N (Y)(FTWH)(SEVCARG)(MD) 180 L L(100) 0.06 84 R (TD)(SYEVCLAPIG)(FMW)(N) 199 W W(100) 0.06 86 G (KER)(FQMWHD)(NYLPI)(SVA) 201 F F(100) 0.06 101 Q (Y)(FTWH)(SVCAG)(D) continued in next column 103 F (KE)(TQD)(SNCRG)(M) continued in next column 3 Table 3. continued 3.2 Known substitutions res type disruptive One of the table columns is “substitutions” - other amino acid types mutations seen at the same position in the alignment. These amino acid types 118 W (KE)(QD)(TR)(N) may be interchangeable at that position in the protein, so if one wants 151 L (YR)(TH)(SKECG)(FQWD) to affect the protein by a point mutation, they should be avoided. For 161 L (YR)(H)(TKE)(SQCDG) example if the substitutions are “RVK” and the original protein has 180 L (YR)(TH)(SKECG)(FQWD) an R at that position, it is advisable to try anything, but RVK. Conver- 199 W (KE)(TQD)(SNCRG)(M) sely, when looking for substitutions which will not affect the protein, 201 F (KE)(TQD)(SNCRG)(M) one may try replacing, R with K, or (perhaps more surprisingly), with 168 E (FWH)(R)(YVCAG)(T) V. The percentage of times the substitution appears in the alignment 226 D (R)(H)(FKYW)(QCG) is given in the immediately following bracket. No percentage is given 100 E (FWH)(Y)(VCAG)(TR) in the cases when it is smaller than 1%. This is meant to be a rough 59 R (D)(E)(TYLPI)(SFVMAW) guide - due to rounding errors these percentages often do not add up 66 S (R)(FKWH)(YM)(EQ) to 100%. 224 Y (K)(R)(Q)(M) 22 L (YR)(TH)(SCG)(KE) 3.3 Surface 23 P (YR)(TH)(SCG)(KE) To detect candidates for novel functional interfaces, first we look for 25 L (YR)(TH)(SCG)(KE) residues that are solvent accessible (according to DSSP program) by 2 26 P (YR)(TH)(SCG)(KE) at least 10A˚ , which is roughly the area needed for one water mole- 85 I (YR)(TH)(SKECG)(FQWD) cule to come in the contact with the residue.