1Iu9 Lichtarge Lab 2006
Total Page:16
File Type:pdf, Size:1020Kb
Pages 1–5 1iu9 Evolutionary trace report by report maker July 1, 2010 4.3.3 DSSP 4 4.3.4 HSSP 4 4.3.5 LaTex 4 4.3.6 Muscle 4 4.3.7 Pymol 4 4.4 Note about ET Viewer 5 4.5 Citing this work 5 4.6 About report maker 5 4.7 Attachments 5 1 INTRODUCTION From the original Protein Data Bank entry (PDB id 1iu9): Title: Crystal structure of the c-terminal domain of aspartate race- mase from pyrococcus horikoshii ot3 Compound: Mol id: 1; molecule: aspartate racemase; chain: a; fragment: c-terminal domain; ec: 5.1.1.13; engineered: yes Organism, scientific name: Pyrococcus Horikoshii; 1iu9 contains a single unique chain 1iu9A (111 residues long). 2 CHAIN 1IU9A 2.1 O58403 overview CONTENTS From SwissProt, id O58403, 100% identical to 1iu9A: 1 Introduction 1 Description: 228aa long hypothetical aspartate racemase. Organism, scientific name: Pyrococcus horikoshii. 2 Chain 1iu9A 1 Taxonomy: Archaea; Euryarchaeota; Thermococci; Thermococca- 2.1 O58403 overview 1 les; Thermococcaceae; Pyrococcus. 2.2 Multiple sequence alignment for 1iu9A 1 2.3 Residue ranking in 1iu9A 1 2.2 Multiple sequence alignment for 1iu9A 2.4 Top ranking residues in 1iu9A and their position on For the chain 1iu9A, the alignment 1iu9A.msf (attached) with 187 the structure 1 sequences was used. The alignment was downloaded from the HSSP 2.4.1 Clustering of residues at 25% coverage. 1 database, and fragments shorter than 75% of the query as well as 2.4.2 Possible novel functional surfaces at 25% duplicate sequences were removed. It can be found in the attachment coverage. 2 to this report, under the name of 1iu9A.msf. Its statistics, from the alistat program are the following: 3 Notes on using trace results 3 3.1 Coverage 3 Format: MSF 3.2 Known substitutions 3 Number of sequences: 187 3.3 Surface 3 Total number of residues: 20223 3.4 Number of contacts 3 Smallest: 86 3.5 Annotation 3 Largest: 111 3.6 Mutation suggestions 3 Average length: 108.1 Alignment length: 111 4 Appendix 4 Average identity: 32% 4.1 File formats 4 Most related pair: 99% 4.2 Color schemes used 4 Most unrelated pair: 7% 4.3 Credits 4 Most distant seq: 31% 4.3.1 Alistat 4 4.3.2 CE 4 1 Lichtarge lab 2006 2.4.1 Clustering of residues at 25% coverage. Fig. 3 shows the top 25% of all residues, this time colored according to clusters they belong to. The clusters in Fig.3 are composed of the residues listed Fig. 1. Residues 103-213 in 1iu9A colored by their relative importance. (See Appendix, Fig.5, for the coloring scheme.) Furthermore, <1% of residues show as conserved in this ali- gnment. The alignment consists of 17% prokaryotic, and 6% archaean sequences. (Descriptions of some sequences were not readily availa- ble.) The file containing the sequence descriptions can be found in the attachment, under the name 1iu9A.descr. 2.3 Residue ranking in 1iu9A The 1iu9A sequence is shown in Fig. 1, with each residue colored according to its estimated importance. The full listing of residues in 1iu9A can be found in the file called 1iu9A.ranks sorted in the attachment. 2.4 Top ranking residues in 1iu9A and their position on the structure In the following we consider residues ranking among top 25% of resi- Fig. 3. Residues in 1iu9A, colored according to the cluster they belong to: dues in the protein . Figure 2 shows residues in 1iu9A colored by their red, followed by blue and yellow are the largest clusters (see Appendix for importance: bright red and yellow indicate more conserved/important the coloring scheme). Clockwise: front, back, top and bottom views. The residues (see Appendix for the coloring scheme). A Pymol script for corresponding Pymol script is attached. producing this figure can be found in the attachment. in Table 1. Table 1. cluster size member color residues red 26 120,121,122,123,124,126,127 128,132,133,147,159,160,163 164,166,191,192,193,194,195 196,197,198,200,212 Table 1. Clusters of top ranking residues in 1iu9A. 2.4.2 Possible novel functional surfaces at 25% coverage. One group of residues is conserved on the 1iu9A surface, away from (or susbtantially larger than) other functional sites and interfaces reco- gnizable in PDB entry 1iu9. It is shown in Fig. 4. The right panel shows (in blue) the rest of the larger cluster this surface belongs to. The residues belonging to this surface ”patch” are listed in Table 2, while Table 3 suggests possible disruptive replacements for these residues (see Section 3.6). Table 2. res type substitutions(%) cvg Fig. 2. Residues in 1iu9A, colored by their relative importance. Clockwise: 194 C C(98).S 0.01 front, back, top and bottom views. continued in next column 2 Table 3. res type disruptive mutations 194 C (KR)(E)(FMWH)(Q) 195 T (KR)(FMWH)(Q)(LPI) 196 E (FW)(VCAHG)(R)(Y) 124 T (R)(K)(H)(FW) 127 T (R)(K)(H)(FW) 133 Y (K)(QR)(E)(M) 212 D (R)(FWH)(YVCAG)(TK) 193 G (KER)(HD)(Q)(FMW) 166 G (R)(KE)(H)(FW) Fig. 4. A possible active surface on the chain 1iu9A. The larger cluster it 147 P (YR)(H)(T)(KE) belongs to is shown in blue. 126 G (R)(KE)(H)(D) 164 K (Y)(FW)(T)(H) Table 2. continued 160 Y (K)(Q)(M)(E) res type substitutions(%) cvg 163 V (Y)(R)(E)(K) 195 T T(98).S(1) 0.02 132 V (R)(KE)(Y)(D) 196 E E(96).D(2)H 0.03 198 S (R)(K)(H)(FW) 124 T T(93)AS(4)NE 0.04 200 V (R)(K)(E)(Y) 127 T T(93)L(1)CS(3)N 0.05 128 I (Y)(R)(H)(T) A(1) 133 Y Y(93)I(1)F(5)L 0.06 Table 3. Disruptive mutations for the surface patch in 1iu9A. 212 D D(85).(12)EN(1) 0.07 193 G A(6)G(91).(1) 0.08 166 G G(87)TD(1)VQ(1) 0.11 3 NOTES ON USING TRACE RESULTS HE(1)K(2)N(2)L. A(1) 3.1 Coverage 147 P P(86)SL(3)IT(2) 0.14 Trace results are commonly expressed in terms of coverage: the resi- HG(2)F(1)EKVAQ due is important if its “coverage” is small - that is if it belongs to 126 G G(35)F(24)Y(19) 0.15 some small top percentage of residues [100% is all of the residues NA(12)Q(4)T(1) in a chain], according to trace. The ET results are presented in the P(1)I form of a table, usually limited to top 25% percent of residues (or 164 K K(44)C(34)Q(1) 0.16 to some nearby percentage), sorted by the strength of the presumed S(5)A(3)V(4)G evolutionary pressure. (I.e., the smaller the coverage, the stronger the I(2)N(1)T(1)E.H pressure on the residue.) Starting from the top of that list, mutating a 160 Y Y(70)F(14)I(2) 0.17 couple of residues should affect the protein somehow, with the exact L(4).(1)C(1)SVH effects to be determined experimentally. R(2)EK 163 V V(26)L(56)K(3) 0.18 3.2 Known substitutions I(10)AY(2).Q One of the table columns is “substitutions” - other amino acid types 132 V V(14)F(49)KI(8) 0.19 seen at the same position in the alignment. These amino acid types M(3)L(19)W(1) may be interchangeable at that position in the protein, so if one wants A(1)Y(1)T to affect the protein by a point mutation, they should be avoided. For 198 S S(20)A(9)P(38). 0.21 example if the substitutions are “RVK” and the original protein has C(3)G(13)T(5) an R at that position, it is advisable to try anything, but RVK. Conver- M(4)N(1)REH(1) sely, when looking for substitutions which will not affect the protein, 200 V L(55).F(1)I(12) 0.24 one may try replacing, R with K, or (perhaps more surprisingly), with A(16)V(10)DG(1) V. The percentage of times the substitution appears in the alignment SYT is given in the immediately following bracket. No percentage is given 128 I V(8)M(52)I(12) 0.25 in the cases when it is smaller than 1%. This is meant to be a rough R(1)QL(13)K(2)T guide - due to rounding errors these percentages often do not add up Y(5)N(1) to 100%. 3.3 Surface Table 2. Residues forming surface ”patch” in 1iu9A. To detect candidates for novel functional interfaces, first we look for residues that are solvent accessible (according to DSSP program) by 2 at least 10A˚ , which is roughly the area needed for one water mole- cule to come in the contact with the residue. Furthermore, we require 3 that these residues form a “cluster” of residues which have neighbor within 5A˚ from any of their heavy atoms. Note, however, that, if our picture of protein evolution is correct, the neighboring residues which are not surface accessible might be equally important in maintaining the interaction specificity - they COVERAGE should not be automatically dropped from consideration when choo- sing the set for mutagenesis.