Pages 1–20 1gw9 Evolutionary trace report by report maker January 27, 2010

4.3.1 Alistat 19 4.3.2 CE 19 4.3.3 DSSP 19 4.3.4 HSSP 19 4.3.5 LaTex 19 4.3.6 Muscle 19 4.3.7 Pymol 19 4.4 Note about ET Viewer 19 4.5 Citing this work 19 4.6 About report maker 19 4.7 Attachments 19

1 INTRODUCTION From the original Protein Data Bank entry (PDB id 1gw9): Title: Tri-iodide derivative of xylose isomerase from rubiginosus Compound: Mol id: 1; molecule: xylose isomerase; chain: a; ec: 5.3.1.5 Organism, scientific name: Streptomyces Rubiginosus 1gw9 contains a single unique chain 1gw9A (385 residues long).

CONTENTS 2 CHAIN 1GW9A 1 Introduction 1 2.1 P24300 overview 2 Chain 1gw9A 1 From SwissProt, id P24300, 96% identical to 1gw9A: 2.1 P24300 overview 1 Description: Xylose isomerase (EC 5.3.1.5). 2.2 Multiple sequence alignment for 1gw9A 1 Organism, scientific name: Streptomyces rubiginosus. 2.3 Residue ranking in 1gw9A 1 : ; ; Actinobacteridae; Actinomy- 2.4 Top ranking residues in 1gw9A and their position on cetales; Streptomycineae; ; Streptomyces. the structure 2 Function: Involved in D-xylose catabolism. 2.4.1 Clustering of residues at 25% coverage. 2 Catalytic activity: D-xylose = D-xylulose. 2.4.2 Overlap with known functional surfaces at Cofactor: Binds 2 magnesium ions per subunit (By similarity). 25% coverage. 2 Subunit: Homotetramer. 2.4.3 Possible novel functional surfaces at 25% Subcellular location: Cytoplasmic. coverage. 16 Similarity: Belongs to the xylose isomerase family. Caution: According to the crystallographic study residue 40 could 3 Notes on using trace results 18 be Gln. 3.1 Coverage 18 About: This Swiss-Prot entry is copyright. It is produced through a 3.2 Known substitutions 18 collaboration between the Swiss Institute of Bioinformatics and the 3.3 Surface 18 EMBL outstation - the European Bioinformatics Institute. There are 3.4 Number of contacts 18 no restrictions on its use as long as its content is in no way modified 3.5 Annotation 18 and this statement is not removed. 3.6 Mutation suggestions 18 2.2 Multiple sequence alignment for 1gw9A 4 Appendix 18 For the chain 1gw9A, the alignment 1gw9A.msf (attached) with 27 4.1 File formats 18 sequences was used. The alignment was assembled through combi- 4.2 Color schemes used 18 nation of BLAST searching on the UniProt database and alignment 4.3 Credits 19 using Muscle program. It can be found in the attachment to this

1 Lichtarge lab 2006 2.4 Top ranking residues in 1gw9A and their position on the structure In the following we consider residues ranking among top 25% of residues in the protein . Figure 3 shows residues in 1gw9A colored by their importance: bright red and yellow indicate more conser- ved/important residues (see Appendix for the coloring scheme). A Pymol script for producing this figure can be found in the attachment.

Fig. 1. Residues 2-193 in 1gw9A colored by their relative importance. (See Appendix, Fig.29, for the coloring scheme.)

Fig. 2. Residues 194-386 in 1gw9A colored by their relative importance. (See Appendix, Fig.29, for the coloring scheme.)

report, under the name of 1gw9A.msf. Its statistics, from the alistat program are the following: Fig. 3. Residues in 1gw9A, colored by their relative importance. Clockwise: front, back, top and bottom views. Format: MSF Number of sequences: 27 Total number of residues: 10273 2.4.1 Clustering of residues at 25% coverage. Fig. 4 shows the Smallest: 371 top 25% of all residues, this time colored according to clusters they Largest: 385 belong to. The clusters in Fig.4 are composed of the residues listed Average length: 380.5 in Table 1. Alignment length: 385 Table 1. Average identity: 44% cluster size member Most related pair: 99% color residues Most unrelated pair: 21% red 79 3,5,10,11,14,16,17,18,46,47 Most distant seq: 67% 52,53,54,55,57,87,88,90,94 102,105,111,134,135,137,138 Furthermore, 9% of residues show as conserved in this alignment. 139,140,141,142,179,181,182 The alignment consists of 11% eukaryotic ( 11% plantae), and 183,184,186,187,190,191,193 85% prokaryotic sequences. (Descriptions of some sequences were 195,197,202,213,215,217,220 not readily available.) The file containing the sequence descriptions 221,223,224,226,235,241,243 can be found in the attachment, under the name 1gw9A.descr. 245,246,247,249,253,254,255 257,260,261,282,283,285,286 287,289,298,300,303,306,307 2.3 Residue ranking in 1gw9A 308,311,313,317 blue 7 22,24,25,26,27,30,292 The 1gw9A sequence is shown in Figs. 1–2, with each residue colo- red according to its estimated importance. The full listing of residues continued in next column in 1gw9A can be found in the file called 1gw9A.ranks sorted in the attachment.

2 Fig. 4. Residues in 1gw9A, colored according to the cluster they belong to: red, followed by blue and yellow are the largest clusters (see Appendix for the coloring scheme). Clockwise: front, back, top and bottom views. The corresponding Pymol script is attached.

Table 1. continued cluster size member color residues yellow 5 124,168,172,173,174

Table 1. Clusters of top ranking residues in 1gw9A.

2.4.2 Overlap with known functional surfaces at 25% coverage. The name of the ligand is composed of the source PDB identifier and the heteroatom name used in that file. Iodide ion binding site. Table 2 lists the top 25% of residues at the interface with 1gw9AIOD1400 (iodide ion). The following table (Table 3) suggests possible disruptive replacements for these residues (see Section 3.6).

Table 2. res type subst’s cvg noc/ dist antn (%) bb (A˚ ) 137 W W(100) 0.09 1/1 4.83 site 141 E E(100) 0.09 3/0 4.13 138 G G(92) 0.18 2/2 3.71 P(7)

Table 2. The top 25% of residues in 1gw9A at the interface with iodide ion.(Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each type in the bracket; noc/bb: number of contacts with the ligand, with the num- ber of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. )

3 Table 3. type in the bracket; noc/bb: number of contacts with the ligand, with the num- res type disruptive ber of contacts realized through backbone atoms given in the bracket; dist: mutations distance of closest apporach to the ligand. ) 137 W (KE)(TQD)(SNCRG)(M) 141 E (FWH)(YVCARG)(T)(SNKLPI) 138 G (R)(KE)(H)(FYQWD) Table 5. res type disruptive Table 3. List of disruptive mutations for the top 25% of residues in mutations 1gw9A, that are at the interface with iodide ion. 16 W (KE)(TQD)(SNCRG)(M) 54 H (E)(TQMD)(SNKVCLAPIG)(YR) 94 F (E)(K)(TQD)(SNCG)

Table 5. List of disruptive mutations for the top 25% of residues in 1gw9A, that are at the interface with iodide ion.

Fig. 5. Residues in 1gw9A, at the interface with iodide ion, colored by their relative importance. The ligand (iodide ion) is colored green. Atoms further than 30A˚ away from the geometric center of the ligand, as well as on the line of sight to the ligand were removed. (See Appendix for the coloring scheme for the protein chain 1gw9A.) Fig. 6. Residues in 1gw9A, at the interface with iodide ion, colored by their relative importance. The ligand (iodide ion) is colored green. Atoms further than 30A˚ away from the geometric center of the ligand, as well as on the line Figure 5 shows residues in 1gw9A colored by their importance, at the of sight to the ligand were removed. (See Appendix for the coloring scheme interface with 1gw9AIOD1400. for the protein chain 1gw9A.) Iodide ion binding site. Table 4 lists the top 25% of residues at the interface with 1gw9AIOD1410 (iodide ion). The following table (Table 5) suggests possible disruptive replacements for these residues Figure 6 shows residues in 1gw9A colored by their importance, at the (see Section 3.6). interface with 1gw9AIOD1410. Iodide ion binding site. Table 6 lists the top 25% of residues at Table 4. the interface with 1gw9AIOD1405 (iodide ion). The following table res type subst’s cvg noc/ dist antn (Table 7) suggests possible disruptive replacements for these residues (%) bb (A˚ ) (see Section 3.6). 16 W W(100) 0.09 4/0 4.04 Table 6. 54 H H(100) 0.09 3/0 4.34 site res type subst’s cvg noc/ dist 94 F F(96) 0.15 2/0 3.88 (%) bb (A˚ ) H(3) 24 D D(100) 0.09 5/1 3.72 25 P P(88) 0.21 4/1 3.02 Table 4. The top 25% of residues in 1gw9A at the interface with iodide continued in next column ion.(Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each

4 Table 6. continued Beta-l-xylopyranose binding site. Table 8 lists the top res type subst’s cvg noc/ dist 25% of residues at the interface with 1gw9ALXC1390 (beta-l- (%) bb (A˚ ) xylopyranose). The following table (Table 9) suggests possible Q(11) disruptive replacements for these residues (see Section 3.6). 254 Y Y(25) 0.25 4/0 4.29 W(62) Table 8. F(11) res type subst’s cvg noc/ dist antn (%) bb (A˚ ) Table 6. The top 25% of residues in 1gw9A at the interface with iodide 16 W W(100) 0.09 12/0 4.14 ion.(Field names: res: residue number in the PDB entry; type: amino acid 54 H H(100) 0.09 17/0 2.78 site type; substs: substitutions seen in the alignment; with the percentage of each 137 W W(100) 0.09 48/0 3.44 site type in the bracket; noc/bb: number of contacts with the ligand, with the num- 181 E E(100) 0.09 20/0 2.41 site ber of contacts realized through backbone atoms given in the bracket; dist: 215 N N(100) 0.09 1/0 4.58 distance of closest apporach to the ligand. ) 217 E E(100) 0.09 6/0 3.24 site 220 H H(100) 0.09 4/0 3.46 site 245 D D(100) 0.09 6/0 3.10 site Table 7. 255 D D(100) 0.09 2/0 4.84 site res type disruptive 257 D D(100) 0.09 1/0 4.88 mutations 287 D D(100) 0.09 15/0 2.88 24 D (R)(FWH)(KYVCAG)(TQM) 90 T T(96) 0.15 4/0 3.80 25 P (Y)(THR)(SCG)(FEW) G(3) 254 Y (K)(Q)(E)(M) 94 F F(96) 0.15 8/0 3.70 H(3) Table 7. List of disruptive mutations for the top 25% of residues in 135 V V(96) 0.15 2/0 4.22 1gw9A, that are at the interface with iodide ion. C(3) 289 K K(88) 0.17 1/0 4.22 H(11)

Table 8. The top 25% of residues in 1gw9A at the interface with beta- l-xylopyranose.(Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percen- tage of each type in the bracket; noc/bb: number of contacts with the ligand, with the number of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. )

Table 9. res type disruptive mutations 16 W (KE)(TQD)(SNCRG)(M) 54 H (E)(TQMD)(SNKVCLAPIG)(YR) 137 W (KE)(TQD)(SNCRG)(M) 181 E (FWH)(YVCARG)(T)(SNKLPI) 215 N (Y)(FTWH)(SEVCARG)(MD) 217 E (FWH)(YVCARG)(T)(SNKLPI) 220 H (E)(TQMD)(SNKVCLAPIG)(YR) 245 D (R)(FWH)(KYVCAG)(TQM) 255 D (R)(FWH)(KYVCAG)(TQM) 257 D (R)(FWH)(KYVCAG)(TQM) 287 D (R)(FWH)(KYVCAG)(TQM) Fig. 7. Residues in 1gw9A, at the interface with iodide ion, colored by their 90 T (KR)(FQMWH)(E)(NLPI) relative importance. The ligand (iodide ion) is colored green. Atoms further 94 F (E)(K)(TQD)(SNCG) than 30A˚ away from the geometric center of the ligand, as well as on the line 135 V (KER)(Y)(QHD)(N) of sight to the ligand were removed. (See Appendix for the coloring scheme 289 K (TY)(SFVCAWG)(D)(E) for the protein chain 1gw9A.) Table 9. List of disruptive mutations for the top 25% of residues in 1gw9A, that are at the interface with beta-l-xylopyranose. Figure 7 shows residues in 1gw9A colored by their importance, at the interface with 1gw9AIOD1405.

5 Table 11. continued res type disruptive mutations

Table 11. List of disruptive mutations for the top 25% of residues in 1gw9A, that are at the interface with iodide ion.

Fig. 8. Residues in 1gw9A, at the interface with beta-l-xylopyranose, colo- red by their relative importance. The ligand (beta-l-xylopyranose) is colored green. Atoms further than 30A˚ away from the geometric center of the ligand, as well as on the line of sight to the ligand were removed. (See Appendix for the coloring scheme for the protein chain 1gw9A.)

Figure 8 shows residues in 1gw9A colored by their importance, at the interface with 1gw9ALXC1390. Iodide ion binding site. Table 10 lists the top 25% of residues at the interface with 1gw9AIOD1403 (iodide ion). The following Fig. 9. Residues in 1gw9A, at the interface with iodide ion, colored by their table (Table 11) suggests possible disruptive replacements for these relative importance. The ligand (iodide ion) is colored green. Atoms further residues (see Section 3.6). than 30A˚ away from the geometric center of the ligand, as well as on the line of sight to the ligand were removed. (See Appendix for the coloring scheme Table 10. for the protein chain 1gw9A.) res type subst’s cvg noc/ dist (%) bb (A˚ ) 135 V V(96) 0.15 2/2 4.44 Figure 9 shows residues in 1gw9A colored by their importance, at the C(3) interface with 1gw9AIOD1403. 134 Y Y(92) 0.24 2/0 3.73 Iodide ion binding site. Table 12 lists the top 25% of residues T(3) at the interface with 1gw9AIOD1402 (iodide ion). The following F(3) table (Table 13) suggests possible disruptive replacements for these residues (see Section 3.6). Table 10. The top 25% of residues in 1gw9A at the interface with iodide ion.(Field names: res: residue number in the PDB entry; type: amino acid Table 12. type; substs: substitutions seen in the alignment; with the percentage of each res type subst’s cvg noc/ dist antn type in the bracket; noc/bb: number of contacts with the ligand, with the num- (%) bb (A˚ ) ber of contacts realized through backbone atoms given in the bracket; dist: 137 W W(100) 0.09 5/4 3.10 site distance of closest apporach to the ligand. ) 90 T T(96) 0.15 1/0 4.29 G(3) 135 V V(96) 0.15 2/2 4.21 Table 11. C(3) res type disruptive 138 G G(92) 0.18 1/1 4.74 mutations P(7) 135 V (KER)(Y)(QHD)(N) 134 Y (K)(Q)(M)(E) Table 12. The top 25% of residues in 1gw9A at the interface with iodide continued in next column ion.(Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each

6 type in the bracket; noc/bb: number of contacts with the ligand, with the num- Table 14. continued ber of contacts realized through backbone atoms given in the bracket; dist: res type subst’s cvg noc/ dist distance of closest apporach to the ligand. ) (%) bb (A˚ ) F(7) 235 A A(96) 0.14 1/1 4.39 Table 13. V(3) res type disruptive mutations Table 14. The top 25% of residues in 1gw9A at the interface with iodide 137 W (KE)(TQD)(SNCRG)(M) ion.(Field names: res: residue number in the PDB entry; type: amino acid 90 T (KR)(FQMWH)(E)(NLPI) type; substs: substitutions seen in the alignment; with the percentage of each 135 V (KER)(Y)(QHD)(N) type in the bracket; noc/bb: number of contacts with the ligand, with the num- 138 G (R)(KE)(H)(FYQWD) ber of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. ) Table 13. List of disruptive mutations for the top 25% of residues in 1gw9A, that are at the interface with iodide ion. Table 15. res type disruptive mutations 241 L (R)(TY)(KE)(SCHG) 235 A (KYER)(QHD)(N)(FTMW)

Table 15. List of disruptive mutations for the top 25% of residues in 1gw9A, that are at the interface with iodide ion.

Fig. 10. Residues in 1gw9A, at the interface with iodide ion, colored by their relative importance. The ligand (iodide ion) is colored green. Atoms further than 30A˚ away from the geometric center of the ligand, as well as on the line of sight to the ligand were removed. (See Appendix for the coloring scheme for the protein chain 1gw9A.)

Figure 10 shows residues in 1gw9A colored by their importance, at the interface with 1gw9AIOD1402. Iodide ion binding site. Table 14 lists the top 25% of residues Fig. 11. Residues in 1gw9A, at the interface with iodide ion, colored by their at the interface with 1gw9AIOD1407 (iodide ion). The following relative importance. The ligand (iodide ion) is colored green. Atoms further than 30A˚ away from the geometric center of the ligand, as well as on the line table (Table 15) suggests possible disruptive replacements for these of sight to the ligand were removed. (See Appendix for the coloring scheme residues (see Section 3.6). for the protein chain 1gw9A.) Table 14. res type subst’s cvg noc/ dist Figure 11 shows residues in 1gw9A colored by their importance, at (%) bb (A˚ ) the interface with 1gw9AIOD1407. 241 L L(92) 0.09 5/4 3.10 Interface with 1gw9A3.Table 16 lists the top 25% of residues at continued in next column the interface with 1gw9A3. The following table (Table 17) suggests possible disruptive replacements for these residues (see Section 3.6).

7 Table 16. res type subst’s cvg noc/ dist (%) bb (A˚ ) 105 T T(100) 0.09 5/4 3.58 168 Y Y(100) 0.09 20/6 3.49 184 P P(100) 0.09 1/0 4.24 230 H H(100) 0.09 33/6 3.60 190 D D(37) 0.11 31/5 3.09 H(62) 191 I I(37) 0.11 1/1 4.89 Q(62) 195 T T(37) 0.14 18/0 3.15 D(62) 197 G G(37) 0.14 37/37 3.09 A(62) 223 M M(37) 0.14 1/1 4.94 L(62) 226 L L(37) 0.14 11/1 4.01 H(62) 224 A A(74) 0.18 17/13 3.33 S(25) 202 F F(96) 0.19 18/6 3.73 L(3) 97 P P(88) 0.20 10/7 3.74 G(3) R(7) 111 V V(92) 0.21 9/4 4.19 I(7) 221 E E(33) 0.22 2/0 4.70 A(62) D(3) 350 L L(81) 0.23 45/16 3.03 E(11) I(7) 193 L L(25) 0.25 5/2 3.86 D(62) F(11)

Table 16. The top 25% of residues in 1gw9A at the interface with 1gw9A3. (Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each type in the bracket; noc/bb: number of contacts with the ligand, with the number of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. )

Table 17. res type disruptive mutations 105 T (KR)(FQMWH)(NELPI)(D) 168 Y (K)(QM)(NEVLAPIR)(D) 184 P (YR)(TH)(SKECG)(FQWD) 230 H (E)(TQMD)(SNKVCLAPIG)(YR) 190 D (R)(FKVCAWG)(TYQMH)(SNLPI) 191 I (Y)(THR)(SCG)(FEW) 195 T (R)(K)(FWH)(QM) continued in next column

8 Table 17. continued Table 18. continued res type disruptive res type subst’s cvg noc/ dist antn mutations (%) bb (A˚ ) 197 G (KER)(QHD)(FYMW)(N) 217 E E(100) 0.09 5/0 1.96 site 223 M (Y)(TH)(R)(SCG) 220 H H(100) 0.09 3/0 4.07 site 226 L (TYR)(E)(SKCG)(QHD) 245 D D(100) 0.09 4/0 2.14 site 224 A (KR)(YE)(QH)(D) 287 D D(100) 0.09 4/0 2.07 202 F (KE)(T)(QDR)(SCG) 97 P (Y)(R)(TEH)(K) Table 18. The top 25% of residues in 1gw9A at the interface with cal- 111 V (YR)(KE)(H)(QD) cium ion.(Field names: res: residue number in the PDB entry; type: amino 221 E (H)(FW)(R)(Y) acid type; substs: substitutions seen in the alignment; with the percentage of 350 L (YR)(H)(T)(CG) each type in the bracket; noc/bb: number of contacts with the ligand, with 193 L (R)(Y)(T)(KH) the number of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. ) Table 17. List of disruptive mutations for the top 25% of residues in 1gw9A, that are at the interface with 1gw9A3. Table 19. res type disruptive mutations 181 E (FWH)(YVCARG)(T)(SNKLPI) 215 N (Y)(FTWH)(SEVCARG)(MD) 217 E (FWH)(YVCARG)(T)(SNKLPI) 220 H (E)(TQMD)(SNKVCLAPIG)(YR) 245 D (R)(FWH)(KYVCAG)(TQM) 287 D (R)(FWH)(KYVCAG)(TQM)

Table 19. List of disruptive mutations for the top 25% of residues in 1gw9A, that are at the interface with calcium ion.

Fig. 12. Residues in 1gw9A, at the interface with 1gw9A3, colored by their relative importance. 1gw9A3 is shown in backbone representation (See Appendix for the coloring scheme for the protein chain 1gw9A.)

Figure 12 shows residues in 1gw9A colored by their importance, at the interface with 1gw9A3. Calcium ion binding site. Table 18 lists the top 25% of residues at the interface with 1gw9ACA1387 (calcium ion). The following table (Table 19) suggests possible disruptive replacements for these residues (see Section 3.6). Table 18. res type subst’s cvg noc/ dist antn ˚ (%) bb (A) Fig. 13. Residues in 1gw9A, at the interface with calcium ion, colored by 181 E E(100) 0.09 4/0 2.12 site their relative importance. The ligand (calcium ion) is colored green. Atoms 215 N N(100) 0.09 1/0 4.90 further than 30A˚ away from the geometric center of the ligand, as well as on continued in next column the line of sight to the ligand were removed. (See Appendix for the coloring scheme for the protein chain 1gw9A.)

9 Figure 13 shows residues in 1gw9A colored by their importance, at the interface with 1gw9ACA1387. Iodide ion binding site. Table 20 lists the top 25% of residues at the interface with 1gw9AIOD1391 (iodide ion). The following table (Table 21) suggests possible disruptive replacements for these residues (see Section 3.6). Table 20. res type subst’s cvg noc/ dist (%) bb (A˚ ) 10 R R(96) 0.14 5/3 4.26 K(3) 87 P P(33) 0.23 3/0 4.25 L(62) R(3)

Table 20. The top 25% of residues in 1gw9A at the interface with iodide ion.(Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each type in the bracket; noc/bb: number of contacts with the ligand, with the num- ber of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. )

Fig. 14. Residues in 1gw9A, at the interface with iodide ion, colored by their Table 21. relative importance. The ligand (iodide ion) is colored green. Atoms further than 30A˚ away from the geometric center of the ligand, as well as on the line res type disruptive of sight to the ligand were removed. (See Appendix for the coloring scheme mutations for the protein chain 1gw9A.) 10 R (T)(YD)(SVCAG)(FELWPI) 87 P (Y)(T)(R)(H) Table 23. Table 21. List of disruptive mutations for the top 25% of residues in res type disruptive 1gw9A, that are at the interface with iodide ion. mutations 282 G (KER)(FQMWHD)(NYLPI)(SVA) Figure 14 shows residues in 1gw9A colored by their importance, at 283 P (R)(Y)(H)(KE) the interface with 1gw9AIOD1391. 87 P (Y)(T)(R)(H) Iodide ion binding site. Table 22 lists the top 25% of residues at the interface with 1gw9AIOD1394 (iodide ion). The following Table 23. List of disruptive mutations for the top 25% of residues in table (Table 23) suggests possible disruptive replacements for these 1gw9A, that are at the interface with iodide ion. residues (see Section 3.6).

Table 22. Figure 15 shows residues in 1gw9A colored by their importance, at res type subst’s cvg noc/ dist the interface with 1gw9AIOD1394. (%) bb (A˚ ) Iodide ion binding site. Table 24 lists the top 25% of residues 282 G G(100) 0.09 1/1 4.71 at the interface with 1gw9AIOD1413 (iodide ion). The following 283 P P(37) 0.14 3/1 3.94 table (Table 25) suggests possible disruptive replacements for these G(62) residues (see Section 3.6). 87 P P(33) 0.23 1/0 4.54 L(62) Table 24. R(3) res type subst’s cvg noc/ dist (%) bb (A˚ ) Table 22. The top 25% of residues in 1gw9A at the interface with iodide 193 L L(25) 0.25 6/3 3.15 ion.(Field names: res: residue number in the PDB entry; type: amino acid D(62) type; substs: substitutions seen in the alignment; with the percentage of each F(11) type in the bracket; noc/bb: number of contacts with the ligand, with the num- ber of contacts realized through backbone atoms given in the bracket; dist: Table 24. The top 25% of residues in 1gw9A at the interface with iodide distance of closest apporach to the ligand. ) ion.(Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each type in the bracket; noc/bb: number of contacts with the ligand, with the num- ber of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. )

10 Fig. 15. Residues in 1gw9A, at the interface with iodide ion, colored by their Fig. 16. Residues in 1gw9A, at the interface with iodide ion, colored by their relative importance. The ligand (iodide ion) is colored green. Atoms further relative importance. The ligand (iodide ion) is colored green. Atoms further than 30A˚ away from the geometric center of the ligand, as well as on the line than 30A˚ away from the geometric center of the ligand, as well as on the line of sight to the ligand were removed. (See Appendix for the coloring scheme of sight to the ligand were removed. (See Appendix for the coloring scheme for the protein chain 1gw9A.) for the protein chain 1gw9A.)

Table 25. Table 27. res type disruptive res type disruptive mutations mutations 193 L (R)(Y)(T)(KH) 230 H (E)(TQMD)(SNKVCLAPIG)(YR)

Table 25. List of disruptive mutations for the top 25% of residues in Table 27. List of disruptive mutations for the top 25% of residues in 1gw9A, that are at the interface with iodide ion. 1gw9A, that are at the interface with iodide ion.

Figure 16 shows residues in 1gw9A colored by their importance, at the interface with 1gw9AIOD1413. Figure 17 shows residues in 1gw9A colored by their importance, at Iodide ion binding site. Table 26 lists the top 25% of residues the interface with 1gw9AIOD1417. at the interface with 1gw9AIOD1417 (iodide ion). The following Iodide ion binding site. Table 28 lists the top 25% of residues table (Table 27) suggests possible disruptive replacements for these at the interface with 1gw9AIOD1393 (iodide ion). The following residues (see Section 3.6). table (Table 29) suggests possible disruptive replacements for these residues (see Section 3.6). Table 26. res type subst’s cvg noc/ dist Table 28. (%) bb (A˚ ) res type subst’s cvg noc/ dist 230 H H(100) 0.09 2/0 4.08 (%) bb (A˚ ) 282 G G(100) 0.09 4/4 4.08 Table 26. The top 25% of residues in 1gw9A at the interface with iodide 10 R R(96) 0.14 2/0 3.91 ion.(Field names: res: residue number in the PDB entry; type: amino acid K(3) type; substs: substitutions seen in the alignment; with the percentage of each 283 P P(37) 0.14 5/2 3.90 type in the bracket; noc/bb: number of contacts with the ligand, with the num- G(62) ber of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. ) Table 28. The top 25% of residues in 1gw9A at the interface with iodide ion.(Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each type in the bracket; noc/bb: number of contacts with the ligand, with the num- ber of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. )

11 Fig. 17. Residues in 1gw9A, at the interface with iodide ion, colored by their Fig. 18. Residues in 1gw9A, at the interface with iodide ion, colored by their relative importance. The ligand (iodide ion) is colored green. Atoms further relative importance. The ligand (iodide ion) is colored green. Atoms further than 30A˚ away from the geometric center of the ligand, as well as on the line than 30A˚ away from the geometric center of the ligand, as well as on the line of sight to the ligand were removed. (See Appendix for the coloring scheme of sight to the ligand were removed. (See Appendix for the coloring scheme for the protein chain 1gw9A.) for the protein chain 1gw9A.)

Table 29. Table 30. continued res type disruptive res type subst’s cvg noc/ dist mutations (%) bb (A˚ ) 282 G (KER)(FQMWHD)(NYLPI)(SVA) L(11) 10 R (T)(YD)(SVCAG)(FELWPI) 298 G G(33) 0.23 3/3 4.17 283 P (R)(Y)(H)(KE) D(62) A(3) Table 29. List of disruptive mutations for the top 25% of residues in 3 Y Y(81) 0.24 8/0 3.89 1gw9A, that are at the interface with iodide ion. L(3) F(14) Figure 18 shows residues in 1gw9A colored by their importance, at 253 K K(25) 0.25 2/0 4.40 the interface with 1gw9AIOD1393. G(62) Interface with 1gw9A1.Table 30 lists the top 25% of residues at R(11) the interface with 1gw9A1. The following table (Table 31) suggests possible disruptive replacements for these residues (see Section 3.6). Table 30. The top 25% of residues in 1gw9A at the interface with 1gw9A1. (Field names: res: residue number in the PDB entry; type: amino Table 30. acid type; substs: substitutions seen in the alignment; with the percentage of res type subst’s cvg noc/ dist each type in the bracket; noc/bb: number of contacts with the ligand, with (%) bb (A˚ ) the number of contacts realized through backbone atoms given in the bracket; 313 L L(37) 0.14 1/0 4.85 dist: distance of closest apporach to the ligand. ) G(62) 30 T T(92) 0.18 1/1 4.93 Table 31. V(7) 308 R R(33) 0.20 32/0 2.96 res type disruptive D(62) mutations S(3) 313 L (R)(Y)(H)(KE) 261 G G(37) 0.22 5/5 3.36 30 T (KR)(QH)(FEMW)(N) P(51) 308 R (TY)(D)(FVCLAWPIG)(EM) continued in next column continued in next column

12 Table 31. continued Table 32. continued res type disruptive res type subst’s cvg noc/ dist antn mutations (%) bb (A˚ ) 261 G (R)(KE)(H)(Y) 255 D D(100) 0.09 2/0 4.25 site 298 G (R)(K)(H)(E) 292 R R(100) 0.09 47/7 3.12 3 Y (K)(Q)(R)(E) 190 D D(37) 0.11 5/1 3.42 253 K (Y)(FW)(T)(S) H(62) 94 F F(96) 0.15 27/11 3.21 Table 31. List of disruptive mutations for the top 25% of residues in H(3) 1gw9A, that are at the interface with 1gw9A1. 289 K K(88) 0.17 2/0 4.43 H(11) 30 T T(92) 0.18 13/3 3.90 V(7) 97 P P(88) 0.20 23/5 3.44 G(3) R(7) 25 P P(88) 0.21 18/6 3.81 Q(11) 253 K K(25) 0.25 43/15 2.93 G(62) R(11) 254 Y Y(25) 0.25 74/10 3.33 W(62) F(11)

Table 32. The top 25% of residues in 1gw9A at the interface with 1gw9A2. (Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each type in the bracket; noc/bb: number of contacts with the ligand, with the number of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. )

Table 33. res type disruptive Fig. 19. Residues in 1gw9A, at the interface with 1gw9A1, colored by mutations their relative importance. 1gw9A1 is shown in backbone representation (See 24 D (R)(FWH)(KYVCAG)(TQM) Appendix for the coloring scheme for the protein chain 1gw9A.) 26 F (KE)(TQD)(SNCRG)(M) 27 G (KER)(FQMWHD)(NYLPI)(SVA) Figure 19 shows residues in 1gw9A colored by their importance, at 137 W (KE)(TQD)(SNCRG)(M) the interface with 1gw9A1. 140 R (TD)(SYEVCLAPIG)(FMW)(N) Interface with 1gw9A2.Table 32 lists the top 25% of residues at 183 K (Y)(FTW)(SVCAG)(HD) the interface with 1gw9A2. The following table (Table 33) suggests 186 E (FWH)(YVCARG)(T)(SNKLPI) possible disruptive replacements for these residues (see Section 3.6). 187 P (YR)(TH)(SKECG)(FQWD) 255 D (R)(FWH)(KYVCAG)(TQM) Table 32. 292 R (TD)(SYEVCLAPIG)(FMW)(N) res type subst’s cvg noc/ dist antn 190 D (R)(FKVCAWG)(TYQMH)(SNLPI) (%) bb (A˚ ) 94 F (E)(K)(TQD)(SNCG) 24 D D(100) 0.09 17/0 2.82 289 K (TY)(SFVCAWG)(D)(E) 26 F F(100) 0.09 88/31 3.00 site 30 T (KR)(QH)(FEMW)(N) 27 G G(100) 0.09 22/22 3.19 97 P (Y)(R)(TEH)(K) 137 W W(100) 0.09 16/0 3.75 site 25 P (Y)(THR)(SCG)(FEW) 140 R R(100) 0.09 44/0 2.82 253 K (Y)(FW)(T)(S) 183 K K(100) 0.09 4/0 3.58 254 Y (K)(Q)(E)(M) 186 E E(100) 0.09 40/14 3.64 187 P P(100) 0.09 26/11 3.55 Table 33. List of disruptive mutations for the top 25% of residues in continued in next column 1gw9A, that are at the interface with 1gw9A2.

13 Table 35. continued res type disruptive mutations 254 Y (K)(Q)(E)(M)

Table 35. List of disruptive mutations for the top 25% of residues in 1gw9A, that are at the interface with iodide ion.

Fig. 20. Residues in 1gw9A, at the interface with 1gw9A2, colored by their relative importance. 1gw9A2 is shown in backbone representation (See Appendix for the coloring scheme for the protein chain 1gw9A.)

Figure 20 shows residues in 1gw9A colored by their importance, at the interface with 1gw9A2. Iodide ion binding site. Table 34 lists the top 25% of residues at the interface with 1gw9AIOD1406 (iodide ion). The following table (Table 35) suggests possible disruptive replacements for these residues (see Section 3.6). Fig. 21. Residues in 1gw9A, at the interface with iodide ion, colored by their Table 34. relative importance. The ligand (iodide ion) is colored green. Atoms further A˚ res type subst’s cvg noc/ dist than 30 away from the geometric center of the ligand, as well as on the line of sight to the ligand were removed. (See Appendix for the coloring scheme (%) bb (A˚ ) for the protein chain 1gw9A.) 186 E E(100) 0.09 2/0 3.34 25 P P(88) 0.21 2/0 3.48 Q(11) Figure 21 shows residues in 1gw9A colored by their importance, at 254 Y Y(25) 0.25 4/0 3.67 the interface with 1gw9AIOD1406. W(62) Iodide ion binding site. Table 36 lists the top 25% of residues F(11) at the interface with 1gw9AIOD1392 (iodide ion). The following table (Table 37) suggests possible disruptive replacements for these Table 34. The top 25% of residues in 1gw9A at the interface with iodide residues (see Section 3.6). ion.(Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each Table 36. type in the bracket; noc/bb: number of contacts with the ligand, with the num- res type subst’s cvg noc/ dist ber of contacts realized through backbone atoms given in the bracket; dist: (%) bb (A˚ ) distance of closest apporach to the ligand. ) 10 R R(96) 0.14 5/2 3.89 K(3) 283 P P(37) 0.14 5/2 3.98 Table 35. G(62) res type disruptive 87 P P(33) 0.23 2/0 4.27 mutations L(62) 186 E (FWH)(YVCARG)(T)(SNKLPI) R(3) 25 P (Y)(THR)(SCG)(FEW) continued in next column Table 36. The top 25% of residues in 1gw9A at the interface with iodide ion.(Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each

14 type in the bracket; noc/bb: number of contacts with the ligand, with the num- Table 38. continued ber of contacts realized through backbone atoms given in the bracket; dist: res type subst’s cvg noc/ dist distance of closest apporach to the ligand. ) (%) bb (A˚ ) Q(62) 138 G G(92) 0.18 4/4 3.28 Table 37. P(7) res type disruptive 193 L L(25) 0.25 5/2 4.45 mutations D(62) 10 R (T)(YD)(SVCAG)(FELWPI) F(11) 283 P (R)(Y)(H)(KE) 87 P (Y)(T)(R)(H) Table 38. The top 25% of residues in 1gw9A at the interface with iodide ion.(Field names: res: residue number in the PDB entry; type: amino acid Table 37. List of disruptive mutations for the top 25% of residues in type; substs: substitutions seen in the alignment; with the percentage of each 1gw9A, that are at the interface with iodide ion. type in the bracket; noc/bb: number of contacts with the ligand, with the num- ber of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. )

Table 39. res type disruptive mutations 141 E (FWH)(YVCARG)(T)(SNKLPI) 191 I (Y)(THR)(SCG)(FEW) 138 G (R)(KE)(H)(FYQWD) 193 L (R)(Y)(T)(KH)

Table 39. List of disruptive mutations for the top 25% of residues in 1gw9A, that are at the interface with iodide ion.

Fig. 22. Residues in 1gw9A, at the interface with iodide ion, colored by their relative importance. The ligand (iodide ion) is colored green. Atoms further than 30A˚ away from the geometric center of the ligand, as well as on the line of sight to the ligand were removed. (See Appendix for the coloring scheme for the protein chain 1gw9A.)

Figure 22 shows residues in 1gw9A colored by their importance, at the interface with 1gw9AIOD1392. Iodide ion binding site. Table 38 lists the top 25% of residues at the interface with 1gw9AIOD1414 (iodide ion). The following table (Table 39) suggests possible disruptive replacements for these residues (see Section 3.6). Table 38. res type subst’s cvg noc/ dist (%) bb (A˚ ) 141 E E(100) 0.09 4/2 3.84 Fig. 23. Residues in 1gw9A, at the interface with iodide ion, colored by their relative importance. The ligand (iodide ion) is colored green. Atoms further 191 I I(37) 0.11 1/0 3.82 than 30A˚ away from the geometric center of the ligand, as well as on the line continued in next column of sight to the ligand were removed. (See Appendix for the coloring scheme for the protein chain 1gw9A.)

15 Figure 23 shows residues in 1gw9A colored by their importance, at the interface with 1gw9AIOD1414. Calcium ion binding site. Table 40 lists the top 25% of residues at the interface with 1gw9ACA1388 (calcium ion). The following table (Table 41) suggests possible disruptive replacements for these residues (see Section 3.6). Table 40. res type subst’s cvg noc/ dist antn (%) bb (A˚ ) 183 K K(100) 0.09 2/0 4.79 217 E E(100) 0.09 4/0 2.03 site 220 H H(100) 0.09 5/0 2.64 site 247 N N(100) 0.09 1/0 3.91 255 D D(100) 0.09 4/0 2.04 site 257 D D(100) 0.09 4/0 2.28 287 D D(100) 0.09 1/0 4.74

Table 40. The top 25% of residues in 1gw9A at the interface with cal- cium ion.(Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each type in the bracket; noc/bb: number of contacts with the ligand, with the number of contacts realized through backbone atoms given in the bracket; Fig. 24. Residues in 1gw9A, at the interface with calcium ion, colored by dist: distance of closest apporach to the ligand. ) their relative importance. The ligand (calcium ion) is colored green. Atoms further than 30A˚ away from the geometric center of the ligand, as well as on the line of sight to the ligand were removed. (See Appendix for the coloring Table 41. scheme for the protein chain 1gw9A.) res type disruptive mutations 183 K (Y)(FTW)(SVCAG)(HD) Table 43. 217 E (FWH)(YVCARG)(T)(SNKLPI) res type disruptive 220 H (E)(TQMD)(SNKVCLAPIG)(YR) mutations 247 N (Y)(FTWH)(SEVCARG)(MD) 137 W (KE)(TQD)(SNCRG)(M) 255 D (R)(FWH)(KYVCAG)(TQM) 138 G (R)(KE)(H)(FYQWD) 257 D (R)(FWH)(KYVCAG)(TQM) 287 D (R)(FWH)(KYVCAG)(TQM) Table 43. List of disruptive mutations for the top 25% of residues in 1gw9A, that are at the interface with iodide ion. Table 41. List of disruptive mutations for the top 25% of residues in 1gw9A, that are at the interface with calcium ion. Figure 25 shows residues in 1gw9A colored by their importance, at Figure 24 shows residues in 1gw9A colored by their importance, at the interface with 1gw9AIOD1401. the interface with 1gw9ACA1388. Iodide ion binding site. Table 44 lists the top 25% of residues Iodide ion binding site. Table 42 lists the top 25% of residues at the interface with 1gw9AIOD1404 (iodide ion). The following at the interface with 1gw9AIOD1401 (iodide ion). The following table (Table 45) suggests possible disruptive replacements for these table (Table 43) suggests possible disruptive replacements for these residues (see Section 3.6). residues (see Section 3.6). Table 44. Table 42. res type subst’s cvg noc/ dist res type subst’s cvg noc/ dist antn (%) bb (A˚ ) (%) bb (A˚ ) 105 T T(100) 0.09 1/1 4.95 137 W W(100) 0.09 4/4 4.08 site 141 E E(100) 0.09 4/0 2.68 138 G G(92) 0.18 2/2 3.92 138 G G(92) 0.18 1/1 4.51 P(7) P(7)

Table 42. The top 25% of residues in 1gw9A at the interface with iodide Table 44. The top 25% of residues in 1gw9A at the interface with iodide ion.(Field names: res: residue number in the PDB entry; type: amino acid ion.(Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each type; substs: substitutions seen in the alignment; with the percentage of each type in the bracket; noc/bb: number of contacts with the ligand, with the num- type in the bracket; noc/bb: number of contacts with the ligand, with the num- ber of contacts realized through backbone atoms given in the bracket; dist: ber of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. ) distance of closest apporach to the ligand. )

16 Fig. 25. Residues in 1gw9A, at the interface with iodide ion, colored by their Fig. 26. Residues in 1gw9A, at the interface with iodide ion, colored by their relative importance. The ligand (iodide ion) is colored green. Atoms further relative importance. The ligand (iodide ion) is colored green. Atoms further than 30A˚ away from the geometric center of the ligand, as well as on the line than 30A˚ away from the geometric center of the ligand, as well as on the line of sight to the ligand were removed. (See Appendix for the coloring scheme of sight to the ligand were removed. (See Appendix for the coloring scheme for the protein chain 1gw9A.) for the protein chain 1gw9A.)

Table 45. res type disruptive mutations 105 T (KR)(FQMWH)(NELPI)(D) 141 E (FWH)(YVCARG)(T)(SNKLPI) 138 G (R)(KE)(H)(FYQWD)

Table 45. List of disruptive mutations for the top 25% of residues in 1gw9A, that are at the interface with iodide ion.

Figure 26 shows residues in 1gw9A colored by their importance, at Fig. 27. A possible active surface on the chain 1gw9A. The larger cluster it the interface with 1gw9AIOD1404. belongs to is shown in blue. 2.4.3 Possible novel functional surfaces at 25% coverage. One group of residues is conserved on the 1gw9A surface, away from (or susbtantially larger than) other functional sites and interfaces reco- Table 46. continued gnizable in PDB entry 1gw9. It is shown in Fig. 27. The right panel res type substitutions(%) cvg shows (in blue) the rest of the larger cluster this surface belongs to. 317 A A(96)V(3) 0.16 The residues belonging to this surface ”patch” are listed in Table 46, 47 G G(85)N(14) 0.19 while Table 47 suggests possible disruptive replacements for these 308 R R(33)D(62)S(3) 0.20 residues (see Section 3.6). 46 L L(88)I(11) 0.21 261 G G(37)P(51)L(11) 0.22 Table 46. 298 G G(33)D(62)A(3) 0.23 res type substitutions(%) cvg 3 Y Y(81)L(3)F(14) 0.24 5 P P(37)G(62) 0.14 311 L L(37)A(62) 0.14 Table 46. Residues forming surface ”patch” in 1gw9A. 313 L L(37)G(62) 0.14 300 W W(37)F(55)L(7) 0.16 continued in next column

17 Table 47. res type disruptive mutations 5 P (R)(Y)(H)(KE) 311 L (YR)(H)(TKE)(SQCDG) 313 L (R)(Y)(H)(KE) 300 W (KE)(T)(QD)(R) 317 A (KYER)(QHD)(N)(FTMW) 47 G (ER)(FKWH)(YMD)(Q) 308 R (TY)(D)(FVCLAWPIG)(EM) 46 L (YR)(TH)(SKECG)(FQWD) 261 G (R)(KE)(H)(Y) 298 G (R)(K)(H)(E) 3 Y (K)(Q)(R)(E)

Table 47. Disruptive mutations for the surface patch in 1gw9A.

Another group of surface residues is shown in Fig.28. The residues

Fig. 28. Another possible active surface on the chain 1gw9A. belonging to this surface ”patch” are listed in Table 48, while Table 49 suggests possible disruptive replacements for these residues (see Section 3.6). Table 48. res type substitutions(%) cvg 168 Y Y(100) 0.09 173 G G(100) 0.09 174 Y Y(37)F(62) 0.14 124 D D(59)E(40) 0.18 172 Q Q(33)I(62)K(3) 0.20

Table 48. Residues forming surface ”patch” in 1gw9A.

18 Table 49. backbone atoms (if all or most contacts are through the backbone, res type disruptive mutation presumably won’t have strong impact). Two heavy atoms mutations are considered to be “in contact” if their centers are closer than 5A˚ . 168 Y (K)(QM)(NEVLAPIR)(D) 173 G (KER)(FQMWHD)(NYLPI)(SVA) 3.5 Annotation 174 Y (K)(Q)(EM)(NR) If the residue annotation is available (either from the pdb file or 124 D (R)(FWH)(YVCAG)(K) from other sources), another column, with the header “annotation” 172 Q (Y)(T)(FWH)(SCG) appears. Annotations carried over from PDB are the following: site (indicating existence of related site record in PDB ), S-S (disulfide Table 49. Disruptive mutations for the surface patch in 1gw9A. bond forming residue), hb (hydrogen bond forming residue, jb (james bond forming residue), and sb (for salt bridge forming residue). 3.6 Mutation suggestions 3 NOTES ON USING TRACE RESULTS Mutation suggestions are completely heuristic and based on comple- 3.1 Coverage mentarity with the substitutions found in the alignment. Note that they are meant to be disruptive to the interaction of the protein Trace results are commonly expressed in terms of coverage: the resi- with its ligand. The attempt is made to complement the following due is important if its “coverage” is small - that is if it belongs to properties: small [AV GSTC], medium [LPNQDEMIK], large some small top percentage of residues [100% is all of the residues [WFYHR], hydrophobic [LPVAMWFI], polar [GTCY ]; posi- in a chain], according to trace. The ET results are presented in the tively [KHR], or negatively [DE] charged, aromatic [WFYH], form of a table, usually limited to top 25% percent of residues (or long aliphatic chain [EKRQM], OH-group possession [SDETY ], to some nearby percentage), sorted by the strength of the presumed and NH2 group possession [NQRK]. The suggestions are listed evolutionary pressure. (I.e., the smaller the coverage, the stronger the according to how different they appear to be from the original amino pressure on the residue.) Starting from the top of that list, mutating a acid, and they are grouped in round brackets if they appear equally couple of residues should affect the protein somehow, with the exact disruptive. From left to right, each bracketed group of amino acid effects to be determined experimentally. types resembles more strongly the original (i.e. is, presumably, less 3.2 Known substitutions disruptive) These suggestions are tentative - they might prove disrup- tive to the fold rather than to the interaction. Many researcher will One of the table columns is “substitutions” - other amino acid types choose, however, the straightforward alanine mutations, especially in seen at the same position in the alignment. These amino acid types the beginning stages of their investigation. may be interchangeable at that position in the protein, so if one wants to affect the protein by a point mutation, they should be avoided. For 4 APPENDIX example if the substitutions are “RVK” and the original protein has an R at that position, it is advisable to try anything, but RVK. Conver- 4.1 File formats sely, when looking for substitutions which will not affect the protein, Files with extension “ranks sorted” are the actual trace results. The one may try replacing, R with K, or (perhaps more surprisingly), with fields in the table in this file: V. The percentage of times the substitution appears in the alignment • is given in the immediately following bracket. No percentage is given alignment# number of the position in the alignment in the cases when it is smaller than 1%. This is meant to be a rough • residue# residue number in the PDB file guide - due to rounding errors these percentages often do not add up • type amino acid type to 100%. • rank rank of the position according to older version of ET 3.3 Surface • variability has two subfields: To detect candidates for novel functional interfaces, first we look for 1. number of different amino acids appearing in in this column residues that are solvent accessible (according to DSSP program) by of the alignment 2 at least 10A˚ , which is roughly the area needed for one water mole- 2. their type cule to come in the contact with the residue. Furthermore, we require • rho ET score - the smaller this value, the lesser variability of that these residues form a “cluster” of residues which have neighbor this position across the branches of the tree (and, presumably, within 5A˚ from any of their heavy atoms. the greater the importance for the protein) Note, however, that, if our picture of protein evolution is correct, • the neighboring residues which are not surface accessible might be cvg coverage - percentage of the residues on the structure which equally important in maintaining the interaction specificity - they have this rho or smaller should not be automatically dropped from consideration when choo- • gaps percentage of gaps in this column sing the set for mutagenesis. (Especially if they form a cluster with the surface residues.) 4.2 Color schemes used The following color scheme is used in figures with residues colored 3.4 Number of contacts by cluster size: black is a single-residue cluster; clusters composed of Another column worth noting is denoted “noc/bb”; it tells the num- more than one residue colored according to this hierarchy (ordered ber of contacts heavy atoms of the residue in question make across by descending size): red, blue, yellow, green, purple, azure, tur- the interface, as well as how many of them are realized through the quoise, brown, coral, magenta, LightSalmon, SkyBlue, violet, gold,

19 75% of the query are taken out, however); R. Schneider, A. de Daruvar, and C. Sander. ”The HSSP database of protein structure- sequence alignments.” Nucleic Acids Res., 25:226–230, 1997.

http://swift.cmbi.kun.nl/swift/hssp/ COVERAGE 4.3.5 LaTex The text for this report was processed using LATEX; V Leslie Lamport, “LaTeX: A Document Preparation System Addison- 100% 50% 30% 5% Wesley,” Reading, Mass. (1986).

4.3.6 Muscle When making alignments “from scratch”, report maker uses Muscle alignment program: Edgar, Robert C. (2004), ”MUSCLE: multiple sequence alignment with high accuracy and high throughput.” Nucleic Acids Research 32(5), 1792-97. V http://www.drive5.com/muscle/ RELATIVE IMPORTANCE 4.3.7 Pymol The figures in this report were produced using Fig. 29. Coloring scheme used to color residues by their relative importance. Pymol. The scripts can be found in the attachment. Pymol is an open-source application copyrighted by DeLano Scien- tific LLC (2005). For more information about Pymol see bisque, LightSlateBlue, orchid, RosyBrown, MediumAquamarine, http://pymol.sourceforge.net/. (Note for Windows DarkOliveGreen, CornflowerBlue, grey55, burlywood, LimeGreen, users: the attached package needs to be unzipped for Pymol to read tan, DarkOrange, DeepPink, maroon, BlanchedAlmond. the scripts and launch the viewer.) The colors used to distinguish the residues by the estimated 4.4 Note about ET Viewer evolutionary pressure they experience can be seen in Fig. 29. Dan Morgan from the Lichtarge lab has developed a visualization 4.3 Credits tool specifically for viewing trace results. If you are interested, please 4.3.1 Alistat alistat reads a multiple sequence alignment from the visit: file and shows a number of simple statistics about it. These stati- http://mammoth.bcm.tmc.edu/traceview/ stics include the format, the number of sequences, the total number of residues, the average and range of the sequence lengths, and the The viewer is self-unpacking and self-installing. Input files to be used alignment length (e.g. including gap characters). Also shown are with ETV (extension .etvx) can be found in the attachment to the some percent identities. A percent pairwise alignment identity is defi- main report. ned as (idents / MIN(len1, len2)) where idents is the number of exact identities and len1, len2 are the unaligned lengths of the two 4.5 Citing this work sequences. The ”average percent identity”, ”most related pair”, and The method used to rank residues and make predictions in this report ”most unrelated pair” of the alignment are the average, maximum, can be found in Mihalek, I., I. Res,ˇ O. Lichtarge. (2004). ”A Family of and minimum of all (N)(N-1)/2 pairs, respectively. The ”most distant Evolution-Entropy Hybrid Methods for Ranking of Protein Residues seq” is calculated by finding the maximum pairwise identity (best by Importance” J. Mol. Bio. 336: 1265-82. For the original version relative) for all N sequences, then finding the minimum of these N of ET see O. Lichtarge, H.Bourne and F. Cohen (1996). ”An Evolu- numbers (hence, the most outlying sequence). alistat is copyrighted tionary Trace Method Defines Binding Surfaces Common to Protein by HHMI/Washington University School of Medicine, 1992-2001, Families” J. Mol. Bio. 257: 342-358. and freely distributed under the GNU General Public License. report maker itself is described in Mihalek I., I. Res and O. 4.3.2 CE To map ligand binding sites from different Lichtarge (2006). ”Evolutionary Trace Report Maker: a new type source structures, report maker uses the CE program: of service for comparative analysis of proteins.” Bioinformatics http://cl.sdsc.edu/. Shindyalov IN, Bourne PE (1998) 22:1656-7. ”Protein structure alignment by incremental combinatorial extension 4.6 About report maker (CE) of the optimal path . Protein Engineering 11(9) 739-747. report maker was written in 2006 by Ivana Mihalek. The 1D ran- 4.3.3 DSSP In this work a residue is considered solvent accessi- king visualization program was written by Ivica Res.ˇ report maker 2 ble if the DSSP program finds it exposed to water by at least 10A˚ , is copyrighted by Lichtarge Lab, Baylor College of Medicine, which is roughly the area needed for one water molecule to come in Houston. the contact with the residue. DSSP is copyrighted by W. Kabsch, C. Sander and MPI-MF, 1983, 1985, 1988, 1994 1995, CMBI version 4.7 Attachments by [email protected] November 18,2002, The following files should accompany this report: http://www.cmbi.kun.nl/gv/dssp/descrip.html. • 1gw9A.complex.pdb - coordinates of 1gw9A with all of its 4.3.4 HSSP Whenever available, report maker uses HSSP ali- interacting partners gnment as a starting point for the analysis (sequences shorter than • 1gw9A.etvx - ET viewer input file for 1gw9A

20 • 1gw9A.cluster report.summary - Cluster report summary for • 1gw9A.1gw9A3.if.pml - Pymol script for Figure 12 1gw9A • 1gw9A.1gw9ACA1387.if.pml - Pymol script for Figure 13 • 1gw9A.ranks - Ranks file in sequence order for 1gw9A • 1gw9A.1gw9AIOD1391.if.pml - Pymol script for Figure 14 • 1gw9A.clusters - Cluster descriptions for 1gw9A • 1gw9A.1gw9AIOD1394.if.pml - Pymol script for Figure 15 • 1gw9A.msf - the multiple sequence alignment used for the chain • 1gw9A.1gw9AIOD1413.if.pml - Pymol script for Figure 16 1gw9A • 1gw9A.1gw9AIOD1417.if.pml - Pymol script for Figure 17 • 1gw9A.descr - description of sequences used in 1gw9A msf • 1gw9A.1gw9AIOD1393.if.pml - Pymol script for Figure 18 • 1gw9A.ranks sorted - full listing of residues and their ranking • 1gw9A.1gw9A1.if.pml - Pymol script for Figure 19 for 1gw9A • 1gw9A.1gw9A2.if.pml - Pymol script for Figure 20 • 1gw9A.1gw9AIOD1400.if.pml - Pymol script for Figure 5 • 1gw9A.1gw9AIOD1406.if.pml - Pymol script for Figure 21 • 1gw9A.cbcvg - used by other 1gw9A – related pymol scripts • 1gw9A.1gw9AIOD1392.if.pml - Pymol script for Figure 22 • 1gw9A.1gw9AIOD1410.if.pml - Pymol script for Figure 6 • 1gw9A.1gw9AIOD1414.if.pml - Pymol script for Figure 23 • 1gw9A.1gw9AIOD1405.if.pml - Pymol script for Figure 7 • 1gw9A.1gw9ACA1388.if.pml - Pymol script for Figure 24 • 1gw9A.1gw9ALXC1390.if.pml - Pymol script for Figure 8 • 1gw9A.1gw9AIOD1401.if.pml - Pymol script for Figure 25 • 1gw9A.1gw9AIOD1403.if.pml - Pymol script for Figure 9 • 1gw9A.1gw9AIOD1404.if.pml - Pymol script for Figure 26 • 1gw9A.1gw9AIOD1402.if.pml - Pymol script for Figure 10 • 1gw9A.1gw9AIOD1407.if.pml - Pymol script for Figure 11

21