Pages 1–17 3a1j Evolutionary trace report by report maker June 8, 2009

4 Chain 3a1jB 11 4.1 O60921 overview 11 4.2 Multiple sequence alignment for 3a1jB 11 4.3 Residue ranking in 3a1jB 11 4.4 Top ranking residues in 3a1jB and their position on the structure 11 4.4.1 Clustering of residues at 25% coverage. 11 4.4.2 Overlap with known functional surfaces at 25% coverage. 12 4.4.3 Possible novel functional surfaces at 25% coverage. 13

5 Notes on using trace results 15 5.1 Coverage 15 5.2 Known substitutions 15 5.3 Surface 15 5.4 Number of contacts 15 5.5 Annotation 15 5.6 Mutation suggestions 15

6 Appendix 15 6.1 File formats 15 CONTENTS 6.2 Color schemes used 16 6.3 Credits 16 1 Introduction 1 6.3.1 Alistat 16 6.3.2 CE 16 2 Chain 3a1jA 1 6.3.3 DSSP 16 2.1 Q99638 overview 1 6.3.4 HSSP 16 2.2 Multiple sequence alignment for 3a1jA 1 6.3.5 LaTex 16 2.3 Residue ranking in 3a1jA 1 6.3.6 Muscle 16 2.4 Top ranking residues in 3a1jA and their position on 6.3.7 Pymol 16 the structure 1 6.4 Note about ET Viewer 16 2.4.1 Clustering of residues at 25% coverage. 2 6.5 Citing this work 16 2.4.2 Overlap with known functional surfaces at 6.6 About report maker 17 25% coverage. 2 6.7 Attachments 17 2.4.3 Possible novel functional surfaces at 25% coverage. 5 1 INTRODUCTION From the original Data Bank entry (PDB id 3a1j): 3 Chain 3a1jC 6 Title: Crystal structure of the human rad9--rad1 complex 3.1 Q5R7X9 overview 6 Compound: Mol id: 1; molecule: cell cycle checkpoint control 3.2 Multiple sequence alignment for 3a1jC 6 protein ; chain: a; fragment: n-terminal domain, residues 1- 3.3 Residue ranking in 3a1jC 7 266; synonym: hrad9, dna repair exonuclease rad9 homolog a; ec: 3.4 Top ranking residues in 3a1jC and their position on 3.1.11.2; engineered: yes; mol id: 2; molecule: checkpoint pro- the structure 7 tein hus1; chain: b; synonym: hhus1; engineered: yes; mol id: 3; 3.4.1 Clustering of residues at 25% coverage. 7 molecule: cell cycle checkpoint protein rad1; chain: c; fragment: resi- 3.4.2 Overlap with known functional surfaces at dues 13-275; synonym: hrad1, dna repair exonuclease rad1 homolog, 25% coverage. 8 rad1- like dna damage checkpoint protein; ec: 3.1.11.2; engineered: 3.4.3 Possible novel functional surfaces at 25% yes coverage. 9 Organism, scientific name: Homo Sapiens;

1 Lichtarge lab 2006 Fig. 1. Residues 1-132 in 3a1jA colored by their relative importance. (See Fig. 2. Residues 133-266 in 3a1jA colored by their relative importance. (See Appendix, Fig.25, for the coloring scheme.) Appendix, Fig.25, for the coloring scheme.)

3a1j contains unique chains 3a1jA (265 residues), 3a1jC (263 2.4 Top ranking residues in 3a1jA and their position on residues), and 3a1jB (269 residues) the structure In the following we consider residues ranking among top 25% of resi- dues in the protein . Figure 3 shows residues in 3a1jA colored by their 2 CHAIN 3A1JA importance: bright red and yellow indicate more conserved/important 2.1 Q99638 overview residues (see Appendix for the coloring scheme). A Pymol script for producing this figure can be found in the attachment. From SwissProt, id Q99638, 93% identical to 3a1jA: Description: Cell cycle checkpoint control protein (RAD9 homolog A) (S. pombe). Organism, scientific name: Homo sapiens (Human). Taxonomy: Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Catarrhini; Hominidae; Homo.

2.2 Multiple sequence alignment for 3a1jA For the chain 3a1jA, the alignment 3a1jA.msf (attached) with 10 sequences was used. The alignment was assembled through combi- nation of BLAST searching on the UniProt database and alignment using Muscle program. It can be found in the attachment to this report, under the name of 3a1jA.msf. Its statistics, from the alistat program are the following:

Format: MSF Number of sequences: 10 Total number of residues: 2599 Smallest: 248 Largest: 265 Average length: 259.9 Alignment length: 265 Average identity: 30% Most related pair: 81% Fig. 3. Residues in 3a1jA, colored by their relative importance. Clockwise: Most unrelated pair: 18% front, back, top and bottom views. Most distant seq: 25%

2.4.1 Clustering of residues at 25% coverage. Fig. 4 shows the Furthermore, 3% of residues show as conserved in this alignment. top 25% of all residues, this time colored according to clusters they The alignment consists of 90% eukaryotic ( 30% vertebrata, 20% belong to. The clusters in Fig.4 are composed of the residues listed arthropoda, 20% fungi) sequences. (Descriptions of some sequences in Table 1. were not readily available.) The file containing the sequence descrip- tions can be found in the attachment, under the name 3a1jA.descr. Table 1. cluster size member 2.3 Residue ranking in 3a1jA color residues red 51 12,15,16,20,21,22,23,24,26 The 3a1jA sequence is shown in Figs. 1–2, with each residue colored 30,36,39,41,42,43,46,47,53 according to its estimated importance. The full listing of residues in 3a1jA can be found in the file called 3a1jA.ranks sorted in the continued in next column attachment.

2 Table 2. continued res type subst’s cvg noc/ dist (%) bb (A˚ ) 196 E T(29) 0.16 35/30 2.81 E(59) Q(10)

Table 2. The top 25% of residues in 3a1jA at the interface with 3a1jB. (Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each type in the bracket; noc/bb: number of contacts with the ligand, with the number of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. )

Table 3. res type disruptive mutations 159 F (T)(KE)(DR)(QCG) 195 T (KR)(FQMWH)(NELPI)(D) 196 E (FWH)(Y)(R)(VA)

Fig. 4. Residues in 3a1jA, colored according to the cluster they belong to: Table 3. List of disruptive mutations for the top 25% of residues in 3a1jA, red, followed by blue and yellow are the largest clusters (see Appendix for that are at the interface with 3a1jB. the coloring scheme). Clockwise: front, back, top and bottom views. The corresponding Pymol script is attached.

Table 1. continued cluster size member color residues 57,58,61,78,81,84,132,150 159,164,165,180,195,196,216 217,218,220,221,222,223,224 228,229,230,244,246,260,261 262,263,264,265 blue 9 7,95,97,112,114,115,117,120 121

Table 1. Clusters of top ranking residues in 3a1jA.

2.4.2 Overlap with known functional surfaces at 25% coverage. The name of the ligand is composed of the source PDB identifier and the heteroatom name used in that file. Interface with 3a1jB.Table 2 lists the top 25% of residues at the interface with 3a1jB. The following table (Table 3) suggests possible disruptive replacements for these residues (see Section 5.6). Table 2. Fig. 5. Residues in 3a1jA, at the interface with 3a1jB, colored by their rela- res type subst’s cvg noc/ dist tive importance. 3a1jB is shown in backbone representation (See Appendix (%) bb (A˚ ) for the coloring scheme for the protein chain 3a1jA.) 159 F F(80) 0.07 5/5 3.16 L(10) M(10) Figure 5 shows residues in 3a1jA colored by their importance, at the 195 T T(69) 0.16 30/10 3.12 interface with 3a1jB. S(29) Sucrose binding site. Table 4 lists the top 25% of residues at the continued in next column interface with 3a1jASUC6001 (sucrose). The following table (Table 5) suggests possible disruptive replacements for these residues (see Section 5.6).

3 Table 4. res type subst’s cvg noc/ dist antn (%) bb (A˚ ) 132 L L(59) 0.21 1/1 4.33 I(20) Q(10) M(10) 39 R T(29) 0.23 16/0 3.64 site K(20) R(50)

Table 4. The top 25% of residues in 3a1jA at the interface with sucrose.(Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each type in the bracket; noc/bb: number of contacts with the ligand, with the number of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. )

Table 5. res type disruptive mutations 132 L (Y)(R)(TH)(SCG) 39 R (D)(TY)(FEVLAWPI)(CG)

Table 5. List of disruptive mutations for the top 25% of residues in 3a1jA, that are at the interface with sucrose.

Fig. 6. Residues in 3a1jA, at the interface with sucrose, colored by their rela- tive importance. The ligand (sucrose) is colored green. Atoms further than 30A˚ away from the geometric center of the ligand, as well as on the line of sight to the ligand were removed. (See Appendix for the coloring scheme for the protein chain 3a1jA.)

4 Figure 6 shows residues in 3a1jA colored by their importance, at the interface with 3a1jASUC6001. Interface with 3a1jC.Table 6 lists the top 25% of residues at the interface with 3a1jC. The following table (Table 7) suggests possible disruptive replacements for these residues (see Section 5.6). Table 6. res type subst’s cvg noc/ dist (%) bb (A˚ ) 78 K K(100) 0.03 3/0 3.93 120 K K(80) 0.09 28/9 3.26 R(20) 117 G G(69) 0.10 10/10 3.33 D(20) E(10) 121 T T(80) 0.20 38/32 2.77 I(10) E(10)

Table 6. The top 25% of residues in 3a1jA at the interface with 3a1jC. (Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each type in the bracket; noc/bb: number of contacts with the ligand, with the number of Fig. 7. Residues in 3a1jA, at the interface with 3a1jC, colored by their rela- contacts realized through backbone atoms given in the bracket; dist: distance tive importance. 3a1jC is shown in backbone representation (See Appendix of closest apporach to the ligand. ) for the coloring scheme for the protein chain 3a1jA.)

Table 7. res type disruptive mutations 78 K (Y)(FTW)(SVCAG)(HD) 120 K (Y)(T)(FW)(SVCAG) 117 G (R)(FKWH)(Y)(Q) 121 T (R)(H)(K)(FW)

Table 7. List of disruptive mutations for the top 25% of residues in 3a1jA, that are at the interface with 3a1jC.

Figure 7 shows residues in 3a1jA colored by their importance, at the interface with 3a1jC. 2.4.3 Possible novel functional surfaces at 25% coverage. One group of residues is conserved on the 3a1jA surface, away from (or susbtantially larger than) other functional sites and interfaces reco- gnizable in PDB entry 3a1j. It is shown in Fig. 8. The residues belonging to this surface ”patch” are listed in Table 8, while Table 9 suggests possible disruptive replacements for these residues (see Section 5.6).

Table 8. Fig. 8. A possible active surface on the chain 3a1jA. res type substitutions(%) cvg 120 K K(80)R(20) 0.09 114 C C(80)M(10)Y(10) 0.10 Table 8. continued 117 G G(69)D(20)E(10) 0.10 res type substitutions(%) cvg 115 K K(69)N(10)R(20) 0.14 V(10) 112 L F(50)L(50) 0.18 121 T T(80)I(10)E(10) 0.20 95 E E(69)Q(10)D(10) 0.19 7 G N(29).(20)G(50) 0.23 continued in next column 94 V V(59)I(29)F(10) 0.25

Table 8. Residues forming surface ”patch” in 3a1jA.

5 Table 9. Table 10. continued res type disruptive res type substitutions(%) cvg antn mutations 30 E E(80)D(20) 0.08 120 K (Y)(T)(FW)(SVCAG) 220 K K(90)R(10) 0.08 114 C (K)(R)(E)(Q) 263 A A(80)G(10)S(10) 0.09 117 G (R)(FKWH)(Y)(Q) 180 S S(59)N(40) 0.11 115 K (Y)(T)(FW)(SVCAG) 228 F L(20)F(69)Y(10) 0.11 112 L (R)(TY)(KE)(SCHG) 41 V L(20)I(20)V(59) 0.12 95 E (H)(FW)(Y)(R) 222 F F(59)L(40) 0.12 121 T (R)(H)(K)(FW) 218 C S(29)C(59)N(10) 0.13 7 G (R)(E)(K)(FWH) 23 I I(80)V(20) 0.14 94 V (KE)(R)(Y)(QD) 265 L V(20)M(20)L(59) 0.14 26 E A(20)E(59)D(20) 0.15 Table 9. Disruptive mutations for the surface patch in 3a1jA. 150 R K(50)S(10)R(40) 0.15 195 T T(69)S(29) 0.16 196 E T(29)E(59)Q(10) 0.16 Another group of surface residues is shown in Fig.9. The right panel 216 T T(69)I(10)S(20) 0.17 shows (in blue) the rest of the larger cluster this surface belongs to. 230 E E(69)S(10)D(20) 0.17 12 V D(20)V(59)Q(10) 0.18 I(10) 261 V I(59)V(29)A(10) 0.18 43 S S(69)N(10).(10) 0.19 A(10) 81 L L(69)F(10)Q(20) 0.21 132 L L(59)I(20)Q(10) 0.21 M(10) 223 R R(50)K(40)L(10) 0.21 164 A E(69)D(10).(10) 0.22 A(10) 24 G D(20)G(50)S(29) 0.23 39 R T(29)K(20)R(50) 0.23 site Fig. 9. Another possible active surface on the chain 3a1jA. The larger cluster 61 Y F(20)Y(69)C(10) 0.24 it belongs to is shown in blue. 45 R R(69)N(10)K(10) 0.25 Q(10) The residues belonging to this surface ”patch” are listed in Table 10, while Table 11 suggests possible disruptive replacements for these Table 10. Residues forming surface ”patch” in 3a1jA. residues (see Section 5.6).

Table 10. Table 11. res type substitutions(%) cvg antn res type disruptive 21 S S(100) 0.03 mutations 42 N N(100) 0.03 21 S (KR)(FQMWH)(NYELPI)(D) 58 F F(100) 0.03 42 N (Y)(FTWH)(SEVCARG)(MD) 78 K K(100) 0.03 58 F (KE)(TQD)(SNCRG)(M) 165 E E(100) 0.03 78 K (Y)(FTW)(SVCAG)(HD) 244 G G(100) 0.03 165 E (FWH)(YVCARG)(T)(SNKLPI) 246 P P(100) 0.03 244 G (KER)(FQMWHD)(NYLPI)(SVA) 264 T T(100) 0.03 246 P (YR)(TH)(SKECG)(FQWD) 217 F L(20)F(80) 0.04 264 T (KR)(FQMWH)(NELPI)(D) 46 S S(90)T(10) 0.05 217 F (KE)(T)(QDR)(SCG) 57 F F(90)M(10) 0.05 46 S (KR)(FQMWH)(NELPI)(Y) 221 E E(90)D(10) 0.05 57 F (TKE)(D)(SQCRG)(N) 47 A G(20)A(69)K(10) 0.06 221 E (FWH)(R)(YVCAG)(T) 15 K R(40)K(59) 0.07 47 A (Y)(E)(R)(H) 22 R R(69)K(29) 0.07 15 K (Y)(T)(FW)(SVCAG) 159 F F(80)L(10)M(10) 0.07 continued in next column continued in next column

6 Table 11. continued res type disruptive mutations 22 R (T)(YD)(SVCAG)(FELWPI) 159 F (T)(KE)(DR)(QCG) 30 E (FWH)(R)(YVCAG)(T) 220 K (Y)(T)(FW)(SVCAG) 263 A (KR)(E)(Y)(QH) 180 S (R)(FKWH)(YM)(EQ) Fig. 10. Residues 13-143 in 3a1jC colored by their relative importance. (See 228 F (K)(E)(Q)(R) Appendix, Fig.25, for the coloring scheme.) 41 V (YR)(KE)(H)(QD) 222 F (KE)(T)(QDR)(SCG) 218 C (R)(KE)(FWH)(M) 23 I (YR)(H)(TKE)(SQCDG) 265 L (Y)(R)(H)(T) 26 E (H)(FW)(R)(Y) 150 R (TYD)(FEVCLAWPIG)(S)(M) 195 T (KR)(FQMWH)(NELPI)(D) 196 E (FWH)(Y)(R)(VA) 216 T (R)(K)(H)(FQW) 230 E (FWH)(R)(Y)(VCAG) Fig. 11. Residues 144-275 in 3a1jC colored by their relative importance. (See 12 V (Y)(R)(H)(K) Appendix, Fig.25, for the coloring scheme.) 261 V (YR)(KE)(H)(QD) 43 S (R)(K)(H)(FW) Format: MSF 81 L (Y)(T)(R)(H) Number of sequences: 10 132 L (Y)(R)(TH)(SCG) Total number of residues: 2600 223 R (T)(Y)(D)(S) Smallest: 253 164 A (R)(Y)(KH)(Q) Largest: 263 24 G (R)(K)(FWH)(EQM) Average length: 260.0 39 R (D)(TY)(FEVLAWPI)(CG) Alignment length: 263 61 Y (K)(Q)(M)(E) Average identity: 34% 45 R (T)(Y)(D)(SVCAG) Most related pair: 90% Most unrelated pair: 20% Table 11. Disruptive mutations for the surface patch in 3a1jA. Most distant seq: 30%

Furthermore, 4% of residues show as conserved in this alignment. The alignment consists of 90% eukaryotic ( 20% vertebrata, 10% arthropoda, 40% fungi) sequences. (Descriptions of some sequences were not readily available.) The file containing the sequence descrip- tions can be found in the attachment, under the name 3a1jC.descr. 3 CHAIN 3A1JC 3.3 Residue ranking in 3a1jC 3.1 Q5R7X9 overview The 3a1jC sequence is shown in Figs. 10–11, with each residue colo- From SwissProt, id Q5R7X9, 97% identical to 3a1jC: red according to its estimated importance. The full listing of residues Description: Hypothetical protein DKFZp459H1127. in 3a1jC can be found in the file called 3a1jC.ranks sorted in the Organism, scientific name: Pongo pygmaeus (Orangutan). attachment. Taxonomy: Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; 3.4 Top ranking residues in 3a1jC and their position on Catarrhini; Hominidae; Pongo. the structure In the following we consider residues ranking among top 25% of residues in the protein . Figure 12 shows residues in 3a1jC colored 3.2 Multiple sequence alignment for 3a1jC by their importance: bright red and yellow indicate more conser- ved/important residues (see Appendix for the coloring scheme). A For the chain 3a1jC, the alignment 3a1jC.msf (attached) with 10 Pymol script for producing this figure can be found in the attachment. sequences was used. The alignment was assembled through combi- nation of BLAST searching on the UniProt database and alignment using Muscle program. It can be found in the attachment to this 3.4.1 Clustering of residues at 25% coverage. Fig. 13 shows the report, under the name of 3a1jC.msf. Its statistics, from the alistat top 25% of all residues, this time colored according to clusters they program are the following: belong to. The clusters in Fig.13 are composed of the residues listed

7 Table 12. cluster size member color residues red 42 34,36,44,48,50,51,52,54,60 61,64,144,145,147,155,156 157,158,160,163,173,177,188 192,194,211,213,223,228,240 241,244,246,249,251,252,253 254,266,268,269,273 blue 12 17,82,89,93,95,97,112,118 130,132,134,136 yellow 5 19,27,70,108,109 green 2 169,170 purple 2 125,127

Table 12. Clusters of top ranking residues in 3a1jC.

3.4.2 Overlap with known functional surfaces at 25% coverage. The name of the ligand is composed of the source PDB identifier and the heteroatom name used in that file. Interface with 3a1jB.Table 13 lists the top 25% of residues at Fig. 12. Residues in 3a1jC, colored by their relative importance. Clockwise: the interface with 3a1jB. The following table (Table 14) suggests front, back, top and bottom views. possible disruptive replacements for these residues (see Section 5.6). Table 13. res type subst’s cvg noc/ dist (%) bb (A˚ ) 95 I I(90) 0.10 42/23 3.34 L(10) 127 G G(90) 0.10 6/6 3.63 D(10) 134 I I(40) 0.13 10/6 4.10 L(40) F(20) 97 G G(69) 0.16 8/8 3.79 E(20) T(10) 130 T T(80) 0.21 38/15 2.76 A(10) I(10) 125 E E(80) 0.22 2/0 4.40 D(20) 132 C C(69) 0.24 16/11 3.36 G(20) A(10)

Fig. 13. Residues in 3a1jC, colored according to the cluster they belong to: Table 13. The top 25% of residues in 3a1jC at the interface with 3a1jB. red, followed by blue and yellow are the largest clusters (see Appendix for (Field names: res: residue number in the PDB entry; type: amino acid type; the coloring scheme). Clockwise: front, back, top and bottom views. The substs: substitutions seen in the alignment; with the percentage of each type corresponding Pymol script is attached. in the bracket; noc/bb: number of contacts with the ligand, with the number of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. ) in Table 12.

8 Table 14. contacts realized through backbone atoms given in the bracket; dist: distance res type disruptive of closest apporach to the ligand. ) mutations 95 I (YR)(TH)(SKECG)(FQWD) 127 G (R)(K)(FWH)(EQM) Table 16. 134 I (R)(Y)(T)(KE) res type disruptive 97 G (R)(K)(FWH)(EQ) mutations 130 T (R)(K)(H)(Q) 169 E (FWH)(R)(YVCAG)(T) 125 E (FWH)(R)(YVCAG)(T) 132 C (KER)(QHD)(FMW)(Y) Table 16. List of disruptive mutations for the top 25% of residues in 3a1jC, that are at the interface with 3a1jA. Table 14. List of disruptive mutations for the top 25% of residues in 3a1jC, that are at the interface with 3a1jB.

Fig. 15. Residues in 3a1jC, at the interface with 3a1jA, colored by their rela- tive importance. 3a1jA is shown in backbone representation (See Appendix for the coloring scheme for the protein chain 3a1jC.) Fig. 14. Residues in 3a1jC, at the interface with 3a1jB, colored by their rela- tive importance. 3a1jB is shown in backbone representation (See Appendix for the coloring scheme for the protein chain 3a1jC.) Figure 15 shows residues in 3a1jC colored by their importance, at the interface with 3a1jA.

Figure 14 shows residues in 3a1jC colored by their importance, at the 3.4.3 Possible novel functional surfaces at 25% coverage. One interface with 3a1jB. group of residues is conserved on the 3a1jC surface, away from (or Interface with 3a1jA.Table 15 lists the top 25% of residues at susbtantially larger than) other functional sites and interfaces reco- the interface with 3a1jA. The following table (Table 16) suggests gnizable in PDB entry 3a1j. It is shown in Fig. 16. The residues possible disruptive replacements for these residues (see Section 5.6). belonging to this surface ”patch” are listed in Table 17, while Table 18 suggests possible disruptive replacements for these residues (see Table 15. Section 5.6). res type subst’s cvg noc/ dist Table 17. (%) bb (A˚ ) res type substitutions(%) cvg 169 E E(80) 0.07 41/5 3.09 70 F F(100) 0.05 D(20) 93 L L(100) 0.05 112 Y Y(100) 0.05 Table 15. The top 25% of residues in 3a1jC at the interface with 3a1jA. 118 P P(100) 0.05 (Field names: res: residue number in the PDB entry; type: amino acid type; continued in next column substs: substitutions seen in the alignment; with the percentage of each type in the bracket; noc/bb: number of contacts with the ligand, with the number of

9 Table 18. continued res type disruptive mutations 97 G (R)(K)(FWH)(EQ) 27 L (R)(Y)(T)(KE) 82 F (T)(KE)(DR)(QCG) 130 T (R)(K)(H)(Q) 17 L (YR)(T)(H)(KECG) 89 L (YR)(H)(T)(KE) 109 R (D)(TY)(FEVLAWPI)(CG) 132 C (KER)(QHD)(FMW)(Y)

Table 18. Disruptive mutations for the surface patch in 3a1jC.

Another group of surface residues is shown in Fig.17. The right panel shows (in blue) the rest of the larger cluster this surface belongs to.

Fig. 16. A possible active surface on the chain 3a1jC.

Table 17. continued res type substitutions(%) cvg 116 G G(80)F(20) 0.07 95 I I(90)L(10) 0.10 136 T T(90)S(10) 0.11 108 L L(40)C(40)V(20) 0.13 Fig. 17. Another possible active surface on the chain 3a1jC. The larger cluster 134 I I(40)L(40)F(20) 0.13 it belongs to is shown in blue. 97 G G(69)E(20)T(10) 0.16 27 L L(80)F(10)I(10) 0.19 The residues belonging to this surface ”patch” are listed in Table 19, 82 F F(80)M(10)L(10) 0.21 while Table 20 suggests possible disruptive replacements for these 130 T T(80)A(10)I(10) 0.21 residues (see Section 5.6). 17 L L(29)F(50)M(20) 0.22 89 L L(69)I(20)V(10) 0.24 Table 19. 109 R R(40)T(29)K(29) 0.24 res type substitutions(%) cvg 132 C C(69)G(20)A(10) 0.24 36 F F(100) 0.05 64 F F(100) 0.05 Table 17. Residues forming surface ”patch” in 3a1jC. 194 G G(100) 0.05 240 K K(100) 0.05 249 G G(100) 0.05 Table 18. 251 L L(100) 0.05 res type disruptive 169 E E(80)D(20) 0.07 mutations 241 V V(80)L(20) 0.07 70 F (KE)(TQD)(SNCRG)(M) 266 F F(80)Y(20) 0.07 93 L (YR)(TH)(SKECG)(FQWD) 211 E E(90)M(10) 0.10 112 Y (K)(QM)(NEVLAPIR)(D) 244 R R(90)K(10) 0.10 118 P (YR)(TH)(SKECG)(FQWD) 254 Q Q(90)H(10) 0.10 116 G (KE)(R)(QD)(M) 50 K K(50)R(50) 0.11 95 I (YR)(TH)(SKECG)(FQWD) 60 Q Q(90)L(10) 0.11 136 T (KR)(FQMWH)(NELPI)(D) 273 P P(90).(10) 0.11 108 L (R)(Y)(H)(KE) 192 T T(59)G(29)C(10) 0.12 134 I (R)(Y)(T)(KE) continued in next column continued in next column

10 Table 19. continued Table 20. continued res type substitutions(%) cvg res type disruptive 269 Y Y(29)F(69) 0.12 mutations 144 D D(90)E(10) 0.14 52 T (KR)(QH)(FMW)(E) 145 F F(50)Y(10)I(40) 0.16 170 L (R)(Y)(T)(KE) 253 L L(69)C(20)I(10) 0.16 228 L (YR)(TH)(SKECG)(FQWD) 268 E E(50)Q(10)D(40) 0.16 155 K K(69)V(10)T(20) 0.18 Table 20. Disruptive mutations for the surface patch in 3a1jC. 157 I I(69)F(10)L(20) 0.18 177 L L(69)F(10)V(20) 0.18 54 E E(69)D(29) 0.19 147 F F(80)Q(10)R(10) 0.19 4 CHAIN 3A1JB 173 T T(80)S(10)N(10) 0.19 51 V V(29)I(29)F(40) 0.21 4.1 O60921 overview 213 F F(80)V(10)A(10) 0.21 From SwissProt, id O60921, 87% identical to 3a1jB: 52 T T(59)S(20)A(20) 0.22 Description: Hus1+-like protein (HUS1 checkpoint protein) (S. 170 L L(69)F(20)I(10) 0.24 pombe) (HUS1 protein). 228 L L(40)I(59) 0.25 Organism, scientific name: Homo sapiens (Human). Taxonomy: Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Table 19. Residues forming surface ”patch” in 3a1jC. Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Catarrhini; Hominidae; Homo.

Table 20. 4.2 Multiple sequence alignment for 3a1jB res type disruptive For the chain 3a1jB, the alignment 3a1jB.msf (attached) with 11 mutations sequences was used. The alignment was assembled through combi- 36 F (KE)(TQD)(SNCRG)(M) nation of BLAST searching on the UniProt database and alignment 64 F (KE)(TQD)(SNCRG)(M) using Muscle program. It can be found in the attachment to this 194 G (KER)(FQMWHD)(NYLPI)(SVA) report, under the name of 3a1jB.msf. Its statistics, from the alistat 240 K (Y)(FTW)(SVCAG)(HD) program are the following: 249 G (KER)(FQMWHD)(NYLPI)(SVA) 251 L (YR)(TH)(SKECG)(FQWD) Format: MSF 169 E (FWH)(R)(YVCAG)(T) Number of sequences: 11 241 V (YR)(KE)(H)(QD) Total number of residues: 2915 266 F (K)(E)(Q)(D) Smallest: 260 211 E (H)(FYW)(CRG)(TVA) Largest: 269 244 R (T)(YD)(SVCAG)(FELWPI) Average length: 265.0 254 Q (TY)(SFVCAWG)(HD)(E) Alignment length: 269 50 K (Y)(T)(FW)(SVCAG) Average identity: 37% 60 Q (Y)(TH)(FW)(SCG) Most related pair: 96% 273 P (YR)(TH)(SCG)(KE) Most unrelated pair: 17% 192 T (KR)(FQMWH)(E)(LPI) Most distant seq: 30% 269 Y (K)(Q)(EM)(NR) 144 D (R)(FWH)(YVCAG)(K) 145 F (K)(E)(Q)(R) Furthermore, 3% of residues show as conserved in this alignment. 253 L (R)(Y)(H)(TKE) The alignment consists of 90% eukaryotic ( 45% vertebrata, 18% 268 E (FWH)(Y)(VCAG)(R) arthropoda, 27% fungi) sequences. (Descriptions of some sequences 155 K (Y)(FW)(H)(T) were not readily available.) The file containing the sequence descrip- 157 I (R)(Y)(T)(KE) tions can be found in the attachment, under the name 3a1jB.descr. 177 L (R)(Y)(T)(KE) 54 E (FWH)(R)(YVCAG)(T) 4.3 Residue ranking in 3a1jB 147 F (E)(T)(D)(K) The 3a1jB sequence is shown in Figs. 18–19, with each residue colo- 173 T (R)(K)(FWH)(M) red according to its estimated importance. The full listing of residues 51 V (KE)(R)(Y)(QD) in 3a1jB can be found in the file called 3a1jB.ranks sorted in the 213 F (KE)(D)(Q)(R) attachment. continued in next column 4.4 Top ranking residues in 3a1jB and their position on the structure In the following we consider residues ranking among top 25% of residues in the protein . Figure 20 shows residues in 3a1jB colored

11 Fig. 18. Residues 0-135 in 3a1jB colored by their relative importance. (See Appendix, Fig.25, for the coloring scheme.)

Fig. 19. Residues 136-280 in 3a1jB colored by their relative importance. (See Appendix, Fig.25, for the coloring scheme.)

Fig. 21. Residues in 3a1jB, colored according to the cluster they belong to: by their importance: bright red and yellow indicate more conser- red, followed by blue and yellow are the largest clusters (see Appendix for the coloring scheme). Clockwise: front, back, top and bottom views. The ved/important residues (see Appendix for the coloring scheme). A corresponding Pymol script is attached. Pymol script for producing this figure can be found in the attachment.

Fig. 20. Residues in 3a1jB, colored by their relative importance. Clockwise: front, back, top and bottom views.

4.4.1 Clustering of residues at 25% coverage. Fig. 21 shows the top 25% of all residues, this time colored according to clusters they belong to. The clusters in Fig.21 are composed of the residues listed in Table 21.

12 Table 21. cluster size member color residues red 36 1,2,3,4,7,16,30,41,55,64,67 70,71,77,79,81,82,88,90,91 92,105,113,129,131,137,139 265,269,271,272,273,274,275 277,278 blue 23 155,157,160,161,163,167,183 184,185,188,192,203,205,208 211,233,235,237,241,251,253 254,255 yellow 2 111,134

Table 21. Clusters of top ranking residues in 3a1jB.

4.4.2 Overlap with known functional surfaces at 25% coverage. The name of the ligand is composed of the source PDB identifier and the heteroatom name used in that file. Interface with 3a1jC.Table 22 lists the top 25% of residues at the interface with 3a1jC. The following table (Table 23) suggests possible disruptive replacements for these residues (see Section 5.6). Fig. 22. Residues in 3a1jB, at the interface with 3a1jC, colored by their rela- tive importance. 3a1jC is shown in backbone representation (See Appendix Table 22. for the coloring scheme for the protein chain 3a1jB.) res type subst’s cvg noc/ dist (%) bb (A˚ ) Figure 22 shows residues in 3a1jB colored by their importance, at the 205 F W(27) 0.12 15/2 3.50 interface with 3a1jC. F(63) Interface with 3a1jA.Table 24 lists the top 25% of residues at Y(9) the interface with 3a1jA. The following table (Table 25) suggests 175 N S(18) 0.15 12/3 3.37 possible disruptive replacements for these residues (see Section 5.6). N(72) P(9) Table 24. 199 V A(9) 0.16 27/11 3.20 res type subst’s cvg noc/ dist V(81) (%) bb (A˚ ) S(9) 90 R K(27) 0.04 23/1 2.71 203 T T(63) 0.16 35/16 2.76 R(72) S(36) 91 A A(90) 0.09 2/1 4.93 V(9) Table 22. The top 25% of residues in 3a1jB at the interface with 3a1jC. 131 H H(90) 0.09 35/10 3.12 (Field names: res: residue number in the PDB entry; type: amino acid type; Q(9) substs: substitutions seen in the alignment; with the percentage of each type 129 V V(81) 0.22 28/12 3.40 in the bracket; noc/bb: number of contacts with the ligand, with the number of L(9) contacts realized through backbone atoms given in the bracket; dist: distance I(9) of closest apporach to the ligand. ) 134 P P(81) 0.24 19/0 3.60 L(9) N(9) Table 23. res type disruptive Table 24. The top 25% of residues in 3a1jB at the interface with 3a1jA. mutations (Field names: res: residue number in the PDB entry; type: amino acid type; 205 F (K)(E)(Q)(D) substs: substitutions seen in the alignment; with the percentage of each type 175 N (Y)(H)(FW)(R) in the bracket; noc/bb: number of contacts with the ligand, with the number of 199 V (KR)(YE)(H)(Q) contacts realized through backbone atoms given in the bracket; dist: distance 203 T (KR)(FQMWH)(NELPI)(D) of closest apporach to the ligand. )

Table 23. List of disruptive mutations for the top 25% of residues in 3a1jB, that are at the interface with 3a1jC.

13 Table 25. res type disruptive mutations 90 R (T)(YD)(SVCAG)(FELWPI) 91 A (KYER)(QHD)(N)(FTMW) 131 H (TE)(D)(SVMCAG)(QLPI) 129 V (YR)(KE)(H)(QD) 134 P (Y)(R)(TH)(SCG)

Table 25. List of disruptive mutations for the top 25% of residues in 3a1jB, that are at the interface with 3a1jA.

Fig. 24. A possible active surface on the chain 3a1jB.

Table 26. continued res type substitutions(%) cvg 111 P P(100) 0.03 149 P P(100) 0.03 161 P P(100) 0.03 208 L L(100) 0.03 90 R K(27)R(72) 0.04 30 C C(90)L(9) 0.06 160 L L(90)M(9) 0.06 185 N N(90)T(9) 0.06 188 G G(90)D(9) 0.06 Fig. 23. Residues in 3a1jB, at the interface with 3a1jA, colored by their rela- 1 X M(90)X(9) 0.08 tive importance. 3a1jA is shown in backbone representation (See Appendix 4 R K(9)R(90) 0.08 for the coloring scheme for the protein chain 3a1jB.) 277 P S(9)P(90) 0.08 131 H H(90)Q(9) 0.09 82 E E(81)T(9)G(9) 0.10 Figure 23 shows residues in 3a1jB colored by their importance, at the 16 F L(18)F(81) 0.12 interface with 3a1jA. 71 G .(27)G(63)A(9) 0.12 4.4.3 Possible novel functional surfaces at 25% coverage. One 205 F W(27)F(63)Y(9) 0.12 group of residues is conserved on the 3a1jB surface, away from (or 241 L L(72)F(27) 0.13 susbtantially larger than) other functional sites and interfaces reco- 278 A T(9)A(72)G(18) 0.13 gnizable in PDB entry 3a1j. It is shown in Fig. 24. The residues 88 L F(18)L(81) 0.14 belonging to this surface ”patch” are listed in Table 26, while Table 271 S V(36)S(54)F(9) 0.14 27 suggests possible disruptive replacements for these residues (see 139 P S(18)P(72)K(9) 0.15 Section 5.6). 70 E Q(9)E(81)D(9) 0.16 203 T T(63)S(36) 0.16 Table 26. 137 V V(81)I(18) 0.18 res type substitutions(%) cvg 183 E S(27)E(54)Y(9) 0.18 3 F F(100) 0.03 Q(9) 55 W W(100) 0.03 251 A I(27)A(54)M(9) 0.18 64 F F(100) 0.03 L(9) 105 L L(100) 0.03 continued in next column continued in next column

14 Table 26. continued Table 27. continued res type substitutions(%) cvg res type disruptive 265 L V(27)L(54)F(9) 0.18 mutations I(9) 70 E (FWH)(Y)(VCAG)(R) 273 Q T(27)Q(54)N(9) 0.18 203 T (KR)(FQMWH)(NELPI)(D) H(9) 137 V (YR)(KE)(H)(QD) 67 F Y(72)F(27) 0.19 183 E (FWH)(R)(VA)(YCG) 77 N N(81)E(9)D(9) 0.19 251 A (Y)(R)(KEH)(D) 163 L L(81)G(9)W(9) 0.19 265 L (R)(Y)(T)(KEH) 167 K R(45)K(54) 0.20 273 Q (Y)(FW)(TH)(VA) 41 F F(81)M(9)L(9) 0.22 67 F (K)(E)(Q)(D) 113 L L(81)I(9)W(9) 0.22 77 N (Y)(FWH)(T)(R) 129 V V(81)L(9)I(9) 0.22 163 L (R)(Y)(KE)(T) 269 D D(81)H(9)P(9) 0.22 167 K (Y)(T)(FW)(SVCAG) 272 L L(81)M(9)V(9) 0.22 41 F (T)(KE)(DR)(QCG) 92 L L(81)V(9)A(9) 0.23 113 L (R)(Y)(T)(KE) 274 Y Y(72)F(18)S(9) 0.23 129 V (YR)(KE)(H)(QD) 2 K R(54)K(45) 0.24 269 D (R)(FKCWHG)(Y)(VA) 134 P P(81)L(9)N(9) 0.24 272 L (Y)(R)(H)(T) 275 F Y(18)F(63)I(18) 0.24 92 L (YR)(H)(KE)(T) 135 I V(81)I(18) 0.25 274 Y (K)(Q)(M)(ER) 2 K (Y)(T)(FW)(SVCAG) Table 26. Residues forming surface ”patch” in 3a1jB. 134 P (Y)(R)(TH)(SCG) 275 F (K)(E)(Q)(R) 135 I (YR)(H)(TKE)(SQCDG) Table 27. res type disruptive Table 27. Disruptive mutations for the surface patch in 3a1jB. mutations 3 F (KE)(TQD)(SNCRG)(M) 55 W (KE)(TQD)(SNCRG)(M) 64 F (KE)(TQD)(SNCRG)(M) 105 L (YR)(TH)(SKECG)(FQWD) 5 NOTES ON USING TRACE RESULTS 111 P (YR)(TH)(SKECG)(FQWD) 5.1 Coverage 149 P (YR)(TH)(SKECG)(FQWD) 161 P (YR)(TH)(SKECG)(FQWD) Trace results are commonly expressed in terms of coverage: the resi- 208 L (YR)(TH)(SKECG)(FQWD) due is important if its “coverage” is small - that is if it belongs to 90 R (T)(YD)(SVCAG)(FELWPI) some small top percentage of residues [100% is all of the residues 30 C (R)(KE)(H)(FYQWD) in a chain], according to trace. The ET results are presented in the 160 L (Y)(R)(TH)(SCG) form of a table, usually limited to top 25% percent of residues (or 185 N (FYWH)(R)(E)(TVMA) to some nearby percentage), sorted by the strength of the presumed 188 G (R)(K)(FWH)(EQM) evolutionary pressure. (I.e., the smaller the coverage, the stronger the 1 X (Y)(R)(TKEH)(FWD) pressure on the residue.) Starting from the top of that list, mutating a 4 R (T)(YD)(SVCAG)(FELWPI) couple of residues should affect the protein somehow, with the exact 277 P (R)(Y)(H)(K) effects to be determined experimentally. 131 H (TE)(D)(SVMCAG)(QLPI) 82 E (FWH)(R)(Y)(K) 5.2 Known substitutions 16 F (KE)(T)(QDR)(SCG) One of the table columns is “substitutions” - other amino acid types 71 G (KER)(HD)(Q)(FMW) seen at the same position in the alignment. These amino acid types 205 F (K)(E)(Q)(D) may be interchangeable at that position in the protein, so if one wants 241 L (R)(TY)(KE)(SCHG) to affect the protein by a point mutation, they should be avoided. For 278 A (KR)(E)(QH)(Y) example if the substitutions are “RVK” and the original protein has 88 L (R)(TY)(KE)(SCHG) an R at that position, it is advisable to try anything, but RVK. Conver- 271 S (K)(R)(Q)(E) sely, when looking for substitutions which will not affect the protein, 139 P (Y)(R)(H)(T) one may try replacing, R with K, or (perhaps more surprisingly), with continued in next column V. The percentage of times the substitution appears in the alignment is given in the immediately following bracket. No percentage is given in the cases when it is smaller than 1%. This is meant to be a rough guide - due to rounding errors these percentages often do not add up to 100%.

15 5.3 Surface To detect candidates for novel functional interfaces, first we look for residues that are solvent accessible (according to DSSP program) by 2 at least 10A˚ , which is roughly the area needed for one water mole- cule to come in the contact with the residue. Furthermore, we require COVERAGE that these residues form a “cluster” of residues which have neighbor within 5A˚ from any of their heavy atoms. V Note, however, that, if our picture of protein evolution is correct, 100% 50% 30% 5% the neighboring residues which are not surface accessible might be equally important in maintaining the interaction specificity - they should not be automatically dropped from consideration when choo- sing the set for mutagenesis. (Especially if they form a cluster with the surface residues.) V

5.4 Number of contacts RELATIVE IMPORTANCE Another column worth noting is denoted “noc/bb”; it tells the num- ber of contacts heavy atoms of the residue in question make across the interface, as well as how many of them are realized through the Fig. 25. Coloring scheme used to color residues by their relative importance. backbone atoms (if all or most contacts are through the backbone, mutation presumably won’t have strong impact). Two heavy atoms are considered to be “in contact” if their centers are closer than 5A˚ . • variability has two subfields: 1. number of different amino acids appearing in in this column 5.5 Annotation of the alignment If the residue annotation is available (either from the pdb file or 2. their type from other sources), another column, with the header “annotation” • rho ET score - the smaller this value, the lesser variability of appears. Annotations carried over from PDB are the following: site this position across the branches of the tree (and, presumably, (indicating existence of related site record in PDB ), S-S (disulfide the greater the importance for the protein) bond forming residue), hb (hydrogen bond forming residue, jb (james bond forming residue), and sb (for salt bridge forming residue). • cvg coverage - percentage of the residues on the structure which have this rho or smaller 5.6 Mutation suggestions • gaps percentage of gaps in this column Mutation suggestions are completely heuristic and based on comple- mentarity with the substitutions found in the alignment. Note that 6.2 Color schemes used they are meant to be disruptive to the interaction of the protein The following color scheme is used in figures with residues colored with its ligand. The attempt is made to complement the following by cluster size: black is a single-residue cluster; clusters composed of properties: small [AV GSTC], medium [LPNQDEMIK], large more than one residue colored according to this hierarchy (ordered [WFYHR], hydrophobic [LPVAMWFI], polar [GTCY ]; posi- by descending size): red, blue, yellow, green, purple, azure, tur- tively [KHR], or negatively [DE] charged, aromatic [WFYH], quoise, brown, coral, magenta, LightSalmon, SkyBlue, violet, gold, long aliphatic chain [EKRQM], OH-group possession [SDETY ], bisque, LightSlateBlue, orchid, RosyBrown, MediumAquamarine, and NH2 group possession [NQRK]. The suggestions are listed DarkOliveGreen, CornflowerBlue, grey55, burlywood, LimeGreen, according to how different they appear to be from the original amino tan, DarkOrange, DeepPink, maroon, BlanchedAlmond. acid, and they are grouped in round brackets if they appear equally The colors used to distinguish the residues by the estimated disruptive. From left to right, each bracketed group of amino acid evolutionary pressure they experience can be seen in Fig. 25. types resembles more strongly the original (i.e. is, presumably, less disruptive) These suggestions are tentative - they might prove disrup- 6.3 Credits tive to the fold rather than to the interaction. Many researcher will 6.3.1 Alistat alistat reads a multiple sequence alignment from the choose, however, the straightforward alanine mutations, especially in file and shows a number of simple statistics about it. These stati- the beginning stages of their investigation. stics include the format, the number of sequences, the total number of residues, the average and range of the sequence lengths, and the 6 APPENDIX alignment length (e.g. including gap characters). Also shown are 6.1 File formats some percent identities. A percent pairwise alignment identity is defi- ned as (idents / MIN(len1, len2)) where idents is the number of Files with extension “ranks sorted” are the actual trace results. The exact identities and len1, len2 are the unaligned lengths of the two fields in the table in this file: sequences. The ”average percent identity”, ”most related pair”, and • alignment# number of the position in the alignment ”most unrelated pair” of the alignment are the average, maximum, and minimum of all (N)(N-1)/2 pairs, respectively. The ”most distant • residue# residue number in the PDB file seq” is calculated by finding the maximum pairwise identity (best • type amino acid type relative) for all N sequences, then finding the minimum of these N • rank rank of the position according to older version of ET numbers (hence, the most outlying sequence). alistat is copyrighted

16 by HHMI/Washington University School of Medicine, 1992-2001, report maker itself is described in Mihalek I., I. Res and O. and freely distributed under the GNU General Public License. Lichtarge (2006). ”Evolutionary Trace Report Maker: a new type of service for comparative analysis of .” Bioinformatics 6.3.2 CE To map ligand binding sites from different 22:1656-7. source structures, report maker uses the CE program: http://cl.sdsc.edu/. Shindyalov IN, Bourne PE (1998) 6.6 About report maker ”Protein structure alignment by incremental combinatorial extension report maker was written in 2006 by Ivana Mihalek. The 1D ran- (CE) of the optimal path . Protein Engineering 11(9) 739-747. king visualization program was written by Ivica Res.ˇ report maker 6.3.3 DSSP In this work a residue is considered solvent accessi- is copyrighted by Lichtarge Lab, Baylor College of Medicine, ble if the DSSP program finds it exposed to water by at least 10A˚ 2, Houston. which is roughly the area needed for one water molecule to come in 6.7 Attachments the contact with the residue. DSSP is copyrighted by W. Kabsch, C. Sander and MPI-MF, 1983, 1985, 1988, 1994 1995, CMBI version The following files should accompany this report: by [email protected] November 18,2002, • 3a1jA.complex.pdb - coordinates of 3a1jA with all of its inter- http://www.cmbi.kun.nl/gv/dssp/descrip.html. acting partners • 6.3.4 HSSP Whenever available, report maker uses HSSP ali- 3a1jA.etvx - ET viewer input file for 3a1jA gnment as a starting point for the analysis (sequences shorter than • 3a1jA.cluster report.summary - Cluster report summary for 75% of the query are taken out, however); R. Schneider, A. de 3a1jA Daruvar, and C. Sander. ”The HSSP database of protein structure- • 3a1jA.ranks - Ranks file in sequence order for 3a1jA sequence alignments.” Nucleic Acids Res., 25:226–230, 1997. • 3a1jA.clusters - Cluster descriptions for 3a1jA http://swift.cmbi.kun.nl/swift/hssp/ • 3a1jA.msf - the multiple sequence alignment used for the chain 3a1jA 6.3.5 LaTex The text for this report was processed using LATEX; Leslie Lamport, “LaTeX: A Document Preparation System Addison- • 3a1jA.descr - description of sequences used in 3a1jA msf Wesley,” Reading, Mass. (1986). • 3a1jA.ranks sorted - full listing of residues and their ranking for 3a1jA 6.3.6 Muscle When making alignments “from scratch”, report maker uses Muscle alignment program: Edgar, Robert C. (2004), • 3a1jA.3a1jB.if.pml - Pymol script for Figure 5 ”MUSCLE: multiple sequence alignment with high accuracy and • 3a1jA.cbcvg - used by other 3a1jA – related pymol scripts high throughput.” Nucleic Acids Research 32(5), 1792-97. • 3a1jA.3a1jASUC6001.if.pml - Pymol script for Figure 6 http://www.drive5.com/muscle/ • 3a1jA.3a1jC.if.pml - Pymol script for Figure 7 • 6.3.7 Pymol The figures in this report were produced using 3a1jC.complex.pdb - coordinates of 3a1jC with all of its inter- Pymol. The scripts can be found in the attachment. Pymol acting partners is an open-source application copyrighted by DeLano Scien- • 3a1jC.etvx - ET viewer input file for 3a1jC tific LLC (2005). For more information about Pymol see • 3a1jC.cluster report.summary - Cluster report summary for http://pymol.sourceforge.net/. (Note for Windows 3a1jC users: the attached package needs to be unzipped for Pymol to read • 3a1jC.ranks - Ranks file in sequence order for 3a1jC the scripts and launch the viewer.) • 3a1jC.clusters - Cluster descriptions for 3a1jC 6.4 Note about ET Viewer • 3a1jC.msf - the multiple sequence alignment used for the chain Dan Morgan from the Lichtarge lab has developed a visualization 3a1jC tool specifically for viewing trace results. If you are interested, please • 3a1jC.descr - description of sequences used in 3a1jC msf visit: • 3a1jC.ranks sorted - full listing of residues and their ranking for http://mammoth.bcm.tmc.edu/traceview/ 3a1jC The viewer is self-unpacking and self-installing. Input files to be used • 3a1jC.3a1jB.if.pml - Pymol script for Figure 14 with ETV (extension .etvx) can be found in the attachment to the • 3a1jC.cbcvg - used by other 3a1jC – related pymol scripts main report. • 3a1jC.3a1jA.if.pml - Pymol script for Figure 15 6.5 Citing this work • 3a1jB.complex.pdb - coordinates of 3a1jB with all of its inter- The method used to rank residues and make predictions in this report acting partners can be found in Mihalek, I., I. Res,ˇ O. Lichtarge. (2004). ”A Family of • 3a1jB.etvx - ET viewer input file for 3a1jB Evolution-Entropy Hybrid Methods for Ranking of Protein Residues • 3a1jB.cluster report.summary - Cluster report summary for by Importance” J. Mol. Bio. 336: 1265-82. For the original version 3a1jB of ET see O. Lichtarge, H.Bourne and F. Cohen (1996). ”An Evolu- • tionary Trace Method Defines Binding Surfaces Common to Protein 3a1jB.ranks - Ranks file in sequence order for 3a1jB Families” J. Mol. Bio. 257: 342-358. • 3a1jB.clusters - Cluster descriptions for 3a1jB

17 • 3a1jB.msf - the multiple sequence alignment used for the chain • 3a1jB.3a1jC.if.pml - Pymol script for Figure 22 3a1jB • 3a1jB.cbcvg - used by other 3a1jB – related pymol scripts • 3a1jB.descr - description of sequences used in 3a1jB msf • 3a1jB.3a1jA.if.pml - Pymol script for Figure 23 • 3a1jB.ranks sorted - full listing of residues and their ranking for 3a1jB

18