1Wba Lichtarge Lab 2006
Total Page:16
File Type:pdf, Size:1020Kb
Pages 1–5 1wba Evolutionary trace report by report maker June 5, 2010 4.3.3 DSSP 4 4.3.4 HSSP 4 4.3.5 LaTex 4 4.3.6 Muscle 4 4.3.7 Pymol 5 4.4 Note about ET Viewer 5 4.5 Citing this work 5 4.6 About report maker 5 4.7 Attachments 5 1 INTRODUCTION From the original Protein Data Bank entry (PDB id 1wba): Title: Winged bean albumin 1 Compound: Mol id: 1; molecule: winged bean albumin 1; chain: a Organism, scientific name: Psophocarpus Tetragonolobus; 1wba contains a single unique chain 1wbaA (171 residues long). CONTENTS 2 CHAIN 1WBAA 1 Introduction 1 2.1 P15465 overview 2 Chain 1wbaA 1 From SwissProt, id P15465, 100% identical to 1wbaA: 2.1 P15465 overview 1 Description: Albumin-1 (WBA-1). 2.2 Multiple sequence alignment for 1wbaA 1 Organism, scientific name: Psophocarpus tetragonolobus (Goa bean) (Asparagus bean). 2.3 Residue ranking in 1wbaA 1 Taxonomy: Eukaryota; Viridiplantae; Streptophyta; Embryophyta; 2.4 Top ranking residues in 1wbaA and their position on Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core the structure 1 eudicotyledons; rosids; eurosids I; Fabales; Fabaceae; Papilionoi- 2.4.1 Clustering of residues at 25% coverage. 2 deae; Phaseoleae; Psophocarpus. 2.4.2 Possible novel functional surfaces at 25% Function: 2S seed storage protein. coverage. 2 Similarity: Belongs to the leguminous Kunitz-type inhibitor family. 3 Notes on using trace results 3 About: This Swiss-Prot entry is copyright. It is produced through a 3.1 Coverage 3 collaboration between the Swiss Institute of Bioinformatics and the 3.2 Known substitutions 3 EMBL outstation - the European Bioinformatics Institute. There are 3.3 Surface 3 no restrictions on its use as long as its content is in no way modified 3.4 Number of contacts 3 and this statement is not removed. 3.5 Annotation 3 3.6 Mutation suggestions 4 2.2 Multiple sequence alignment for 1wbaA 4 Appendix 4 For the chain 1wbaA, the alignment 1wbaA.msf (attached) with 21 4.1 File formats 4 sequences was used. The alignment was downloaded from the HSSP 4.2 Color schemes used 4 database, and fragments shorter than 75% of the query as well as 4.3 Credits 4 duplicate sequences were removed. It can be found in the attachment 4.3.1 Alistat 4 to this report, under the name of 1wbaA.msf. Its statistics, from the 4.3.2 CE 4 alistat program are the following: 1 Lichtarge lab 2006 Fig. 1. Residues 2-172 in 1wbaA colored by their relative importance. (See Appendix, Fig.5, for the coloring scheme.) Format: MSF Number of sequences: 21 Total number of residues: 3441 Smallest: 159 Largest: 171 Average length: 163.9 Alignment length: 171 Average identity: 45% Most related pair: 99% Fig. 2. Residues in 1wbaA, colored by their relative importance. Clockwise: Most unrelated pair: 21% front, back, top and bottom views. Most distant seq: 32% Furthermore, 7% of residues show as conserved in this alignment. The alignment consists of 80% eukaryotic ( 80% plantae) sequences. (Descriptions of some sequences were not readily availa- ble.) The file containing the sequence descriptions can be found in the attachment, under the name 1wbaA.descr. 2.3 Residue ranking in 1wbaA The 1wbaA sequence is shown in Fig. 1, with each residue colored according to its estimated importance. The full listing of residues in 1wbaA can be found in the file called 1wbaA.ranks sorted in the attachment. 2.4 Top ranking residues in 1wbaA and their position on the structure In the following we consider residues ranking among top 25% of residues in the protein . Figure 2 shows residues in 1wbaA colored by their importance: bright red and yellow indicate more conser- ved/important residues (see Appendix for the coloring scheme). A Pymol script for producing this figure can be found in the attachment. 2.4.1 Clustering of residues at 25% coverage. Fig. 3 shows the Fig. 3. Residues in 1wbaA, colored according to the cluster they belong to: top 25% of all residues, this time colored according to clusters they red, followed by blue and yellow are the largest clusters (see Appendix for belong to. The clusters in Fig.3 are composed of the residues listed the coloring scheme). Clockwise: front, back, top and bottom views. The corresponding Pymol script is attached. in Table 1. Table 1. Table 1. continued cluster size member cluster size member color residues color residues red 42 5,7,10,15,16,17,18,19,20,22 59,72,82,92,94,95,105,118 28,29,38,43,44,45,46,47,49 130,131,132,134,135,141,143 continued in next column 144,145,158,164,165,167,169 continued in next column 2 Table 1. continued Table 2. continued cluster size member res type substitutions(%) cvg antn color residues E(4) 171 141 C C(85)D(14) 0.21 S-S 167 V V(85)L(14) 0.21 Table 1. Clusters of top ranking residues in 1wbaA. 134 F F(61)Y(38) 0.22 15 N N(90)A(4)S(4) 0.23 28 A A(9)G(76)S(4) 0.24 2.4.2 Possible novel functional surfaces at 25% coverage. One K(9) group of residues is conserved on the 1wbaA surface, away from (or 22 V V(23)L(71)I(4) 0.25 susbtantially larger than) other functional sites and interfaces reco- 143 N N(9)D(76)Y(14) 0.25 gnizable in PDB entry 1wba. It is shown in Fig. 4. The right panel shows (in blue) the rest of the larger cluster this surface belongs to. Table 2. Residues forming surface ”patch” in 1wbaA. Table 3. res type disruptive mutations 7 D (R)(FWH)(KYVCAG)(TQM) 19 Y (K)(QM)(NEVLAPIR)(D) 130 Y (K)(QM)(NEVLAPIR)(D) 131 K (Y)(FTW)(SVCAG)(HD) 132 L (YR)(TH)(SKECG)(FQWD) 135 C (KER)(FQMWHD)(NYLPI)(SVA) 165 L (YR)(TH)(SKECG)(FQWD) 17 G (KER)(QHD)(FYMW)(N) Fig. 4. A possible active surface on the chain 1wbaA. The larger cluster it 49 S (KR)(QH)(FYEMW)(N) belongs to is shown in blue. 92 W (KE)(TQD)(SNCRG)(M) 20 T (K)(R)(QM)(FNELWPHI) The residues belonging to this surface ”patch” are listed in Table 118 F (KE)(T)(QDR)(SCG) 2, while Table 3 suggests possible disruptive replacements for these 164 P (R)(YH)(K)(E) residues (see Section 3.6). 144 I (YR)(H)(TKE)(SQCDG) 171 R (T)(Y)(D)(SVCAG) Table 2. 10 G (R)(K)(FWH)(EQM) res type substitutions(%) cvg antn 158 V (YR)(KE)(H)(QD) 7 D D(100) 0.07 16 R (TY)(D)(FEVAW)(SCLPIG) 19 Y Y(100) 0.07 18 K (Y)(FW)(T)(VAH) 130 Y Y(100) 0.07 141 C (R)(K)(FWH)(EQM) 131 K K(100) 0.07 167 V (YR)(KE)(H)(QD) 132 L L(100) 0.07 134 F (K)(E)(Q)(D) 135 C C(100) 0.07 S-S 15 N (Y)(H)(FW)(R) 165 L L(100) 0.07 28 A (Y)(ER)(K)(H) 17 G G(95)A(4) 0.09 22 V (YR)(KE)(H)(QD) 49 S S(95)V(4) 0.09 143 N (YH)(FW)(R)(TVA) 92 W W(95)F(4) 0.09 20 T T(4)Y(95) 0.12 Table 3. Disruptive mutations for the surface patch in 1wbaA. 118 F F(95)L(4) 0.12 164 P P(95)T(4) 0.12 144 I I(90)V(9) 0.14 171 R R(4)K(90)Q(4) 0.15 3 NOTES ON USING TRACE RESULTS 10 G G(90)D(9) 0.17 3.1 Coverage 158 V V(85)L(14) 0.18 16 R R(4)G(85)D(4) 0.19 Trace results are commonly expressed in terms of coverage: the resi- N(4) due is important if its “coverage” is small - that is if it belongs to 18 K K(4)T(85)N(4) 0.19 some small top percentage of residues [100% is all of the residues continued in next column in a chain], according to trace. The ET results are presented in the form of a table, usually limited to top 25% percent of residues (or to some nearby percentage), sorted by the strength of the presumed evolutionary pressure. (I.e., the smaller the coverage, the stronger the 3 pressure on the residue.) Starting from the top of that list, mutating a couple of residues should affect the protein somehow, with the exact effects to be determined experimentally. 3.2 Known substitutions COVERAGE One of the table columns is “substitutions” - other amino acid types seen at the same position in the alignment. These amino acid types V may be interchangeable at that position in the protein, so if one wants 100% 50% 30% 5% to affect the protein by a point mutation, they should be avoided. For example if the substitutions are “RVK” and the original protein has an R at that position, it is advisable to try anything, but RVK. Conver- sely, when looking for substitutions which will not affect the protein, one may try replacing, R with K, or (perhaps more surprisingly), with V. The percentage of times the substitution appears in the alignment V is given in the immediately following bracket. No percentage is given RELATIVE IMPORTANCE in the cases when it is smaller than 1%.