3Ksc Lichtarge Lab 2006
Total Page:16
File Type:pdf, Size:1020Kb
Pages 1–15 3ksc Evolutionary trace report by report maker May 12, 2010 4.3.1 Alistat 14 4.3.2 CE 14 4.3.3 DSSP 14 4.3.4 HSSP 15 4.3.5 LaTex 15 4.3.6 Muscle 15 4.3.7 Pymol 15 4.4 Note about ET Viewer 15 4.5 Citing this work 15 4.6 About report maker 15 4.7 Attachments 15 1 INTRODUCTION From the original Protein Data Bank entry (PDB id 3ksc): Title: Crystal structure of pea prolegumin, an 11s seed globulin from pisum sativum l. Compound: Mol id: 1; molecule: lega class; chain: a, b, c, d, e, f; CONTENTS synonym: prolegumin; engineered: yes Organism, scientific name: Pisum Sativum; 1 Introduction 1 3ksc contains a single unique chain 3kscE (383 residues long) and its homologues 3kscA, 3kscF, 3kscD, 3kscC, and 3kscB. 2 Chain 3kscE 1 2.1 Q41676 overview 1 2.2 Multiple sequence alignment for 3kscE 1 2.3 Residue ranking in 3kscE 1 2.4 Top ranking residues in 3kscE and their position on 2 CHAIN 3KSCE the structure 1 2.4.1 Clustering of residues at 25% coverage. 2 2.1 Q41676 overview 2.4.2 Overlap with known functional surfaces at From SwissProt, id Q41676, 78% identical to 3kscE: 25% coverage. 2 Description: Legumin A precursor. 2.4.3 Possible novel functional surfaces at 25% Organism, scientific name: Vicia narbonensis (Narbonne vetch). coverage. 10 Taxonomy: Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core 3 Notes on using trace results 13 eudicotyledons; rosids; eurosids I; Fabales; Fabaceae; Papilionoi- 3.1 Coverage 13 deae; Vicieae; Vicia. 3.2 Known substitutions 13 3.3 Surface 13 3.4 Number of contacts 14 3.5 Annotation 14 2.2 Multiple sequence alignment for 3kscE 3.6 Mutation suggestions 14 For the chain 3kscE, the alignment 3kscE.msf (attached) with 218 sequences was used. The alignment was downloaded from the HSSP 4 Appendix 14 database, and fragments shorter than 75% of the query as well as 4.1 File formats 14 duplicate sequences were removed. It can be found in the attachment 4.2 Color schemes used 14 to this report, under the name of 3kscE.msf. Its statistics, from the 4.3 Credits 14 alistat program are the following: 1 Lichtarge lab 2006 residues (see Appendix for the coloring scheme). A Pymol script for producing this figure can be found in the attachment. Fig. 1. Residues 7-219 in 3kscE colored by their relative importance. (See Appendix, Fig.18, for the coloring scheme.) Fig. 2. Residues 220-488 in 3kscE colored by their relative importance. (See Appendix, Fig.18, for the coloring scheme.) Fig. 3. Residues in 3kscE, colored by their relative importance. Clockwise: front, back, top and bottom views. Format: MSF Number of sequences: 218 2.4.1 Clustering of residues at 25% coverage. Fig. 4 shows the Total number of residues: 64427 top 25% of all residues, this time colored according to clusters they Smallest: 49 belong to. The clusters in Fig.4 are composed of the residues listed Largest: 383 in Table 1. Average length: 295.5 Alignment length: 383 Average identity: 43% Table 1. Most related pair: 99% cluster size member Most unrelated pair: 0% color residues Most distant seq: 48% red 32 10,43,45,46,67,71,117,118 145,325,326,346,350,351,353 358,359,360,373,374,375,376 Furthermore, <1% of residues show as conserved in this ali- 378,379,380,381,416,417,435 gnment. 436,437,441 The alignment consists of 60% eukaryotic ( 60% plantae) blue 19 20,32,34,50,332,333,337,339 sequences. (Descriptions of some sequences were not readily availa- 340,341,363,365,368,369,384 ble.) The file containing the sequence descriptions can be found in 410,411,425,432 the attachment, under the name 3kscE.descr. yellow 7 84,85,86,108,109,110,318 2.3 Residue ranking in 3kscE green 7 15,393,394,395,402,403,404 purple 6 53,57,58,77,131,132 The 3kscE sequence is shown in Figs. 1–2, with each residue colored azure 5 65,151,152,153,154 according to its estimated importance. The full listing of residues turquoise 3 26,27,29 in 3kscE can be found in the file called 3kscE.ranks sorted in the brown 2 448,451 attachment. coral 2 457,460 2.4 Top ranking residues in 3kscE and their position on magenta 2 61,162 LightSalmon 2 80,129 the structure SkyBlue 2 472,476 In the following we consider residues ranking among top 25% of resi- continued in next column dues in the protein . Figure 3 shows residues in 3kscE colored by their importance: bright red and yellow indicate more conserved/important 2 Table 2. continued res type subst’s cvg noc/ dist antn (%) bb (A˚ ) V(1)LIF KHTC 332 P A(66)G 0.18 2/0 3.96 P(13) .(13) S(3) T(1)FR 131 Y Y(77)D 0.20 1/0 4.63 .(15) F(2) H(3)T Table 2. The top 25% of residues in 3kscE at the interface with 3kscB. (Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each type in the bracket; noc/bb: number of contacts with the ligand, with the number of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. ) Fig. 4. Residues in 3kscE, colored according to the cluster they belong to: Table 3. red, followed by blue and yellow are the largest clusters (see Appendix for the coloring scheme). Clockwise: front, back, top and bottom views. The res type disruptive corresponding Pymol script is attached. mutations 75 G (R)(KE)(H)(FWD) 337 P (Y)(R)(H)(T) Table 1. continued 341 R (D)(E)(T)(Y) cluster size member 332 P (R)(Y)(H)(KE) color residues 131 Y (K)(Q)(M)(R) Table 1. Clusters of top ranking residues in 3kscE. Table 3. List of disruptive mutations for the top 25% of residues in 3kscE, that are at the interface with 3kscB. 2.4.2 Overlap with known functional surfaces at 25% coverage. Figure 5 shows residues in 3kscE colored by their importance, at the The name of the ligand is composed of the source PDB identifier interface with 3kscB. and the heteroatom name used in that file. Interface with 3kscD.Table 4 lists the top 25% of residues at the Interface with 3kscB.Table 2 lists the top 25% of residues at the interface with 3kscD. The following table (Table 5) suggests possible interface with 3kscB. The following table (Table 3) suggests possible disruptive replacements for these residues (see Section 3.6). disruptive replacements for these residues (see Section 3.6). Table 4. Table 2. res type subst’s cvg noc/ dist antn res type subst’s cvg noc/ dist antn (%) bb (A˚ ) (%) bb (A˚ ) 373 P P(87) 0.00 16/2 3.66 75 G G(78) 0.07 1/1 4.69 .(12)A .(21)P 416 P P(86) 0.01 3/2 3.88 337 P P(68) 0.08 9/6 4.01 .(13) R(11)S 10 C C(73) 0.02 48/20 2.92 S-S .(13) .(25)WY E(2)TAQ 399 G G(85) 0.02 37/37 3.32 LK .(12)ER 341 R S(7) 0.14 40/0 2.98 site C R(68) 375 Y W(67) 0.03 66/0 2.64 .(13) F(1) Y(4) Y(17) continued in next column continued in next column 3 Table 4. continued res type subst’s cvg noc/ dist antn (%) bb (A˚ ) Q(75)Y .(12) E(3)P 43 C C(78) 0.07 4/3 3.74 S-S .(19)LH 378 N N(79) 0.07 12/10 3.31 A(2) .(12) S(1) D(1)KTQ 379 A A(77) 0.09 11/10 3.81 C(2) S(5) .(12) G(1)D 476 K K(75) 0.13 26/0 2.78 R(5) .(16)TM D Fig. 5. Residues in 3kscE, at the interface with 3kscB, colored by their rela- 394 V I(19) 0.17 2/2 4.78 tive importance. 3kscB is shown in backbone representation (See Appendix V(64)A for the coloring scheme for the protein chain 3kscE.) .(12)NM S 404 D D(66) 0.23 1/0 4.96 Table 4. continued S(2) res type subst’s cvg noc/ dist antn N(15) (%) bb (A˚ ) .(12)H .(12)G Q(1)EK 417 Q Q(84) 0.03 77/12 2.65 .(13)RT Table 4. 457 P P(84)R 0.03 37/12 3.58 The top 25% of residues in 3kscE at the interface with 3kscD. (Field names: res: residue number in the PDB entry; type: amino acid type; .(14)S substs: substitutions seen in the alignment; with the percentage of each type 460 V V(83)S 0.03 30/7 3.99 in the bracket; noc/bb: number of contacts with the ligand, with the number of .(14) contacts realized through backbone atoms given in the bracket; dist: distance L(1)G of closest apporach to the ligand. ) 380 N N(25) 0.04 21/4 3.40 H(60) .(12)KS Table 5. R res type disruptive 395 V V(82)L 0.04 31/1 3.39 mutations .(12) 373 P (Y)(R)(H)(T) A(2)IRC 416 P (YR)(TH)(SCG)(KE) 437 T T(82) 0.04 16/7 2.56 10 C (K)(E)(R)(Q) .(13) 399 G (FKEWR)(H)(MD)(Q) N(2)SP 375 Y (K)(Q)(M)(E) 448 G H(3) 0.05 1/1 4.48 417 Q (Y)(FW)(H)(T) G(77) 457 P (Y)(R)(TH)(ECG) D(3) 460 V (R)(K)(E)(Y) Q(1) 380 N (Y)(T)(FW)(EVCAG) .(14)I 395 V (Y)(E)(R)(K) 451 S Q(3) 0.05 1/1 4.86 437 T (R)(K)(H)(FW) S(77) 448 G (R)(KE)(FWH)(M) .(18)RV 451 S (KR)(FWH)(Y)(M) 393 Q R(7) 0.06 66/8 2.75 continued in next column continued in next column 4 Table 5.