2Blf Lichtarge Lab 2006
Total Page:16
File Type:pdf, Size:1020Kb
Pages 1–16 2blf Evolutionary trace report by report maker August 29, 2009 4 Notes on using trace results 13 4.1 Coverage 13 4.2 Known substitutions 14 4.3 Surface 14 4.4 Number of contacts 14 4.5 Annotation 14 4.6 Mutation suggestions 14 5 Appendix 14 5.1 File formats 14 5.2 Color schemes used 14 5.3 Credits 15 5.3.1 Alistat 15 5.3.2 CE 15 5.3.3 DSSP 15 5.3.4 HSSP 15 5.3.5 LaTex 15 5.3.6 Muscle 15 5.3.7 Pymol 15 5.4 Note about ET Viewer 15 5.5 Citing this work 15 5.6 About report maker 15 CONTENTS 5.7 Attachments 15 1 Introduction 1 1 INTRODUCTION From the original Protein Data Bank entry (PDB id 2blf): 2 Chain 2blfA 1 Title: Sulfite dehydrogenase from starkeya novella 2.1 Q9LA16 overview 1 Compound: Mol id: 1; molecule: sulfite:cytochrome c oxidoreduc- 2.2 Multiple sequence alignment for 2blfA 1 tase subunit a; chain: a; synonym: sora; engineered: yes; mol id: 2; 2.3 Residue ranking in 2blfA 1 molecule: sulfite:cytochrome c oxidoreductase subunit b; chain: b; 2.4 Top ranking residues in 2blfA and their position on synonym: sorb; engineered: yes the structure 1 Organism, scientific name: Thiobacillus Novellus; 2.4.1 Clustering of residues at 25% coverage. 1 2blf contains unique chains 2blfA (373 residues) and 2blfB (81 2.4.2 Overlap with known functional surfaces at residues) 25% coverage. 2 2.4.3 Possible novel functional surfaces at 25% 2 CHAIN 2BLFA coverage. 6 2.1 Q9LA16 overview 3 Chain 2blfB 9 From SwissProt, id Q9LA16, 100% identical to 2blfA: 3.1 Q9LA15 overview 9 Description: Sulfite:cytochrome c oxidoreductase subunit A. 3.2 Multiple sequence alignment for 2blfB 9 Organism, scientific name: Thiobacillus novellus. 3.3 Residue ranking in 2blfB 10 Taxonomy: Bacteria; Proteobacteria; Alphaproteobacteria; Rhizo- 3.4 Top ranking residues in 2blfB and their position on biales; Hyphomicrobiaceae; Starkeya. the structure 10 3.4.1 Clustering of residues at 25% coverage. 10 2.2 Multiple sequence alignment for 2blfA 3.4.2 Overlap with known functional surfaces at For the chain 2blfA, the alignment 2blfA.msf (attached) with 226 25% coverage. 11 sequences was used. The alignment was downloaded from the HSSP 3.4.3 Possible novel functional surfaces at 25% database, and fragments shorter than 75% of the query as well as coverage. 12 duplicate sequences were removed. It can be found in the attachment 1 Lichtarge lab 2006 2.4 Top ranking residues in 2blfA and their position on the structure In the following we consider residues ranking among top 25% of resi- dues in the protein . Figure 3 shows residues in 2blfA colored by their importance: bright red and yellow indicate more conserved/important residues (see Appendix for the coloring scheme). A Pymol script for producing this figure can be found in the attachment. Fig. 1. Residues 1-186 in 2blfA colored by their relative importance. (See Appendix, Fig.18, for the coloring scheme.) Fig. 2. Residues 187-373 in 2blfA colored by their relative importance. (See Appendix, Fig.18, for the coloring scheme.) to this report, under the name of 2blfA.msf. Its statistics, from the alistat program are the following: Fig. 3. Residues in 2blfA, colored by their relative importance. Clockwise: front, back, top and bottom views. Format: MSF Number of sequences: 226 Total number of residues: 78267 2.4.1 Clustering of residues at 25% coverage. Fig. 4 shows the Smallest: 280 top 25% of all residues, this time colored according to clusters they Largest: 373 belong to. The clusters in Fig.4 are composed of the residues listed Average length: 346.3 in Table 1. Alignment length: 373 Average identity: 30% Table 1. Most related pair: 98% cluster size member Most unrelated pair: 16% color residues Most distant seq: 31% red 93 47,48,53,54,55,56,57,70,72 76,78,89,103,104,105,106,107 108,109,118,120,121,125,127 Furthermore, <1% of residues show as conserved in this ali- 128,129,131,133,134,136,140 gnment. 143,144,156,158,170,184,185 The alignment consists of 8% eukaryotic ( 3% vertebrata, <1% 186,188,189,190,193,194,197 arthropoda, 2% fungi, 1% plantae), 9% prokaryotic, and <1% 198,200,202,205,206,207,209 archaean sequences. (Descriptions of some sequences were not rea- 210,211,212,213,214,215,216 dily available.) The file containing the sequence descriptions can be 217,220,231,232,235,236,237 found in the attachment, under the name 2blfA.descr. 265,267,285,286,287,288,289 290,297,300,302,304,307,310 2.3 Residue ranking in 2blfA 321,324,339,340,341,343,346 349,350,355,356,359,360 The 2blfA sequence is shown in Figs. 1–2, with each residue colored according to its estimated importance. The full listing of residues continued in next column in 2blfA can be found in the file called 2blfA.ranks sorted in the attachment. 2 Table 2. continued res type subst’s cvg noc/ dist antn (%) bb (A˚ ) P(2) Q(1) S(3)IEK 56 Y S(4) 0.20 2/2 3.95 site N(44) W(7)D T(1) H(15) Y(11) V(1) I(7). A(1)MCQ 231 W F(14) 0.23 6/0 4.06 Y(9) .(4) I(5) W(30) K(7) E(12) Fig. 4. Residues in 2blfA, colored according to the cluster they belong to: V(3)T red, followed by blue and yellow are the largest clusters (see Appendix for N(1) the coloring scheme). Clockwise: front, back, top and bottom views. The Q(2) corresponding Pymol script is attached. A(3) G(1) D(2)M Table 1. continued 120 Q P(13) 0.24 34/1 3.11 cluster size member G(7) color residues S(3) Q(46) Table 1. Clusters of top ranking residues in 2blfA. .(10)M K(1) A(2) 2.4.2 Overlap with known functional surfaces at 25% coverage. T(1) The name of the ligand is composed of the source PDB identifier E(3) and the heteroatom name used in that file. N(4) Interface with 2blfB.Table 2 lists the top 25% of residues at the D(3)L interface with 2blfB. The following table (Table 3) suggests possible disruptive replacements for these residues (see Section 4.6). Table 2. The top 25% of residues in 2blfA at the interface with 2blfB. Table 2. (Field names: res: residue number in the PDB entry; type: amino acid type; res type subst’s cvg noc/ dist antn substs: substitutions seen in the alignment; with the percentage of each type (%) bb (A˚ ) in the bracket; noc/bb: number of contacts with the ligand, with the number of 55 R V(2) 0.04 1/0 4.61 site contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. ) R(92). L(2)AGT IE Table 3. 57 H H(72) 0.05 3/3 4.22 res type disruptive N(15)T mutations D(8).CR 55 R (Y)(D)(T)(E) YF 57 H (E)(M)(Q)(D) 118 G .(12) 0.09 12/12 2.96 118 G (R)(H)(FW)(K) G(74) 56 Y (K)(QR)(M)(E) T(3) 231 W (K)(E)(T)(R) continued in next column continued in next column 3 Table 3. continued Table 4. continued res type disruptive res type subst’s cvg noc/ dist antn mutations (%) bb (A˚ ) 120 Q (Y)(H)(FW)(T) L(2)AGT IE Table 3. List of disruptive mutations for the top 25% of residues in 2blfA, 106 G G(79) 0.04 1/1 4.96 that are at the interface with 2blfB. S(7) N(3) A(8)HD. 210 G G(86) 0.04 30/30 3.22 site A(11) S(2) 57 H H(72) 0.05 30/7 2.90 N(15)T D(8).CR YF 236 Y Y(91) 0.06 11/0 3.34 site N(3) W(3).S 197 N H(68) 0.09 18/6 2.78 site N(22) Q(7) R(1)Y 216 H W(68) 0.09 2/1 4.65 Y(5) M(4)R H(11) K(5) C(2)ATS Q 105 S S(31) 0.12 10/6 2.74 site A(43) Fig. 5. Residues in 2blfA, at the interface with 2blfB, colored by their relative G(12) importance. 2blfB is shown in backbone representation (See Appendix for the V(7) coloring scheme for the protein chain 2blfA.) D(2) T(1)I. Figure 5 shows residues in 2blfA colored by their importance, at the 156 G G(75) 0.12 2/2 4.57 interface with 2blfB. A(3)D MSS binding site. Table 4 lists the top 25% of residues at the S(15) interface with 2blfAMSS1374 (mss). The following table (Table N(2)FC. 5) suggests possible disruptive replacements for these residues (see QE Section 4.6). 53 F F(63) 0.13 28/10 2.69 site Y(34).L Table 4. AV res type subst’s cvg noc/ dist antn 211 T V(9) 0.14 34/26 3.11 site (%) bb (A˚ ) G(6) 202 R R(96) 0.02 19/0 2.72 T(18) K(2)V S(9) 104 C C(96) 0.03 15/8 2.79 site M(6) V(2)QG.