2I52 Lichtarge Lab 2006

Pages 1–11 2i52 Evolutionary trace report by report maker February 25, 2010 4.3.3 DSSP 10 4.3.4 HSSP 10 4.3.5 LaTex 10 4.3.6 Muscle 10 4.3.7 Pymol 10 4.4 Note about ET Viewer 10 4.5 Citing this work 10 4.6 About report maker 11 4.7 Attachments 11 1 INTRODUCTION From the original Protein Data Bank entry (PDB id 2i52): Title: Crystal structure of protein pto0218 from picrophilus torridus, pfam duf372 Compound: Mol id: 1; molecule: hypothetical protein; chain: a, b, CONTENTS c, d, e, f; engineered: yes Organism, scientific name: Picrophilus Torridus; 1 Introduction 1 2i52 contains a single unique chain 2i52F (119 residues long) and its homologues 2i52A, 2i52D, 2i52C, 2i52E, and 2i52B. 2 Chain 2i52F 1 2.1 Q6L2J9 overview 1 2.2 Multiple sequence alignment for 2i52F 1 2.3 Residue ranking in 2i52F 1 2.4 Top ranking residues in 2i52F and their position on the structure 1 2 CHAIN 2I52F 2.4.1 Clustering of residues at 25% coverage. 1 2.4.2 Overlap with known functional surfaces at 2.1 Q6L2J9 overview 25% coverage. 2 From SwissProt, id Q6L2J9, 97% identical to 2i52F: Description: Hypothetical protein. 3 Notes on using trace results 9 Organism, scientific name: Picrophilus torridus. 3.1 Coverage 9 Taxonomy: Archaea; Euryarchaeota; Thermoplasmata; Thermoplas- 3.2 Known substitutions 9 matales; Picrophilaceae; Picrophilus. 3.3 Surface 9 3.4 Number of contacts 9 3.5 Annotation 9 3.6 Mutation suggestions 9 2.2 Multiple sequence alignment for 2i52F 4 Appendix 9 For the chain 2i52F, the alignment 2i52F.msf (attached) with 30 4.1 File formats 9 sequences was used. The alignment was downloaded from the HSSP 4.2 Color schemes used 10 database, and fragments shorter than 75% of the query as well as 4.3 Credits 10 duplicate sequences were removed. It can be found in the attachment 4.3.1 Alistat 10 to this report, under the name of 2i52F.msf. Its statistics, from the 4.3.2 CE 10 alistat program are the following: 1 Lichtarge lab 2006 Fig. 1. Residues 0-120 in 2i52F colored by their relative importance. (See Appendix, Fig.15, for the coloring scheme.) Format: MSF Number of sequences: 30 Total number of residues: 1932 Smallest: 56 Largest: 119 Average length: 64.4 Alignment length: 119 Average identity: 54% Most related pair: 97% Most unrelated pair: 31% Most distant seq: 55% Fig. 2. Residues in 2i52F, colored by their relative importance. Clockwise: front, back, top and bottom views. Furthermore, 7% of residues show as conserved in this alignment. The alignment consists of 10% prokaryotic, and 40% archaean sequences. (Descriptions of some sequences were not readily availa- ble.) The file containing the sequence descriptions can be found in the attachment, under the name 2i52F.descr. 2.3 Residue ranking in 2i52F The 2i52F sequence is shown in Fig. 1, with each residue colored according to its estimated importance. The full listing of residues in 2i52F can be found in the file called 2i52F.ranks sorted in the attachment. 2.4 Top ranking residues in 2i52F and their position on the structure In the following we consider residues ranking among top 25% of residues in the protein . Figure 2 shows residues in 2i52F colored by their importance: bright red and yellow indicate more conserved/important residues (see Appendix for the coloring scheme). A Pymol script for producing this figure can be found in the attachment. 2.4.1 Clustering of residues at 25% coverage. Fig. 3 shows the top 25% of all residues, this time colored according to clusters they belong to. The clusters in Fig.3 are composed of the residues listed in Table 1. Fig. 3. Residues in 2i52F, colored according to the cluster they belong to: Table 1. red, followed by blue and yellow are the largest clusters (see Appendix for cluster size member the coloring scheme). Clockwise: front, back, top and bottom views. The color residues corresponding Pymol script is attached. red 30 12,14,15,17,19,20,21,22,23 24,25,26,27,28,29,30,31,32 34,36,45,46,48,49,52,56,57 2.4.2 Overlap with known functional surfaces at 25% coverage. 59,62,64 The name of the ligand is composed of the source PDB identifier and the heteroatom name used in that file. Interface with 2i52B.Table 2 lists the top 25% of residues at the Table 1. Clusters of top ranking residues in 2i52F. interface with 2i52B. The following table (Table 3) suggests possible disruptive replacements for these residues (see Section 3.6). 2 Table 2. res type subst’s cvg noc/ dist (%) bb (A˚ ) 64 I V(93) 0.08 2/2 4.64 I(6) 46 E E(90) 0.13 13/0 2.90 A(10) 62 V I(43) 0.25 1/1 4.74 V(46) A(10) Table 2. The top 25% of residues in 2i52F at the interface with 2i52B. (Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each type in the bracket; noc/bb: number of contacts with the ligand, with the number of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. ) Table 3. res type disruptive mutations 64 I (YR)(H)(TKE)(SQCDG) 46 E (H)(FYWR)(CG)(TKVA) 62 V (YR)(KE)(H)(QD) Table 3. List of disruptive mutations for the top 25% of residues in 2i52F, that are at the interface with 2i52B. Fig. 4. Residues in 2i52F, at the interface with 2i52B, colored by their relative importance. 2i52B is shown in backbone representation (See Appendix for the coloring scheme for the protein chain 2i52F.) Figure 4 shows residues in 2i52F colored by their importance, at the interface with 2i52B. 3 Glycerol binding site. By analogy with 2i52C – 2i52GOL902 interface. Table 4 lists the top 25% of residues at the interface with 2i52GOL902 (glycerol). The following table (Table 5) suggests possible disruptive replacements for these residues (see Section 3.6). Table 4. res type subst’s cvg noc/ dist (%) bb (A˚ ) 56 Q Q(100) 0.08 7/0 3.02 23 I I(96) 0.12 2/0 4.13 V(3) 27 A A(83) 0.23 1/0 3.89 T(10) S(6) Table 4. The top 25% of residues in 2i52F at the interface with glycerol.(Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each type in the bracket; noc/bb: number of contacts with the ligand, with the number of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. ) Fig. 5. Residues in 2i52F, at the interface with glycerol, colored by their rela- Table 5. tive importance. The ligand (glycerol) is colored green. Atoms further than res type disruptive 30A˚ away from the geometric center of the ligand, as well as on the line of mutations sight to the ligand were removed. (See Appendix for the coloring scheme for 56 Q (Y)(FTWH)(SVCAG)(D) the protein chain 2i52F.) 23 I (YR)(H)(TKE)(SQCDG) 27 A (KR)(E)(Y)(QH) Table 5. List of disruptive mutations for the top 25% of residues in 2i52F, that are at the interface with glycerol. Figure 5 shows residues in 2i52F colored by their importance, at the interface with 2i52GOL902. Chloride ion binding site. By analogy with 2i52B – 2i52CL813 interface. Table 6 lists the top 25% of residues at the interface with 2i52CL813 (chloride ion). The following table (Table 7) suggests possible disruptive replacements for these residues (see Section 3.6). Table 6. res type subst’s cvg noc/ dist (%) bb (A˚ ) 14 I R(76) 0.20 4/2 3.46 A(10) K(3) I(3) P(6) 12 T T(73) 0.21 5/2 3.87 S(23) N(3) Table 6. The top 25% of residues in 2i52F at the interface with chloride ion.(Field names: res: residue number in the PDB entry; type: amino acid type; substs: substitutions seen in the alignment; with the percentage of each type in the bracket; noc/bb: number of contacts with the ligand, with the number of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. ) 4 Table 7. Table 8. continued res type disruptive res type subst’s cvg noc/ dist mutations (%) bb (A˚ ) 14 I (Y)(T)(R)(H) W(6) 12 T (R)(K)(FWH)(M) Table 8. The top 25% of residues in 2i52F at the interface with calcium Table 7. List of disruptive mutations for the top 25% of residues in 2i52F, ion.(Field names: res: residue number in the PDB entry; type: amino acid that are at the interface with chloride ion. type; substs: substitutions seen in the alignment; with the percentage of each type in the bracket; noc/bb: number of contacts with the ligand, with the number of contacts realized through backbone atoms given in the bracket; dist: distance of closest apporach to the ligand. ) Table 9. res type disruptive mutations 30 H (E)(TQMD)(SNKVCLAPIG)(YR) 31 Q (Y)(FTWH)(SVCAG)(D) 34 G (KER)(FQMWHD)(NYLPI)(SVA) 29 F (K)(E)(Q)(D) 32 Y (K)(Q)(E)(M) Table 9. List of disruptive mutations for the top 25% of residues in 2i52F, that are at the interface with calcium ion.

2I52 Lichtarge Lab 2006

Microbial Diversity of Non-Flooded High Temperature Petroleum Reservoir in South of Iran

Proteome Cold-Shock Response in the Extremely Acidophilic Archaeon, Cuniculiplasma Divulgatum

Insights Into Archaeal Evolution and Symbiosis from the Genomes of a Nanoarchaeon and Its Inferred Crenarchaeal Host from Obsidian Pool, Yellowstone National Park

Picrophilus Oshimae and Picrophilus Tomdus Fam. Nov., Gen. Nov., Sp. Nov

Downloaded from PIR, PRF, Swissprot And

Occurrence and Expression of Novel Methyl-Coenzyme M Reductase Gene

(Gid ) Genes Coding for Putative Trna:M5u-54 Methyltransferases in 355 Bacterial and Archaeal Complete Genomes

The Novel Extremely Acidophilic, Cell-Wall-Deﬁcient Archaeon Cuniculiplasma Divulgatum Gen

Microbial Extremophiles in Aspect of Limits of Life. Elena V. ~Ikuta

Extremophiles-Basic Concepts

Distribution and Evolution of the Mobile Vma-1B Intein

Genome Sequence of Picrophilus Torridus and Its Implications for Life Around Ph 0