Analysis of the Structural Consensus of the Zinc Coordination Centers of Metalloprotein Structures ⁎ Kirti Patel, Anil Kumar, Susheel Durani
Total Page:16
File Type:pdf, Size:1020Kb
Analysis of the structural consensus of the zinc coordination centers of metalloprotein structures ⁎ Kirti Patel, Anil Kumar, Susheel Durani Department of Chemistry, Indian Institute of Technology Bombay, Mumbai-400076, India Abstract In a recent sequence-analysis study it was concluded that up to 10% of the human proteome could be comprised of zinc proteins, quite varied in the functional spread. The native structures of only few of the proteins are actually established. The elucidation of rest of the sequences of not just human but even other actively investigated genomes may benefit from knowledge of the structural consensus of the zinc-binding centers of the currently known zinc proteins. Nearly four hundred X-ray and NMR structures in the database of zinc–protein structures available as of April 2007 were investigated for geometry and conformation in the zinc-binding centers; separately for the structural and catalytic proteins and individually in the zinc centers coordinated to three and four amino-acid ligands. Enhanced cysteine involvement in agreement with the observation in human proteome has been detected in contrast with previous reports. Deviations from ideal coordination geometries are detected, possible underlying reasons are investigated, and correlations of geometry and conformation in zinc-coordination centers with protein function are established, providing possible benchmarks for putative zinc-binding patterns of the burgeoning genome data. Keywords: Metalloprotein; Zinc–protein; Zinc binding pattern; Zinc coordination; Structural zinc-center; Catalytic zinc-center 1. Introduction processes, like, transcription, translation, and metabolism [3–8]. Based on homology based searches of human genome, Andreini The second most abundant transition element in biological et al. [9] concluded that ∼2800, i.e., ∼10%, of the proteins could systems after iron, zinc is essential for life. With closed d-shell, be zinc proteins, encompassing all the six International Union of unlike iron and copper, zinc (II) finds biological relevance as a Biochemistry recognized classes of enzymes [10]; in fact zinc is redox-stable element in metalloprotein structures. Interacting the only metal so widespread in the distribution. The zinc-finger mainly with amino-acid side chains and occasionally with non- type transcription regulators, comprising more than 4% of protein ligands in mostly tetra, penta, and hexa-coordinate human proteome [9], call upon zinc as a tetra-coordinating complexes, zinc contributes both structural and catalytic roles in element; involvements of zinc as catalytic element, on other metalloprotein structures [1,2]. Progressive studies of zinc hand, call upon its electrophilic character, in stabilizing anionic proteins have established their involvements in diverse cellular intermediates, and dynamic coordination geometry, in facilitat- ing bond rearrangements. In protein engineering and design applications, zinc could be an aid in control of both con- formation and function [11]. The currently known zinc centers could provide the structural consensus for characterization of the newer zinc-binding sequences, as well as serve as the guidelines for protein engineering and design applications. Reviews of geometry and stereochemistry of zinc-coordination centers have appeared from time to time [12–21]. However, the structural database of zinc proteins has grown at least five times since the last survey 1248 Table 1 PDB codes of 228 structural zinc proteins, 102 NMR (Panel a) and 126 X-ray determined (Panel b), and 154 catalytic zinc proteins, 7 NMR (Panel c) and 147 X-ray determined (Panel d), investigated in this study Panel a a 1A1T 1FRE 1M36 1PFT 1U2N 1VYX 1XPA 1ZFO 2AYJ 2FNF 1AJY 1G25 1M9O 1PXE 1U6P 1WIG 1XRZ 1ZGW 2AZH 2GAT 1BBO 1G47 1MM2 1PYC 1U7J 1WII 1XWH 1ZR9 2BL6 2GHF 1BOR 1IBI 1N0Z 1R9P 1U85 1WJ0 1Y0J 1ZU1 2C6A 2HGH 1DSQ 1IML 1NCS 1RXR 1UW0 1WJ2 1Y8F 1ZW8 2CON 2I5O 1DSV 1IYM 1NEE 1SRK 1UW2 1WO3 1YUI 2A20 2CQE 2IDA 1DX8 1J2O 1NJ3 1T3K 1VA1 1WPK 1YWS 2A23 2DB6 2IHX 1DXW 1JJD 1NYP 1TF3 1VA2 1WWD 1Z60 2A51 2F8B 2JM3 1E4U 1JN7 1OVX 1TM6 1VA3 1XF7 1ZE9 2AF2 2FEJ 2JNE 1EF4 1LPV 1P7M 1TOT 1VD4 1XJH 1ZFD 2AQC 2FK4 2ODX 1EXK 1LV3 Panel b b 1A2P 1NW2 1R1H 1U5K 1YW4 2AU3 2CYE 2FU5 2H6L 2OGW 1A7W 1OAO 1R2Z 1VJ0 1Z05 2AYD 2DGE 2FYG 2HCV 2OH3 1BTK 1OJ7 1R5Y 1W4R 1Z83 2AZ4 2DS5 2G84 2HJN 2OIK 1DVP 1ONW 1RUT 1WWR 1Z84 2B3J 2ESL 2GC0 2I13 2OMH 1ENR 1OQJ 1RYQ 1WY2 1ZED 2B4Y 2EV6 2GFE 2I9W 2OSO 1GL4 1P3J 1SW1 1X0T 1ZKP 2B8T 2F5N 2GFO 2IMR 2OU3 1H3N 1P5D 1T0B 1XC8 1ZSW 2B9D 2F5P 2GLZ 2IQJ 2OWA 1INN 1P6O 1T64 1XM8 1ZZM 2BOQ 2FE3 2GNR 2J21 2OX0 1J30 1PZW 1T9H 1XTM 2A1K 2C1D 2FE8 2GPY 2J6A 2P57 1JW9 1Q08 1TDZ 1XV2 2AP1 2C2U 2FEA 2GVI 2J9U 2PEB 1KWG 1Q1A 1TJL 1XWY 2APO 2CIH 2FPR 2GWG 2O6D 2UVL 1L1T 1Q74 1TQX 1YC5 2AQ2 2CJS 2FQP 2GWN 2O8J 4KMB 1MA3 1Q9U 1U0A 1YQD 2AS9 2CS7 Panel c c 1DXW 1FFW 1XXE 1XYD 2AF2 2HDP 2JMO Panel d d 1A4M 1GVF 1KO3 1P5X 1TAZ 1XOV 1Z1L 2BO0 2FR5 2HPO 1AMP 1H2B 1KQ9 1Q2O 1TBF 1XP2 1Z5R 2BZ1 2G3F 2HSI 1ATL 1HET 1LML 1Q7L 1TQW 1XRT 1ZDE 2C1I 2G64 2I0O 1C7K 1HKK 1LR5 1QH5 1U2W 1XRU 1ZKL 2CC0 2G7N 2IW0 1C8Y 1HP1 1LU0 1QIP 1UDV 1XVX 1ZY7 2CDC 2GFO 2J9A 1CA1 1IA9 1M65 1QWR 1UUF 1Y0Y 1ZZ1 2CFU 2GIV 2JG6 1D8W 1IE0 1M7J 1QWY 1V4P 1Y2K 2A21 2CKI 2GMW 2NX9 1ED9 1IM5 1MFM 1R55 1VHH 1Y7W 2A7M 2CKL 2GSO 2NXF 1EU3 1ITU 1MXG 1R5T 1VYK 1Y93 2A97 2CTB 2GU1 2O1Q 1EVL 1J79 1NTO 1S2Z 1W5Q 1Y9A 2AFW 2DDF 2H6F 2OB3 1F4T 1JD0 1OHT 1S4B 1WCZ 1YB0 2AIO 2DVT 2HBV 2OOT 1FRP 1JIW 1OI0 1SG0 1WN5 1YIX 2BC2 2F4M 2HC9 2ORW 1FUA 1JJT 1OJR 1SG6 1X8H 1YLK 2BGX 2FOU 2HEK 2OVY 1GKP 1K7H 1ONW 1SR9 1XEM 1YM3 2BIB 2FPQ 2HF1 8TLN 1GUD 1K9Z 1OYW 1T0A 1XOC 1YT3 2BNM a The dataset of 102 structural zinc proteins (NMR structures). b The dataset of 126 structural zinc proteins (X-ray structures). c The dataset of 7 catalytic zinc proteins (NMR structures). d The dataset of 147 catalytic zinc proteins (X-ray structures). [18]. Specifically, explosive growth of the NMR characterized 2. Materials and methods zinc–protein structures has been witnessed in addition to the “ ” X-ray characterized structures [22]. The combined resource of Protein Data Bank (PDB) [22] was queried for keyword ZN with a sequence identity cut-off ≤50% cross checked on PISCES1 server of Dunbrack about 400 structures will possibly better capture the consensus [23], furnishing a total of 1321 structures. Restricting to high-resolution X-ray in geometry and conformation in the newly uncovered zinc- structures (≤2.0 Å resolution, b45 average B-factor, and b0.25 R-value) but binding patterns of diverse genomes. We have examined the zinc including all NMR structures, gave 382 PDB entries corresponding to 228 centers available in Protein Data Bank [22] as of April 2007 for structural proteins and 154 catalytic zinc–proteins. Among the structural details of structure and conformation and summarize the results in this report. 1249 coordination, 31% in C5 coordination, and 11% in C6 coordination. The C4 coordination sites are always tetrahedral. Majority of the C5 coordination sites are trigonal bipyramidal, while the C6 coordination sites are always octahedral. Presumably electro- static and steric interactions among the ligands define the coordination geometries. Deviations from ideal geometry occur for some of the following reasons: (1) A ligand extraneously H-bonded often in secondary coordination sphere and sometime with solvent. (2) Bidentate coordination of Asp (D) and Glu (E) carbox- ylates via both oxygens; among 400 carboxyl ligands, 67 instances of bidentate arrangement were observed. (3) Simultaneous ligation of backbone N and CO to Zn: this was observed in 3 cases. (4) Multi-Zn sites involving Asp, Glu, or His (H) as bridging ligand between Zn atoms. Fig. 1. Ideal geometries for 4, 5, and 6 coordinated Zinc. Ax =axial, Eq=equatorial. The structural Zn sites are generally tetrahedral with all the four ligands from protein. In the catalytic Zn sites water or inhibitor usually occupies the fourth ligation site. In structural proteins, 102 entries were NMR structures and 126 were X-ray structures, while zinc proteins conservation of length of the spacer between the among 154 catalytic proteins, 7 entries were NMR structures and 147 were X- amino acid residues ligated with zinc is observed. The ray structures. Classification of the protein structures as the structural or catalytic structures was made on basis of the available descriptions in specific PDB conservation is less stringent among the catalytic zinc sites, entries. The NMR structures were only considered for conformation in the zinc- possibly due to the requirement of conformational flexibility for coordination centers, while the X-ray structures were considered for both catalytic function. Short and long spacers generally link the geometry and conformation. The overall dataset includes apo-enzymes, protein ligands in both structural and catalytic sites. The shorter – enzyme inhibitor complexes, and mutants. The PDB codes of the proteins spacers, generally 2 to 7 residues, occur between the first and analyzed, their classification, and the technique of structure determination are summarized in Table 1. second zinc ligand in both the structural and catalytic sites, and The analysis performed with in-house programs2 involved searching PDB between the third and fourth zinc ligand in the structural coordinates for the amino-acid residues within 3.0 Å cut-off of zinc atom, proteins. A comparatively longer N20 residue spacer occurs calculating bond lengths and angles of the atoms coordinated with zinc, and between the second and third zinc ligand in both the structural evaluating sequence positions of the amino acids coordinated with zinc. The and catalytic sites (Tables S1–S4), similar to the observations of angle of imidazole planes of coordinating histidines with metal ion was also determined with in-house program.