<<

Subscriber access provided by - Access paid by the | UCSF Library Article Deamination of 6-Aminodeoxyfutalosine in Menaquinone Biosynthesis by Distantly Related Alissa Marie Goble, Rafael Toro, Xu Li, Argentina Ornelas, Hao Fan, Subramaniam Eswaramoorthy, Yury V. Patskovsky, Brandan Hillerich, Ronald D. Seidel, Andrej Sali, Brian K. Shoichet, Steven C. Almo, Subramanyam Swaminathan, Martin E. Tanner, and Frank Michael Raushel Biochemistry, Just Accepted Manuscript • DOI: 10.1021/bi400750a • Publication Date (Web): 23 Aug 2013 Downloaded from http://pubs.acs.org on August 28, 2013

Just Accepted

“Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Biochemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties. Page 1 of 50 Biochemistry

1 2 3 4 Deamination of 6Aminodeoxyfutalosine in 5 6 7 8 Menaquinone Biosynthesis by Distantly Related 9 10 11 Enzymes 12 13 14 15 16 17 18 Alissa M. Goble §, Rafael Toro ø, Xu Li ∂, Argentina Ornelas §, Hao 19 20 ∆ϕη ‡ ø 21 Fan , Subramaniam Eswaramoorthy , Yury Patskovsky , Brandan 22 23 Hillerich ø, Ron Seidel ø, Andrej Sali ∆η, Brian K. Shoichet ϕη, Steven C. 24 25 ø ‡ ∂ 26 Almo , Subramanyam Swaminathan , Martin E. Tanner , 27 §* 28 and Frank M. Raushel 29 30 31 32 33 34 35 §Department of Chemistry, P.O. Box 30012, Texas A&M University, College Station, TX 36 37 ϕ η 38 778433012; Department of Pharmaceutical Chemistry, and California Institute for 39 40 Quantitative Biosciences, University of California, San Francisco, 1700 Fourth Street, San 41 Francisco, California 94158; ‡Biology Department, Brookhaven National Laboratory, P.O. Box 42 43 5000 Upton, New York 119735000; ∂Department of Chemistry, University of British 44 45 Columbia, 2036 Main Mall, Vancouver, British Columbia V6T 1Z1, Canada; øDepartment of 46 47 Biochemistry, Einstein College of Medicine, Bronx, NY. 48 49 50 51 52 53 54 55 56 57 58 1 59 60 ACS Paragon Plus Environment Biochemistry Page 2 of 50

1 2 3 ABSTRACT 4 5 6 Proteins of unknown function belonging to cog1816 and cog0402 were characterized. 7 8 Sav2595 from Steptomyces avermitilis MA4680, Acel0264 from Acidothermus cellulolyticus 9 10 11 11B, Nis0429 from Nitratiruptor sp. SB1552 and Dr0824 from Deinococcus radiodurans R1 12 13 were cloned, purified, and their profiles determined. These enzymes were previously 14 15 16 incorrectly annotated as adenosine deaminases or chlorohydrolases. It was shown here that 17 18 these enzymes actually deaminate 6aminodeoxyfutalosine. The deamination of 6 19 20 aminodeoxyfutalosine is part of an alternative menaquinone biosynthetic pathway that involves 21 22 23 the formation of futalosine. 6Aminodeoxyfutalosine is deaminated by these enzymes with 24 5 1 1 25 catalytic efficiencies greater than 10 M s , Km values of 0.9 to 6.0 M and kcat values of 1.2 to 26 27 8.6 s 1. Adenosine, 2’deoxyadenosine, thiomethyladenosine, and Sadenosylhomocysteine are 28 29 30 deaminated at least an order of magnitude slower than 6aminodeoxyfutalosine. The crystal 31 32 structure of Nis0429 was determined and the substrate, 6aminodeoxyfutalosine, was positioned 33 34 in the , based on the presence of adventitiously bound benzoic acid. In this model 35 36 37 Ser145 interacts with the carboxylate moiety of the substrate. The structure of Dr0824 was 38 39 also determined, but a collapsed active site pocket prevented docking of substrates. A 40 41 42 computational model of Sav2595 was built based on the crystal structure of adenosine 43 44 deaminase and substrates were docked. The model predicted a conserved arginine after βstrand 45 46 1 to be partially responsible for the substrate specificity of Sav2595. 47 48 49 50 51 52 53 54 55 56 57 58 2 59 60 ACS Paragon Plus Environment Page 3 of 50 Biochemistry

1 2 3 Menaquinone can be biosynthesized through either the shikimate pathway or the 4 5 6 futalosine pathway. In Escherichia coli , the wellstudied shikimate pathway utilizes seven 7 8 enzymes (menA to menG ) to convert chorismate into menaquinone (1). Recently, menaquinone 9 10 11 producing organisms were identified that lack menF , menD , menC , menE , and menB orthologs 12 13 (2, 3). These organisms, including Streptomyces coelicolor A3(2), Helicobacter pylori , 14 15 Campylobacter jejuni , and Acidothermus cellulolyticus , lack men gene homologs and were 16 17 18 suspected of employing a different pathway for the biosynthesis of menaquinone (1, 4). Using a 19 20 combination of bioinformatics and gene disruption experiments, the mqn genes involved in the 21 22 futalosine pathway were identified (Figure 1 ) (1). Mqn genes are found scattered throughout 23 24 25 the genomes, with the exception of A. cellulolyticus , which has two clusters of menaquinone 26 27 related genes (4). The smaller cluster contains MqnC and MqnB orthologs and the other cluster 28 29 is made up of several genes, including MqnA , another MqnC and an 30 31 32 ortholog (Acel0264) (Figure 2 ). 33 34 Adenosine deaminase is a member of the superfamily (AHS) of 35 36 37 enzymes: a superfamily first identified by Holm and Sander based on the similarities in the 38 39 threedimensional structures of phosphotriesterase, adenosine deaminase and (5). The 40 41 common fold within the AHS is a distorted (β/α) 8barrel with conserved metal binding residues 42 43 44 at the Cterminal ends of βstrands 1, 4, 5, 6, and 8. Enzymes in the AHS have either a 45 46 mononuclear or binuclear metal center that functions to activate a water molecule for 47 48 nucleophilic attack on amino acids, sugars, nucleic acids and organophosphate esters (6). The 49 50 51 AHS is composed of 24 clusters of orthologous groups (COG). Enzymes that are known to 52 53 deaminate aromatic bases are found in cog1001, cog0402 and cog1816. Members of cog1001 54 55 include the prototypical deaminase (ADE) and N6methyladenine deaminase (7). 56 57 58 3 59 60 ACS Paragon Plus Environment Biochemistry Page 4 of 50

1 2 3 Enzymes that deaminate guanine, cytosine, Sadenosylhomocysteine, thiomethyl adenosine, N 4 5 6 formiminoLglutamate, and 8oxoguanine are found in cog0402 (8). According to annotations 7 8 reported by NCBI, bacterial enzymes in cog1816 function in the deamination of adenosine. The 9 10 11 functional diversity of cog1816 has recently been expanded to include enzymes that deaminate 12 13 adenine and cytokinins (9, 10 ). The mechanism, structure and biological function of adenosine 14 15 deaminase enzymes from several organisms have been investigated (11-23 ). A sequence 16 17 70 18 similarity network for cog1816 is illustrated in Figure 3 at a BLAST Evalue of 1 x 10 , which 19 20 corresponds to a sequence identity within each group of ~30% (24 ). In general, members of a 21 22 group share a common function, and the substrate profiles differ between groups. Members of 23 24 25 group 5 of cog1816 are prototypical adenosine deaminases, including the adenosine deaminase 26 27 from E. coli K12. Group 3 of cog1816 includes adenine deaminases with a single divalent 28 29 cation and enzymes that catalyze the deamination of cytokinins. 30 31 32 The initial targets for this investigation were the functional annotation of Acel0264 from 33 34 A. cellulolyticus and Sav2595 from Streptomyces avermitilis MA4680. These enzymes belong 35 36 37 to groups 11 and 13 of cog1816, respectively. The substrate profile for these two enzymes has 38 39 been established and a threedimensional model has been proposed for how the best substrate, 40 41 6aminodeoxyfutalosine (AFL), binds in the active site. Based on the absence of men genes in 42 43 44 the genomes of Deinococcus radiodurans R1 and Nitratiruptor sp. SB1552, two distantly 45 46 related enzymes from cog0402, Dr0824 from D. radiodurans R1 and Nis0429 from N. sp. 47 48 SB1552, were identified as 6aminodeoxyfutalosine deaminases. The three dimensional 49 50 51 structures of Dr0824 and Nis0429 were determined by Xray crystallography. A sequence 52 53 similarity network of cog0402, which includes , cytosine deaminase and 54 55 other deaminases, is shown in Figure 4 . 56 57 58 4 59 60 ACS Paragon Plus Environment Page 5 of 50 Biochemistry

1 2 3 MATERIALS and METHODS 4 5 6 Materials. All chemicals were purchased from SigmaAldrich, unless otherwise stated. 7 8 6Aminodeoxyfutalosine (AFL) was synthesized as previously described (25 ). 1(6Amino9H 9 10 11 purin9yl)1deoxyNethylβDribofuranuronamide (NECA) was purchased from Ascent 12 13 Scientific. 14 15 Cloning and Purification of Sav2595 from Streptomyces avermitilis MA-4680 and 16 17 18 Dr0824 from Deinococcus radiodurans R1 . The genes for Sav2595 and Dr0824 were cloned 19 20 from the genomic DNA of Streptomyces avermitilis MA4680 and Deinococcus radiodurans 21 22 R1, respectively. The PCR products were amplified using the primer pairs 5’ 23 24 25 AGAGCCATGGTGACCGAGCACTTCGACGC 3’ and 5’ 26 27 AGAGAAGCTTCAGGGCGCGAGCCATGCC3’, and pair 5’ 28 29 GGGAATTACCATATGCGCTTTTCTGCCGTCAGCCG 3’ and 5’ 30 31 32 CCCAAGCTTTCATCACAAGTCGCGGCTCAACTCCCAACG 3’, respectively. The 33 34 restriction sites for Nco I and Hin dIII for Sav2595, and Nde I and Hin dIII for Dr0824 were 35 36 37 inserted into the forward and reverse primers, respectively. The PCR products were purified 38 39 with a PCR cleanup system (Qiagen), digested with their respective restriction enzymes and 40 41 ligated into a pET30a(+) vector, which was previously digested with the same two restriction 42 43 44 enzymes. The cloned gene fragments were sequenced to verify fidelity of the PCR 45 46 amplification. Dr0824 contains a Cterminal polyhistidine tag, whereas Sav2595 has an N 47 48 terminal polyhistidine tag. 49 50 51 Sav2595pET30a(+) and Dr0824pET30a(+) constructs were transformed into 52 53 BL21(DE3) cells (Novagen). A single colony was used to inoculate a 5 mL overnight culture of 54 55 LB medium containing 50 g/mL kanamycin. Each overnight culture was used to inoculate 1.0 56 57 58 5 59 60 ACS Paragon Plus Environment Biochemistry Page 6 of 50

1 2 3 4 L of LB medium containing 50 g/mL kanamycin. One liter cultures were grown at 37 °C, 5 6 supplemented with 1.0 mM ZnCl 2 and induced with 0.5 mM isopropyl Dthiogalactopyranoside 7 8 (IPTG) when an OD 600 of 0.6 was reached. Cultures of Dr0824 were induced with 1.0 mM 9 10 11 IPTG. At the time of induction, the temperature was lowered to 20 °C and allowed to shake for 12 13 18 hours before the cells were harvested by centrifugation at 8,000 rpm for 10 minutes. The 14 15 16 cells were resuspended in 20 mM HEPES buffer, pH 7.5, containing 20 mM imidazole, 120 17 18 mM ammonium sulfate, 10% glycerol, 0.1 mg/mL phenylmethanesulfonyl fluoride (PMSF) and 19 20 0.5 mg/mL deoxyribonuclease I from bovine pancreas. Cells were lysed by sonication and the 21 22 23 soluble protein was separated from the cell debris by centrifugation. Soluble protein from 24 25 cultures of Sav2595 was loaded onto a HisTrap column (GE Healthcare) and eluted with a 26 27 linear gradient of a solution containing 20 mM HEPES, pH 7.5, imidazole, 500 mM imidazole, 28 29 30 and 500 mM ammonium sulfate. Soluble protein from cultures of Dr0824 was partitioned by 31 32 ammonium sulfate precipitation at 30, 50 and 70% saturation. The precipitated protein (3050% 33 34 ammonium sulfate saturation) was resuspended in 50 mM HEPES buffer (pH 7.7) and applied 35 36 37 to a High Load 26/60 Superdex 200 preparative grade gel filtration column (GE Healthcare). 38 39 Fractions containing protein were pooled and passed through an anion exchange column. 40 41 42 Cloning and Purification of Acel0264 from Acidothermus cellulolyticus . The gene 43 44 for Acel0264 was amplified from Acidothermus cellulolyticus Strain 11B genomic DNA using 45 46 5’TTAAGAAGGAGATATACCATGGTGATGACACCCCACGATCCCGTC 3’ as the 47 48 49 forward primer and 5’ 50 51 GATTGGAAGTAGAGGTTCTCTGCCAGCGACTCTCCCGCCGCTG 3’ as the reverse 52 53 primer. PCR was performed using KOD Hot Start DNA Polymerse (Novagen). The amplified 54 55 56 57 58 6 59 60 ACS Paragon Plus Environment Page 7 of 50 Biochemistry

1 2 3 fragment was cloned into the Cterminal TEV cleavable Cterminal StrepII6xHistag 4 5 6 containing vector, CHS23, by ligationindependent cloning (26 ). 7 8 The gene for Acel0264 inserted in a CHS23 vector was transformed into Rosetta2 (DE3) 9 10 11 cells (Novagen). A single colony was used to inoculate a 5 mL overnight culture of LB medium 12 13 containing 50 g/mL kanamycin and 25 g/mL chloramphenicol. Each overnight culture was 14 15 used to inoculate 1.0 L of LB medium containing 50 g/mL kanamycin and 25 g/mL 16 17 18 chloramphenicol. One liter cultures were grown at 37 °C. At an OD 600 of 0.6, 1.0 mM ZnCl 2 19 20 was added, followed by induction with 50 M IPTG. At the time of induction, the temperature 21 22 23 was lowered to 20 °C and allowed to shake for 18 hours before the cells were harvested by 24 25 centrifugation at 8,000 rpm for 10 minutes. The cells were resuspended in 20 mM HEPES 26 27 buffer, pH 6.5, with 120 mM ammonium sulfate and 20 mM imidazole. Cells were lysed by 28 29 30 sonication in the presence of 0.1 mg/mL PMSF and 0.5 mg/mL deoxyribonuclease I from 31 32 bovine pancreas. Soluble protein was separated from the cell debris by centrifugation. The 33 34 protein was then loaded onto a HisTrap HP column (GE Healthcare) and eluted with a gradient 35 36 37 of buffer containing 20 mM HEPES, 500 mM imidazole and 500 mM ammonium sulfate, pH 38 39 6.5. 40 41 42 Cloning and Purification of Nis0429 from Nitratiruptor sp . SB155-2. The gene 43 44 encoding Nis0429 was chemically synthesized and cloned into the plasmid pUC57 (GenScript). 45 46 Expression constructs were generated by PCR amplification using the 5’ 47 48 49 TTAAGAAGGAGATATACCATGCGTATCATTAAGCCGTTCGCC3’ as the forward 50 51 primer and the 5’ GATTGGAAGTAGAGGTTCTCTGCTTCGCGGACGTGTTCTTCGCC3’ 52 53 as the reverse primer. PCR was performed using KOD Hot Start DNA Polymerase (Novagen). 54 55 56 The conditions were: 2 minutes at 95 °C, followed by 40 cycles of 30 seconds at 95 °C, 30 57 58 7 59 60 ACS Paragon Plus Environment Biochemistry Page 8 of 50

1 2 3 seconds at 66 °C, and 30 seconds at 72 °C. The amplified fragment was cloned into the C 4 5 6 terminal TEV cleavable Cterminal StrepII6xHistag containing vector, CHS30, by ligation 7 8 independent cloning (26 ). 9 10 11 Expression vectors were transformed into E. coli BL21(DE3) containing the pRIL 12 13 plasmid (Stratagene) and used to inoculate a 25 mL M9 minimal media culture containing 25 14 15 g/mL kanamycin and 34 g/mL chloramphenicol. The culture was allowed to grow overnight 16 17 18 at 37 °C in a shaking incubator. A 10 mL aliquot of the overnight culture was used to inoculate 19 20 2 L of M9 SeMET highyield medium (Shanghai Medicilon) supplemented with 150 mM 2, 2’ 21 22 bipyridyl, 1.0 mM ZnCl 2, and 1.0 mM MnCl 2. The culture was placed in a LEX48 airlift 23 24 25 fermenter and incubated at 37 °C until the OD600 reached 1.2. Cells were induced by addition 26 27 of 0.5 mM IPTG and allowed to grow for 16 hours at 22 °C overnight. The culture was 28 29 harvested and pelleted by centrifugation. 30 31 32 Cells were resuspended in lysis buffer containing 20 mM HEPES, pH 7.5, 500 mM 33 34 NaCl, 20 mM imidazole, and 10% glycerol and lysed by sonication. Lysates were clarified by 35 36 37 centrifugation at 35,000g for 30 minutes. Proteins were purified on an AKTAxpress FPLC (GE 38 39 Healthcare). Clarified lysates were loaded onto a 5 mL StrepTactin column (IBA), washed 40 41 with 5 column volumes of lysis buffer, and eluted with 20 mM HEPES, pH 7.5, 500 mM NaCl, 42 43 44 20 mM imidazole, 10% glycerol, and 2.5 mM desthiobiotin. The eluent was loaded onto a 1 45 46 mL Histrap HP column (GE Healthcare), washed with 10 column volumes of lysis buffer, and 47 48 eluted with 20 mM HEPES, pH 7.5, 500 mM NaCl, 500 mM imidazole, and 10% glycerol. The 49 50 51 purified sample was applied to a HiLoad S200 16/60 HR gel filtration column which was 52 53 equilibrated in 20 mM HEPES, pH7.5, 150 mM NaCl, 10% glycerol, and 5 mM DTT. Peak 54 55 fractions were collected and protein was analyzed by SDSPAGE. Samples were concentrated 56 57 58 8 59 60 ACS Paragon Plus Environment Page 9 of 50 Biochemistry

1 2 3 to 18 mg/mL using Amicon Ultra centrifugal filters (Millipore), snap frozen in liquid nitrogen, 4 5 6 and stored at 80 °C. 7 8 Initial Activity Screens . Sav2595, Acel0264, Dr0824, and Nis0429 (5 M) were 9 10 11 incubated for 16 hours with AFL, adenine, adenosine, 2’deoxyadenosine, 3’deoxyadenosine, 12 13 5’deoxyadenosine, 2’5’dideoxyadenosine, AMP, ADP, ATP, Sadenosylhomocysteine, S 14 15 adenosyl methionine, N6methyl2’deoxyadenosine, 5’thiomethyladenosine, 16 17 18 formaminopyrimidineadenine, and NECA. The substrate concentration was 80 M. 19 20 Enzymatic activity was monitored by changes in absorbance between 240300 nm with a 21 22 SPECTRAmax Plus spectrophotometer (Molecular Devices). 23 24 25 Measurement of Kinetic Constants . Assays were conducted with substrate 26 27 concentrations of 4200 M. The deamination of AFL, NECA, adenosine, AMP, 2’ 28 29 deoxyadenosine, 3’deoxyadenosine, 5’deoxyadenosine, SAH and 5’methylthioadenosine was 30 31 32 monitored by following the decrease in absorbance at 263 nm. Differential extinction 33 34 coefficients were calculated by subtracting the extinction coefficient of the from that of 35 36 37 the substrate for AFL, adenosine, NECA, 5’methylthioadenosine, 5’deoxyadenosine, SAH and 38 1 1 39 AMP (ε 263 = 6900 M cm ). 40 41 Metal Analysis . Metal content of the proteins was determined by ICPMS (27 ). Protein 42 43 44 samples for ICPMS were digested with HNO 3 by refluxing for ~45 minutes to prevent protein 45 46 precipitation during the measurement. Protein concentration was adjusted to ~1.0 M with 1% 47 48 49 (v/v) HNO 3. 50 51 Data Analysis . Sequence alignments were created using Clustal (28 ). Steady state 52 53 kinetic data were analyzed using Softmax Pro, version 5.4 (Molecular Devices). Kinetic 54 55 56 parameters were determined by fitting the data to equation 1 using the nonlinear leastsquares 57 58 9 59 60 ACS Paragon Plus Environment Biochemistry Page 10 of 50

1 2 3 fitting program in SigmaPlot 11.0 (Systat Software Inc.). In this equation, A is the substrate 4 5 6 concentration, Km is the Michaelis constant, v is the velocity of the reaction and kcat is the 7 8 turnover number. 9 10 11 v / Et = kcat A / (Km + A) (1) 12 13 14 15 Homology Modeling and Ligand Docking . The Xray structure of adenosine 16 17 deaminase from Bos taurus with a transitionstate analogue bound in the active site (PDB ID: 18 19 20 1KRM, 26% and 25% sequence identity to Acel0264 and Sav2595, respectively) was used as 21 22 the template to build homology models for Acel0264 and Sav2595. The sequence alignment 23 24 25 was computed with Multiple Sequence Comparison by LogExpectation (MUSCLE) (29 ) 26 27 followed by generation of 500 homology models using the standard “automodel” class in 28 29 MODELLER 9v12 (30 ). The model with the best DOPE (31 ) score was selected and this initial 30 31 32 model was refined with the “loopmodel” class in MODELLER and the side chain prediction 33 34 protocol in PLOP (32 ). After refinement, the two homology models of Acel0264 and Sav2595 35 36 were targeted by docking screens using DOCK 3.6 (33, 34 ) with a virtual library (35-37 ) that 37 38 39 contains NECA, AFL, as well as 57680 different highenergy intermediates (HEI) of 6440 40 41 KEGG (Kyoto Encyclopedia of Genes and Genomes) molecules (38, 39 ). In docking screens, a 42 43 computed pose was selected if the O – moiety of the HEI portion of the molecule was found 44 45 46 within 4 Å of the metal ion in the active site. The selected molecules were ranked by an energy 47 48 function consisting of proteinligand van der Waals interactions, proteinligand electrostatic 49 50 51 interactions, and a correction for ligand desolvation. The top 500 ranked molecules were 52 53 inspected visually to ensure the compatibility of the pose with the amidohydrolase reaction 54 55 mechanism. 56 57 58 10 59 60 ACS Paragon Plus Environment Page 11 of 50 Biochemistry

1 2 3 Mutation of Sav2595 and Nis0429 . Singlesite mutations of Sav2595 were constructed 4 5 6 using the standard QuikChange PCR protocol according to the manufacturer’s instructions. The 7 8 mutants, R87A and R87M, were expressed and purified using the protocol established for wild 9 10 11 type Sav2595. A singlesite mutation was made to Nis0429 through standard QuikChange PCR 12 13 protocols to create the S145A mutant. 14 15 Structure Determination of Dr0824 . Crystals were grown by mixing 1 L protein with 16 17 18 1 L reservoir solution via the sitting drop vapor diffusion method. Protein was in HEPES 19 20 buffer at pH 7.5 at a concentration of approximately 14 mg/mL. Reservoir solution contained 21 22 30% PEG MME500, 0.1 M BisTris, pH 6.5, and 50 mM calcium chloride. Harvested crystals 23 24 25 were mounted in cryoloops and immediately flashcooled in liquid nitrogen. Diffraction from 26 27 these crystals was consistent with the monoclinic space group C2 and data extending to 1.78 Å 28 29 resolution collected on the X12C beamline of the National Synchrotron Light Source (NSLS), 30 31 32 Brookhaven National Laboratory. Data were processed, integrated, and scaled using HKL2000 33 34 (40 ); data collection parameters are given in Table 1. The structure was determined by single 35 36 37 anomalous dispersion (SAD) using SeMet substituted crystals. SeMet crystals were obtained 38 39 under the same conditions that yielded native crystals. Se atoms were located using SOLVE 40 41 and the phases were refined with SHARP (41, 42 ). The initial protein model was built by an 42 43 44 automated procedure using ARP/wARP software (43 ). The model was completed manually 45 46 using ‘O’ and water molecules added based on the Fourier difference map (44 ). The structure 47 48 was refined with CNS 1.1 and the refinement statistics are presented in Table 1 (45 ). The final 49 50 51 model contains a zinc ion in addition to protein atoms and 170 water molecules (PDB ID: 52 53 2IMR). 54 55 56 57 58 11 59 60 ACS Paragon Plus Environment Biochemistry Page 12 of 50

1 2 3 Structure Determination of Nis0429 . SeMet derivatized protein was crystallized by 4 5 6 the sittingdrop vapor diffusion method, as described below. 0.5 L of the protein solution (18 7 8 mg/mL) was mixed with an equal volume of a precipitant solution and equilibrated at room 9 10 11 temperature (294 K) against the same precipitant solution in clear tapesealed 96well 12 13 INTELLIplates. Crystallization was performed using a TECAN crystallization robot. A single 14 15 crystal appeared approximately 11 months after setting up crystallization drops in 0.1 M 16 17 18 MES/NaOH, pH 6.0, 1 M potassium sodium tartrate. This crystal was flashcooled in liquid 19 20 nitrogen prior to data collection. 21 22 Two complete Xray diffraction data sets were collected from the same crystal at 100 K 23 24 25 on the beamline X29A at NSLS using wavelengths of 1.075 Å (the resolution cutoff was 1.35 26 27 Å) and 0.979 Å (the resolution cutoff was 1.5 Å), respectively. All data were processed and 28 29 scaled with HKL2000 (40 ). The crystal structure was determined by seleniumSAD applying 30 31 32 anomalous Xray diffraction data (collected at a wavelength of 0.979 Å) for phasing and 33 34 SHELXD software (the CCP4 program package suite (46, 47 ). The structure was further 35 36 37 refined using a higher resolution “native” data (resolution 1.35 Å) and the program REFMAC 38 39 (48 ). The model was rebuilt and fixed manually using COOT visualization and refinement 40 41 software (49 ). The data collection and refinement statistics are listed in Table 1. The 42 43 44 coordinates and structure factors were deposited to the Protein Data Bank (PDB ID: 3V7P). 45 46 Structure Determination of the S145A mutant of Nis0429 . The purified S145A 47 48 protein was concentrated to about 38 mg/mL and crystallized as described above for the wild 49 50 51 type Nis0429 using a PHOENIX crystallization platform and four different MCSG 52 53 crystallization screens. Protein crystals appeared in two to three weeks. The crystals were 54 55 directly flash frozen in liquid nitrogen. The 1.08 Å resolution Xray diffraction data were 56 57 58 12 59 60 ACS Paragon Plus Environment Page 13 of 50 Biochemistry

1 2 3 collected on the beamline X29A (NSLS) at a wavelength of 1.075 Å and a temperature of 100 K. 4 5 6 The data were processed and scaled with HKL2000 ( 40 ). The crystal of the S145S mutant of 7 8 Nis0429 was essentially isomorphous to that of the wildtype protein (PDB ID: 3V7P), and thus 9 10 11 the coordinates of the latter were used to build and subsequently refine the structure of the 12 13 mutant using the program REFMAC ( 48 ) and a COOT visualization and refinement software 14 15 (49 ). The final refinement statistics altogether with data collection parameters are listed in Table 16 17 18 1. The coordinates and structure factors were deposited in the Protein Data Bank (PDB ID: 19 20 4M51). 21 22 23 24 25 RESULTS 26 27 Protein Isolation . The enzymes studied in this investigation belong to groups 11 and 13 28 29 of cog1816 and groups 7 and 41 of cog0402 (Figures 3 and 4). Sav2595, Acel0264, Dr0824, 30 31 32 and Nis0429 were expressed in E. coli and purified to homogeneity. Despite supplementing the 33 34 growth medium with 1.0 mM ZnCl 2, the metal content of all proteins was low. Sav2595 35 36 2+ 37 contained an average of 0.1 equivalent of Zn per subunit. Acel0264 contained an average of 38 2+ 2+ 39 0.4 equivalent of Zn and 0.1 equivalent of Ni per monomer. Both proteins were incubated 40 41 overnight with ZnCl 2, but no increase in catalytic activity was observed after incubation. 42 43 2+ 2+ 44 Dr0824 contained 0.8 equivalents of Mn . Nis0429 contained 0.5 equivalents of Zn and 0.1 45 46 equivalent each of Fe 2+ and Mn 2+ . 47 48 Substrate Specificity . The substrate profiles of Sav2595, Acel0264, Dr0824, and 49 50 51 Nis0429 were initially determined by monitoring changes in absorbance between 240300 nm 52 53 using a small library of modified adenosine derivatives. The best substrate for these four 54 55 enzymes is AFL. Less efficient substrates are NECA, adenosine, 5’methylthioadenosine, S 56 57 58 13 59 60 ACS Paragon Plus Environment Biochemistry Page 14 of 50

1 2 3 adenosylhomocysteine, 2’deoxyadenosine, 3’deoxyadenosine, 5’deoxyadenosine and AMP. 4 5 6 The kinetic constants are presented in Table 2. The values of kcat /Km with AFL for Sav2595, 7 8 Acel0264, Dr0824 and Nis0429 are greater than 10 5 M1 s1. 9 10 11 Mutational Analysis . Singlesite mutants of Sav2595 and Nis0429 were constructed to 12 13 help identify residues that may interact with the side chain carboxylate of AFL. The R87A and 14 15 R87M variants of Sav2595 were expressed in E. coli and subsequently purified. The S145A 16 17 18 mutant of Nis0429 was constructed, expressed in E. coli, and purified. The mutant proteins 19 20 were assayed, and the kinetic constants for the deamination of AFL and adenosine were 21 22 determined. The results for the Sav2595 mutants are presented in Table 3. The Nis0429 23 24 1 5 1 1 25 S145A mutant has a kcat of 4.5 s , a KM of 12 M and a kcat /KM of 3.8 x 10 M s . 26 27 Structure of Nis0429 . The crystal structure of Nis0429 (PDB ID: 3V7P) was 28 29 determined at 1.35 Å resolution, revealing a slightly distorted TIMbarrel domain with a metal 30 31 32 and a much smaller βbarrel domain positioned between two short αhelices. The 33 34 latter is comprised of both Nterminal (residues 154) and Cterminal (starting around residue 35 36 37 354) sequences. The likely biological assembly of Nis0429 is a C2symmetrical homodimer 38 39 (Figure 5), in which both domains are involved in formation of the dimer interface and 40 41 maintenance of the active site architecture. Besides protein atoms and water molecules, the 42 43 44 refined structure contains a few tightly bound ligands (listed in Table 1). The metal is 45 46 coordinated to His61, His62, His206 and water 608, which was modeled and refined as Fe 2+ 47 48 on the basis of coordination geometry and crystallographic refinement ( Figure 6). However, the 49 50 51 possibility exists that there is a mixture of metals in the active site, although refinement 52 53 parameters and coordination geometry are more consistent with the presence of iron and thus the 54 55 metal was modeled and refined as such. The difference between the metal analysis of the 56 57 58 14 59 60 ACS Paragon Plus Environment Page 15 of 50 Biochemistry

1 2 3 purified protein and the metal in the active site may be related to the crystallization conditions. 4 5 6 In addition to two small unknown ligands (refined as formic acid and glycine) inside the 7 8 active site, strong electron density has revealed the presence of a larger molecule that was 9 10 11 modeled and refined as benzoic acid (Figure 6). Based on the location of this particular ligand 12 13 and using the coordinates of previously reported structures of adenosine deaminases, one AFL 14 15 molecule was manually modeled into the active site of Nis0429. In the crystal structure, the 16 17 18 carboxylic acid moiety of benzoic acid is coordinated by the side chain hydroxyl of Ser145 19 20 from βstrand 3 and a water molecule. The same interactions can be mimicked by the 5’side 21 22 chain of AFL (Figure 7). The analogous residue position in all members of group 7 from 23 24 25 cog0402 is either serine or threonine. Comparison of Nis0429 (PDB ID: 3V7P) with structures 26 27 of adenosine deaminases revealed striking similarities between their highly conserved (>90%) 28 29 metal and nucleoside binding sites (magenta, Figure 8). By contrast, the hydrophobic portion 30 31 32 of the binding site is represented by mainly nonpolar residues specific for Nis0429 and its 33 34 closest homologues (yellow, Figure 8). 35 36 37 Structure of S145 Mutant of Nis0429 . The crystals of wildtype Nis0429 and the 38 39 S145A mutant were essentially isomorphous ( Table 1 ) and their structures are very similar with 40 41 one important difference: the orientation of the benzoic acid ligand bound inside the active site 42 43 44 next to a residue 145 ( Figure 9 ). In the wildtype protein the carboxyl moiety of the ligand is 45 46 hydrogenbonded to the Ser145 sidechain hydroxyl group and a Gly144 amide nitrogen, 47 48 whereas in the mutant structure the same group is hydrogenbonded to the side chain from Arg 49 50 51 89. Another interesting feature of the Ser145 mutant is the presence of the metal ion (refined as 52 53 Fe 2+ ) in the active site. The high resolution structure revealed a slightly distorted tetrahedral 54 55 metal ion coordination with varied distances between bonded atoms. The distances are as 56 57 58 15 59 60 ACS Paragon Plus Environment Biochemistry Page 16 of 50

1 2 3 follows: 1.63 Å (Fe water A962); 1.96 Å (Fe His61); 2.12 Å (Fe His63) and 2.33 Å (Fe – 4 5 6 His206). The side chain of Asp306 is positioned in two alternate conformations, one of which 7 8 may support a weak coordination bond to the iron at a distance of 2.52 Å. 9 10 11 Structure of Dr0824 . The crystal structure of Dr0824 was determined by the SAD 12 13 method and refined to 1.78 Å resolution ( Table 1). Dr0824 consists of two domains, a distorted 14 15 TIMbarrel domain and a small domain with a βsheet flanked by αhelices on both sides. The 16 17 18 small domain is made up of both the Nterminal (34 through 91) and Cterminal (399 through 19 20 413) residues. Residues 1 – 33, 47 – 49, and 395 – 396 have poor or undefined electron density. 21 22 Missing residues 47 – 49 and 395 – 396 are located at the interface of the intramolecular 23 24 25 domains. 26 27 A metal ion is located at the Cterminal side of the βsheet of the distorted TIMbarrel 28 29 domain, which is the catalytic domain. The metal ion is coordinated with His97 and His99 of 30 31 32 βstrand 1, His238 of βstrand 6, and a water molecule (O554). Water molecule O554 also 33 34 forms a hydrogen bond with OD1 of Asp352 and is expected to be the nucleophilic center. 35 36 37 Both N and Cterminal sides of the distorted βbarrel of this catalytic domain are covered by α 38 39 helices and the active site is closed from outside interactions. 40 41 The crystallographic data support the conclusion that the biological assembly for this 42 43 44 protein is a monomer rather than a homodimer. Excluding the unique Nterminal sequence 45 46 fragment (~50 residues), the Dr0824 structure shows a high degree of similarity to Nis0429 47 48 (Figure 10 ) with RMSD between Cα atoms along the aligned length (residues 54274) of about 49 50 51 2.1 Å. The only large difference between these two structures is the conformation of the two 52 53 segments near the active site pocket (residues 102128 and 253283, colored blue in the 2IMR 54 55 56 57 58 16 59 60 ACS Paragon Plus Environment Page 17 of 50 Biochemistry

1 2 3 structure in Figure 10 . The observed differences may be the result of crystal packing, loops 4 5 6 flexibility, or the presence of ligands bound in the substratebinding site of Nis0429. 7 8 Docking Poses of AFL in Sav2595 and Acel0264 . In docking screens, the chiral 9 10 11 tetrahedral intermediate for the deamination reaction that was generated by the attack of 12 13 hydroxide on the re face of the adenine moiety of AFL was ranked 51 and 30 among the 57,680 14 15 highenergy intermediates (HEI) in the virtual substrate library for Acel0264 and Sav2595, 16 17 18 respectively. In the modeled active site of Sav2595 from cog1816, the metal ion is coordinated 19 20 by three histidine residues (His32, His34, and His215) and one aspartate residue (Asp296). 21 22 This catalytic machinery is conserved in Dr0824 and Nis0429 from cog0402. The modeled 23 24 25 active site of Sav2595 shows the two hydroxyl groups from the ribose moiety of AFL forming 26 27 hydrogen bonds with the side chain carboxylate group from Asp154. This interaction is also 28 29 identified in the docking of AFL to Nis0429 (Glu141). The N3 nitrogen from the adenine 30 31 32 moiety of AFL forms a hydrogen bond with the backbone amine group from Gly188, but this 33 34 interaction is replaced by a histidine residue in Nis0429 (His179). The AFL specificity of 35 36 37 Sav2595 can be explained by the presence of an arginine residue (Arg87) that interacts with the 38 39 carboxylate group in the 3ketobenzoic acid moiety of AFL ( Figure 11). In Nis0429, Ser145 40 41 is predicted to hydrogen bond to the 3ketobenzoic acid moiety (Figure 8). 42 43 44 45 46 DISCUSSION 47 48 Utilization of Bioinformatics, Structural Biology, and Molecular Enzymology to 49 50 51 Determine Substrate Specificity . A group of bacterial deaminases from cog1816 and cog0402 52 53 within the amidohydrolase superfamily of enzymes has been characterized. These enzymes are 54 55 currently incorrectly annotated as adenosine deaminases and chlorohydrolases. The most likely 56 57 58 17 59 60 ACS Paragon Plus Environment Biochemistry Page 18 of 50

1 2 3 physiological substrate is 6aminodeoxyfutalosine (AFL). AFL is an intermediate in the 4 5 6 biosynthesis of menaquinone in certain bacteria that utilize the futalosine pathway (4). Further 7 8 evidence supporting a role in the futalosine pathway is the location of Acel0264 in a cluster of 9 10 11 genes related to the biosynthesis of menaquinone ( Figure 2). The gene for Acel0264 is flanked 12 13 by the genes for Acel0261 and Acel0263. These two genes are orthologs to MqnA and MqnC , 14 15 respectively. Other members of group 11 from cog1816 have only UbiE , a methyltransferase in 16 17 18 menaquinone biosynthesis, and cog-FixC , a geranylgeranyl reductase, as recurring neighbors. 19 20 Members of group 13 of cog1816 and group s 7 and 41 of cog0402 show no genomic hints 21 22 related to menaquinone biosynthesis, except for the lack of men gene homologs and the 23 24 25 presence of mqn homologs. 26 27 Sequence Comparison and Structural Analysis . The four enzymes in group 11 of 28 29 cog1816 share sequence identity to one another of greater than 50%, but are less than 30% 30 31 32 identical to the wellcharacterized adenosine deaminase from E. coli (locus tag: b1623) from 33 34 group 5 of cog1816. The same degree of similarity is observed between members of group 13. 35 36 37 All of the residues that coordinate the divalent metal ion in the active site are conserved among 38 39 the group 5, 11 and 13 enzymes, which include the HxH motif following βstrand 1, the 40 41 histidine at the Cterminus of βstrand 5 and the aspartate at the Cterminus of βstrand 8. 42 43 44 Another mutually conserved aspartate following βstrand 8 is responsible for hydrogen bonding 45 46 to N7 of the adenine ring. Also conserved among these deaminases from cog1816 is the 47 48 glutamate from the HxxE motif at the Cterminus of βstrand 5. A functionally important and 49 50 51 conserved histidine at the Cterminus of βstrand 6 forms a hydrogen bond with the putative 52 53 hydrolytic water and may participate in proton transfer reactions. 54 55 56 57 58 18 59 60 ACS Paragon Plus Environment Page 19 of 50 Biochemistry

1 2 3 Unlike other members of cog1816, members of group 11 and 13 have a conserved 4 5 6 arginine following βstrand 1. Using adenosine deaminase (PDB ID: 1KRM) as a structural 7 8 template, a model was constructed in an attempt to understand the substrate specificity of 9 10 11 Acel0264 and Sav2595. An arginine residue was identified that could possibly recognize the 12 13 carboxylate group of the 5’side chain of AFL. The predicted location of Arg87 in Sav2595 is 14 15 on a turn between two αhelices after βstrand 1. This arginine was mutated to alanine or 16 17 18 methionine in Sav2595, resulting in slower rates for the deamination of AFL, with essentially 19 20 no effect on K M (Table 3). The R87A and R87M mutants could be saturated with adenosine, 21 22 whereas the wildtype was not. From these results, we propose that Arg87 is likely 23 24 25 near the substrate binding site (Figure 11) because changing this residue leads to the loss of 26 27 substrate specificity. The remainder of the large substrate binding pocket is made up of 28 29 hydrophobic residues and an aspartate or glutamate to interact with the 2’hydroxyl group. 30 31 32 Members of group 7 of cog0402 share a sequence identity of 40% or more within the 33 34 group, but much less than 30% with atrazine chlorohydrolase, and less than 25% compared to 35 36 37 guanine deaminases from group 2. Dr0824 has a sequence identity of 25% or less with 38 39 members of group 7 of cog0402, and has very low sequence similarity to wellcharacterized 40 41 members of cog0402 and groups 11 and 13 of cog1816. Active site residues involved in metal 42 43 44 binding are conserved among group 7 and 41 members, including an HxH motif after βstrand 45 46 1, another histidine after βstrand 5 and an aspartate after βstrand 8. A histidine after βstrand 47 48 6 is associated with binding a hydrolytic water molecule. A catalytically essential glutamate is 49 50 51 conserved after βstrand 5. Additional shared substrate recognition elements include a 52 53 glutamate after βstrand 1 involved in binding of the 3’hydroxyl group of the ribose moiety and 54 55 56 57 58 19 59 60 ACS Paragon Plus Environment Biochemistry Page 20 of 50

1 2 3 a histidine after βstrand 4 responsible for hydrogen bonding to N3 of the purine ring system of 4 5 6 a nucleoside substrate. 7 8 The active site structure of Nis0429 (PDB ID: 3V7P) is shown in Figure 6 and the 9 10 11 complete structure is shown in Figure 7 with AFL manually docked in the active site. The 12 13 placement of AFL in the active site of Nis0429 was guided by the position of benzoic acid in 14 15 the structure of this protein and the binding of adenosine in the active site of adenosine 16 17 18 deaminases. This binding mode predicts an interaction with Ser145 after βstrand 3 to bind the 19 20 side chain of AFL (Figure 8 ). A serine or threonine residue is conserved in this position in all 21 22 members of group 7 of cog0402, but not in group 41 or groups 11 and 13 of cog1816. It is 23 24 25 noteworthy that the S145A mutation of Nis0429 has resulted in more than a 10fold higher KM 26 27 value than that of the wildtype enzyme. These data support the idea that Ser145 is directly 28 29 involved in binding of AFL. 30 31 32 We have demonstrated that Acel0264, Sav2595, Nis0429 and Dr0824 are not adenosine 33 34 deaminases or chlorohydrolases, as currently annotated. Based on sequence similarity and 35 36 37 conservation of putative essential residues, all known members of groups 11 and 13 of cog1816 38 39 and groups 7 and 41 of cog0402 are probable 6aminodeoxyfutalosine deaminases. It is quite 40 41 likely that these enzymes evolved from distinct deaminases, which may explain the variation in 42 43 44 observed enzyme architecture. On the basis of our data, we propose a list of proteins believed 45 46 to be 6aminodeoxyfutalosine deaminase enzymes (Table 4 ). The presence of an AFL 47 48 deaminase in certain organisms is surprising to some degree because Campylobacter jejuni has 49 50 51 been reported to have a methylthioadenosine nucleosidase (MTAN) that will hydrolyze AFL 52 53 directly to the MqnB product ( Figure 1 ) but will not hydrolyze futalosine (25 ). C. jejuni 54 55 MTAN is the only close relative for futalosine (MqnB) activity in the genome. It is 56 57 58 20 59 60 ACS Paragon Plus Environment Page 21 of 50 Biochemistry

1 2 3 unclear why organisms would have an AFL deaminase, if they do not also have a futalosine 4 5 6 hydrolase. Perhaps the AFL deaminase gene is a remnant of the original futalosine pathway in 7 8 these organisms. 9 10 11 Comparison to Adenosine Deaminase from cog1816 . The prototypical adeonosine 12 13 deaminase from E. coli (locus tag: b1623) is also a member of the amidohydrolase superfamily 14 15 and cog1816. This enzyme has a mononuclear metal center and will deaminate adenosine, N6 16 17 18 methyladenosine, and 2’deoxyadenosine as substrates with comparable rate constants. With 19 20 the high conservation of metalbinding and catalytically essential residues between Acel0264, 21 22 Sav2595, and b1623, it is very likely that these enzymes will employ the same mechanism for 23 24 25 deamination. Dr0824 and Nis0429 have a slightly modified nucleoside binding site compared 26 27 to b1623, but the structures and the biochemical data clearly support the notion that they 28 29 function through an activated water molecule to facilitate the deamination reaction. 30 31 32 Acidothermus celluloyticus 11B has a protein from cog0402 (Acel1042) of unknown 33 34 function, and no other proteins in cog1816 or cog1001, and thus adenosine or adenine 35 36 37 deaminases are not likely to be found in this organism. The genome of Streptomyces avermitilis 38 39 MA4680 contains five proteins from cog1816 and four proteins from cog0402. Of the five 40 41 proteins from cog1816, Sav5577 is a confirmed adenine deaminase belonging to group 3, 42 43 44 Sav1165 and Sav3358 are believed to be adenosine deaminases from group 1, and Sav4906 45 46 from group 17 has unknown function. The genes from cog0402 include an isoxanthopterin 47 48 deaminase (Sav2017) and Sav4899, Sav6652, and Sav3494 of unknown function. Deinococcus 49 50 51 radiodurans R1 has an adenine deaminase from cog1001 (DrA0270) and a guanine deaminase 52 53 from cog0402 (DrA0180) in addition to Dr0824. Nis0429 is the sole protein in Nitratiruptor sp. 54 55 SB1552 from cog0402, cog1816 or cog1001. 56 57 58 21 59 60 ACS Paragon Plus Environment Biochemistry Page 22 of 50

1 2 3 Strategy for Annotation and Characterization . Acidothermus celluloyticus 11B, 4 5 6 Streptomyces avermitilis MA4680, Nitratiruptor sp. SB1552, and Deinococcus radiodurans 7 8 R1 were identified as lacking the men genes necessary for menaquinone biosynthesis. The 9 10 11 genes involved in an alternative menaquinone biosynthetic pathway were identified in clusters 12 13 in the Acidothermus celluloyticus 11B genome and we have confirmed that Acel0264 was 14 15 capable of deaminating 6aminodeoxyfutalosine. The genomes of Nitratiruptor sp. SB1552, 16 17 18 Streptomyces avermitilis MA4680, and Deinococcus radiodurans R1 lacked the men genes 19 20 associated with the shikimate pathway, but they do possess the mqn genes of the futalosine 21 22 pathway. A sequence similarity network was constructed and used to identify additional 23 24 25 proteins with similar sequences to Acel0264, Sav2595, Dr0824, and Nis0429. Substrate 26 27 profiles for these proteins were determined using a small focused library and a model for 28 29 substrate binding was proposed. The high resolution crystal structure of Nis0429 with smaller 30 31 32 ligands in the active site helped us to predict the binding mode of AFL as a substrate. 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 22 59 60 ACS Paragon Plus Environment Page 23 of 50 Biochemistry

1 2 3 ASSOCIATED CONTENT 4 5 6 AUTHOR INFORMATION 7 8 Corresponding Author 9 10 11 (F.M.R.) Address: Department of Chemistry, P.O. Box 30012, Texas A&M University, College 12 13 Station, TX 778433012. Tel: (979) 8453373. Email: [email protected]. 14 15 16 17 18 FUNDING 19 20 This work was supported in part by the Robert A. Welch Foundation (A840) and the 21 22 NIH (GM 71790). This work was made possible by the Center for Synchrotron Biosciences 23 24 25 grant, P30EB009998, from the National Institute of Biomedical Imaging and Bioengineering 26 27 (NIBIB). Use of the National Synchrotron Light Source, Brookhaven National Laboratory, was 28 29 supported by the U.S. Department of Energy, Office of Science, Office of Basic Energy 30 31 32 Sciences, under Contract No. DEAC0298CH10886. 33 34 35 36 37 ABBREVIATIONS 38 39 ADE, adenine deaminase; ADD, adenosine deaminase; cog, cluster of orthologous 40 41 groups; IPTG, isopropylβgalactoside; DTT, dithiothreitol; AHS, amidohydrolase superfamily; 42 43 44 ICPMS, inductively coupled plasma mass spectrometry; NECA, 1(6amino9Hpurin9yl)1 45 46 deoxyNethylβDribofuranuronamide; SAH, Sadenosylhomocysteine; AFL, 6 47 48 aminodeoxyfutalosine. 49 50 51 52 53 54 55 56 57 58 23 59 60 ACS Paragon Plus Environment Biochemistry Page 24 of 50

1 2 3 REFERENCES 4 5 6 1. Hiratsuka, T., Furihata, K., Ishikawa, J., Yamashita, H., Itoh, N., Seto, H., and Dairi, T. (2008) 7 8 An alternative menaquinone biosynthetic pathway operating in microorganisms, Science 321 , 9 10 11 16701673. 12 13 2. Bentley, S. D., Chater, K. F., CerdenoTarraga, A. M., Challis, G. L., Thomson, N. R., James, 14 15 K. D., Harris, D. E., Quail, M. A., Kieser, H., Harper, D., Bateman, A., Brown, S., Chandra, G., 16 17 18 Chen, C. W., Collins, M., Cronin, A., Fraser, A., Goble, A., Hidalgo, J., Hornsby, T., Howarth, 19 20 S., Huang, C. H., Kieser, T., Larke, L., Murphy, L., Oliver, K., O'Neil, S., Rabbinowitsch, E., 21 22 Rajandream, M. A., Rutherford, K., Rutter, S., Seeger, K., Saunders, D., Sharp, S., Squares, R., 23 24 25 Squares, S., Taylor, K., Warren, T., Wietzorrek, A., Woodward, J., Barrell, B. G., Parkhill, J., 26 27 and Hopwood, D. A. (2002) Complete genome sequence of the model actinomycete 28 29 Streptomyces coelicolor A3(2), Nature 417 , 141147. 30 31 32 3. Borodina, I., Krabben, P., and Nielsen, J. (2005) Genomescale analysis of Streptomyces 33 34 coelicolor A3(2) metabolism, Genome research 15 , 820829. 35 36 37 4. Arakawa, C., Kuratsu, M., Furihata, K., Hiratsuka, T., Itoh, N., Seto, H., and Dairi, T. (2011) 38 39 Diversity of the early step of the futalosine pathway, Antimicrobial agents and chemotherapy 40 41 55 , 913916. 42 43 44 5. Holm, L., and Sander, C. (1997) An evolutionary treasure: unification of a broad set of 45 46 related to urease, Proteins 28 , 7282. 47 48 6. Seibert, C. M., and Raushel, F. M. (2005) Structural and catalytic diversity within the 49 50 51 amidohydrolase superfamily, Biochemistry 44 , 63836391. 52 53 54 55 56 57 58 24 59 60 ACS Paragon Plus Environment Page 25 of 50 Biochemistry

1 2 3 7. Kamat, S. S., Fan, H., Sauder, J. M., Burley, S. K., Shoichet, B. K., Sali, A., and Raushel, F. M. 4 5 6 (2011) Enzymatic deamination of the epigenetic base N6methyladenine, Journal of the 7 8 American Chemical Society 133 , 20802083. 9 10 11 8. Hall, R. S., Agarwal, R., Hitchcock, D., Sauder, J. M., Burley, S. K., Swaminathan, S., and 12 13 Raushel, F. M. (2010) Discovery and structure determination of the orphan enzyme 14 15 isoxanthopterin deaminase, Biochemistry 49 , 43744382. 16 17 18 9. Goble, A. M., Zhang, Z., Sauder, J. M., Burley, S. K., Swaminathan, S., and Raushel, F. M. 19 20 (2011) Pa0148 from Pseudomonas aeruginosa catalyzes the deamination of adenine, 21 22 Biochemistry 50 , 65896597. 23 24 25 10. Goble, A. M., Fan, H., Sali, A., and Raushel, F. M. (2011) Discovery of a cytokinin deaminase, 26 27 ACS chemical biology 6, 10361040. 28 29 11. Wang, Z., and Quiocho, F. A. (1998) Complexes of adenosine deaminase with two potent 30 31 32 inhibitors: Xray structures in four independent molecules at pH of maximum activity, 33 34 Biochemistry 37 , 83148324. 35 36 37 12. Ho, M. C., Cassera, M. B., Madrid, D. C., Ting, L. M., Tyler, P. C., Kim, K., Almo, S. C., and 38 39 Schramm, V. L. (2009) Structural and metabolic specificity of methylthiocoformycin for 40 41 malarial adenosine deaminases, Biochemistry 48 , 96189626. 42 43 44 13. Baer, H. P., Drummond, G. I., and Gillis, J. (1968) Studies on the specificity and mechanism of 45 46 action of adenosine deaminase, Arch Biochem Biophys 123 , 172178. 47 48 14. Kinoshita, T., Nakanishi, I., Terasaka, T., Kuno, M., Seki, N., Warizaya, M., Matsumura, H., 49 50 51 Inoue, T., Takano, K., Adachi, H., Mori, Y., and Fujii, T. (2005) Structural basis of compound 52 53 recognition by adenosine deaminase, Biochemistry 44 , 1056210569. 54 55 56 57 58 25 59 60 ACS Paragon Plus Environment Biochemistry Page 26 of 50

1 2 3 15. Kinoshita, T., Tada, T., and Nakanishi, I. (2008) Conformational change of adenosine 4 5 6 deaminase during ligandexchange in a crystal, Biochem Biophys Res Commun 373 , 5357. 7 8 16. Koch, A. L., and Vallee, G. (1959) The properties of adenosine deaminase and adenosine 9 10 11 nucleoside phosphorylase in extracts of Escherichia coli, J Biol Chem 234 , 12131218. 12 13 17. Lin, J., Westler, W. M., Cleland, W. W., Markley, J. L., and Frey, P. A. (1998) Fractionation 14 15 factors and activation energies for exchange of the low barrier hydrogen bonding proton in 16 17 18 peptidyl trifluoromethyl ketone complexes of chymotrypsin, Proc Natl Acad Sci U S A 95 , 19 20 1466414668. 21 22 18. Gleeson, M. P., Burton, N. A., and Hillier, I. H. (2003) Prediction of the potency of inhibitors of 23 24 25 adenosine deaminase by QM/MM calculations, Chem Commun (Camb) , 21802181. 26 27 19. Sadat Hayatshahi, S. H., Abdolmaleki, P., Ghiasi, M., and Safarian, S. (2007) QSARs and 28 29 activity predicting models for competitive inhibitors of adenosine deaminase, FEBS Lett 581 , 30 31 32 506514. 33 34 20. Wilson, D. K., Rudolph, F. B., and Quiocho, F. A. (1991) Atomic structure of adenosine 35 36 37 deaminase complexed with a transitionstate analog: understanding catalysis and 38 39 immunodeficiency mutations, Science 252 , 12781284. 40 41 21. Larson, E. T., Deng, W., Krumm, B. E., Napuli, A., Mueller, N., Van Voorhis, W. C., Buckner, 42 43 44 F. S., Fan, E., Lauricella, A., DeTitta, G., Luft, J., Zucker, F., Hol, W. G., Verlinde, C. L., and 45 46 Merritt, E. A. (2008) Structures of substrate and inhibitorbound adenosine deaminase from a 47 48 human malaria parasite show a dramatic conformational change and shed light on drug 49 50 51 selectivity, J Mol Biol 381 , 975988. 52 53 54 55 56 57 58 26 59 60 ACS Paragon Plus Environment Page 27 of 50 Biochemistry

1 2 3 22. Weiss, P. M., Cook, P. F., Hermes, J. D., and Cleland, W. W. (1987) Evidence from nitrogen15 4 5 6 and solvent deuterium isotope effects on the chemical mechanism of adenosine deaminase, 7 8 Biochemistry 26 , 73787384. 9 10 11 23. Sideraki, V., Mohamedali, K. A., Wilson, D. K., Chang, Z., Kellems, R. E., Quiocho, F. A., and 12 13 Rudolph, F. B. (1996) Probing the functional role of two conserved active site aspartates in 14 15 mouse adenosine deaminase, Biochemistry 35 , 78627872. 16 17 18 24. Atkinson, H. J., Morris, J. H., Ferrin, T. E., and Babbitt, P. C. (2009) Using sequence similarity 19 20 networks for visualization of relationships across diverse protein superfamilies, PLoS One 4, 21 22 e4345. 23 24 25 25. Li, X., Apel, D., Gaynor, E. C., and Tanner, M. E. (2011) 5'methylthioadenosine nucleosidase 26 27 is implicated in playing a key role in a modified futalosine pathway for menaquinone 28 29 biosynthesis in Campylobacter jejuni, J Biol Chem 286 , 1939219398. 30 31 32 26. Aslanidis, C., and de Jong, P. J. (1990) Ligationindependent cloning of PCR products (LIC 33 34 PCR), Nucleic acids research 18 , 60696074. 35 36 37 27. Hall, R. S., Xiang, D. F., Xu, C., and Raushel, F. M. (2007) NAcetylDglucosamine6 38 39 phosphate deacetylase: substrate activation via a single divalent metal ion, Biochemistry 46 , 40 41 79427952. 42 43 44 28. Waterhouse, A. M., Procter, J. B., Martin, D. M., Clamp, M., and Barton, G. J. (2009) Jalview 45 46 Version 2a multiple sequence alignment editor and analysis workbench, Bioinformatics 25 , 47 48 11891191. 49 50 51 29. Edgar, R. C. (2004) MUSCLE: multiple sequence alignment with high accuracy and high 52 53 throughput, Nucleic Acids Res 32 , 17921797. 54 55 56 57 58 27 59 60 ACS Paragon Plus Environment Biochemistry Page 28 of 50

1 2 3 30. Sali, A., and Blundell, T. L. (1993) Comparative Protein Modeling by Satisfaction of Spatial 4 5 6 Restraints, J Mol Biol 234 , 779815. 7 8 31. Shen, M. Y., and Sali, A. (2006) Statistical potential for assessment and prediction of protein 9 10 11 structures, Protein Science 15 , 25072524. 12 13 32. Sherman, W., Day, T., Jacobson, M. P., Friesner, R. A., and Farid, R. (2006) Novel procedure 14 15 for modeling ligand/receptor induced fit effects, J Med Chem 49 , 534553. 16 17 18 33. Lorber, D. M., and Shoichet, B. K. (2005) Hierarchical docking of databases of multiple ligand 19 20 conformations, Current Topics in Medicinal Chemistry 5, 739749. 21 22 34. Mysinger, M. M., and Shoichet, B. K. (2010) Rapid ContextDependent Ligand Desolvation in 23 24 25 Molecular Docking, J Chem Inf Model 50 , 15611573. 26 27 35. Hermann, J. C., Ghanem, E., Li, Y. C., Raushel, F. M., Irwin, J. J., and Shoichet, B. K. (2006) 28 29 Predicting substrates by docking highenergy intermediates to enzyme structures, J Am Chem 30 31 32 Soc 128 , 1588215891. 33 34 36. Hermann, J. C., MartiArbona, R., Fedorov, A. A., Fedorov, E., Almo, S. C., Shoichet, B. K., 35 36 37 and Raushel, F. M. (2007) Structurebased activity prediction for an enzyme of unknown 38 39 function, Nature 448 , 775779. 40 41 37. Fan, H., Hitchcock, D., Seidel, R. D., Hillerich, B., Lin, H., Almo, S. C., Sali, A., Shoichet, B. 42 43 44 K., and Raushel, F. M. (2013) Assignment of pterin deaminase activity to an enzyme of 45 46 unknown function guided by homology modeling and docking, J Am Chem Soc 135 , 795803. 47 48 38. Kanehisa, M., and Goto, S. (2000) KEGG: Kyoto Encyclopedia of Genes and Genomes, 49 50 51 Nucleic Acids Res 28 , 2730. 52 53 54 55 56 57 58 28 59 60 ACS Paragon Plus Environment Page 29 of 50 Biochemistry

1 2 3 39. Kanehisa, M., Goto, S., Hattori, M., AokiKinoshita, K. F., Itoh, M., Kawashima, S., Katayama, 4 5 6 T., Araki, M., and Hirakawa, M. (2006) From genomics to chemical genomics: new 7 8 developments in KEGG, Nucleic Acids Res 34 , D354D357. 9 10 11 40. Otwinowski, Z., and Minor, W. (1997) Processing of Xray diffraction data collected in 12 13 oscillation mode, Methods Enzymol. 276 , 307326. 14 15 41. DeLaFortelle, E., and Bricogne, G. (1997) Maximumlikelihood heavy atom parameter 16 17 18 refinement in the MIR and MAD methods., Methods Enzymol. 276 , 472493. 19 20 42. Terwilliger, T. C., and Berendzen, J. (1997) Automated structure solution for MIR and MAD, 21 22 Acta Cryst. D55 , 849861. 23 24 25 43. Perrakis, A., Morris, R., and Lamzin, V. S. (1999) Automated protein model building combined 26 27 with iterative structure refinement, Nature Struct. Biol. 6, 458463. 28 29 44. Jones, T. A., Zou, J.Y., Cowan, S. W., and Kjeldgaard, M. (1991) Improved methods in 30 31 32 building protein models in electron density map and the location of errors in these models, Acta 33 34 Crystallogr. A47 , 110 119. 35 36 37 45. Brunger, A. T., Adams, P. D., Clore, G. M., Delano, W. L., Gros, P., GrosseKunstleve, R. W., 38 39 Jiang, J. S., Kuszwewski, J., Nilges, M., Pannu, N. S., Read, R. J., Rice, L. M., Somonsom, T., 40 41 and Warren, G. L. (1998) Crystallography & NMR system: a new software suite for 42 43 44 macromolecular structure determination, Acta Crystallogr. D54 , 905921. 45 46 46. Winn, M. D., Ballard, C. C., Cowtan, K. D., Dodson, E. J., Emsley, P., Evans, P. R., Keegan, R. 47 48 M., Krissinel, E. B., Leslie, A. G., McCoy, A., McNicholas, S. J., Murshudov, G. N., Pannu, N. 49 50 51 S., Potterton, E. A., Powell, H. R., Read, R. J., Vagin, A., and Wilson, K. S. (2011) Overview of 52 53 the CCP4 suite and current developments, Acta crystallographica. Section D, Biological 54 55 crystallography 67 , 235242. 56 57 58 29 59 60 ACS Paragon Plus Environment Biochemistry Page 30 of 50

1 2 3 47. Collaborative Computational Project Number 4. (1994) The CCP4 suite: programs for protein 4 5 6 crystallography, Acta Cryst. D50 , 760763. 7 8 48. Murshudov, G. N., Vagin, A. A., and Dodson, E. J. (1997) Refinement of macromolecular 9 10 11 structures by the maximumlikelihood method, Acta crystallographica. Section D, Biological 12 13 crystallography 53 , 240255. 14 15 49. Emsley, P., and Cowtan, K. (2004) Coot: modelbuilding tools for molecular graphics, Acta 16 17 18 crystallographica. Section D, Biological crystallography 60 , 21262132. 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 30 59 60 ACS Paragon Plus Environment Page 31 of 50 Biochemistry

1 2 3 4 5 6 Table 1. Data Collection and Refinement Statistics (data for high resolution shells) 7 PDB identifier 2IMR 3V7P 4M51 8 Space group C 1 2 1 C 1 2 1 C 1 2 1 9 Unit cell dimension (Å) 10 11 a 113.83 91.515 92.163 12 b 71.30 75.169 75.788 13 c 51.34 75.646 76.291 14 Cell angles (degrees) 15 alpha 90.00 90.00 90.00 16 beta 115.20 120.72 120.99 17 18 gamma 90.00 90.00 90.00 19 Molecules per ASU 1 1 1 20 Solvent content 40.24 47.01 47.03 21 Matthew's Coefficient 2.06 2.32 2.33 22 benzoate benzoate 23 tartrate sulfate 24 Ligands Zn 2+ 25 unknown HEPES 2+ 2+ 26 Fe Fe 27 NSLS X29A NSLS X12C NSLS X29A 28 Xray source 29 Wavelength 0.9795 1.075 1.075 30 molecular 31 Method of structure solution SAD SAD 32 replacement 33 29.301.78 70.001.35 33.911.08 Resolution 34 (1.841.78) (1.381.35) (1.101.08) 35 Resolution/refinement 29.301.78 50.001.35 33.911.08 36 37 Completeness (%) 95.3 (64.7) 99.6 (94.3) 95.1 (91.1) 38 I/sigma (I) 19.20 (3.30) 9.00 (1.30) 10.20 (2.50) 39 Rsym 0.048 (0.364) 0.051 (0.730) 0.061 (0.640) 40 Rwork (R free ) 20.1 (23.0) 12.8 (16.1) 11.8 (13.5) 41 Rfree reflections (%) 1321 (5.0%) 2889 (3.0%) 5505 (3.0%) 42 Average B factor 24.11 24.77 16.48 43 44 RMSD 45 bonds lengths 0.005 0.012 0.012 46 bond angles 1.310 1.389 1.614 47 Number of solvent molecules 170 226 609 48 Ramachandran plot statistics 49 50 Most favored regions 97.1% 96.9% 96.4% 51 Allowed regions 2.6% 2.9% 3.4% 52 Outliers 0.27% (His301) 0.25% (His257) 0.25%(His257) 53 54 55 56 57 58 31 59 60 ACS Paragon Plus Environment Biochemistry Page 32 of 50

1 2 Table 2. Kinetic Constants for Sav2595, Acel0264, Dr0824, and Nis0429 a 3 4 Sav2595 Acel0264 Dr0824 Nis0429

5 kcat Km kcat /Km kcat Km kcat /Km kcat Km kcat /Km kcat Km kcat /Km 1 1 1 1 1 1 1 1 1 1 1 1 6 (s ) (M) (M s ) (s ) (M) (M s ) (s ) (M) (M s ) (s ) (M) (M s ) 7 AFL 3.8 4.8 7.9 x 10 5 6.8 6.0 1.1 x 10 6 8.6 1.8 4.8 x 10 6 1.2 0.9 1.3 x 10 6 8 NECA 34 0.007 61 110 1.0 x 10 4 110 9 3 4 3 10 5’methylthioadenosine 0.40 94 4.3 x 10 28 3.4 x 10 0.31 110 2.8 x 10 11 adenosine 2.2 x 10 4 20 1.8 x 10 4 11 12 Sadenosylhomocysteine 310 24 7.0 x 10 3 79 13 5’deoxyadenosine 2.9 x 10 3 47 7.4 x 10 4 0.017 42 410 14 15 3’deoxyadenosine 240 44 240 6 3 4 16 2’deoxyadenosine 1.1 x 10 51 0.72 37 1.9 x 10 5 17 AMP 32 45 21 3 18 a 19 Standard errors on the kinetic constants are less than 15%. Assays were conducted at pH 7.5 and at 30 °C. Sav2595 is from group 13 of 20 cog1816; Acel0264 is from group 11 of cog1816; Dr0824 is from group 41 of cog0402; and Nis0429 is from group 7 of cog0402. 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 32 43 44 45 46 ACS Paragon Plus Environment 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Page 33 of 50 Biochemistry

1 2 3 a 4 Table 3. Kinetic Constants for Sav2595 mutants 5 6 AFL adenosine 7 kcat Km kcat /Km kcat Km kcat /Km 1 1 1 1 1 1 8 (s ) (M s ) (s ) (M s ) 9 (M) (M) 10 WT 3.8 4.8 8 x 10 5 2.2 x 10 4 11 4 4 12 R87A 0.094 3.4 2.7 x 10 0.106 8.3 1.3 x 10 13 4 4 14 R87M 0.42 8 5.3 x 10 0.41 10 4.1 x 10 15 aStandard errors on the kinetic constants are less than 15%. Assays 16 were conducted at pH 7.5 and at 30 °C. 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 33 59 60 ACS Paragon Plus Environment Biochemistry Page 34 of 50

1 2 3 4 5 6 Table 4. 6Aminodeoxyfutalosine deaminase enzymes 7 Organism Locus Tag GI cog 8 Acidothermus cellulolyticus 11B Acel_0264 117927473 1816 9 Arcobacter butzleri RM018 Abu_0489 157736750 0402 10 11 Campylobacter concisus 13826 CCC13826_0371 157164865 0402 12 Campylobacter curvus 525.92 CCV52592_0362 154174508 0402 13 Campylobacter fetus subsp. fetus 8240 CFF8240_1504 118475030 0402 14 Campylobacter hominis ATCC CHAB381_1212 154147820 0402 15 BAA_381 16 Campylobacter jejuni NCTC 11168 Cj0067 15791459 0402 17 18 Campylobacter jejuni RM1221 CJE0064 57237077 0402 19 Campylobacter jejuni subsp. 81176 CJJ81176_0105 121612374 0402 20 Campylobacter jejuni subsp. doylei JJD26997_0077 153952082 0402 21 269.97 22 Campylobacter jejuni subsp. jejuni 23 C8J_0060 157414381 0402 24 81116 25 Campylobacter lari RM1200 Cla_0229 222823270 0402 26 Deinococcus deserti VCD115 Deide_06659 226355533 0402 27 Deinococcus geothermalis DSM 11300 Dgeo_0576 94984683 0402 28 Deinococcus radiodurans R1 Dr0824 15805850 0402 29 Frankia alni ANC14a FRAAL1027 111220494 1816 30 31 Frankia sp. Ccl3 Francci3_0534 86739247 1816 32 Frankia sp. EAN1pec Franean1_6098 158317840 1816 33 Frankia sp. EAN1pec Franean1_2196 158314030 1816 34 Helicobacter hepaticus ATCC 51449 HH1764 32267263 0402 35 Nautilia profundicola AmH NAMH_1355 224373376 0402 36 37 Nitratiruptor sp. SB1552 NIS_0429 152990178 0402 38 Nocardioides sp. JS614 Noca_1355 119715591 1816 39 Salinospora arenicola CNS205 Sare_1766 159037391 1816 40 Salinospora tropica CNB440 Strop_1779 145594323 1816 41 Streptomyces avermitilis MA4680 Sav_2595 29829137 1816 42 Streptomyces coelicolor A3(2) Sco5662 21224013 1816 43 Streptomyces griseus subsp. griseus 44 SGR_1841 182435634 1816 45 NBRC 13359 46 Sulfurimonas denitrificans DSM 1251 Suden_0686 78776885 0402 47 Sulfurovum sp. NBC371 SUN_1037 152992630 0402 48 Wolinella succinogenes DSM 1740 WS1477 34557817 0402 49 50 51 52 53 54 55 56 57 58 34 59 60 ACS Paragon Plus Environment Page 35 of 50 Biochemistry

1 2 3 FIGURE LEGENDS 4 5 6 Figure 1 : Biosynthetic pathways for the assembly of menaquinone. The deamination of AFL is 7 8 shown in red. The pathway for menaquinone biosynthesis in E. coli is shown in blue. 9 10 11 12 13 Figure 2 : Genomic context for Acel0264. UbiD (Acel0255) is believed to catalyze a 14 15 decarboxylation reaction in the late stages of menaquinone biosynthesis. UbiA (Acel0256) 16 17 functions as a 4hydroxybenzoate polyprenyltransferase. UbiX (Acel0257) is a 3octaprenyl4 18 19 20 hydroxybenzoate carboxy. Acel0261 and Acel0263 are MqnA and MqnC homologs, 21 22 respectively. Acel0264 catalyzes the deamination of 6aminodeoxyfutalosine to futalosine and 23 24 25 has been characterized in this work. UbiE (Acel0265) is a methyltransferase involved in 26 27 menaquinone biosynthesis. cog-FixC (Acel0266) serves as a geranylgeranyl reductase in the 28 29 biosynthesis of menaquinone. Genes depicted in grey have unknown function or function in an 30 31 32 unrelated pathway. 33 34 35 36 Figure 3 : Sequence similarity network created using Cytoscape ( www.cytoscape.org ) of 37 38 39 cog1816 from the amidohydrolase superfamily. Each node in the network represents a single 40 41 sequence, and each edge (depicted as lines) represents the pairwise connection between two 42 43 70 44 sequences at a BLAST Evalue of better than 1 × 10 . Lengths of edges are not significant, 45 46 except for tightly clustered groups, which are more closely related than sequences with only a 47 48 few connections. Group 5 contains the adenosine deaminase from E. coli . Additional groups 49 50 51 containing adenosine deaminase enzymes are shown in green. Group 3 contains adenine 52 53 deaminase enzymes, with the exception of the blue outliers, which are cytokinin deaminases. In 54 55 56 57 58 35 59 60 ACS Paragon Plus Environment Biochemistry Page 36 of 50

1 2 3 groups 11 and 13, the sequences depicted in pink contain all of the essential residues for the 4 5 6 deamination of AFL identified in this investigation. 7 8 9 10 11 Figure 4 : Sequence similarity network of cog0402. Groups 7 and 41 (pink) contain genes with 12 13 all the essential residues to be AFL deaminases. Other groups with known function include 14 15 group 1 (5’modified adenosine deaminases), group 2 (guanine deaminases), group 4 16 17 18 (isoxanthopterin deaminases or 8oxoguanine deaminases), group 6 (cytosine deaminases), 19 20 group 8 (NformiminoLglutamate deiminases) and group 10 (pterin deaminase). Groups with 21 22 no known function are shown in grey. 23 24 25 26 27 Figure 5: Cartoon model of Nis0429 C2symmetrical homodimer (based on the structure 28 29 3V7P). Each monomer consists of two domains, a TIMbarrel domain (helices are red, beta 30 31 32 sheets are yellow and loops are green) and a smaller domain 2 (a betabarrel flanked with two 33 34 short alphahelices, all are blue colored). Iron and benzoic acid ligands are shown as spheres. 35 36 37 Both domains contribute to interactions between amino acid residues along the dimer interface. 38 39 40 41 Figure 6: The 2F F electron density map (1 σ cutoff) is shown around amino acid residues 42 o c 43 44 involved in metal coordination (red) inside of the active site of Nis0429 (PDB ID: 3V7P) and 45 46 around small unknown ligands (green). The residues are shown as stick models, and iron is a 47 48 49 yellow sphere. The unknown ligands were modeled and refined as benzoic acid, glycine, and 50 51 formate. They occupy the suspected substratebinding site of Nis0429. The overall protein 52 53 structure in the background is represented by a cartoon model. 54 55 56 57 58 36 59 60 ACS Paragon Plus Environment Page 37 of 50 Biochemistry

1 2 3 Figure 7: The cartoon model of Nis0429 with a bound substrate (6aminodeoxyfutalosine, 4 5 6 AFL, the stick model). The substrate was modeled using the location of a few smaller ligands 7 8 detected inside of the substratebinding site in the structure 3V7P (the surface of these small 9 10 11 ligands is marked by yellow dots). The iron atom is drawn as a sphere. 12 13 14 15 Figure 8: The amino acid residues in the active site of Nis0429 (PDB ID: 3V7P) are shown 16 17 18 with a modeled AFL presented as sticks. The residues that are conserved among “typical” 19 20 adenosine deaminases have carbon atoms colored in magenta. The residues of the proposed 21 22 hydrophobic binding site are unique to Nis0429 and its homologues and have carbon atoms 23 24 25 drawn in yellow. The carbon atoms of AFL are colored in cyan. The iron atom is shown as a 26 27 grey sphere. The protein in the background is presented by a cartoon model. 28 29 30 31 32 Figure 9 : Superimposition of wild type Nis0429 (PDB ID: 3V7P) and the S145A mutant (PDB 33 34 ID: 4M51). For the wild type enzyme the carbon atoms are cyan, oxygen atoms are red and 35 36 37 nitrogen atoms are blue. For the S145A mutant the carbon atoms are green with selected 38 39 residues drawn as sticks. Iron in drawn as a brown sphere. The 2FoFc electron density maps 40 41 (sigma cutoff 1.0) are shown around the benzoic acid ligands and interacting residues. 42 43 44 45 46 Figure 10: Superposition of two structures, 3V7P (yellow) and 2IMR (green and blue), is 47 48 presented as a cartoon model. Two sequence segments (residues 102128 and 253283) in the 49 50 51 structure 2IMR are blue. Except for the small domain 2, these are the only parts in the Dr0824 52 53 structure that are significantly different form their counterparts in the Nis0429 structure. AFL 54 55 56 57 58 37 59 60 ACS Paragon Plus Environment Biochemistry Page 38 of 50

1 2 3 (spheres) cannot be fitted inside of the active site of the structure 2IMR due to multiple clashes 4 5 6 with residues residing on segments colored in blue. 7 8 9 10 11 Figure 11: Model of Sav2595 with AFL docked in the active site. The HEI of AFL is shown in 12 13 blue. The hydrophobic pocket that accommodates the AFL side chain and substrate recognizing 14 15 arginine are shown in yellow. Residues in purple are conserved among adenosine deaminases. 16 17 18 The zinc ion is shown as a grey sphere. 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 38 59 60 ACS Paragon Plus Environment Page 39 of 50 Biochemistry

1 2 3 4 5 ? 6 7 8 9 10 11 12 MTAN 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 Figure 1 55 56 57 58 39 59 60 ACS Paragon Plus Environment Biochemistry Page 40 of 50

1 2 3 4 5 6 Acel_0255 Acel_0256 Acel_0257 Acel_0261 Acel_0263 Acel_0264 Acel_0265 Acel0266 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 Figure 2 55 56 57 58 40 59 60 ACS Paragon Plus Environment Page 41 of 50 Biochemistry

1 2 3 4 5 6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 7 8 9 10 11 12 13 14 15 16 17 18 15 16 17 18 19 20 21 22 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 Figure 3 57 58 41 59 60 ACS Paragon Plus Environment Biochemistry Page 42 of 50

1 2 3 4 5 1 2 3 4 5 6 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 32 33 34 35 36 37 38 39 40 41 42 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 Figure 4 56 57 58 42 59 60 ACS Paragon Plus Environment Page 43 of 50 Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 Figure 5 50 51 52 53 54 55 56 57 58 43 59 60 ACS Paragon Plus Environment Biochemistry Page 44 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 Figure 6 52 53 54 55 56 57 58 44 59 60 ACS Paragon Plus Environment Page 45 of 50 Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 Figure 7 53 54 55 56 57 58 45 59 60 ACS Paragon Plus Environment Biochemistry Page 46 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 Figure 8 54 55 56 57 58 46 59 60 ACS Paragon Plus Environment Page 47 of 50 Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 Figure 9 52 53 54 55 56 57 58 47 59 60 ACS Paragon Plus Environment Biochemistry Page 48 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 Figure 10 50 51 52 53 54 55 56 57 58 48 59 60 ACS Paragon Plus Environment Page 49 of 50 Biochemistry

1 2 3 4 5 I76 6 L80 D296 7 Y79 E218 8 9 H239 10 V83 11 H34 H32 12 13 14 15 H215 16 17 AFL 18 R87 19 20 21 S119 22 T116 D154 23 24 25 26 27 28 29 30 31 32 33 34 Figure 11 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 49 59 60 ACS Paragon Plus Environment Biochemistry Page 50 of 50

1 2 3 Table of Contents Graphic 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 50 59 60 ACS Paragon Plus Environment