Exploring the Chemistry and Evolution of the Isomerases
Total Page:16
File Type:pdf, Size:1020Kb
Exploring the chemistry and evolution of the isomerases Sergio Martínez Cuestaa, Syed Asad Rahmana, and Janet M. Thorntona,1 aEuropean Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom Edited by Gregory A. Petsko, Weill Cornell Medical College, New York, NY, and approved January 12, 2016 (received for review May 14, 2015) Isomerization reactions are fundamental in biology, and isomers identifier serves as a bridge between biochemical data and ge- usually differ in their biological role and pharmacological effects. nomic sequences allowing the assignment of enzymatic activity to In this study, we have cataloged the isomerization reactions known genes and proteins in the functional annotation of genomes. to occur in biology using a combination of manual and computa- Isomerases represent one of the six EC classes and are subdivided tional approaches. This method provides a robust basis for compar- into six subclasses, 17 sub-subclasses, and 245 EC numbers cor- A ison and clustering of the reactions into classes. Comparing our responding to around 300 biochemical reactions (Fig. 1 ). results with the Enzyme Commission (EC) classification, the standard Although the catalytic mechanisms of isomerases have already approach to represent enzyme function on the basis of the overall been partially investigated (3, 12, 13), with the flood of new data, an integrated overview of the chemistry of isomerization in bi- chemistry of the catalyzed reaction, expands our understanding of ology is timely. This study combines manual examination of the the biochemistry of isomerization. The grouping of reactions in- chemistry and structures of isomerases with recent developments volving stereoisomerism is straightforward with two distinct types cis-trans in the automatic search and comparison of reactions. Results (racemases/epimerases and isomerases), but reactions obtained using our de novo reaction-based clustering approach entailing structural isomerism are diverse and challenging to clas- were compared with the EC classification. sify using a hierarchical approach. This study provides an overview of which isomerases occur in nature, how we should describe and Results classify them, and their diversity. Unlike other EC classes, the overall chemistry of isomerases is diverse, especially at the subclass level (Fig. 1A). Some isomer- isomerases | enzyme reaction | EC-BLAST | reaction similarity | ases change stereochemistry [racemases and epimerases (EC 5.1) EC classification and cis-trans isomerases (EC 5.2)]; the rest catalyze major structural rearrangements and mirror the chemistry of other EC he 3D structure and function of biomolecules are intimately primary classes but act intramolecularly [intramolecular oxido- Tlinked. One of the most outstanding attributes of enzymes is reductases (EC 5.3) evoke oxidoreductases (EC 1), intra- their ability to recognize similar molecules, such as isomers, se- molecular transferases (EC 5.4) are designated from transferases (EC 2), and intramolecular lyases (EC 5.5) are designated from lectively. For example, glutamate racemase catalyzes the inter- lyases (EC 4)]. Finally, other isomerases (EC 5.99) refer to conversion between the isomers L-glutamate and D-glutamate, isomerases that do not fit any of the above and exhibit even with the first being one of the 20 amino acids used to build greater diversity. Only three subclasses, EC 5.1, EC 5.3, and EC proteins, whereas the second is an essential component of bac- 5.4, are further divided into sub-subclasses depending on dif- terial cell walls (1). Isomers of the same drug are often distin- ferent attributes of the reaction: type of substrate, bond change, guished; for example, the tragic story of thalidomide unveiled how subtle changes in the spatial arrangement of atoms can have Significance drastic consequences in their biological effect (2). The isomerases, which catalyze these interconversions, are involved in the central metabolism of most living organisms and Biologists are now challenged with the functional interpreta- have important applications in organic synthesis, biotechnology, tion of vast amounts of sequencing data derived from geno- and drug discovery (3–5). In comparison to other classes, isom- mics initiatives. Among all known proteins, the function of erases are a small class involving unimolecular reactions, which enzymes is probably the most investigated and best described are easy to analyze manually. The study of the biological mecha- at the molecular level. Together with enzymes changing the nisms of isomerases provided fundamental insights into the elec- redox state of substrates and transferring chemical groups trostatic principles of enzyme catalysis (6) and helped to reveal the between molecules, isomerases catalyze interconversion of connection between host–parasite interactions and cancer (7). The isomers, molecules sharing the same atomic composition but challenges of automatically detecting stereoisomerization in re- different arrangements of chemical groups. This study presents actions also make their chemistry technically interesting (8–11). a way of describing isomerases that will give biochemists a method A standard description of the biological function of genes and to search and utilize reaction data in a more knowledge- proteins is essential to interpret and report the outcome of se- based manner. It captures our current knowledge, charac- quencing initiatives. Scientists have traditionally developed elab- terizing the chemistry of isomerization in biology, and will orate classification systems to group functions in a hierarchical contribute to improving the annotation of sequences derived manner. Among the existing approaches, enzyme function is from genomes. probably the best described at the molecular level, due to the long- standing effort of a team of enzymologists from the Enzyme Author contributions: S.M.C., S.A.R., and J.M.T. designed research; S.M.C. performed re- Commission (EC) of the Nomenclature Committee of the In- search; S.A.R. contributed new reagents/analytic tools; S.M.C. and J.M.T. analyzed data; ternational Union of Biochemistry and Molecular Biology (NC- and S.M.C. and J.M.T. wrote the paper. IUBMB) to classify enzyme function systematically. The EC The authors declare no conflict of interest. classification is the most widely used system and uses four-digit This article is a PNAS Direct Submission. identifiers known as the EC numbers describing different levels Freely available online through the PNAS open access option. of the overall chemistry being catalyzed by an enzyme. For instance, 1To whom correspondence should be addressed. Email: [email protected]. alanine racemase is an isomerase (EC 5) catalyzing the racemiza- This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. tion (EC 5.1) of the amino acid (EC 5.1.1) Ala (EC 5.1.1.1). This 1073/pnas.1509494113/-/DCSupplemental. 1796–1801 | PNAS | February 16, 2016 | vol. 113 | no. 7 www.pnas.org/cgi/doi/10.1073/pnas.1509494113 Downloaded by guest on September 27, 2021 AB Fig. 1. Analysis of the EC classification of isomerases. (A) Distribution of isomerases in six subclasses, with the type of isomerism highlighted. Different attributes of the reaction are used to divide subclasses into sub-subclasses. (B) Distribution of isomerase reactions by bond changes. The symbol “↔” indicates change of bond order. reaction center, and chemical group transferred (Fig. 1A). An For instance, all reactions in EC 5.1 are only C(R/S), except the approach using a combination of manual analysis informed by an conversion of L-phenylalanine into D-phenylalanine, where automatic comparison and clustering of reactions was contrasted L-phenylalanine racemase (EC 5.1.1.11) catalyzes the cleavage to the EC nomenclature to suggest key determinants involved and formation of two O–P bonds and two O–H bonds from ATP in the classification of isomerization in biology (SI Appendix, and water molecules. EC 5.2 is mainly C(E/Z), and C–Hand Fig. S1A). C–C ↔ C=C are rare. The rest of the subclasses involve a more complex combination of bond changes and reaction centers. Isomerase Reaction Data. At the time of writing, the NC-IUBMB Despite being rare, 12 bond changes (40% of the total) and 468 (the body that oversees enzyme nomenclature) listed 5,385 active reaction centers (79% of the total) are distinctive of one subclass four-digit EC numbers in the classification, 245 of which corre- (SI Appendix, Table S1). For example, the O–O bond in ring spond to isomerase EC numbers. The EC assigns an EC number systems is only broken by EC 5.3 enzymes present in the arachi- to an enzyme and, based on experimental evidence, identifies its donic acid metabolism: prostaglandin synthases D, E, and I (EC “dominant” reaction, even though the enzyme might be pro- 5.3.99.2, EC 5.3.99.3, and EC 5.3.99.4) and thromboxane-A syn- miscuous and able to catalyze many different reactions. Bi- thase (EC 5.3.99.5). These enzymes catalyze the opening of epi- ological databases, such as the Kyoto Encyclopedia of Genes and dioxy bridges in prostaglandins. On the other hand, abundant bond Genomes (KEGG; which is very widely used), rely on this so- changes, such as C(R/S), are often present in multiple subclasses. BIOCHEMISTRY called “IUBMB reaction,” which is chosen by the KEGG as the representative reaction for the group of reactions associated with Isomerase Reactants. All isomerase reactions, as defined in the the same EC number. Only the 219 isomerase EC numbers with KEGG (15), are reversible, with both substrates and products chemical structures available for all reactants and balanced equally designated as reactants. Most reactions are unimolecular IUBMB reactions were used in this analysis (Materials and Methods). (a single substrate leads to a single product); the only exception This dataset represents the most complete compilation of is the interconversion catalyzed by L-phenylalanine racemase isomerase chemistry existing in nature that is known today (SI (discussed above).