The MDR Superfamily
Total Page:16
File Type:pdf, Size:1020Kb
Cell. Mol. Life Sci. 65 (2008) 3879 – 3894 View metadata,1420-682X/08/243879-16 citation and similar papers at core.ac.uk Cellular and Molecular Life Sciencesbrought to you by CORE DOI 10.1007/s00018-008-8587-z provided by Springer - Publisher Connector Birkhuser Verlag, Basel, 2008 The MDR superfamily B. Perssona,b,*, J. Hedlunda and H. Jçrnvallc a IFM Bioinformatics, Linkçping University, 581 83 Linkçping (Sweden), Fax: +4613137568, e-mail: [email protected] b Dept of Cell and Molecular Biology, Karolinska Institutet, 171 77 Stockholm (Sweden) c Dept of Medical Biochemistry and Biophysics, Karolinska Institutet, 171 77 Stockholm (Sweden) Online First 14 November 2008 Abstract. The MDR superfamily with ~350-residue ments. Subsequent recognitions now define at least 40 subunits contains the classical liver alcohol dehydro- human MDR members in the Uniprot database genase (ADH), quinone reductase, leukotriene B4 (correspondingACHTUNGRE to 25 genes when excluding close dehydrogenase and many more forms. ADH is a homologues), and in all species at least 10888 entries. dimeric zinc metalloprotein and occurs as five differ- Overall, variability is large, but like for many dehy- ent classes in humans, resulting from gene duplica- drogenases, subdivided into constant and variable tions during vertebrate evolution, the first one traced forms, corresponding to household and emerging to ~500 MYA (million years ago) from an ancestral enzyme activities, respectively. This review covers formaldehyde dehydrogenase line. Like many dupli- basic facts and describes eight large MDR families and cations at that time, it correlates with enzymogenesis nine smaller families. Combined, they have specific of new activities, contributing to conditions for substrates in metabolic pathways, some with wide emergence of vertebrate land life from osseous fish. substrate specificity, and several with little known The speed of changes correlates with function, as do functions. differential evolutionary patterns in separate seg- Keywords. Dehydrogenases, reductases, enzyme superfamily, evolution, bioinformatics. Introduction bers, while about 1000 forms belong to small clusters with 10 or fewer members each. Thus, the MDR The superfamily of MDR (medium-chain dehydro- superfamily is now showing a spread resembling the genase/reductase) proteins is formed by zinc-depend- complexity of the SDR (short-chain dehydrogenase/ACHTUNGRE ent ADHs (alcohol dehydrogenases), quionone re- reductase) superfamily [3]. This is a new characteristic ductases, leukotriene B4 dehydrogenase,ACHTUNGRE and many of MDR, not detectable until now when many data more families. The family has grown considerably in from large-scale genome projects exist. recent years, and as of October 2007 it had close to The first-characterised member was the class I type of 11000 members in the Uniprot database [1]. Dis- mammalian ADH, for which the primary structure regardingACHTUNGRE species variations, there is still considerable was reported already in 1970 [4]. Subsequently multiplicity, with at least 25 members in humans, detected MDR members included sorbitol DH [5], a excluding close homologues. Compared to an inves- crystallin, metabolic enzymes and a synaptic protein tigation five years ago [2], the total number of known [6]. The latter report also coined the term MDR [6] to MDR members in all species has increased about 10- distinguish this protein family from that of SDR with fold. Roughly half of the MDR proteins can be its typically smaller (~250 residue) subunits and no grouped into large families with hundreds of mem- metal dependence [7]. This split between two protein types, giving rise to a basic multiplicity of many activities, was seen already in the 1970s when the first * Corresponding author. data on Drosophila ADH showed subunit patterns 3880 B. Persson, J. Hedlund and H. Jçrnvall MDR proteins separate from those of the Zn-dependent ADHs [8]. We then notice a substantial decrease in number In parallel with these family distinctions deduced from already at the 99% identity level, where about 20% of whole-subunit similarities, the distribution of domain all characterised MDRs are eliminated, representing similarities, and the ancestral nucleotide binding units closely related forms or duplicates. At the 90% level, discerned from three-dimensional data [9–11] led to a we find about two-third of all members. At the 30% number of repeated sequence similarities, also seen identity level, we find 476 members, corresponding to early [12] within this whole class of DHs/reductases/ about 5% of the total number. Here we will in most kinases in general. cases find only one representative for each MDR The MDR proteins typically consist of two domains, family, and we can therefore estimate the total number where the C-terminal domain is coenzyme-binding of MDR families to be around 500. Of these families, with the ubiquitous Rossmann fold [10] of an often six- we find that only 8 have 200 or more members, and stranded parallel b-sheet sandwiched between a- therefore are the most widespread of the MDR helices on each side. The N-terminal domain is proteins. Together they represent 3165 proteins in substrate binding with a core of antiparallel b-strands the database, corresponding to about a third of all and surface-positioned a-helices, showing distant MDR forms characterised. The majority of sequence homology with the GroES structure [13]. The do- clusters (334 families) presently have 10 or fewer mains are separated by a cleft containing a deep members. pocket which accommodates the cofactor and the active site. In this report, we review the different MDR families, MDR families observe common characteristics of the members and emphasise evolutionary aspects of this superfamily. The MDR superfamily can be divided into separate families, based upon sequence similarities. For each of the 8 families corresponding to the most widespread Divergence within the MDR superfamily forms, we have derived a hidden Markov model (HMM, [15]) which can be used for subsequent Presently, there are 10888 members of the MDR database searches in order to find further members. superfamily, as judged from Uniprot entries, which We have created additional HMMs for some MDR contain the Pfam [14] domains PF00107 (zinc-binding families of special interest, for instance those with DH) and PF08240 (alcohol DH GroES-like domain). representatives from the human species. The number A number of MDR members are multi-proteins, and of currently detected members for each of these MDR some Uniprot entries contain multiple proteins. families is shown in Table 2. The largest MDR families Therefore, we decided to focus only on the MDR are currently the ADH and CAD (cinnamyl alcohol units for the subsequent comparisons. From the 10888 dehydrogenase) families. However, the description in MDR members, we could extract 9756 MDR proteins, this review reflects the current situation, and as future excluding partial sequences shorter than 250 amino genome projects and environmental sequencing proj- acid residues. In order to get an overview of the ects (e.g. [16]) will contribute to an immense increase relationships within this large superfamily, we have of structural information, the number of MDR forms clustered and counted the proteins at different and their family sizes are also expected to increase. It identity levels; i.e. at each level the MDR proteins should be noticed that the functional importance of are redundancy-reduced so that all sequences have each MDR family is not correlated with the size of the pairwise identities below the threshold of that level family. If anything, it might even be the opposite, since (Table 1). the strictest substrate specificity is often associated with housekeeping enzymes of low multiplicity (cf. Table 1. MDR members in Uniprot as of October 2007 in total and Table 3, below). Functionally, higher vertebrates have at various redundancy-reduction levels. at least 11 separate MDR activity types (Fig. 4, Level MDR domains below). Knowledge of multiplicities within the MDR super- Total 9756 family is steadily increasing (Fig. 1). In 1983, ADH 99% 7982 and YADH (now TADH) were the only MDR 90% 6593 families in the databases. In the latter half of the 60% 3005 1980s, the PDH (polyol DH) family became notice- 30% 476 able, followed by the first members of the TDH (threonine DH), QOR (quinone oxidoreductase), The numbers represent MDR domains. VAT (vesicle amine transport), ACR (acyl-CoA Cell. Mol. Life Sci. Vol. 65, 2008 Review Article 3881 Table 2. Number of members of the MDR families discussed as Highly widespread MDR families detected in the Uniprot sections Swissprot and TrEMBL with the corresponding HMMs at the stringent E-value level of 1e-100 or better. At least eight MDR families are large, each with presently more than 200 members, and in total 3165 Family Members Swissprot TrEMBL known proteins, thus corresponding to about one- ADH 931 99 832 third of the MDR superfamily members. CAD 520 35 485 LTD 427 15 413 1. The ADH family TADH 330 40 290 Based upon the present multiple sequence alignment YHDH 295 1 294 and dendrogram of all MDR forms, ADH constitutes a natural family, encompassing animal ADH, plant BPDH 229 3 226 ADH and FADH (glutathione-dependent formalde- PDH 218 18 200 hyde dehydrogenase, also known as class III ADH). TDH 215 49 166 The ADH family as an entity is also supported by our BurkDH 67 0 67 HMM analyses. The ancestral form appears to be MCAS 58 1 57 FADH, which is present in all kingdoms of life, and MECR 49 13 36 the sole form present in non-vertebrate animals [18]. VAT1 39 6 33 The dendrogram in Figure 2 may appear to suggest QOR 28 8 20 that classes II and V/VI emerged early, but the data ACR 25 4 21 for these classes are few. Disregarding the little- known forms, class I ADH from fish branches off DOIAD 16 7 6 first.