Structure and Evolution of Linalool Synthase
Total Page:16
File Type:pdf, Size:1020Kb
Structure and Evolution of Linalool Synthase Leland Cseke, Natalia Dudareva,1 and Eran Pichersky Department of Biology, University of Michigan Plant terpene synthases constitute a group of evolutionarily related enzymes. Within this group, however, enzymes that employ two different catalytic mechanisms, and their associated unique domains, are known. We investigated the structure of the gene encoding linalool synthase (LIS), an enzyme that uses geranyl pyrophosphate as a substrate and catalyzes the formation of linalool, an acyclic monoterpene found in the ¯oral scents of many plants. Although LIS employs one catalytic mechanism (exempli®ed by limonene synthase [LMS]), it has sequence motifs indicative of both LMS-type synthases and the terpene synthases employing a different mechanism (exempli®ed by copalyl diphosphate synthase [CPS]). Here, we report that LIS genes analyzed from several species encode proteins that have overall 40%±96% identity to each other and have 11 introns in identical positions. Only the region encoding roughly the last half of the LIS gene (exons 9±12) has a gene structure similar to that of the LMS-type genes. On the other hand, in the ®rst part of the LIS gene (exons 1±8), LIS gene structure is essentially identical to that found in the ®rst half of the gene encoding CPS. In addition, the level of similarity in the coding information of this region between the LIS and CPS genes is also signi®cant, whereas the second half of the LIS protein is most similar to LMS-type synthases. Thus, LIS appears to be a composite gene which might have evolved from a recombination event between two different types of terpene synthases. The combined evolutionary mechanisms of duplication followed by divergence and/or ``domain swapping'' may explain the extraordinarily large diversity of proteins found in the plant terpene synthase family. Introduction Terpenes are among the most widespread and alyzes the formation of limonene, a monoterpenoid de- chemically diverse groups of compounds produced by fense compound. So far, LMS (a protein with approxi- plants. They act in a large number of roles, e.g., hor- mately 600 amino acids) is one of the few terpene syn- mones, mediators of polysaccharide synthesis, photo- thases whose gene has been characterized for diverse synthetic pigments, electron carriers, and membrane species. The deduced amino acid sequences of LMSs components, among other essential functions within the from various angiosperms are 40%±65% identical to plant (Chappell 1995; McGarvey and Croteau 1995). each other, but share only ;28% amino acid sequence They also mediate interactions between plants and their identity with LMS from the gymnosperm Abies grandis environment: some serve as toxic defense compounds (Bohlmann, Steel, and Croteau 1997). Signi®cant levels to ward off herbivores or fungal attacks (Johnson and of interspeci®c sequence identity are also seen with co- Croteau 1987; Lewinsohn, Gijzen, and Croteau 1992; palyl diphosphate synthase (CPS, formerly ent-kaurene Nebeker, Hodges, and Blance 1993), while others are synthase A). CPS (a protein with approximately 800 res- volatile compounds released from the plants to attract idues), together with the similarly sized ent-kaurene syn- pollinators (Dobson 1993; Knudsen and Tollsten 1993). thase (KS, formerly ent-kaurene synthase B), is respon- Terpenoid compounds (also called isoprenoids) are sible for the production of ent-kaurene (the precursor of made from the condensation of one molecule of dime- gibberellin-type plant hormones) from GGPP (Yama- thylallyl pyrophosphate (DMAPP) and one or more mol- guchi et al. 1996). The deduced amino acid sequences ecules of isopentyl pyrophosphate (IPP), to give geranyl of CPS from A. thaliana, tomato, pea, pumpkin, and diphosphate (GPP), farnesyl diphosphate (FPP), or ger- corn species are only 45%±50% identical (Sun and Ka- anylgeranyl diphosphate (GGPP) (Gershenzon and Cro- miya 1994, 1997; Bensen et al. 1995). teau 1993). Practically all other terpenes are derived Certain sequence motifs shared by all plant terpene from these three precursors. The enzymes that use GPP, synthases suggest that they share a common evolution- FPP, or GGPP as substrates are termed terpene synthases ary origin (McGarvey and Croteau 1995; Bohlmann, or, when the product formed is a cyclic compound, ter- Meyer-Gauen, and Croteau 1998). Interestingly, the lev- pene cyclases. The mechanisms of action and of the evo- el of sequence identity between ``same'' synthases (i.e., lution of these enzymes are of particular interest for an enzymes with the same substrate and same product) in understanding of the many plant processes in which ter- different species can be lower than that between two penes are involved. different synthases. This is clearly shown in the phylo- Research on terpenes synthesis has focused on the genetic analysis of the terpene synthase protein family cyclases, such as limonene synthase (LMS), which cat- carried out by Croteau and co-workers (Bohlmann, Meyer-Gauen, and Croteau 1998), in which it was found 1 Present address: Horticulture Department, Purdue University. that the same terpene synthases (e.g., LMSs) from dif- ferent plant species belonged in two different subfami- Key words: domain swapping, evolution, ¯oral scent, gene struc- ture, isoprenoids, plants, terpene synthases, secondary metabolism. lies, and each of these subfamilies also contained en- Address for correspondence and reprints: Eran Pichersky, De- zymes whose substrates and products were quite differ- partment of Biology, University of Michigan, Ann Arbor, Michigan ent. 48109-1048. E-mail: [email protected]. Gene structure also points to the common ancestry Mol. Biol. Evol. 15(11):1491±1498. 1998 of many terpene synthases. LMS, 5EAS (5-epi-aristo- q 1998 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038 lochene synthase), VES (vetispiradiene synthase), and 1491 1492 Cseke et al. CAS (casbene synthase) have introns in identical posi- cies (Kaiser 1991; Knudsen, Tollsten, and Bergstrom tions (Facchini and Chappell 1992; Mau and West 1994; 1993). A cDNA encoding linalool synthase (LIS1, for- Back and Chappell 1995; Yuba et al. 1996). However, merly LIS) was previously isolated and characterized CPS gene structure differs substantially from that of oth- from a C. breweri ¯oral cDNA library (Dudareva et al. er terpene cyclases (Sun and Kamiya 1994; Bensen et 1996). As part of our investigation to understand how al. 1995). It is noteworthy that the deduced amino acid the LIS gene promoter in C. breweri is differentially sequences of CPSs share only 20%±31% identity with regulated in several ¯ower parts during ¯oral develop- other terpene synthases, including KS (;26%±31% with ment, we isolated the LIS gene from two linalool-scent- KS, whose complete gene structure is not currently ed Onagraceae species, C. breweri and Oenothera ari- known) (Sun and Kamiya 1994; Yamaguchi et al. 1996, zonica, and from a nonscented one, Clarkia concinna, 1998). Their substantial sequence divergence from other which has nonetheless been shown to express its LIS terpene cyclases has resulted in the placement of the gene at a low level in the stigma (Pichersky et al. 1994; CPS proteins in a separate subfamily in the classi®cation Dudareva et al. 1996). In comparing the structures of scheme proposed by Bohlmann, Steele, and Croteau these LIS genes (and that of Arabidopsis thaliana, which (1997) and Bohlmann, Meyer-Gauen, and Croteau was independently obtained) with the structures of other (1998). terpene synthase genes, we discovered that the structure Although they are all evolutionarily related, mech- of LIS indicates that it is a composite gene. The results anistically the plant terpene synthases seem to fall into of these comparisons, and their implications for the evo- two main catergories. One class, exempli®ed by LMS, lution of the terpene gene family, are presented here. employs a mechanism involving the ionization of the substrate allylic diphosphate to initiate the reaction (Vo- Materials and Methods gel et al. 1996). A second class, exempli®ed by CPS, Construction of Partial Genomic Libraries from C. employs a mechanism involving substrate double-bond breweri, C. concinna, and O. arizonica protonation (Vogel et al. 1996). An important feature of the ®rst class of terpene synthases is the DDXXD motif DNAs were digested with EcoRI or EcoRV and found in the C-terminal halves of these proteins and separated on 0.7% agarose gels. Fragments in the ap- shown to be involved in metal cofactor binding and ca- propriate range (as previously determined by Southern talysis (Ashby and Edwards 1990; Tarshis et al. 1994; blot analysis with an LIS1 probe) were cut out and elut- Starks et al. 1997; Lesburg et al. 1997). CPS does not ed from the gels. In the case of EcoRV-generated frag- contain such a motif, but instead has an aspartate-rich ments, EcoRI/NotI adaptors (Pharmacia Biotech) were motif in its N-terminal half that is also believed to be ligated to the blunt ends of these fragments. Partial ge- involved in metal cofactor binding (Sun and Kamiya nomic libraries were then constructed and screened us- 1994). Thus, the classi®cation based on levels of amino ing the Lambda ZAP II/EcoRI/CIAP Cloning Kit (Stra- acid sequence similarities among various terpene syn- tagene) according to manufacturer's instructions. Li- thases, which places CPS in a separate subfamily, is braries were screened with LIS1 as a probe under high- consistent with its distinct mechanism. stringency conditions (658C hybridization temperature) However, some terpene synthases, such as abieta- for the C. breweri library and under low-stringency con- diene synthase (ABS) and linalool synthase (LIS), are ditions (588C hybridization temperature) for O. arizon- not easily classi®ed based on amino acid sequence com- ica library, with all other conditions being identical to parisons alone. This is so because they have higher se- the ones described in Dudareva et al. (1996). quence similarity to CPS in their N-terminal parts but Isolation of C.