Proc. Natl. Acad. Sci. USA Vol. 93, pp. 9033-9038, August 1996 Evolution

Duplication and functional divergence in the chalcone synthase gene family of : Evolution with substrate change and catalytic simplification (anthocyanin/flavonoid genetics/gene phylogeny/secondary metabolism/stilbene synthase) YRJO HELARIUTA*t*, MiKA KOTILAINEN*, PAULA ELOMAA*, NISSE KALKKINEN*, KARE BREMER§, TEEMU H. TEERI*, AND VICTOR A. ALBERTt *Institute of Biotechnology, University of Helsinki, P.O. Box 45, FIN-00014 Helsinki, Finland; *The New York Botanical Garden, Bronx, NY 10458-5126; and §Department of Systematic Botany, Uppsala University, Villavagen 6, S-752 36 Uppsala, Sweden Communicated by Michael T Clegg, University of California, Riverside, CA, May 20, 1996 (received for review November 29, 1995)

ABSTRACT -specific polyketide synthase genes con- SS genes at the level of deduced amino acid sequence. Its stitute a gene superfamily, including universal chalcone syn- expression pattern at both organ and cellular levels is not thase [CHS; malonyl-CoA:4-coumaroyl-CoA malonyltrans- correlated with anthocyanin pigmentation, for which CHS ferase (cyclizing) (EC 2.3.1.74)] genes, sporadically distrib- provides the first committed biosynthetic step. Furthermore, uted stilbene synthase (SS) genes, and atypical, as-yet- the catalytic properties of the corresponding enzyme differ uncharacterized CHS-like genes. We have recently isolated from CHS and SS, although the GCHS2 catalytic reaction and from hybrida (Asteraceae) an unusual CHS-like gene, its role in vivo are not yet completely understood. GCHS2, which codes for an enzyme with structural and In this study we show that the GCHS2-like genes in - enzymatic properties as well as ontogenetic distribution dis- aceae constitute a gene family, whose corresponding amino tinct from both CHS and SS. Here, we show that the GCHS2- acid sequences share some consensus residues. Phylogenetic like function is encoded in the Gerbera genome by a family of parsimony analysis of (i) the GCHS2 nucleotide sequence, (ii) at least three transcriptionally active genes. Conservation further GCHS2-like genes screened from a Gerbera cDNA within the GCHS2 family was exploited with selective PCR to library, (iii) gene fragments amplified from various Asteraceae study the occurrence ofGCHS2-like genes in other Asteraceae. using GCHS2 family-specific primers, and (iv) CHS superfam- Parsimony analysis of the amplified sequences together with ily genes isolated from other angiosperms of subclass Asteri- CHS-like genes isolated from other taxa of angiosperm sub- dae indicates that GCHS2 probably evolved from CHS via a class Asteridae suggests that GCHS2 has evolved from CHS single gene duplication event that occurred before the diver- via a gene duplication event that occurred before the diversifi- sification of Asteraceae. Structural and functional variation cation of the Asteraceae. Enzyme activity analysis of proteins observed among the members of the GCHS2 gene family produced in vito indicates that the GCHS2 reaction is a non-SS suggests that subsequent diversification has also taken place. A variant of the CHS reaction, with both different substrate comparison of the catalytic properties of GCHS2 to parsley specificity (to benzoyl-CoA) and a truncated catalytic profile. CHS shows that both substrate specificity and progressivity of Together with the recent results of Durbin et al. [Durbin, M. L., catalytic reaction steps have changed during GCHS2 evolution. Learn, G. H., Jr., Huttley, G. A. & Clegg, M. T. (1995) Proc. Natl. Acad. Sci. USA 92, 3338-3342], our study confirms a gene MATERIALS AND METHODS duplication-based model that explains how various related func- tions have arisen from CHS during plant evolution. Plant Material. G. hybrida is a hybrid of two species (G. jamessonii and G. viridifolia) belonging to the tribe Plant-specific polyketide synthase genes constitute a gene (Asteraceae subfamily Cichorioideae; ref. 16). We chose for superfamily. Genes encoding chalcone synthase [CHS; malo- analysis (a closely related ) and (a nyl-CoA:4-coumaroyl-CoA malonyltransferase (cyclizing) more distantly related genus) from Mutisieae, Taraxacum (EC 2.3.1.74)] and flavonoids, their corresponding reaction (tribe Lactuceae) from Cichorioideae, and (tribe He- products, seem to be universally distributed in (1-3). liantheae) from subfamily Asteroideae (16). CHS genes have been isolated from a wide taxonomic spec- Mature plants of G. hybrida var. Regina (obtained from trum from nonflowering seed plants to dicots and monocots (4, Terra Nigra, De Kwakel, The Netherlands) and seedlings of 5). Stilbene synthase (SS) genes coding for enzymes with a Leibnitzia anandria and Onoseris sagittatis were grown under related activity to CHS have been isolated from species that standard greenhouse conditions. Leaf material of Dahlia sp. accumulate stilbene phytoalexin (5-8). In addition to SS genes, and Taraxacum sp. (collected from gardens in Helsinki) were other structurally unusual CHS-like genes or gene products also used as sources of DNA. have been reported (9-12). Recently, Tropf et al. (13) have Isolation of GCHS17 and GCHS26 from a Genomic A provided evidence that SS genes have evolved from CHS genes Library. Nuclear DNA from Gerbera leaves was prepared by via independent gene duplication events several times during the method ofJofuku and Goldberg (17). A genomic library seed plant evolution. Durbin et al. (14) have demonstrated an was constructed with LambdaGEM-11 vector (Promega) and analogous mechanism leading to the evolution of structurally was screened using GCHS1-3 cDNA clones as probes (15, 18). unusual genes in the genus Ipomoea. Short fragments (181 bp) from the clones were amplified using We have recently isolated a structurally unusual CHS-like primers designed from the conserved region of the CHS genes gene, GCHS2, from Gerbera hybrida (Asteraceae; ref. 15). GCHS2 is '70% identical to typical CHS genes and the related Abbreviations: CHS, chalcone synthase; SS, stilbene synthase; 2-ME, 2-mercaptoethanol. Data deposition: The sequences reported in this paper have been The publication costs of this article were defrayed in part by page charge deposited in the GenBank data base (accession nos. X91339-X91345). payment. This article must therefore be hereby marked "advertisement" inI tTo whom reprint requests should be sent at present address: Depart- accordance with 18 U.S.C. §1734 solely to indicate this fact. ment of Biology, New York University, New York, NY 10003. 9033 Downloaded by guest on October 1, 2021 9034 Evolution: Helariutta et al. Proc. Natl. Acad. Sci. USA 93 (1996) (15), and their sequences were used for classification and a linear gradient of methanol (0-60% in 40 min) in H20. The designation of the subcloning strategy. Both strands of the eluent was monitored at 220 nm using a Waters 990 + diode GCHS17 and GCHS26 clones were determined using the array detector. For spectral information, data from 200 to 400 deletion strategy both manually and with an automated se- nm was collected at a resolution of 1.4 nm. Furthermore, the quence determination system (ALF; Pharmacia; ref. 18). radioactivity of each fraction was measured by scintillation Sequence alignments were done using the CLUSTAL program of counter, and the peak fractions were analyzed by TLC. the PC/gene package (IntelliGenetics). Amplification of GCHS2-Like Sequences from Dahlia, Leib- RESULTS nitzia, Onoseris, and Taraxacum. Partially degenerate (inosine containing) primers including restriction enzyme sites were Isolation and Characterization of GCHS17 and GCHS26 used to amplify fragments from various asteraceous species. Clones. Among 3.4 million plaque-forming units screened, 5'-TGCACTCGAGTGA(A/C/G/T)AA(A/G)ACAGC(A/ eight A clones hybridizing to GCHS1-3 cDNA probes were C/G/T)ATAAA(A/G)AA-3' and 5'-ACTGGGATCCACC- isolated. According to the classification based on the ampli- (A/C/G/T)GGGTG(A/C/G/T)ACCATCCA(A/G)AA-3' fication of a 181-bp fragment from a conserved region (15), the corresponding to peptides C(D/E)KTAIKK and FWMVH- clones were deduced to represent four different sequences. PGG were used for specific amplification of GCHS2-like Two novel genes, GCHS17 and GCHS26, having a continuous genes. For amplification, plant DNA was extracted by the reading frame (except for an intron) showed similarity to the method of Dellaporta et al. (19) and purified in isopycnic CsCl GCHS2 gene. The exon/intron boundaries as well as the start gradients (18). PCR was performed using a "touch-down" and stop codons of the reading frames were deduced based on the strategy: 10 times (940 75 s; 500 5 min adding -1° per cycle, general similarity of CHS enzymes at the amino acid sequence slope +220, 10 per 10 s; 720 5 min) followed by 31 times (940 level and by comparison to the GCHS2 cDNA sequence. 75 s; 530 2 min; 720 5 min). To verify that the amplification GCHS17 is a truncated clone missing approximately the first 50 products did not contain any chimeric artifacts due to recom- codons. It has a 28-bp first exon, a 1638-bp intron, and a 1016-bp bination among related gene family members (20), a second, second exon. GCHS26 harbors the entire reading frame: a 196-bp independent amplification was performed. After PCR, frag- first exon, a 483-bp intron, and a 1016-bp second exon. ments were separated from primers by gel electrophoresis, We compared the deduced amino acid sequences of purified from agarose and digested with the corresponding GCHS17 and GCHS26 to each other and to the other astera- restriction enzyme pair for cloning into plasmid pOK12 (21) ceous CHS superfamily sequences. The sequences form two for sequence analysis of both strands. subgroups: usual CHS-like sequences (with 88-93% intra- Phylogenetic Analysis of CHS-Like Gene Sequences. Nucle- group identity) and GCHS2-like sequences (83-84% intra- otide sequences representing 20 sequences (corresponding to group identity). The identity between the two groups is the amino acid sequence from Rll to S389 in GCHS1) were 73-77%. Next, we compared these sequences to a CHS con- aligned (using CLUSTAL) based on their deduced amino acid sensus sequence that contains the 260 residues identical in nine sequences. The aligned sequences (available on request) were functionally verified sequences of a wide evolutionary spec- subjected to cladistic parsimony analysis using the program trum (15). GCHS2 deviates at 49, GCHS26 at 47, and the PAUP (22). Heuristic search options were random addition of truncated GCHS17 at 42 positions. At 18 comparable sites, all sequences with 100 replicates and subsequent tree-bisection- GCHS2-like sequences deviate in an identical way from the reconnection branch swapping to generate multiple equally CHS consensus. At an additional six sites, they deviate in parsimonious trees. Branch support was estimated using par- concert from the consensus of GCHS1, CHS from Dendran- simony jackknifing with 10,000 replicates (23). The trees were thema grandiflora, and GCHS3. For comparison, GCHS1 and oriented by treating CHS superfamily sequences from taxa of the CHS from D. grandiflora deviate from the CHS consensus the Asterid I and Asterid II clades of Chase et al. (24) as at five positions, whereas GCHS3 deviates at 10 sites. The monophyletic, respectively. sequence analysis suggests that, in the Gerbera genome, Expression of GCHS26 in Escherichia coli and Analysis of GCHS2-like genes form a family, at least in the sense that they Corresponding Enzymatic Activity. Plasmids pHTT402, share some diagnostic characteristics in the primary structure pHTT406 (15), and pHTT409 express GCHS2, parsley CHS, of the corresponding enzymes. and GCHS26 genes from the vector pKKtac (25) in E. coli. In Isolation and Characterization of Gene Fragments from pHTT409, the intron in the genomic clone of GCHS26 was Other Asteraceae Species Using Primers Designed for the removed with the help of an oligonucleotide spanning the GCHS2 Gene Family. To study the distribution of GCHS2-like putative joint site of the exons. In these expression constructs genes in the Asteraceae, we designed specific primers for PCR the initiation of translation takes place at the plant genes' amplification based on the GCHS2 diagnostic sites (Fig. 1). ATG. The vector without an insert served as the control. For Amplification products of expected size were obtained from enzyme production, E. coli DH5a cells (26) harboring the Leibnitzia, Onoseris, Taraxacum, Dahlia, and Gerbera (for expression constructs were grown to OD600 = 0.8-1.0, induced control). In Fig. 1, deduced amino acid sequences correspond- for 1.5 h with 1 mM isopropyl 3-D-thiogalactoside at 280C, ing to the amplification products of the four species are shown. pelleted, and stored at -70°C. The enzymatic reaction was Each fragment has a reading frame without stop codons, and analyzed exactly as described in ref. 15. In certain experiments, the degree of deviation from the CHS consensus is of the same 2-mercaptoethanol (2-ME) was added in the reaction to inhibit order of magnitude as that of GCHS2 family of Gerbera, its progressivity. suggesting that the fragments analyzed represent coding re- Production, Purification, and UV Spectrum Analysis of gions of CHS superfamily genes. Leibnitzia LACHS1 and Benzoyl-CoA-Derived Products. The ethyl acetate extracts Onoseris OSCHS1 sequences share the clear majority of from a large-scale enzymatic reaction [in 6 x 1 ml of 50 mM features common to the GCHS2 family. LACHS1 follows the Hepes-KOH (pH 7)/1 mM EDTA/100 ,uM benzoyl-CoA/32 GCHS2 consensus in 13 of 16 comparable sites and OSCHS1 gLM malonyl-CoA (2 x 1 ml: 800 nM [14C]malonyl-CoA)/200 follows this consensus in 11 of 16 comparable sites. gg of dialyzed E. coli protein extract] were evaporated in a The Taraxacum and Dahlia sequences also share some vacuum centrifuge and redissolved in 10% methanol in water. residues diagnostic for GCHS2-like sequences. TXCHS1 This material was subjected to reverse-phase HPLC on a 0.4 X shares four of seven comparable positions with the GCHS2 10 cm LiChrospher 100 RP-18 (5 ,um) column (Merck) con- consensus, DHCHS1 shares three of seven comparable posi- nected to a Beckman 126 system gold gradient HPLC pump. tions, and DHCHS2 shares three of nine comparable positions. Chromatography was performed with a flow of 1 ml/min using Furthermore, the Taraxacum and Dahlia sequences often Downloaded by guest on October 1, 2021 Evolution: Helariutta et al. Proc. Natl. Acad. Sci. USA 93 (1996) 9035

cons. IC KS I R M TEE L NP C Y APSLDI IFAGGTVLR AKD AENN GARV VVCSEITAVTI GCHS1 CDKSMIRKRYMHITEEYLKQN NMCAYMAPSLD 95 FAGGTVLRLAKDLAENNKGARVLVVCSEITAVT 200 MUMCHS CDKSMIRKRYMHLTEEYLKEN NLCEYMAPSLD 95 FAGGTVLRLAKDLAENNKIARVLVVCSEITAVT 200 GCHS3 CDKSMIRKRYMHITEEFLKEN SMCK MAPSLD 98 FAGGTVLRLAKDLAENNKGARVLVVCSEITAVT 203 GCHS2 CEK AIKKRYHALTEIYLQEN TMCE MAPSLD 97 AGGTVLRLAKDLAENNKGERVLLVCSEITAE 202 GCHS26 CDK AIKKRY ALTEEYLKQN SMCE MAPSL 98 AGG VLRLAKDLAENNKG RVLVVCIEITA 203 GCHS17 CEK AIKKRY VLTEIYLEK NMCE MAPS 42 AGG VLRLAKDLAENNK RVLVICSEIT 147

+ + * * * * * LACHS1 ------RYMVLTEEYLKEN NMCEIMAPSL 25 AGGTVLRIAKDLAENNKG RVLVVClEIl A 130 OSCHS1 ------RYMVLTEEYLKEN NMCEgMAPSL 25 AGGt LRLAKDLAENNK RVLVVCSEIT 130 TXCHS 1 - RYIHHTEEFLKEN NMCGYN PSL 25 GG LRLAKDVAENNK RVLVVCEE 130 DCHS1 ------RYMFLTEIFLKDNDHCS NMCEYA PSLI 25 AGGG ~Vf~VLRLAKDIAENNKGLA D AENN RVVV1iRVLVVCIElIA 130

cons. AQT PDS GAIDGHLREVGL FHLLKDVPG SKNI L AF ISD N FW AHPGG GCHS1 MVSAAQTILPDSEGAIDGHLREVGLTFHLLKDVPGLISKNIEKALTTAFSPLGIDWNS IFWIAHPGG 309 MUMCHS MISAAQTILPDSEGAIDGHLREVGLTFHLLKDVPGLISKNIEKALTQAFSPLGISDWNSIFWIAHPGG 309 GCHS3 MVSAAQTILPDSEGAIDGHLIEVGLTFHLLKDVP LI KNIEKALIQAFSPLNI DWNSIFWIAHPGG 312 GCHS2 IVST QTILPDIE A XHLRE GLTF L DVP MV KNIENAEKASPLGI DWNSVFWM HPGG 311 GCHS26 IVST QTILPDSE A _HLRE GLTF L DVP MI KNIEDV VKE SPLGISDWNSLFW HPGG 312 GCHS17 IVST QTILPDgE A HLRE GLTF LEDVP MVSKNIEDA MKA SPLGISDYNSLFWM HPGG 256 + + * * * * + *

LACHS1 IVSA QTILPISE A HLRE GLTFLDVP MIS NIKDVDQAFSPLGISDWNSL ------193

OSCHS1 IVS QTLLPDSE A HLRE GLTFjL DVP MIKNIEDVIVKA SPLGISDWNSL ------193 DHCHS2 IVSA QTIIPDE A I3HLRE GLK HLDVP MI SNIENILMKA HPLGflDWNSM ------33 FIG. 1. Alignment of deduced amino acid sequences for CHS superfamily genes ofAsteraceae. Three selected regions are shown, corresponding to the sequence ranges of PCR-amplified genes (the forward and reverse primer sites are denoted with hyphens). Positions of final residues per block are marked for each polypeptide. cons., strict consensus of CHS sequences based on 9 sequences from a wide taxonomic spectrum (15). The amino acid residues deviating from this consensus are shown with a black background. Consensus within the GCHS2 family of Gerbera that represents deviation from CHS is indicated by * when compared with all CHS sequences and + against CHS genes of Asteraceae only. Note that in complete sequences (not shown) two asterisked sites lie between the upper two blocks and three others lie after the last block. MUMCHS, CHS ofD. grandiflora (27). LACHS1, OSCHS1, TXCHS1, DHCHS1, and DHCHS2 are the amplified fragments ofLeibnitzia, Onoseris, Taraxacum, and Dahlia, respectively. These partial sequences vary in size because of cloning aspects; in Taraxacum, a sequence prematurely ending at a BamHI site was isolated, whereas in Dahlia, two BamHI fragments were isolated separately. deviate from the CHS consensus in the same positions as the (Fig. 2). Nevertheless, it is likely that GCHS2 has emerged GCHS2 family of Gerbera, even if the derived residue is not from CHS as the result of a single gene duplication event, with identical. In contrast, there are several sites that are shared by subsequent differentiation during the evolution of the Aster- Taraxacum and/or Dahlia clones and the CHS consensus that aceae. This duplication event must have occurred prior to deviate from the GCHS2 consensus. diversification ofthe Asteraceae because all GCHS2-like genes Phylogenetic Analysis of Asteraceae and Asterid CHS Su- (of tribes Mutisieae, Lactuceae, and ) would be perfamily Genes. To study the phylogenetic relationships of derived from the CHS3 and CHS1 lineages (Fig. 2), both of the GCHS2 gene family, we performed a parsimony analysis of which include Gerbera (Mutisieae). Phylogenetic analyses of 19 CHS-like sequences at the nucleotide level (Fig. 2). As morphological and other molecular data indicate that tribe recent publications (13, 14) have shown general overviews of Mutisieae is the basal-most lineage of cichorioid-asteroid CHS superfamily phylogeny, we included only selected CHS- Asteraceae (-23,000 species), with only subfamily Barnade- like sequences available for angiosperm subclass Asteridae. sioideae (92 species) more primitive in the family (16). These included CHS for Apiaceae, Asteraceae, Convolvu- Comparison of the Enzymatic Activities of GCHS2 and laceae, Scrophulariaceae, and Solanaceae. Based on prior GCHS26 with Parsley CHS. To investigate the catalytic prop- phylogenetic results from the plastid gene rbcL (24), CHS-like erties of GCHS26 and compare them with those of the sequences from the first two families were expected to appear previously analyzed GCHS2 and parsley CHS (15, 32), we as one monophyletic branch of Asteridae, whereas the latter cloned the cDNA into the expression vector pKKtac and three were expected to occur in another. Genes selected produced the enzymes in E. coli. A parsley CHS cDNA was included two unusual UV-inducible CHS genes of Petunia used as a reference for CHS function. As a control for the E. (CHSB and CHSG; ref. 9), the CHSB-related Ipomoea se- coli background, the vector with no insert was also used. quences of (14), and the GCHS2-like genes of Gerbera and Fig. 3A shows the products ofthe three enzymes formed with other Asteraceae. 4-coumaroyl-CoA as a substrate. The initial product of a The parsimony analysisyielded 2 equally most-parsimonious typical CHS reaction is chalcone, but in vitro most of it is trees of 2720 steps with a consistency index of 0.50 and a converted nonenzymatically to naringenin in the course of the retention index of 0.52 (31). One of these trees is shown in Fig. reactions, and this is the main product observed with parsley 2. The alternative tree differs only in the placement of Leib- CHS in the chromatogram. In contrast, GCHS2 and GCHS26 nitzia and Onoseris GCHS2-like sequences relative to those of reveal no formation of naringenin, but produce a faint (and Gerbera (Fig. 2). In every case, GCHS2-like sequences are blurred) signal (P0.08; ref. 15) near the start. The radioactivity derived from within CHS as sister-group to CHS3. Jackknife at the front detected with low-pH extractions probably repre- branch support values (23) suggest uncontradicted support for sents malonic acid liberated from malonyl-CoA by thioester- the clade comprising all Asteraceae CHS-like sequences and ases in the extracts (15). for each of the two major clades containing GCHS2-like Since in our previous study (15) we found that GCHS2 is sequences (Fig. 2). able to use benzoyl-CoA as a substrate (leading to the accu- GCHS2-like genes appear to be monophyletic in all most- mulation of a product, P0.51) we tested the activity of GCHS26 parsimonious trees, but support for this association is weak similarly. Fig. 3B shows that GCHS26 produces two signals Downloaded by guest on October 1, 2021 9036 Evolution: Helariutta et al. Proc. Natl. Acad. Sci. USA 93 (1996)

CHS (Antirrhinum: Scrophulariaceae) CHSB (Petunia: Solanaceae) ~~1 86 CHSA (Ipomoea: Convolvulaceae) 100 CHSB (Ipomoea: Convolvulaceae) CHSG (Petunia: Solanaceae) - CHSJ (Petunia: Solanaceae) 1- CHSA (Petunia: Solanaceae) l...... CHS1 (Lycopersicon: Solanaceae) CHS (Petroselinum: Apiaceae) UiL;Hlll^^"sC4 (Gerbera:- Asteraceae)I MUMCHS (Dendranthema: Asteraceae) GCHS3 (Gerbera: Asteraceae) TXCHS1 (Taraxacum: Asteraceae) 81 DHCHS1 (Dahlia: Asteraceae) .0 OSCHS1 (Onoseris: Asteraceae) 75 LACHS1 (Leibnitzia: Asteraceae) 50 GCHS26 (Gerbera: Asteraceae) GCHS2 (Gerbera: Asteraceae) GCHS1 7 (Gerbera: Asteraceae) FIG. 2. Phylogenetic relationships of GCHS2-like genes within the context of the CHS gene superfamily of angiosperm subclass Asteridae. The tree shown is one of two most-parsimonious topologies differing only in the reversed placements of OSCHS1 (from Onoseris) and LACHS1 (from Leibnitzia). The tree was oriented by treating the two major clades ofAsteridae (24) as monophyletic groups. The same orientation was also indicated by midpoint rooting (28). Branch lengths are proportional to numbers of hypothesized nucleotide changes under the accelerated transformation character optimization (29, 30). The scale bar indicates 50 nucleotide changes. Numbers at nodes are parsimony jackknife support values; values close to or greater than 63% may indicate nodes set off by uncontradicted synapomorphies, whereas values between 50% and 63% indicate nodes with some robustness to extra steps (23). Support values under 50% are not shown. All Asteraceae CHS-like sequences form a group supported by 67% ofjackknife replicates. GCHS2-like sequences are shown to derive monophyletically from CHS1 and CHS3, but support for this relationship is weak. It is therefore possible that all GCHS2-like genes trace to a single duplication event that took place before the diversification of Asteraceae; DHCHS1 from Dahlia (of the derived subfamily Asteroideae) is embedded within an asteraceous CHS-like clade otherwise dominated by Gerbera and other taxa of tribe Mutiseae, which occupies a primitive position in Asteraceae phylogeny (16). (PO.51 and PO.9) specific for benzoyl-CoA. This indicates that (33). As a control, we first tested the effect of 2-ME on the both GCHS2 and GCHS26 differ from typical CHS by their reaction with 4-coumaroyl-CoA. As expected, treatment with substrate specificity; both enzymes are strikingly inactive with 100 mM concentration blocked naringenin formation and led the natural substrate of CHS, 4-coumaroyl-CoA, but both are to the production of a high mobility signal likely ascribable to able to use benzoyl-CoA instead. Furthermore, neither en- liberated malonic acid (Fig. 3E). With benzoyl-CoA as sub- zyme is able to catalyze the formation of products that are strate, increasing the amount of 2-ME in the reaction led to soluble to organic phase in pH, which contrasts with reduced formation of the high-pH soluble compounds PO.51b high and PO.67, whereas the accumulation of PO.51a (not soluble to parsley CHS (and the two CHS enzymes of Gerbera with both organic phase in high pH) remained high (Fig. 3F). As with 4-coumaroyl-CoA and benzoyl-CoA as substrates; ref. 15). 4-coumaroyl-CoA, the high mobility signal became stronger. To characterize further the product specificity of each This signal (and no others) increased even in a second control enzyme with benzoyl-CoA we purified the major products by reaction where malonyl-CoA was the only substrate (Fig. 3F). HPLC and analyzed their identity based on mobility in TLC In conclusion, the effect of 2-ME on the parsley CHS reaction (Fig. 3C) and UV absorption spectrum (Fig. 3D). Parsley CHS with benzoyl-CoA strongly suggests that PO.51a represents an produces four products (PO.51a, PO.51b, PO.67, PO.82), one of intermediate of the normal CHS reaction. Indeed, NMR which (PO.51a) is produced by both GCHS2 and GCHS26 and structural analysis of the PO.51 compound indicates the ab- one of which (PO.82) is produced also by GCHS26, but not sence of the second aromatic ring typical for CHS reaction, and significantly by GCHS2. Based on this analysis, the two therefore incomplete cyclization in the GCHS2 reaction (I. products soluble at high pH are PO.5lb and PO.67. Kilpelainen, personal communication). This catalytic trunca- To understand the relationship of the compound shared tion with substrate change implies a functional deviation from between GCHS2 and GCHS26 (PO.51a) to the other reaction CHS consistent with the virtually unaltered floral pigmenta- products, we studied the effect of 2-ME, a known inhibitor of tion of Gerbera plants transgenic for an antisense-GCHS2 the progressivity of the CHS reaction (33). With parsley CHS, Agrobacterium construct (35). 2-ME decreases the formation of the normal end-product naringenin (probably by disturbing a cysteine residue at the DISCUSSION enzyme's active site; ref. 34) and increases the formation of a The GCHS2 gene of G. hybrida has been shown to be a novel byproduct, bis-noryangonin, which is a reaction intermediate member of the CHS gene superfamily, encoding an enzyme Downloaded by guest on October 1, 2021 Evolution: Helariutta et al. Proc. Natl. Acad. Sci. USA 93 (1996) 9037 A B -F -F

S ...

Awf -.... !im: 0 qp - 0.51

.;.. -f - Nar

- 0.08 - S SD 0 0} - s V V P P 26 26 2 2 V V P P 26 26 2 2 mc Mc mc mc mc mc mc mc mb mb mb mb mb mb mb mb hi to hi lo hi lo hi lo hi lo hi lo hi lo hi lo C D - F

_NW. 26/0.82

0 : fW :. P/0.82

P/0.51 26/0.51 2/0.51 P/0.67 P/0.51 b 2/ 2 P/ P/ P/ P/ P 26/ 26/ 26 220 nm 350 nm 0.51 mb 0.82 0.51 0.67 0.51b mb 0.82 0.51 mb E lo lo lo F -F -F S

- 0.51 - 0.51 S_ - Nar

,K.? a' -S -S

PP P PPPP P VV P P P P P P P P P P mb mb mb mb m m m m mb mb mc mc mb mb mb mb mb mb mb mb hi hi lo lo hi hi lo J lIolo lo lo hi lo hi lo hi lo hi lo

+ + + + 0 0 12 12 60 60 300 300

FIG. 3. Chromatographic analysis of in vitro CHS reaction products. (A) TLC analysis with 4-coumaroyl-CoA and malonyl-CoA as substrates. 2, GCHS2; 26, GCHS26; P, parsley CHS expressed in E. coli; V, E. coli control (vector); c,-coumaroyl-CoA; m, malonyl-CoA substrates of the reactions. The pH at extraction is labeled lo (pH 4) and hi (pH 8.8). (B) TLC analysis with benzoyl-CoA and malonyl-CoA as substrates. b, benzoyl-CoA. (C) TLC analysis of the radioactive fractions purified by HPLC run adjacent to the nonpurified reactions. 2/0.51, 26/0.51, 0.82, P/0.51, 05lb, 0.67, 0.82, unknown products of GCHS2, GCHS26, and parsley CHS reactions, respectively. (D) UV spectrum of the fractions analyzed in C; (E) TLC analysis of the effect of 100 mM concentration of 2-ME to parsley CHS reaction. (-) absence/(+) presence of 2-ME. (F) TLC analysis of the effect of increasing concentration of 2-ME to parsley CHS reaction. 0, 12, 60, and 300 mM concentrations of 2-ME. S, F: start and front of the chromatograms. Nar, position of naringenin; 0.08, 0.51 position of unknown products.

with structural and enzymological properties as well as onto- harbors a family of GCHS2-like genes with at least three mem- genetic distribution distinct from CHS and SS (15). Here, we bers, and amplified fragments of GCHS2-like genes have been describe the evolutionary and functional relationships of obtained from other species of Asteraceae. Based on the phylo- GCHS2 to other genes of the superfamily. The Gerbera genome genetic hypothesis presented (Fig. 2), two strongly-supported Downloaded by guest on October 1, 2021 9038 Evolution: Helariutta et al. Proc. Natl. Acad. Sci. USA 93 (1996)

lineages of GCHS2-like genes have emerged from CHS by gene ships, eds. Cody, V., Middleton, E., Jr., & Harborne, J. B. (Liss, duplication and functional divergence during the evolution of New York), pp. 1-14. Asteraceae. Although the relationship is only weakly supported, 2. Stafford, H. (1991) Plant PhysioL 96, 680-685. maximum parsimony indicates a monophyletic origin for both of 3. Koes, R. E., Quattrocchio, F. & Mol, J. N. M. (1994) BioEssays 16, 123-132. these lineages, suggesting that a single gene duplication gave rise 4. Niesbach-Klosgen, U., Barzen, E., Bernhardt, J., Rohde, W., to all GCHS2-like genes before Asteraceae diversified. Schwarz-Sommer, Z., Reif, H. J., Wienand, U. & Saedler, H. Together with the recent results of Tropf et al. (13) and (1987) J. Moi. Evol. 26, 213-225. Durbin et al. (14), our study confirms a gene duplication based 5. Fliegmann, J., Schroder, G., Schanz, S., Britsch, L. & Schroder, model that explains how various related functions have arisen J. (1992) Plant Mol. Biol. 18, 489-503. from CHS during plant evolution. Based on some common 6. Schroder, G., Brown, J. W. S. & Schroder, J. (1988) Eur. J. Bio- motifs in primary structure and the similar sequential reaction chem. 172, 161-169. mechanism, it has been suggested that CHS itself shares a 7. Melchior, F. & Kindl, H. (1990) FEBS Lett. 268, 17-20. 8. Sparvoli, F., Martin, C., Scienza, A., Gavazzi, G. & Tonelli, C. common origin with fatty acid synthases of primary metabo- (1994) Plant Mol. Bio. 24, 743-755. lism (36, 37). In the reaction of SS, which uses identical 9. Koes, R. E., Spelt, C. E., van den Elzen, P. J. M. & Mol, J. N. M. malonyl-CoA and 4-coumaroyl-CoA substrates, only the final (1989) Gene 81, 245-257. cyclization step of the reaction is modified (37). In the GCHS2 10. Shen, J. B. & Hsu, F. (1992) Moi. Gen. Genet. 234, 379-389. reaction, as studied here, both the substrate specificity as well 11. Baumert, A., Maier, W., Groeger, D. & Deutzmann, R. (1994) Z. as the progressivity of the reaction have been changed. Altered Naturforsch. 49, 26-32. substrate specificity in a CHS-like enzyme has been reported 12. Junghans, K. T., Kneusel, R. E., Baumert, A., Maier, W., Groger, for acridone synthase, an enzyme in the alkaloid biosynthetic D. & Matern, U. (1995) Plant Mol. Bio. 27, 681-692. pathway using N-methylanthraniloyl-CoA as substrate 13. Tropf, S., Lanz, T., Rensing, S. A., Schr6der, J. & Schr6der, G. (11, 12). (1994) J. Mol. Evol. 38, 610-618. Nevertheless, the truncation of the CHS reaction in GCHS2 14. Durbin, M. L., Learn, G. H., Jr., Huttley, G. A. & Clegg, M. T. catalysis is a novel feature. It suggests that the initially rela- (1995) Proc. Natl. Acad. Sci. USA 92, 3338-3342. tively complex CHS reaction has been simplified along the 15. Helariutta, Y., Elomaa, P., Kotilainen, M., Griesbach, R. J., protein evolutionary process, and that this simplification must Schroder, J. & Teeri, T. H. (1995) Plant Moi. Biol. 28, 47-60. have been selectively advantageous to have been retained. If 16. Bremer, K. (1994) Asteraceae: Cladistics and Classification (Tim- the catalytic steps of the CHS reaction (and those of related ber, Portland, OR). fatty acid synthases) evolved stepwise over time, then the 17. Jofuku, D. & Goldberg, R. B. (1988) in Plant Molecular Biology: truncated reaction of GCHS2 may be considered a reversal to A Practical Approach, ed. Shaw, C. H. (IRL, Oxford), pp. 37-66. 18. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989) Molecular a more primitive condition in enzymatic evolution. Truncation Cloning: A Laboratory Manual (Cold Spring Harbor Lab. Press, of the CHS reaction leading to the accumulation of novel Plainview, NY), 2nd Ed. metabolites of secondary metabolism has been hypothesized 19. Dellaporta, S. L., Wood, J. & Hicks, J. B. (1983) Plant Mol. Biol. previously in the context ofp-hydroxyphenylbutan-2-one bio- Rep. 1, 19-21. synthesis in raspberry (39) and the biosynthetic origin of 20. Saiki, R. K., Gelfand, D. H., Stoffel, S., Scharf, S. J., Higuchi, R., bis-noryangonins (33). Our present results with the GCHS2 Horn, G. T., Mullis, K. B. & Erlich, H. A. (1988) Science 239, reaction support these untested suppositions and imply that 487-491. catalytic simplification (i.e., evolutionary reversal), like substrate 21. Vieira, J. & Messing, J. (1991) Gene 100, 189-194. change, may be a theme 22. Swofford, D. L. (1993) PAUP: Phylogenetic Analysis Using Parsi- recurring in CHS superfamily evolution. mony (Illinois Nat. His. Survey, Champaign), Version 3.1. The GCHS2-like sequences from Taraxacum (tribe Lac- 23. Farris, J. S., Albert, V. A., Kallersjo, M., Lipscomb, D. & Kluge, tuceae, subfamily Cichorioideae) and Dahlia (tribe Heli- A. G. (1996) Cladistics, in press. antheae, subfamily Asteroideae) form a well-supported lin- 24. Chase, M. W., Soltis, D. E., Olmstead, R. G., Morgan, D., Les, eage apart from those of the three species of the tribe D. H., et al. (1993) Ann. Mo. Bot. Gard. 80, 528-580. Mutisieae (subfamily Cichorioideae), which indicates a further 25. Takkinen, K., Laukkanen, M.-L., Sizmann, D., Alfthan, K., divergence in the GCHS2 family along with the diversification Immonen, T., Vanne, L., Kaartinen, M., Knowles, J. K. C. & of Asteraceae. Additionally, the three GCHS2-like genes of Teeri, T. T. (1991) Protein Eng. 4, 837-841. Gerbera clearly differ from one other. GCHS17 and GCHS26 26. Hanahan, D. (1983) J. Mo. Biol. 166, 557-580. 27. Courtney-Gutterson, N., Napoli, C., Lemieux, C., Morgan, A., are generally expressed at a lower level than GCHS2, and the Firoozabady, A. & Robinson, K. E. P. (1994) BiolTechnology 12, strong expression in floral organs typical for GCHS2 is lacking 268-271. (data not shown). Furthermore, in the comparison of the 28. Farris, J. S. (1972) Am. Nat. 106, 645-668. catalytic properties of GCHS2 and GCHS26, a slightly differ- 29. Farris, J. S. (1970) Syst. Zool. 19, 83-92. ent product specificity was observed (Fig. 3B). In the future, 30. Swofford, D. L. & Maddison, W. P. (1987) Math. Biosci. 87, in combination with biochemical investigation of the function 199-229. of GCHS2, these studies on the molecular evolution of GCHS2 31. Farris, J. S. (1989) Cladistics 5, 417-419. will lead to a greater understanding of the role, biological 32. Schuz, R., Heller, W. & Hahlbrock, K. (1983) J. Biol. Chem. 258, 6730-6734. significance, and diversity of GCHS2 as a novel enzyme of 33. Kreuzaler, F. & Hahlbrock, K. (1975) Arch. Biochem. Biophys. secondary metabolism in the Asteraceae. 169, 84-90. 34. Lanz, T., Schroder, G. & Schroder, J. (1990) Planta 181, 169-175. We thank Hans V. Hansen for the Leibnitzia and Onoseris seed 35. Elomaa, P., Helariutta, Y., Kotilainen, M. & Teeri, T. H. (1996) material; Neil Courtney-Gutterson for the CHS sequence of D. Mol. Breeding 2, 41-50. grandiflora; and James S. Farris, Jaakko Hyvonen, Ilkka Kukkonen, 36. Kauppinen, S., Siggaard-Andersen, M. & von Wettstein- Barbara Meurer-Grimes, Joachim Schroder, Lena Struwe, and Risto Knowles, P. (1988) Carlsberg Res. Commun. 53, 357-370. Vainola for valuable discussions or assistance. Eija Holma, Marja 37. Verwoert, I. I. G. S., Verbree, E. C., Van der Linden, K. H., Huovila, Paivi Laamanen, and Keijo Virta are acknowledged for their Nijkamp, H. J. J. & Stuitje, A. R. (1992) J. Bacteriol. 174, 2851- excellent technical assistance. This work was partially funded by the 2857. Academy of Finland, The Swedish Natural Science Research Council, 38. Kindl, H. (1985) in Biosynthesis and Biodegradation of Wood and the Lewis B. and Dorothy Cullman Foundation. Components, ed. Higuchi, T. (Academic, New York), pp. 349- 377. 1. Swain, T. (1986) in Plant Flavonoids in Biology and Medicine: 39. Borejsza-Wysocki, W. & Hrazdina, G. (1994) Phytochemistry 35, Biochemical, Pharmacological and Structure-Activity Relation- 623-628. Downloaded by guest on October 1, 2021