Molecular Phylogenetics and Evolution 44 (2007) 26–41 www.elsevier.com/locate/ympev

Patterns of gene duplication and functional diversification during the evolution of the AP1/SQUA subfamily of MADS-box genes

Hongyan Shan a,b,1, Ning Zhang a,b,1, Cuijing Liu a,b,1, Guixia Xu a,b, Jian Zhang a,b, Zhiduan Chen a,*, Hongzhi Kong a,*

a State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, The Chinese Academy of Sciences, Xiangshan, Beijing 100093, People’s Republic of China b Graduate University of the Chinese Academy of Sciences, Beijing 100039, People’s Republic of China

Received 19 June 2006; revised 7 February 2007; accepted 19 February 2007 Available online 25 February 2007

Abstract

Members of the AP1/SQUA subfamily of plant MADS-box genes play broad roles in the regulation of reproductive meristems, the specification of sepal and petal identities, and the development of leaves and fruits. It has been shown that AP1/SQUA-like genes are angiosperm-specific, and have experienced several major duplication events. However, the evolutionary history of this subfamily is still uncertain. Here, we report the isolation of 14 new AP1/SQUA-like genes from seven early-diverging and the identification of 11 previously uncharacterized ESTs and genomic sequences from public databases. Sequence comparisons of these and other published sequences reveal a conserved C-terminal region, the FUL motif, in addition to the known euAP1/paleoAP1 motif, in AP1/SQUA-like proteins. Phylogenetic analyses further suggest that there are three major lineages (euAP1,euFUL, and AGL79) in core eudicots, likely resulting from two close duplication events that predated the divergence of core eudicots. Among the three lineages, euFUL is structur- ally very similar to FUL-like genes from early-diverging eudicots and basal angiosperms, whereas euAP1 might have originally been gen- erated through a 1-bp deletion in the exon 8 of an ancestral euFUL-orFUL-like gene. Because euFUL- and FUL-like genes usually have broad expression patterns, we speculate that AP1/SQUA-like genes initially had broad functions. Based on these observations, the evo- lutionary fates of duplicate genes and the contributions of the frameshift mutation and alternative splicing to functional diversity are discussed. 2007 Elsevier Inc. All rights reserved.

Keywords: MADS-box gene; The AP1/SQUA subfamily; Evolution; Gene duplication; Alternative splicing; Frameshift mutation; Subfunctionalization

1. Introduction Theissen, 2001; Theissen et al., 1996). Many of the genes in this network encode transcription factors, which can bind Flowers are reproductive structures that characterize to the regulatory region of other genes and activate/repress angiosperms (Endress, 1994). Molecular and genetic stud- their expression (de Folter et al., 2005; Kaufmann et al., ies of model , such as Arabidopsis thaliana (Brassica- 2005). Among the regulators involved in floral develop- ceae), Antirrhinum majus (Plantaginaceae), and Petunia ment, the best understood are the MIKCc-type MADS- hybrida (Solanaceae), have indicated that the process of box genes, which encode proteins with a conserved K-box flower formation is controlled by a complex regulatory net- domain, a less-conserved intervening region, and a variable work of numerous genes and pathways (Soltis et al., 2002; C terminus, in addition to the most conserved MADS-box domain (Becker et al., 2003; De Bodt et al., 2003b; Kauf- mann et al., 2005). * Corresponding authors. Fax: +86 10 62590843. E-mail addresses: [email protected] (Z. Chen), [email protected] It has been proposed that floral MADS-box genes can (H. Kong). be grouped into several major groups, or subfamilies, 1 These authors contributed equally to this work. which are the results of several gene duplications (Becker

1055-7903/$ - see front matter 2007 Elsevier Inc. All rights reserved. doi:10.1016/j.ympev.2007.02.016 H. Shan et al. / Molecular Phylogenetics and Evolution 44 (2007) 26–41 27 and Theissen, 2003; Martinez-Castilla and Alvarez-Buylla, AP1/SQUA-like genes have also been identified from 2003; Nam et al., 2003; Purugganan et al., 1995). Gene other angiosperm species. In general, these genes possess duplication, followed by divergence in coding sequence similar sequence features, expression patterns or gene and expression pattern, thus plays key roles in the innova- functions with one of the four Arabidopsis paralogs. tion and diversification of the regulatory network that For example, in another model species, Antirrhinum determines floral morphology. For example, the earliest majus, the SQUA gene is important for the specification gene duplication events that occurred within the AP3/PI, of floral meristems (Huijser et al., 1992), and DEFH28 AG/STK and SEP subfamilies of MADS-box genes were plays a dual role during the development of both inflo- believed to coincide with the emergence of flowering plants rescences and carpels (Muller et al., 2001). However, this (Aoki et al., 2004; Kim et al., 2004; Kramer et al., 1998, situation seems to be restricted to core eudicots because 2004; Stellari et al., 2004; Zahn et al., 2005a, 2006). In genes isolated from basal angiosperms (incl. Amborella- angiosperms, key duplication events seem to have also ceae, Nymphaeaceae, Austrobaileyales, monocots, Cer- occurred within the AP1/SQUA, AP3, AG, and AGL2/3/ atophyllaceae, magnoliids, and Chloranthaceae; see 4 lineages before the diversification of core eudicots (Aoki Soltis and Soltis, 2004) have broad expression patterns. et al., 2004; Kim et al., 2004; Kramer et al., 1998, 2004, For example, the Magnolia grandiflora (Magnoliaceae) 2006; Litt and Irish, 2003; Zahn et al., 2005a, 2006). The Ma.gr.AP1 and Eupomatia bennetii (Eupomatiaceae) core eudicots are one of the most successful angiosperm Eu.be.AP1 genes are both highly expressed in bracts groups, comprising the Gunnerales, Berberidopsidales, (or calyptra in Eupomatia), tepals, stamens, carpels and Saxifragales, Santalales, Caryophyllales, rosids (incl. Ara- leaves (Kim et al., 2005a), and the Nuphar advena bidopsis, Medicago, and Populus, etc.), and asterids (incl. (Nymphaeaceae) Nu.ad.AP1 gene is strongly expressed Antirrhinum, Petunia, and Gerbera, etc.) (Soltis and Soltis, in carpels and leaves, with a small amount of transcripts 2004). detected in inner tepals and stamens (Kim et al., 2005b). Among the floral MADS-box genes studied to date, In rice (Oryza sativa; Poaceae), the OsMADS18 gene is members of the AP1/SQUA subfamily are of particular widely expressed in roots, leaves, inflorescences, and all interest to us because they play very important roles in floral organ primordia (Fornara et al., 2004; Masiero the development of inflorescences and flowers. In Arabid- et al., 2002). opsis, there are four AP1/SQUA-like genes, i.e., APET- In addition to the obvious differences in expression pat- ALA1 (AP1), CAULIFLOWER (CAL), FRUITFULL terns, the evolutionary history of AP1/SQUA-like genes (FUL), and AGAMOUS-LIKE 79 (AGL79)(De Bodt also seems to be complicated. Several recent studies indi- et al., 2003a; Parenicova et al., 2003). AP1 was identified cate that the AP1/SQUA subfamily has experienced fre- as both a floral organ identity gene and a floral meristem quent gene duplications and the acquisition of novel identity gene, as strong ap1 mutations cause the conver- sequence structures, making it difficult to understand the sion of sepals into bract-like structures and the concomi- history of this subfamily (Becker and Theissen, 2003; tant formation of additional flowers in the axis of each Johansen et al., 2002; Litt and Irish, 2003; Vandenbussche bract, and weak ap1 mutations result in defects in sepals et al., 2003a). In most previous studies, resolution and reli- and petals (Bowman et al., 1993; Irish and Sussex, 1990; ability of phylogenetic results are likely affected by taxon Mandel et al., 1992). The closest Arabidopsis relative of limitation and long-branch attraction. For instance, in AP1 is CAL, which was generated through a gene dupli- the most complete study to date, Litt and Irish (2003) cation that occurred within the Brassicaceae and acts par- reported that FUL-like genes from monocots (mainly tially redundantly with AP1 in controlling the formation grasses) were resolved as two successively branching clades, of floral meristem (Bowman et al., 1993; Kempin et al., and the Arabidopsis AtFL gene, which is virtually the same 1995; Lawton-Rauh et al., 1999). The cal single mutants as AGL79, was clustered with FUL-like genes from early- do not have obvious abnormalities, but the cal mutation diverging eudicots (mainly and Buxaceae). causes an enhancement of the repeated branching pattern This pattern suggests that two ancient duplications seen in the ap1 floral meristem (Bowman et al., 1993; occurred at an early stage of angiosperm evolution, one Kempin et al., 1995). The FUL gene encodes a protein before the split of monocots from dicots, and the other with a C terminus quite different from those of AP1 before the diversification of eudicots. However, as pointed and CAL (Mandel and Yanofsky, 1995). FUL also plays out by the authors, this pattern could also be an artifact of a role in the control of the floral meristem; in the ap1 cal an uneven substitution rate or biased G–C content. It has ful triple mutants, all floral meristem characters are lost been shown that, due to uneven substitution rate or biased and no flower is formed (Ferrandiz et al., 2000a). In con- G–C content, factors which may potentially skew the trast to CAL, FUL is also expressed in young siliques and results of phylogenetic analysis, especially when the taxa growing leaves, suggestive of its roles in fruit and leaf selected are strongly biased (Kong et al., 2004; Leebens- development (Gu et al., 1998; Mandel and Yanofsky, Mack et al., 2005; Zahn et al., 2005a, 2006). This suggests 1995). The function of AGL79 is not known, although that the evolution of AP1/SQUA-like genes needs to be its transcripts can be detected in roots (De Bodt et al., further examined by including sequences from additional 2003a; Parenicova et al., 2003). taxa. 28 H. Shan et al. / Molecular Phylogenetics and Evolution 44 (2007) 26–41

In this study, we have isolated 14 new AP1/SQUA-like The amplification program was started at 94 C for genes from seven species that represent the three major 4 min, followed by 10 cycles of ‘‘94 C for 30 s, 48 C for clades of early-diverging eudicots. In addition, by search- 30 s, and 72 C for 1 min’’, and 25 cycles of ‘‘94 C for ing in the publicly available databases, we identified 11 30 s, 52 C for 30 s, and 72 C for 1 min’’. The program uncharacterized ESTs and genomic sequences. Phyloge- was terminated at 72 C for 10 min. Then, a second AP1/ netic analyses of these and other published sequences indi- SQUA gene-specific degenerate primer, AP1F2 (50-GAGA cate that two duplication events, which gave rise to the TCTCYGTSCTCTGYGAYG-30), and PTAP were used to three major, core eudicot-specific gene lineages (i.e., amplify the PCR products obtained in the first step. PCR euAP1,euFUL,andAGL79), both occurred before the conditions were the same as those used in the first step radiation of core eudicots but after the divergence of the except that the cycle number was 35 and the annealing tem- Buxaceae. Furthermore, we found that FUL-like genes perature was set to 52 C. For the EUplFL2 gene, only the from basal angiosperms and early-diverging eudicots share first step was performed, with the forward primer AP1F3 the highest similarity with members of the euFUL lineage (50-CGRCARCTGACSTTCT SCAARCG-30) and PTAP. in both sequence features and expression patterns. This Amplified fragments, if over 750 bp in length, were then suggests that, in core eudicots, the functional counterparts cloned into pGEM-T Easy Vector (Promega, Madison, of the FUL-like genes from basal angiosperms and early- WI, USA). For each species, more than 50 positive clones diverging eudicots are euFULs, rather than euAP1sor were sequenced or digested with restriction enzyme(s), or AGL79s. This also suggests that both euAP1 and AGL79 both. At least three independent clones were sequenced lineages were derived from an ancestral euFUL-orFUL- for every identified locus. Complete cDNA sequences of like gene through two independent duplication events, SIchFL2, EUplFL1,andEUplFL2 were obtained through followed by differential changes in sequence structure, 50 RACE (rapid amplification of cDNA ends). expression pattern and gene function. To determine the mechanism by which two different transcripts of the S. chinensis SIchFL2 gene were gener- 2. Materials and methods ated, genomic DNA of SIchFL2 was amplified by using the forward primer AP1F4 (50-CGCCATCCTTCCTAC 2.1. Plant materials and gene cloning TATCAC-30) in combination with the reverse primer AP1R1 (50-CGCATATTTGTCTATGCATG-30). Accord- Seven early-diverging eudicots were sampled in this ing to the exon/intron structure of the AP1/SQUA-like study, i.e., Buxus sempervirens and Pachysandra terminalis genes, the primer binding sites for these two primers were from the Buxaceae, Platanus · acerifolia (Platanaceae) and presumably located in exons 7 and 8, respectively, and thus Nelumbo nucifera (Nelumbonaceae) from the Proteales, the amplified fragments should cover the 30 portion of exon and Euptelea pleiospermum (Eupteleaceae), Sinofranchetia 7, the entire intron 7, and the 50 portion of exon 8. PCR chinensis and Decaisnea insignis (both ) program used for this amplification includes an initial from the Ranunculales. Floral buds and leaf materials of 94 C denaturalization for 4 min, followed by 35 cycles of B. sempervirens (N. Zhang 2005001), P. terminalis (N. ‘‘94 C for 30 s, 52 C for 30 s, and 72 C for 2 min’’, and Zhang 2005002), P. · acerifolia (G. Xu 2005003) and N. a final 72 C extension for 10 min. The amplified fragments nucifera (H. Kong 2005004) were obtained from plants were then cloned and sequenced as described above. growing in the Botanical Garden of Institute of Botany, the Chinese Academy of Sciences, whereas those of E. 2.2. Data retrieval and sequence alignment pleiospermum (C. Liu 2005005) were from the Beijing Botanical Garden. Materials of S. chinensis (W. Wang In addition to the new genes obtained in this study, AP1/ SX040) and D. insignis (W. Wang SX200513) were col- SQUA-like genes were also retrieved from previously pub- lected from plants growing on the Taibai Mountains, Sha- lished studies and other publicly available databases using anxi Province, China. BLAST searches. During the BLAST searches, multiple Total RNA of each species was extracted from young AP1/SQUA-like genes from different lineages were used inflorescences and floral buds with Plant RNA Reagent as queries. Sequences from the same species were regarded (Invitrogen, Carlsbad, CA, USA). Poly (A)+ mRNA was as possible alleles if they are >95% identical at the DNA then purified from total RNA using Oligotex mRNA Mini level (Zhang et al., 2001); only the single species (from each Kit (Qiagen, Hilden, Germany). First strand cDNA was ) for which the best data were available was included. synthesized by SuperScript III Reverse Transcriptase Other alleles, as well as the sequences with poor quality, (Invitrogen) with the Poly (T) primer PT (50-CCGGAT were excluded from further analyses. To avoid unnecessary 0 CCTCTAGAGCGGCCGC(T)17-3 ). Amplification of computational intensiveness, only one best-studied species AP1/SQUA-like genes was performed by the following pro- from the same genus was included, and thus the number cedure. First, PCR amplification was carried out with the of AP1/SQUA-like genes was reduced from ca. 200 to AP1/SQUA gene-specific degenerate primer AP1F1 (50- 135. Two AGL6-like and two SEP-like genes were also CTSAAGAARGCKCAYGAGATC-30) and the adapter included, because previous phylogenetic analyses have primer PTAP (50-CCGGATCCTCTAG AGCGGCCGC-30). indicated that both the AGL6 and SEP subfamilies are H. Shan et al. / Molecular Phylogenetics and Evolution 44 (2007) 26–41 29 the closest relatives of the AP1/SQUA subfamily (Becker assessed using bootstrap analyses with 250 bootstrap repli- and Theissen, 2003; Litt and Irish, 2003; Nam et al., cates, each with 50 stepwise additions and TBR branch 2004; Zahn et al., 2005a). swapping. For the ML analyses of protein matrices in Protein sequences were first aligned with CLUSTALX PHYML, the default JTT model was used, with proportion 1.83 (Thompson et al., 1997) and then adjusted manually of invariable sites and gamma distribution parameter opti- using Genedoc (Nicholas and Nicholas, 1997). Since some mized automatically and with a BIONJ tree applied as a parts of the C-terminal region are too divergent to be starting point. For the ML analyses of DNA matrices, aligned, we produced a preliminary tree on the basis of the most appropriate model, GTR+I+C, was estimated the conserved M-, I-, and K-domain regions, and then by running MODELTEST version 3.06 (Posada and adjusted the order of the sequences according to their phy- Crandall, 1998). Then, in PHYML, we specified all param- logenetic placements. When closely related sequences were eters needed and started to search for the best tree. To placed together, additional conserved regions became determine the reliability of PHYML, we have also per- apparent. For this reason, after manual adjustments in formed a similar ML analysis for the ‘‘AP1_D_good’’ the I and most C regions, we generated a second tree using matrix in PAUP. Our results indicate that the trees gener- all confidently-aligned regions, and rearranged the order of ated by the two methods are virtually identical, suggesting sequences to align the remaining align-able regions. Fol- that PHYML works as well as PAUP for our data but with lowing this procedure, we finally obtained a relatively reli- much higher speed. For this reason, all ML estimates in able alignment for AP1/SQUA-like genes. A DNA version this study, including bootstrap analyses (1000 replicates of this alignment was also generated with the help of the for DNA matrices and 500 replicates for protein matrices), publicly available software aa2dna (http://www.bio. were performed with PHYML. psu.edu/People/Faculty/Nei/Lab/software.htm). 2.4. Expression studies 2.3. Phylogenetic estimates To detect the expression of AP1/SQUA-like genes from Phylogenetic analyses were conducted for both protein early-diverging eudicots, RT-PCR was performed with and DNA alignments. However, not all residues/sites were RNAs from young and developing leaves as well as female used because a few regions in the alignments are too diver- and male floral buds of B. sempervirens and E. pleiosper- gent. More importantly, because a frameshift mutation mum. In addition, to ensure the reliability of the experi- may have happened at the 30 end of an ancestral euFUL mental system, we have examined the expression patterns or FUL-like gene to generate the euAP1 ancestor, the C- of the four Arabidopsis AP1/SQUA-like genes in develop- terminal ends of euAP1 proteins are no longer homologous ing roots, leaves, inflorescence stems, flowers, and siliques. to that of euFUL or FUL-like proteins (Litt and Irish, The methods used to isolate total RNA, to purify mRNA, 2003; Vandenbussche et al., 2003a). During phylogenetic and to synthesize the first strand cDNA, were the same as estimates, this non-homologous region must be excluded. we did for gene cloning (see Section 2.1). Then, gene-spe- In addition, because most of the previous phylogenetic cific primers were used to detect gene expression, with the analyses were based on the conserved M-, I-, and K- ACTIN gene used as a control. The PCR amplification pro- domain regions, we generated two matrices (one for pro- gram included an initial denaturation at 94 C for 3 min, 30 teins and the other for DNAs) that include these regions. cycles of ‘‘94 C for 30 s, 50–54 C for 30 s and 72 C for In this study, they were named as the ‘‘AP1_P_MIK’’ 30 s’’, and a final extension at 72 C for 10 min. PCR prod- and ‘‘AP1_D_MIK’’ matrices, respectively. Furthermore, ucts were fractionated in a 1.2% agarose gel and since the inclusion of relatively conserved C-terminal resi- photographed. dues can greatly improve the resolution and reliability of phylogenetic analyses (Zahn et al., 2005a, 2006), we gener- 3. Results ated two more matrices, called ‘‘AP1_P_good’’ and ‘‘AP1_D_good’’, to include the conserved MIK regions 3.1. Sequences of AP1/SQUA homologs and the C-terminal residues with higher-than-12 quality scores. The quality score for each residue was estimated Fourteen new AP1/SQUA-like genes (GenBank Acces- in CLUSTALX 1.83 (Thompson et al., 1997). sion Nos. DQ656553–DQ656566) were isolated in this Phylogenetic analyses for each matrix were carried out study, i.e., BUseFL1, BUseFL2, and BUseFL3 from using maximum parsimony (MP) and maximum likelihood Buxus sempervirens, PAteFL1 and PAteFL3 from Pachy- (ML) methods in PAUP* version 4.0b10 (Swofford, 2002) sandra terminalis, PLacFL1 and PLacFL2 from Plat- and PHYML version 2.4 (Guindon and Gascuel, 2003), anus · acerifolia, NEnuFL1 from Nelumbo nucifera, respectively. For MP analyses, 1000 replicates of random EUplFL1 and EUplFL2 from Euptelea pleiospermum, stepwise addition with tree bisection-reconnection (TBR) DEinFL1 and DEinFL2 from Decaisnea insignis, and SIc- branch swapping were performed by using heuristic hFL1 and SIchFL2 from Sinofranchetia chinensis. Among searches, with all most parsimonious trees saved at each these genes, SIchFL2 seems to have two different tran- replicate (MulTree on). Support for each branch was scripts: the first is represented by the majority (19/26 30 H. Shan et al. / Molecular Phylogenetics and Evolution 44 (2007) 26–41 or 73%) of the sequenced clones and encodes a typical errors. LeFUL1 (AY306155; Litt and Irish, 2003) is also FUL-like protein, while the second (7/26 or 27% of a synonym of TDR4, because it is identical to TDR4 at sequenced clones) contains a 13-bp deletion in the C-ter- both DNA and protein sequence levels but with incomplete minal region and thus encodes a FUL-like protein but 50 and 30 ends. with a euAP1-type C-terminal region. A comparison of In addition to the published sequences, our BLAST these transcripts and their corresponding genomic DNA searches found many ESTs, unigenes and genomic sequences suggests that the two different transcripts were actually that correspond to AP1/SQUA-like genes. To facilitate derived from an alternative splicing event: the 12th and description and, most importantly, to better understand 13th sites (A and G) in exon 8 have been erroneously the evolution of the AP1/SQUA subfamily, we have desig- recognized as the exon accepter sites (AG) at the end nated names for them according to their phylogenetic posi- of intron 7 (Fig. 1). tions and sequence structures. They are: PEamFL1 (FGP Our BLAST searches found numerous AP1/SQUA-like unigene No. 4-209453) from Persea americana (Lauraceae), genes, most of which have been published or characterized. AQfoFL1 (TC14597) from Aquilegia formosa · A. pubescens In addition, we noticed that a number of published (Ranunculaceae), LYesFL1 (BT013126) and LYesFUL2 sequences contain ambiguous sites in coding regions. Some (TC162618) from Lycopersicon esculentum (Solanaceae), DNA sequences differ from closely related sequences by MEtrFL1 (TC104631) from Medicago truncatula (Faba- one to a few nucleotides and affect the reading frame. ceae), GLmaFUL (TC208789) from Glycine max (Fabaceae), For example, the NCBI database has two different versions MEcrFUL (TC6568) from Mesembryanthemum crystallinum of the tomato TDR4 gene: X60757 (Pnueli et al., 1991) and (Aizoaceae), GOraFUL (TC29051) from Gossypium raimon- AY098732 (Busi et al., 2003). At the DNA and protein lev- dii (Malvaceae), and POtrFL1 (TC22128), POtrFL2 els, these two sequences are 98% and 86% identical to each (TC24080) and POtrFUL (TC39414) from Populus tricho- other, respectively. A close inspection of the two sequences carpa (Salicaceae). The latter three genes are from the newly indicated that their M, I, and K regions are 99% identical. sequenced poplar genome and thus have predicted exon/ Within the C-terminal region, the two sequences are also intron information. nearly identical except for a few 1-bp indels in the DNA Alignment of the amino acid sequences of AP1/ sequences that resulted in striking difference in inferred SQUA-like genes indicated that overall the M- and K- protein sequences. This suggests that the two sequences domains are highly conserved, with many positions probably represent the same gene, but with some sequenc- nearly invariant (data not shown). Within the variable ing errors in one of them. Additional information from the I region, substantial variation exists, but the total num- tomato EST database indicates that while AY098732 ber of residues remains nearly unchanged. Within the agrees with many ESTs, X60757 does not match any divergent C-terminal region, there are two short, rela- EST, suggesting that the latter sequence has sequencing tively conserved regions; one of them is the previously

Fig. 1. Alternative splicing of the Sinofranchetia chinensis SIchFL2 gene. (a) Sequence of the normal transcript, which codes for a typical FUL-like protein, with the paleoAP1 motif at the 30 end. (b) The corresponding genomic sequence, with the 30 end of exon 7, the 50 and 30 ends of intron 7, and the 50 end of exon 8. (c) Sequence of the shorter transcript. Because the 12th and 13th nucleotides in exon 8, ‘‘A’’ and ‘‘G’’, are erroneously recognized as the end of intron 7, this transcript has lost the first 13 sites of exon 8 and thus codes for a modified FUL-like protein with the euAP1 motif. H. Shan et al. / Molecular Phylogenetics and Evolution 44 (2007) 26–41 31 described paleoAP1/euAP1 motif (Litt and Irish, 2003; 3.2. Two very close duplications before the diversification of Vandenbussche et al., 2003a) and the other we have core eudicots named the FUL motif because it is present in all mem- bers of this subfamily (Fig. 2). Both the FUL and The eight different phylogenetic analyses of the AP1/ paleoAP1/euAP1 motifs contain many hydrophobic and SQUA-like sequences give essentially identical results. polar residues, and correspond in position to the AG I However, trees based on the matrices with more characters and AG II motifs of the AG proteins, the PI-derived tend to have greater resolution and internal supports. In all and AP3 motifs of the AP3 proteins, and the SEP I our analyses, the relationships among AP1/SQUA-like and SEP II motifs of the SEP proteins (Kramer et al., genes are largely consistent with the angiosperm phylogeny 1998, 2004; Zahn et al., 2005a). In contrast to the (Fig. 3). The single gene from the basalmost angiosperm FUL motif, however, the paleoAP1 and euAP1 motifs Nuphar advena always holds the basalmost position, fol- never coexist: the second C-terminal motif of an AP1/ lowed by genes from magnoliids, monocots, Chlorantha- SQUA protein is either paleoAP1 or euAP1. The euAP1 ceae, early-diverging eudicots, and core eudicots. In motif seems to be core eudicot-specific and not found in early-diverging eudicots, genes from each of the Buxaceae, proteins from early-diverging eudicots or basal angio- Proteales, and Ranunculales form separate well-supported sperms, while the paleoAP1 motif is present in all major clades, although the relationships among these clades are angiosperm lineages. still uncertain. Within core eudicots, three distinct gene

Fig. 2. Alignment of the C-terminal region of AP1/SQUA-like proteins. Three highly conserved motifs are boxed. The FUL motif is present in all AP1/ SQUA-like proteins, whereas the paleoAP1 and euAP1 motifs are lineage specific. In core eudicots, members of the euAP1 lineage always possess the euAP1 motif, while those of the euFUL and AGL79 lineages generally contain the paleoAP1 motif. Genes from basal angiosperms and early-diverging eudicots all code for FUL-like proteins and have the paleoAP1 motif. The arrow indicates the position of intron 7. Note that due to frameshift mutation, the 30 ends of euAP1 proteins (shaded) are no longer homologous to those of euFUL, AGL79, or FUL-like proteins. 32 H. Shan et al. / Molecular Phylogenetics and Evolution 44 (2007) 26–41

Fig. 3. Maximum-likelihood tree of 135 AP1/SQUA,2SEP, and 2 AGL6 genes. This tree was based on the analysis of the nucleotide matrix ‘‘AP1_D_good’’, with higher-than-50% bootstrap supports indicated for each node. Two arrows point to the two major gene duplication events prior to the diversification of core eudicots, which might have happened very closely in time because trees with euAP1,euFUL, and AGL79 resolved as the sister of the other two lineages are not significantly different from each other, with -Ln likelihood = 41413.53840, 41414.49846, and 41415.47746, respectively. In monocots, the two major, successive duplication events are highlighted by stars. Small-scale duplication events are indicated with dots. H. Shan et al. / Molecular Phylogenetics and Evolution 44 (2007) 26–41 33 lineages are evident, each of which contains genes from ros- ceae). The wide distribution of recently duplicated genes ids, asterids, Caryophyllales, and Saxifragales, suggesting in many plant groups suggests that the evolution of the that the two gene duplication events giving rise to these AP1/SQUA subfamily is a rather dynamic process, with three clades predated the diversification of core eudicots. new genes frequently recruited into the genome. However, Note that the euAP1 and euFUL lineages in our study it is still not clear whether these recent duplicated copies are essentially the same as those identified in a previous will stay in the genome permanently, or will eventually study (Litt and Irish, 2003), whereas the third lineage cor- become pseudogenes. responds to the previously designated ‘‘core eudicot FUL- Our analyses also revealed two relatively major duplica- like’’ clade (Litt and Irish, 2003), plus the Arabidopsis tion events in monocots (Fig. 3). The first event may have AGL79 gene. However, because the core eudicots euFUL occurred before the split of Asparagales and commelinids genes, rather than AGL79-like genes, show the highest sim- (including the grasses), because both descendent lineages ilarity to the FUL-like genes from early-diverging eudicots (i.e., the OsMADS14/15 and OsMADS18 clades) have and basal angiosperms in sequence structure (see below), genes from the orchids Dendrobium thyrsiflorum and Pha- we prefer to designate this core eudicot-specific group as laenopsis amabilis (Orchidaceae), the palm E. guineensis, the AGL79 clade, rather than the FUL-like clade. the spiderwort Tradescantia virginiana (Commelinaceae), It is worth noting that although the three major lineages and grasses. The second event, which gave rise to the of the core eudicot AP1/SQUA-like genes are moderately OsMADS14 and OsMADS15 lineages, seems to have supported, the relationships among them are still uncertain occurred more recently, because both clades only contain (Fig. 3). In all our analyses, euAP1 is resolved as the sister genes from grass species. The observation that genes from to euFUL and AGL79, but with bootstrap values lower E. guineensis and T. virginiana hold positions outside the than 50%. In the MP analyses of the ‘‘AP1_D_good’’ OsMADS14 and OsMADS15 clades further suggests that matrix, we found three most parsimonious trees of 10198 the duplication postdated the divergence of the Poales from steps, all of which support the close affinity between euFUL other commelinids. and AGL79. However, trees with AGL79 constrained to be sister of euAP1 and euFUL, and those with euFUL forced 3.4. Conservation and divergence in sequence characteristics to be sister of euAP1 and AGL79, are only two and four among clades steps longer than the most parsimonious trees, respectively, suggesting that the alternative hypotheses could not be Sequence comparisons showed that although proteins rejected. Similarly, in the ML analyses, trees with euAP1, of AP1/SQUA-like genes are highly conserved in the euFUL or AGL79 resolved as the sister of the other two lin- M- and K-domain regions, considerable variations exist eages are not significantly different from each other. All in the I and C-terminal regions. Interestingly, the C-ter- this suggests that the two duplication events giving rise to minal motifs are very informative and can act as impor- the three major core eudicot-specific gene lineages may tant markers by which the core eudicot genes belonging have occurred very closely in time. to the three major clades are distinguished (Fig. 2). For example, the euFUL proteins usually possess a typ- 3.3. Other duplications ical FUL motif and a paleoAP1 motif, whereas the euAP1 proteins always contain a typical FUL motif In addition to the aforementioned duplication events, and a euAP1 (rather than paleoAP1) motif. The our analyses reveal many recent duplication events that AGL79 proteins are similar to euFUL-like proteins in are specific to a particular organismal lineage (one or few having both FUL and paleoAP1 motifs, but the FUL closely related species) (Fig. 3). For example, the co-exis- motif in these proteins is no longer typical and contains tence of AP1 and CAL was only documented in the additional residues. The fact that AP1/SQUA-like pro- Brassicaceae, and the duplication event that created the teins from early-diverging eudicots and basal angio- Petunia PFG and FBP26 genes might have happened sperms all possess a typical FUL motif and a within the Solanaceae. The Populus trichocarpa genome paleoAP1 motif suggests that the euFUL, rather than contains two euAP1-like genes (PTAP1_1 and PTAP1_2) euAP1 or AGL79, genes may be structurally and and two AGL79-like genes (POtrFL1 and POtrFL2) that functionally more similar to FUL-like genes from basal were results of two independent within-family or even lineages. The relationships among the three core eudi- within-genus duplication events. A similar situation was cot-specific gene lineages, which are (euAP1,(euFUL, also found for paralogous genes from Crocus sativus AGL79)), further implies that the ancestral euAP1 and (Iridaceae), Elaeis guineensis (Arecaceae), Peperomia caper- AGL79 genes may have been generated from a euFUL- ata (Piperaceae), Euptelea pleiospermum, Decaisnea or FUL-like gene through two independent duplication insignis, Sinofranchetia chinensis, Ranunculus bulbosus events. (Ranunculaceae), Papaver somniferum (Papaveraceae), The conservation and divergence of AP1/SQUA-like Platanus · acerifolia, Buxus sempervirens, Pachysandra genes are also evident in the exon/intron structures, terminali, Phytolacca americana (Phytolaccaceae), Eucalyp- because genes from distantly related taxa and lineages tus globulus (Myrtaceae), and Helianthus annuus (Astera- all possess eight exons and seven introns (data not 34 H. Shan et al. / Molecular Phylogenetics and Evolution 44 (2007) 26–41 shown). In addition, we found that although the length of each intron varies considerably among genes, the size of each exon keeps largely unchanged even when dis- tantly related homologs are compared. This is particu- larly true for exons 1–6, which are extremely conserved in both length and sequence characteristics. However, for exons 7 and 8, which code for only three amino acids of the K domain and the whole C terminus, noteworthy variations exist. For instance, the length of exon 7 varies from 104 (in Arabidopsis FUL) to 164 bp (in rice OsMADS15), whereas that of exon 8 from 97 (in poplar PTAP1_1) to 127 bp (in rice OsMADS15). Moreover, due to frequent insertions and deletions, the alignment within exons 7 and 8 has been relatively difficult unless closely related genes are compared. Nevertheless, we noticed that, both exons contain highly conserved frac- tions and thus are homologous for genes from different lineages. The fact that members of the euAP1 clade can be aligned very well with all other AP1/SQUA-like genes when a single gap was introduced into exon 8, probably after the fourth codon (Fig. 2), further supports the idea that the ancestral euAP1 gene was created through a frameshift mutation that happened at a posi- tion prior to the paleoAP1 motif of a euFUL- or FUL- like gene. Fig. 4. Expression pattern of AP1/SQUA-like genes revealed by the RT- PCR method. (a) Arabidopsis thaliana: R, roots; L, leaves; I, inflorescence 3.5. Expression patterns stems; F, floral buds; and S, siliques. (b) Buxus sempervirens: L, leaves; M, male flowers; F, female flowers. (c) Euptelea pleiospermum: L, leaves; M, male flowers; F, female flowers. ACTIN is used as a positive control. Our RT-PCR results indicate that the five genes from the two early-diverging eudicots, i.e., BUseFL1, 2 and 3 from B. sempervirens, and EUplFL1 and 2 from E. pleio- spermum, are mostly expressed in both flowers and leaves results indicate that, while variation does exist, most (Fig. 4), and that paralogous genes from the same species genes are expressed in both vegetative and reproductive tend to display similar but slightly different expression organs, suggesting that AP1/SQUA-like genes generally patterns. For example, BUseFL1 and BUseFL2 are play roles in both vegetative and reproductive develop- mainly expressed in leaves and male flowers, respectively, ments, as is known for the Arabidopsis FUL, Petunia while BUseFL3 is highly expressed in all tissues examined FBP29, and rice OsMADS18 genes. For many genes, (Fig. 4b). In E. pleiospermum, EUplFL1 is only expressed the highest level of transcripts was detected in inflores- in leaves, whereas weak expression of EUplFL2 was cences (and florets in grasses), suggesting that they are detected in both leaves and male flowers (Fig. 4c). How- especially important for the formation of inflorescence ever, because the primers used in the analyses do not or floral meristems. Particularly, in tissues from surveyed necessarily work equally well for all genes, our results species, AP1/SQUA-like transcripts were detected in may not reflect the real endogenous transcript levels. nearly all sampled inflorescence and floral meristems, as Nevertheless, the fact that our RT-PCR results for the well as three quarters of all sepals, petals, and fruits, four Arabidopsis genes (Fig. 4a) are in good agreement two-third of all carpels, half of all leaves, and one-third with the published RT-PCR, mRNA in situ hybridization of all stamens, but were rarely detected in roots. In con- and microarray results (De Bodt et al., 2003b; de Folter trast to the general pattern, transcripts of the euAP1 et al., 2005; Parenicova et al., 2003; Zhang et al., 2005) lineage members were never detected in roots or leaves, strongly suggests that the expression patterns indicated or very rarely detected in stamens, suggesting that a shift in our study are relatively reliable. In particular, the dif- in expression pattern may have occurred after the diver- ferences in the expression level of the same gene in differ- gence of the euAP1 clade from the euFUL/AGL79 clade. ent tissues may roughly reflect the differences in the roles On the contrary, FUL-like genes from basal angiosperms that the gene plays in different tissues. and basal eudicots, as well as members of the core eudi- To obtain additional clues to the evolution of AP1/ cot-specific euFUL and AGL79 lineages, are generally SQUA-like genes, we have mapped the expression expressed in both vegetative and reproductive organs, pattern of each gene (if available) onto the phylogenetic implying that the ancestor of the AP1/SQUA subfamily tree (Fig. 5, Table S1 in Supplementary Materials). Our may have broad functions. H. Shan et al. / Molecular Phylogenetics and Evolution 44 (2007) 26–41 35

Fig. 5. Summary of the expression pattern for AP1/SQUA-like genes in a phylogenetic context. Genes with known expression in roots (R), leaves (L), inflorescence and floral meristems (M), sepals (S), petals (P), stamens (St), carpels (C), and fruits (F) are indicated. Black and grey bars denote high- and low-level expressions, respectively, whereas open boxes mean no expression. 36 H. Shan et al. / Molecular Phylogenetics and Evolution 44 (2007) 26–41

4. Discussion Among these, the third does require the functions of AP1/SQUA-, AP3-, and SEP-like genes (Bowman et al., 4.1. An improved phylogeny of the AP1/SQUA subfamily 1993, 1989; Ditta et al., 2004; Ferrandiz et al., 2000a; Flan- agan and Ma, 1994; Honma and Goto, 2001; Irish and Sus- Phylogenetic relationships among AP1/SQUA-like sex, 1990; Mandel and Yanofsky, 1998; Pelaz et al., 2000, genes have been reported in several studies (Becker and 2001a,b; Savidge et al., 1995). Many core eudicot AP1/ Theissen, 2003; Johansen et al., 2002; Litt and Irish, SQUA- and SEP-like genes also function as meristem iden- 2003). However, all those studies revealed some strange tity genes and play key roles in the maintenance of inflores- phenomena that could not be easily interpreted based on cence and floral meristems (Ditta et al., 2004; Ferrandiz our current knowledge of angiosperm evolution. Coinci- et al., 2000a,b; Huijser et al., 1992; Immink et al., 1999; dently, in those studies, the taxa sampled were quite unbal- Muller et al., 2001; Pelaz et al., 2000; Vandenbussche anced, with the vast majority of sequences from either et al., 2003b), suggesting that the occurrence of the duplica- grasses or core eudicots; sequences from phylogenetically tion events in each of the AP1/SQUA and SEP lineages important taxa, such as basalmost angiosperms, non-grass may have contributed to the diversification and elaboration monocots, and early-diverging eudicots, were usually miss- of the inflorescence and floral structures of core eudicots. ing. This led us to suspect that the unreasonable place- It has been proposed that the pre-core eudicot duplica- ments of some AP1/SQUA-like genes were possibly tions within each of the AP1/SQUA, AP3, AG, and SEP caused by long-branch attraction. In this study, by adding subfamilies can be explained by a single polyploidy event new AP1/SQUA-like genes from phylogenetically impor- or several independent duplication events (Zahn et al., tant taxa and conducting extensive MP and ML analyses 2005a). Indeed, on the basis of genomic sequences analysis, on both protein and DNA sequences, we were able to clar- several studies have claimed the identification of a pre-core ify the relationships among AP1/SQUA-like genes. Genes eudicot, genome-wide duplication event in angiosperm evo- from monocots no longer form successively branching lution (Blanc et al., 2003; Vision et al., 2000). However, this clades. Instead, they are clustered into a strongly supported hypothesis was not supported in other studies (Blanc and monophyletic group, suggesting that all monocot genes Wolfe, 2004; Bowers et al., 2003; De Bodt et al., 2005), sug- were actually derived from a single ancestor, and that the gesting that the polyploidy hypothesis may not be valid. In pre-monocot gene duplication proposed in previous studies addition, unlike AP3-andAG-like genes, which each have was not supported. Similarly, in core eudicots, we observed experienced only one major gene duplication event before three distinct lineages (euAP1,euFUL and AGL79) that the diversification of core eudicots, both AP1/SQUA- were probably generated through two very close duplica- and SEP-like genes have experienced two gene duplication tion events before the diversification of core eudicots but events at that point (Zahn et al., 2005a). This suggests that after the split of the Buxaceae and core eudicots. Because independent gene duplication events must have happened our results are in good accordance with the currently because the three core eudicot-specific lineages within each understood relationships of the angiosperm phylogeny, it of the AP1/SQUA and SEP subfamilies could not be pro- is very likely that the effects of long-branch attraction have duced through a single genome-wide duplication event. been reduced in our analyses. The addition of phylogenet- ically important taxa improved the resolution of phyloge- 4.3. Conservation and divergence in the structure and netic analyses. function of duplicate genes

4.2. Duplications within the AP1/SQUA subfamily coincide It is becoming clear that, after gene duplication, one or with the occurrence of plant groups both duplicate genes can accumulate considerable mutations within the coding or regulatory regions (Force et al., 1999; The major duplication events revealed in the AP1/ Lynch and Force, 2000; Moore and Purugganan, 2005). SQUA subfamily also coincide with the duplication events Consequently, the duplicate genes, if not silenced (nonfunc- documented in the AP3, AG and SEP subfamilies (Aoki tionalization), may either acquire novel functions (neofunc- et al., 2004; Kim et al., 2004; Kramer et al., 1998, 2004, tionalization), or perform part of the original functions 2006; Zahn et al., 2005a, 2006). Particularly, since members (subfunctionalization). In plants, several studies have found of all these gene lineages play critical roles in the develop- evidence of neofunctionalization and subfunctionalization, ments of inflorescence and floral structures, it has been sug- as well as overlapping redundancy, for floral MADS-box gested that the pre-core eudicot duplication events in these genes (Duarte et al., 2006; Kramer et al., 2004; Pelaz et al., subfamilies have contributed to the elaboration of core 2000; Pinyopich et al., 2003; Zahn et al., 2006, 2005b). In this eudicot flowers and the diversification of core eudicots study, we noticed that, among the three core eudicot gene lin- (Irish, 2003; Kramer and Hall, 2005; Zahn et al., 2005a). eages, euFUL proteins share some plesiomorphic features Core eudicots have three advantageous floral characteris- with FUL-like proteins from basal angiosperms and early- tics (or ‘‘key innovations’’): the whorled arrangement of diverging eudicots whereas the euAP1 and AGL79 lineages floral organs, the fixation of organ number in each whorl, each acquired novel sequence features (Fig. 2). In particular, and the differentiation of perianth into sepals and petals. due to the translational frameshift mutation, the C-terminal H. Shan et al. / Molecular Phylogenetics and Evolution 44 (2007) 26–41 37 regions of the euAP1 proteins are no longer homologous to recent duplicate genes, such as the Arabidopsis CAL and those of the euFUL and AGL79 proteins. Since the C termi- AP1 genes have similar but distinct sequence structures nus of euAP1 proteins contains a farnesylation motif and is and expression patterns. In fact, a previous study has sug- critical for the formation of higher-order complexes with the gested that CAL evolves at a faster rate than AP1 (Lawton- protein products of other floral MADS-box genes such as Rauh et al., 1999). SEP, AP3 and PI (Yalovsky et al., 2000), it is very likely that In line with the differences in coding and regulatory euFUL and AGL79 proteins are unable to interact with regions, paralogous genes from the same species usually euAP1 partners. Similarly, compared with euFUL proteins, have partially redundant functions, as is evidenced in Ara- members of the AGL79 lineage contain a slightly different bidopsis (Bowman et al., 1993; Ferrandiz et al., 2000a; Gu FUL motif in the C-terminal regions. These differences, et al., 1998; Kempin et al., 1995; Mandel and Yanofsky, although seemingly trivial compared with those observed 1995) and rice (Fornara et al., 2004; Lim et al., 2000; Masi- in the euAP1 lineage, may have also influenced the ability ero et al., 2002; Moon et al., 1999). Similarly, in Betula pen- of these proteins to interact with their potential partners. dula, the three AP1/SQUA-like genes, BpMADS 3, 4, and Changes among the euFUL,euAP1, and AGL79 lin- 5, are all active during the development of male and female eages in regulatory regions can be deduced from the inflorescences (Elo et al., 2001). Ectopic expression of one observed differences in expression pattern. As mentioned of the three genes in tobacco resulted in extremely early before, AP1/SQUA-like transcripts were detected in half flowering, suggesting that the three genes act redundantly of all surveyed leaves and a third of all stamens. In con- as floral meristem identity genes. However, the differences trast, transcripts of euAP1 clade members were seldom in expression patterns suggest that they may have slightly detected in either leaves or stamens. Since a broader different functions: the expression of BpMADS3 and 4 is expression pattern for AP1/SQUA-like transcripts is gener- mainly restricted to inflorescences, whereas BpMADS4 is ally seen in surveyed early-diverging eudicots and basal also expressed in roots and shoots (Elo et al., 2001)(Table angiosperms, we suggest that the narrower pattern of S1 in Supplementary Materials). Note that BpMADS 3, 4, expression seen in some euFUL and AGL79 clade members and 5 appears to be the orthologs of the Arabidopsis AP1/ and all euAP1 clade members is a derived state. As this nar- CAL, AGL79, and FUL genes, respectively (Fig. 3). rower expression pattern typifies all surveyed euAP1 genes, this suggests that a change in promoter regulation or func- 4.4. Contribution of frameshift mutation and alternative tion occurred in the latest common ancestor of this clade splicing and has been retained by all clade members. However, the observation that expression patterns for members of It has been suggested that the C-terminal region of the the AGL79 or euFUL clades are not complementary to euAP1 lineage members were generated through a frame- those of members of the euAP1 clade is not consistent with shift mutation that happened in a paleoAP1 ancestral gene a hypothesis of promoter subfunctionalization following at a position upstream of the paleoAP1 motif (Litt and the divergence of euAP1 from the AGL79 or euFUL clade. Irish, 2003; Vandenbussche et al., 2003a). However, since Further detailed analyses of the conservation and diver- the coding sequences preceding the terminal motifs of the gence of the regulatory elements in AP1/SQUA-like genes euAP1 and paleoAP1 genes are too divergent to be aligned are needed to test possible evolutionary mechanisms. confidently, previous studies were not able to determine the Unlike the duplications that resulted in the euFUL, exact position of the frameshift mutation. In this study, by euAP1 and AGL79 lineages, all other duplication events conducting careful alignment and comparing the exon/ within the AP1/SQUA subfamily seem to have produced intron structure, we found that the frameshift mutation structurally very similar paralogs. In addition, since the likely occurred in exon 8, probably after the fourth codon evolutionary history is relatively short and no frameshift (Fig. 2). Considering that sequences of the euAP1 and mutation was detected, the paralogous genes from the same FUL/euFUL genes can be aligned reasonably when a single species usually have relatively high sequence similarity in 1-bp gap was introduced into the euAP1 genes, we hypoth- coding regions. For example, the two poplar AGL79 genes, esize that the ancestral euAP1 gene was originally created POtrFL1 and 2, are 86% and 82% identical at the DNA through a deletion of a single nucleotide in exon 8 of a and protein sequence levels, respectively, and have the euFUL-orFUL-like gene. exactly same exon/intron structure. However, compared In addition to frameshift mutations, alternative splicing with orthologous genes, the differences between paralogous has been reported in several MADS-box genes, such as the genes are usually large. This could be explained by the fact Arabidopsis AG (Cheng et al., 2003), ABS (Nesi et al., 2002) that the functions of many MADS-box genes are dosage and FCA (Macknight et al., 2002), Nymphaea NymAP3 dependent: over-expression of them usually causes obvious (Stellari et al., 2004), Eupomatia Eu.be.AP3 (Kim et al., phenotypes (Yu and Goh, 2000). In other words, after gene 2005a), Eucalyptus EgAP2 (Kyozuka et al., 1997), and duplication, one or both paralogs have to accumulate con- Lolium LtMDAS2 (Gocal et al., 2001) genes. In general, siderable mutations in coding or regulatory regions other- alternatively spliced sequences are believed to play negative wise the higher-than-normal concentration of protein roles by inhibiting the function of the original gene: if the products will be harmful to the plant. That’s why many alternative transcripts are not translated, they will reduce 38 H. Shan et al. / Molecular Phylogenetics and Evolution 44 (2007) 26–41 the concentration of ‘‘normal’’ transcripts; if the alternative CAL are derived from a gene duplication that occurred transcripts can be translated, their protein products will within the Brassicaceae further suggests that AP1 may have not only dilute the concentration of ‘‘normal’’ proteins, acquired class A function very recently. but also compete with ‘‘normal’’ proteins in the interaction Now that most AP1/SQUA-like genes are known not to with other proteins. Occasionally, however, alternative be A-function genes, then we may ask which genes confer splicing can enhance the function of the gene, because dif- class A functions? In Arabidopsis, in addition to AP1, ferent transcripts have the potential to function as different AP2, a member of the AP2 transcription factor family, genes. In fruitfly (Drosophila melanogaster), it has been has been shown to play key roles during the development shown that the doublesex (dsx) gene can be spliced differen- of sepals and petals. Mutation of AP2 usually causes the tially in male and female individuals: the exons 1, 2, 3 and 4 conversion of sepals into carpels or leaves and petals into are used in females, and exons 1, 2, 3, 5, and 6 are used in staminoid petals or being absent (Bowman et al., 1991, males (see Graur and Li, 2000). 1989; Kunst et al., 1989). Moreover, AP2 can repress the In this study, we have shown that the Sinofranchetia expression of AG in the first and second whorls of the chinensis SIchFL2 gene have two different transcripts: one flower, suggesting that some AP2-like genes may be the real codes for a typical FUL-like protein, whereas the other A-function genes (Drews et al., 1991). In fact, two Antirrhi- for a modified FUL-like protein, with a euAP1 motif at num AP2-like genes, LIPLESS1 and 2 (LIP1/2), have also its C-terminal end. In other words, the SIchFL2 gene does been shown to be critical for the identity of the organs in have the potential to function as both a FUL-like gene and the first two whorls (Keck et al., 2003). In the lip1 lip2 dou- an AP1-like gene. This is interesting, because real AP1-like ble mutant, sepals are transformed into bracts and petals genes are known to be core eudicot-specific. Sinofranchetia are either lost or reduced in size. Unlike AP2, however, chinensis is an early-diverging eudicot species, and thus did LIP genes are not required to repress the expression of not experience the gene duplication event that generated C-function genes in the first two whorls, suggesting that the euAP1 gene lineage. Thus, the finding of euAP1-like they are not typical A-function genes. Additional studies transcripts in S. chinensis suggests that even in basal eudi- are needed to identify genes that perform class A functions. cots, there is a need for euAP1-like proteins, in addition to the more abundant FUL-like proteins. Before the origin Acknowledgments and diversification of core eudicots, euAP1-like proteins may have been generated through alternative splicing by We thank Yi Ren and Xiaohui Zhang for help with col- chance, to remedy the weakness of FUL-like proteins. lection of bud materials, Zhenyu Li for identification of Then, in the ancestor of core eudicots, a real euAP1-like plant species, and Hong Ma, Laura Zahn, Zhenguo Lin, gene was created through gene duplication and retained Zheng Meng and two anonymous reviewers for critical in the genome. This new gene, together with other novel reading of the manuscript and for valuable comments. This members in the AP3, AG and SEP subfamilies, may have work was supported by National Nature Science Founda- lead to the generation of a more elaborate regulatory net- tion of China (Grants Nos. 30530090, 30470116, and work for floral development, which in turn eventually 30121003) and an IBCAS grant for scientific frontiers. resulted in the origin and rapid diversification of core eudicots. Appendix A. Supplementary data

4.5. AP1/SQUA-like genes and A function Supplementary data associated with this article can be found, in the online version, at doi:10.1016/ For a relatively long time, AP1/SQUA-like genes have j.ympev.2007.02.016. been regarded as potential A-function genes (Berbel et al., 2001; Coen and Meyerowitz, 1991; Hardenack References et al., 1994; Huijser et al., 1992; Irish and Sussex, 1990; Mandel et al., 1992; Shchennikova et al., 2004). However, Aoki, S., Uehara, K., Imafuku, M., Hasebe, M., Ito, M., 2004. Phylogeny several recent studies suggest that most AP1/SQUA-like and divergence of basal angiosperm inferred from APETALA3- and genes are not floral organ identity genes at all, and that PISTILLATA-like MADS-box genes. Journal of Plant Research 117, the Arabidopsis AP1 gene is likely the only MADS-box 229–244. gene that specifies the identity of sepal and petal identities. Becker, A., Saedler, H., Theissen, G., 2003. Distinct MADS-box gene expression patterns in the reproductive cones of the gymnosperm In this study, we show that all sampled AP1/SQUA-like Gnetum gnemon. Development Genes and Evolution 213, 567–572. genes are expressed in both vegetative and reproductive Becker, A., Theissen, G., 2003. The major clades of MADS-box genes and organs, with the highest expression in inflorescence and flo- their role in the development and evolution of flowering plants. ral meristems (Fig. 5). This suggests that the functions of Molecular Phylogenetics and Evolution 29, 464–489. AP1/SQUA-like genes are not restricted to sepals or petals; Berbel, A., Navarro, C., Ferrandiz, C., Canas, L.A., Madueno, F., Beltran, J.-P., 2001. Analysis of PEAM4, the pea AP1 functional rather, they may play key roles in both vegetative and homologue, supports a model for AP1-like genes controlling both reproductive developments, especially in the formation of floral meristem and floral organ identity in different plant species. The inflorescence and floral meristems. The fact that AP1 and Plant Journal 25, 441–451. H. Shan et al. / Molecular Phylogenetics and Evolution 44 (2007) 26–41 39

Blanc, G., Hokamp, K., Wolfe, K.H., 2003. A recent polyploidy Force, A., Lynch, M., Pickett, F.B., Amores, A., Yan, Y.L., Postlethwait, superimposed on older large-scale duplications in the Arabidopsis J., 1999. Preservation of duplicate genes by complementary, degener- genome. Genome Research 13, 137–144. ative mutations. Genetics 151, 1531–1545. Blanc, G., Wolfe, K.H., 2004. Widespread paleopolyploidy in model plant Fornara, F., Parenicova, L., Falasca, G., Pelucchi, N., Masiero, S., species inferred from age distributions of duplicate genes. The Plant Ciannamea, S., Lopez-Dee, Z., Altamura, M.M., Colombo, L., Kater, Cell 16, 1667–1678. M.M., 2004. Functional characterization of OsMADS18, a member of Bowers, J.E., Chapman, B.A., Rong, J., Paterson, A.H., 2003. Unravelling the AP1/SQUA subfamily of MADS box genes. Plant Physiology 135, angiosperm genome evolution by phylogenetic analysis of chromo- 2207–2219. somal duplication events. Nature 422, 433–438. Gocal, G.F.W., King, R.W., Blundell, C.A., Schwartz, O.M., Andersen, Bowman, J.L., Alvarez, J., Weigel, D., Meyerowitz, E.M., Smyth, D.R., C.H., Weigel, D., 2001. Evolution of floral meristem identity genes: 1993. Control of flower development in Arabidopsis thaliana by analysis of Lolium temuletum genes related to APETALA1 and APETALA1 and interacting genes. Development 119, 721–734. LEAFY in Arabidopsis. Plant Physiology 125, 1788–1801. Bowman, J.L., Drews, G.N., Meyerowitz, E.M., 1991. Expression of the Graur, D., Li, W., 2000. Fundamentals of Molecular Evolution. Sinauer Arabidopsis floral homeotic gene AGAMOUS is restricted to specific Associates, Saunderland. cell types late in flower development. The Plant Cell 3, 749–758. Gu, Q., Ferrandiz, C., Yanofsky, M.F., Martienssen, R., 1998. The Bowman, J.L., Smyth, D.R., Meyerowitz, E.M., 1989. Genes directing FRUITFULL MADS-box gene mediates cell differentiation during flower development in Arabidopsis. The Plant Cell 1, 37–52. Arabidopsis fruit development. Development 125, 1509–1517. Busi, M.V., Bustamante, C., D’Angelo, C., Hidalgo-Cuevas, M., Boggio, Guindon, S., Gascuel, O., 2003. A simple, fast, and accurate algorithm to S.B., Valle, E.M., Zabaleta, E., 2003. MADS-box genes expressed estimate large phylogenies by maximum likelihood. Systematic Biology during tomato seed and fruit development. Plant Molecular Biology 52, 596–704. 52, 801–815. Hardenack, S., Ye, D., Saedler, H., Grant, S., 1994. Comparison of Cheng, Y., Kato, N., Wang, W., Li, J., Chen, X., 2003. Two RNA binding MADS box gene expression in developing male and female flowers of proteins, HEN4 and HUA1, act in the processing of AGAMOUS pre- the dioecious plant white campion. The Plant Cell 6, 1775–1787. mRNA in Arabidopsis thaliana. Developmental Cell 4, 53–66. Honma, T., Goto, K., 2001. Complexes of MADS-box proteins are Coen, E.S., Meyerowitz, E.M., 1991. The war of whorls: genetic sufficient to convert leaves into floral organs. Nature 409, 525–529. interactions controlling flower development. Nature 353, 31–37. Huijser, P., Klein, J., Lonnig, W.E., Meijer, H., Saedler, H., Sommer, H., De Bodt, S., Maere, S., Van de Peer, Y., 2005. Genome duplication and 1992. Bracteomania, an inflorescence anomaly, is caused by the loss of the origin of angiosperms. Trends in Ecology and Evolution 20, 591– function of the MADS-box gene squamosa in Antirrhinum majus. The 597. EMBO Journal 11, 1239–1249. De Bodt, S., Raes, J., Florquin, K., Rombauts, S., Rouze, P., Theissen, Immink, R.G., Hannapel, D.J., Ferrario, S., Busscher, M., Franken, J., G., Van De Peer, Y., 2003a. Genomewide structural annotation and Lookeren Campagne, M.M., Angenent, G.C., 1999. A Petunia MADS evolutionary analysis of the Type I MADS-Box genes in plants. box gene involved in the transition from vegetative to reproductive Journal of Molecular Evolution 56, 573–586. development. Development 126, 5117–5126. De Bodt, S., Raes, J., Van de Peer, Y., Theissen, G., 2003b. And then Irish, V.F., 2003. The evolution of floral homeotic gene function. there were many: MADS goes genomic. Trends in Plant Science 8, Bioessays 25, 637–646. 475–483. Irish, V.F., Sussex, I.M., 1990. Function of the apetala-1 gene during de Folter, S., Immink, R.G., Kieffer, M., Parenicova, L., Henz, S.R., Arabidopsis floral development. The Plant Cell 2, 741–753. Weigel, D., Busscher, M., Kooiker, M., Colombo, L., Kater, M.M., Johansen, B., Pedersen, L.B., Skipper, M., Frederiksen, S., 2002. MADS- Davies, B., Angenent, G.C., 2005. Comprehensive interaction map of box gene evolution-structure and transcription patterns. Molecular the Arabidopsis MADS Box transcription factors. The Plant Cell 17, Phylogenetics and Evolution 23, 458–480. 1424–1433. Kaufmann, K., Melzer, S., Theissen, G., 2005. MIKC-type MADS- Ditta, G., Pinyopich, A., Robles, P., Pelaz, S., Yanofsky, M., 2004. The domain proteins: structural modularity, protein interactions and SEP4 gene of Arabidopsis thaliana functions in floral organ and network evolution in land plants. Gene 347, 183–198. meristem identity. Current Biology 14, 1935–1940. Keck, E., McSteen, P., Carpenter, R., Coen, E., 2003. Separation of Drews, G.N., Bowman, J.L., Meyerowitz, E.M., 1991. Negative regulation genetic functions controlling organ identity in flowers. The EMBO of the Arabidopsis homeotic gene AGAMOUS by the APETALA2 Journal 22, 1058–1066. product. Cell 65, 991–1002. Kempin, S.A., Savidge, B., Yanofsky, M.F., 1995. Molecular basis of the Duarte, J.M., Cui, L., Wall, P.K., Zhang, Q., Zhang, X., Leebens-Mack, cauliflower phenotype in Arabidopsis. Science 267, 522–525. J., Ma, H., Altman, N., dePamphilis, C.W., 2006. Expression pattern Kim, S., Koh, J., Ma, H., Hu, Y., Endress, P.K., Hauser, B.A., Buzgo, M., shifts following duplication indicative of subfunctionalization and Soltis, P.S., Soltis, D.E., 2005a. Sequence and expression studies of A-, neofunctionalization in regulatory genes of Arabidopsis. Molecular B-, and E-class MADS-box homologues in Eupomatia (Eupomatia- Biology and Evolution 23, 469–478. ceae): support for the bracteate origin of the calyptra. International Elo, A., Lemmetyinen, J., Turunen, M.L., Tikka, L., Sopanen, T., Journal of Plant Sciences 166, 185–198. 2001. Three MADS-box genes similar to APETALA1 and FRUIT- Kim, S., Koh, J., Yoo, M.-J., Kong, H., Hu, Y., Ma, H., Soltis, P.S., FULL from silver birch (Betula pendula). Physiologia Plantarum Soltis, D.E., 2005b. Expression of floral MADS-box genes in basal 112, 95–103. angiosperm: implication for the evolution of floral regulators. The Endress, P.K., 1994. Diversity and Evolutionary Biology of Tropical Plant Journal 43, 714–744. Flowers. Cambridge University Press, Cambridge. Kim, S., Yoo, M.-J., Albert, V.A., Farris, J.S., Soltis, P.S., Soltis, D.E., Ferrandiz, C., Gu, Q., Martienssen, R., Yanofsky, M.F., 2000a. Redun- 2004. Phylogeny and diversification of B-function MADS-box genes dant regulation of meristem identity and plant architecture by in angiosperms: evolutionary and functional implication of a 260- FRUITFULL, APETALA1 and CAULIFLOWER. Development million-year-old duplication. American Journal of Botany 91, 127, 725–734. 2102–2118. Ferrandiz, C., Liljegren, S.J., Yanofsky, M.F., 2000b. Negative regulation Kong, H., Leebens-Mack, J., Ni, W., DePamphilis, C.W., Ma, H., 2004. of the SHATTERPROOF genes by FRUITFULL during Arabidopsis Highly heterogeneous rates of evolution in the SKP1 gene family in fruit development. Science 289, 436–438. plants and animals: functional and evolutionary implications. Molec- Flanagan, C.A., Ma, H., 1994. Spatially and temporally regulated ular Biology and Evolution 21, 117–128. expression of the MADS-box gene AGL2 in wild-type and mutant Kramer, E.M., Dorit, R.L., Irish, V.F., 1998. Molecular evolution of Arabidopsis flowers. Plant Molecular Biology 26, 581–595. genes controlling petal and stamen development: duplication and 40 H. Shan et al. / Molecular Phylogenetics and Evolution 44 (2007) 26–41

divergence within the APETALA3 and PISTILLATA MADS-box Nam, J., dePamphilis, C.W., Ma, H., Nei, M., 2003. Antiquity and gene lineages. Genetics 149, 765–783. evolution of the MADS-box gene family controlling flower develop- Kramer, E.M., Hall, J.C., 2005. Evolutionary dynamics of genes ment in plants. Molecular Biology and Evolution 20, 1435–1447. controlling floral development. Current Opinion in Plant Biology 8, Nam, J., Kim, J., Lee, S., An, G., Ma, H., Nei, M., 2004. Type I MADS- 13–18. box genes have experienced faster birth-and-death evolution than type Kramer, E.M., Jaramillo, M.A., Di Stilio, V.S., 2004. Patterns of gene II MADS-box genes in angiosperms. Proceedings of the National duplication and functional evolution during the diversification of the Academy of Sciences USA 101, 1910–1915. AGAMOUS subfamily of MADS box genes in angiosperms. Genetics Nesi, N., Debeaujon, I., Jond, C., Stewart, A.J., Jenkins, G.I., Caboche, 166, 1011–1023. M., Lepiniec, L., 2002. The TRANSPARENT TESTA16 locus

Kramer, E.M., Su, H.J., Wu, C.C., Hu, J.M., 2006. A simplified encodes the ARABIDOPSIS BSISTER MADS domain protein and is explanation for the frameshift mutation that created a novel C- required for proper development and pigmentation of the seed coat. terminal motif in the APETALA3 gene lineage. BMC Evolutionary The Plant Cell 14, 2463–2479. Biology 6, 30. Nicholas, K.B., Nicholas, H.B., 1997. Genedoc: A Tool for Editing and Kunst, L., Klenz, J.E., Martinez-Zapater, J.M., Haughn, G.W., 1989. AP2 Annotating Multiple Sequence Alignments. Distributed by the author. gene determines the identity of perianth organs in flowers of Parenicova, L., de Folter, S., Kieffer, M., Horner, D.S., Favalli, C., Arabidopsis thaliana. The Plant Cell 1, 1195–1208. Busscher, J., Cook, H.E., Ingram, R.M., Kater, M.M., Davies, B., Kyozuka, J., Harcourt, R., Peacock, W.J., Dennis, E.S., 1997. Eucalyptus Angenent, G.C., Colombo, L., 2003. Molecular and phylogenetic has functional equivalents of the Arabidopsis AP1 gene. Plant analyses of the complete MADS-box transcription factor family in Molecular Biology 35, 573–584. Arabidopsis: new openings to the MADS world. The Plant Cell 15, Lawton-Rauh, A.L., Buckler, E.S. t., Purugganan, M.D., 1999. Patterns 1538–1551. of molecular evolution among paralogous floral homeotic genes. Pelaz, S., Ditta, G.S., Baumann, E., Wisman, E., Yanofsky, M.F., 2000. B Molecular Biology and Evolution 16, 1037–1045. and C floral organ identity functions require SEPALLATA MADS- Leebens-Mack, J., Raubeson, L.A., Cui, L., Kuehl, J.V., Fourcade, M.H., box genes. Nature 405, 200–203. Chumley, T.W., Boore, J.L., Jansen, R.K., dePamphilis, C.W., 2005. Pelaz, S., Gustafson-Brown, C., Kohalmi, S.E., Crosby, W.L., Yanofsky, Identifying the basal angiosperm node in chloroplast genome phylog- M.F., 2001a. APETALA1 and SEPALLATA3 interact to promote enies: Sampling one’s way out of the Felsenstein zone. Molecular flower development. The Plant Journal 26, 385–394. Biology and Evolution 22, 1948–1963. Pelaz, S., Tapia-Lopez, R., Alvarez-Buylla, E.R., Yanofsky, M.F., 2001b. Lim, J., Moon, Y.H., An, G., Jang, S.K., 2000. Two rice MADS domain Conversion of leaves into petals in Arabidopsis. Current Biology 11, proteins interact with OsMADS1. Plant Molecular Biology 44, 182–184. 513–527. Pinyopich, A., Ditta, G.S., Savidge, B., Liljegren, S.J., Baumann, E., Litt, A., Irish, V.F., 2003. Duplication and diversification in the Wisman, E., Yanofsky, M.F., 2003. Assessing the redundancy of APETALA1/FRUITFULL floral homeotic gene lineage: implications MADS-box genes during carpel and ovule development. Nature 424, for the evolution of floral development. Genetics 165, 821–833. 85–88. Lynch, M., Force, A., 2000. The probability of duplicate gene preservation Pnueli, L., Abu-Abeid, M., Zamir, D., Nacken, W., Schwarz-Sommer, Z., by subfunctionalization. Genetics 154, 459–473. Lifschitz, E., 1991. The MADS box gene family in tomato: temporal Macknight, R., Duroux, M., Laurie, R., Dijkwel, P.P., Simpson, G.G., expression during floral development, conserved secondary structures Dean, C., 2002. Functional significance of the alternative transcript and homology with homeotic genes from Antirrhinum and Arabidopsis. processing of the Arabidopsis floral promoter FCA. The Plant Cell 14, The Plant Journal 1, 255–266. 877–888. Posada, D., Crandall, K.A., 1998. MODELTEST: testing the model of Mandel, M.A., Gustafson-Brown, C., Savidge, B., Yanofsky, M.F., 1992. DNA substitution. Bioinformatics 14, 817. Molecular characterization of the Arabidopsis floral homeotic gene Purugganan, M.D., Rounsley, S.D., Schmidt, R.J., Yanofsky, M.F., 1995. APETALA1. Nature 360, 273–277. Molecular evolution of flower development: diversification of the plant Mandel, M.A., Yanofsky, M.F., 1995. The Arabidopsis AGL8 MADS MADS-box regulatory gene family. Genetics 140, 345–356. box gene is expressed in inflorescence meristems and is negatively Savidge, B., Rounsley, S.D., Yanofsky, M.F., 1995. Temporal relationship regulated by APETALA1. The Plant Cell 7, 1763–1771. between the transcription of two Arabidopsis MADS box genes and the Mandel, M.A., Yanofsky, M.F., 1998. The Arabidopsis AGL9 MADS-box floral organ identity genes. The Plant Cell 7, 721–733. gene is expressed in young flower primordia. Sexual Plant Reproduc- Shchennikova, A.V., Shulga, O.A., Skryabin, K.G., Angenent, G.C., tion 11, 22–28. 2004. Identification and characterization of four Chrysanthemum Martinez-Castilla, L.P., Alvarez-Buylla, E.R., 2003. Adaptive evolution in MADS-Box genes, belonging to the APETALA1/FRUITFULL and the Arabidopsis MADS-box gene family inferred from its complete SEPALLATA3 subfamilies. Plant Physiology 134, 1632–1641. resolved phylogeny. Proceedings of the National Academy of Sciences Soltis, D.E., Soltis, P.S., Albert, V.A., Oppenheimer, D.G., dePamphilis, USA 100, 13407–13412. C.W., Ma, H., Frohlich, M.W., Theissen, G., 2002. Missing links: the Masiero, S., Imbriano, C., Ravasio, F., Favaro, R., Pelucchi, N., Gorla, genetic architecture of flower and floral diversification. Trends in Plant M.S., Mantovani, R., Colombo, L., Kater, M.M., 2002. Ternary Science 7, 22–31. complex formation between MADS-box transcription factors and the Soltis, P.S., Soltis, D.E., 2004. The origin and diversification of histone fold protein NF-YB. Journal of Biological Chemistry 277, angiosperms. American Journal of Botany 91, 1614–1626. 26429–26435. Stellari, G.M., Jaramillo, M.A., Kramer, E.M., 2004. Evolution of the Moon, Y.H., Kang, H.G., Jung, J.Y., Jeon, J.S., Sung, S.K., An, G., 1999. APETALA3 and PISTILLATA lineages of MADS-box-containing Determination of the motif responsible for interaction between the rice genes in the basal angiosperms. Molecular Biology and Evolution 21, APETALA1/AGAMOUS-LIKE9 family proteins using a yeast two- 506–519. hybrid system. Plant Physiology 120, 1193–1204. Swofford, D.L., 2002. PAUP*: Phylogenetic Analysis Using Parsimony Moore, R.C., Purugganan, M.D., 2005. The evolutionary dynamics of (*and Other Methods). Version 4.0b10. Sinauer Associates, Sunder- plant duplicate genes. Current Opinion in Plant Biology 8, 122–128. land, MA. Muller, B.M., Saedler, H., Zachgo, S., 2001. The MADS-box gene Theissen, G., 2001. Development of floral organ identity: Stories from the DEFH28 from Antirrhinum is involved in the regulation of floral MADS house. Current Opinion in Plant Biology 4, 75–85. meristem identity and fruit development. The Plant Journal 28, Theissen, G., Kim, J.T., Saedler, H., 1996. Classification and phylogeny of 169–179. the MADS-box multigene family suggest defined roles of MADS-box H. Shan et al. / Molecular Phylogenetics and Evolution 44 (2007) 26–41 41

gene subfamilies in the morphological evolution of eukaryotes. Journal Yu, H., Goh, C.J., 2000. Identification and characterization of three of Molecular Evolution 43, 484–516. orchid MADS-box genes of the AP1/AGL9 subfamily during floral Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., Higgins, transition. Plant Physiology 123, 1325–1336. D.G., 1997. The CLUSTAL_X windows interface: flexible strategies Zahn, L.M., Kong, H., Leebens-Mack, J., Kim, S., Soltis, P.S., Landherr, for multiple sequence alignment aided by quality analysis tools. L.L., Soltis, D.E., dePamphilis, C.W., Ma, H., 2005a. The evolution of Nucleic Acids Research 25, 4876–4882. the SEPALLATA subfamily of MADS-box genes: a pre-angiosperm Vandenbussche, M., Theissen, G., Van de Peer, Y., Gerats, T., 2003a. origin with multiple duplications throughout angiosperm history. Structural diversification and neo-functionalization during floral Genetics 169, 2209–2223. MADS-box gene evolution by C-terminal frameshift mutations. Zahn, L.M., Leebens-Mack, J., Arrington, J.M., Hu, Y., Landherr, L.L., Nucleic Acids Research 31, 4401–4409. dePamphilis, C.W., Becker, A., Theissen, G., Ma, H., 2006. Conser- Vandenbussche, M., Zethof, J., Souer, E., Koes, R., Tornielli, G.B., vation and divergence in the AGAMOUS subfamily of MADS-box Pezzotti, M., Ferrario, S., Angenent, G.C., Gerats, T., 2003b. genes: Evidence of independent sub- and neofunctionalization events. Toward the analysis of the Petunia MADS Box gene family by Evolution and Development 8, 30–45. reverse and forward transposon insertion mutagenesis approaches: Zahn, L.M., Leebens-Mack, J., dePamphilis, C.W., Ma, H., Theissen, G., B, C, and D floral organ identity functions require SEPALLA 2005b. To B or not to B a flower: the role of DEFCIENS and TA-like MADS box genes in Petunia. The Plant Cell 15, GLOBOSA orthologs in the evolution of the angiosperms. Journal of 2680–2693. Heredity 96, 225–240. Vision, T.J., Brown, D.G., Tanksley, S.D., 2000. The origins of genomic Zhang, L., Gaut, B.S., Vision, T.J., 2001. Gene duplication and evolution. duplications in Arabidopsis. Science 290, 2114–2117. Science 293, 1551. Yalovsky, S., Rodriguez-Concepcion, M., Bracha, K., Toledo-Ortiz, Zhang, X., Feng, B., Zhang, Q., Zhang, D., Altman, N., Ma, H., 2005. G., Gruissem, W., 2000. Prenylation of the floral transcription Genome-wide expression profiling and identification of gene activities factor APETALA1 modulates its function. The Plant Cell 12, during early flower development in Arabidopsis. Plant Molecular 1257–1266. Biology 58, 401–919.