Genetics: Published Articles Ahead of Print, published on July 2, 2006 as 10.1534/genetics.106.057125 Preston and Kellogg, Evolution of AP1/FUL-like genes

1Reconstructing the Evolutionary History of Paralogous APETALA1/FRUITFULL-like

Genes in Grasses ()

Jill C. Preston* and Elizabeth A. Kellogg*

*Department of Biology

University of Missouri – Saint Louis

Saint Louis, MO 63121

USA

1 Sequence data from this article have been deposited with the EMBL/GenBank Data Libraries under accession nos. DQ792945-DQ792980.

1 Preston and Kellogg, Evolution of AP1/FUL-like genes

Running title: Evolution of AP1/FUL-like genes

Key words: MADS-box, gene tree, gene expression, gene duplication, APETALA1, FRUITFULL, grasses

Author for correspondence: Jill Preston

Address: Department of Biology, R223

University of Missouri – Saint Louis

One University Boulevard

Saint Louis, MO 63121

Phone: +1-314-516-7997

Fax: +1-314-516-6233

Email: [email protected]

2 Preston and Kellogg, Evolution of AP1/FUL-like genes

ABSTRACT

Gene duplication is an important mechanism for the generation of evolutionary novelty.

Paralogous genes that are not silenced may evolve new functions (neo-functionalization) that will alter the developmental outcome of pre-existing genetic pathways, partition ancestral functions (sub-functionalization) into divergent developmental modules, or may function redundantly. Functional divergence can occur by changes in the spatio-temporal patterns of gene expression and/or by changes in the activities of their protein products. We reconstructed the evolutionary history of two paralogous monocot MADS-box transcription factors, FUL1 and

FUL2, and determined the evolution of sequence and gene expression in grass AP1/FUL-like genes. Monocot AP1/FUL-like genes duplicated at the base of Poaceae and codon substitutions occurred under relaxed selection mostly along the branch leading to FUL2. Following the duplication, FUL1 was apparently lost from early-diverging taxa, a pattern consistent with major changes in grass floral morphology. Overlapping gene expression patterns in and spikelets indicate that FUL1 and FUL2 probably share some redundant functions, but that FUL2 may have become temporally restricted under partial sub-functionalization to particular stages of floret development. These data have allowed us to reconstruct the history of AP1/FUL-like genes in

Poaceae and to hypothesize a role for this gene duplication in the evolution of the grass spikelet.

3 Preston and Kellogg, Evolution of AP1/FUL-like genes

INTRODUCTION

Gene duplications have been implicated in the origin of evolutionary novelty through creation of paralogous genes that may be co-opted for new or altered functions (OHNO 1970;

FORCE et al. 1999; HUGHES 1999). Classic models of gene duplications generally predict the silencing and loss of one descendant gene by accumulation of deleterious mutations (non- functionalization) (NEI and ROYCHOUDHURY 1973; LI 1980), or acquisition of a novel function by one descendent gene following periods of positive Darwinian selection (OHNO

1970; HUGHES 1994). Changes in the nucleotide sequence of a duplicated gene can either take place in the coding region (OHNO 1970; WALSH 1995) or in associated cis-regulatory elements

(LI and NOLL 1994; DOEBLEY and LUKENS 1998). Recent authors have suggested other possible fates for duplicated genes, including maintenance of the ancestral function by both copies (redundancy), or partitioning of the ancestral role between copies (sub-functionalization) following a period of relaxed selection (FORCE et al. 1999; LYNCH et al. 2001). Despite non- functionalization being a more likely evolutionary outcome, the existence of ancient multigene families is evidence that duplicated genes can be maintained in the genome for long periods of time (LYNCH and FORCE 2000; DE BODT et al. 2003). Since duplication and subsequent diversification of genes have been implicated in the evolution of morphological diversity (LI and

NOLL 1994; FORCE et al., 1999; HUGHES 1999; GARCIA-FERNANDEZ 2005), studies aimed at understanding the fate of duplicated genes may be key to establishing a causal link between molecular evolution and the evolution of novel form.

4 Preston and Kellogg, Evolution of AP1/FUL-like genes

ABCDE model in Arabidopsis: Morphological novelty occurs through the creation or modification of developmental pathways (CARROLL et al. 2001). These pathways involve multiple interacting proteins, many of which are transcription factors. In , some of the best- studied transcription factors are members of the MIKC-type MADS-box family

(PURUGGANAN et al. 1995; BECKER and THEISSEN 2003; KOFUJI et al. 2003;

MARTINEZ-CASTILLA and ALVAREZ-BUYLLA 2004; ROBLES and PELAZ 2005). These proteins share a highly-conserved DNA binding domain (the MADS domain) at or near the N- terminus, a less-conserved Intervening (I) domain, a Keratin-like (K) domain, and finally a highly variable C-terminus (RIECHMANN et al. 1996); the primary sequence of the latter often contains motifs specific to a particular sub-type of MADS box proteins (e.g. LAMB and IRISH

2003). Although MADS-box proteins function at many developmental stages, they are best characterized for their role in inflorescence and floral development. According to the classic

ABC model of floral development in Arabidopsis, floral homeotic genes function combinatorially to determine the number and identity of each of the four concentric whorls

(BOWMAN et al. 1989; COEN and MEYEROWITZ 1991; WEIGEL and MEYEROWITZ

1994). A-class genes (APETALA1 (AP1) and APETALA2 (AP2)) specify the identity of sepals

(MANDEL et al. 1992; JOFUKU et al. 1994), A- and B-class (PISTILLATA (PI) and

APETALA3 (AP3)) (KRIZEK and MEYEROWITZ, 1996), petals, B- and C-class (AGAMOUS

(AG)) (YANOFSKY et al. 1990), stamens, and C-class carpels (BOWMAN et al. 1989;

BOWMAN et al. 1991). The products of Arabidopsis E class genes (SEPALLATA1/2/3

(SEP1/2/3), are thought to further participate by forming multi-protein complexes with ABC proteins to specify petals, stamens, and carpels (PELAZ et al. 2000; PELAZ et al. 2001;

5 Preston and Kellogg, Evolution of AP1/FUL-like genes

HONMA and GOTO 2001), although this has yet to be demonstrated in planta. Except for AP2, all these genes encode MADS-box transcription factors.

B and C functions are largely conserved among homologous genes across the angiosperms (AMBROSE et al. 2000; MA and DEPAMPHILIS 2000; WHIPPLE et al. 2004; but see KRAMER et al. 2003; LAMB and IRISH 2003), but similar expression analyses and reverse genetics data on the AP1 gene family (e.g. HUIJSER et al. 1992; ROSIN et al. 2003) show that A function (specification of sepal and petal identity) may be derived in the angiosperms, evolving some time during the diversification of the core eudicots (THEISSEN et al. 2000; LITT and IRISH 2003). Arabidopsis has three AP1-like genes, AP1, CAULIFLOWER

(CAL) and FRUITFULL (FUL), all three of which act redundantly in the transition from vegetative to inflorescence development by specifying the identity of the floral meristems that give rise to flowers (KEMPIN et al. 1995; MANDEL and YANOFSKY 1995; FERRANDIZ et al. 2000) (Figure 1A). Unlike CAL, AP1 and FUL also act non-redundantly, with FUL being involved in the proper development of both fruit and leaves (GU et al. 1998; LILJEGREN et al.

2004). Sequence analyses of MADS-box genes spanning both angiosperms and gymnosperms show that all members of the AP1/FUL, AGL6 and SEP gene subfamilies, except for the core eudicot AP1 clade, share a conserved hydrophobic motif in the C-terminal domain (LITT and

IRISH 2003; ZAHN et al. 2005). This observation suggests that the ancestral AP1/FUL gene may have been functionally closer to FUL than AP1, with a conserved role in meristem identity and possibly fruit development within the monocots and basal angiosperms (LITT and IRISH

2003).

Mutations in both CAL and AP1 of Arabidopsis result in the proliferation of inflorescence meristems in positions that would normally develop flowers. Unlike ap1/cal double mutants,

6 Preston and Kellogg, Evolution of AP1/FUL-like genes ap1/cal/ful triple mutants never produce flowers, but rather continue to reiterate the development of leafy shoots (FERRANDIZ et al. 2000). Similar mutations in the AP1 homologue

SQUAMOSA of Antirrhinum result in replacement of flowers with shoots subtended by bracts

(CARPENTER et al. 1995). As in ap1 cal double mutants, squa mutants occasionally produce flowers, but these are often misshapen and are subtended by prophyll-like structures (HUIJSER et al. 1992). In wild-type plants, interactions between AP1/FUL and other genes (see MULLER et al. 2001; YU et al. 2004) suppress the development of non-floral meristems, allowing subsequent development of flowers.

ABCDE-like genes in grasses: The grass family (Poaceae) comprises an estimated 10,000 species, diverging from two successive sister families, Ecdeicoleaceae and Joinvilleaceae, approximately 55-70 million years ago (KELLOGG 2001; BREMER 2002). Most species can be placed into two large clades, BEP and PACCAD, each representing two major radiations around

40-50 mya (BREMER 2002) (Figure 2). Grasses are morphologically unique in having highly modified flowers, each developing within a larger floral structure termed the grass spikelet

(CHENG et al. 1983; CLIFFORD 1987; IKEDA et al. 2004) (Figure 1C). Each spikelet is generally composed of one to two basal bracts (glumes) enclosing one or more flowers, each of which is subtended by two more bracts, the palea and lemma. Spikelet literally means “little spike”, referring to its similarity to an indeterminate branching inflorescence, developing within the larger inflorescence. Depending on the species, spikelet meristems will produce one to several floral meristems; these will then develop flowers. Grass flowers can be bisexual, unisexual or reduced. Bisexual flowers contain an outer whorl of lodicules (modified perianth), an inner whorl of stamens, and a central gynoecium. Unlike lodicules, stamens and carpels, the

7 Preston and Kellogg, Evolution of AP1/FUL-like genes lemmae, paleae, glumes, and the spikelet as a whole, have no obvious counterpart within flowering plants (Figure 1).

The role of ABCDE gene homologues in grass spikelet development is starting to be elucidated (reviewed in BOMMERT et al. 2005; WHIPPLE and SCHMIDT 2006). Genetic studies in rice and maize suggest both functional conservation and diversification of related genes between Arabidopsis and grasses. Rice has two AG-like genes (OsMADS58 and

OsMADS3), one of which is involved in stamen identity (OsMADS58) and the other in floral meristem and carpel identity (OsMADS3) (YAMAGUCHI et al. 2006). Likewise in both maize and rice, AP3-like orthologues Silky1 and SUPERWOMAN1 respectively, play roles in stamen and lodicule identity (AMBROSE et al. 2000; NAGASAWA et al. 2003) and PI-like proteins function in stamen identity (WHIPPLE et al. 2004). In contrast to B- and C-class related genes, the rice SEPALLATA-like gene LHS1 is involved in specifying the identity of floral organs unique to grasses. Mutant lhs1 plants have leafy lemma and palea (JEON et al. 2000; PRASAD et al. 2001), and expression analyses indicate a conservation of this role across grasses

(MALCOMBER and KELLOGG 2004). In Arabidopsis, AP1/FUL-like genes specify floral meristem identity and outer whorl organogenesis (KEMPIN et al. 1995; MANDEL and

YANOFSKY 1995; FERRANDIZ et al. 2000), but their function remains largely unknown in grasses (but see TREVASKIS et al. 2003; YAN et al. 2003; PETERSEN et al. 2004). It is therefore of great interest to determine the role of grass AP1/FUL-like homologues in the novel developmental pathway of the spikelet. A crucial step in this endeavor is to reconstruct the evolutionary history of these genes within grasses.

8 Preston and Kellogg, Evolution of AP1/FUL-like genes

AP1/FUL-like genes in grasses: Phylogenetic studies that include grass AP1/FUL genes have shown that diploid members of Poaceae possess three copies derived from two duplication events (LITT and IRISH 2003). The first duplication apparently occurred around the base of the monocots, giving rise to the FUL3 clade. The other occurred somewhat later, at some time during the divergence of the Commelinid clade (Arecales + Commelinales + +

Zingiberales [GRAHAM et al. 2006]), giving rise to the FUL1 and FUL2 clades (LITT and

IRISH 2003). Studies of expression of the latter two genes within spikelets of maize (Zea mays), barley (Hordeum vulgare), ryegrass (Lolium temulentum), and rice (Oryza sativa) have shown heterogeneous spatio-temporal patterns, both between orthologues in different taxa (MENA et al.

1995; SCHMITZ et al. 2000; GOCAL et al. 2001; PELUCCHI 2002) and paralogues in the same taxa (PELUCCHI 2002). In addition, reports of differences in the timing of expression within meristems (SCHMITZ et al. 2000), and the absence of FUL2 RNA in leaves (MENA et al. 1995;

SCHMITZ et al. 2000; GOCAL et al. 2001), suggest a possible diversification of roles following the origin of FUL1 and FUL2. Up-regulation of FUL1 gene expression in shoot apical meristems and leaves of barley (SCHMITZ et al. 2000; DANYLUK et al. 2003; TREVASKIS et al. 2003;

VON ZITZEWITZ et al. 2005), ryegrass (PETERSEN et al. 2004; PETERSEN et al. 2006) and wheat (Triticum monococcum and T. aestivum) (DANYLUK et al. 2003; TREVAKIS et al.

2003; YAN et al. 2003) has been correlated with the transition to flowering, indicating a role for these genes in competency to flower. It has also been suggested that grass AP1/FUL genes may play a general role in floral meristem identity, and/or a more specific role in spikelet organ identity (GOCAL et al. 2001; COOPER et al. 2003).

Few studies have focused on the evolution of AP1/FUL genes in families outside those of the model species of core eudicots. One exception is the study of AP1/FUL genes within

9 Preston and Kellogg, Evolution of AP1/FUL-like genes a few grass species (e.g., MENA et al. 1995; GOCAL et al. 2001; PELUCCHI et al. 2002; YAN et al. 2003). We would like to broaden our understanding of AP1/FUL gene diversification in grasses by concentrating on four questions. Firstly, did the duplication event that gave rise to

FUL1 and FUL2 occur before or after the origin of grasses? By determining the timing of this event we can both assess whether gene duplication is correlated with the origin of the grass spikelet, and establish patterns of molecular evolution before and after the duplication event.

Secondly, do patterns of codon evolution following the duplication event suggest different fates of either paralogue? To test for evidence of neo-functionalization under positive Darwinian selection, we use codon models specifying different selective pressures along branches immediately following the duplication event, and determine bias in codon substitutions across the phylogenetic tree. Thirdly, are patterns of gene expression across grasses consistent with a role for AP1/FUL genes in spikelet development? If both genes have a general role in spikelet development their expression would be expected in glumes and florets across a wide range of grasses. Finally, are patterns of gene expression across grasses consistent with functional diversification of the genes? Organ-specific patterns of expression in taxa containing genes similar to the ancestral gene (before the duplication event) and taxa containing the descendent genes (after the duplication event) can illuminate changes in expression consistent with redundancy, sub-functionalization or neo-functionalization.

METHODS

10 Preston and Kellogg, Evolution of AP1/FUL-like genes

Study species: We included sequence data on 22 species of Poaceae in this study, of which 17 are newly sampled here. For 14 of these we also obtained gene expression data. Two species of

Poaceae represent early diverging lineages within the family and the rest span the two major remaining lineages. Within-group choices were based on both availability and, more importantly, variation in spikelet morphology, particularly in number and position of fertile and reduced florets. We also generated new sequence data on three non-grass Poales, two of which were included in the gene expression analysis. Finally, we included seven monocots outside the

Poales, two of which are newly sampled here. For three of these we obtained gene expression data. Seeds were obtained either from the United States Department of Agriculture (USDA) or from pre-existing seed stocks, and plants were grown at the University of Missouri – St. Louis or the Missouri Botanical Garden. The GENBANK accession numbers for each AP1/FUL-like sequence included in this study are listed in Table 1.

Isolation and sequencing of AP1/FUL genes: Total RNA was extracted from inflorescence tissue using RNAwiz solution (Ambion Inc., Austin, TX) according to the manufacturers’ instructions. A pool of cDNA including all MADS box genes was generated from the extracted

RNA using Superscript One-Step RT-PCR with Platinum Taq (Invitrogen, Carlsbad, CA) as per the manufacturers’ instructions, using a degenerate primer designed to bind at the 5’ end of the

MADS box (MADS1F; 5’-ATG GGT MGS GGS AAG GTG GAG CTG AAG CGG-3’)

(MALCOMBER and KELLOGG 2004) and a polyT primer with adaptor (5’-CCG GAT CCT

CTA GAG CGG CCG CTT TTT TTT TTT TTT TTT-3’). Each reaction was run for 30 cycles with an annealing temperature of 52ºC on an MJ Research PTC-200 thermocycler (GMI Inc.,

Ramsey, MN). AP1/FUL homologues from grasses and non-grass Poales were PCR amplified

11 Preston and Kellogg, Evolution of AP1/FUL-like genes from the pooled MADS-box cDNAs using a forward primer designed to bind at the 3’ end of the

MADS box (ZAP4F; 5’-ATC TCC GTS CTC TGY GA-3’) and a reverse primer at the 3’ end of the C-terminal domain (ZAP1oneR; 5’-GAR GKK GCT CAG CAT CCA T-3’). Because of problems with the 3’ primer, in some species we used a primer at the 5’ end of the C-terminal domain (ZAP1fourR; 5’-AGR RYC TTG TTC TCC TCC TG-3’), producing a slightly truncated

PCR product. AP1/FUL homologues from monocots outside Poales were amplified using the 5’

MADS box primer MADS1F and one of the primers shown in supplemental Table 1. Each reaction was run for 30 cycles with an annealing temperature of 55ºC. PCR products were gel purified through a QIAquick spin column (Qiagen Inc., Valencia, CA) and sub-cloned into the pGEM-T easy vector (Promega Corp., Madison, WI).

Plasmid DNA for ten clones per ligation was isolated through a QIAprep Spin Miniprep

Column (Qiagen Inc., Valencia, CA), and sequenced using the Big Dye 3.1 terminator cycle sequencing protocol (Applied Biosystems, Foster City, CA) with the plasmid primers T7 and

SP6. Sequencing reactions were analyzed on an ABI-377 automated DNA sequencer (Applied

Biosystems, Foster City, CA). Double-stranded sequences were aligned and edited in SeqManII

(DNASTAR Inc., Madison), and base callings with Phred scores (EWING et al. 1998) below 20 were re-sequenced.

Genes were named according to the clade to which they belong (FUL1 or FUL2), plus the first letters of the genus and species name respectively. Thus the FUL1 gene from

Chasmanthium latifolium is ClFUL1, from Eleusine indica, EiFUL1, etc.

Phylogenetic Analyses: Nucleotide sequences were translated into amino acids and manually aligned using MacClade 4 (MADDISON and MADDISON 2003) (Supplemental Figure 1) with

12 Preston and Kellogg, Evolution of AP1/FUL-like genes amino acid sequences derived from previously characterized GENBANK sequences. After alignment, nucleotide sequences were analyzed with PAUP version 4.0b10 (SWOFFORD 2001) and MrBayes 3.0 (HUELSENBECK and RONQUIST 2001). A maximum parsimony (MP) heuristic search was conducted using 1000 random addition sequences with TBR branch swapping and gaps treated as missing data. Full heuristic bootstrap analyses (FELSENSTEIN

1985) were conducted using 500 replicates. Parameters for the maximum likelihood (ML) and

Bayesian analyses were estimated using MrModelTest 2.2 (NYLANDER 2004). The ML analysis was performed with 10 random addition sequences and Bayesian phylogenetic estimates were run for 3 x 106 generations on a Beowulf cluster (University of Missouri – St. Louis, MO).

Both ML and Bayesian analyses implemented the GTR + I + G model of evolution based on the results from MrModelTest 2.2.

A Shimodaira-Hasegawa (SH) test (GOLDMAN et al. 2000) was undertaken to test whether we could statistically reject alternative hypotheses for the timing of the AP1/FUL duplication. A priori constraint trees were created in MacClade. Each topology differed in the placement of Joinvillea (Joinvilleaceae) and Xyris (Xyridaceae) FUL sequences, and the basal grasses Streptochaeta (Anomochlooideae) and Pharus (Pharidoideae) FUL2 sequences as follows: (1) Joinvillea JaFUL sister to all FUL1 sequences, (2) Joinvillea JaFUL sister to all

FUL2 sequences, (3) Streptochaeta SaFUL2 sister to both FUL1 and FUL2 clades, (4)

Streptochaeta SaFUL2 and Pharus PsFUL2 sister to both FUL1 and FUL2 clades, (5) Xyris

XsFUL and Joinvillea sister to all FUL1 sequences, and (6) Xyris XsFUL and Joinvillea JaFUL sister to all FUL2 sequences. Each constraint tree was tested against the unconstrained ML phylogeny.

13 Preston and Kellogg, Evolution of AP1/FUL-like genes

Maximum parsimony ancestral state reconstructions were estimated in MacClade 4

(MADDISON and MADDISON 2003) after supplementation with previously reported sequence and expression data for Crocus (Iridaceae) (TSAFTARIS et al. 2004) and Dendrobium

(Orchidaceae) (YU and GOH 2000).

Southern blot hybridization: Total DNA was extracted from 250 mg of tissue from

Hordeum vulgare, Sorghum bicolor, Pharus sp., and Streptochaeta angustifolia, using a modified protocol of Doyle and Doyle (1987). Approximately 10 µg of total DNA was digested with the enzymes BamHI, EcoRI or HindIII, and digested DNA was run on a 0.8% agarose gel overnight. Digested DNA was blotted onto a nylon membrane (Amersham Biosciences,

Piscataway, NJ) and hybridized at moderate stringency (16 h at 60°C) with a 32P-dCTP labeled

FUL2 gene-specific probe spanning the KC domains and the 3’-UTR. To remove non-specific binding of the probes each membrane was washed twice with 2 X SSC/1% SDS and twice with 1

X SSC/1% SDS at 65°C, following the protocol of Laurie et al. (1993).

Tests for positive selection: If natural selection has driven multiple adaptive changes in amino acids in a protein, theory predicts that the number of mutations leading to amino acid changes

(non-synonymous mutations, or dN) should be significantly larger than the number of mutations at synonymous sites (dS) (MIYATA and YASUNAGA 1980; LI 1997). Conversely, if selection consistently weeds out amino acid changes, then dN should be significantly smaller than dS.

Thus the nature of selection can be assessed by the ratio of dN/dS; for convenience dN/dS is commonly abbreviated by the symbol ω. If the ratio is much greater than 1, then adaptive

(positive) selection is inferred; if less than 1, then negative selection, and if the ratio is

14 Preston and Kellogg, Evolution of AP1/FUL-like genes approximately equal to 1, then the sequence is inferred to be evolving neutrally. To test for evidence of positive selection following the AP1/FUL duplication, the best-supported tree obtained from the larger phylogenetic analysis was pruned in MacClade (as indicated in Figure

3) and evaluated using the program codeml from the PAML package, version 3.14 (YANG

1997). The pruned tree only included sequences that spanned all four MIKC domains as follows:

CylFUL, XsFUL, OsMADS14, TaMADS11, HvMADS5, LtMADS1, AsFUL1, EiFUL1, ClFUL1,

ZmMADS4, ZmMADS15, OsMADS15, HvMADS8, LtMADS2, AsFUL2, EiFUL2, ZmMADS3,

ZAP1, SbMADS2, SvFUL2, PsFUL2 and SaFUL2. PAML calculates the likelihood of the data given the tree under a set of increasingly complex models of evolution. The likelihood values are then tested statistically to determine whether the complex models fit the data significantly better than the simple ones, and thus if there is significant evidence of positive selection. We tested three models all of which assume that the entire coding region is evolving the same way over the entire tree. These models were M0, the one-ratio test assuming equal κ (ratio of transition to transversion rates), codon usage, and ω (ratio of non-synonymous to synonymous substitutions or dN/dS) over all sites and branches; M1a, a model of nearly neutral evolution (0 <

ω1 < 1) averaged over all sites and branches; and M2a, a discrete model assuming positive selection (ω1 > 1) averaged over all branches and sites (ANISMOV et al. 2002). We would infer positive selection if M2a proved to be significantly better than either M0 or M1a.

It is unlikely that all residues in a protein, and therefore all sites in a coding sequence, are subject to the same selection pressures. We therefore tested three models that allow the evolutionary rate to vary among sites, but keep the rate the same for a given site over all branches in the tree. These differ in their assumptions about the distribution of sites and in the possibility of positive selection: M3 (discrete distribution), M7 (ß distribution), and M8 (ß

15 Preston and Kellogg, Evolution of AP1/FUL-like genes distribution and positive selection) (BIELAWSKI and YANG 2003). We would infer positive selection if M8 proved to be significantly better than either M3 or M7.

The ML-based method can produce different estimates of ω at a given codon site, depending on the input ω value, due to multiple local maxima of the likelihood function

(SUZUKI and NEI 2001; WONG et al. 2004). For this reason, we re-ran models 2, 3, 7 and 8 using 0.4, 3.14, and 4 as the input ω values. Only the results with the highest log-likelihood [InL] values are presented.

We wished to test explicitly for increased positive selection along all branches subsequent to the duplication event (two-ratio), and increased followed by decreased positive selection after the duplication event (three-ratio) (YANG 1998). To do this we used models that could incorporate heterogeneity over both sites and branches, by specifying two ω ratios. Model

A assumes 0 < ω0 < 1 (estimated) and ω1 = 1, while model B determines both ω0 and ω1 as free parameters to be estimated from the data. In each case ω1 was specified as the branch immediately following the duplication event and leading to clade FUL1 or FUL2 respectively

(BIELAWSKI and YANG 2003).

Nested models were compared against a χ2 distribution of trees using a likelihood ratio test (LRT). M0 was compared with M3, and M7 with M8 to test for uniformity of parameters over different sites. M1a was compared with M2a to examine whether ϖ = 1 or >1. To test for increased ω following the duplication event, M0 was compared with the two-ratio model, and the two- with the three-ratio model. Finally, to incorporate heterogeneity of parameters over both branches and sites, M1a and null model A with ω2 fixed at 1 were compared with model A, and null model B with only two site categories was compared with model B.

16 Preston and Kellogg, Evolution of AP1/FUL-like genes

To identify amino acid sites distinguishing clade FUL1 from FUL2, codon transitions were mapped onto the AP1/FUL gene tree in MacClade 4 (MADDISON and MADDISON

2003). The binomial test (Pr = pk) (SOKAL and ROHLF 2001) was used to determine if amino acid replacements were significantly more prevalent within FUL1 or FUL2 gene lineages prior to divergence of the two main grass lineages (i.e. the rice-containing BEP clade and the maize- containing PACCAD clade [Figure 2]). Pr is the probability that the difference between replacement substitutions in the two lineages is at least as extreme as observed, p is the probability of change, and k is the number of codon synapomorphies discriminating either clade.

The parameter p was set at 0.5 and thus assumed an equal time available for change of either gene following the duplication event.

Reverse transcriptase (RT)-PCR characterization of expression: Total RNA was extracted separately from stems, leaves, outer tepals, inner tepals, stamens and carpels of non-grass monocots, and roots, stems, leaves, glumes, and florets of grasses as described above. For each species and each organ, RNA was extracted at least twice. In order to detect even low abundance AP1/FUL-like gene transcripts, RT-PCR used a nested PCR approach. We preferred this approach because of its reliability in determining presence or absence of transcripts, even though it prevents assessment of their relative abundance. RNA was first reverse transcribed into

MADS-box specific cDNA using the primer MADS1F (at the translation start site) and polyT- anchor in a One-Step RT-PCR as described above. PCR products were then diluted 1:100 and 1

µl template was used in the nested PCR. For the non-grass monocot species forward or reverse gene-specific primers (Supplemental Table 1) were designed according to published sequences, and used in combination with the reverse primer corresponding to the C-terminus (ZAP1oneR)

17 Preston and Kellogg, Evolution of AP1/FUL-like genes or the forward primer corresponding to the translation start (MADS1F) respectively to determine gene expression within each organ type. To identify paralogous copies from each grass subfamily, forward primers (Supplemental Table 1) were designed based on aligned sequences used in the phylogenetic analysis. Each forward primer was used in combination with

ZAP1oneR, which spans the conserved hydrophobic motif of the C-terminal domain, to amplify copy- and cDNA-specific products. As a positive control, the constitutively expressed actin gene was amplified using the primers ACT-59F (5’-AGG CTG GTT TCG CTG GGG ATG ATG-3’) and ACTIN-764R (5’-GGA CCT CGG GGC ACC TGA ACC TCT-3’). Each reaction was run for 30 cycles with an annealing temperature of 52-55°C. Two independent experiments were carried out for each organ type. When no expression was detected, each experiment was repeated two more times using the same RNA.

To verify results from the nested RT-PCR experiment and to assess approximately relative levels of transcripts, semi-quantitative RT-PCR was carried out for Oryza, Sorghum,

Chasmanthium and Eleusine. Gene-specific primers were designed in the reverse orientation and used in combination with forward primers described above (Supplemental Table 1). To control for DNA contamination, amplified products spanned intron-exon boundaries within the C- terminus and 3’-UTR. For each RNA sample, actin was amplified for 25 and 30 cycles with an annealing temperature of 55˚C as below. Bands were resolved on agarose gels and their relative intensities determined by densitometric analysis in ImageJ (ABRAMOFF et al. 2004). PCR conditions for each target gene were one ‘hot start’ cycle at 94˚C for 5 min, followed by 25-35 cycles of 94˚C for 30 s, 55-58˚C for 30 s and 72˚C for 1 min, followed by a final extension step at 72˚C for 10 min. Different numbers of cycles were carried out to confirm linearity of each amplification (HE et al. 1995; FREEMAN et al. 1999). Target bands were quantified in ImageJ

18 Preston and Kellogg, Evolution of AP1/FUL-like genes

(ABRAMOFF et al. 2004) and standardized against actin. Unlike the nested RT-PCR approach, semi-quantitative PCR is likely to miss low abundance RNAs.

RESULTS

Phylogenetic analyses of AP1/FUL genes: We generated AP1/FUL sequences for 22 monocot taxa. Of these newly generated sequences, most spanned the entire coding region of the gene minus the first 100-150 bp of the MADS-box domain. In the case of eight sequences, sequence coverage was less either because the reverse primer ZAP1fourR was used to give a truncated product ending at the 5’ end of the C-terminus or a portion of the sequence had Phred scores below 20: EeFUL1 (C domain only), PcFUL2 (K-C domains), LvFUL2 (K-C domains), ClFUL2

(M-K domains), JaFUL (M-K domains), AgsFUL (M-K), SbFUL1 (M-K) and LisFUL (M-I).

Nucleotide sequence divergence between the IKC domains of paralogous genes approximated

23%, with the C domain showing the highest pairwise divergence of around 33%. Distinct

AP1/FUL alleles sharing >95% nucleotide identity were sequenced for AsFUL1, BdFUL2,

EiFUL1, PgFUL1 and TmFUL2. All loci for which C-terminal sequence was obtained had the

FUL-motif (L/MPPWML/V) (LITT and IRISH 2003). No premature stop codons were found, suggesting that all genes sequenced could be functional. Only one FUL2-like sequence was retrieved for Streptochaeta and for Pharus; despite multiple attempts at PCR and sequencing of multiple clones, we did not detect a FUL1-like sequence in these species.

19 Preston and Kellogg, Evolution of AP1/FUL-like genes

ML analyses gave one tree with a likelihood score of 14,118 (Figure 3). Tree topologies from each of the MP, ML and Bayesian analyses showed no significant differences, except for the arrangement of the non-grass Poales genes from Xyris, Cyperus, and Joinvillea. Each tree was largely similar to those found in previous molecular phylogenetic analyses (GRASS

PHYLOGENY WORKING GROUP [GPWG], 2001), except for the placement of genes from three ehrhartoid species (Oryza, and Ehrharta). Phylogenetic analyses strongly support the existence of three distinct grass AP1/FUL clades, FUL1, FUL2 and FUL3. Placement of

Joinvillea, Xyris and Cyperus genes as sister to the grass FUL1 and FUL2 clades indicates that the duplication event that gave rise to FUL1 and FUL2 occurred near the base of Poaceae. This relationship is supported by both the MP bootstrap (84%) and Bayesian analyses (99%).

Placement of the Streptochaeta sequence within the FUL2 clade is also strongly supported by both bootstrap (99%) and Bayesian (100%) analyses, indicating that the duplication event occurred prior to the divergence of the most basally placed grasses (Figure 3).

The SH test rejects, at the P<0.05 level, all topologies that explicitly test for alternative origins of the duplication event (data not shown). In agreement with the non-parametric bootstrap and Bayesian posterior probabilities, these data strongly support the hypothesis that the

AP1/FUL duplication event occurred prior to the common ancestor of extant grasses (Figure 3).

AP1/FUL gene copy number: PCR identified only one FUL sequence in Streptochaeta and

Pharus. To test whether we had failed to amplify a second sequence, we undertook Southern blot hybridization of probes spanning the KC domains and 3’-UTR of FUL2 from each species, as well as from two diploid grasses representing the two major grass radiations, Sorghum

(Panicoideae) and Hordeum (Pooideae). The latter two possess two and three similar copies of

20 Preston and Kellogg, Evolution of AP1/FUL-like genes

AP1/FUL-like genes respectively (data not shown), consistent with reports from other studies on

Hordeum AP1/FUL-like genes demonstrating two copies of HvMADS5 (FUL1) within the barley genome (SCHMITZ et al. 2000; VON ZITZEWITZ et al. 2005). In contrast to these findings, only one band was identified on the blots of Pharus and Streptochaeta (data not shown), consistent with our PCR results. This supports the hypothesis of two independent losses of FUL1 within early-diverging lineages of grasses. Banding patterns for Sorghum, Hordeum and

Streptochaeta were confirmed using gene specific probes hybridized under high stringency conditions (data not shown).

Tests of selection: Data from the pruned AP1/FUL tree could not reject a nearly neutral model

(M1a) (2∆l = 0.00) of evolution, indicating no evidence of positive selection averaged across the entire tree (Table 3). However, sites are not evolving homogeneously across the coding region of the gene. The more complex discrete model (M3), which allows two or more unrestricted site classes, was significantly better at explaining the data ((2∆l = 199.26, P < 0.01) than the one- ratio (M0) model. Codons fall into different classes with ω values ranging from 0.01531 (near purifying selection) to 0.89122 (relaxed selection) (Table 2).

There is no significant signature of positive selection on sites following the duplication event, indicating that any non-synonymous substitutions along these branches are probably the result of relaxed selection following functional redundancy. A model allowing an additional site class under positive selection (M8) has a likelihood score not significantly different than the model restricting all ω values to ≤ 1 (M7) (2∆l = -0.9) (Table 2), meaning that positive selection is not necessary to explain the data. Similarly, under the branch models (Two-ratios and Three- ratios) that specify a different evolutionary rate on branches following the duplication event,

21 Preston and Kellogg, Evolution of AP1/FUL-like genes likelihood scores are not significantly better than the most simple branch model specifying a single ω value for all branches (M0) (2∆l = 0.34 and 0.1). Indeed, despite strong evidence that different codons are experiencing different selection pressures, branch-site models (A and B) specifying branch FUL1 or FUL2 as the foreground branch were not significantly better at explaining the data than the nested site models (M1a [2∆l = 0.00 and 0.98] and null model A

[2∆l = 0.00 and 0.00], and null model B [2∆l = 0.00 and 0.00] respectively).

Multiple codon substitutions within the FUL2 lineage prior to diversification of the

BEP/PACCAD clades: Twenty-one codon substitutions were found in the FUL2 lineage prior to diversification of the BEP and PACCAD clades (8 before the divergence of SaFUL2, 7 before divergence of PsFUL2, and 6 before divergence of the ehrhartoids), in contrast to zero codon substitutions in the corresponding branch leading to the FUL1 lineage (Figure 4; first number below each branch). Six of these 21 substitutions are unique to the entire tree (not shown). Of the eight substitutions that occurred before the divergence of Streptochaeta, four are conserved for the entire FUL2 clade (Figure 4; boxed number above branch): one each within the MADS-box

(M) (#26), Intervening (I) (#79), Keratin-like (K) (#145), and C-terminal (C) (#175) domains

(Figure 5). Of the seven codon substitutions occurring after the Streptochaeta divergence and before that of Pharus, five (#47, 88, 102, 109, 159) are conserved. Of the six substitutions that occurred after the divergence of Pharus, four (#182, 233, 240, 266) of these changes are conserved respectively (Figures 4, 5). Thus early in grass diversification, the FUL2 genes accumulated 21 non-synonymous mutations of which 13 have remained unchanged for approximately 50 million years. These codon changes are not biased to any particular functional domain within the gene.

22 Preston and Kellogg, Evolution of AP1/FUL-like genes

The trend of non-synonymous substitutions within the FUL2 lineage immediately following AP1/FUL-like gene duplication is highly significant, and is consistent with a hypothesis of neo-functionalization in the FUL2 clade. MP reconstructions of ancestral AP1/FUL amino acid sequences estimate that no codon changes occurred within the FUL1 lineage prior to

BEP-PACCAD clade diversification (Figure 4). According to the binomial test, the probability that all thirteen conserved codon substitutions occurred by chance only in the FUL2 lineage prior to BEP-PACCAD diversification (Pr) is 0.0001. We tested whether the difference simply reflects an increase in the rate of evolution in the FUL2 clade by comparing the number of non- synonymous substitutions after the common ancestor of Zea and Oryza for FUL1 and FUL2.

The average number is slightly higher in the FUL1 clade, but the difference between the two clades is not significant. We conclude that the elevated rate of amino acid substitution in FUL2 reflects a change in the selection pressure on the gene soon after its origin by duplication.

Analysis of AP1/FUL gene expression in vegetative and floral organs: Expression patterns of

AP1/FUL genes in non-grass monocots varied across the five taxa sampled (Figure 6).

Expression patterns in Lilium (Liliaceae) (not included in phylogenetic analysis due to short sequence) and Tradescantia (Commelinaceae) are identical, with AP1/FUL transcripts being detected in all vegetative (stem and leaves) and floral organs (outer and inner tepals, stamens, and carpel) sampled. In contrast, in Agapanthus (Agapanthaceae) (not included in phylogenetic analysis due to short sequence) expression is not detected in vegetative tissues but is found in each of the four floral whorls. Interestingly no transcript is detected in floral bracts of

Agapanthus or Cyperus (Cyperaceae), despite multiple independent RT-PCR experiments, indicating that AsFUL and CpFUL expression are both restricted to flowers. Expression of

23 Preston and Kellogg, Evolution of AP1/FUL-like genes

XsFUL in Xyris (Xyridaceae) is the most divergent, with transcripts being detected in both vegetative (stem) and reproductive organs (carpels), but absent from tepals. The common expression of AP1/FUL genes in carpels is consistent with the hypothesis that monocot AP1/FUL genes may be functionally more similar to Arabidopsis FRUITFULL than APETALA1.

Expression patterns of AP1/FUL genes in grasses are generally conserved throughout vegetative tissues and glumes, but show variation in florets (Figure 7). Gene expression in glumes is completely conserved, with both genes being detected in glumes of all taxa examined.

In Streptochaeta angustifolia, a species that lacks a true spikelet (Figure 1D), SaFUL2 is expressed in bracts one to five, six, and seven to twelve, the latter surrounding the stamens and carpel. Within vegetative tissues FUL1 and FUL2 transcripts are detected in roots, stems, and leaves of most taxa sampled. However, in Streptochaeta (Anomochlooideae) SaFUL2 was never detected in leaves, in Chasmanthium (Centothecoideae) ClFUL1/ClFUL2 were never detected in roots, and in maize (Zea mays: Panicoideae) ZmMADS15/ZmMADS4/ZmMADS3/ZAP1 were never detected in roots or stems and ZmMADS3/ZAP1 were never detected in leaves.

Nested RT-PCR gives a clear picture of presence/absence of transcripts because the initial step of creating a pool of MADS box RNAs makes it more likely that rare transcripts will be detected. However, nested RT-PCR provides a misleading picture of relative amounts of transcripts of different loci. Conversely, semi-quantitative RT-PCR provides good information on relative transcript abundance, but is considerably less sensitive than the nested approach and so will miss low abundance RNAs. In contrast to the nested RT-PCR results (Figure 7), semi- quantitative RT-PCR was unable to detect Sorghum SbFUL1/SbFUL2, Chasmanthium ClFUL2,

Eleusine EiFUL2 and Oryza OsMADS15 expression in leaves (Figure 8). Follow-up nested RT-

PCR on these samples verified that RNA of these genes is present, but at levels too low to detect

24 Preston and Kellogg, Evolution of AP1/FUL-like genes without first limiting the RNA pool to MADS box transcripts. Conversely, all transcripts that were detected by the semi-quantitative approach were also detected by nested RT-PCR. The semi-quantitative results suggest FUL2 transcription in leaves is generally much lower than transcription of FUL1.

All grass taxa included in this study bear both fertile and reduced florets, except for

Streptochaeta, which has a single fertile floret surrounded by bracts (Figure 1D), and Lithachne

(Bambusoideae), Pharus (Pharoideae), and Zea, all of which have reduced staminate or pistillate florets borne singly (Lithachne, Pharus) or in pairs (Zea) within the spikelet. With the exception of Streptochaeta, all fertile florets can be considered homologous. In contrast, reduced florets vary from sterile lemmas (e.g. Oryza) to florets lacking fully developed reproductive organs (e.g.

Zea). Reduced florets may be borne below (Oryza, Phalaris, Chasmanthium, Sorghum, Setaria,

Zea) or above (Chasmanthium, Eleusine, Eragrostis) the fertile ones, or in a separate spikelet entirely (Pharus, Lithachne, Hordeum). The reduced floret may be underdeveloped (Phalaris,

Hordeum, Triticum, Avena, Lolium, Chasmanthium distal), staminate (Pharus, Lithachne,

Setaria, Zea tassel), pistillate (Pharus, Lithachne, Eleusine, Eragrostis, Zea ear), or reduced to a sterile lemma (Oryza, Chasmanthium proximal, Sorghum). The mechanisms that cause florets to become variously reduced are unknown, and are likely to be different in different taxa. In addition, reduced florets have evolved multiple times within the grasses. The morphological and developmental differences among reduced florets, combined with their independent appearance in evolutionary time, all suggest that reduced florets are non-homologous; they are comparable only in that they lack one or more floral organs.

Although gene expression varied among species, and among florets within spikelets, we could find no clear pattern of gene expression that was consistent with floral fertility, position, or

25 Preston and Kellogg, Evolution of AP1/FUL-like genes number, or taxonomic relatedness (Figures 7, 8). In reduced florets Zea FUL1 transcripts

(ZmMADS4 and ZmMADS15) are detected, whereas Pharus PsFUL2 and Zea ZAP1/ZmMADS3 transcripts are never detected. In fertile florets FUL1 transcripts are detected in all taxa sampled, whereas FUL2 transcripts are never detected in fertile florets of Streptochaeta or Chasmanthium despite repeated attempts. Furthermore, semi-quantitative RT-PCR was unable to detect Eleusine

EiFUL2 and Oryza OsMADS15 in fertile florets (Figure 8). This suggests an overall lower level of FUL2 gene expression in florets of these taxa compared to expression of FUL1.

Reconstruction of ancestral AP1/FUL expression before and after gene duplication: MP reconstructions of ancestral AP1/FUL expression patterns estimates that expression in all four floral whorls is ancestral for all monocot clades included in this study. Expression in leaves is equivocal at the base of the monocots, but is estimated as present in the ancestor of the commelinid clade. This hypothesis suggests that expression in leaves was lost twice within the

FUL2 clade, once in Streptochaeta (SaFUL2) and once in Zea (ZmMADS3 and ZAP1).

Expression in floral bracts is optimized as absent at the base of monocots, but present around the time of gene duplication at the base of grasses. Similarly, AP1/FUL expression in fertile grass florets is inferred as ancestral for grasses, suggesting that gene expression was again lost in the

FUL2 clade, within the lineages leading to Streptochaeta (Anomochlooideae) and Chasmanthium

(Centothecoideae). AP1/FUL gene expression in reduced florets was not optimized on the phylogeny because reduced florets are non-homologous across grasses, and thus comparisons are not meaningful.

26 Preston and Kellogg, Evolution of AP1/FUL-like genes

DISCUSSION

Gene duplications may lead to novel forms by partitioning of ancestral functions (sub- functionalization) that can then be integrated into divergent developmental modules (FORCE et al. 1999; LYNCH et al. 2001) or by evolution of novel functions (neo-functionalization) that can alter the developmental outcome of pre-existing genetic pathways (OHNO 1970; HUGHES

1994). Changes in the spatio-temporal patterns of gene expression and/or interactions of their protein products are two ways in which functional divergence can occur (DOEBLEY and

LUKENS 1998; BECKER and THEISSEN 2003). We have reconstructed the evolutionary history of two paralogous monocot MADS-box transcription factors, FUL1 and FUL2, as a framework to determine patterns of sequence and gene expression evolution that may be implicated in evolution and/or diversification of the grass spikelet.

AP1/FUL genes duplicated prior to evolution of the grass spikelet: Phylogenetic analyses of monocot AP1/FUL genes support the hypothesis that the gene duplication event giving rise to

FUL1 and FUL2 occurred before the common ancestor of extant grasses. Support values from bootstrap and Bayesian analyses, and the SH test suggest duplication after the divergence of

Joinvilleaceae, one of the sister families of Poaceae, but before the split of Anomochlooideae, the earliest diverging grass lineage. These data suggest that the FUL1 and FUL2 paralogues have been maintained in the genome of Poaceae for at least 55-65 million years (BREMER 2002), and thus implicate selection for retention of both gene activities throughout grass evolution

(MONSON 2003). Other studies have also found genes duplicated at the base of grasses (e.g.

Adh1/Adh2 [GAUT et al. 1999]), often associated with changes in expression and/or function.

27 Preston and Kellogg, Evolution of AP1/FUL-like genes

Interestingly, only a single AP1/FUL copy was found in Streptochaeta and Pharus.

Streptochaeta is unique in that it lacks a true spikelet, but has a flower-bearing structure surrounded by multiple bracts (JUDZIEWICZ and SODERSTROM 1997) (Figure 1D). In contrast, Pharus has unisexual single-flowered spikelets that are conventional in structure except for the absence of lodicules in all mature florets of most species (JUDZIEWICZ et al. 1999). The single copy of AP1/FUL in Streptochaeta could either be due to the duplication occurring after the split of Anomochlooideae from the rest of the grasses or, as supported in our phylogenetic analyses, by a loss or rapid divergence of FUL1 following the gene duplication event. Loss or rapid divergence of FUL1 is also supported for Pharus by both the SH test and phylogenetic reconstructions, suggesting that FUL1 may have been under relaxed or positive selection independently on at least two occasions. Taken together, these findings suggest a complex history of duplication and divergence and/or loss of AP1/FUL genes at the base of Poaceae, a pattern that correlates with major changes in grass floral morphology.

Evidence for a period of relaxed selection followed by stabilizing selection on the branch leading to the FUL2 clade: Codon-based models of evolution suggest that selection on monocot

AP1/FUL genes is not homogeneous, but rather varies across sites and lineages. This finding is consistent with the modular structure of MIKC MADS-box genes, where the N-terminal MADS- domain is under extreme purifying selection and the C-terminal domain evolves rapidly under a combination of relaxed and positive selection (BECKER and THEISSEN 2003). Comparison of site models testing for evidence of nearly neutral versus positive selection did not allow rejection of the nearly neutral model. This result does not preclude adaptive amino acid replacements, as the models testing for ω > 1.0 may simply lack power to detect a limited number of isolated

28 Preston and Kellogg, Evolution of AP1/FUL-like genes adaptive events (WONG et al. 2004). Rather, this result suggests that, on average, codon sites are evolving under a combination of stabilizing and relaxed selection.

The inferred pattern of codon substitutions suggests that FUL1 and FUL2 genes may have been subject to quite different selective pressures. Our data provide no evidence for positive selection along branches immediately following the gene duplication, but do suggest that sites with elevated non-synonymous substitutions could have evolved under relaxed selection. Twenty-one and zero codon substitutions occurred within the FUL2 and FUL1 gene lineages respectively prior to BEP/PACCAD diversification (Figure 4). A quarter of these codon substitutions change the biochemical class of the amino acid. The fact that thirteen of these codon substitutions have been maintained, at least since the estimated split of rice and maize 50 mya (GAUT 2002), suggests that these sites may have been preserved under purifying selection.

This pattern is inconsistent with that expected under sub-functionalization. According to the sub- functionalization model, degenerative mutations have no fitness value and are thus not conserved over long periods of time (FORCE et al. 1999; LYNCH et al. 2001). In addition, sub- functionalization in modular genes generally predicts degeneration within different domains of either duplicate gene (FORCE et al. 2004).

Complex AP1/FUL gene expression patterns are consistent with a combination of incomplete sub-functionalization and redundancy following gene duplication: Recent studies on vernalization (JENSEN et al. 2005; YAN et al. 2003; FU et al. 2005) and photoperiod

(DANYLUK et al. 2003) responses of cereal cultivars have implicated FUL1 as a possible integrator of grass flowering time pathways. This putative function is supported by the early flowering phenotypes of rice (JEON et al. 2000) and maize (DANILEVSKAYA 2005)

29 Preston and Kellogg, Evolution of AP1/FUL-like genes ectopically expressing FUL1. In addition, when the AP1/FUL gene from Dendrobium was expressed in Arabidopsis, the resulting plants exhibited an early flowering phenotype (YU and

GOH 2000). In contrast to other studies (MENA et al. 1995; KYOZUKA et al. 2000; GOCAL et al. 2001; but see PETERSEN et al. 2004) we found expression of FUL2 in leaves of both seedlings (data not shown) and mature plants. Since quantitative changes in FUL1 leaf expression have been positively correlated with competency to flower (YAN et al. 2003), the expression of both FUL1 and FUL2 in leaves may suggest redundancy of function between these genes with respect to flowering time. Furthermore, the general expression of AP1/FUL genes of non-grass monocots may suggest a conserved role for these genes in flowering time across monocots.

Compared to flowering time, the role of AP1/FUL-like genes in grass spikelet development is less well known. RNA interference mutants of VRN-1 (FUL1) in Triticum aestivum (LOUKOIANOV et al. 2005) and ZmMADS3 (FUL2) in Zea mays (HEUER et al.

2001) exhibit normal spikelet development and morphology. Nonetheless both genes are consistently expressed in the spikelet in all grasses investigated to date, suggesting that the genes have a functional role there. Due to the lack of mutant phenotypes, determination of function has required extrapolation from expression analysis; these types of analyses have been carried out to varying degrees for rice (MOON et al. 1999; KYOZUKA et al. 2000; PELUCCHI et al. 2002), ryegrass (GOCAL et al. 2001), wheat (MURAI et al. 2003), barley (SCHMITZ et al. 2000), sorghum (GRECO et al. 1997), and maize (MENA et al. 1995). We have found conserved expression of both FUL1 and FUL2 in glumes, confirming and extending published results from rice and ryegrass, in which FUL1 and FUL2 gene transcripts have both been found in glume

30 Preston and Kellogg, Evolution of AP1/FUL-like genes primordia (KYOZUKA et al. 2000; GOCAL et al. 2001). The consistent expression pattern suggests a redundant role for these genes in glume identity.

As predicted based on previous studies (see above), we found expression patterns in grass florets to be more diverse than in glumes. However, we found no clear pattern of gene expression that was consistent with floral fertility, position, or number, or taxonomic relatedness. Prior to the gene duplication event (i.e. in non-grass monocots), AP1/FUL genes are consistently expressed in carpels, indicating a possible ancestral role in carpel development. In grasses, the ancestral state of expression for both genes was inferred as present in fertile florets, but no analysis could be done for reduced florets since they are non-homologous across taxa. FUL2 transcripts were detected in florets of most, but not all grass taxa. Within Pooideae, expression patterns for ryegrass, barley, and wheat matched those previously reported (GOCAL et al. 2001;

SCHMIDZ et al. 2000; MURAI et al. 2003), i.e. both genes were expressed in fertile florets, and our new data on TmFUL2 were identical to that of wheat AP1 (WAP1). In Ehrhartoideae, expression patterns of rice also confirmed those previously reported in fertile florets (MOON et al. 1999; PELUCCHI et al. 2002), but no expression of either gene was detected in sterile lemmas. Finally, in maize, gene expression did not completely correspond to a previous report

(MENA et al. 1995), which found ZAP1 expression in male florets using Northern blot analysis.

We were unable to detect it in these florets using RT-PCR. This discrepancy may reflect different developmental stages used in the two studies.

Variation in gene expression patterns within and between grasses indicates a combination of incomplete sub-functionalization and redundancy following gene duplication. Further studies are needed to determine the biochemical function and developmental role of FUL1 and FUL2, and how these are partitioned between paralogues across grasses.

31 Preston and Kellogg, Evolution of AP1/FUL-like genes

This study illustrates the complex history of AP1/FUL-like genes in grasses and their relatives, and has generated testable hypotheses regarding the role of these genes in spikelet evolution and flowering time. We have shown that, immediately following duplication at the base of the grasses, non-synonymous changes occurred solely within the FUL2 lineage prior to

BEP/PACCAD diversification. Expression analyses across grasses further demonstrate restriction of FUL2 transcription within spikelets of unrelated taxa: Streptochaeta, Pharus, Oryza,

Chasmanthium, Sorghum and Zea. We hypothesize that heterologous gene expression found in this and other studies indicates a combination of incomplete sub-functionalization and redundancy for FUL1 and FUL2 genes, possibly in specifying floral organ identity. Finally, we suggest that AP1/FUL genes play a general role in regulating flowering time in monocots, and that the FUL1 and FUL2 genes of grasses may be redundant regarding their role in vernalization response in cereals. To further elucidate these roles we are continuing to investigate the spatio- temporal pattern of FUL1/FUL2 expression across Poaceae and their relatives.

32 Preston and Kellogg, Evolution of AP1/FUL-like genes

ACKNOWLEDGEMENTS

We would like to thank the Missouri Botanical Garden and the United States Department of

Agriculture for access to plant materials, Jessica Kossuth for help with RT-PCR, and Mark

Beilstein, Andrew Doust, Iván Jiménez, Simon Malcomber, Robert Marquis, Robert Schmidt,

Peter Stevens, and Patrick Sweeney for helpful comments. This work was supported by National

Science Foundation grant DBI-0110189 to E. A. K.

33 Preston and Kellogg, Evolution of AP1/FUL-like genes

LITERATURE CITED

ABRAMOFF, M. D., P. J. MAGELHAES and S. J. RAM, 2004 Image processing with ImageJ.

Biophotonics Int. 11: 36-42.

AMBROSE, B. A., D. R. LERNER, P. CICERI, C. M. PADILLA, M. F. YANOFSKY and R. J.

SCHMIDT, 2000 Molecular and genetic analyses of the Silky1 gene reveal conservation in

floral organ specification between eudicots and monocots. Mol. Cell 5: 569-579.

ANGIOSPERM PHYLOGENY GROUP, 2003 An update of the Angiosperm Phylogeny Group

classification for the orders and families of flowering plants: APG II. Bot. J. Linn. Soc 141:

399-436.

ANISMOVA, M., J. P. BIELAWSKI and Z. YANG, 2002 Accuracy and power of Bayes

prediction of amino acid sites under positive selection. Mol. Biol. Evol. 19: 950-958.

BECKER, A. and G. THEISSEN, 2003 The major clades of MADS-box genes and their role in

the development and evolution of flowering plants. Mol. Phylogenet. Evol. 29: 464-489.

BIELAWSKI, J. P. and Z. YANG, 2003 Maximum likelihood methods for detecting adaptive

evolution after gene duplication. J. Struct. Funct. Genomics. 3: 201-212.

BOMMERT, P., N. SATOH-NAGASAWA, D. JACKSON and H.-Y. HIRANO, 2005 Genetics

and evolution of inflorescence and flower development in grasses. Plant Cell Physiol. 46:

69-78.

BOWMAN, J. L., G. N. DREWS and E. M. MEYEROWITZ, 1991 Expression of the

Arabidopsis floral homeotic gene AGAMOUS is restricted to specific cell types late in

flower development. Plant Cell 3: 749-758.

34 Preston and Kellogg, Evolution of AP1/FUL-like genes

BOWMAN, J. L., D. R. SMYTH and E. M. MEYEROWITZ, 1989 Genes directing flower

development in Arabidopsis. Plant Cell 1: 37-52.

BREMER, K., 2002 Gondwanan evolution of the grass alliance of families (Poales). Evolution

56: 1374-1387.

CARPENTER, R., L. COPSEY, C. VINCENT, S. DOYLE, R. MAGRATH and E. COEN, 1995

Control of flower development and phyllotaxy by meristem identity genes in Antirrhinum.

Plant Cell 7: 2001-2011.

CHENG, P. C., R. I. GREYSON and D. B. WALDEN, 1983 Organ initiation and the

development of unisexual flowers in the tassel and ear of Zea Mays. Am. J. Bot. 70: 450-

462.

CLIFFORD, H. T., 1961 Floral evolution in the family Gramineae. Evolution 15: 455-460.

CLIFFORD, H. T., 1987 Spikelet and floral morphology, pp. 21-30 in Grass Systematics and

Evolution, edited by T. R. SODERSTROM, K. W. HILU, C.S. CAMPBELL, and M. E.

BARKWORTH. Smithsonian Institute Press, Washington DC.

COEN, E. S. and E. M. MEYEROWITZ, 1991 The war of the whorls: genetic interactions

controlling flower development. Nature 353: 31-37.

COOPER, B., J. D. CLARKE, P. BUDWORTH, J. KREPS, D. HUTCHISON, S. PARK, S.

GUIMIL, M. DUNN, P. LUGINBUHL, C. ELLERO, S. A. GOFF and J. GLAZEBROOK,

2003 A network of rice genes associated with stress response and seed development. Proc.

Nat. Acad. Sci. USA 100: 4945-4950.

DANILEVSKAYA, O., D. SELINGER, P. HERMON, E. ANANIEV, D. BICKEL, and J.

PECCOUD. 2005. Identification of two maize MADS box transcription factors that

promote floral transition. 47th Annual Maize Genetics Conference: Abstract T14, p. 27.

35 Preston and Kellogg, Evolution of AP1/FUL-like genes

DANYLUK, J., N. A. KANE, G. BRETON, A. E. LIMIN, D. B. FOWLER and F. SARHAN,

2003 TaVRT-1, a putative transcription factor associated with vegetative to reproductive

transition in cereals. Plant Physiol. 132: 1849-1860.

DE BODT, S., J. RAES, Y. VAN DE PEER and G. THEISSEN, 2003 And then there were so

many: MADS goes genomic. Trends Plant Sci. 8: 475-483.

DOEBLEY, J., and L. LUKENS, 1998 Transcriptional regulators and the evolution of plant

form. Plant Cell 10: 1075-1082.

DOYLE, J. J., and J. L. DOYLE, 1987 A rapid DNA isolation procedure for small quantities of

fresh leaf material. Phytochem. Bull. 19: 11-15.

EWING, B., L. HILLIER, M. C. WENDL and P. GREEN, 1998 Base-calling of automated

sequencer traces using phred. I. accuracy assessment. Genome Res. 8: 175-185.

FELSENSTEIN, J., 1985 Confidence limits on phylogenies: an approach using the bootstrap.

Evolution 39: 783-791.

FERRÁNDIZ, C., Q. GU, R. MARTIENSSEN and M. F. YANOFSKY, 2000 Redundant

regulation of meristem identity and plant architecture by FRUITFULL, APETALA1 and

CAULIFLOWER. Development 127: 725-734.

FORCE, A. G., W. A. CRESKO and F. B. PICKETT, 2004 Information accretion, gene

duplication, and the mechanisms of genetic module parcellation, pp. 315-337 in Modularity

in Development and Evolution, edited by G. SCHLOSSER and G. P. WAGNER. University

of Chicago Press, Chicago.

FORCE, A., M. LYNCH, F. B. PICKETT, A. AMORES, Y. L. YAN and J. POSTLETHWAIT,

1999 Preservation of duplicate genes by complementary, degenerative mutations. Genetics

151: 1531-1545.

36 Preston and Kellogg, Evolution of AP1/FUL-like genes

FREEMAN, W. M., S. J. WALKER and K. E. VRANA, 1999 Quantitative RT-PCR: pitfalls and

potential. BioTechniques 26: 112-125.

FU, D., P. SZUCS, L. YAN, M. HELGUERA, J. S. SKINNER, J. VON ZITZEWITZ, P. M.

HAYES and J. DUBCOVSKY, 2005 Large deletions within the first intron in VRN-1 are

associated with spring growth habit in barley and wheat. Mol. Genet. Genomics 273: 54-65.

GARCIA-FERNANDEZ, J., 2005 The genesis and evolution of homeobox gene clusters. Nat.

Rev. Genet. 6: 881-892.

GAUT, B. S., A. S. PEEK, B. R. MORTON and M. T. CLEGG, 1999 Patterns of genetic

diversification within the Adh gene family in the grasses (Poaceae). Mol. Biol. Evol. 16:

1086-1097.

GAUT, B. S., 2002 Evolutionary dynamics of grass genomes. New Phytol. 154: 15-28.

GOCAL, G. F. W., R. W. KING, C. A. BLUNDELL, O. M. SCHWARTZ, C. H. ANDERSEN

and D. WEIGEL, 2001 Evolution of floral meristem identity genes. Analyses of Lolium

temulentum genes related to APETALA1 and LEAFY of Arabidopsis. Plant Physiol. 125:

1788-1801.

GOLDMAN, N., J. P. ANDERSON and A. G. RODRIGO, 2000 Likelihood-based tests of

topology in phylogenetics. Syst. Biol. 49: 652-670.

GRAHAM, S. W., J. M. ZGURSKI, M. A. MCPHERSON, D. M. CHERNIAWSKY, J. M.

SAARELA, E. S. C. HORNE, S. Y. SMITH, W. A. WONG, H. E. O’BRIEN, V. L.

BIRON, J. C. PIRES, R. G. OLMSTEAD, M. W. CHASE and H. S. RAI, 2006 Robust

inference of monocot deep phylogeny using an expanded multigene plastid data set, pp. – in

Monocots: Comparative Biology and Evolution, 2 vols, edited by J. T. COLUMBUS, E. A.

37 Preston and Kellogg, Evolution of AP1/FUL-like genes

FRIAR, C. W. HAMILTON, J. M. PORTER, L. M. PRINCE and M. G. SIMPSON. Rancho

Santa Ana Botanical Garden, Claremont, CA.

GRASS PHYLOGENY WORKING GROUP (GPWG), 2001 Phylogeny and subfamilial

classification of the grasses (Poaceae). Ann. Mo. Bot. Gard. 88: 373-457.

GRECO, R., L. STAGI, L. COLOMBO, G. C. ANGENENT, M. SARI-GORLA and M. E. PE,

1997 MADS box genes expressed in developing inflorescences of rice and sorghum. Mol.

Gen. Genet. 253: 615-623.

GU, Q., C. FERRANDIZ, M. F. YANOFSKY and R. MARTIENSSEN, 1998 The FRUITFULL

MADS-box gene mediates cell differentiation during Arabidopsis fruit development.

Development 125: 1509-1517.

HE, M., E. LIU and K. CONWAY, 1995 Semi-quantitative reverse transcriptase-polymerase

chain reaction for the detection of low abundance transcripts in human cells, pp. 289-299 in

Reverse transcriptase PCR, edited by J. W. LARRICK and P. D. SIEBERT. Ellis Harwood,

Hertfordshire, UK.

HEUER, S., S. HANSEN, J. BANTIN, R. BRETTSCHNEIDER, E. KRANZ, H. LORZ and T.

DRESSELHAUS, 2001 The maize MADS box gene ZmMADS3 affects node number and

spikelet development and is co-expressed with ZmMADS1 during flower development in

egg cells, and early embryogenesis. Plant Physiol. 127: 33-45.

HONMA, T. and K. GOTO, 2001 Complexes of MADS-box proteins are sufficient to convert

leaves into floral organs. Nature 409: 525-529.

HUELSENBECK, J. P. and F. RONQUIST, 2001 MrBayes: Bayesian inference of phylogenetic

trees. Bioinformatics 17: 754-755.

38 Preston and Kellogg, Evolution of AP1/FUL-like genes

HUGHES, A. L., 1994 The evolution of functionally novel proteins after gene duplication. Proc.

R. Soc. Lond. B Biol. Sci. 256: 119-124.

HUGHES, A. L., 1999 Adaptive Evolution of Genes and Genomes. Oxford University Press,

New York.

HUIJSER, P., J. KLEIN, W. E. LONNIG, H. SAEDLER and H. SOMMER, 1992 Bracteomania,

an inflorescence anomaly, is caused by the loss of function of the MADS-box gene

squamosa in Antirrhinum majus. EMBO J. 11: 1239-1249.

IKEDA, K., H. SUNOHARA and Y. NAGATO, 2004 Developmental course of inflorescence

and spikelet in rice. Breed. Sci. 54: 147-156.

IRISH, E. E., 1998 Grass spikelets: a thorny problem. Bioessays 20: 789-793.

JENSEN, L. B., J. R. ANDERSEN, U. FREI, Y. XING, C. TAYLOR, P. B. HOLM and T.

LUBBERSTEDT, 2005 QTL mapping of vernalization response in perennial ryegrass

(Lolium perenne L.) reveals co-location with an orthologue of wheat VRN1. Theor. Appl.

Genet. 110: 527-536.

JEON, J. S., S. LEE, K. H. JUNG, W. S. YANG, G. H. YI, B. G. OH and G. H. AN, 2000

Production of transgenic rice plants showing reduced heading date and plant height by

ectopic expression of rice MADS-box genes. Mol. Breeding 6: 581-592.

JOFUKU, K. D., B. G. W. DEN BOER, M. VAN MONTAGU and J. K. OKAMURA, 1994

Control of Arabidopsis flower and seed development by the homeotic gene APETALA2.

Plant Cell 6: 1211-1225.

JUDZIEWICZ, E. J., L. G. CLARK, X. LONDOÑO and M. J. STERN, 1999 American

Bamboos. Smithsonian Institution Press, WA.

39 Preston and Kellogg, Evolution of AP1/FUL-like genes

JUDZIEWICZ, E. J. and T. R. SODERSTROM, 1997 Morphological, anatomical, and

taxonomic studies in Anomochloa and Streptochaeta (Poaceae: Bambusoideae).

Smithsonian Contr. Bot. 68: 1-52.

KECK, E., P. MCSTEEN, R. CARPENTER and E. COEN, 2003 Separation of genetic functions

controlling organ identity in flowers. EMBO J. 22: 1058-1066.

KELLOGG, E. A., 2001 Evolutionary history of grasses. Plant Physiol. 125: 1198-1205.

KEMPIN, S., B. SAVIDGE and M. F. YANOFSKY, 1995 Molecular basis of the cauliflower

phenotype in Arabidopsis. Science 267: 522-525.

KOFUJI, R., N. SUMIKAWA, M. YAMASAKI, K. KONDO, K. UEDA, M. ITO and M.

HASEBE, 2003 Evolution and divergence of MADS-box gene family based on genome

wide expression analyses. Mol. Biol. Evol. 20: 1963-1977.

KRAMER, E. M., V. S. DI STILIO and P. M. SCHLUTER, 2003 Complex patterns of gene

duplications in the APETALA3 and PISTILLATA lineages of the Ranunculaceae. Int. J.

Plant Sci. 164: 1-11.

KRIZEK, B. A. and E. M. MEYEROWITZ, 1996 The Arabidopsis homeotic gene APETALA3

and PISTILLATA are sufficient to provide the B class organ identity function. Development

122: 11-22.

KYOZUKA, J., T. KOBAYASHI, M. MORITA and K. SHIMAMOTO, 2000 Spatially and

temporally regulated expression of rice MADS box genes with similarity to Arabidopsis

class A, B and C genes. Plant Cell Physiol. 41: 710-718.

LAMB, R. S. and V. F. IRISH, 2003 Functional divergence within the

APETALA3/PISTILLATA floral homeotic gene lineages. Proc. Natl. Acad. Sci. USA 100:

6558-6563.

40 Preston and Kellogg, Evolution of AP1/FUL-like genes

LAURIE, D. A., N. PRATCHETT, K. M. DEVOS, I. J. LEITCH and M. D. GALE, 1993 The

distribution of RFLP markers on chromosome 2 (2H) of barley in relation to the physical

and genetic location of 5S rDNA. Theor. Appl. Genet. 87: 177-183.

LI, W.-H., 1980 Rate of gene silencing at duplicate loci: a theoretical study and interpretation of

data from tetraploid fishes. Genetics 95: 237-258.

LI, W.-H., 1997 Molecular evolution. Sinauer, Sunderland, MA.

LI, X. and M. NOLL, 1994 Evolution of distinct developmental functions of the Drosophila

genes by acquisition of different cis-regulatory regions. Nature 367: 83-87.

LILJEGREN, S. J., A. H. ROEDER, S. A. KEMPIN, K. GREMSKI, L. OSTERGAARD, S.

GUIMIL, D. K. REYES and M. F. YANOFSKY, 2004 Control of fruit patterning in

Arabidopsis by INDEHISCENT. Cell 116: 843-853.

LITT, A. and V. F. IRISH, 2003 Duplication and diversification in the

APETALA1/FRUITFULL floral homeotic gene lineage: implications for the evolution of

floral development. Genetics 165: 821-833.

LOUKOIANOV, A., L. YAN, A. BLECHL, A. SANCHEZ and J. DUBCOVSKY, 2005

Regulation of VRN-1 vernalization genes in normal and transgenic polyploid wheat. Plant

Phys. 138: 2364-2373.

LYNCH, M. and A. FORCE, 2000 The probability of duplicate gene preservation by

subfunctionalization. Genetics 154: 459-473.

LYNCH, M., M. O’HELY, B. WALSH and A. FORCE, 2001 The probability of preservation of

a newly arisen gene duplicate. Genetics 159: 1789-1804.

MA, H. and C. DEPAMPHILIS, 2000 The ABCs of flower evolution. Cell 101: 5-8.

41 Preston and Kellogg, Evolution of AP1/FUL-like genes

MADDISON, D. R. and W. P. MADDISON, 2003 MacClade: analysis of phylogeny and

character evolution. Sinauer Associates, Sunderland, MA.

MALCOMBER, S. T. and E. A. KELLOGG, 2004 Heterogeneous expression patterns and

separate roles of the SEPALLATA gene LEAFY HULL STERILE1 in grasses. Plant Cell 16:

1692-1706.

MANDEL, M. A., C. GUSTAFSON-BROWN, B. SAVIDGE and M. F. YANOFSKY, 1992

Molecular characterization of the Arabidopsis floral homeotic gene APETALA1. Nature

360: 273-277.

MANDEL, M. A. and M. F. YANOFSKY, 1995 A gene triggering flower formation in

Arabidopsis flowers. Genes Dev. 15: 3355-3364.

MARTINEZ-CASTILLA, L. P. and E. R. ALVAREZ-BUYLLA, 2003 Adaptive evolution in the

Arabidopsis MADS-box gene family inferred from its complete resolved phylogeny. Proc.

Natl. Acad. Sci. USA 100: 13407-13412.

MASIERO, S., L. M.-A. LI, I. WILL, U. HARTMANN, H. SAEDLER, P. HUIJSER, Z.

SCHWARZ-SOMMER and H. SOMMER, 2004 INCOMPOSITA: a MADS-box gene

controlling prophyll development and floral meristem identity in Antirrhinum. Development

131: 5981-5990.

MENA, M., M. A. MANDEL, D. R. LERNER, M. F. YANOFSKY and R. J. SCHMIDT, 1995

A characterization of the MADS-box gene family in maize. Plant J. 8: 845-854.

MIYATA, T. and T. YASUNAGA, 1980 Molecular evolution of mRNA: a method for

estimating evolutionary rates of synonymous and amino acid substitutions from homologous

nucleotide sequences and its application. J. Mol. Evol. 16: 23-36.

42 Preston and Kellogg, Evolution of AP1/FUL-like genes

MONSON, R. K., 2003 Gene duplication, neofunctionalization, and the evolution of C4

photosynthesis. Int. J. Plant Sci. 164: S43-S54.

MOON, Y.-H. H.-G, KANG, J.-Y JUNG, J.-S. JOEN, S.-K. SUNG and G. AN, 1999

Determination of the motif responsible for interaction between the rice

APETALA1/AGAMOUS-LIKE9 family proteins using a yeast two-hybrid system. Plant

Physiol. 120: 1193-1204.

MULLER, B. M., H. SAEDLER and S. ZACHGO, 2001 The MADS-box gene DEFH28 from

Antirrhinum is involved in the regulation of floral meristem identity and fruit development.

Plant J. 28: 169-179.

MURAI, K., M. MIYAMAE, H. KATO, S. TAKUMI and Y. OGIHARA, 2003 WAP1, a wheat

APETALA1 homolog, plays a central role in the phase transition from vegetative to

reproductive growth. Plant Cell Physiol. 44: 1255-1265.

NAGASAWA, N., M. MIYOSHI, Y. SANO, H. SATOH, H.-Y. HIRANO, H. SAKAI and Y.

NAGATO, 2003 SUPERWOMAN1 and DROOPING LEAF genes control floral organ

identity in rice. Development 130: 705-718.

NEI, M. and A. K. ROYCHOUDHURY, 1973 Probability of fixation of nonfunctional genes at

duplicate loci. Am. Nat. 107: 362-372.

NYLANDER, J. A. A., 2004 MrModeltest v2. Program distributed by the author. Evolutionary

Biology Centre, Uppsala University.

OHNO, S., 1970 Evolution by Gene Duplication. Springer-Verlag, New York.

PELAZ, S., G. S. DITTA, E. BAUMANN, E. WISMAN and M. F. YANOFSKY, 2000 B and C

floral organ identity functions require SEPALLATA MADS-box genes. Nature 405: 200-

203.

43 Preston and Kellogg, Evolution of AP1/FUL-like genes

PELAZ, S., C. GUSTAFSON-BROWN, S. E. KOHALMI, W. L. CROSBY and M. F.

YANOFSKY, 2001 APETALA1 and SEPALLATA1 interact to promote flower

development. Plant J. 26: 385-394.

PELUCCHI, N., F. FORNARA, C. FAVALLI, S. MASIERO, C. LAGO, M. E. PÈ, L.

COLOMBO and M. M. KATER, 2002 Comparative analysis of rice MADS-box genes

expressed during flower development. Sex. Plant Reprod. 15: 113-122.

PETERSEN, K., T. DIDION, C. H. ANDERSEN and K. K. NIELSEN, 2004 MADS-box genes

from perennial ryegrass differentially expressed during transition from vegetative to

reproductive growth. J. Plant Physiol. 161: 439-447.

PETERSEN, K., E. KOLMOS, M. FOLLING, K. SALCHERT, M. STORGAARD, C. S.

JENSEN, T. DIDION and K. K. NIELSEN, 2006 Two MADS-box genes from perennial

ryegrass are regulated by vernalization and involved in the floral transition. Physiol. Plant.

126: 268-278.

PRASAD, K., P. SRIRAM, C. S. KUMAR, K. KUSHALAPPA and U. VIJAYRAGHAVAN,

2001 Ectopic expression of rice OsMADS1 reveals a role in specifying the lemma and palea,

grass floral organs analogous to sepals. Dev. Genes Evol. 211: 281-290.

PURUGGANAN, M. D., S. D. ROUNSLEY, R. J. SCHMIDT and M. F. YANOFSKY, 1995

Molecular evolution of flower development: diversification of the plant MADS-box

regulatory gene family. Genetics 140: 345-356.

RIECHMANN, J. L., B. A. KRIZEK and E. M. MEYEROWITZ, 1996 Dimerization specificity

of Arabidopsis MADS domain homeotic proteins APETALA1, APETALA3, PISTILLATA,

and AGAMOUS. Proc. Natl. Acad. Sci. USA 93: 4793-4798.

44 Preston and Kellogg, Evolution of AP1/FUL-like genes

RIECHMANN, J. L. and E. M. MEYEROWITZ, 1997 Determination of floral organ identity by

Arabidopsis MADS domain homeotic proteins AP1, AP3, PI, and AG is independent of

their DNA-binding specificity. Mol. Biol. Cell 8: 1243-1259.

ROBLES, P. and S. PELAZ, 2005 Flower and fruit development in Arabidopsis thaliana. Int. J.

Dev. Biol. 49: 633-643.

ROSIN, F. M., J. K. HART, H. VAN ONCKELEN and D. J. HANNAPEL, 2003 Suppression of

a vegetative MADS box gene of potato activates axillary meristem development. Plant.

Physiol. 131: 1613-1622.

SCHMITZ, J., R. FRANZEN, T. H. NGYUEN, F. GARCIA-MAROTO, C. POZZI, F.

SALAMINI and W. ROHDE, 2000 Cloning, mapping and expression analysis of barley

MADS-box genes. Plant Mol. Biol. 42: 899-913.

SOKAL, R. R. and F. J. ROLF, 2001 Biometry - Third Edition. W. H. Freeman and Company,

NY.

SUZUKI, Y. and M. NEI, 2001 Reliabilities of parsimony-based and likelihood-based methods

for detecting positive selection at single amino acid sites. Mol. Biol. Evol. 18: 2179-2185.

SWOFFORD, D. L., 2001 PAUP*: phylogenetic analysis using parsimony (and other methods).

Sinauer Associates, Sunderland, MA.

THEISSEN, G., A. BECKER, A. DI ROSA, A. KANNO, J. T. KIM, T. MÜNSTER, K.-U.

WINTER and H. SAEDLER, 2000 A short history of MADS-box genes in plants. Plant

Mol. Biol. 42: 115-149.

TREVASKIS, B., D. J. BAGNALL, M. H. ELLIS, W. J. PEACOCK and E.S. DENNIS, 2003

MADS box genes control vernalization-induced flowering in cereals. Proc. Natl. Acad. Sci.

USA 100: 13099-13104.

45 Preston and Kellogg, Evolution of AP1/FUL-like genes

TSAFTARIS, A. S., K. PASENTSIS, I. ILIOPOULOS and A. N. POLIDOROS, 2004 Isolation

of three homologous AP1-like MADS-box genes in crocus (Crocus sativus L.) and

characterization of their expression. Plant Sci. 166: 1235-1243.

VON ZITZEWITZ, J., P. SZUCS, J. DUBCOVSKY, L. YAN, E. FRANCIA, N. PECCHIONI,

A. CASA, T. H. CHEN, P. M. HAYES and J. S. SKINNER, 2005 Molecular and structural

characterization of barley vernalization genes. Plant Mol. Biol. 59: 449-467.

WALSH, J. B., 1995 How often do duplicated genes evolve new functions? Genetics 139: 421-

428.

WEIGEL, D. and E. M. MEYEROWITZ, 1994 The ABCs of floral homeotic genes. Cell 78:

203-209.

WHIPPLE, C. J., P. CICERI, C. M. PADILLA, B. A. AMBROSE, S. L. BANDONG and R. J.

SCHMIDT, 2004 Conservation of B-class floral homeotic gene function between maize and

Arabidopsis. Development 131: 6083-6091.

WHIPPLE, C. J. and R. J. SCHMIDT. 2006. Genetics of grass flower development. Adv. Bot.

Res. In press.

WONG, W. S. W., Z. YANG, N. GOLDMAN and R. NIELSEN, 2004 Accuracy and power of

statistical methods for detecting adaptive evolution in protein coding sequences and for

identifying positively selected sites. Genetics 168: 1041-1051.

YAMAGUCHI, T., D. Y. LEE, A. MIYAO, H. HIROCHIKA, G. AN and H.-Y. HIRANO, 2006

Functional diversification of the two C-class MADS box genes OSMADS3 and OSMADS58

in Oryza sativa. Plant Cell 18: 15-28.

46 Preston and Kellogg, Evolution of AP1/FUL-like genes

YAN, L., A. LOUKOIANOV, G. TRANQUILLI, M. HELGUERA, T. FAHIMA and J.

DUBCOVSKY, 2003 Positional cloning of the wheat vernalization gene VRN1. Proc. Natl.

Acad. Sci. USA 100: 6263-6268.

YANG, Z., 1997 PAML: a program package for phylogenetic analysis by maximum likelihood.

CABIOS 13: 555-556.

YANG, Z., 1998 Likelihood ratio tests for detecting positive selection and application to primate

lysozyme evolution. Mol. Biol. Evol. 15: 568-573.

YANOFSKY, M. F., H. MA, J. L. BOWMAN, G. N. DREWS, K. A. FELMANN and E. M.

MEYEROWITZ, 1990 The protein encoded by the Arabidopsis homeotic gene agamous

resembles transcription factors. Nature 346: 35-39.

YU, H., and C. J. GOH, 2000 Identification and characterization of three orchid MADS-box

genes of the AP1/AGL9 subfamily during floral transition. Plant Physiol. 123: 1325-1336.

YU, H., T. ITO, F. WELLMER and E. M. MEYEROWITZ, 2004 Repression of AGAMOUS-

LIKE 24 is a crucial step in promoting flower development. Nat. Genet. 36: 157-161.

ZAHN, L. M., H. KONG, J. H. LEEBENS-MACK, S. KIM, P. S. SOLTIS, L. L. LANDHERR,

D. E. SOLTIS, C. W. DEPAMPHILIS and H. MA, 2005 The evolution of the SEPALLATA

subfamily of MADS-box genes: a preangiosperm origin with multiple duplications

throughout angiosperm history. Genetics 169: 2209-2223.

47 Preston and Kellogg, Evolution of AP1/FUL-like genes

Table 1. AP1/FUL homologues included in this study

Taxon Family/Subfamily Gene* Accession No.

Dendrobium grex Orchidaceae/-- DOMADS2 AF198175

Crocus sativus Iridaceae/-- Kcap1b AY337929

Lilium lancifolium Liliaceae/-- LisFUL DQ792945

Allium sp. Alliaceae/-- AlFL AY306138

Agapanthus africanus Agapanthaceae/-- AgsFUL DQ792946

Elaeis guineensis Arecaceae/-- PAS1 AF411840

Tradescantia virginiana Commelinaceae/-- TvFL1 AY306190

Cyperus longus Cyperaceae/Cyperoideae CylFUL DQ792947

Xyris sp. Xyridaceae/Xyridoideae XsFUL DQ792948

Joinvillea ascendens Joinvilleaceae/-- JaFUL DQ792949

Streptochaeta angustifolia Poaceae/Anomochlooideae SaFUL2 DQ792950

Pharus sp. Poaceae/Pharoideae PsFUL2 DQ792951

Dendrocalamus latiflorus Poaceae/Bambusoideae DlMADS1 AY395714

Lithachne humilis Poaceae/Bambusoideae LhFUL2 DQ792952

Oryza sativa Poaceae/Ehrhartoideae OsMADS14 AF058697

OsMADS15 AF058698

OsMADS18 AF091458

Ehrharta erecta Poaceae/Ehrhartoideae EeFUL1 DQ792953

EeFUL2 DQ792954

Leersia virginica Poaceae/Ehrhartoideae LvFUL1 DQ792955

LvFUL2 DQ792956

Brachypodium distachyon Poaceae/Pooideae BdFUL1 DQ792957

BdFUL2a DQ792958

48 Preston and Kellogg, Evolution of AP1/FUL-like genes

BdFUL2b DQ792959

Triticum aestivum Poaceae/Pooideae TaMADS11 AB007504

Triticum monococcum Poaceae/Pooideae TmFUL2a DQ792960

TmFUL2b DQ792961

Hordeum vulgare Poaceae/Pooideae HvMADS3 AJ249143

HvMADS5 AJ249144

HvMADS8 AJ249146

Lolium temulentum Poaceae/Pooideae LtMADS1 AF035378

LtMADS2 AF035379

Phalaris canariensis Poaceae/Pooideae PcFUL1 DQ792962

PcFUL2 DQ792963

Avena strigosa Poaceae/Pooideae AstFUL1 DQ792964

Avena sativa Poaceae/Pooideae AsFUL1a DQ792965

AsFUL1b DQ792966

AsFUL2 DQ792967

Eleusine indica Poaceae/Chloridoideae EiFUL1a DQ792968

EiFUL1b DQ792969

EiFUL2 DQ792970

Eragrostis pilosa Poaceae/ Chloridoideae EpFUL1 DQ792971

EpFUL2 DQ792972

Chasmanthium latifolium Poaceae/Centothecoideae ClFUL1 DQ792973

ClFUL2 DQ792974

Setaria viridis Poaceae/Panicoideae SvFUL1 DQ792975

Setaria italica Poaceae/Panicoideae SiFUL1 DQ792976

SiFUL2 DQ792977

49 Preston and Kellogg, Evolution of AP1/FUL-like genes

Sorghum bicolor Poaceae/Panicoideae SbFUL1 DQ792978

Poaceae/Panicoideae SbMADS2 U32110

Zea mays Poaceae/Panicoideae ZmMADS4 AJ430641

ZmMADS15 AJ430632

ZmMADS3 AF112150

ZAP1 L46400

Pennisetum glaucum Poaceae/Panicoideae PgFUL1a DQ792979

PgFUL1b DQ792980

Note-Voucher information for each sequence is included in Genbank. Newly generated sequences are indicated in bold.

*Gene names of new sequences correspond to the species’ initials and the putative phylogenetic position: prior to the duplication event (FUL), following the duplication event in the OsMADS14 clade (FUL1), or following the duplication event in the OsMADS15 clade (FUL2). Distinct alleles are designated a or b.

50 Preston and Kellogg, Evolution of AP1/FUL-like genes

Table 2. Codon model parameter estimates for AP1/FUL paralogues.

Model p l Estimates of parameters Positively selected sites

M0: one-ratio 1 -3,714.44 ϖ = 0.1181 None

Branch-specific models

Two-ratios 2 -3,714.27 ϖ0 = 0.1167, ϖ1 = 0.1525 None

Three-ratios 3 -3,714.22 ϖ0 = 0.1223, ϖ1 = 0.1527 None

ϖ2 = 0.1147

Site-specific models

M1a: near neutral 2 -3,635.68 p0 = 0.87366, (p1 = 0.12634) Not allowed

M2a: selection 4 -3,635.68 p0 = 0.87368, p1 = 0.12632 None

(p2 = 0.00000), ϖ2 = ∞

M3: discrete (K=3) 3 -3,614.81 p0 = 0.52591, (p1 = 0.39457), None

p2 = 0.07953, ϖ0 = 0.01531

ϖ1 = 0.18259, ϖ2 = 0.89122

M7: beta 2 -3,618.86 p = 0.32839, q = 1.74796 Not allowed

M8: beta & w>1 4 -3,619.31 p0 = 0.99707, p = 0.32866 167V (P = 0.86)

q = 1.75003, (p1 = 0.00293) 207T (P = 0.60)

ϖ = 118.19 208T (P = 0.73)

215L (P = 0.68)

Branch-site models

Model A: FUL1 4 -3,635.68 p0 = 0.87366, p1 = 0.12634, None

(= null Model A) (p2a + p2b = 0.000)

ϖ0 = 0.0729

ϖ2a = 1.00000, ϖ2b = 1.00000

Model B: FUL1 5 -3,626.84 p0 = 0.80202, p1 = 0.19798, None

(= null Model B) (p2 + p3 = 0.00), ϖ0 = 0.05027

ϖ1 = 0.52984, ϖ2 = ∞

51 Preston and Kellogg, Evolution of AP1/FUL-like genes

Model A: FUL2 4 -3,635.19 p0 = 0.77157, p1 = 0.11071 45L (P = 0.69)

(= null Model A) (p2a + p2b = 0.11773) 50T (P = 0.62)

ϖ0 = 0.0714, ϖ2a = 1.00000 145Q (P = 0.61)

ϖ2b = 1.00000 206D (P = 0.51)

Model B: FUL2 5 -3,626.84 p0 = 0.80202, p1 = 0.19798,

(= null Model B) (p2 + p3 = 0.00), ϖ0 = 0.05027

ϖ1 = 0.52984, ϖ2 = ∞

Note- Parameter estimates in bold indicate positive selection; those in parentheses are not free parameters (p). Putative sites under positive selection are based on the coding region of rice

OsMADS14. Estimates of κ ranged between 1.74 and 1.9.

52 Preston and Kellogg, Evolution of AP1/FUL-like genes

FIGURE LEGENDS

Figure 1. Simplified comparison between flowers of grasses, typical non-grass monocots, and the core eudicot Arabidopsis. (1A) Arabidopsis flower and associated inflorescence branches. (1B)

Generalized monocot flower with inflorescence branches. Each branch is subtended by a bract and gives rise to a prophyll (leaf-like structure). (1C) Generalized grass spikelet composed of a single floret, within which is a single flower. (1D) Modified spikelet or spikelet-equivalent of

Streptochaeta, composed of several bracts and a terminal flower. Gray structures are homologous between diagrams. Black structures are non-homologous between diagrams. BRA, bract; PRO, prophyll; GLU, glume; LEM, lemma; PAL, palea; TEP, tepal; SEP, sepal; PET, petal; LOD, lodicule; STA, stamen; GYN, gynoecium. Arrowheads indicate indeterminate growth.

Figure 2: Simplified grass phylogeny showing subfamilial relationships. Lowercase names are grass subfamilies. Uppercase names are sister families to grasses. Names in parentheses are examples of genera within subfamilies. BEP, Bambusoideae-Ehrhartoideae-Pooideae clade;

PACCAD, Panicoideae-Arundinoideae-Centothecoideae-Chloridoideae-Aristidoideae-

Danthoniodeae clade. Based on GPWG (2001).

Figure 3. Maximum likelihood tree using GTR + I + G model of evolution. Bayesian posterior probabilities >90 shown above the branches, MP non-parametric bootstrap values >70 shown below branches. ● = inferred duplication event. ❂ = sequences included in the PAML analysis.

A = foreground branch 1 in branch and branch-site models. B = foreground branch 2 in branch and branch-site models. Arrows indicate origin of the grass spikelet. Shaded boxes indicate taxa lacking a spikelet.

53 Preston and Kellogg, Evolution of AP1/FUL-like genes

Figure 4. Inferred history of codon substitutions within grass FUL1 and FUL2 clades based on maximum parsimony ancestral reconstructions. Numbers below the branches indicate total number of non-synonymous substitutions and number of unique non-synonymous substitutions respectively. Branches in bold represent the period of time between gene duplication and diversification of the BEP/PACCAD clades; one branch in the FUL1 lineage is equal to three branches in the FUL2 lineage since Streptochaeta and Pharus have lost their FUL1 copy.

Numbers in boxes above these branches denote the number of unreversed non-synonymous substitutions.

Figure 5. Aligned AP1/FUL codon sequences from Xyris (XsFUL), Joinvillea (JaFUL), Oryza

(OsMADS14 [FUL1] and OsMADS15 [FUL2]), Zea (ZmMADS15 [FUL1] and ZAP1 [FUL2]),

Pharus (PsFUL2) and Streptochaeta (SaFUL2). Black arrowheads indicate conserved codons distinguishing all members of the FUL1 and FUL2 clades in the larger dataset. Gray arrowheads indicate codon changes that are not conserved in all subsequent clades. All changes occurred along the branch leading to the FUL2 clade, and six of the conserved changes resulted in a new class of amino acid.

Figure 6. Non-grass monocot gene-specific RT-PCR products amplified from RNA of stems

(ST), leaves (LF), floral bracts (FB), fertile florets (FF), outer tepals (T1), inner tepals (T2), stamens (S), and carpels (C). RNA was extracted from early through mid-stage buds. Actin was used as a positive control.

54 Preston and Kellogg, Evolution of AP1/FUL-like genes

Figure 7. Grass gene-specific RT-PCR products amplified from RNA of roots (RT), stems (ST), leaves (LF), glumes (GL) or bracts (I-V, VI, X-XII; numbered sequentially from outer to inner), fertile florets (FF), and reduced florets (RF). The type of reduced floret and its location within the spikelet is indicated below the RF lane: sterile florets at the base (BS), staminate florets at the base (BST), pistillate florets at the apex (AP), staminate florets from male inflorescence (S), sterile florets in different spikelets from fertile florets (SS), and not applicable (NA). Dashed lines represent missing data. RNA was extracted from a variety of developmental stages except for florets, where RNA was extracted only at mid to late stages of development. Actin was used as a positive control.

Figure 8. Semi-quantitative RT-PCR of FUL1 and FUL2 amplified from RNA of stems (ST), leaves (LF), glumes (GL), fertile florets (FF), and reduced florets (RF). Histograms show relative expression profiles for FUL1 and FUL2 of Sorghum (A), Chasmanthium (B), Eleusine (C), and

Oryza (D) standardized against actin expression. Gel photos below histograms show FUL1 (top),

FUL2 (middle) and actin (bottom) bands used for quantification.

55