Mol Genet Genomics (2015) 290:1435–1446 DOI 10.1007/s00438-015-1004-z

ORIGINAL PAPER

Genome‑wide characterization and analysis of F‑box ‑encoding genes in the Malus domestica genome

Hao‑Ran Cui · Zheng‑Rong Zhang · Wei lv · Jia‑Ning Xu · Xiao‑Yun Wang

Received: 24 October 2014 / Accepted: 29 January 2015 / Published online: 18 February 2015 © Springer-Verlag Berlin Heidelberg 2015

Abstract The F-box protein family is a large family that Using qRT-PCR to examine the expression of F-box genes is characterized by conserved F-box domains of approxi- encoding with domains related to stress, the results mately 40–50 amino acids in the N-terminus. F-box pro- revealed that F-box proteins were up- or down-regulated, teins participate in diverse cellular processes, such as which suggests that F-box genes were involved in abiotic development of floral organs, signal transduction and stress. The results of this study helped to elucidate the response to stress, primarily as a component of the Skp1- functions of F-box proteins, especially in Rosaceae plants. cullin-F-box (SCF) complex. In this study, using a global search of the apple genome, 517 F-box protein-encoding Keywords Apple · F-box · Bioinformatics · Expression genes (F-box genes for short) were identified and further pattern · Stress subdivided into 12 groups according to the characterization of known functional domains, which suggests the different potential functions or processes that they were involved in. Introduction Among these domains, the galactose oxidase domain was analyzed for the first time in plants, and this domain was The F-box motif is a conserved sequence that is found present with or without the Kelch domain. The F-box genes widely in proteins from animals and plants. This sequence were distributed in all 17 apple chromosomes with various consists of approximately 40–50 amino acids, which usu- densities and tended to form gene clusters. Spatial expres- ally occur in the N-terminal end of the proteins. Proteins sion profile analysis revealed that F-box genes have organ- with one or more F-box motifs are called F-box proteins. In specific expression and are widely expressed in all organs. animal species, the quantities of predicted F-box protein- Proteins that contained the galactose oxidase domain were encoding genes (F-box genes for short) were generally highly expressed in leaves, flowers and seeds. From a fruit less than a hundred; for example, Drosophila, mice and ripening expression profile, 166 F-box genes were identi- humans had only 27, 74 and 69 F-box genes, respectively fied. The expressions of most of these genes changed little (Kipreos and Pagano 2000; Jin et al. 2004; Jia et al. 2013). during maturation, but five of them increased significantly. However, many more F-box genes were predicted in plant species, for example, 694 F-box genes in Arabidopsis and 687 and 359 in rice and maize, respectively (Gagne et al. Communicated by S. Hohmann. 2002; Jain et al. 2007; Jia et al. 2013). Large numbers of F-box genes implied crucial roles and diverse functions of Electronic supplementary material The online version of this the F-box genes, especially in the plant kingdom. In plants, article (doi:10.1007/s00438-015-1004-z) contains supplementary material, which is available to authorized users. F-box proteins bind with other proteins such as cullin1 and Skp1 to form an SCF complex (Skp1-cullin-F-box com- H.‑R. Cui · Z.‑R. Zhang · W. lv · J.‑N. Xu · X.‑Y. Wang (*) plex), participating in the ubiquitin/26S proteome pathway. College of Life Science, State Key Laboratory of Crop Biology, The ubiquitin/26S proteome pathway is a major posttran- Shandong Agricultural University, Taian 271018, Shandong, People’s Republic of China scriptional regulatory process that allows cells to degrade e-mail: [email protected] redundant proteins through a reaction cascade that requires

1 3 1436 Mol Genet Genomics (2015) 290:1435–1446 three enzymes, an E1 (ubiquitin-activating enzyme), an in the pollen determinants of self-incompatibility (Okada E2 (ubiquitin conjugating enzyme), and an E3 (ubiquitin et al. 2013). The expression of the apple protein TLP7 can ligases). Once ubiquitin is activated by E1, it is transferred enhance stress tolerance in E. coli cells (Du et al. 2014). to E3 under the catalysis of E2 and binds to substrate pro- However, genome-wide analysis of apple F-box genes is teins through E3 (Ciechanover 1998). The SCF complex still yet to be studied. is one of the most well-known E3 s, in which the cullin1, The completion of the apple (Malus domestica) genome Rbx1, and Skp1 subunits comprise the core ligase activity, map offers the possibility of investigating the F-box gene while F-box proteins interact with Skp1 proteins via the family in this species (Velasco et al. 2010). In this study, we F-box domain and bind to substrate proteins through vari- performed a genome-wide search of F-box protein-encod- able protein interaction domains (Xu et al. 2002). ing genes in the apple genome and analyzed their chromo- Since the initial discovery of the F-box protein, cyclin F, somal distributions, functional domains and expression pat- in humans, numerous F-box proteins have been identified terns in different organs and processes. A special domain in eukaryotes. In plants, many F-box proteins are involved called galactose oxidase was first noticed as a functional in a variety of different functions, including hormone sign- domain in F-box proteins and was designated as a distinct aling, defense response, the development of floral organs group in classification. Transcription profile analysis dem- and flowering control, photomorphogenesis, circadian onstrated that F-box genes were expressed in all organs, rhythm, and self-incompatibility. The protein UFO (Unu- and individual F-box genes showed organ-specific expres- sual Floral Organs) is the first identified F-box protein in sion, which suggests that they have different functions. We plants, and it plays roles in the floral meristem and floral also discovered several F-box genes that respond to chill- organ development (Levin and Meyerowitz 1995; Chae ing and fruit ripening. To the best of our knowledge, the et al. 2008). Another set of F-box proteins, including FKF1, present report is the first genome-wide report of the apple ZTL/LKP1 and LKP2, has been found to be involved in F-box family, and our results could provide useful informa- the regulation of the circadian clock, flowering time, and tion for further studies of the apple F-box gene family. photomorphogenesis in Arabidopsis (Kiyosue and Wada 2000; Mas et al. 2003; Somers et al. 2004; Baudry et al. 2010; Takase et al. 2011). A nuclear F-box protein, EID1 Materials and methods (Empfindlicher im Dunkelroten Licht 1), which contains a leucine zipper motif, functions as a negative regulator in Identification and classification of F‑box genes in apple phytochrome A-specific light signaling by targeting phy- tochrome A for ubiquitin-dependent proteolysis (Marrocco The genome and proteome sequences of Arabidop- et al. 2006). The F-box protein MAX2 is involved in pho- sis, maize and rice were downloaded from the NCBI ftp tomorphogenesis, including inflorescence architecture and website. Two strategies were used. For the first method, senescence, as a positive regulator. In addition, many F-box BLASTp was performed by the Bioedit program with an proteins have been demonstrated to take part in hormone E value cutoff of 0.001 to search the apple proteome using perception and signaling processes. Several F-box proteins F-box proteins in maize and rice as queries. For the sec- have been found to function as auxin receptors, such as ond method, Hmmer3.0 software was downloaded from the TIR1, AFB1, AFB2 and AFB3, while the F-box proteins HMMER website (http://hmmer.janelia.org/) to perform a AtGID2 and AtSLEEEPY1 are involved in gibberellin sign- global search of the apple proteome. The Hidden Markov aling by directing the SCF complex (McGinnis et al. 2003; Model (HMM) profile of the F-box domain that was used Dill et al. 2004; Dharmasiri et al. 2005; Kepinski and Ley- in this investigation was downloaded from the pfam web- ser 2005; Ariizumi et al. 2011). The F-box proteins EBF1 site (http://pfam.xfam.org/). All of the proteins that were and EBF2 are related to the regulation of plant ethylene obtained by the two strategies were submitted to the Inter- hormone signaling (Potuschak et al. 2003). Another F-box pro Database to ensure the presence of F-box domains. protein, COI1, functions as a receptor for jasmonic acid by Gene sequences were obtained using their corresponding assembling SCF complexes (Thines et al. 2007; Yan et al. numbers in the apple genome with a Bioperl script (http:// 2009). www.rosaceae.org/species/malus/malus_x_domestica/ In apple (Malus domestica), only a few F-box protein genome_v1.0p). The F-box protein sequences that were functions have been discovered. Two homologous apple obtained above were further identified by their conserved F-box proteins, EBF1 and FBCP1, are likely to partici- functional domains through submitting them to the Interpro pate in the ethylene signaling that is involved in the fruit Database for identification, and they were manually clas- ripening process (Han et al. 2008; Tacken et al. 2012). sified into different groups according to their functional F-box proteins in apple have also been found to participate domains.

1 3 Mol Genet Genomics (2015) 290:1435–1446 1437

Sequence logo generation and sequence alignment collected and also stored at 70 °C. The total RNA was − isolated from the leaves using the CTAB procedure (Gasic

The apple F-box domain sequences were submitted to et al. 2004). RNA concentrations and A260/A280 ratios were the MEME website (http://meme.nbcr.net/meme/cgi-bin/ determined using a NanoDrop Spectrometer (ND-1000 meme.cgi) to generate sequence logos. The optimum width Spectrophotometer, Peqlab). The integrity of the RNA of each motif was limited to 6–50 amino acids, and other samples was examined with an Agilent 2100 Bioanalyzer parameters were set to a default. Multiple sequence align- (RNA Nano Chip, Agilent, Santa Clara, CA, USA). Quali- ments of the F-box regions of the 517 F-box genes that fied RNA was used for cDNA synthesis and quantitative were identified above were investigated using Clustal X2, real-time PCR. which was downloaded from www.clustal.org, and default parameters were used (Larkin et al. 2007). Quantitative real‑time PCR (qRT‑PCR) analysis

Chromosome distribution in the apple genome cDNA fragments were synthesized from total RNA using the TransScript™ One-step gDNA Removal and cDNA Location data were downloaded from the genome annota- Synthesis SuperMix (TransGen Biotech, Beijing, China). tion database from the Genome Database for Rosaceae The primers for amplifying specific genes were designed (short for the GDR database, http://www.rosaceae.org/ based on target gene sequences using the software Beacon species/malus/malus_x_domestica/genome_v1.0). Genes Designers 8.10. The primer sequences were listed in Table that were localized to unassembled genomic sequence scaf- S3. qRT-PCR was performed with a Stratagene Mx3000P folds were excluded. The MapDraw program was used to thermocycler (Agilent) in a final volume of 20 μl that con- map the genes to the corresponding chromosomes (Liu and tained 0.8 μl cDNA, 10 μl 2 SYBR premix Ex Taq™ × Meng 2003). Homologous genes that were located nearby (Takara, Shiga, Japan) and 0.8 μl (10 μM) primers. The to one another were considered to be tandem duplicates. thermal cycling conditions were as follows: 44 cycles of 95 °C denaturation for 15 s, 55 °C annealing for 30 s and Expression pattern of F‑box genes in apple 72 °C extension for 15 s. The apple actin gene was used as an internal control. To study the expression pattern of F-box genes in different organs and their expression pattern during fruit ripening, series matrix data from the expression profiles GSE24523 Results and GSE42873 were downloaded from NCBI GEO datasets. For GSE24523, the BLASTp program was used to search Genome‑wide identification of F‑box protein‑encoding the database (Apple Genome V1.0p Contigs) to obtain cor- genes in apple responding unigene IDs for each gene using the identified F-box proteins as queries. For GSE42873, its gene names To identify genes that encode F-box proteins in the apple were consistent with the names in the GDR database. genome, two strategies were used. In the first strategy, all Expression data for the identified genes were extracted from of the predicted sequences of F-box proteins from Arabi- the two datasets using their unigene IDs and gene names, dopsis, maize and rice were used as queries to search the respectively, with a Visual Basic (Version 6.0) script. Clus- apple peptide database using BLASTp (Gagne et al. 2002; tering was performed by Cluster 3.0 (http://rana.lbl.gov/ Jain et al. 2007; Jia et al. 2013). The E value in the BLAST EisenSoftware.htm) using the hierarchical clustering method program represents the possibility of a random match. To for the average linkage, and a heatmap and clustering tree obtain credible results, the E value cutoff was set to 0.001, were constructed and viewed with Java Treeview (http:// as in references of similar investigations (Li et al. 2011; sourceforge.net/projects/jtreeview/?source typ_redirect). Jia et al. 2013). Using this approach, approximately 3,000 = sequences were identified. Based on the characteristics Sample preparation and total RNA extraction of the sequences, a Hidden Markov Model (HMM) profile of the F-box domain was established by For investigation of the expression under chilling stress, the pfam website (http://pfam.xfam.org/). For the second spring apple shoots of five individuals from 3-year-old cul- strategy, the HMM profile of the F-box domain was used tivar M26 were treated at 4 °C for 1 and 6 h while room as a query to search in the apple peptide database using the temperature treatment was used as a control. The leaves software of hmmer 3.0. Using this strategy, 571 proteins were collected, placed into liquid nitrogen and stored that potentially contained the F-box domain were obtained. at 70 °C until further use. With respect to the samples To further verify the F-box domains that were presented − used for organ-specific expression, different organs were in these proteins, all of the sequences that were identified

1 3 1438 Mol Genet Genomics (2015) 290:1435–1446 by the above two strategies were submitted to the Interpro FBKG that distinguished them from FBK or FBG, which Database (http://www.ebi.ac.uk/Interpro/scan.html) to per- only have a Kelch domain or a galactose oxidase domain. form checking. All of the proteins that were obtained from Other functional domains were also found in the F-box pro- the Interpro Database were further manually checked to teins, such as ATPase, glycoside hydrolase, tubulin, small remove the redundant items. Finally, 517 F-box proteins GTPase superfamily and DnaJ domains. However, the and their relative encoding genes were obtained and are number of proteins that have specific functional domains listed in Tables S1 and S2. The entries were named accord- was less than 5. Therefore, these F-box proteins were clas- ing to the chromosomal distributions of these genes. sified into another group called FBO (F-box proteins that contain other domains). Schematic diagrams of typical Identification of functional domains in apple F‑box members from each group are shown in Fig. 2. proteins and classification of F‑box genes The galactose oxidase domain was first chosen as a func- tional domain of the F-box proteins in the apple genome In addition to the F-box domain in the N-terminal end, in this study, which was a different choice compared with F-box proteins contained diverse functional domains in previous studies of F-box proteins in other plant species. their C-terminal region, and most of these domains were There were 36 F-box proteins that contained the galactose involved in protein–protein interactions (Xu et al. 2009). oxidase domain in the apple genome, and they were classi- The Interpro Database was used to identify conserved func- fied into the FBKG and FBG groups according to whether tional domains in apple F-box proteins, and all of the F-box they also had a Kelch domain or not. proteins that were identified were classified into 12 groups, To analyze the characteristics of the F-box domain, the as shown in Fig. 1. There were 152 apple F-box proteins sequences of the F-box domain from the above 517 genes that contained only the F-box domain without other known in the apple genome were submitted to the MEME website functional domains; this group was the largest group, the (http://meme.nbcr.net/meme/cgi-bin/meme.cgi), with the FBX group. The functional domains that were identified optimum width of each motif limited to 6–50 amino acids were FBA (an F-box associated domain), FBD (a domain and other parameters set to the default. A sequence logo often occurring in F-box proteins), TUB (the Tubby was generated (Fig. 3). The height of each stacked letter domain), WD40 (a conserved domain with approximately represents the probability of each letter appearing at each 40 amino acids ending with Trp and Arg), LRR (a Leu- position. A conserved F-box motif of 40 amino acids in the rich repeat domain), Kelch-type domain, galactose oxidase apple genome was identified, in which the Leu, Pro, Val domain, DUF (domain of unknown functions) and zinc-fin- and Trp residues at positions 4, 5, 27, and 31 were highly ger domain. The groups that contained F-box proteins with conserved, as indicated by filled triangles in Fig. 3. In addi- FBD or FBA domains were named based on the name of tion, the Leu (Phe), Leu (Val), Pro, Cys (Ser), Lys (Arg), the domain. Other groups were designated FB (F-box) with and Phe (Leu) residues at positions 13, 16, 17, 28, 29, and additional capital letters from the first or second letter of 40, respectively, were conserved. The residues in parenthe- the functional domain (for example, FBT for proteins that sis represent possibilities that could appear with less likeli- contain a TUB domain and FBU for proteins that contain a hood; they are indicated by empty triangles in Fig. 3. DUF domain). With regard to the F-box proteins that con- To further ascertain the conserved amino acid compo- tain both the Kelch domain and galactose oxidase domain, sition of the apple F-box proteins that are characterized these proteins were classified into a separate group called above, a multiple sequence alignment was performed (Fig.

Fig. 1 Numbers of F-box genes in different groups in the apple genome

1 3 Mol Genet Genomics (2015) 290:1435–1446 1439

Fig. 2 Schematic diagram of typical F-box protein family members finger domain, WD40 WD40 repeat, FBA F-box associated domain, with their primary structure functional domains. Schematic diagram FBD FBD domain, Leu-rich repeat Leucin rich repeat, Kelch Kelch of typical F-box protein family members from each group. The con- repeats, Galactose oxidase galactose oxidase domain, and DUF served domains are indicated by different shapes. The domain abbre- domain of unknown functions viations are the following: Tubby Tubby domain, Zinc-finger Zinc-

Fig. 3 Sequence logos of F-box domains that were generated from all of the F-box proteins in apple. The x axis represents the relative positions of the F-box motifs. The y axis represents the information content as measured in bits

S1). The results from both methods were consistent, which apple chromosomes, while 25 could not be mapped. These indicates that these residues could play important roles in genes could be localized to unassembled genomic sequence protein recognition and interaction. scaffolds (Li et al. 2011). As shown in Fig. 4, the number of F-box genes in different chromosomes was variable. The Chromosomal distributions of apple F‑box chromosome with the highest density of F-box genes was protein‑encoding genes chromosome 11, which contained 52 F-box genes. In con- trast, chromosome 6 contained only 7 F-box genes, which In this study, an attempt was made to map all of the 517 was the smallest number of F-box proteins in a chromosome. identified apple F-box genes to apple chromosomes. As Within the chromosomes, the apple F-box genes tended to a result, 492 F-box genes were distributed in all 17 of the form gene clusters. Genes within gene clusters that showed

1 3 1440 Mol Genet Genomics (2015) 290:1435–1446

Fig. 4 Distribution of the F-box genes in 17 apple chromosomes. The numbers to the left of each chromosome represent a megabase between two F-box genes. Tandem duplications are indicated by light green (color figure online) high similarity (higher than 70 %), e.g., MdFBX272 and shaped the genome of domesticated apple and formed sev- MdFBX273 (which share a 99 % similarity), were defined as eral intragenomic homologous regions (Velasco et al. 2010). tandem gene duplication. The results indicated that tandem Large numbers of apple F-box genes were located in homolo- gene duplication occurred in apple F-box genes. In general, gous regions in different chromosomes (Fig. 4), which sug- chromosomes that contained more F-box genes contained gests that genome duplication was another important way for more duplication pairs, with the exception of chromosome F-box genes to extend in the apple genome. 16. Based on these results, we hypothesized that the tan- dem duplication of genes was one reason for the generation Apple F‑box protein‑encoding genes present organ‑specific of the F-box genes. In addition to tandem gene duplication, expressions a genome-wide duplication event also occurred in apple, which led to a larger chromosome number in apple than that To further analyze the organ-specific expression of apple in other Rosaceae plants. Genome-wide duplication also F-box genes, an expression dataset, GSE42873, from the

1 3 Mol Genet Genomics (2015) 290:1435–1446 1441

◂ Fig. 5 Organ-specific expression pattern of F-box genes. a Organ- specific expression of F-box genes in the entire apple genome. b Organ-specific expression of F-box genes that encode proteins with the galactose oxidase domain

GEO database (from National Center for Biotechnology Information) was used to search the expression levels of the F-box genes identified above. All of the 517 F-box genes could be found in the apple expression profile. These F-box genes were further divided into seven groups according to their specific expression in different organs; these groups were named group I, IIA, IIB, IIIA, IIIB, IVA and IVB (Fig. 5a). In group I, all of the 15 genes shared a low expression level in the entire plant. Compared with other groups, the gene expression level in this group was relatively low in the stems, leaves, flowers and fruits. There were 85 genes that were classified into Group IIA because they showed a higher expression level in the seeds, seedlings and roots compared with other organs. Among them, 8 genes were expressed mainly in the seeds and seedlings, including MdFBX1, MdFBX454, MdFBX356, MdFBX170, MdFBX87, MdFBX263, MdFBX262 and MdFBX432. In group IIB, 30 genes were clustered together due to their high expression in the seeds or seedlings and low expression in other organs. 61 F-box proteins were mainly expressed in the seeds and 38 mainly in the seedlings. The expression levels of the 29 genes increased during the process of the seedling growth, which indicates their potential function in seed germination. Group IIIA and IIIB shared the common characteristic of low expression in seeds and seedlings and relatively high expression in other organs. For the genes in group IIIA, the expression appeared to be low in all of the organs. In group IIIB, the expression of most of the F-box genes showed a slight increase during the period of fruit ripening, which indicates that these genes play roles in fruit maturation. Among them, 19 F-box genes were expressed highly in the flower, which indicated that these genes could be involved in processes that are related to floral organ development or flowering time. F-box genes in both group IVA and group IVB showed low expression levels in seeds. In group IVA, most of the gene expression was increased during seed germination, while in group IVB, most of the genes were expressed highly in roots and stems. There was 23 F-box genes with high expression in leaves and 16 genes with high expression in fruits in group IVA. For group IVB, most of the F-box genes were expressed highly in the roots and stems. The qRT-PCR technique was also used to verify the organ-specific expression of apple F-box genes. Ten genes

1 3 1442 Mol Genet Genomics (2015) 290:1435–1446

Fig. 5 continued were selected from each group to examine their expressions leaves, while 5 genes in group GIIB showed high expres- in different organs using gene-specific primers. The results sion in roots. These results suggest that different F-box are indicated in Fig. S2. The expression tendency of most proteins with galactose oxidase domains were expressed in F-box genes in different organs was consistent with the different organs and could be related to multiple functions. data in the GSE42873. The spatial expression pattern of genes encoding F-box F‑box genes involved in fruit ripening and the abiotic stress proteins with a galactose oxidase domain was further ana- response lyzed because this domain was first chosen as a functional domain in this investigation. All 36 genes were classified It has been demonstrated that F-box genes function mainly into three groups according to their expression similarity in by directing the SCF complex through the ubiquitin/26S different organs, i.e., groups GI, GIIA and GIIB (Fig. 5b). proteasome pathway. In apple, the F-box proteins EBF1 Group GI contained 25 members that had low expres- and FBC1 have been identified to be involved in matura- sion mainly in seeds and seedlings. In contrast with group tion by assembling the SCF complex (Han et al. 2008; GI, the F-box genes in group GIIA and GIIB shared high Tacken et al. 2012). To detect other candidate F-box genes expression in seeds and seedlings. A total of 8 genes that that are involved in fruit ripening, the transcriptome profile belonged to Group GI had high expression in flowers and during fruit ripening (GSE24523 from GEO database) was

1 3 Mol Genet Genomics (2015) 290:1435–1446 1443

◂ Fig. 6 Expression pattern of F-box genes in the apple genome during fruit ripening

analyzed. This transcriptome profile came from two apple cultivars (Honeycrisp and Cripps Pink) at different stages of ripening. Using corresponding contigs in the GDR uni- gene database, we searched for the 517 identified F-box genes in GSE24523, and 166 genes were found in the tran- scriptome profile for fruit ripening. The 166 genes were classified into four groups: RIA, RIB, RIIA, and RIIB. Gene expression in groups RIA and RIB was increased slightly during ripening, while gene expression in groups RIIA and RIIB was decreased (Fig. 6). The results implied that some F-box genes could be involved in the apple fruit ripening process. Several F-box genes have also been found to play roles in abiotic stress (Stone and Callis 2007; Yan et al. 2011; Maldonado-Calderón et al. 2012; Chen et al. 2014). To study the response of apple F-box genes to cold stress, the expression of ten F-box gene encoding domains that are related to stress was detected by qRT-PCR, including six members that contain a tubby domain, one that contains an ARM domain, one that contains a peroxidase domain and two heat shock proteins contain DnaJ domain. After 1 and 6 h of chilling treatment at 4 °C, the expression of the seven F-box genes increased, and among them were four genes that encoded F-box proteins with a tubby domain. Other three genes encoded F-box proteins with the ARM fold domain, peroxidase domain and DnaJ domain. On the other hand, the expression of three genes was down- regulated, specifically, two genes that encoded F-box pro- teins with a tubby domain and one that had a DnaJ domain (Fig. 7). The results above indicated the potential functions for F-box genes in the chilling defense process.

Discussion

As a large family, F-box proteins are widely present in eukaryotes. The numbers of the majority of F-box genes in animals were less than a hundred, although there were 326 F-box genes predicted in C. elegans that had the maxi- mum number known in animals (Kipreos and Pagano 2000; Jin et al. 2004; Jia et al. 2013). In plants, larger quanti- ties of F-box genes have been predicted than that in ani- mals. For example, in Arabidopsis, 694 F-box genes were predicted, with 687 predicted in rice and 359 predicted in maize. Abundant F-box genes could play important roles in eukaryotes, especially in plant species. The apple genome sequencing was completed in 2010, which provided the possibility of studying the characteristics of F-box genes

1 3 1444 Mol Genet Genomics (2015) 290:1435–1446

Fig. 7 Relative gene expression of some F-box genes during chilling treatment

on the apple genome scale (Velasco et al. 2010). In this rice and maize, respectively (Gagne et al. 2002; Jain et al. study, 517 genes that encoded F-box proteins were iden- 2007; Jia et al. 2013). Similarly, 34 F-box proteins that tified, which was an amount that was consistent with the contain the LRR domain were identified in apple, and they order of magnitude of F-box gene quantities in other plants. were classified as the FBL group (F-box proteins with LRR Zhang and his colleagues predicted 458 apple F-box genes domain) in our investigation. One member in this group, in Apple Gene Function and Gene Family Database (http:// MdFBX96, had an additional ARM domain, while the www.applegene.org/). They used the NCBI Conserved ARM domains were demonstrated to have functions in anti- Domain Database to predict F-box genes in the apple stress (Phillips et al. 2012). In this study, the expression of genome, which is a different method from our investigation the gene MdFBX96 was found to be induced by cold stress, (Zhang et al. 2013). Among these genes, 359 F-box genes which suggests a potential function of MdFBX96 in the were matched with our results, and more than a hundred stress response. genes were different. In our investigation, two strategies Another functional domain, WD40, contains repeated were used to scan the apple genome for F-box genes, and motifs of approximately 40 amino acids, which include a each potential F-box gene obtained was verified by submit- Trp-Asp (W-D) dipeptide at its C-terminus. This domain ting it to the Interpro Database for further characterization has been reported to have diverse functions (Li and Rob- by the functional domains. erts 2001). We verified six members that contain a WD40 The functional domains that are located in F-box pro- domain in the apple F-box family, while there were five teins have been found to be diverse. In a previous study, and two in maize and rice, respectively. Tubby proteins the F-box associated (FBA) domain is a motif that has 60 are thought to be related to obesity in animals (Kleyn amino acids in the C-terminus, which marks target proteins et al. 1996). In plants, a few tubby proteins are reported to for ubiquitination and degradation. In apple, F-box proteins be involved in anti-stress (Wardhan et al. 2012; Bao et al. with FBA domains were the most abundant, numbering up 2014; Du et al. 2014). In this investigation, the expression to 152 in total, while in rice and maize, only 4 and 17 mem- of three of the four apple genes that encode F-box proteins bers, respectively, were found (Gagne et al. 2002; Jain et al. that contain the tubby domain was up-regulated by cold 2007; Jia et al. 2013). The FBD domain is usually located treatment, which suggests that some F-box proteins with in plant F-box proteins. Although its precise function is the tubby domain could play important roles in stress. unknown, it is thought to be associated with biochemical Several Kelch repeats can associate to form a beta-pro- processes in the nucleus (Doerks et al. 2002). The LRR peller that has diverse functions, and the Kelch domain was domain occurs widely in proteins that range from viruses to also found in the apple F-box proteins. In the apple F-box eukaryotes and can provide a structural framework for the family, there were 41 proteins that contain a Kelch domain, formation of protein–protein interactions (Kobe and Kajava and 29 of them also had a galactose oxidase domain. The 2001). F-box proteins that contain the LRR domain were galactose oxidase domain contained a single copper ion, predicted to be 160, 61 and 61 members, in Arabidopsis, catalyzing the stereospecific oxidation of primary alcohols

1 3 Mol Genet Genomics (2015) 290:1435–1446 1445 into their corresponding aldehyde (Ito et al. 1991). Path- (2010) F-box proteins FKF1 and LKP2 act in concert with way analysis using the KEGG website (http://www.kegg. ZEITLUPE to control Arabidopsis clock progression. Plant Cell 22:606–622 jp/kegg/pathway.html) reveals that galactose oxidase can Chae E, Tan QK, Hill TA, Irish VF (2008) An Arabidopsis F-box pro- transform d-galactose into d-galactonate, while the latter tein acts as a transcriptional co-factor to regulate floral develop- participates in the pentose phosphate pathway via serial ment. Development 135:1235–1245 steps. Apple F-box proteins that contain the galactose oxi- Chen R, Guo W, Yin Y, Gong ZH (2014) A novel F-box protein CaF- box is involved in responses to plant hormones and abiotic stress dase domain shared 36 (with or without Kelch domain) in in pepper (Capsicum annuum L.). Int J Mol Sci 15:2413–2430 total, while galactose oxidase domain was not reported in Ciechanover A (1998) The ubiquitin-proteasome pathway: on protein Arabidopsis, rice, and maize F-box proteins. Compared death and cell life. EMBO J 17:7151–7160 with the functional domains of F-box proteins in other Dharmasiri N, Dharmasiri S, Weijers D, Lechner E, Yamada M, Hob- bie L, Ehrismann JS, Jurgens G, Estelle M (2005) Plant develop- plants, some of the functional domains were not detected in ment is regulated by a family of auxin receptor F box proteins. apple F-box proteins, such as LysM (Lysin motif domain), Dev Cell 9:109–119 DEXDc (DEAD-like helicases superfamily) and SEL1 Dill A, Thomas SG, Hu J, Steber CM, Sun TP (2004) The Arabidopsis (Sel1-like repeats) (Xu et al. 2009). F-box protein SLEEPY1 targets gibberellin signaling repressors for gibberellin-induced degradation. Plant Cell 16:1392–1405 Previous studies showed that F-box proteins have differ- Doerks T, Copley RR, Schultz J, Ponting CP, Bork P (2002) System- ent functions in different organs, such as the development atic identification of novel families associated of floral organs, flowering control, defense response, photo- with nuclear functions. Genome Res 12:47–56 morphogenesis, circadian rhythm, and self-incompatibility Du F, Xu JN, Zhan CY, Yu ZB, Wang XY (2014) An obesity-like gene MdTLP7 from apple (Malus x domestica) enhances abiotic stress (Kiyosue and Wada 2000; Mas et al. 2003; Somers et al. tolerance. Biochem Biophys Res Commun 445:394–397 2004; Baudry et al. 2010; Takase et al. 2011). In this study, Gagne JM, Downes BP, Shiu SH, Durski AM, Vierstra RD (2002) The 19 F-box genes in group IIIB were observed to have high F-box subunit of the SCF E3 complex is encoded by a diverse expression in flowers, and these genes could be related to superfamily of genes in Arabidopsis. Proc Natl Acad Sci USA 99:11519–11524 the development of apple floral organs or flowering control. Gasic K, Hernandez A, Korban S (2004) RNA extraction from dif- More than twenty genes in group IVA showed a high expres- ferent apple tissues rich in polyphenols and polysaccharides for sion in apple leaves, which implies that they have potential cDNA library construction. Plant Mol Biol Rep 22:437–438 roles in light sensitivity, such as photomorphogenesis (Som- Han SE, Seo YS, Heo S, Kim D, Sung SK, Kim WT (2008) Structure and expression of MdFBCP1, encoding an F-box-containing pro- ers et al. 2004). Chilling treatment also arose as some F-box tein 1, during Fuji apple (Malus domestica Borkh.) fruit ripening. genes up- or down-regulated, which suggests that these Plant Cell Rep 27:1291–1301 genes could have potential functions in the stress response. Ito N, Phillips SE, Stevens C, Ogel ZB, McPherson MJ, Keen JN, This study reported the genome-scale analysis of F-box Yadav KD, Knowles PF (1991) Novel thioether bond revealed by a 1.7 A crystal structure of galactose oxidase. Nature genes in apple. Several candidate F-box proteins were iden- 350:87–90 tified; these proteins were involved in physiological pro- Jain M, Nijhawan A, Arora R, Agarwal P, Ray S, Sharma P, Kapoor cesses, such as flower development, seed germination, fruit S, Tyagi AK, Khurana JP (2007) F-box proteins in rice. Genome- ripening and the cold stress response. These results could wide analysis, classification, temporal and spatial gene expres- sion during panicle and seed development, and regulation by light provide useful information for further studies on the func- and abiotic stress. Plant Physiol 143:1467–1483 tions of F-box proteins in plant growth and development. Jia F, Wu B, Li H, Huang J, Zheng C (2013) Genome-wide identifi- cation and characterisation of F-box family in maize. Mol Genet Acknowledgments This work was supported by Special Research Genomics 288:559–577 Fund of Public Welfare of China Agricultural Ministry (201303093). Jin J, Cardozo T, Lovering RC, Elledge SJ, Pagano M, Harper JW (2004) Systematic analysis and nomenclature of mammalian Conflict of interest The authors declare that they have no conflict F-box proteins. Genes Dev 18:2573–2580 of interest. Kepinski S, Leyser O (2005) The Arabidopsis F-box protein TIR1 is an auxin receptor. Nature 435:446–451 Kipreos ET, Pagano M (2000) The F-box protein family. Genome Biol 1: Reviews 3002 Kiyosue T, Wada M (2000) LKP1 (LOV kelch protein 1): a factor References involved in the regulation of flowering time inArabidopsis . Plant J 23:807–815 Ariizumi T, Lawrence PK, Steber CM (2011) The role of two f-box Kleyn PW, Fan W, Kovats SG, Lee JJ, Pulido JC, Wu Y, Berkemeier proteins, SLEEPY1 and SNEEZY, in Arabidopsis gibberellin LR, Misumi DJ, Holmgren L, Charlat O, Woolf EA, Tayber O, signaling. Plant Physiol 155:765–775 Brody T, Shu P, Hawkins F, Kennedy B, Baldini L, Ebeling C, Bao Y, Song WM, Jin YL, Jiang CM, Yang Y, Li B, Huang WJ, Liu Alperin GD, Deeds J, Lakey ND, Culpepper J, Chen H, Glucks- H, Zhang HX (2014) Characterization of Arabidopsis Tubby-like mann-Kuis MA, Carlson GA, Duyk GM, Moore KJ (1996) Iden- proteins and redundant function of AtTLP3 and AtTLP9 in plant tification and characterization of the mouse obesity gene tubby: a response to ABA and osmotic stress. Plant Mol Biol 86:471–483 member of a novel gene family. Cell 85:281–290 Baudry A, Ito S, Song YH, Strait AA, Kiba T, Lu S, Henriques R, Kobe B, Kajava AV (2001) The leucine-rich repeat as a protein recog- Pruneda-Paz JL, Chua NH, Tobin EM, Kay SA, Imaizumi T nition motif. Curr Opin Struct Biol 11:725–732

1 3 1446 Mol Genet Genomics (2015) 290:1435–1446

Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan repress Arabidopsis photoperiodic flowering under non-inductive PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, conditions, dependent on FLAVIN-BINDING KELCH REPEAT Thompson JD, Gibson TJ, Higgins DG (2007) Clustal W and F-BOX1. Plant J 67:608–621 Clustal X version 2.0. Bioinformatics 23:2947–2948 Thines B, Katsir L, Melotto M, Niu Y, Mandaokar A, Liu G, Nomura Levin JZ, Meyerowitz EM (1995) UFO: an Arabidopsis gene involved K, He SY, Howe GA, Browse J (2007) JAZ repressor proteins are in both floral meristem and floral organ development. Plant Cell targets of the SCF(COI1) complex during jasmonate signalling. 7:529–548 Nature 448:661–665 Li D, Roberts R (2001) WD-repeat proteins: structure characteristics, Velasco R, Zharkikh A, Affourtit J, Dhingra A, Cestaro A, Kalyanara- biological function, and their involvement in human diseases. man A, Fontana P, Bhatnagar SK, Troggio M, Pruss D, Salvi Cell Mol Life Sci 58:2085–2097 S, Pindo M, Baldi P, Castelletti S, Cavaiuolo M, Coppola G, Li Y, Wu B, Yu Y, Yang G, Wu C, Zheng C (2011) Genome-wide anal- Costa F, Cova V, Dal Ri A, Goremykin V, Komjanc M, Longhi ysis of the RING finger gene family in apple. Mol Genet Genom- S, Magnago P, Malacarne G, Malnoy M, Micheletti D, Moretto ics 286:81–94 M, Perazzolli M, Si-Ammour A, Vezzulli S, Zini E, Eldredge G, Liu RH, Meng JL (2003) MapDraw: a microsoft excel macro for Fitzgerald LM, Gutin N, Lanchbury J, Macalma T, Mitchell JT, drawing genetic linkage maps based on given genetic linkage Reid J, Wardell B, Kodira C, Chen Z, Desany B, Niazi F, Palmer data. Yi Chuan 25:317–321 M, Koepke T, Jiwan D, Schaeffer S, Krishnan V, Wu C, Chu VT, Maldonado-Calderón MT, Sepúlveda-García E, Rocha-Sosa M (2012) King ST, Vick J, Tao Q, Mraz A, Stormo A, Stormo K, Bogden R, Characterization of novel F-box proteins in plants induced by Ederle D, Stella A, Vecchietti A, Kater MM, Masiero S, Lasserre biotic and abiotic stress. Plant Sci 185–186:208–217 P, Lespinasse Y, Allan AC, Bus V, Chagne D, Crowhurst RN, Marrocco K, Zhou Y, Bury E, Dieterle M, Funk M, Genschik P, Krenz Gleave AP, Lavezzo E, Fawcett JA, Proost S, Rouze P, Sterck L, M, Stolpe T, Kretsch T (2006) Functional analysis of EID1, an Toppo S, Lazzari B, Hellens RP, Durel CE, Gutin A, Bumgarner F-box protein involved in phytochrome A-dependent light signal RE, Gardiner SE, Skolnick M, Egholm M, Van de Peer Y, Sala- transduction. Plant J 45:423–438 mini F, Viola R (2010) The genome of the domesticated apple Mas P, Kim WY, Somers DE, Kay SA (2003) Targeted degradation of (Malus x domestica Borkh.). Nat Genet 42:833–839 TOC1 by ZTL modulates circadian function in Arabidopsis thali- Wardhan V, Jahan K, Gupta S, Chennareddy S, Datta A, Chakraborty ana. Nature 426:567–570 S, Chakraborty N (2012) Overexpression of CaTLP1, a putative McGinnis KM, Thomas SG, Soule JD, Strader LC, Zale JM, Sun TP, transcription factor in chickpea (Cicer arietinum L.), promotes Steber CM (2003) The Arabidopsis SLEEPY1 gene encodes a stress tolerance. Plant Mol Biol 79:479–493 putative F-box subunit of an SCF E3 ubiquitin ligase. Plant Cell Xu L, Liu F, Lechner E, Genschik P, Crosby WL, Ma H, Peng W, 15:1120–1130 Huang D, Xie D (2002) The SCF(COI1) ubiquitin-ligase com- Okada K, Moriya S, Haji T, Abe K (2013) Isolation and characteriza- plexes are required for jasmonate response in Arabidopsis. Plant tion of multiple F-box genes linked to the S9- and S10-RNase in Cell 14:1919–1935 apple (Malus x domestica Borkh.). Plant Reprod 26:101–111 Xu G, Ma H, Nei M, Kong H (2009) Evolution of F-box genes in Phillips SM, Dubery IA, van Heerden H (2012) Molecular characteri- plants: different modes of sequence divergence and their relation- zation of an elicitor-responsive gene (GhARM) ships with functional diversification. Proc Natl Acad Sci USA from cotton (Gossypium hirsutum). Mol Biol Rep 39:8513–8523 106:835–840 Potuschak T, Lechner E, Parmentier Y, Yanagisawa S, Grava S, Koncz Yan J, Zhang C, Gu M, Bai Z, Zhang W, Qi T, Cheng Z, Peng W, C, Genschik P (2003) EIN3-dependent regulation of plant ethyl- Luo H, Nan F, Wang Z, Xie D (2009) The Arabidopsis CORO- ene hormone signaling by two Arabidopsis F-box proteins: EBF1 NATINE INSENSITIVE1 protein is a jasmonate receptor. Plant and EBF2. Cell 115:679–689 Cell 21:2220–2236 Somers DE, Kim WY, Geng R (2004) The F-box protein ZEITLUPE Yan YS, Chen XY, Yang K, Sun ZX, Fu YP, Zhang YM, Fang RX confers dosage-dependent control on the circadian clock, photo- (2011) Overexpression of an F-box protein gene reduces abi- morphogenesis, and flowering time. Plant Cell 16:769–782 otic stress tolerance and promotes root growth in rice. Mol Plant Stone SL, Callis J (2007) Ubiquitin ligases mediate growth and 4:190–197 development by promoting protein death. Curr Opin Plant Biol Zhang S, Chen G, Liu Y, Chen H, Yang G, Yuan X, Jiang Z, Shu H 10:624–632 (2013) Apple gene function and gene family database: an inte- Tacken EJ, Ireland HS, Wang YY, Putterill J, Schaffer RJ (2012) grated bioinformatics database for apple research. Plant Growth Apple EIN3 BINDING F-box 1 inhibits the activity of three Regul 70:199–206 apple EIN3-like transcription factors. AoB Plants 2012: pls034 Takase T, Nishiyama Y, Tanihigashi H, Ogura Y, Miyazaki Y, Yamada Y, Kiyosue T (2011) LOV KELCH PROTEIN2 and ZEITLUPE

1 3