The F-Box Subunit of the SCF E3 Complex Is Encoded by a Diverse Superfamily of Genes in Arabidopsis
Total Page:16
File Type:pdf, Size:1020Kb
The F-box subunit of the SCF E3 complex is encoded by a diverse superfamily of genes in Arabidopsis Jennifer M. Gagne*, Brian P. Downes*, Shin-Han Shiu*†, Adam M. Durski*, and Richard D. Vierstra*‡ *Cellular and Molecular Biology Program and the Department of Horticulture, 1575 Linden Drive, University of Wisconsin, Madison, WI 53706; and †MIPS͞Institute for Bioinformatics, GSF-National Research Center for Environment and Health, 85764 Neuherberg, Germany Communicated by Eldon H. Newcomb, University of Wisconsin, Madison, WI, June 6, 2002 (received for review April 23, 2002) The covalent attachment of ubiquitin is an important determinant chain of Ubs is attached, typically using lysine-48 within the Ub for selective protein degradation by the 26S proteasome in plants moieties as the acceptor site for polymerization. and animals. The specificity of ubiquitination is often controlled by Currently, five types of E3s have been identified that differ ubiquitin-protein ligases (or E3s), which facilitate the transfer of according to their subunit organization and͞or mechanism of Ub ubiquitin to appropriate targets. One ligase type, the SCF E3s are transfer (1, 3, 4). One important E3 type is the SCF complex, composed of four proteins, cullin1͞Cdc53, Rbx1͞Roc1͞Hrt1, Skp1, which in yeast (Saccharomyces cerevisiae) is composed of four and an F-box protein. The F-box protein, which identifies the primary subunits—cullin1͞Cdc53, Rbx1͞Roc1͞Hrt1, Skp1, and targets, binds to the Skp1 component of the complex through a an F-box protein (3, 4). The cullin1, Rbx1, and Skp1 subunits degenerate N-terminal Ϸ60-aa motif called the F-box. Using pub- appear to form the core ligase activity, with Rbx1 recruiting the lished F-boxes as queries, we have identified 694 potential F-box E2 bearing an activated Ub. F-box proteins perform the crucial genes in Arabidopsis, making this gene superfamily one of the role of delivering appropriate targets to the complex. These largest currently known in plants. Most of the encoded proteins proteins all contain a degenerate Ϸ60-aa N-terminal motif called contain interaction domains C-terminal to the F-box that presum- the F-box, which is used to interact with the rest of the SCF ably participate in substrate recognition. The F-box proteins can be complex by binding to the Skp subunit. The C-terminal portions classified via a phylogenetic approach into five major families, of F-box proteins typically contain a variable protein-interaction which can be further organized into multiple subfamilies. Se- domain that binds the target and thus confers specificity to the quence diversity within the subfamilies suggests that many F-box SCF complex. In yeast and animals, a number of F-box proteins proteins have distinct functions and͞or substrates. Representa- are present, easily classified by the nature of this interaction tives of all of the major families interact in yeast two-hybrid domain (3, 4). In yeast, for example, the F-box protein Grr1p experiments with members of the Arabidopsis Skp family support- contains C-terminal leucine-rich repeats (LRRs) that recruit the ing their classification as F-box proteins. For some, a limited phosphorylated forms of the cyclins Cln1p and Cln2p to the preference for Skps was observed, suggesting that a hierarchical SCFGrr1 complex, whereas the Cdc4 F-box protein contains organization of SCF complexes exists defined by distinct Skp͞F-box WD-40 repeats that recruit the phosphorylated form of the protein pairs. Collectively, the data shows that Arabidopsis has cyclin-dependent kinase inhibitor Sic1 to the SCFcdc4 complex. exploited the SCF complex and the ubiquitin͞26S proteasome The existence of multiple F-box proteins, each with unique pathway as a major route for cellular regulation and that a diverse specificity or specificities would allow SCFs to ubiquitinate a array of SCF targets is likely present in plants. diverse array of substrates. Recent studies in Arabidopsis thaliana indicate that F-box rotein degradation is an important posttranscriptional reg- proteins and SCF complexes play critical roles in various aspects ulatory process that allows cells to respond rapidly to intra- of plant growth and development (2). ASK1, one of the 19 Skp P proteins in Arabidopsis (5), is involved in male gametogenesis cellular signals and changing environmental conditions by ad- and floral organ identity (6, 7). Arabidopsis F-box (FBX) genes justing the levels of key proteins. One major proteolytic route in have been implicated in auxin (TIR1) and jasmonate signaling eukaryotes involves the ubiquitin (Ub)͞26S proteasome pathway (COI1) (8, 9), floral homeosis (UFO) (10), flowering time and (1, 2). Here, proteins destined for degradation become modified circadian clocks (ZTL͞FKF͞LKP2) (11–13), leaf senescence and by the covalent attachment of multiple Ubs. The ubiquitinated lateral shoot branching (ORE1͞MAX2) (14, 15), and photomor- substrates are then recognized by the 26S proteasome and phogenesis (EID1) (16). Cursory searches of the Arabidopsis degraded while the Ub moieties are recycled. In both yeast and genome detected many other FBX loci, suggesting that they animals, the Ub͞26S proteasome pathway is responsible for comprise a large gene family (17, 18). For example, Andrade et removing most abnormal polypeptides and many short-lived cell al. (19) identified 48 genes encoding proteins with Kelch repeats regulators, which in turn control numerous processes, including following a degenerate F-box motif. To further determine the the cell cycle, signal transduction, transcription, stress responses, complexity of the F-box protein family in plants, we conducted and defense (1). Recent studies infer similar roles for the an exhaustive search for FBX-related loci in the Arabidopsis pathway in plants. For example, mutations in specific compo- ͞ genome. This search identified 694 potential FBX genes, making nents of the Arabidopsis Ub 26S proteasome pathway block this superfamily one of the largest found so far in plants. The size embryogenesis, hormonal responses, floral homeosis, photo- and diversity of this collection suggest that SCF E3s impact most morphogenesis, circadian rhythms, senescence, and pathogen aspects of plant cell biology presumably by recognizing and invasion (2). Although many of the components of this pathway ubiquitinating a wide array of protein targets. have been described, the mechanisms for target recognition remain poorly defined. Materials and Methods Ub conjugation is achieved through an ATP-dependent reac- Prediction of F-Box Proteins. All F-box proteins published as of tion cascade involving the sequential action of three enzymes, 01͞17͞01 were used as queries in multiple BLAST searches (20) E1, E2s, and E3s (1). As the final enzyme in the cascade, the E3s or Ub-protein ligases are responsible for recognizing the sub- strate and facilitating Ub transfer, which results in the formation Abbreviations: AGI, Arabidopsis Genome Initiative; LRR, leucine-rich repeat; Ub, ubiquitin; of an isopeptide bond between a lysl -amino group of the target Y2H, yeast two-hybrid. PLANT BIOLOGY and the C-terminal glycine carboxyl group of the Ub. Often a ‡To whom reprint requests should be addressed. E-mail: [email protected]. www.pnas.org͞cgi͞doi͞10.1073͞pnas.162339999 PNAS ͉ August 20, 2002 ͉ vol. 99 ͉ no. 17 ͉ 11519–11524 Downloaded by guest on September 24, 2021 against the Arabidopsis genome (annotated 01͞17͞2001). Se- quences predicted to encode F-boxes by the SMART and PFAM databases (http:͞͞smart.embl-heidelberg.de) were then used in BLAST searches against the complete, nonredundant, re- annotated Arabidopsis genome (released 02͞17͞01). One set of BLAST searches used only the F-box motifs and had an E value cutoff of 10, whereas the other set used the full-length proteins with a cutoff of 1e-10. Of the 1,849 nonredundant sequences retrieved, 650 were annotated by SMART͞PFAM as containing an F-box motif. Fifteen of the 650 loci were re-annotated as two F-box genes and another locus as three. Hand analysis led to the identification of 27 additional loci. Four of those loci are missing in the current annotation and have been named using the Arabidopsis Genome Initiative (AGI) numbers of the genes that bracket the locus. The additional 44 genes and the revised annotations are described in Table 2, which is published as supporting information on the PNAS web site, www.pnas.org. Four genes (At1g51290, At5g02700, At3g28410, and At3g49030) appear to contain two F-boxes; in these cases, the domain with a better match to the consensus F-box motif was used. Alignments, Phylogenetic Analysis, and EST Identification. Intron͞ exon organizations were based on the Arabidopsis genome annotation, our re-annotation data, and analysis of ESTs. F-box sequences were retrieved and aligned using CLUSTALX PC V.1.81 (21). An unrooted phylogenetic tree of the alignment was created by MEGA V.2.1 (22), using the p-distance method with gaps treated by pairwise deletion and a 1,000 bootstrap replicate. Alignments and trees of selected groups were generated by CLUSTALX MAC V.1.6b (21). Percent identities and similarities were calculated using MACBOXSHADE V.2.11 (Institute of Animal Fig. 1. Phylogenetic tree of the F-box protein superfamily from Arabidopsis. Health, Pirbright, U.K.). Additional domains were predicted The 60-aa F-box motifs from the 694 potential F-box proteins were aligned by using SMART and PFAM databases, BLAST searches, and sequence CLUSTALX. The alignment then was used to generate an unrooted phylogenetic alignments. Chromosomal maps were generated using the tree with MEGA 2.1, using the p-distance method and a bootstrap value of 1,000. Genome Pixelizer Tcl͞Tk script [www.atgc.org͞GenomePixel- Individual members of the tree are color-coded by (A) the nature of the ͞ ͞ domain(s) C-terminal to the F-box or (B) the number of introns within the izer (released 02 15 2002)]. The predicted mRNA sequences respective genes.