The WRKY Superfamily of Plant Transcription Factors Thomas Eulgem, Paul J
Total Page:16
File Type:pdf, Size:1020Kb
trends in plant science Reviews The WRKY superfamily of plant transcription factors Thomas Eulgem, Paul J. Rushton, Silke Robatzek and Imre E. Somssich The WRKY proteins are a superfamily of transcription factors with up to 100 representatives in Arabidopsis. Family members appear to be involved in the regulation of various physio- logical programs that are unique to plants, including pathogen defense, senescence and trichome development. In spite of the strong conservation of their DNA-binding domain, the overall structures of WRKY proteins are highly divergent and can be categorized into distinct groups, which might reflect their different functions. ne of the apparent fundamental principles of biological The name of the WRKY family is derived from the most promi- evolution is that the progression from ancient to advanced nent feature of these proteins, the WRKY domain, a 60 amino acid Olife forms is inseparably connected to an increase in regu- region that is highly conserved amongst family members. The latory capacity. Genome-sequencing efforts have provided evi- emerging picture is that these proteins are regulatory transcription dence for a positive correlation between the proportion of genes factors with a binding preference for the W box, but with the involved in information processing and the complexity of organ- potential to differentially regulate the expression of a variety of isms. More than 20% of the genes within the sequence available target genes. Consistent with a role as transcription factors, for the Arabidopsis thaliana genome appear to encode proteins PcWRKY1 and WIZZ (from tobacco) have been shown to be tar- that play a role in signal transduction or transcription1, whereas geted to the nucleus11,12. only 12% of the genome of the single-celled yeast Saccharomyces cerevisiae contains genes of this type2. The WRKY domain and the W box This increase in biological complexity coincides with the The WRKY domain is defined by the conserved amino acid appearance or expansion of specific groups of regulator genes. sequence WRKYGQK at its N-terminal end, together with a novel One example is the nuclear-receptor-gene family, which is com- zinc-finger-like motif 8 (Fig. 1). Because of the clear binding pref- pletely absent in yeast but highly represented in metazoan organ- erence of all characterized WRKY proteins for the same DNA isms3. The evolution of nuclear receptors is believed to be a key motif, it has been assumed that the WRKY domain, as their only event in the development of intercellular communication, a pre- conserved structural feature, constitutes a DNA-binding domain. requisite for the multicellularity of metazoans4. Similarly, the Indeed, it has recently been shown that an isolated WRKY domain establishment of a complex animal body plan was driven by the has sequence-specific DNA-binding activity12. The divalent metal amplification and divergence of ancestral homeobox genes, chelators 1,10-o-phenanthroline and EDTA abolish in vitro DNA thereby generating a sophisticated regulatory system of function- binding, which is taken as strong support for a zinc-finger struc- ally interconnected transcriptional regulators5. ture within the WRKY domain8,10,11. However, it has not yet been To meet their disparate biological requirements, plants and ani- proven that zinc is actually complexed in the WRKY domain. In mals have evolved unique regulatory mechanisms. This was partly addition, nothing is known about the function of the WRKYGQK achieved by combining functional domains from pre-existing fac- heptapeptide stretch, the hallmark of this superfamily. tors to build new regulators, as exemplified by the MADS-box fac- All known WRKY proteins contain either one or two WRKY tors, which play a central role in determining floral and organ domains. They can be classified on the basis of both the number of identity in plants6. In addition, completely new factors have arisen WRKY domains and the features of their zinc-finger-like motif. and we focus here on the potential biological roles of WRKY (pro- WRKY proteins with two WRKY domains belong to group I, nounced ‘worky’) proteins, a large family of transcriptional regu- whereas most proteins with one WRKY domain belong to group II lators that has to date only been found in plants. The abundance of (Fig. 2). Generally, the WRKY domains of group I and group II information provided by the Arabidopsis sequencing projects is an members have the same type of finger motif, whose pattern of ideal basis for comparative analysis of this superfamily within one potential zinc ligands (C–X4–5–C–X22–23–H–X1–H; Fig. 1) is unique plant species. Although their precise regulatory functions are among all described zinc-finger-like motifs13. The single finger largely unknown, the fact that these factors appear to be specific to motif of a small subset of WRKY proteins is distinct from that of plants, with probably up to 100 members in Arabidopsis, suggests group I and II members. Instead of a C2–H2 pattern, their WRKY that they play an important role during plant evolution. domains contain a C2–HC motif (C–X7–C–X23–H–X1–C; Fig. 1). Owing to this distinction, they were recently assigned to the newly Biochemical properties of WRKY proteins defined group III. Nevertheless, experimental evidence has shown The first WRKY cDNAs were cloned from sweet potato (Ipomoea that members of all three groups bind sequence specifically to vari- batatas; SPF1), wild oat (Avena fatua; ABF1,2), parsley (Petro- ous W box elements (R.S. Cormack et al., unpublished). selinum crispum; PcWRKY1,2,3) and Arabidopsis (ZAP1), based The two WRKY domains of group I members appear to be on the ability of the encoded proteins to bind specifically to the functionally distinct. As has been shown for SPF1, ZAP1 and DNA sequence motif (T)(T)TGAC(C/T), which is known as the PcWRKY1, sequence-specific binding to their target DNA W box7–10. It has been suggested that the cognate binding site for sequences is mediated mainly by the C-terminal WRKY SPF1 is different from other WRKY proteins. However, the domain7,10,12. The function of the N-terminal WRKY domain oligonucleotide used to isolate SPF1 does have a W box in the remains unclear. Because protein regions outside of the C-termi- flanking sequence7. nal WRKY domain contribute to the overall strength of DNA 1360 - 1385/00/$ – see front matter © 2000 Elsevier Science Ltd. All rights reserved. PII: S1360-1385(00)01600-9 May 2000, Vol. 5, No. 5 199 trends in plant science Reviews Group I WRKY1 TLFDIVNDGYRWRKYGQKSVKGSPYPRSYYRCSSPG...CPVKKHVERSSHDTKLLITTYEGKHDHDMP WRKY2 SDVDILDDGYRWRKYGQKVVKGNPNPRSYYKCTAPG...CTVRKHVERASHDLKSVITTYEGKHNHDVP WRKY3 SEVDLLDDGYRWRKYGQKVVKGNPYPRSYYKCTTPD...CGVRKHVERAATDPKAVVTTYEGKHNHDVP WRKY4 SEVDLLDDGYRWRKYGQKVVKGNPYPRSYYKCTTPG...CGVRKHVERAATDPKAVVTTYEGKHNHDLP WRKY20 SEVDILDDGYRWRKYGQKVVRGNPNPRSYYKCTAHG...CPVRKHVERASHDPKAVITTYEGKHDHDVP WRKY25 SDIDVLIDGFRWRKYGQKVVKGNTNPRSYYKCTFQG...CGVKKQVERSAADERAVLTTYEGRHNHDIP WRKY26 SDIDILDDGYRWRKYGQKVVKGNPNPRSYYKCTFTG...CFVRKHVERAFQDPKSVITTYEGKHKHQIP WRKY32 GDVGICGDGYRWRKYGQKMVKGNPHPRNYYRCTSAG...CPVRKHIETAVENTKAVIITYKGVHNHDMP WRKY33 SDIDILDDGYRWRKYGQKVVKGNPNPRSYYKCTTIG...CPVRKHVERASHDMRAVITTYEGKHNHDVP WRKY34 SDIDILDDGYRWRKYGQKVVKGNPNPRSYYKCTANG...CTVTKHVERASDDFKSVLTTYIGKHTHVVP WRKY44 VESDSLEDGFRWRKYGQKVVGGNAYPRSYYRCTSAN...CRARKHVERASDDPRAFITTYEGKHNHHLL WRKY45 SQVDILDDGYRWRKYGQKAVKNNPFPRSYYKCTEEG...CRVKKQVQRQWGDEGVVVTTYQGVHTHAVD WRKY58 SEVDLLDDGYRWRKYGQKVVKGNPHPRSYYKCTTPN...CTVRKHVERASTDAKAVITTYEGKHNHDVP WRKY10 SDEDNPNDGYRWRKYGQKVVKGNPNPRSYFKCTNIE...CRVKKHVERGADNIKLVVTTYDGIHNHPSP Group II (a) WRKY18 DTSLTVKDGFQWRKYGQKVTRDNPSPRAYFRCSFAPS..CPVKKKVQRSAEDPSLLVATYEGTHNHLGP WRKY40 KDGYQWRKYGQKVTRDNPSPRAYFKCACAPS..CSVKKKVQRSVEDQSVLVATYEGEHNHPMP WRKY60 VSSLTVKDGYQWRKYGQKITRDNPSPRAYFRCSFSPS..CLVKKKVQRSAEDPSFLVATYEGTHNHTGP (b) WRKY6 SEAPMISDGCQWRKYGQKMAKGNPCPRAYYRCTMATG..CPVRKQVQRCAEDRSILITTYEGNHNHPLP WRKY9 CETATMNDGCQWRKYGQKTAKGNPCPRAYYRCTVAPG..CPVRKQVQRCLEDMSILITTYEGTHNHPLP WRKY31 SEAAMISDGCQWRKYGQKMAKGNPCPRAYYRCTMAGG..CPVRKQVQRCAEDRSILITTYEGNHNHPLP WRKY36 CEDPSINDGCQWRKYGQKTAKTNPLPRAYYRCSMSSN..CPVRKQVQRCGEETSAFMTTYEGNHDHPLP WRKY42 SEAPMLSDGCQWRKYGQKMAKGNPCPRAYYRCTMAVG..CPVRKQVQRCAEDRTILITTYEGNHNHPLP WRKY47 HKQHEVNDGCQWRKYGQKMAKGNPCPRAYYRCTMAVG..CPVRKQVQRCAEDTTILTTTYEGNHNHPLP WRKY61 NDGCQWRKYGQKIAKGNPCPRAYYRCTIAAS..CPVRKQVQRCSEDMSILISTYEGTHNHPLP (c) WRKY8 TEVDHLEDGYRWRKYGQKAVKNSPYPRSYYRCTTQK...CNVKKRVERSYQDPTVVITTYESQHNHPIP WRKY12 SDVDVLDDGYKWRKYGQKVVKNSLHPRSYYRCTHNN...CRVKKRVERLSEDCRMVITTYEGRHNHIPS WRKY13 SEVDVLDDGYRWRKYGXKVVKNTQHPRSYYRCTQDK...CRVKKRVERLADDPRMVITTYEGRHLHSPS WRKY23 SEVDHLEDGYRWRKYGQKAVKNSPFPRSYYRCTTAS...CNVKKRVERSFRDPSTVVTTYEGQHTHISP WRKY24 SDDDVLDDGYRWRKYGQKSVKHNAHPRSYYRCTYHT...CNVKKQVQRLAKDPNVVVTTYEGVHNHPCE WRKY28 SEVDHLEDGYRWRKYGQKAVKNSPYPRSYYRCTTQK...CNVKKRVERSFQDPTVVITTYEGQHNHPIP WRKY43 SDADILDDGYRWRKYGQKSVKNSLYPRSYYRCTQHM...CNVKKQVQRLSKETSIVETTYEGIHNHPCE WRKY48 KSIDNLDDGYRWRKYGQKAVKNSPYPRSYYRCTTVG...CGVKKRVERSSDDPSIVMTTYEGQHTHPFP WRKY49 NSNGMCDDGYKWRKYGQKSIKNSPNPRSYYKCTNPI...CNAKKQVERSIDESNTYIITYEGFHFHYTY WRKY50 SEVEVLDDGFKWRKYGKKMVKNSPHPRNYYKCSVDG...CPVKKRVERDRDDPSFVITTYEGSHNHSSM WRKY51 DVMDDGFKWRKYGKKSVKNNINKRNYYKCSSEG...CSVKKRVERDGDDAAYVITTYEGVHNHESL WRKY56 SDDDVLDDGYRWRKYGQKSVKNNAHPRSYYRCTYHT...CNVKKQVQRLAKDPNVVVTTYEGVHNHPCE WRKY57 SDVDNLEDGYRWRKYGQKAVKNSPFPRSYYRCTNSR...CTVKKRVERSSDDPSIVITTYEGQHCHQTI WRKY59 DEKVALDDGYKWRKYGKKPITGSPFPRHYHKCSSPD...CNVKKKIERDTNNPDYILTTYEGRHNHPSP (d) WRKY7 KMADIPSDEFSWRKYGQKPIKGSPHPRGYYKCSSVRG..CPARKHVERALDDAMMLIVTYEGDHNHALV WRKY11 KIADIPPDEYSWRKYGQKPIKGSPHPRGYYKCSTFRG..CPARKHVERALDDPAMLIVTYEGEHRHNQS