TAL Effectors Are Remote Controls for Gene Activation Heidi Scholze and Jens Boch
Total Page:16
File Type:pdf, Size:1020Kb
Available online at www.sciencedirect.com TAL effectors are remote controls for gene activation Heidi Scholze and Jens Boch TAL (transcription activator-like) effectors constitute a novel typically 34 amino acids (aa) long, but also repeat types of class of DNA-binding proteins with predictable specificity. They 30–42 aa can be found ([2], Figure 2). The last repeat is are employed by Gram-negative plant-pathogenic bacteria of only a half repeat. Repeat-to-repeat variations are limited the genus Xanthomonas which translocate a cocktail of to a few aa positions including two hypervariable residues different effector proteins via a type III secretion system (T3SS) at positions 12 and 13 per repeat. Typically, TALs differ into plant cells where they serve as virulence determinants. in their number of repeats while most contain 15.5–19.5 Inside the plant cell, TALs localize to the nucleus, bind to target repeats [2]. The repeat domain determines the specificity promoters, and induce expression of plant genes. DNA-binding of TALs which is mediated by selective DNA binding specificity of TALs is determined by a central domain of tandem [7,8]. TAL repeats constitute a novel type of DNA- repeats. Each repeat confers recognition of one base pair (bp) binding domain [7] distinct from classical zinc finger, in the DNA. Rearrangement of repeat modules allows design of helix–turn–helix, and leucine zipper motifs. proteins with desired DNA-binding specificities. Here, we summarize how TAL specificity is encoded, first structural data The recent understanding of the DNA-recognition speci- and first data on site-specific TAL nucleases. ficity of TALs [9,10] has brought TAL studies to a new level, in regard to both functional studies and applied Address science. In this review we summarize the DNA-recog- Department of Genetics, Institute of Biology, Martin-Luther-University nition code of TALs, novel structural insights, and pro- Halle-Wittenberg, 06099 Halle (Saale), Germany gress in biotechnological applications. As TAL virulence Corresponding author: Boch, Jens ([email protected]) targets in plants and plant resistance against TALs were recently reviewed [2,3,5,6] these aspects will not be discussed here. Current Opinion in Microbiology 2011, 14:47–53 This review comes from a themed issue on TALs function as transcription factors Host-microbe interactions: bacteria TAL frequency Edited by Brett Finlay and Ulla Bonas Proteins with similarity to TALs have only been ident- ified in the plant-pathogenic bacteria Xanthomonas spp. Available online 5th January 2011 and Ralstonia solanacearum. The presence, number, and 1369-5274/$ – see front matter importance of TALs vary in different strains of Xantho- # 2010 Elsevier Ltd. All rights reserved. monas. Whereas some Xanthomonas pathovars carry none or only few TAL genes, others contain up to 28 TAL genes DOI 10.1016/j.mib.2010.12.001 (including pseudogenes) per strain (Table 1). A few TALs have been shown to be essential for virulence, but the contribution of others is minor or not detectable. Introduction It is unclear whether TALs with no effect are first, Plant-pathogenic Xanthomonas bacteria translocate a nonfunctional, but possibly serve as recombinatory reser- cocktail of 30–40 different effector proteins via a type voir, second, only functional in specific host plants, or III secretion system (T3SS) into plant cells to manipulate third, fulfill a task in virulence that has not been detected eukaryotic cellular pathways [1]. Members of the TAL so far. (transcription activator-like) effector family include important virulence factors which function as transcrip- Many R. solanacearum strains carry homologs of Xantho- tion factors for target plant genes [2–6]. This is believed monas TAL genes (Table 1;[11,12]). However, the N- to support bacterial proliferation, but the role of most terminal and C-terminal regions of R. solanacearum TALs TALs in virulence is still unknown. show low similarity to Xanthomonas TALs, and the repeats differ mainly in their C-terminal part (Figure 2). Because The N-terminal and C-terminal regions are highly con- of their similar overall repeat-architecture, it is possible served in TALs. The N-terminal region contains the that R. solanacearum TALs bind to nucleic acids, too. secretion and translocation signal for the T3SS, whereas the C-terminal region contains nuclear localization signals DNA-recognition specificity of TALs and a transcriptional activation domain, both of which are TAL-specific effects are due to specific sequences loca- important for protein activity ([2], Figure 1). The most lized in promoter regions of induced plant genes prominent feature of TALs is their central domain that [7,8,13,14,15]. Characterization of different AvrBs3- consists of tandem nearly identical repeats. Each repeat is induced genes in pepper revealed the presence of a www.sciencedirect.com Current Opinion in Microbiology 2011, 14:47–53 48 Host-microbe interactions: bacteria Figure 1 (a) 0 Repeat domain NLS AD NC T 1 12/13 34 LTPEQVVAIASHDGGKQALETVQRLLPVLCQAHG (b) Repeat type NI HD NG IG NK NN NS DNA specificity A C TTG G A A C G T Current Opinion in Microbiology TAL effector-DNA-binding code. (a) TALs contain nuclear localization signals (NLS, yellow) and an activation domain (AD, green) to function as transcriptional activators. A central tandem repeat domain (red) confers DNA-binding specificity. Each 34 amino acid (aa)-repeat binds to one base pair (bp) with specificity being determined by aa 12 and 13. One sample repeat is shown. Numbers refer to aa positions within the repeat. Repeat zero is indicated which specifies for T. (b) Repeat types have specificity for one or several DNA bp. Only bases of the DNA leading strand are shown. common promoter element, termed UPA (upregulated by rearranged to generate artificial TALs with novel and AvrBs3) box [7,8,16], which was essential for AvrBs3 predictable DNA-binding specificities [9,22]. recognition and whose function was sensitive to mutations [7,8,17,18]. TAL repeat structure PSIPRED predictions revealed a strong similarity of TAL The length of the UPA box roughly matches the number repeats with tetratrico peptide repeat (TPR) proteins and of repeats in AvrBs3, leading to a model to explain pentatrico peptide repeat (PPR) proteins [23,24]. TPRs AvrBs3 DNA-recognition specificity [9,10]. Accord- are 34 aa long, contain two alpha helices, and mediate ingly, one TAL repeat binds to one DNA base pair (bp), protein–protein interactions in prokaryotes and eukaryotes and the specificity of individual repeats is encoded in (e.g. 79 TPR proteins in Arabidopsis thaliana, 260 TPR the hypervariable residues at positions 12 and 13 in each proteins in humans) [25]. TPR proteins contain 3–16 repeat (Figure 2;[9,10]; also termed RVD, repeat- degenerated tandem repeats and are involved in diverse variable diresidue). Some repeat types are specific for a cellular processes [25]. In contrast, PPR proteins carry 2–26 particular DNA bp whereas others recognize more than either degenerated tandem 35 aa-repeats or tandem alter- one bp ([9,10]; Figure 1). Repeat positions other than nating 31 aa-repeats, 36 aa-repeats, and 35 aa-repeats the hypervariable residues do not exhibit a significant [26,27]. PPR proteins mediate RNA binding and are effect on repeat specificity [19]. The target DNA especially abundant in higher plants (e.g. >450 PPR elements of TALs have been termed differently, that proteins in A. thaliana, six PPR proteins in humans) [26]. is, UPA box [7], UPT box (upregulated by TAL, [20]) As no PPR-3D structures are known TAL repeats were or simply TAL box [9]. We have suggested to name modeled onto TPR protein structures [23,24]. In such this element TAL box (e.g. AvrBs3 box, Hax3 box; models, the specificity-determining aa residues 12 and 13 [9]), because the DNA sequence controls recognition of each TAL repeat are located next to another, oriented to and binding of the TAL, but not necessarily upregula- the inner face of the superhelix [23,24] which fits well to a tion of gene expression. TAL boxes all start with a ‘T’ possible TAL–DNA interaction (Figure 2). Typically, indicating that the N-terminal region preceding the first protein repeat units (e.g. TPR and PPR) are highly degen- repeat (termed repeat ‘zero’) contributes to DNA bind- erate and the consensus structure only forms a scaffold to ing ([9,21]; Figure 2). Furthermore, it was shown on present functional residues for interactions [25,28]. In the basis of artificial TALs containing 1.5–16.5 repeats contrast, TAL repeats show an extraordinarily high degree that a minimum of 6.5 repeats is necessary to induce of repeat-to-repeat identity. This probably generates a transcription and 10.5 repeats are sufficient for full highly symmetric framework to position the two speci- induction [9]. ficity-determining hypervariable residues to its highly symmetrical binding partner, the DNA double helix. The simple and modular way of how TALs bind to DNA has far-reaching consequences. DNA-binding specifici- Nuclear magnetic resonance (NMR) structural data of a ties of TALs can be predicted and repeat modules can be 1.5 TAL repeat polypeptide showed that the TAL repeat Current Opinion in Microbiology 2011, 14:47–53 www.sciencedirect.com TAL effector-DNA binding Scholze and Boch 49 folds into a helix–turn–helix–turn structure reminiscent as well as endogenes in plants [9,19,22]. The position of of TPRs (Figure 2;[23]). The hypervariable residues a TAL box defines the TAL-dependent transcriptional were located at the end of the first helix. The two alpha start site at approximately 40–60 bp downstream of the helices of one repeat are arranged in antiparallel orien- box [7,17,18,20].