View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Elsevier - Publisher Connector Structure, Vol. 8, 1157–1166, November, 2000, 2000 Elsevier Science Ltd. All rights reserved. PII S0969-2126(00)00527-X Solution Structures of Two CCHC Zinc Fingers from the FOG Family U-Shaped that Mediate Protein–Protein Interactions

Chu Kong Liew, Kasper Kowalski, Archa H. Fox, number of cases, both classical and GATA-type ZnF domains Anthea Newton, Belinda K. Sharpe, Merlin Crossley, have been demonstrated to act as sequence-specific DNA and Joel P. Mackay* binding modules. Thus, a paradigm has been established Department of Biochemistry whereby a DNA binding function is often ascribed to newly University of Sydney discovered ZnF , even in the absence of direct experi- Sydney NSW 2006 mental evidence. However, it has recently become apparent Australia that both GATA-type and classical ZnF domains can also serve as protein–protein recognition motifs [4, 5]. This realization has substantial implications for the analysis of novel ZnF-con- Summary taining proteins as well as for the mechanisms through which ZnF transcription factors modulate expression. Background: domains have traditionally been re- The erythroid GATA-1 [6] contains two garded as sequence-specific DNA binding motifs. However, adjacent ZnF domains, which have both been reported to play recent evidence indicates that many zinc fingers mediate spe- roles in DNA binding and protein recognition. The C terminus cific protein–protein interactions. For instance, several zinc of the two fingers (C-finger or CF) is sufficient for high-affinity fingers from FOG family proteins have been shown to interact DNA binding to the cognate (A/T)GATA(A/G) site, and has also with the N-terminal zinc finger of GATA-1. been shown to interact with the ZnF domain of Sp1 family transcription factors [7]. The N-terminal ZnF (NF) is also in- Results: We have used NMR spectroscopy to determine the volved in stabilizing GATA-1:DNA interactions [8] and may con- first structures of two FOG family zinc fingers that are involved tact DNA at certain double-GATA sites [9, 10]. Notably, how- in protein–protein interactions: fingers 1 and 9 from U-shaped. ever, the NF has been shown to interact with ZnFs from These fingers resemble classical TFIIIA-like zinc fingers, with transcriptional cofactors from the FOG [11–13] and Sp1 families the exception of an unusual extended portion of the polypep- [7, 14]. FOG-1 has nine ZnF domains, and it has been shown tide backbone prior to the fourth zinc ligand. [15N,1H]-HSQC that the NF of GATA-1 can contact four of these, namely F1, titrations have been used to define the GATA binding surface F5, F6, and F9 [15]. Strikingly, each of these four ZnFs has an of USH-F1, and comparison with other FOG family proteins unusual CCHC arrangement of zinc-chelating residues (Figure indicates that the recognition mechanism is conserved across 1). An analogous situation exists in Drosophila, where the species. The surface of FOG-type fingers that interacts with N-finger of dGATA-a/Pannier [16, 17] interacts with several of GATA-1 overlaps substantially with the surface through which the CCHC fingers in the FOG-family protein U-shaped (USH). classical fingers typically recognize DNA. This suggests that These CCHC domains all carry the conserved residues charac- these fingers could not contact both GATA and DNA simultane- teristic of classical zinc fingers, with the notable exception of ously. In addition, results from NMR, gel filtration, and sedimen- the final His→Cys substitution (Figure 1), and the question tation equilibrium experiments suggest that the interactions therefore arises as to whether this substitution has any struc- are of moderate affinity. tural or functional significance. The interaction between the GATA-1 NF and FOG is essential Conclusions: Our results demonstrate unequivocally that zinc for GATA-1 activity. Elimination of the interaction through the fingers comprising the classical ␤␤␣ fold are capable of mediat- introduction of specific mutations in NF abolishes the ability ing specific contacts between proteins. The existence of this of GATA-1 to drive normal erythroid differentiation, and activity alternative function has implications for the prediction of pro- can be rescued with compensatory FOG mutations that restore tein function from sequence data and for the evolution of pro- the interaction [11]. tein function. However, aside from this work, and despite the appearance of a substantial number of reports implicating either TFIIIA- or Introduction GATA-like ZnF domains in protein–protein interactions, there is little direct physical data available detailing such interactions. Zinc fingers (ZnFs) are among the most common of all protein Here we describe the three-dimensional solution structure of domains. In the recently sequenced Caenorhabditis elegans two CCHC ZnF domains from U-shaped (USH-F1 and USH- genome, more than 1000 zinc binding domains were identified F9). Both of these ZnFs can bind GATA-1, and these structures [1]. These domains can be divided into a number of classes constitute the first examples of a functionally distinct subclass based on the pattern of zinc-chelating residues and on the of ZnF domains. We show that the structures have consider- topology of the three-dimensional fold that they form [2]. The able similarity to classical TFIIIA-like (CCHH) fingers but share classical (TFIIIA-like or CCHH) and GATA-type ZnFs are two a novel helix termination motif that is likely to be necessary of the best-characterized classes; these domains exhibit for the interaction of these fingers with GATA-1 NF. A [15N,1H]-

conserved C-X2–5-C-X12-H-X2–5-H and C-X2-C-X17-C-X2-C se- HSQC titration strategy has been used to construct a map of quences, respectively. Many proteins contain more than one the recognition interface of the USH-F1:NF complex, and a ZnF; for example, the Xenopus protein Xfin has 37 ZnFs, which comparison of these results with our recent mutagenic analysis comprise more than 90% of the protein sequence [3]. In a of the murine FOG protein [15] reveals that this binding inter- face is conserved across species. * To whom correspondence should be addressed (e-mail: j.mackay@ biochem.usyd.edu.au). Key words: Zinc fingers; structure; transcription; U-shaped; GATA-1 Structure 1158

Figure 1. Distributions and Sequences of FOG Family CCHC Zinc Fingers (a) Schematic diagram of the distribution of ZnF domains in murine FOG-1, human FOG-2, and U-shaped. CCHC fingers are shaded, while CCHH fingers are unshaded. Fingers marked with a star exhibit good bind- ing to the N-finger of GATA-1. (b) Sequence alignments of CCHC fingers from mFOG-1 [13], hFOG-2, Xenopus laevis FOG [47], and USH. The numbering system used in the text is shown above, and the num- bering from the native proteins is also indi- cated beside the sequences. Residues in ital- ics are non-native and are leftover from the thrombin recognition site. Residues in bold are most likely involved in zinc ligation, and the gray bars indicate conserved residues that appear to be required for GATA-1 binding activity. Fingers that are capable of binding GATA-1 are labeled as interactors.

Results during these calculations. On the basis of these structures, it was evident that the three cysteine thiol groups (C11, C14, and NMR Structure Determination C32) and the Nε2 atom of H27 comprised the zinc coordination Initially, attempts were made to study three of the murine FOG-1 sphere. This is consistent with UV-vis spectrophotometric fingers that interact with GATA-1 (FOG-F1, -F6, and -F9). How- studies [18] and with the crosspeak pattern observed for the ever, one- and two-dimensional 1H NMR spectra of these domains H27 side chain in a [15N,1H]-HSQC of USH-F1 optimized for were of low quality [18], and sedimentation equilibrium experi- long-range (2J and 3J) couplings [20]. These structures were ments confirmed that homoaggregation was occurring. Efforts used as input for calculations in CNS [21]. Table 1 summarizes were therefore focused on CCHC domains from the Drosophila the experimental constraints used in this final round of struc- FOG family protein, U-shaped. We have previously shown that ture calculations. No hydrogen bonding constraints were used U-shaped fingers 1 (USH-F1) and 9 (USH-F9) can also bind in the calculations. GATA-1 both in vitro and in cellular assays [15]. The ability of In both cases, no NOE violations greater than 0.3 A˚ or dihe- these Drosophila domains to bind the murine GATA-1 NF is dral angle violations greater than 5Њ were observed, and the 20 not surprising given the very high homology of the latter to the lowest-energy structures were used to represent the solution NF of Drosophila GATA-a (appoximately 95%). conformation of each domain (Figure 2a). The structures both Recombinant 36 residue peptides corresponding to USH-F1 display good covalent geometry and good nonbonded con- and USH-F9 (Figure 1) generated high-quality NMR spectra tacts (Table 1). Analysis of the structures with PROCHECK- when refolded in the presence of Zn(II). The same numbering NMR [22] revealed that, for well-ordered residues (φ angle (1–36) is used for each domain and can be referenced back to order parameter Ͼ 0.6), 97.5% (for USH-F1; 99% for USH-F9) of the wild-type protein by using Figure 1. 1H and 15N resonance backbone φ/␺ pairs lie within the most favored or additionally assignments were achieved by using a standard combina- allowed regions of the Ramachandran plot. tion of two-dimensional homonuclear and three-dimensional 15N-separated NMR experiments. Following the iterative, man- The Three-Dimensional Structures of USH-F1 ual introduction of ambiguous NOEs with DYANA [19], a set and USH-F9 of 1000 structures was calculated for each domain. No zinc ion As shown in Figure 2b, both structures comprise a short ␤ was included, and the zinc coordination site was not restrained hairpin (residues 10–11 and 16–17) followed by an ␣ helix (resi- Zinc Fingers in Protein Recognition 1159

Figure 2. Solution Structure of the USH-F1 and USH-F9 Zinc Fingers (a) Stereoviews of the best 20 structures of USH-F1 and USH-F9. Structures are superim- posed for best fit over backbone atoms of residues 9–32 (note that residues 1–7 and 36 are not displayed, for clarity). The zinc-chelat- ing side chains are shown in orange, and the zinc atom is shown in gray. The side chains of residues 18, 24, 30, and 31 are shown in blue. (b) Ribbon diagrams of the lowest energy structures of USH-F1 and USH-F9 display the secondary structure recognized by the pro- gram MOLMOL [45].

dues 21–30). The strands of the hairpin are connected by a rmsd of 1.7 A˚ (Figure 3). One difference, however, can be seen type I ␤ turn, and a type IV turn connects the hairpin and the at the C-terminal end of the ␣ helix. In classical fingers, the ␣ helix. Hydrogen bonds are observed throughout the helix of helix extends up to or beyond the final zinc-ligating histidine each domain as well as across the hairpin. In particular, the residue. In contrast, the two residues immediately preceding backbone carbonyl group of C11 is able to form hydrogen C32 in both USH-F1 and USH-F9 exist in extended con- bonds with the amide protons of both residues 15 and 16. formations (e.g., φ30 ϭ –150Њ and φ31 ϭ –109Њ for USH-F9). The Residues 11, 18, 24, 27, and 32 form the hydrophobic core of backbone conformations of both domains are well-defined at each finger. An overlay of the mean structures of USH-F1 and this point (S2(φ) Ͼ 0.99 for both 30 and 31 in both fingers), and USH-F9 that uses backbone atoms of well-ordered residues this result demonstrates that this conformation is not an artifact (C␣, C, N) yields a root-mean-square deviation (rmsd) of 1.2 A˚ brought about by a low density of constraints. In part, this and underlines the substantial structural homology that exists conformation is probably required to maintain the tetrahedral despite the relatively low (approximately coordination of the zinc ion. Interestingly, however, residues 32% identity). Obviously, the structures are dictated by a small 30 and 31 have both been implicated in binding to the number of key residues such as the zinc ligands, suggesting N-terminal zinc finger of GATA-1 (reference [12] and see below). that the zinc finger motif is a stable scaffold that may potentially It is possible that the unusual backbone conformation positions be adapted for different functions. the two aromatic side chains in such a way as to allow specific contacts with the N-finger. We have previously shown [18] Comparison with Classical Zinc Fingers that a mutant of FOG-F1 in which histidine replaces the final Both structures show considerable similarity to structures of cysteine is able to fold but is unable to interact with GATA-1. the classical CCHH fingers. For example, USH-F9 overlays with This result further suggests that the protein conformation in the structure of a zinc finger from the protein ZFY [23] with an this region is important for GATA binding activity. Structure 1160

Table 1. Structural Statistics for USH-F1 and USH-F9 USH-F1 USH-F9 Experimental Restraints Meaningful intraresidue distances 75 89 Sequential distances 94 165 Medium-range distances (i Ϫ j Ͻ 5) 77 163 Long-range distances (i Ϫ j Ն 5) 78 89 ␹ ␹ Dihedral angles 32 (27 φ,5 1) 29 (22 φ,7 1) Mean Rmsd from Idealized Covalent Geometrya Bonds (A˚ ) 0.00207 Ϯ 0.00013 0.00092 Ϯ 0.00005 Angles (Њ) 0.398 Ϯ 0.026 0.277 Ϯ 0.009 Impropers (Њ) 0.196 Ϯ 0.019 0.108 Ϯ 0.005 Mean Rmsd from Experimental Restraints NOE (A˚ ) 0.026 Ϯ 0.032 0.0011 Ϯ 0.0009 Dihedrals (Њ) 0.084 Ϯ 0.017 0.0271 Ϯ 0.012 mean CNS energies (kJ molϪ1)

Ebond 2.34 Ϯ 0.27 0.50 Ϯ 0.05

Eangle 17.5 Ϯ 1.2 10.00 Ϯ 0.19 b Edihe 0.12 Ϯ 0.05 0.0123 Ϯ 0.012 b ENOE 8.52 Ϯ 1.1 3.2 Ϯ 0.5

Evdw 6.7 Ϯ 0.8 3.4 Ϯ 0.4

Eimpr 1.8 Ϯ 0.4 0.56 Ϯ 0.05

Etot 37.0 Ϯ 2.3 17.7 Ϯ 0.6 Atomic rms Differences (A˚ )c Backbone residues (9–35)d 0.63 Ϯ 0.14 0.25 Ϯ 0.07 All heavy atoms (residues 9–33) 1.12 Ϯ 0.19 0.81 Ϯ 0.12 a Idealized geometry is defined by the CHARMM force field as implemented within CNS. b The final values of the square-well NOE and dihedral angle potentials were calculated with force constants of 50 kcal molϪ1 A˚ Ϫ2 and 200 kcal molϪ1 A˚ Ϫ2, respectively. c Rms differences are given as the average rms difference against the mean coordinate structure. d Backbone includes C␣, C, N, and O atoms. All energies, violations, and rms differences are given as the mean Ϯ standard deviation.

The GATA Binding Face of USH-F1 in concert with the appearance of a similar number of new We first confirmed that the N-finger of GATA-1 can interact signals, indicated that the binding kinetics correspond to inter- with USH-F1 both in vivo and in vitro and that the structural mediate exchange on the chemical-shift timescale. Once solu- integrity of the USH finger is required for the interaction to take tion conditions had been optimized for spectral quality, triple place (Figure 4). Mutation of the final cysteine of USH-F1 to resonance NMR experiments were used to assign backbone alanine (C231A) clearly abrogates its interaction with NF. These HN,N,C␣, and C’ signals from [15N,13C]-USH-F1 in both the results, in combination with the studies detailed below, support absence and the presence of NF. the notion that the interactions between GATA and FOG-type In order to summarize the chemical-shift changes that oc- fingers are specific in nature. curred during the titration, we calculated a mean chemical-shift We next carried out a [15N,1H]-HSQC titration by adding unla- change index. For each atom type (HN, N, etc.) the observed beled GATA-1 NF(200–243) to 15N-labeled USH-F1 (Figure 5a). chemical-shift change was divided by the maximum change The broadening and disappearance of a subset of resonances, seen for that atom type, giving a value between 0 and 1. These

Figure 3. Overlays of USH-F9 and the Sixth Finger from ZFY USH-F9 is shown in blue, and the sixth finger from ZFY [23] is shown in red. The overlays are related by a 90Њ rotation about a horizontal axis in the plane of the page. The differences in side chain conformation in this region can be clearly seen. Zinc Fingers in Protein Recognition 1161

Figure 4. The GATA-1:USH-F1 Interaction Requires Correctly Folded USH-F1 (a) HF7c yeast growth after 48 hr incubation at 29ЊC on the indicated minimal media. Each streak contains yeast harboring pGBT9. GATA-1 N-finger (NF of GATA-1 fused to the DNA binding domain of GAL4) and either pGAD10.USH-F1 (USH-F1 fused to the acti- vation domain of GAL4), pGAD10.USH-F1 (C32A), or pGAD10.FOG-F6 (finger 6 from FOG). Growth on the (LeuϪ TrpϪ) medium in- dicates that all the yeast strains contain the NF and FOG/USH finger–encoding plasmids, which also contain the Leu and Trp , respectively. Growth in the additional ab- sence of histidine indicates that a His reporter gene has been activated by physical associa- tion of the two fingers and consequent recon- stitution of Gal4 activity. (b) GST-pulldown assays. Lane 1 contains 10% of the input in vitro translated 35S-labeled GATA-1. Lane 2 contains GST-coated gluta- thione-Sepharose beads, lane 3 contains beads coated with GST-USH-F1, and lane 4 contains GST-USH-F1(C231A). Each sample was incubated with 35S-labeled GATA-1 and, after extensive washing, the fusion protein– coated beads were boiled in loading buffer and subjected to electrophoresis, after which retained GATA-1 was visualized by phosphorimaging. A Coomassie- stained SDS-PAGE is shown below to indicate the relative levels of each fusion protein that were bound to the beads during the pulldown assay. Molecular- weight markers are shown (labeled in kDa) on the right.

values were averaged for the HN,N,C␣ and CЈ atoms of each between USH-F1 and USH-F9, and it is therefore very likely residue (Figure 5b), and Figure 6a shows (in yellow) the residues that these two domains interact with NF in the same manner. that experience changes greater than 0.25 mapped onto a Comparison of these results with the sequences of other space-filling model of USH-F1 (F18, P21, T23, L24, H27, and FOG family members (Figure 1) reveals that the GATA binding Y31). The side chains of five of these residues (F18, P21, T23, surface appears to be conserved across phyla. This is sugges- H27, and Y31) lie on one contiguous surface of the domain, tive of an ancient mechanism for mediating protein–protein while L24 extends largely onto the opposite face. It is notable contacts. Further, an alanine scan of selected residues in FOG- that the side chain methyl carbons of L24 (which extend promi- F1 [15] revealed a number of amino acids that were important nently on the opposite face) do not undergo substantial chemi- for maintaining the interaction with GATA-1, and Figure 6b cal-shift changes upon complex formation. This suggests that illustrates these residues mapped onto the structure of USH- the changes observed for the backbone atoms of L24 most F1. Most of these residues are conserved in USH-F1 and USH- likely reflect small conformational changes in this region F9 and are not conserved in CCHC fingers that are unable to brought about by the involvement of the surrounding residues interact with GATA-1 (Figure 1). The contact surface revealed T23 and H27 in the binding of NF. Furthermore, [13C,1H]-HSQC by the mutational study is clearly consistent with that deduced titration data revealed that the ␥CH3 group of I16 shifts upfield from our NMR data, and a consensus GATA-1 binding surface, by 0.25 ppm, and the ␦CH protons of Y30 and Y31 shift by Ͼ derived from consideration of both NMR and mutagenic data, 0.1 ppm (orange in Figure 6a). These results indicate that these is shown in Figure 6c. The complementarity of the two experi- side chains are likely to contact the NF directly. H␣,C␣, and mental approaches is notable. For instance, residues Y18 and C’ chemical-shift index plots for USH-F1 [24, 25] indicate that H27 of FOG-F1 were not mutated because of their essential there are no major changes in the regular secondary-structure roles in maintaining the folded conformation of the finger. In content of this domain upon NF binding. contrast, the [15N,1H]-HSQC titration data shows substantial We next sought to confirm the specificity of the NF:USH chemical-shift changes for these two residues, and these re- interaction by using a mutational strategy. Two of the residues sults indicate the residues’ proximity to the GATA-1 N-finger. (I16 and Y31) that lie on the USH-F1 surface on which most of Conversely, no signal was observed in [15N,1H]-HSQC experi- the chemical-shift changes were observed were separately ments for S19, so a chemical-shift change index could not be mutated to alanine, and the NF binding ability of these point calculated. In the FOG-F1 mutational analysis, however, this mutants was assessed in yeast two-hybrid assays. Likewise, residue proved essential for GATA-1 binding. E25 and Q28, two residues that lie on the same face as the It is also interesting that the interaction face identified in L24 side chain, were mutated to alanine. The I16A and Y31A all of these experiments overlaps considerably with the DNA mutants exhibited negligible binding to NF in comparison to binding face of the related classical zinc fingers (see, for exam- wild-type USH-F1 (Figure 7), while both E25A and Q28A re- ple, reference [26]). Given that the location of the DNA binding tained strong NF binding activity. The combination of NMR face of CCHH fingers is highly conserved, the results of the and mutational data suggests that F18 forms the center of the current study indicate either (a) that these FOG family fingers NF binding face of USH-F1, which also includes residues I16, serve solely as protein recognition modules or (b) that the T23, H27, Y30, and Y31. All of these residues are conserved binding of GATA family proteins and of DNA by these fingers Structure 1162

orange), and this suggests that it should be capable of inter- acting with both USH/FOG fingers and DNA simultaneously [27]. Knowledge of the molecular details of the GATA-FOG inter- action has illuminated the physical basis for a recently recog- nized form of familial dyserythropoietic anemia, a disease that is now attributed to a V205M mutation in GATA-1 [28]. Residue V205 is implicated in FOG binding by both NMR and mutagenic experiments [11, 12]. It is likely that a methionine residue at position 205 cannot be accommodated at the GATA-FOG inter- face and that this results in disruption of the interaction and consequently in dyserythropoietic anemia.

Weak Interactions in Transcription The intermediate-to-slow exchange kinetics observed in the USH-F1/NF HSQC titration suggest that the binding affinity between these fingers is relatively low. Consistent with this, the half-lives for the NF(200–243):USH-F1 and NF(200–243): USH-F9 dimers are approximately 5 ms [unpublished observa- tion] and approximately 15 ms, respectively. Judging from these data, it is clear that the association constants are likely to be less than approximately 106 MϪ1. Sedimentation equilibrium data were collected on a 1:1 mix- ture of NF (M ϭ 5191 Da) and USH-F1 (M ϭ 4201 Da); number-

average molecular weight estimates (Mave,z Ϸ 7000 Da) were consistent with a relatively low-affinity interaction. Modeling of the data with the 2COMPSIM program [29] revealed that the data were best represented by an association constant of approximately 4 ϫ 104 MϪ1. The magnitude of this association constant is consistent with our observation that the interaction cannot be detected by size exclusion chromatography, where Figure 5. Chemical-Shift Mapping of the USH-F1:NF(200–243) Interaction a coinjection of NF(200–243) and USH-F1 results in separate (a) [15N, 1H]-HSQC spectra of USH-F1 (pH 6.4, 280 K) in both the absence elution of the two domains. Furthermore, the interaction be- (black) and presence (red) of NF(200–243) (1.2 molar equivalents). Arrows tween GATA-1 and FOG-1 has not been detected in electropho- are used to show residues that underwent substantial shifts in this titration, and residues indicated in outline text had a mean chemical-shift change retic mobility shift assays (M. C., unpublished observation). In index (MCSCI) of Ͼ 0.25 (see text) when we took into account HN,N,C␣, and both of the above assays, the protein concentrations are such C’ chemical shifts. An asterisk indicates signals that could not be assigned that very little dimer would be present (Ͻ approximately 5%). unambiguously. Despite this apparently low affinity, these interactions are (b) Plot of MCSCI against residue number for the USH-F1:NF titration. A clearly detectable in both yeast two-hybrid and GST-pulldown dotted line indicates the lower limit for residues defined as having undergone assays [12, 13, 15]. The fact that interactions of this magnitude substantial changes in MCSCI. can be observed in cellular assays raises the possibility that relatively weak protein complexes that can be quickly assem- bled and disassembled may facilitate rapid transcriptional re- are mutually exclusive events. Given that these FOG-type fin- sponses. gers are not located in tandem arrays of fingers and that there is no experimental evidence that these domains bind DNA, we favor a view that the primary function of USH-F1 and USH-F9 Biological Implications is to bind dGATA-a.

Classical zinc finger domains (C-X2–5-C-X12-H-X2–5-H) have long The USH Binding Surface of GATA-1 NF been thought to act primarily as DNA binding motifs. A number We next used the same methodology to explore the surface of recent studies have suggested, however, that some ZnFs, through which the GATA-1 N-finger contacts USH-F1. Titration including the related CCHC fingers from FOG family transcrip- of USH-F1 into 15N-NF revealed that a distinct subset of resi- tional cofactors, may serve to interact with other proteins. dues underwent significant shifts (i.e., residues 203, 204, 205, We have used NMR spectroscopy to solve the solution struc- 207, 208, 210, 220, 221, 223, 235, and 238 of NF shifted by tures of two CCHC zinc finger domains from the transcription Ͼ0.1 ppm in the 1H dimension and/or Ͼ 0.5 ppm in the 15N factor U-shaped and to define the key residues involved in dimension). Figure 8a shows these residues (violet/cyan) mediating the interactions that these domains make with the mapped onto the solution structure of the GATA-1 N-finger N-terminal finger of GATA-1. The structures resemble the clas- [27]. It can be seen that they lie largely on one face of the sical TFIIIA-like fingers but exhibit an unusual helix termination domain and are consistent with those identified in our muta- motif at the C terminus. It appears that the residues in this genic study of the interaction of NF and finger 6 of murine motif, among others, are positioned to contribute to the interac- FOG-1 [12] (Figure 8b). This protein contact surface of the tion with GATA-1. Our results in combination with previous NF lies away from its supposed DNA-contact site (shown in studies demonstrate that these interactions, although relatively Zinc Fingers in Protein Recognition 1163

weak, are highly specific and mediated by sets of conserved residues. The results described here highlight the versatility of the basic ␤␤␣ ZnF fold as a structural scaffold. It appears that these domains may act as DNA recognition motifs (e.g., TFIIIA CCHH fingers), protein recognition motifs (e.g., FOG and USH CCHC fingers), or perhaps both (e.g., the CCHH fingers from Sp1 and YY1 [7, 30]). The role of CCHC-type fingers as protein binding domains is also interesting in light of the observation that a number of proteins (e.g., Kru¨ ppel [31], Snail [32], Xfin [3], and REST [33]) carry a single CCHC finger that it is sepa- rated from other ZnFs by many amino acid residues. It is possi- ble that these isolated fingers are involved in recruiting protein partners. It is also interesting to speculate that, because only a single finger is required for protein binding activity, protein binding may represent the ancestral function of some ZnF domains, with nucleic acid binding evolving later through dupli- cation events. These observations also cast a new light on the presumed function of many ZnF proteins. It is now clear that newly discov- ered ZnF proteins should not always be presumed to possess nucleic-acid binding activity. This may be particularly important in the analysis of data from genome sequencing projects. In addition, these interactions provide a direct mechanism through which DNA-bound transcription factors may communi- cate to influence . Thus, for example, ZnF mediated–transcription factor complexes may bridge promot- ers and distant enhancer elements.

Experimental Procedures

Protein Production Constructs encoding residues 202–235 and 1113–1146 of U-shaped were subcloned into the E. coli expression vector pGEX-2T (Pharmacia); this process created C-terminal fusions with glutathione-S-transferase (GST), as described previously [34]. These constructs were expressed in the host strain BL21 (DE3), grown in Luria broth, and purified as described previously [27]. Uniformly 15N-labeled peptides were prepared according to the protocol of Cai et al. [35]. 15N-labeled ammonium chloride (Isotec) served as the sole nitrogen source. The N-finger of GATA-1 was prepared as described previously [27]. Uniform labeling was confirmed using electrospray mass spectrometry. Note that as a consequence of the engineered thrombin cleav- age site in the vector, both of the U-shaped domains have an additional N-terminal GS sequence (residues 1 and 2 in Figure 1).

NMR Sample Preparation Samples of USH-F1, USH-F9, and NF(200–243) were prepared for NMR spectroscopy by first dissolving the lyophilized peptides in a solution con-

taining tris-carboxyethylphosphine (TCEP, 2 mM), ZnSO4 (1.5 molar equiva-

lents), and D2O (5%), giving solutions with pH values in the range 2–3, and with concentrations of less than 0.4 mM in peptide. Using 10 mM and 0.1 M NaOH, we carefully raised the pH to approximately 5.5 (5.0 for USH- F9). In the case of USH-F1 and NF(200–243), the resulting solutions were concentrated in a vacuum centrifuge. The final concentrations of NMR sam- ples were approximately 0.6 mM (USH-F1), approximately 1–2 mM (NF[200– 243]), and approximately 0.4 mM (USH-9). Note that this process required some care. If the domains were concentrated to more than the stated extent, line broadening was observed in the 1H NMR spectrum.

(b) Residues of FOG-F1 required for GATA-1 binding, as they were deduced from alanine scanning mutagenesis [15]. These residues (in yellow) have Figure 6. The GATA-Interacting Surface of FOG Family CCHC Zinc Fingers been mapped onto the structure of USH-F1. (a) Space-filling diagram of USH-F1 illustrates the residues (in yellow) that (c) Consensus GATA-1 binding face (shown in yellow), inferred by combining exhibited MCSCI values of Ͼ 0.25. I16 and Y30 are indicated in orange, as NMR titration and mutagenesis data. The residues constituting the binding significant changes in the chemical shifts of side chain atoms are observed surface are mapped onto the structure of USH-F1. In both (a) and (b), USH- for these residues upon complex formation. F1 is shown in the same orientation as in Figure 2. Structure 1164

Figure 7. The GATA-1:USH-F1 interaction occurs on a specific surface of USH-F1 The growth of HF7c yeast (incubation for 48 hr at 29ЊC) on the indicated minimal media is shown on both plates. Each streak contains yeast harboring the pGBT9.GATA-1 N-finger and either pGAD10.USH-F1 or one of the four indicated mutants. The negative control wedge contained yeast harboring the empty pGBT9 plasmid together with pGAD10.USH- F1(E26A). See the legend of Figure 4 for de- tails.

NMR Spectroscopy NMR spectra used for structure determination were acquired at 298 K (USH- NMR experiments were carried out either on Bruker DRX600 or DRX500 F1) or 293 K (USH-F9). Water suppression was generally achieved by using NMR spectrometers equipped with 5 mm triple resonance gradient probes. pulsed field gradients. The following homonuclear 2D experiments were

Figure 8. The USH/FOG Binding Face of the GATA-1 N Finger Ribbon diagrams of the GATA-1 NF illustrate in violet (a) the amino acids that underwent significant changes in chemical shift during the titration of 15N-NF with USH-F1 and (b) the amino acids that were required to main- tain the NF–FOG interaction in a mutagenic analysis of the GATA-1 NF [12]. V205 is high- lighted in cyan; individuals with a V205M mu- tation can suffer familial dyserythropoietic anaemia. In addition, the predicted DNA bind- ing face of NF (based on the structure of the homologous GATA-1 C-finger bound to DNA [48]) is shown in orange. Zinc Fingers in Protein Recognition 1165

recorded on the unlabeled samples: TOCSY [36] (␶m ϭ 70 ms), DQFCOSY Size Exclusion Chromatography

[37], and NOESY [38] (␶m ϭ 50 and 250 ms for USH-F1, and ␶m ϭ 50, 200, Samples of NF(200–243) and USH-F1 were made up as for NMR spectros- and 250 ms for USH-F9). 2D HSQC-TOCSY [39] (␶m ϭ 70 ms), 3D HNHA [40], copy. These samples were applied either separately or premixed (at an and 3D TOCSY-HSQC [41] experiments were used to assign the [15N,1H]- approximately 1:1 molar ratio) onto a Superdex Peptide HR 10/30 size exclu- HSQC spectra of the 15N labeled peptides, while HNHA and HNHB [42] sion column (Amersham Pharmacia Biotech) equilibrated in a buffer con- 3 3 spectra were used to derive JHN␣ and JN␤ coupling constants, respectively. taining Tris (20 mM) and NaCl (250 mM, pH 8.1). Protein elution was moni- Spectral widths were typically 12 ppm for 1H and 30 ppm for 15N. Spectra tored at 280 nm. were processed as described previously [27]. For the detection of two- and three-bond scalar couplings involving the histidine side chain nitrogens, a GST-Pulldowns and Yeast Two-Hybrid Assays 2D [15N,1H]-HSQC was recorded in which the dephasing delay 1/(4J) was Mutants were constructed by the standard overlap PCR method, and GST- set to 11 ms [20] and the 15N carrier frequency and spectral width were 201 pulldowns were carried out as previously described [12]. For yeast two- ppm and 6000 Hz, respectively. hybrid assays, competent HFc7 yeast cells were transformed simultaneously with both the appropriate pGBT9.GATA-1 and either pGAD10.USH-F1 or Structure Determination pGAD10.FOG-F6 finger constructs (Clontech Two-Hybrid Matchmaker sys- NOESY crosspeaks were integrated using a combination of manual and tem protocol), and the transformants were selected on Leu– Trp– minimal lineshape integration in XEASY [43] and converted to upper distance limits media plates after growth at 29ЊC. Transformants were then patched onto by using the CALIBA module of the DYANA [44] program. These constraints, His– Leu– Trp– plates and monitored for growth for up to three days. together with scalar coupling constants determined from HNHA data, were used to generate a set of allowable dihedral angles in the GRIDSEARCH Acknowledgments module, also within DYANA. These distance and dihedral angle constraints were used to calculate 1000 structures from random starting conformers in This work was supported by the Australian Research Council through a DYANA. The conformer with the lowest target function in each case was grant to M. C. and J. P. M., who is an Australian Research Fellow. C. L., used as input for restrained molecular dynamics–simulated annealing calcu- K. K., B. S., A. N., and A. F. hold Australian Postgraduate Awards. We thank lations in CNS [21], with randomized initial atomic velocities for each run. Robert Czolij for expert technical assistance in protein purification. We thank A single zinc atom was built into the starting structures, with equilibrium Bruker Australia, Ltd and Peter Barron in particular for access to their ˚ Zn–S and Zn–N bond lengths set to 2.3 and 2.0 A, respectively, and restrained DRX500 NMR spectrometer. with force constants of 250 kcal molϪ1 A˚ Ϫ2. Bond angles defining the zinc coordination site were set to 109Њ (S-Zn-S and S-Zn-N) and 108Њ (C␤-S-Zn) and were constrained with a force constant of 50 kcal molϪ1 radϪ2.No Received: July 11, 2000 significant differences in the structures were observed when the force con- Revised: September 21, 2000 stant was varied in the range of 10–50 kcal molϪ1 radϪ2. The standard param- Accepted: September 22, 2000 eter and topology files protein-allhdg.param and protein-allhdg.top were used to restrain the protein geometry, and the annealing protocol followed References the default values in the input script anneal.inp, provided with the program. No hydrogen bonding constraints were used in the calculations. A total of 1. Clarke, N.D., and Berg, J.M. (1998). Zinc fingers in Caenorhabditis ele- 500 structures was calculated for each domain, and the 20 with the lowest gans: finding families and probing pathways. Science 282, 2018–2022. overall energies were used to represent the solution structures of USH-F1 2. Schwabe, J.W.R., and Klug, A. (1994). Zinc mining for protein domains. and USH-F9. The structures were examined with the programs MOLMOL Nature Struct. Biol. 1, 345–349. [45] and PROCHECK-NMR [22]. 3. Ruiz i Altaba, A., Perry-O’Keefe, H., and Melton, D.A. (1987). Xfin:an embryonic gene encoding a multifingered protein in Xenopus. EMBO J., 6, 3065–3070. USH-F1:NF(200–243) Binding Experiments 4. Hata, A., Seoane, J., Lagna, G., Montalvo, E., Hemmati-Brivanlou, A., A sample of 15N-labeled USH-F1 (0.5 mM) was titrated with a solution of and Massague, J. (2000). OAZ uses distinct DNA- and protein-binding NF(200–243) in increments of 0.2 molar equivalents, and [15N,1H]-HSQC spec- zinc fingers in separate BMP-Smad and Olf signaling pathways. Cell tra were recorded after each addition. NF(200–243) was added until no 100, 229–240. significant changes were observed in the USH-F1 spectrum (approximately 5. Mackay, J.P., and Crossley, M. (1998). Zinc fingers are sticking together. 1 molar equivalent). Titrations were carried out at several different tempera- Trends Biochem. Sci. 265, 1–4. tures between 278 and 298 K in order to maximize spectral quality. A reverse 6. Weiss, M.J., and Orkin, S.H. (1995). GATA transcription factors: key titration was also carried out in which USH-F1 was added stepwise to a regulators of hematopoiesis. Exp. Hematol. 23, 99–107. solution of 15N-labeled NF(200–243). In all cases, solution pHs were moni- 7. Merika, M., and Orkin, S. (1995). Functional synergy and physical interac- tored during the titrations; no changes were observed. tions of the erythroid transcription factor GATA-1 with the Kruppel family proteins Sp1 and EKLF. Mol. Cell. Biol. 15, 2437–2447. Sedimentation Equilibrium 8. Martin, D.I., and Orkin, S.H. (1990). Transcriptional activation and DNA Sedimentation equilibrium experiments were carried out on a Beckman XL-A binding by the erythroid factor GF- 1/NF-E1/Eryf 1. Genes Dev. 4, 1886– analytical ultracentrifuge. Mixtures of NF and USH-F1 (1:1 molar ratio) were 1898. made up to give loading concentrations between 20 and 100 ␮M. Data were 9. Trainor, C.D., Omichinski, J.G., Vandergon, T.L., Gronenborn, A.M., recorded at 298 K using two-channel centerpieces and an An-60ti rotor; the Clore, G.M., and Felsenfeld, G. (1996). A palindromic regulatory site speeds used were 30,000 and 42,000 rpm. Circular dichroism spectra were within vertebrate GATA-1 promoters requires both zinc fingers of the recorded of the individual peptides prior to running the sedimentation experi- GATA-1 DNA-binding domain for high-affinity interaction. Mol. Cell. Biol. ments in order to confirm that the peptides were properly folded. Data were 16, 2238–2247. recorded as absorbance versus radius scans at 0.001 cm intervals and as 10. Yang, H.Y., and Evans, T. (1992). Distinct roles for the two cGATA-1 the sum of ten scans. Scans were collected at 3 hr intervals and compared finger domains. Mol. Cell. Biol. 12, 4562–4570. to determine when the samples had reached sedimentation and chemical 11. Crispino, J., Lodish, M.B., Mackay, J.P., and Orkin, S.H. (1999). Altered equilibrium. Weight-average molecular weights were estimated by fitting specificity mutants establish that direct association of FOG with tran- the data to an ideal single-species model in NONLIN [46], and then an scription factor GATA-1 is critical for erythroid differentiation. Mol. Cell estimate of the association constant was obtained by comparing the data 3, 219–228. to multiple simulated data sets generated in the program 2COMPSIM [29]. 12. Fox, A., Kowalski, K., Mackay, J.P., and Crossley, M. (1998). The GATA- It had been previously established that both USH-F1 [18] and NF(200–243) FOG complex: key residues specific to the GATA-1 N-finger are recog- [34] are strictly monomeric in solution. The partial specific volumes for USH- nised by FOG. J. Biol. Chem. 273, 33595–33603. F1 and NF were determined from the amino acid sequences. In this calcula- 13. Tsang, A., Visvader, J.E., Turner, C.A., Fujiwara, Y., Yu, C., Weiss, M.J., tions, we took into account the temperature and the presence of a single et al. (1997). FOG, a multitype zinc finger protein, acts as a cofactor for zinc ion per ZnF domain (a density of 7.14 g ml-1 was assumed for the Zn2ϩ transcription factor GATA-1 in erythroid and megakaryocytic differentia- ion). tion. Cell 90, 109–119. Structure 1166

14. Merika, M., and Orkin, S.H. (1993). DNA-binding specificity of GATA 35. Cai, M., Huang, Y., Sakaguchi, K., Clore, G.M., Gronenborn, A.M., and family transcription factors. Mol. Cell. Biol. 13, 3999–4010. Craigie, R. (1998). An efficient and cost-effective isotope labeling proto- 15. Fox, A., Kowalski, K., Mackay, J.P., and Crossley, M. (1999). Transcrip- col for proteins expressed in Escherichia coli. J. Biomol. NMR 11, tional cofactors of the FOG-family interact with GATA-proteins by means 97–102. of multiple zinc fingers. EMBO J. 18, 2812–2822. 36. Bax, A., and Davis, D.G. (1985). MLEV-17 based two-dimensional homo- 16. Haenlin, M., Cubadda, Y., Blondeau, F., Heitzler, P., Lutz, Y., Simpson, nuclear magnetisation transfer spectroscopy. J. Magn. Reson. 65, P., et al. (1997). Transcriptional activity of pannier is regulated negatively 355–360. by heterodimerization of the GATA DNA-binding domain with a cofactor 37. Rance, M., Sorensen, O.W., Bodenhausen, G., Wagner, G., Ernst, R.R., encoded by the u-shaped gene of Drosophila. Genes Dev. 11, 3096– and Wuthrich, K. (1983). Improved spectral resolution in COSY 1H NMR 3108. spectra of proteins via double quantum filtering. Biochem. Biophys. 17. Ramain, P., Heitzler, P., Haenlin, M., and Simpson, P. (1993). pannier, Res. Commun. 117, 479–485. a negative regulator of achaete and scute in Drosophila, encodes a 38. Kumar, A., Ernst, R.R., and Wu¨ thrich, K. (1980). A two-dimensional nu- zinc finger protein with homology to the vertebrate transcription factor clear Overhauser enhancement (2D nOe) experiment for the elucidation GATA-1. Development 119, 1277–1291. of complete proton-proton cross-relaxation networks in biological mac- 18. Matthews, J.M., Kowalski, K., Liew, C.L., Sharpe, B.K., Fox, A.H., Cross- romolecules. Biochem. Biophys. Res. Commun. 95, 1–6. ley, M., et al. (2000). A class of zinc fingers involved in protein-protein 39. Norwood, T.J., Boyd, J., Heritage, J.E., Soffe, N., and Campbell, I.D. interactions: biophysical characterization of CCHC fingers from FOG (1990). Comparison of techniques for 1H-detected heteronuclear 1H-15N and U-shaped. Eur. J. Biochem. 267, 1030–1038. spectroscopy. J. Magn. Reson. 87, 488–501. 19. Gu¨ ntert, P., Mumenthaler, C., and Wu¨ thrich, K. (1997). Torsion angle 40. Vuister, G., and Bax, A. (1993). Quantitative J correlation: a new ap- dynamics for NMR structure calculation with the new program DYANA. proach for measuring homonuclear three-bond J(HNH␣) coupling con- J. Mol. Biol. 273, 283–298. stants in 15N-enriched proteins. J. Amer. Chem. Soc. 115, 7772–7777. 20. Pelton, J.G., Torchia, D.A., Meadow, N.D., and Roseman, S. (1993). 41. Marion, D., Driscoll, P.C., Kay, L.E., Wingfield, P.T., Bax, A., Gronenborn, Tautomeric states of the active-site histidines of phosphorylated and A.M., et al. (1989). Overcoming the overlap problem in the assignment unphosphorylated IIIGlc, a signal-transducing protein from Escherichia of 1H NMR spectra of larger proteins by use of three-dimensional hetero- coli, using two-dimensional heteronuclear NMR techniques. Prot. Sci. nuclear 1H-15N Hartmann-Hahn-multiple quantum coherence and nu- 2, 543–558. clear Overhauser- multiple quantum coherence spectroscopy: applica- 21. Brunger, A.T., Adams, P.D., Clore, G.M., DeLano, W.L., Gros, P., Grosse- tion to interleukin 1 ␤. Biochemistry 28, 6150–6156. Kunstleve, R.W., et al. (1998). Crystallography & NMR system: A new 42. Dux, P., Whitehead, B., Boelens, R., Kaptein, R., and Vuister, G.W. (1997). software suite for macromolecular structure determination. Acta Crys- Measurement of 15N-1H coupling constants in uniformly 15N-labeled pro- tallogr. D. Biol. Crystallogr. 54, 905–921. teins: application to the photoactive yellow protein. J. Biomol. NMR 10, 22. Laskowski, R.A., Rullmannn, J.A., MacArthur, M.W., Kaptein, R., and 301–306. Thornton, J.M. (1996). AQUA and PROCHECK-NMR: programs for 43. Bartels, C., Xia, T.-H., Billeter, P., Guntert, P., and Wuthrich, K. (1995). checking the quality of protein structures solved by NMR. J. Biomol. The program XEASY for computer-supported NMR spectral analysis of NMR 8, 477–486. biological macromolecules. J. Biomol. NMR 6, 1–10. 23. Kochoyan, M., Havel, T.F., Nguyen, D.T., Dahl, C.E., Keutmann, H.T., 44. Guntert, P., Mumenthaler, C., and Wu¨ thrich, K. (1997). Torsion angle and Weiss, M.A. (1991). Alternating zinc fingers in the human male dynamics for NMR structure calculation with the new program DYANA. associated protein ZFY: 2D NMR structure of an even finger and implica- J. Mol. Biol. 273, 283–298. tions for “jumping-linker” DNA recognition. Biochemistry 30, 3371–3386. 45. Koradi, R., Billeter, M., and Wuthrich, K. (1996). MOLMOL: a program 24. Wishart, D.S., and Sykes, B.D. (1994). The 13C chemical-shift index: a for display and analysis of macromolecular structures. J. Mol. Graph., simple method for the identification of protein secondary structure using 14, 51–55, 29–32. 13C chemical-shift data. J. Biomol. NMR 4, 171–180. 46. Johnson, M.L., Correia, J.J., Yphantis, D.A., and Halvorson, H.R. (1981). 25. Wishart, D.S., Sykes, B.D., and Richards, F.M. (1992). The chemical shift Analysis of data from the analytical ultracentrifuge by nonlinear least- index: a fast and simple method for the assignment of protein secondary squares techniques. Biophys. J. 36, 575–588. structure through NMR spectroscopy. Biochemistry 31, 1647–1651. 47. Deconinck, A.E., Mead, P.E., Tevosian, S.G., Crispino, J.D., Katz, S.G., 26. Wuttke, D.S., Foster, M.P., Case, D.A., Gottesfeld, J.M., and Wright, Zon, L.I., et al. (2000). FOG acts as a repressor of red blood cell develop- P.E. (1997). Solution structure of the first three zinc fingers of TFIIIA ment in Xenopus. Development 127, 2031–2040. bound to the cognate DNA sequence: determinants of affinity and se- 48. Omichinski, J.G., Clore, G.M., Schaad, O., Felsenfeld, G., Trainor, C., quence specificity. J. Mol. Biol. 273, 183–206. Appella, E., et al. (1993). NMR structure of a specific DNA complex of 27. Kowalski, K., Czolij, R., King, G.F., Crossley, M., and Mackay, J.P. (1999). Zn-containing DNA binding domain of GATA-1. Science 261, 438–446. The solution structure of the N-terminal zinc finger of GATA-1 reveals a specific binding face for the transcriptional cofactor FOG. J. Biomol. ID Codes NMR 13, 249–261. 28. Nichols, K.E., Crispino, J.D., Poncz, M., White, J.G., Orkin, S.H., Maris, The atomic coordinates and experimental restraint files for the ensemble of J.M., et al. (2000). Familial dyserythropoietic anaemia and thrombocyto- 20 USH-F1 and USH-F9 structures have been deposited in the Protein Data penia due to an inherited mutation in GATA-1. Nat. Genet. 24, 266–270. Bank (PDB accession codes 1fv1 and 1fu9, respectively). 29. Minton, A.P. (1997). Alternative strategies for the characterization of associations in multicomponent solutions via measurement of sedimen- tation equilibrium. Prog. Colloid Polym. Sci. 107, 11–19. 30. Lee, J.S., Galvin, K.M., and Shi, Y. (1993). Evidence for physical interac- tion between the zinc-finger transcription factors YY1 and Sp1. Proc. Natl Acad. Sci. U.S.A. 90, 6145–6149. 31. Rosenberg, U.B., Preiss, A., Seifert, E., Jackle, H., and Knipple, D.C. (1985). Production of phenocopies by Kruppel antisense RNA injection into Drosophila embryos. Nature 313, 703–706. 32. Boulay, J.L., Dennefeld, C., and Alberga, A. (1987). The Drosophila devel- opmental gene snail encodes a protein with nucleic acid binding fingers. Nature 330, 395–398. 33. Tapia-Ramirez, J., Eggen, B.J., Peral-Rubio, M.J., Toledo-Aral, J.J., and Mandel, G. (1997). A single zinc finger motif in the silencing factor REST represses the neural-specific type II sodium channel promoter. Proc. Natl. Acad. Sci. USA 94, 1177–1182. 34. Mackay, J.P., Kowalski, K., Czolij, R., King, G.F., and Crossley, M. (1998). Involvement of the N-finger in the homodimerization of GATA-1. J. Biol. Chem. 273, 30560–30567.