Proc. Nati. Acad. Sci. USA Vol. 89, pp. 9759-9763, October 1992 Biochemistry Sequence-specific recognition of DNA by zinc-finger peptides derived from the factor Spl (mobility shift/primer extension/GC box/binding isotherm/competition assay) RICHARD W. KRIWACKIt, STEVE C. SCHULTZU§, THOMAS A. STEITZt, AND JOHN P. CARADONNAt¶ Departments of tChemistry, and $Molecular Biophysics and Biochemistry and the Howard Hughes Medical Institute, Yale University, P.O. Box 6666, New Haven, CT 06511 Contributed by Thomas A. Steitz, July 6, 1992

ABSTRACT We have overexpressed and purified two pep- niques based on gel electrophoresis to demonstrate that these tide fragments of Spl that contain the three "zinc-finger'" Spl peptides recognize the same DNA sequences as wild- domains necessary for specific Spl DNA binding. These pep- type Spl. Considering the three zinc-fingers as individual tides assume a stable, folded conformation in solution in the DNA binding domains and, analogously, considering the Spl presence of Zn2+ as shown by DNA binding assays and NMR binding site to be composed of three individual subsites [as spectroscopy. Mobility-shift assays demonstrate that the Spl first proposed for Zif-268 by Nardelli et al. (9)], our electro- peptides recognize a number ofdifferent Spl DNA binding sites phoresis studies reveal that the binding affinities associated (GC boxes, with the core sequence GGGCGG). The dissocia- with the three, pairwise zinc-finger/DNA subsite interac- tion constant for a 92-amino acid peptide binding to the tions are not equal, with the affinity of the second and third GGGGCGGGGC sequence (Kd 10 nM) and the relative fingers binding to their respective DNA subsites within the 5' affinities for several other DNA sequences definitively demon- strate Spl-like binding properties. The thermodynamic bind- portion of the GC box significantly greater than that for the ing site for Spl-Zn92 has been mapped using the primer- first Spl finger binding to its DNA subsite within the 3' extension/mobility-shift assay revealing that the 5' portion of portion of the GC box. These findings are shown to be the GC box DNA sequence (GGG GCG) contributes more consistent with (i) the high conservation of the GGG GCG strongly to the total binding energy than the 3' portion motifat the 5' end ofGC box sequences, and (ii) the relatively (GGGC). These rindings are interpreted in the context of the low conservation within the 3' end of these sequences. We Spl amino acid sequence in comparison with the structurally compare the Spl peptide sequence with that of Zif-268 (10) characterized Zif-268/DNA complex. A model is proposed that and propose a model for the binding of Spl to the consensus offers a structural explanation for the ability of Spl to recog- GC box. nize a diverse array ofDNA sequences in terms ofthe individual (and different) DNA binding properties of each of the three zinc-finger domains. MATERIALS AND METHODS Expression and Purification of Spl Peptides. Several zinc- The "zinc-finger" motif (1, 2) has been identified within the finger-containing Spl peptides of different length were over- amino acid sequences of a large number of DNA binding expressed during the course of studies designed to determine proteins, including a wide variety of the minimal peptide length required for Spl-like DNA bind- factors in which this motif mediates sequence-specific rec- ing; we present here results obtained with two of these ognition of DNA. Transcription factor Spl (3), which con- peptides. Standard cloning techniques (11-14) were used to tains three contiguous Cys2His2 zinc-finger domains, binds to construct expression vectors that code for residues 521-639 some, but not all, DNA sequences that contain the asym- and residues 533-623 of Spl (pSpl-Znl21 and pSpl-Zn92, metric GGGCGG hexanucleotide core (GC box) within a respectively). The amino acid sequences for Spl-Znl21 and large number of cellular and viral promoters. Early studies Spl-Zn92 are given in Table 1. These peptides were overex- demonstrated that Spl-responsive promoters typically con- pressed in Escherichia coli strain BL21(DE3) (13) using tain multiple binding sites ofdifferent DNA sequence and Spl standard protocols (13, 14). binding affinity (4-6). A survey of high-affinity Spl DNA Purification procedures were performed at 4°C unless binding sites (Kd 10-9 M) reveals that a large number of otherwise noted. E. coli cells (50 g) in which either Spl-Znl2l substitutions within the generalized Spl GC box sequence are or Spl-Zn92 was overexpressed were resuspended and lysed tolerated while high binding affinity is maintained (7, 8). in buffer L [50 mM Hepes, pH 8.0/50 mM NaCI/10 mM These observations raise questions about the molecular basis dithiothreitol (DTT)/1 mM EDTA/1 mM phenylmethylsul- of the Spl/DNA recognition process and the role played by fonyl fluoride]. Solid urea was added to the supernatant to 5 multiple Spl binding sites of different affinity with regard to M. The peptides were isolated using cation-exchange chro- the enhancement of transcriptional activation. We are inter- matography (S Sepharose and Mono S, Pharmacia) with ested in understanding the molecular details of DNA recog- buffers composed of 5 M urea/50 mM Hepes, pH 8.0/10 mM nition by Spl and with this aim have developed a system for DTT/1 mM EDTA using an NaCl gradient. Pooled fractions studying the underlying intermolecular interactions at the were reduced at 70°C for 30 min (100 mM DTT) and purified structural level. by reverse-phase HPLC (C4 resin, Vydac, Hesperia, CA) In this report we describe the overexpression, purification, solutions of and characterization of the DNA binding properties of two using composed H20/0.1% trifluoroacetic acid peptides, 121 and 92 amino acids in length, that contain the (TFA) and CH3CN/0.1% TFA. The peptides were lyophi- three zinc-fingers of Spl. We have employed various tech- Abbreviations: SV40, simian virus 40; HIV, human immunodefi- ciency virus; Mt, metallothionein. The publication costs of this article were defrayed in part by page charge §Present address: Department of Chemistry and Biochemistry, Uni- payment. This article must therefore be hereby marked "advertisement" versity of Colorado, Boulder, CO 80309. in accordance with 18 U.S.C. §1734 solely to indicate this fact. ITo whom reprint requests should be addressed.

9759 Downloaded by guest on September 28, 2021 9760 Biochemistry: Kriwacki et al. Proc. Natl. Acad. Sci. USA 89 (1992) Table 1. Amino acid sequences for Spl-Znl21 and Spl-Zn92 acrylamide gels (75:1) and electrophoresed as above. The (Spl-Zn92 is in bold type, between brackets) in the dried gels were quantitated using a Betascope 603 blot N-terminal to C-terminal direction analyzer (Betagen, Waltham, MA). The fraction of labeled DNA bound by Spl-Zn92 (0b) was MKDSEGRGSGDPG [ (M) IC calculated using the equation Ob = Ib/(Ib + If), where lb is the 10 30 KQHICHIQGCGNKYGETSiaLRAHLRWHTGE20 intensity of the Spl-Zn92-bound DNA band and If is the 40 50 60 intensity of the free DNA band. The values of Ob for protein RPFMCTWSYCGKRFTESDELQRHKRTHTGE titrations with the oligonucleotides metallothionein [GC box 70 80 90 within the human Mt Ila (6)], human immunodefi- KKFAC - -PECPKRFMRSDBLSrHIKTHQNK ciency virus long terminal repeat [HIV-LTR, site 3 (7)], and simian virus 40 [SV40 early promoter, site 3 (6)] (Table 2) K] GGPGVALSVGTLPLDS without competitors were fit (KaleidaGraph program, Abel- beck Software) using the binding isotherm equation: = [A Key residues discussed in the text are underlined, and the Cis2His2 (Ab - X = residues are in large, bold type. Numbering is for Spl-Zn92. (A2 - B)1/2]/(2 K. x [Sp1-Zn92]j), where A [Ka([*DNA]t + [Spl-Zn92]t) + 1] and B = 4 x (K.)2 X lized and stored in an argon atmosphere (220C). Isolated yield [*DNA]t x [Spl-Zn92]t (18). The values of Ob for experiments was =10 mg per 1 g of wet cells. in which Spl-Zn92 concentration was fixed and the concen- Gel Mobiity-Shift, DNA Binding Assay. See refs. 15 and 16. tration of competitor oligonucleotides was varied were ana- Six-site oligonucleotide. The plasmid pRVS35' (ref. 17; M. lyzed using the equation: *Ob = [A - (A2 - B)1/2]/(2 x where A = [Ka + (r x [cDNA]t + + Biggin, personal communication) was digested with Pst I and [*DNA]t), [Spl-Zn92]t [*DNA]f) + 1], B = 4 x [*DNA]t x [Spl-Zn92]t, and r = Xho I to generate an =80-base-pair (bp) fragment containing KU1KS (18). In these equations Ka and K' are the equilibrium three tandem repeats of the GC box; the fragment was gel constants for Spl-Zn92 binding to the 32P-labeled and unla- purified and 3' end-labeled. Binding reaction mixtures con- beled competitor oligonucleotides, [*DNA]t and [CDNA]t are tained the 3' 32P-end-labeled DNA fragment (4100 pM), 10 the total concentrations of 32P-labeled and unlabeled oligo- mM Tris (pH 8.0), 50 mM NaCl, 100 ,uM ZnSO4, and 200 ng nucleotides, and [Spl-Zn92]t is the total concentration of of poly(dA/dT) per tul (Pharmacia). Spl-Znl21 was added Spl-Zn92 peptide. last, followed by equilibration (220C, 20 min) and electro- Primer-Extension/Mobility-Shift Assay. The primer- phoresis in 6% polyacrylamide gels (75 polyacrylamide:1 extension/mobility-shift assay was performed as reported by bisacrylamide) preequilibrated in 1:2 Tris/borate. Additional Liu-Johnson et al. (19) and Gartenberg et al. (20) with only binding reaction mixtures included fixed Spl-Znl21 concen- minor modifications. The oligonucleotides (Table 2) contain tration ('4 ,uM) and increasing concentrations of two unla- a single GC box flanked at either end by 17 bp from the dhfr beled competitor oligonucleotides: Mt IIa-1/2 (Mt) and CAP- promoter, site I (7). The Spl-Zn92 bound and free oligonu- 30-1/2 (CAP) (Table 2). cleotides were separated in 10%o polyacrylamide gels (75:1). Single-site oligonucleotides. The G-rich strands for the Isolated DNA fractions from bound and free bands were 20-bp single GC box oligonucleotide series were 5' end- separated by electrophoresis in 201% polyacrylamide (20:1)/7 labeled and annealed with their complementary strands. M urea gels. Dried gels were quantitated as before. Binding reactions (as above, including 0.5% Nonidet-P40) with these 20-mers (4-6 nM) were performed with (i) increas- RESULTS ing Spl-Zn92 concentration and (ii) fixed Spl-Zn92 concen- tration and increasing concentrations of unlabeled 20-mers. DNA Binding Properties of Spl Peptides. Initial mobility- Additional binding reaction mixtures included 10 mM EDTA shift DNA binding experiments using the 32P-labeled six-site to remove Zn2+. Samples were then applied to 10% poly- oligonucleotide showed a ladder of up to six Spl-Znl2l- bound bands, indicating that Spl-Znl21 binds specifically to Table 2. Oligonucleotides used in this work the six GC box sites within the DNA fiagment. The speci- ficity of binding was further challenged by competition ex- Mobility-Shift Assays. periments in which comparable amounts of two unlabeled Sine-site. oligonucleotides, (i) a 20-mer containing a single GC box (Mt) Mt IIa-1/2 GCCGGGGCGGGGCTTCTGCA or (ii) a 30-mer lacking a GC box (CAP), were added to the binding reactions (Fig. 1). The fraction of labeled six-site HIV-LTR III-1/2 GCCGAGOCGTGGCTTCTGCA oligonucleotide bound at a fixed concentration of Spl-Znl21 SV40 III-1/2 GCCTGGGCGGAGTTTCTGCA in these experiments is reduced by Mt and is unaffected by Single-Site Mutants. CAP. The Zn2+ dependence ofDNA binding was established by mobility-shift DNA binding experiments using labeled Mt Mt -T3 GCCGGTGCGGGCTTCTGCA in the presence and absence of Zn2+; binding is observed in Mt -T4 the presence of Zn2+ and is eliminated in the presence of 10 Non-Specific-Site. mM EDTA (data not shown). Additionally, preliminary NMR studies show that Spl-Zn92, in the presence of a stoichio- CAP-30-1/2 CAATTAATGTGAGTTAGCTCACTCATTAGG metric amount of Zn2+, assumes a stable, folded conforma- tion and, further, that this conformation is abolished in the Primer-extension I Moblity-Shift Assay absence ofZn2+ (data not shown). Mobility-shift experiments performed in the absence of competitor DNA [poly(dA/dT)] DHFR-44a CGCGGCGGGCCTTGGTGGGGCGGGCCTAAGCTGCGCAAGTGG indicate that Kd ' 10 nM. This value compares well with DHFR-12ap <------extension ------GACGCGTTCACC results reported by Harrington et al. (21) for native Spl (Kd 1-3 nM), demonstrating that Spl-Zn92 contains all those residues essential for tight binding. DHFR-12bp CGCGGCGGGCCT ------extension ------> Variation in DNA affinity among members of the tight DHFR-44b GCGCCGCCCGGAACCACCCCCGCCCCGGATTCGACGCGTTCACC binding class was surveyed in a series of DNA binding Top strands are listed 5' -* 3', bottom strands are listed 3' -* 5', experiments using three duplex oligonucleotides containing and complementary strands are not listed. different GC-box binding sites (Mt, HIV, and SV40) nested Downloaded by guest on September 28, 2021 Biochemistry: Kriwacki et al. Proc. Natl. Acad. Sci. USA 89 (1992) 9761 a comp. w/ Mt b comp. wi CAP Table 3. Relative Ka values determined using mobility-shift :E :g x X: > m 2 x > : Z z c: r- z :) : assay in the presence of oligonucleotide competition C) ID CN Iv O O S0OOUC ? O [competitor DNA] L _; N~ _I m1 _, N H tN Labeled Unlabeled competitor site, relative Ka value _- wells site Mt HIV SV40 Mt-T3 Mt-T4 Mt 1.0 0.8 0.2 0.1 0.2 Spl -Zn 1 21 bound DNA; HIV 1.0 1.0 0.2 # of molecules bound SV40 3.4 2.4 1.0 "- 6 ..N- 5 _- 4 at the 5' end of the GC box, five or six residues beyond the _- 3 generalized "consensus" binding site are required to achieve - 2 full binding affinity. For the 3' end of the binding site a different result is observed; only the GGG GCG portion ofthe binding site is required for moderate affinity and the affinity - free DNA increases as bases are added through the GGGC portion of the site. This result does not depend on the fraction of DNA bound by Spl-Zn92 as controlled by variation of the peptide concentration within the range "200 nM-2.00 ,uM (data not FIG. 1. Results of mobility-shift assay using 32P-labeled six-site shown). oligonucleotide in the presence of (a) unlabeled Mt competitor DNA and (b) unlabeled CAP DNA. DISCUSSION within identical flanking sequences (Table 2). The relative Ka The Spl Peptides Mimic Wild-Type Spl DNA Binding values for Spl-Zn92 binding to each of the three sites were Properties. We have overexpressed and purified two peptide obtained from mobility-shift competition studies using la- fragments of Spl that contain the three zinc-finger domains beled and unlabeled oligonucleotides (Fig. 2). These exper- necessary for specific Spl DNA binding in quantities suffi- iments were performed for all nine combinations of pairs of cient for structural studies. Mobility-shift experiments show labeled and unlabeled 20-mers; the relative values for Ka that both Spl peptides bind with specificity and in a zinc- (Table 3) show that the affinity of Spl-Zn92 for the Mt and dependent manner to DNA sequences containing GC boxes. HIV sites is comparable and that the affinity for the SV40 site In the case of Spl-Znl2l, the mobility-shift assay with the is about 3- to 5-fold lower. Additionally, the ability of six-site fragment reveals from one to six shifted bands Spl-Zn92 to discriminate between the Mt Ila site and two corresponding to the binding of one to six peptide molecules single-base mutants of this site was determined in competi- per DNA fragment. This band pattern is altered or eliminated tion mobility-shift assays using labeled Mt and unlabeled Mt, by competitor DNA containing a single GC box but not by Mt-T3, and Mt-T4 (Table 2). The Mt-T3 and Mt-T4 variants nonspecific DNA. Under the experimental binding condi- involve G [conserved among high-affinity Spl binding sites tions, there is no evidence for cooperative binding of Spl- (7, 22)] to T nucleotide changes at positions 3 and 4 of the Znl21 to adjacent GC box sequences. The lack of cooper- GGfi GCG GGGC Mt Iha sequence. As expected, Ka is ativity was also reported for the binding ofintact Spl with the significantly reduced for the mutant sites: 10-fold and 5-fold 21-bp repeat elements of the SV40 promoter that contains for Mt-T3 and Mt-T4, respectively (Table 3). Oligonucleotide Length Required for Maximal Binding. We a 1.0 p have used the primer-extension/mobility-shift assay (19, 20) 0 to determine the limits of the thermodynamic DNA binding 0.8 0.6 site for Spl-Zn92 using 44-bp oligonucleotides containing a 4-' * Ix - 0.4 single high-affinity GC-box derived from the mouse dhfr 0 promoter (Table 2). This experiment yields the dependence of 0.2 IIi relative K. values on DNA length as an oligonucleotide is 0.0 lengthened in one-base increments through the binding site to the terminus of the template strand. A plot of relative K. 0 10 20 30 40 values as a function of oligonucleotide length for extension b through the GC box in the two directions (Fig. 3) reveals that, 1.0 0 ., *0 0.8 C 1.0 r 0.6 GC-box .la 0 g0 0.4 \ :. 0.8 IF 11....: A A -ri A 0.2 z A a 0.0 0.6 F TG GTGQ222C222QCTAA U) \.. .. 0 12 16 20 24 28 32 .0 0.4 1[ 't 0 position within 44-mer c 0.2 sequence 0 0 co 0.0 FIG. 3. Primer-extension/mobility-shift assay giving the depen- ". 1 0D-9 M 1i M 10'5M dence of relative K. values on oligonucleotide length. Two separate [unlabeled competitor DNA] experiments are represented: extension from annealed oligonucleo- tides DHFR-44a/DHFR-12ap in the 3' -* 5' direction (with respect FIG. 2. Competition mobility-shift assay using 32P-labeled Mt in to the G-rich strand of the GC box) (e) and extension from annealed competition with unlabeled Mt, HIV, and SV40 DNA. The experi- oligonucleotides DHFR-44b/DHFR-12bp in the 5' -*3' direction (o). mental data are given as discreet points [Mt (n), HIV (e), SV40 (.), (a) Extension over the entire length of the template 44-mers. (b) CAP (A)] and the theoretical data are given as continuous curves [Mt Expansion of the region in a contained within the dashed box. The (solid curve), HIV (dashed curve), and SV40 (dotted curve)]. GC box sequence in b positions the binding site along the x axis. Downloaded by guest on September 28, 2021 9762 Biochemistry: Kriwacki et al. Proc. Natl. Acad. Sci. USA 89 (1992) three tandem copies of the GC box (6). Using single-site that binding affinity is reduced for Spl-Zn92 with DNA oligonucleotides, binding to Spl-Zn92 is reduced or elimi- sequences that are not Spl binding sites, these results further nated by GC-box-containing competitor oligonucleotides but demonstrate that the mode ofDNA binding for Spl-Zn92 and not by nonspecific competitor DNA. These results clearly Spl is very similar. show that for both Spl-peptides only oligonucleotides con- Delineation of the GC Box Binding Site Within the dhfr taining an Spl binding site are effective competitors and Promoter Sequence. The primer-extension/mobility-shift ex- directly demonstrate specificity for the GGG GCG GGGC periments provide direct evidence that Spl-Zn92 recognizes binding site. the dhfr promoter, site I GC box and suggest that the 5' To further demonstrate the fidelity of DNA sequence portion of the binding site contributes more significantly to recognition for our peptides, we have examined the binding the binding energy of this protein/DNA interaction than the of Spl-Zn92 to a series ofwild-type "high-affinity" GC boxes 3' portion. The results show that for extension in the 3' -* 5' from the human Mt Iha, HIV long terminal repeat, and SV40 direction (Fig. 3, closed figures) the complete binding site promoters, as well as binding to two mutants of the Mt Ila plus three bases is required before appreciable binding to site. The affinity of Spl-Zn92 for the three high-affinity sites Spl-Zn92 (relative Ka > 0.2) is observed, whereas for ex- can be ranked, according to our results, in the order Mt tension in the 5' -- 3' direction (Fig. 3, open figures), HIV > SV40 (Table 3). The series of sites studied here appreciable binding is observed after the addition ofthe GGG exemplifies the characteristic DNA sequence diversity GCG G portion of the binding site. Alternatively, Spl-Zn92 among Spl binding sites; examination of "high"- to "low- may bind to one oftwo other potential binding sites generated affinity" Spl binding sites (Table 4) shows that a number of by shifting the 10-base GC box "frame" either one or two single and multiple substitutions are tolerated within the bases in the 5' direction (with respect to the G-rich strand) 10-bp site, yielding the consensus sequence: G(T)GG GCG giving rise to the sequences GGG GGjC GGGG and TGG GG(A)G(A)C(T) (7, 22). The HIV and SV40 sequences GGG CGGfi, respectively. These alternate sites, however, represent those sites with the greatest number of substitu- diverge at several key positions from the consensus sequence tions with respect to the Mt Iha G-rich site. The fact that these G(T)GG GCG GG(A)G(A)C(T) (mismatches indicated substitutions are tolerated in our experiments strongly sug- above) and do not provide properly spaced recognition gests that the mode of DNA recognition for the Spl-Zn92 elements for Spl binding and, therefore, are ruled out on this fragment mimics that for wild-type Spl. basis. The third and fourth positions within the GC-box are The requirement at the 5' end of the GC box for several conserved as G residues among high-affinity Spl binding sites nucleotides beyond the end of the consensus sequence for (Table 4; ref. 22) and we expected that mutation of these maximal binding affinity may be due to (i) the existence of positions to T residues within the Mt Iha sequence would non-sequence-specific peptide/DNA interactions or (ii) the diminish Spl-Zn92 binding affinity. The competition exper- effect of changing local DNA structure on peptide binding iments with these mutants indicate that Spl-Zn92 has lower affinity as the DNA is lengthened in the 5' direction. The affinity for the mutant sites and that the decreases in relative designation of possible peptide/DNA contacts outside the Ka values (Table 3) correspond to decreases in AG of binding consensus DNA binding site as non-sequence-specific in of 1.5 to 1.1 kcal/mol (1 kcal = 4.18 kJ) (for Mt-T3 and Mt-T4, nature is supported by a statistical analysis of all known Spl respectively). In the context of the intermolecular interac- binding sites (22), which fails to show significant sequence tions that exist between the peptide and DNA, these differ- preference outside the decanucleotide consensus site. Our ences could correspond to the loss of one or two hydrogen current experiments, however, do not allow us to distinguish bonds between Spl-Zn92 and the highly conserved G resi- between or confirm these hypotheses. dues that are found in the wild-type sequence. By showing Comparison of Spl and Zif-268 Binding to DNA. The high-resolution crystal structure of the Zif-268/DNA com- Table 4. Spl DNA binding sites plex (10) provides exquisite insight into the recognition of DNA by the Cys2His2 class of zinc-finger proteins. The DNA High affinity sites: binding sites for Zif-268 (GcG GGG GcG = A B A) (23) and GGG GCG GGGC HSV IE-3 (V), DHFR (I,III), for Spl (GGG GcG GGGc = B A B) are closely related and MT-IIA, CH-TK INTRON can be thought of as rearrangements of one another (A B A versus B A B). An analysis of the Spl sequence, considering TGG GCG GGGC HSV IE-3 (III,IV) those residues involved in DNA binding in the Zif-268/DNA GGGGOCG G&G. SV40 (III,V) structure, reveals striking similarities between Zif and Spl GGG GCG G&GC DHFR (II,IV) that parallel those within the DNA binding sites (Fig. 4). The first and third fingers of Zif contact the third and first GcG GGG G&G TGGC HIV-LTR (I) triplets, respectively (5' -- 3' direction), within the Zif DNA XGG GCG GG&C HIV-LTR (II) binding site, and a pair of Arg residues emanating from the GAG GCG JGGC HIV-LTR (III) a-helix of each finger is responsible for sequence-specific recognition ofguanine (Arg-18 and Arg-24 offinger 1; Arg-74 Medium affinity sites: and Arg-80 of finger 3). The second finger of Zif contacts the GGG GCG GGG0 HSV IE-3 (I) central GGG triplet within the binding site with these con- GGG GCG GGG0 HSV IE-3 (II) tacts mediated by Arg-46 and His-49 within the a-helix ofthis finger. The similarities between the Zif and Spl amino acid XGG GCG GGGT HSV TK (II) sequences (Fig. 4) suggest the following: (i) finger 3 of Spl XGG GCG GA0C SV40 (II) may contact the first GGG triplet of a GC box through GGG GCG GGAT SV40 (IV) interactions mediated by Arg-77 and His-80 (Spl-Zn92 num- bering system; in analogy with finger 2 of Zif), (ii) finger 2 of GGG GCG GG&C SV40 (VI) Spl may contact the central GcG triplet through Arg-49 and Lo affinity sites: Arg-55 (in analogy with fingers 1 and 3 of ZiO, and (iii) finger 1 with in GGG GCG GAGA SV40 (I) of Spi departs from the Zif model Lys-19 the place of Arg-46 (Zif, finger 2) while His-22 mimics His-49 (Zif, GGG GCG GQGQ HSV-TK (I) finger 2). The amino acids for the three zinc domains of Spl See refs. 7 and 8. can be symbolized as follows: (i) Spl finger 3, RH- for GGG Downloaded by guest on September 28, 2021 Biochemistry: Kriwacki et al. Proc. Natl. Acad. Sci. USA 89 (1992) 9763 a mational flexibility of KH- versus RH- or R-R contacts with C-term. zif-3 zif-2 zif-A Nterm. DNA may partly account for the sequence diversity that is SDLLTT =SRKRHT tolerated within the 3' portion of the Spl binding site. This 5'-GcG GGG GcG-3' analysis parallels the results of Fig. 3, with the 5'-GGG GCG ofthe site vital for binding (mediated by RH- and R-R Spl-3 Spl-2 Sp1-1 portion SEHLSKHTkT-HRSDELQRHKTH contacts) and the 3'-GGGC portion (mediated by KH- con- =R -KT.SHLRAHLRWH-- tacts) not vital. This analysis is also consistent with the 5'-GGG GcG GGGc-3' finding that a peptide containing only fingers 2 and 3 of Spl b binds strongly to the Mt GC box (unpublished results) zif-3 Lzif-2 through putative contacts with the 5'-GGG GCG motif. GGGc Although this analysis vastly simplifies the nature of the Zif-t I interaction between Spl-Zn92 (and Spl) and GC box DNA sites and ignores the multitude of other contacts that must exist between the two, it does account for the striking GcG asymmetry of binding affinity across the binding site. In FIG. 4. Similarities between the Spl and Zif peptide sequences addition, this model offers a structural basis for the observed and binding site DNA sequences. (a) Sequence similarities in the ability of Spl, in contrast to Zif, to accommodate consider- context of the amino acid sequences (C-terminal to N-terminal able sequence variation within its cognate DNA binding site. direction). (b) Individual fingers grouped according to target DNA Additional molecular biological and structural studies de- sequences. In a, amino acid residues that are bold and underlined are signed to challenge this hypothesis are necessary. either involved in DNA recognition in the Zif/DNA structure or proposed to be involved in DNA recognition by Spl. The His2 We thank Dr. Donald M. Crothers for critical comments on the residues of the Cis2His2 motif are given in large, bold type. The data manuscript, Drs. Jason D. Kahn and Kevin M. Weeks for advice on for Zif are based on the three-dimensional structure (10) and the data DNA binding experiments, Mr. Paul Raccuia for technical assistance for Spl are based on the arguments developed in the text. during protein purification, and Dr. Mark Biggin for providing the GC-box-containing plasmid. This work was supported in part by a recognition, (ii) Spl finger 2, R-R for GcG recognition, and Camille and Henry Dreyfus Grant for Distinguished New Faculty in (iii) Spl finger 1, KH- for GGGc recognition. Chemistry (to J.P.C.), Predoctoral Biophysical Fellowship 5 T32 GM08293-04 (to R.W.K.) from the National Institutes ofHealth, and A Model for Spl/DNA Binding. The comparison between the New Graduate Assistance Program in Chemistry, Grant Zif and Spl binding can be further elaborated considering P200A00228-91 (to R.W.K.). known Spl binding sites (Table 4). The Zif crystal structure indicates that the fingers with R-R recognition elements 1. Miller, J., McLachlan, A. D. & Klug, A. (1985) EMBO J. 4, (fingers 1 and 3) contact only the G residues ofthe GcQtriplet 1609-1614. while the finger with the RH- element (finger 2) contacts only 2. Brown, R. S., Sander, C. & Argos, P. (1985) FEBS Lett. 186, the second and third G residues of the GiG triplet. If, in 271-274. 3. Kadonaga, J. T., Carner, K. R., Masiarz, F. R. & Tjian, R. analogy to Zif, Spl recognizes DNA through RH- (finger 3), (1987) Cell 51, 1079-1090. R-R (finger 2), and KH- (finger 1) elements that interact with 4. Dynan, W. S. & Tjian, R. (1983) Cell 32, 669-680. the G(iG (triplet 1), (jcQ (triplet 2), and GGGc (triplet 3) 5. Gidoni, D., Dynan, W. S. & Tjian, R. (1984) Nature (London) subunits of the DNA binding site, respectively, then the 312, 409-413. specific bases contacted in the Zifbinding site should also be 6. Gidoni, D., Kadonaga, J. T., Barrera-Saldana, H., Takahashi, contacted in the Spl binding site and therefore should be K., Chambon, P. & Tjian, R. (1985) Science 230, 511-517. 7. Kadonaga, J. T., Jones, K. & Tjian, R. (1986) Trends Biochem. highly conserved among Spl binding sites. This requirement Sci. 11, 20-23. holds true with (i) positions 2 and 3 of triplet 1 conserved as 8. Jones, K. A., Kadonaga, J. T., Luciw, P. A. & Tjian, R. (1986) G in all but one high-affinity Spl binding site (Table 4) and (ii) Science 232, 755-759. positions 1 and 3 within triplet 2 conserved as G in all Spl 9. Nardelli, J., Gibson, T. J., Vesque, C. & Charnay, P. (1991) binding sites. The third subunit within the Spl binding site Nature (London) 349, 175-178. tolerates a variety of substitutions; this fact suggests that a 10. Pavletich, N. P. & Pabo, C. 0. (1991) Science 252, 809-817. novel recognition process exists here with respect to the Zif 11. Yanisch-Perron, C. J., Vieira, J. & Messing, J. (1985) Gene 33, 103-115. system and is consistent with the divergence of the third Spl 12. Maniatis, T., Fritsch, E. F. & Sambrook, J. (1982) Molecular recognition element (finger 3, KH-) from the Zifmodel. These Cloning:A Laboratory Manual (Cold Spring Harbor Lab., Cold suggestions are supported by the experiments with the Mt Spring Harbor, NY). mutants since the altered sites (GGG to GGT for Mt-T3 and 13. Studier, F. W. & Moffatt, B. A. (1986) J. Mol. Biol. 189, -QcG to TcG for Mt-T4) correspond to key positions in the 113-130. above model and, correspondingly, DNA binding affinity is 14. Studier, F. W., Rosenberg, A. H., Dunn, J. J. & Dubendorff, decreased with these mutants. J. W. (1990) Methods Enzymol. 185, 60-89. 15. Fried, M. G. & Crothers, D. M. (1981) Nucleic Acids Res. 9, Our model provides an explanation for the asymmetry 6505-6525. within the plot of Ka versus DNA length (Fig. 3). The 16. Revzin, A., Ceglarek, J. A. & Garner, M. M. (1986) Anal. Zif/DNA structure (10) reveals two types of sequence- Biochem. 153, 172-177. specific contacts between the zinc-fingers and DNA: (i) 17. Courey, A. J. & Tjian, R. (1988) Cell 55, 887-898. bidentate hydrogen bonds between the guanidinium group of 18. Lin, S. & Riggs, A. D. (1972) J. Mol. Biol. 72, 671-690. Arg residues and guanine bases and (ii) a single hydrogen 19. Liu-Johnson, H. N., Gartenberg, M. R. & Crothers, D. M. bond between the imidazole ring of a His residue and (1986) Cell 47, 995-1005. We envision similar contacts between Spl fingers 3 20. Gartenberg, M. R., Ampe, C., Steitz, T. A. & Crothers, D. M. guanine. (1990) Proc. Natl. Acad. Sci. USA 87, 6034-6038. and 2 and DNA due to the peptide sequence and DNA binding 21. Harrington, M. A., Jones, P. A., Imagawa, M. & Karin, M. site similarities pointed out above (Fig. 4). Spl finger 1 differs (1988) Proc. Natl. Acad. Sci. USA 85, 2066-2070. from the Zifsystem and may contact its DNA subsite through 22. Bucher, P. (1990) J. Mol. Biol. 212, 563-578. hydrogen bonds arising from Lys-19 and His-22. The nature 23. Christy, B. A., Lau, L. F. & Nathans, D. (1988) Proc. NatI. of the side-chain/DNA interactions and the greater confor- Acad. Sci. USA 85, 7857-7861. Downloaded by guest on September 28, 2021