<<
Home , G

Proc. Natl. Acad. Sci. USA Vol. 93, pp. 8863-8867, August 1996 Biochemistry

The [(G/)3NNJn motif: A common DNA repeat that excludes nucleosomes (electron microscopy/histones/nucleotide triplets) YUH-HWA WANG AND JACK . GRIFFITH* Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599-7295 Communicated by Clyde A. Hutchinson III, University of North Carolina, Chapel Hill, NC, May 14, 1996 (received for review January 16, 1996)

ABSTRACT Nucleosomes, the basic structural elements alternating them, the wrapping of DNA around the histone of chromosomes, consist of 146 bp of DNA coiled around an core would be energetically favored (14, 15). The strongest octamer of histone proteins, and their presence can strongly natural nucleosome positioning element yet identified was influence gene expression. Considerations of the anisotropic recently discovered in this laboratory. Myotonic dystrophy is flexibility of nucleotide triplets containing 3 cytosines or one of several human genetic diseases characterized by expan- guanines suggested that a [5'(G/C)3NN3'] motif might resist sions of repeating nucleotide triplets, in this case CTG triplet wrapping around a histone octamer. To test this, DNAs were repeats, located in the 3' untranslated region of the myotonic constructed containing a 5'-CCGNN-3' pentanucleotide re- dystrophy protein kinase gene (reviewed in ref. 16). We peat with the varied. Using in vitro nucleosome reconsti- showed that DNA containing 130 CTG repeats generates a tution and electron microscopy, a plasmid with 48 contiguous nucleosome positioning signal 9 times stronger than 1 copy of CCGNN repeats strongly excluded nucleosomes in the repeat the 5S RNA gene, and it was suggested that generation of region. Competitive reconstitution gel retardation experi- arrays of hyperstable nucleosomes over long CTG repeats can ments using DNA fragments containing 12, 24, or 48 CCGNN alter the local chromatin structure and hence the expression of repeats showed that the propensity to exclude nucleosomes the myotonic dystrophy protein kinase gene (17-19). The increased with the of the repeat. Analysis showed that reason why DNA containing long CTG repeats would assem- a 268-bp DNA containing a (CCGNN)," block is 4.9 ± 0.6-fold ble into unusually stable nucleosomes is, as yet, unclear. less efficient in nucleosome assembly than a similar length Sequences also exist that inhibit nucleosome formation and pUC19 fragment and -'78-fold less efficient than a similar in vivo such sequences, were they to occur in promoter or length (CTG). sequence, based on results from previous enhancer regions, could play an extremely important role by studies. Computer searches against the GenBank database for maintaining access to the DNA. DNA-RNA hybrids (20), matches with a [(G/C)3NN]48 sequence revealed numerous left-handed -DNA (21), and DNA containing poly(dA.dT) examples that frequently were present in the control regions runs have all been shown to exclude nucleosome formation in of"TATA-less" genes, including the human ETS-2 and human vitro (22-25). Recently, Iyer and Struhl (25) found that dihydrofolate reductase genes. In both cases the (G/C)3NN poly(dA.dT) tracts in the yeast his3 promoter stimulate Gcn4- repeat, present in the promoter region, co-maps with loci activated transcription in vivo due to nucleosome exclusion and previously shown to be nuclease hypersensitive sites. increased accessibility of Gcn4-binding sites. Consideration of the sequence motif in the 5S RNA gene (14, 15) led us to The eukaryotic chromosome is formed by a progressive con- suggest that DNA containing long repeats in the form of densation of the DNA, beginning with its assembly into a string (G/C)3NN(G/C)3NN should exclude nucleosomes for the of beads or nucleosomes (1). Each nucleosome consists of 146 following reason. This motif differs from the 5S RNA gene bp of DNA wrapped about a histone protein octamer and their element (G/C)3NN(A/T)3NN in that each time minor groove presence near genetic control elements can strongly influence compression is required, the DNA presents the histone oc- gene expression (reviewed in refs. 2 and 3). Placement of a tamer with a wedge that favors bending into the major groove. nucleosome directly over a TATA box normally maintains we would predict that DNA containing a nucleosome- genetic repression, whereas assembly nearby may aid in the Thus, entry of activator proteins and thus facilitate transcription sized tract of, for example, repeating CCGNN pentanucleoti- (4-7). In general, the absence of nucleosomes in promoter des, which are a member of the (G/C)3NN(G/C)3NN motif regions is correlated with transcriptional potency. family, would energetically resist nucleosome formation. The Numerous factors can lead to the specific assembly or reason for particular interest in the CCG triplet is that five exclusion of nucleosomes along DNA. These factors include different fragile sites in the human genome have been found the primary sequence of the DNA itself, and the energetics of to be associated with expansions of CCG triplets (26) and different DNA sequences for nucleosome formation have been fragile sites have been suggested to result from an altered, less examined. Repeated tracts of 4-6 adenines in phase with the well organized chromatin structure (27). helix produce sequence-directed bends, and bent DNA pref- To examine this hypothesis, we constructed three erentially assembles into nucleosomes (8,9). The 5S RNA gene pGEM3zf(+)-based recombinant plasmids containing 12, 24, from several species contains a strong nucleosome positioning and 48 contiguous CCGNN repeats with the positions rich element (10-13) in the form of repeats of 5'(G/C)3NN(A/ in adenines or thymines. Using in vitro reconstitution and T)3NN3'. Shrader and Crothers (14, 15) proposed that in this electron microscopy (EM) and a competitive reconstitution gel repeating motif, adenines or thymines are preferred at sites of retardation assay, we demonstrate here that the longer minor-groove compression and guanines or cytosines are favored at sites of major-groove compression. Thus, in their Abbreviations: EM, electron microscopy; DHFR, dihydrofolate re- model, by spacing these anisotropically flexible wedges and ductase. *To whom reprint requests should be addressed. tThe simple repeat sequences described here are frequently denoted The publication costs of this article were defrayed in part by page charge by the first three letters, for example CCG or CTG. This abbreviation payment. This article must therefore be hereby marked "advertisement" in indicates a duplex repeat of the form CCGCGG in which both accordance with 18 U..C. §1734 solely to indicate this fact. antiparallel strands are oriented 5' to 3'.

8863 Downloaded by guest on September 26, 2021 8864 Biochemistry: Wang and Griffith Proc. Natl. Acad. Sci. USA 93 (1996) (CCGNN), repeat blocks strongly exclude nucleosome forma- 10 ,tg of unlabeled calf thymus DNA and 2.5 ,tg of histone tion. Computer searches against the GenBank database re- octamers in a solution containing 2 NaCl, 100 ,ug/ml BSA, vealed that the (G/C)3NN motif commonly occurs in the and 0.1% Nonidet P-40 (Sigma). Nucleosome assembly and human genome with two examples co-mapping to sites known electrophoresis was carried out as described in ref. 18. Briefly, to be nucleosome free. nucleosome assembly was accomplished by slowly lowering the salt in increments of 0.1 M to a final concentration of 0.1 M using a solution of 20 mM Hepes, 1 mM EDTA (pH 7.5) (5 min MATERIALS AND METHODS for each step at room temperature). The assembly mixtures DNA and Proteins. Two complementary oligonucleotides were then directly electrophoresed on 5% polyacrylamide gels (5' -CCGTACCGATCCGAACCGGACCGCTCCG- at 150 for 4 hr at room temperature to separate free DNA AGCCGTCCCGTGCCGCACCGGCCCGTTCCGAT-3' from the nucleosome-assembled DNA. The amounts of DNA and 5'-ACGGATCGGAACGGGCCGGTGCGGCAC- in each band were determined by Phosphorlmager (Molecular GGGACGGCTCGGAGCGGCTCGGTTCGGATCGGT-3') Dynamics) scanning. were synthesized, annealed to generate a duplex with four Sequence Homology Search. A GenBank database search nucleotide cohesive ends, and ligated head-to-tail. Monomers, was done by using MACVECTOR (release 4.1.4) to access the dimers, and tetramers of (CCGNN)12 were purified by gel ENTREZ:SEQUENCES database (release 6.0), which was pro- electrophoresis and cloned into the pGEM3zf(+) vector. The vided by the National Center for Biotechnology Information sequences of the inserts in the recombinant plasmids p(C- (Bethesda, MD). CGNN), were verified by direct DNA sequencing. HeLa cell histone octamers were isolated by centrifugation of purified RESULTS HeLa cell chromatin through sucrose gradients containing 0.6 M NaCl, followed by release of the core histones with 2 M The (CCGNN). Repeat Generates a Strong Nucleosome NaCl as described (9). Exclusion Signal. To examine the hypothesis that long tracts In Vitro Nucleosome Reconstitution and EM. Closed circular of (G/C)3NN repeats will exclude nucleosomes three p(CCGNN)48 or pGEM3zf(+) plasmid DNA was incubated in pGEM3zf(+)-based recombinant plasmids, p(CCGNN)12, 5 mM MgCl2 at 55°C for 30 min prior to being mixed with p(CCGNN)24, and p(CCGNN)48, were constructed contain- purified HeLa cell histone octamers in a buffer containing 2 ing, respectively, 12, 24, and 48 tandem copies of the CCGNN M NaCl. The salt was slowly lowered in increments of 0.1 M repeat with the N positions varied and rich in adenines and to a final concentration of 0.6 M by adding a solution of 20 mM thymines. Two independent methods of analysis were em- Hepes, 1 mM EDTA (pH 7.5) (5 min for each step at room ployed. In the first, EM was used together with in vitro temperature) to form stable nucleosomes. The nucleosome- nucleosome reconstitution onto supertwisted p(CCGNN)48 assembled DNA was fixed with 0.6% glutaraldehyde (vol/vol) plasmid DNA. Histone core octamers (from HeLa cells) were for 10 min at room temperature, chromatographed over 1 ml mixed with the plasmid DNA in 2 M NaCl then step-wise of Sephadex G-50, then treated with EcoO109I and AatII diluted to lower the salt and assemble nucleosomes. The DNA restriction endonucleases. Only the EcoO109I end of the was cleaved after reconstitution and one end was labeled with molecules were filled in with biotinylated dCTP using the large a biotin-streptavidin tag to mark it for EM mapping. This fragment of DNA polymerase I. The DNA was incubated with placed the CCGNN block in p(CCGNN)48 between 27 and streptavidin (150 jig/ml) and then chromatographed over 2-ml 34% from the tagged end. The histone concentration was columns of BIO-GEL A-5m (Bio-Rad) to remove the excess adjusted to yield 1-2 nucleosomes per DNA. At the electron streptavidin. The fractions containing DNA-protein com- microscope, each DNA sequentially encountered with a single plexes were mixed with a buffer containing 2 mM spermidine, nucleosome was photographed and the position of the nucleo- applied to glow-charged carbon-coated grids, and washed with some relative to the tagged end measured. Molecules with a sequential water/ethanol series. The samples were then more than one nucleosome were excluded to avoid errors air-dried and rotary shadowcast with tungsten (28). Samples resulting from the progressive compaction of DNAs with were examined in a Philips CM12 electron microscope and multiple nucleosomes. By measuring the position of the single micrographs were taken on 35 mm film. Measurement of the nucleosome in 100 different molecules, a map was generated position of nucleosomes along DNA was accomplished by showing the frequency of nucleosome assembly over the DNA projecting images of molecules on the film onto a Summa- (Fig. 1). Strong nucleosome exclusion was seen over the graphics digitizing tablet coupled to a Macintosh computer (CCGNN)48 insert with only 2% of all nucleosomes mapped programmed with software developed in this laboratory. located between 25 and 35 map units, whereas in other Competitive Nucleosome Reconstitution and Gel Retarda- segments of this length 10% or more of the DNA contained tion Assays. The 261-bp fragment containing the (CCGNN)12 a nucleosome. In contrast, the parent pGEM3zf(+) vector repeat was obtained by PCR amplification of a segment of showed no regions of strong nucleosome exclusion. p(CCGNN)12 plasmid using primers from nucleotide 5 to Competitive nucleosome reconstitution assays provide a nucleotide 23 and from nucleotide 182 to nucleotide 205 quantitative measure of the strength of a DNA segment for [based on the nucleotide numbering in pGEM3zf(+)]; the nucleosome assembly (14, 15). Here, DNAs -260 bp in length, 255-bp fragment containing the (CCGNN)24 repeat was ob- containing (CCGNN), blocks of 12,24, or 48 repeats (Fig. 2A), tained by PCR amplification of a segment of p(CCGNN)24 were labeled with 32P and mixed with an excess of cold plasmid using primers from nucleotide 5 to nucleotide 23 and nonspecific DNA and a limiting amount of HeLa cell histone from nucleotide 116 to nucleotide 139 [based on the nucleotide octamers in high salt (see Materials and Methods). Following numbering in pGEM3zf(+)]; the 268-bp fragment containing the step-wise dilution of the salt, the samples were electro- the (CCGNN)48 repeat was obtained by EcoRI and XbaI phoresed on polyacrylamide gels (Fig. 2B) and the amount of digestion of p(CCGNN)48; and the 262-bp pUC19 fragment nucleosome-assembled and nucleosome-free DNA was mea- was obtained by PCR amplification of pUC19 plasmid using sured using a Phosphorlmager. This analysis showed that the primers from nucleotide 239 to nucleotide 263 and from 268-bp DNA containing the (CCGNN)48 repeat is 4.9 + nucleotide 477 to nucleotide 500 (based on the nucleotide 0.6-fold less efficient in nucleosome assembly than the same numbering in pUC19). DNA fragments were purified by 5% sized pUC19 fragment (Fig. 2 and Table 1). Combining these polyacrylamide gel electrophoresis and labeled with T4 DNA results from previous studies using the same method to analyze kinase (New England Biolabs) in the presence of [--32P]ATP the affinity of DNAs containing repeating CTG triplets for (Amersham). Fifty nanograms of labeled DNA was mixed with nucleosome formation (18), we estimated that the (CCGNN)48 Downloaded by guest on September 26, 2021 Biochemistry: Wang and Griffith Proc. Natl. Acad. Sci. USA 93 (1996) 8865

(CCONN)40 la, A 15 A 1 262 bp

2 ( 261 bp

3 (CCGNN)24 255 bp

4 - (CCGNN)48 268 bp

*10 20 40 * 80 100 (CCGNN)1 2= is 5 CCGTACCGATCCGAACCGGACCGCTCCGAOCCGTCCCGTGCCGCACCGGCCCGTTCCGAT3 3 'CGCATGGCTAGGCVIQGCCTGGCGAGOCTCGGCAGOGCACGOCGTQGCCGGCAAGGCTAS' 10l B 1. I 1 2 3 4

complex

Map units free DNA FIG. 1. Distribution of nucleosomes assembled on closed, circular (A) p(CCGNN)48 plasmid and (B) pGEM3zf(+) vector. Nucleosome assembly and EM analysis were performed as described (see Materials and Methods and ref. 17). At least 100 DNA molecules containing single nucleosomes were photographed, the position of each nucleo- C some from the streptavidin-labeled end (filled circle) was measured, 1201 and a histogram (with DNA length broken into 20 slices and each percentage of length equivalent to 1 map unit) showing the location of the nucleosomes along the DNA was generated. The position of the 1001 (CCGNN)48 sequence in p(CCGNN)48 is indicated. az repeat block was -78-fold less efficient than a similar length uJ block of repeating CTG triplets. As the length of the CCGNN 80 repeat sequence increased from 12 to 48 repeats within an z;a -260-bp segment, the ability of the DNA to exclude nucleo- a 60 somes increased in proportion to the length of the repeat block (Fig. 2C). Indeed, a 261-bp DNA containing 12 tandem a. CCGNN repeats assembled nucleosomes 94% as efficiently as a 40 a 262-bp pUC19 fragment. The free energies for the fragment 0 containing the (CCGNN)48 repeat block and the pUC19 DNA were calculated (Table 1). The difference in free energy 20 between the (CCGNN)48 fragment and the pUC19 DNA is 937 ± 79 cal/mol and -2600 cal/mol relative to the (CTG)75 repeating element (18). We concluded that the (CCGNN), 10 20 30 40 50 sequence provides a very strong nucleosome-exclusion ele- ment. (CCGNN)n The [(G/C)3NN], Motif: A DNA Element Common to FIG. 2. Competitive nucleosome reconstitution with DNAs con- Control Regions in Eukaryotic Genes. Based on the findings taining various CCGNN repeats. (A) Diagram showing the location of described above, a computer search was carried out using the the CCGNN repeat sequences within the DNA fragments and the GenBank database to examine the prevalence of the general nucleotide sequence of the (CCGNN)12 motif. (B) Autoradiogram of [(G/C)3NN] motif. Searches against 240 bp of a [(G/C)3NN]48 a competitive nucleosome reconstitution experiment. Lane 1, the continuously repeating sequence revealed many matches, and 262-bp pUC19 fragment; lanes 2-4, DNA fragments containing 12,24, 75 examples showed .85% sequence match over 200 bp of this and 48 CCGNN repeats, respectively. DNA preparation, reconstitu- tion of the fragmiients with histones, and gel electrophoresis were as motif (Table 2). Many of these were present in or near the described (see Materials and Methods and ref. 18). (C) Dependence of control regions for eukaryotic genes. Of the 75 examples with nucleosome assembly on the length of the repeat block. Each DNA was the greatest number of sequence matches, 31 genes were found reconstituted in three separate but identical experiments, and the to contain the (G/C)3NN motif in the 5' region upstream of fraction of DNA in the nucleosome-assembled and nucleosome-free the coding sequences. Such regions would be loci where DNA bands was measured by a Phosphorlmager. nucleosome exclusion would be expected to provide favored access to the DNA for sequence-specific DNA-binding pro- nt upstream control region of the human dihydrofolate reduc- teins. Furthermore, at least 20 of these 31 genes lack a TATA tase (DHFR) gene contains the [(G/C)3NN]48 motif (with box in the promoter region and are classified as "TATA-less" 87% sequence match), and this overlaps with a hypersensitive genes. In two of these genes, the 5' regions have been mapped region mapped in vivo (29). Furthermore, in the gene for the for nuclease hypersensitive sites (Fig. 3). The -530 nt to -300 human ETS-2 nuclear phosphoprotein, there is a region up- Downloaded by guest on September 26, 2021 8866 Biochemistry: Wang and Griffith Proc. Natl. Acad. Sci. USA 93 (1996)

Table 1. Competitive nucleosome reconstitution of the A human dihydrofolet reducta gen (CCGNN), containing DNAs Complex DNA/ Free energy, -530 300 DNA free DNA* cal/molt

pUC19 100 0 470 -340 -170 +150 (CCGNN)12 94.3 + 3.5 34 ± 22 (CCGNN)24 67.0 + 4.4 238 ± 38 (CCGNN)4s 20.7 + 2.9 937 + 79 B human ETS-2 gene *The ratio of DNA bound by histones to free DNA for each sample -150 was determined by measuring the amount of DNA in each radioactive band by a Phosphorlmager. The ratios for CCGNN-containing DNA were normalized against that of the pUC19 DNA fragment. The ratio .195 .4 for the pUC19 DNA fragment was assigned a value of 100. tThe free energy was calculated according to the equation FIG. 3. Location of the [(G/C)3NN]48 sequences in the human (CCGNN-containing fragment) = RT ln (ratio of DNA in complex DHFR gene (A) and the human ETS-2 gene (B) related to the to free DNA for pUC19 fragment) - RT ln (ratio of DNA in complex nuclease hypersensitive sites mapped in vivo. The [(G/C)3NN]48 to free DNA for the CCGNN-containing fragment). The free energy sequences are indicated by black bars, the nuclease hypersensitive sites for the pUC19 DNA was defined as zero. The values are derived from are indicated by bars with internal arrows, and each is labeled with three separate experiments. numbers determined by using the cap site (shown by a right angle arrow) as +1. The positions of the nuclease hypersensitive sites are stream of the gene between nucleotides -195 and +45 con- derived from in vivo nucleosome mapping studies of others (29, 30). taining the [(G/C)3NN]48 motif (with 86% sequence match). hypersensitive region exactly co-maps with the region con- The promoter region of this gene maps between nucleotides taining the [(G/C)3NN]48 motif. These results suggest that - 159 and +141 and studies of chromatin structure around the ability of the [(G/C)3NN]48 sequence to exclude nucleo- the promoter using Si nuclease revealed a hypersensitive somes may play a central role in the regulation of certain region from nucleotides -150 to -50 (30). This 100-bp genes in vivo. Table 2. Examples of genes containing regions showing .85% sequence matches to the [(G/C)3NN)]48 sequence taken from a GenBank search Sequence Location Gene or encoded protein matches* 5' end (upstream of coding sequences) Human glutamate dehydrogenase 222/235 Human arginosuccinate lyase 213/230 Human insulin receptor 214/237 Wheat a amylase 208/233 Chicken hsp9o 205/231 Human DHFR 205/228 Pig nuclear factor 1 201/225 Rat insulin-like growth factor binding protein 202/232 Rabbit metallothionine 195/229 Mouse neu protooncogene 222/236 Human hematopoietic cell-specific protein 215/235 Human fibronectin 205/233 Rat phosphorylase kinase catalytic subunit 199/230 Human vitronectin protein 196/229 Mouse S16 ribosomal protein 202/232 Rat neu oncogene 214/231 Triticum aestivum histone H2B 204/224 Human gastrin releasing peptide 202/228 Human ETS-2 oncogene 204/231 Rat nucleolin 199/230 Coding region Chicken c-fos protooncogene 208/231 Chicken tropoelastin 206/224 Human collagen-like protein 201/217 Herpes simplex 1 UL 18 204/230 Epstein-Barr virus nuclear antigen 3C 204/234 Human thymidine kinase 199/233 Chicken ubiquitin 202/231 Human translocation fusion protein (E2A-Prl) 202/232 Human p120 199/233 Gallus gallus c-ski oncoprotein 202/235 3' end (downstream of coding sequences) Human cytochrome P-450 207/232 Mouse histone H2A. 209/235 Human lamin B2 206/232 Chicken p53 oncoprotein 193/215 Human a 1 collagen type 1 204/232 At least 75 genes in the GenBank were found to have -85% sequence matches to [(G/C)3NN]48. *The first number indicates the number of bases showing sequence matches with [(G/C)3NN]48 and the second number refers to the length of this sequence. Downloaded by guest on September 26, 2021 Biochemistry: Wang and Griffith Proc. Natl. Acad. Sci. USA 93 (1996) 8867 DISCUSSION region upstream of the FMR-1 gene in patients with fragile-X syndrome (reviewed in ref. 34). To date, long tracts of repeat- In this study we have examined the ability of repeating tracts ing CCG triplets have been found in five rare folate-sensitive of CCGNN pentanucleotides in duplex DNA to exclude nu- fragile sites in the human genome (26). These fragile sites are cleosomes. EM revealed that a plasmid containing 48 contin- loci that stain poorly with protein stains, are disorganized as uous CCGNN repeats showed strong inhibition of nucleosome seen by light microscopy, and show increased propensity for assembly over the repeat region. Competitive assembly and gel DNA breakage. Such properties would be consistent with retardation experiments confirmed this observation, demon- failure of the DNA to assemble into stable nucleosomes. strated that the degree of exclusion increases with the length Preliminary studies with DNA containing repeating CCG of the repeat block, and showed that a tract of 48 CCGNN triplets derived from patients with fragile-X syndrome (un- repeats is 78-fold less likely to assemble into nucleosomes than published work) is consistent with this interpretation. a similar length DNA containing CTG repeats. Computer searches against the GenBank database revealed at least 75 We wish to thank Ms. Lora Cavallo for excellent technical assistance genes containing regions of .85% sequence match to a in these studies and Mr. Robert Cavallo for aid in the synthesis of the repeating (G/C)3NN element over a length of 240 bp. Among initial set of repeating DNAs. This work was supported by Grants these, the (G/C)3NN-like sequences of the human DHFR and GM31819 and GM42342 from the National Institutes of Health and ETS-2 genes mapped to loci previously shown to be nuclease Grant NP-583 from the American Cancer Society. hypersensitive sites near the promoters of these genes. In the motif (G/C)3NN(A/T)3NN, anisotropically flexible 1. Griffith, J. D. (1975) Science 187, 1202-1203. wedges that preferentially bend into the major groove [(G/C)3] 2. Thoma, . (1992) Biochim. Biophys. Acta 1130, 1-19. and minor groove [(A/T)3] are spaced with the helical repeat 3. Wolffe, A. P. (1994) Cell 77, 13-16. 4. Workman, J. . & Kingston, . E. (1992) Science 258,1780-1784. (14, 15). Here we proposed that a motif, [(G/C)3NN(G/ 5. Schild, C., Claret, F.-X., Wahli, W. & Wolffe, A. P. (1993) EMBO C)3NN],, would energetically resist assembly into nucleosomes J. 12, 423-433. because each time compression into the minor groove was 6. Mcpherson, C. E., Shim, E.-Y., Friedman, D. S. & Zaret, . S. required, the DNA would prefer to bend into the major (1993) Cell 75, 387-398. groove; this was shown experimentally for one member of this 7. Lee, .-Y. & Archer, T. K. (1994) Mol. Cell. Biol. 14, 32-41. motif, (CCGNN),. The triplet CCG is one of four possible 8. Trifonov, E. N. (1985) CRC Crit. Rev. Biochem. 19, 89-106. variations of (G/C)3. Initial inspection of the (G/C)3 triplets 9. Hsieh, C.-H. & Griffith, J. D. (1988) Cell 52, 535-544. in the long blocks of .85% sequence match found in the 10. Simpson, R. T. & Stafford, D. W. (1983) Proc. Natl. Acad. Sci. GenBank search did not show usage of USA 80, 51-55. (Table 2) preferential 11. Rhodes, D. (1985) EMBO J. 4, 3473-3482. any one triplet; nonetheless, a more detailed analysis might 12. Ramsay, N. (1986) J. Mol. Biol. 189, 179-188. show otherwise. In principle, a repeating motif in the form of 13. Gottesfeld, J. M. (1987) Mol. Cell. Biol. 7, 1612-1622. [(A/T)3NN(A/T)3NN], should also resist nucleosome forma- 14. Shrader, T. E. & Crothers, D. M. (1989) Proc. Natl. Acad. Sci. tion, and poly(dA.dT), which is one member of this family, USA 86, 7418-7422. does so (22-25). However, the sequence (A4T4CG),, which is 15. Shrader, T. E. & Crothers, D. M. (1990)J. Mol. Biol. 216, 69-84. also a member of the [(A/T)3NN], motif and is highly bent 16. Timchenko, L., Monckton, D. G. & Caskey, C. T. (1995) Semin. (31), might act counter to nucleosome exclusion. Thus, con- Cell Biol. 6, 13-19. tinued experimental examination of the ability of these se- 17. Wang, Y.-H., Amirhaeri, S., Kang, S., Wells, R. D. & Griffith, quence arrangements to exclude nucleosomes is needed. J. D. (1994) Science 265, 669-671. 18. Wang, Y.-H. & Griffith, J. D. (1995) Genomics 25, 570-573. The discovery of long runs of sequence match to the 19. Otten, A. D. & Tapscott, S. J. (1995) Proc. Natl. Acad. Sci. USA [(G/C)3NN]n motif in the 5' control regions of the DHFR and 92, 5465-5469. ETS-2 genes (29, 30) and in segments shown in vivo to be 20. Dunn, K. & Griffith, J. D. (1979) Nucleic Acids Res. 8, 555-566. nucleosome free provides a compelling link between the 21. Nickol, J., Behe, M. & Felsenfeld, G. (1982) Proc. Natl. Acad. Sci. presence of the [(G/C)3NN]n motif and nucleosome exclusion USA 79, 1771-1775. in vivo. It also argues that the relatively small free energy 22. Simpson, R. T. & Kunzler, P. (1979) Nucleic Acids Res. 6, differences measured in high salt in vitro translate to significant 1387-1415. biological differences in the cell. The DHFR, ETS-2, and 23. Kunkel, G. R. & Martinson, H. G. (1981) Nucleic Acids Res. 9, several other listed in Table 2 are 6869-6888. genes genes regulated by 24. Prunell, A. (1982) EMBO J. 1, 173-179. promoters lacking a TATA box. The "TATA-less" family 25. Iyer, V. & Struhl, K. (1995) EMBO J. 14, 2570-2579. includes many housekeeping genes, several oncogenes, and 26. Sutherland, G. R. & Richards, R. I. (1995) Curr. Opin. Genet. genes encoding growth factors and transcription factors (32). Dev. 5, 323-327. For genes containing a TATA box, the binding of TFIID may 27. Chaudhuri, J. P. (1972) Chromosome Today 3, 147-151. preclude the subsequent formation of nucleosomes over that 28. Griffith, J. D. & Christiansen, G. (1978) Annu. Rev. Biophys. segment (33) keeping the gene receptive to factors that would Bioeng. 7, 19-35. initiate transcription. For TATA-less genes, some other mech- 29. Shimada, T., Inokuchi, K. & Nienhuis, A. W. (1986) J. Biol. anism may be required to maintain transcriptional potency. Chem. 261, 1445-1452. Here the presence of the motif may this 30. Mavrothalassitis, G. J., Watson, D. K. & Papas, T. S. (1990) (G/C)3NN provide Oncogene 5, 1337-1342. mechanism, a circumstance similar to the function of the 31. Hagerman, P. J. (1986) Nature (London) 321, 449-450. poly(dA.dT) sequences in the his3 promoter (25). 32. Azizkhan, J. C., Jensen, D. E., Pierce, A. J. & Wade, M. (1993) The choice of the CCG triplet for the basis of the repeating Crit. Rev. Eukaryotic Gene Expression 3, 229-254. element CCGNN was based on knowledge that this triplet 33. Workman, J. L. & Roeder, R. G. (1987) Cell 51, 613-622. undergoes very large expansions (200 to >2000 repeats) in the 34. Nelson, D. L. (1995) Semin. Cell Biol. 6, 5-11. Downloaded by guest on September 26, 2021