Multiple Genes Encode Nuclear Factor 1-Like Proteins That Bind To

Proc. Nati. Acad. Sci. USA Vol. 85, pp. 8%3-8967, December 1988 Biochemistry Multiple genes encode nuclear factor 1-like proteins that bind to the promoter for 3-hydroxy-3-methylglutaryl-coenzyme A reductase (transcription factors/DNA-binding proteins/regulation of cholesterol metabolism) GREGORIO GIL*t, JEFFREY R. SMITH*, JOSEPH L. GOLDSTEIN*, CLIVE A. SLAUGHTERS, KIM ORTHIt, MICHAEL S. BROWN*, AND TIMOTHY F. OSBORNE* *Departments of Molecular Genetics and Internal Medicine and *Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, 5323 Hary Hines Boulevard, Dallas, TX 75235 Contributed by Michael S. Brown, August 31, 1988 ABSTRACT DNA-binding proteins of the nuclear factor 1 62,000. Santoro et al. (3) isolated three cDNAs for NF1/CTF (NF1) family recognize sequences containing TGG. Two of proteins from human HeLa cells. All of the mRNAs were these proteins, termed reductase promoter factor (RPF) pro- derived from a single gene by alternative splicing and differed teins A and B, bind to the promoter for hamster 3- by the presence or absence ofan insertion in the middle ofthe hydroxy-3-methylglutaryl-coenzyme A reductase, a negatively protein and the presence of various COOH termini. All three regulated enzyme in cholesterol biosynthesis. In the current proteins, when produced in Escherichia coli, bound to the study, we determined the sequences of peptides derived from NF1/CTF recognition sequence, and at least two of them hamster RPF proteins A and B and used this information to stimulated replication of adenovirus in vitro. isolate a cDNA, designated pNF1/Redl, that encodes RPF Another member of the NF1/CTF family is the TGGCA- protein B. The peptide sequence of RPF protein A, the other binding protein, which was described in chicken liver (8) and reductase-related protein, suggests that it is the hamster believed to have a molecular weight in the 30,000 range. The equivalent of NF1/L, which was previously cloned from rat cDNA for this protein, designated pNF1/L, was isolated liver. We also isolated a hamster cDNA for an additional from rat liver (9). It encodes a protein of 505 amino acids, member of the NF1 family, designated NF1/X. Thus, the which was proteolyzed during its isolation. The first 175 hamster genome contains at least three genes for NFl-like amino acids of the rat TGGCA-binding protein show 98% proteins. It is likely to contain a fourth gene, corresponding to identity with the NH2 terminus of human NF1/CTF, and the NF1/CTF, which was previously cloned from the human. The remaining portion shows an identity of "50% (3, 9). NH2-terminal sequences of all four NFl-like proteins The 5' flanking region of the reductase gene contains the (NFl/Redl, NF1/L, NF1/X, and NF1/CTF), which are vir- information necessary for transcription ofthe gene as well as tually identical, contain the DNA-binding domain that recog- for feedback repression by sterols. This region contains eight nizes TGG. Functional diversity may arise from differences in sequences that bind nuclear proteins as revealed by DNase I the COOH-terminal sequences. We hypothesize that the protection assays (10). We have purified (5) a protein doublet COOH-terminal domain interacts with adjacent DNA-binding of 33 and 35 kDa, designated reductase promoter factor proteins, thereby stabilizing the binding of a particular NF1- (RPF) that produces six of the eight footprints. The two like protein to a particular promoter. This protein-protein proteins were active as monomers, and both recognized all interaction confers specificity to a class of proteins whose six footprinted sequences. The only sequence shared by all DNA-recognition sequence is widespread in the genome. Ste- six footprinted regions is the trinucleotide TGG. Methyla- rols may repress transcription of the reductase gene by tion-interference analysis showed that both of the adjacent disrupting this protein-protein interaction. guanosines in this sequence made contact with RPF (5). The HMG-CoA reductase sequence that binds RPF with highest affinity (designated footprint 2B) contains the in- Among the factors that regulate transcription of eukaryotic verted repeat TGGN7CCA (5). The other five RPF-binding genes is a family ofproteins known as nuclear factor 1 (NF1), sites contain only half-sites. We speculated that RPF be- CCAAT-box-binding transcription factor (CTF), or CCAAT- longed to the NF1 family, and we demonstrated by gel box-binding protein (1-4). These proteins are related in that retardation assays that RPF bound with high affinity to an they all bind to double-stranded sequences containing TGG oligonucleotide corresponding to the NFl-binding site in the and its complement, CCA making contact with the adjacent adenovirus origin of replication (5). The question remains as guanosines (4). NFl-like pNroteins bind to the promoters of to whether NF1/CTF and RPF are products ofthe same gene many genes, including the gene for 3-hydroxy-3-methylglu- with differences in behavior attributable to alternative splic- taryl-coenzyme A reductase (HMG-CoA reductase), the ing, or whether multiple genes for this protein family exist. rate-determining enzyme in cholesterol synthesis (5). In the current study, we obtained partial amino acid NF1 binds to a TGG-containing sequence in adenovirus sequences for the two RPF proteins and used this information and thereby stimulates its replication (6, 7). The same protein to isolate cDNAs§ for two related but genetically distinct (designated NF1/CTF) binds to the CCAAT box, making NFl-like proteins. The evidence suggests that the hamster contact with TGG on the opposite strand and activating transcription ofthe human a-globin gene (1). NF1/CTF binds Abbreviations: HMG-CoA, 3-hydroxy-3-methylglutaryl-coenzyme with highest affinity to sequences that contain the inverted A; NF1, nuclear factor 1; PCR, polymerase chain reaction; RPF, repeat TGGN7CCA. It also binds to "half-sites" that contain reductase promoter factor; CTF, CCAAT-box-binding transcription only one copy of TGG. Purified NF1/CTF consists of a factor. family of proteins in the molecular weight range of 55,000- tPresent address: Department of Biochemistry, University of Mas- sachusetts Medical School, Worcester, MA 01605. §The sequences reported in this paper are being deposited in the The publication costs of this article were defrayed in part by page charge EMBL/GenBank data base (IntelliGenetics, Mountain View, CA, payment. This article must therefore be hereby marked "advertisement" and Eur. Mol. Biol. Lab., Heidelberg) (accession nos. J04122 and in accordance with 18 U.S.C. §1734 solely to indicate this fact. J04123). 8963 Downloaded by guest on September 24, 2021 8964 Biochemistry: Gil et al. Proc. NatL. Acad. Sci. USA 85 (1988) genome contains at least three genes of the NF1/CTF family 1 2 Peptide Ainio Acid Sequence and that at least two of these proteins bind to the HMG-CoA reductase promoter. Al *FAYTVFNLQAG *-43 £2 K P P C C V L S N P D Q I MATERIALS AND METHODS £3 QADXVVR LD LV V I LFK G I P ? LV 4 *Q PE NGI LGF Q D ? FV General Methods. Standard molecular biology techniques AS V S Q T P I A A G T G P N F S L S D L E ? S ? Y were used (11). cDNA clones were sequenced by the dideoxy IT- chain-termination method (12) with either the M13 universal x sequencing primer or specific oligonucleotides after subclon- .0A E 31 *IAYTVFNLQA ing into bacteriophage M13 vectors. Sequencing reactions 32 *A D K V V R L D L V V I L F K .0. B3 L D L F L A Y Y V Q E Q D S G Q ? G S P were performed with the Klenow fragment of E. coli DNA * -31 polymerase I (12) or a modified bacteriophage T7 DNA 34 P I T QG T GV N F P I G E I P S Q P polymerase (13). Total cellular RNA was isolated by guani- dinium thiocyanate extraction followed by centrifugation FIG. 1. Sequence of tryptic peptides from RPF. (Left) RPF was through a cesium chloride cushion (14). Blot hybridization of separated into two proteins (A and B) by electrophoresis on a 35-cm RNA was performed as described (15) with single-stranded 7% polyacrylamide gel containing NaDodSO4 (lane 1). This gel gave 32P-labeled probes (16). wide separation of two marker proteins, ovalbumin (43,000) and Amino Acid Sequence of RPF Peptides. Approximately 300 carbonic anhydrase (31,000) (lane 2). Proteins A and B were pmol of purified RPF (5) was subjected to NaDodSO4/ transferred to nitrocellulose, stained, cut from the filters, and polyacrylamide gel electrophoresis and transferred to nitro- separately digested with trypsin. The peptides were separated by cellulose sheets for solid-phase tryptic digestion (17). Pep- HPLC and sequenced. (Right) Peptides Al-A5 and B1-B4 were derived from proteins A and B, respectively. A single sequencer run tides were separated by reverse-phase HPLC on a Brownlee was performed on each peptide. An asterisk denotes that more than RP300 (2.1 x 1000 mm) C8 column in 0.1% trifluoroacetic acid one amino acid was obtained at the indicated position. Amino acid with a gradient of 0-50%o (vol/vol) acetonitrile for 120 min at signals ranged from 0.5 to 63 pmol. 50 ttl/min. Peaks were collected manually on 1-cm Whatman GF/C discs. Cysteines were reduced and alkylated (18). RESULTS Peptides were sequenced on an Applied Biosystems (Foster City, CA) model 470A sequencer (Fig. 1). RPF was isolated as a protein doublet from hamster liver nuclei cDNA Library. A double-stranded cDNA library from as described (5). The two major components, A and B, were Syrian hamster liver was synthesized from 10 ,ug of poly(A)+ separated on 35-cm NaDodSO4/polyacrylamide gels (Fig.

Multiple Genes Encode Nuclear Factor 1-Like Proteins That Bind To

ENCODE Regions Identification of Higher-Order Functional Domains in the Human

Developing and Implementing an Institute-Wide Data Sharing Policy Stephanie OM Dyke and Tim JP Hubbard*

Estimating the Number of Unseen Variants in the Human Genome

The ENCODE Project

The ENCODE (Encyclopedia of S DNA Elements) Project

What Is a Gene, Post-ENCODE? History and Updated Definition

A User's Guide to the Encyclopedia of DNA Elements (ENCODE)

Current Advances of Epigenetics in Periodontology from ENCODE Project

DNA Copy Number Analysis of Fresh and Formalin-Fixed Specimens By

ENCODE Phase 2: Participants and Projects

National Human Genome Research Institute EA Feingold, PJ G

Snyder Produc&On Group Summary for ENCODE 3