DNA·RNA Triple Helix Formation Can Function As a Cis-Acting Regulatory
Total Page:16
File Type:pdf, Size:1020Kb
DNA·RNA triple helix formation can function as a cis-acting regulatory mechanism at the human β-globin locus Zhuo Zhoua, Keith E. Gilesa,b,c, and Gary Felsenfelda,1 aLaboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892; bUniversity of Alabama at Birmingham Stem Cell Institute, University of Alabama at Birmingham, Birmingham, AL 35294; and cDepartment of Biochemistry and Molecular Genetics, University of Alabama at Birmingham, Birmingham, AL 35294 Contributed by Gary Felsenfeld, February 4, 2019 (sent for review January 4, 2019; reviewed by James Douglas Engel and Sergei M. Mirkin) We have identified regulatory mechanisms in which an RNA tran- of the criteria necessary to establish the presence of a triplex script forms a DNA duplex·RNA triple helix with a gene or one of its structure, we first describe and characterize triplex formation at regulatory elements, suggesting potential auto-regulatory mecha- the FAU gene in human erythroid K562 cells. FAU encodes a nisms in vivo. We describe an interaction at the human β-globin protein that is a fusion containing fubi, a ubiquitin-like protein, locus, in which an RNA segment embedded in the second intron of and ribosomal protein S30. Although fubi function is unknown, the β-globin gene forms a DNA·RNA triplex with the HS2 sequence posttranslational processing produces S30, a component of the within the β-globin locus control region, a major regulator of glo- 40S ribosome. We used this system to refine methods necessary bin expression. We show in human K562 cells that the triplex is to detect triplex formation and to distinguish it from R-loop stable in vivo. Its formation causes displacement from HS2 of ma- formation, a potential source of confusion. jor transcription factors and RNA Polymerase II, and consequently We then applied these methods to search for other examples in loss of factors and polymerase that bind to the human e- and of DNA·RNA triplexes and identified an interaction between an γ-globin promoters, which are activated by HS2 in K562 cells. This RNA sequence present within an intron of the human adult results in reduced expression of these genes. These effects are β-globin gene and an upstream regulatory element within hy- observed when a small length of triplex-forming RNA is intro- persensitive site 2 (HS2) of the β-globin locus control region duced into cells, or when a full-length intron-containing human (LCR). The effect of this interaction is to displace transcription β-globin transcript is expressed. Related results are obtained in factors from the regulatory site and affect expression of members human umbilical cord blood-derived erythroid progenitor-2 cells, of the β-globin family. This system represents a feedback mech- in which β-globin expression is similarly affected by triplex forma- anism in which a transcript could affect its own expression by tion. These results suggest a model in which RNAs conforming to forming a triple-strand structure at a nearby regulatory element. the strict sequence rules for DNA·RNA triplex formation may par- ticipate in feedback regulation of genes in cis. Results In Vivo Triplex-Forming RNA in K562 Cells: The FAU Gene as a Source · · DNA RNA triplex | human globin genes | FAU gene | DNA RNA triplex and Target. The methods we employed for detecting DNA·RNA RNA-mediated gene expression triple-stranded structure formation in vitro are shown in SI Ap- pendix, Fig. S1 B–D, using a previously published and well- he fact that both DNA and RNA can form triple-stranded characterized triplex-forming sequence (3). RNA and individual Tstructures has been known for a long time (1, 2), and the rules governing the formation of such structures have been well Significance established (3, 4). Triplexes in which all three strands are RNA were the first to be observed (1) in vitro. Triplexes in which all RNA containing only pyrimidines, transcribed from genomic three strands are DNA (H-DNA) form under superhelical stress DNA, can form DNA∙RNA triple helices with other sites in the (5, 6), and short runs of all-RNA triplex structures have been genome. We characterize an intronic sequence in the human shown to protect a long-noncoding RNA (lncRNA) from deg- β-globin gene that forms a triplex with an upstream regulatory radation or misfolding (7–12). Triple-strand structures can also site in the β-globin locus, causing displacement of transcription be formed between a DNA duplex and single-stranded RNA factors and down regulation of expression of the genes in the (ssRNA), but the sequence constraints are greater than with locus. Consequently, overexpression of a full-length β-globin structures in which all three strands are DNA (13–17). As shown transcript containing the intron results in decreased expres- · in SI Appendix, Fig. S1A, a typical DNA RNA triplex requires sion of members of the β-globin gene cluster. This is a cis- that the RNA bases be pyrimidines, with U favored over C en- regulatory feedback mechanism in which overexpression of ergetically (4). An rU base forms a triplex with an A·TDNA the β-globin gene could result in signals to down-regulate that base pair; an rC base, typically protonated, forms triplex with a expression. Balance in the expression of the many genes as- G·C DNA base pair. Such DNA·RNA complexes are among the sociated with hemoglobin biosynthesis is essential for ery- most stable triplex polynucleotide structures (15, 17). The RNA throid cell function. polypyrimidine strand binds in a direction antiparallel to the DNA polypyrimidine strand. One consequence of this strand Author contributions: Z.Z., K.E.G., and G.F. designed research; Z.Z. performed research; polarity is that, except when the sequences are palindromic, a Z.Z. and G.F. analyzed data; and Z.Z. and G.F. wrote the paper. transcript cannot form a DNA·RNA triplex with its own Reviewers: J.D.E., University of Michigan; and S.M.M., Tufts University. template. Thus, source and target sequences will generally The authors declare no conflict of interest. be distinct. Published under the PNAS license. Taking these constraints into account, we searched the human 1To whom correspondence should be addressed. Email: [email protected]. genome for candidate sequences that might be either a DNA This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. target for DNA·RNA triplex formation, or the source of an RNA 1073/pnas.1900107116/-/DCSupplemental. capable of binding to such DNA sequences. As a demonstration Published online March 13, 2019. 6130–6139 | PNAS | March 26, 2019 | vol. 116 | no. 13 www.pnas.org/cgi/doi/10.1073/pnas.1900107116 Downloaded by guest on September 27, 2021 DNA strands are labeled for electrophoresis studies, and sensi- triplex formation and eliminating heteroduplex formation as an tivity of complexes to RNases distinguishes triplex from hetero- explanation for our results. duplex when the sequence is palindromic. To distinguish between these possibilities in vitro, FAU-triplex Using these guidelines for formation of DNA·RNA triplexes gel-shift assays were performed with fluorescent tagged oligo- (3, 4) (SI Appendix, Fig. S1), we first searched the Cold Spring nucleotides, as shown in Fig. 1B. We observed that Cy5-labeled Harbor Laboratory long RNA-sequencing database in K562 cells FAU triplex-forming RNA (FAU-tfRNA) forms a complex with for RNAs (>200-nt long) containing at least 20-nt-long stretches its Cy3-labeled DNA templates at pH 6.5. This complex is re- of polypyrimidines. Because we wanted to explore the potential sistant to RNase H cleavage but is degraded by RNase A, as has role of such triplexes in gene regulation, we were interested in been shown to be consistent with formation of a DNA·RNA RNAs transcribed from nonrepetitive DNA near putative gene triplex (22) (SI Appendix, Fig. S1). A third possible outcome, promoter regions. Such a configuration should favor interac- involving complete displacement of one of the two DNA strands, tions between RNA and its DNA target. We initially focused on is ruled out by the electrophoretic results (Fig. 1B) (RNase H). FAU (FAU ubiquitin-like and ribosomal protein S30 fusion), a Complex formation is dependent on the RNA sequence, as Cy5- proapoptotic regulatory gene that is expressed in K562 cells and labeled PolyU-RNA is unable to produce a band shift with the down-regulated in human breast, prostate, and ovarian cancers FAU-DNA duplex in our triplex gel-shift assay (SI Appendix, Fig. (18–21). Our search showed that one of the more abundant S2). The results indicate that triplexes rather than R-loops are RNAs satisfying the criteria for triplex formation corresponded formed in vitro between FAU-tfRNA and its targeting DNA in sequence to FAU antisense transcript (Fig. 1A). This RNA duplex, similarly to what is observed in the “model” triplex in SI contains a U-rich sequence, (UCU)6, which could in principle be Appendix, Fig. S1. targeted back to the FAU gene as part of a canonical triplex. To further exclude the possibility of an R-loop structure, we However, because it happens to be a palindromic sequence it also subjected these FAU double-stranded (dsDNA)/ssRNA could also form an R-loop in which the RNA partially displaces complexes to dimethyl sulfate (DMS) footprinting to deter- one of the DNA strands, and forms a heteroduplex with the mine the accessibility of guanines in the DNA duplex at the other, while still maintaining a complex with three strands. We N7 position. In triple-stranded structures in which all three chose this gene as a way to develop methods for demonstrating strands are DNA, it has been shown that formation of dC·dG·dC BIOCHEMISTRY A CCTGTGTCTTCTTCTTCTTCTTCTCTCCTGTTTGG BC DMS - + + RNase A RNase H Proteinase K 32P-FAU DNA Duplex + + + FAU RNA --+ DNA/RNA Hybrid DNA/RNA DNA (+) DNA Duplex Triplex RNA DNA Duplex RNA DNA/RNA Hybrid DNA/RNA DNA (+) Triplex RNA Hybrid DNA/RNA DNA Duplex DNA (+) Triplex RNA Hybrid DNA/RNA DNA (+) DNA Duplex Triplex DNA·RNA triplex DNA RNA 5’-GTGGCCAAACAGGAGAAGAAG AAGAAGAAGACAGGTCGGGC -3’ Fig.