molecules

Review Structure and Function of Multimeric G-Quadruplexes

Sofia Kolesnikova 1,2 and Edward A. Curtis 1,*

1 The Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, 166 10 Prague, Czech Republic 2 Department of Biochemistry and Microbiology, University of Chemistry and Technology, 166 28 Prague, Czech Republic * Correspondence: [email protected]

 Received: 13 August 2019; Accepted: 22 August 2019; Published: 24 August 2019 

Abstract: G-quadruplexes are noncanonical nucleic acid structures formed from stacked guanine tetrads. They are frequently used as building blocks and functional elements in fields such as synthetic biology and also thought to play widespread biological roles. G-quadruplexes are often studied as monomers, but can also form a variety of higher-order structures. This increases the structural and functional diversity of G-quadruplexes, and recent evidence suggests that it could also be biologically important. In this review, we describe the types of multimeric topologies adopted by G-quadruplexes and highlight what is known about their sequence requirements. We also summarize the limited information available about potential biological roles of multimeric G-quadruplexes and suggest new approaches that could facilitate future studies of these structures.

Keywords: G-quadruplex; dimer; tetramer; multimer; oligomer; ; ; R-loop; DNA:RNA hybrid

1. Introduction The B-form double helix is the most well known nucleic acid structure, but it is not the only one. Other examples include double helices with geometries that differ from that of classical B-form DNA, such as Z-DNA [1], triple helices [2], and even four-stranded structures in which canonical A-T and C-G base pairs are absent [3,4]. Among these noncanonical folds, the G-quadruplex (GQ) has received the most attention. This is a four-stranded structure typically stabilized by stacked guanine tetrads connected by short loops [3,5,6]. Because of their high stability, structural versatility, and functional diversity, GQs have been widely used as building blocks and functional elements in fields such as synthetic biology [7,8]. Sequences with the potential to form GQs are also abundant in the genomes of higher eukaryotes [9,10], and recent studies using GQ-specific antibodies indicate that these structures can form in the context of cells [11,12]. GQs are thought to play a variety of biological roles. These including regulation of transcription, translation, DNA replication, and RNA localization [13–15]. Most biological studies to date have focused on monomeric GQs. However, GQs can also adopt a variety of multimeric forms. These include relatively small structures such as dimers. They also include larger ones like G-wires, which can contain hundreds of GQ monomers. Although the ability of GQs to multimerize has long been recognized, the possibility that such high-order structures form in the context of cells has received less attention. From our perspective, however, two observations suggest that this possibility should be considered. First, the cellular concentrations of GQs are surprisingly high, especially in higher eukaryotes. For example, current estimates suggest that human cells contain at least 716,000 DNA GQs [16] in a volume (for a HeLa cell nucleus) of 0.22 pL [17]. This corresponds to a cellular GQ concentration of 6 µM. Not all of these GQs will be present at the same time in the cell cycle, and not all will be capable of forming dimers. On the other hand, this concentration is orders of

Molecules 2019, 24, 3074; doi:10.3390/molecules24173074 www.mdpi.com/journal/molecules Molecules 2019, 24, x FOR PEER REVIEW 2 of 20 Molecules 2019, 24, 3074 2 of 20 the same time in the cell cycle, and not all will be capable of forming dimers. On the other hand, this concentration is orders of magnitude higher than that required for efficient GQ multimerization. For magnitude higher than that required for efficient GQ multimerization. For example, in a recent study example, in a recent study we identified GQs with dissociation constants of dimer formation as low we identified GQs with dissociation constants of dimer formation as low as 35 nM [18]. The hypothesis as 35 nM [18]. The hypothesis that GQs form multimeric structures in cells is also compelling when that GQs form multimeric structures in cells is also compelling when the myriad evolutionary and the myriad evolutionary and functional advantages of this mechanism are considered [19–22]. These functional advantages of this mechanism are considered [19–22]. These include the ability to regulate include the ability to regulate biochemical function on the basis of concentration, to detect ligands biochemical function on the basis of concentration, to detect ligands with enhanced sensitivity by with enhanced sensitivity by cooperative binding, and to modulate activity in a rapid and reversible cooperative binding, and to modulate activity in a rapid and reversible manner by the exchange of manner by the exchange of dimerization partners (Figure 1). Inspired by these considerations, in this dimerization partners (Figure1). Inspired by these considerations, in this review we describe the review we describe the types of multimeric topologies adopted by GQs and review what is known types of multimeric topologies adopted by GQs and review what is known about their sequence about their sequence requirements. We also summarize the limited information currently available requirements. We also summarize the limited information currently available about the potential about the potential biological roles of multimeric GQs in cells and suggest new approaches that could biological roles of multimeric GQs in cells and suggest new approaches that could facilitate future facilitate future studies of these structures, especially in the context of cells. studies of these structures, especially in the context of cells.

(a)

On On On On On Off Off On Off Off

(b)

(c)

RNAP

5’ 3’ 3’ 5’ Gene 1 Gene 2

RNAP

5’ 3’ 3’ 5’ Gene 1 Gene 2

Figure 1. Regulation of biochemical function by multimerization. (a) Concentration-based control Figure 1. Regulation of biochemical function by multimerization. (a) Concentration-based control of of biochemical function. In this scheme, monomers are biochemically active and dimers are not. biochemical function. In this scheme, monomers are biochemically active and dimers are not. At At concentrations below the dissociation constant for dimer formation, most of the population is concentrations below the dissociation constant for dimer formation, most of the population is monomeric and in the active state, while at concentrations above the dissociation constant, most of the monomeric and in the active state, while at concentrations above the dissociation constant, most of population is dimeric and in the inactive state. Green rectangles = nucleic acid or protein monomers. the population is dimeric and in the inactive state. Green rectangles = nucleic acid or protein (b) Enhanced sensitivity to ligand concentration by cooperative binding. In this scheme, ligand binding monomers. (b) Enhanced sensitivity to ligand concentration by cooperative binding. In this scheme, is independent when nucleic acid or protein binding sites are monomeric but cooperative when ligand binding is independent when nucleic acid or protein binding sites are monomeric but they are linked by multimerization. This leads to all-or-none binding and enhanced sensitivity to cooperative when they are linked by multimerization. This leads to all-or-none binding and enhanced ligand concentration. Green rectangles = nucleic acid or protein monomers; black circles = ligands. sensitivity to ligand concentration. Green rectangles = nucleic acid or protein monomers; black circles (c) Modulation of biochemical activity by the exchange of dimerization partners. In this scheme, the = ligands. (c) Modulation of biochemical activity by the exchange of dimerization partners. In this gene transcribed by RNA polymerase is determined by the DNA-binding specificity of its dimerization scheme, the gene transcribed by RNA polymerase is determined by the DNA-binding specificity of partner. Green ovals = RNA polymerase; green and purple rectangles = transcription factors; blue lines its dimerization partner. Green ovals = RNA polymerase; green and purple rectangles = transcription = DNA. factors; blue lines = DNA.

Molecules 2019, 24, 3074x FOR PEER REVIEW 33 of of 20 Molecules 2019, 24, x FOR PEER REVIEW 3 of 20 2. Definition of Multimeric G-Quadruplexes and Modes of Multimerization 2.2. Definition Definition of o Multimericf Multimeric G-Quadruplexes G-Quadruplexes and and Modes Modes of of Multimerization Multimerization A multimer is an aggregate of molecules consisting of multiple monomers. Nucleic acids typicallyAA multimer multimermultimerize is is an an(hybridize) aggregate aggregate through of of molecules molecules duplex consisting formationconsisting of viaof multiple multiplethe Watson–Crick monomers. monomers. base Nucleic Nucleic pairing acids acids of typicallytwotypically complementary multimerize multimerize strands. (hybridize) (hybridize) A double through through helix duplex providesduplex formation formation DNA with via via structural the the Watson–Crick Watson–Crick stability, basedetermined base pairing pairing by of of twohydrogentwo complementary complementary bonding and strands. strands. base-stacking AA double interactions, helix helix provides provides and DNAfacilitates DNA with with replication structural structural stability,[23,24], stability, while determined determined double- by bystrandedhydrogen hydrogen RNA bonding bondingfacilitates and andgenetic base-stacking base-stacking interference interactions, interactions, [25]. Duplex and formationfacilitates and facilitates replicationis the replicationmost common[23,24], [23 while mechanism,24], double- while double-strandedofstranded multimerization, RNA facilitates RNA but facilitatesnot genetic the only geneticinterference one. Multimerization interference [25]. Duplex [25]. can formation Duplex also occur formation is the using most isa commondistinct the most hydrogen- mechanism common mechanismbondingof multimerization, pattern of multimerization, called but Hoogsteen not the but only base not one. thepairing. Multimerization only one.This Multimerizationenables can formation also occur can of also using a G-tetrad, occur a distinct using the ahydrogen-building distinct hydrogen-bondingblockbonding of a uniquepattern type called pattern of nucleicHoogsteen called acid Hoogsteen basestructure pairing. base called pairing.This a G-quadruplexenables This formation enables (GQ). formation of To a formG-tetrad, of aa G-tetrad, G-tetrad, the building four the buildingguaninesblock of block associatea unique of a unique typevia eight of type nucleic hydrogen of nucleic acid structurebonds acid structure from called both called a G-quadruplexthe a Watson–Crick G-quadruplex (GQ). (GQ).and To formHoogsteen To form a G-tetrad, a G-tetrad,faces fourof fourtheguanines base guanines (Figure associate associate 2). via via eight eight hydrogen hydrogen bonds bonds from from both both the the Watson–Crick Watson–Crick andand HoogsteenHoogsteen faces faces of of thethe base base (Figure (Figure2). 2). H R N N R N R NNH H N N R N N NNH N N O H N O H N H N H N H O N N H O H H N N H H O H N N H H O N N HN H O N H O N H N N N N R N N N H R H N N R N N N R H Figure 2. Chemical structure of a GQ (G-quadruplex) tetrad. Figure 2. Chemical structure of a GQ (G-quadruplex) tetrad. G-tetrads stackstack onon toptop of of one one another another giving giving rise rise to to a GQa GQ (Figure (Figure3). The3). The types types of sequences of sequences that that can adoptcan adopt GQG-tetrads structuresGQ structures stackin on vitro top in havevitroof one beenhave another extensively been giving extensively characterized.rise to acharacterized. GQ (Figure To form 3). To aThe GQ,form types a sequencea GQ,of sequences a sequence typically that needstypicallycan adopt to containneeds GQ segmentstostructures contain of insegments at vitro least twohave of guanines atbeen least exte two separatednsively guanines characterized. by mixed-sequence separated Toby formmixed-sequence loop a nucleotides. GQ, a sequence loop The mostnucleotides.typically widely needs The used most modelsto contain widely allow segmentsused loops models of of one atallow to least seven loop two nucleotidess ofguanines one to seven [ 9separated,10] butnucleotides in by some mixed-sequence cases[9,10] loopsbut in can some loop be longercasesnucleotides. loops [26–28 can]. The In be addition,most longer widely it[26–28]. is used becoming modelsIn addition, increasingly allow itloop is s clearbecoming of one that to GQs sevenincreasingly can nucleotides accommodate clear [9,10] that bulges butGQs in [somecan29], andaccommodatecases noncanonical loops canbulges tetradsbe longer[29], have and [26–28]. alsononcanonical been In observedaddition, tetrad inits high-resolution ishave becoming also been increasingly structures observed [30inclear –high-resolution36 ].that Even GQs more can complicatedstructuresaccommodate [30–36]. topologies bulges Even are[29], more seen and complicated in noncanonical the Spinach, topologi Mango, tetrades sare andhave seen Class also in V thebeen GTP Spinach, aptamers,observed Mango, in whichhigh-resolution and theClass four V clustersGTPstructures aptamers, of guanines [30–36]. in which Even that the formmore four the complicated clusters tetrads of in guanines topologi the GQ esthat are are farform seen apart the in in tetradsthe the Spinach, primary in the Mango,GQ sequence are farand apart [ 37Class–39 in]. V GuaninestheGTP primary aptamers, in sequence the G-tetradin which [37–39]. canthe four inGuanines principle clusters in come ofthe guanines G-tetr fromad one, that can two,form in principle three, the tetrads or come four in guanine-richfromthe GQ one, are two, far (G-rich) apartthree, in strands.orthe four primary guanine-rich GQs sequence formed (G-rich) from [37–39]. only strands. Guanines one strand GQs in formed arethe typicallyG-tetr fromad can definedonly in one principle as strand intramolecular come are typically from (unimolecular)one, defined two, three, as (Figureintramolecularor four3A–B), guanine-rich although,(unimolecular) (G-rich) as discussed (Figure strands. 3A–B), below, GQs although, multipleformed from GQsas discussed only on a one single below, strand strand multiple are can typically alsoGQs interacton defined a single to as formstrandintramolecular higher-order can also interact (unimolecular) structures. to form GQs higher-order(Figure that 3A–B), contain stru although,ctures. more than GQsas discussed onethat strandcontain below, are more termedmultiple than intermolecularone GQs strand on a single are (multimoleculartermedstrand intermolecular can also or interact multimeric) (multimolecular to form and higher-order can beor classifiedmultimeric) structures. according and GQs can to bethat the classified numbercontain according ofmore strands than asto one bimolecularthe strand number are (dimeric)oftermed strands intermolecular (Figureas bimolecular3C), trimolecular (multimolecular (dimeric) (trimeric)(Figure or 3C), multimeric) (Figure trimolecular3D), and or tetramolecular (trimeric)can be classified (Figure (tetrameric) according 3D), or tetramolecular (Figureto the number3E). Multimeric(tetrameric)of strands GQsas(Figure bimolecular formed 3E). fromMultimeric (dimeric) identical GQs(Figure strands formed 3C), are trimfrom calledolecular identical homomultimeric, (trimeric) strands (Figureare whereas called 3D), homomultimeric, thoseor tetramolecular composed fromwhereas(tetrameric) nonidentical those (Figure composed strands 3E). from Multimeric are called nonidentical heteromultimeric. GQs formed strands from are called identical heteromultimeric. strands are called homomultimeric, whereas those composed from nonidentical strands are called heteromultimeric.

Figure 3. Formation of GQs from different different numbers of strands. ( a) Intramolecular Intramolecular (unimolecular) GQ withFigure antiparallel 3. Formation strands. of GQs ( b) Intramolecular from different (unimolecular)numbers of strands. GQ with (a) Intramolecular parallel strands. (unimolecular) ( c) Bimolecular GQ (dimeric)with antiparallel GQ. (d) Trimolecularstrands. (b) Intramolecular (trimeric) GQ. (unimolecular) (e) Tetramolecular GQ with (tetrameric)(tetrameric) parallel GQ.strands. Note (c that) Bimolecular each of these(dimeric) structures GQ. can(d) inTrimolecular principle contain (trimeric) all parallel GQ. (e ) strands, Tetramolecular all antiparallel (tetrameric) strands, GQ. oror aNote mix ofthat parallel each of andthese antiparallel structures strands. can in principle contain all parallel strands, all antiparallel strands, or a mix of parallel and antiparallel strands.

Molecules 2019, 24, 3074 4 of 20

While the number of molecules in a GQ is the standard way to classify multimers, another approachMolecules is 2019 to, consider24, x FOR PEER the REVIEW structure of the multimerization interface. GQs utilize two4 of primary 20 modes of multimerization. In the first mode, interfaces are formed by tetrads from two different While the number of molecules in a GQ is the standard way to classify multimers, another GQs stackedapproach on is to top consider of one the another. structure When of the intramolecular multimerization (unimolecular) interface. GQs utilize GQs two multimerize primary modes using this mechanism,of multimerization. individual In nucleic the first acidmode, strands interfaces first are assemble formed by intotetrads monomeric from two different GQs, such GQs thatstacked all four guanineson top in of each one another. G-tetrad When come intramolecular from the same (unimolecular) nucleic acid GQs strand. multimerize Monomeric using this GQ mechanism, subunits then stackindividual on top of nucleic one another acid strands via π– πfirststacking assemble interactions into monomeric of terminal GQs, such interfaces that all tofour form guanines higher-order in GQ structureseach G-tetrad (Figure come4 fromA). Intermolecular the same nucleic (multimeric)acid strand. Monomeric GQs can alsoGQ subunits multimerize then stack in this on top way ofif all of theone guanines another via in oneπ–π ofstacking the tetrads interactions at the of interface terminal comeinterfaces from to oneform GQhigher-order and all of GQ the structures guanines in the other(Figure come 4A). from Intermolecular a different (multimeric) GQ (Figure GQs4A). can GQ also subunits multimerize can stack in this in way three if diall ffoferent the guanines orientations: in one of the tetrads at the interface come from one GQ and all of the guanines in the other come from 50 to 30 (head-to-tail) [40], 50 to 50 (head-to-head) [41], and 30 to 30 (tail-to-tail) [42]. Experimental a different GQ (Figure 4A). GQ subunits can stack in three different orientations: 5′ to 3′ (head-to-tail) and molecular dynamics data suggest that 5 to 5 stacking is the most common orientation for GQ [40], 5′ to 5′ (head-to-head) [41], and 3′ to 3′ 0(tail-to-tail)0 [42]. Experimental and molecular dynamics structuresdata suggest due to that a favorable 5′ to 5′ stacking stacking is the geometry most common [41, 43orientatio]. Then propensity for GQ structures of subunits due to a to favorable stack is also influencedstacking by geometry the topology [41,43]. of The intermolecular propensity of GQ subunits subunits. to stack Stacking is also isinfluenced favorable by for the parallel topology GQs of (i.e., GQs withintermolecular all strands GQ oriented subunits. in Stacking the same is directionfavorable andfor parallel with propeller GQs (i.e., loopsGQs with on theall strands sides of oriented the tetrads). In antiparallelin the same GQ direction structures, and whichwith propeller have strands loops in on opposite the sides orientations, of the tetrads). lateral In antiparallel and diagonal GQ loops are positionedstructures, which above have and strands below thein opposite GQ axis, orient whichations, can lateral impede and stackingdiagonal interactionsloops are positioned of terminal G-tetradsabove [44and]. below Stacking the GQ interactions axis, which can can also impede facilitate stacking formation interactions of higher-orderof terminal G-tetrads structures [44]. from tandemStacking GQ subunits interactions folded can onalso a singlefacilitate strand. formation Such of a structurehigher-order was structures proposed from as one tandem of the GQ models of thesubunits human folded telomere on a overhangsingle strand. [45 Such,46]. a In structur additione was to proposed playing as important one of the roles models in of multimerization, the human telomere overhang [45,46]. In addition to playing important roles in multimerization, stacking stacking interactions are the most important mode of ligand binding to GQs. End-stacking GQ ligands interactions are the most important mode of ligand binding to GQs. End-stacking GQ ligands typicallytypically bind bind to terminal to terminal GQs GQs tetrads, tetrads, but but can can alsoalso intercalate between between tandem tandem GQs GQs to stabilize to stabilize the the multimericmultimeric structure structure [47 –[47–52].52].

FigureFigure 4. Types 4. Types of interfacesof interfaces in in multimeric multimeric GQs. GQs. (a) First mode mode of of multimerization. multimerization. Interfaces Interfaces are are

formedformed between between tetrads tetrads which which stack stack on on top top of of one one another another in a a 5 5′0 toto 5′ 5, 03,′ 3to0 to3′, 3or0, 5 or′ to 5 03′to arrangement. 30 arrangement. (b) Second(b) Second mode mode of multimerization. of multimerization. Interfaces Interfaces areare formed within within tetrads tetrads made made up of up guanines of guanines from from multiplemultiple DNA DNA strands. strands. (c) ( Structurec) Structure combining combining thesethese two two modes modes of of multimerization. multimerization.

In theIn second the second mode mode of multimerization, of multimerization, guanines guanines from from two ortwo more or more nucleic nucleic acid strandsacid strands hydrogen bondhydrogen to form tetrads, bond to so form that tetrads, interfaces so that occur interfaces within occur rather within than betweenrather than tetrads. between These tetrads. intermolecular These intermolecular G-tetrads then stack to form GQs of various lengths (Figure 4B), sometimes using G-tetrads then stack to form GQs of various lengths (Figure4B), sometimes using slipped strands [ 53]. slipped strands [53]. Various G-rich oligonucleotides multimerize via this mode, and a large body of Variousliterature G-rich has oligonucleotides investigated their formation multimerize [5,6]. via As early this mode,as 1988, andSen and a large Gilbert body demonstrated of literature that has investigatedoligonucleotides their formation containing [5,6 ].motifs As early of four, as 1988, five, Senor six and contiguous Gilbert demonstrated guanines fold that into oligonucleotides tetrameric containingstructures motifs [54]. of One four, year five, later, or two six contiguousdifferent grou guaninesps proposed fold the into formation tetrameric of dimeric structures GQs [from54]. Onetwo year later,telomere-derived two different groups G-rich proposed sequences the[55,56]. formation Sen and of Gilbert dimeric [57] GQs observed from that two short telomere-derived oligonucleotides G-rich sequences [55,56]. Sen and Gilbert [57] observed that short oligonucleotides with three consecutive Molecules 2019, 24, 3074 5 of 20

guanines at the 30 end assembled into four-, eight-, and twelve-stranded GQ structures. Formation of theseMolecules structures 2019, starts24, x FOR with PEER the REVIEW self-assembly of slipped tetramers, in which strands are not5 perfectlyof 20 aligned but contain two overhanging guanines at the 30end. Tetrameric subunits associate with one anotherwith via three hydrogen consecutive bonding guanines between at the these3′ end extra assembled guanines. into four-, eight-, and twelve-stranded GQ Higher-orderstructures. Formation GQs of which these struct combineures starts these with two the modes self-assembly of multimerization of slipped tetramers, have in which also been describedstrands [18 are,42 ,not58– 61perfectly]. An illustrativealigned but examplecontain two is a overhanging GQ structure guanines assembled at the from 3′end. eight Tetrameric d(TGGGGT) strandssubunits (Figure associate4C). The with structural one another subunit via hydrogen is a tetramolecular bonding between GQ consisting these extra of guanines. four perfectly aligned Higher-order GQs which combine these two modes of multimerization have also been described strands, each in an identical 5 –3 orientation (second mode of multimerization). Two structural [18,42,58–61]. An illustrative 0example0 is a GQ structure assembled from eight d(TGGGGT) strands subunits(Figure then 4C). stack The atstructural their 50 subunitinterfaces, is a tetramolecular forming an octamer GQ consisting (first modeof four ofperfectly multimerization) aligned strands, [58 ,59]. Higher-ordereach in an identical GQ structures5′–3′ orientation are typically (second mode initially of multimerization). characterized using Two structural low-resolution subunits techniques then suchstack as circular at their 5 dichroism,′ interfaces, forming dimethylsulfate an octamer footprinting, (first mode of multimerization) native PAGE, mass[58,59]. spectrometry, and analyticalHigher-order ultracentrifugation. GQ structures These are typically methods initially can be characterized used to establish using low-resolution that the sequence techniques forms a GQ,such identify as circular the guaninesdichroism, indimethylsulfate tetrads, and footprinting, determine thenative number PAGE, ofmass strands spectrometry, in the structure.and Higher-orderanalytical GQs ultracentrifugation. can be visualized These in methods greater detail can be using used to NMR establish and X-raythat the crystallography. sequence forms a Examples GQ, identify the guanines in tetrads, and determine the number of strands in the structure. Higher-order of high-resolution structures which utilize the first mode of multimerization include 50–50 stacked dimersGQs with can canonicalbe visualized [41 ,in62 greater,63] or extendeddetail using [64 NM] tetradsR and X-ray at the crystallograph interface. Structuresy. Examples which of high- use the resolution structures which utilize the first mode of multimerization include 5′–5′ stacked dimers second mode of multimerization include interlocked dimers which occur in the promoters and introns with canonical [41,62,63] or extended [64] tetrads at the interface. Structures which use the second of oncogenesmode of multimerization [65–67]. A structure include whichinterlocked combines dimers these which two occur modes in the of promoters multimerization and introns is thatof of a parallel-strandedoncogenes [65–67]. tetrameric A structure GQ, formedwhich combines from d(TGGGGT) these two modes strands of [ 58multimerization,59]. Additional is examplesthat of a are discussedparallel-stranded in [5,50,68] tetrameric and elsewhere. GQ, formed from d(TGGGGT) strands [58,59]. Additional examples are discussed in [5,50,68] and elsewhere. 3. Sequence Requirements of Multimeric G-Quadruplexes 3. Sequence Requirements of Multimeric G-Quadruplexes The propensity of a GQ sequence to fold into a particular multimeric structure in vitro depends on a numberThe ofpropensity factors, of including a GQ sequence the type to fold and into concentration a particular multimeric of cation instructure the bu ffiner, vitro the depends presence of molecularon a number crowding of factors, agents, including the nucleic the type acid and concentration, concentration theof cation sequence in the length,buffer, the and presence the sequence of compositionmolecular [5 ,crowding68–70]. Under agents, conditions the nucleic approximating acid concentration, the intracellularthe sequence environment,length, and the the sequence main factor composition [5,68–70]. Under conditions approximating the intracellular environment, the main controlling higher-order structure formation is the sequence. In the context of GQ formation, a large factor controlling higher-order structure formation is the sequence. In the context of GQ formation, a numberlarge of number sequences, of sequences, ranging ranging from short from strands short strand withs onlywith only one one G-run G-run to upto up to to 20,000 20,000 nucleotide nucleotide long sequenceslong sequences with roughly with 3300roughly G-runs, 3300 haveG-runs, been have investigated been investigated [58,59 ,[58,59,71].71]. The keyThe sequencekey sequence features affectingfeatures multimerization affecting multimerization are discussed are discussed below and below summarized and summarized in Figure in 5Figure. 5.

FigureFigure 5. Sequence 5. Sequence requirements requirements of of multimeric multimeric GQs. (a ()a )Example Example of ofa canonical a canonical GQ. GQ. (b) Variant (b) Variant containingcontaining two rathertwo rather than fourthan G-runs.four G-runs. Such aSuch sequence a sequence can form can aform multimeric a multimeric GQ but GQ not but a monomeric not a one. (monomericc) Variant containingone. (c) Variant G-runs containing of two ratherG-runs thanof two three rather nucleotides. than three Innucleotides. some cases, In some such sequencescases, formsuch multimeric sequences rather form multimeric than monomeric rather than structures. monomeric ( dstructures.) Variant (d containing) Variant containing mutations mutations in tetrads. Suchin mutations tetrads. Such can mutations induce formation can induce of formation dimeric andof dimeric tetrameric and tetrameric structures. structures. (e) Variant (e) Variant containing containing overhanging nucleotides. Such variants typically cannot stack via the first mode of overhanging nucleotides. Such variants typically cannot stack via the first mode of multimerization, multimerization, but the ability to interact via the second mode of multimerization is unaffected. (f) but the ability to interact via the second mode of multimerization is unaffected. (f) Variant containing Variant containing extended loops. Longer loops favor formation of antiparallel rather than parallel extendedGQs, loops.and such Longer loops loops can favor interfere formation with ofthe antiparallel ability of GQs rather to than stack parallel via the GQs, first and mode such of loops can interferemultimerization. with the ability of GQs to stack via the first mode of multimerization.

Molecules 2019, 24, 3074 6 of 20

MoleculesThe minimum 2019, 24, x FOR requirement PEER REVIEW for a sequence to fold into an intramolecular GQ is typically6 of four 20 runs of at least two guanine bases separated by loops ranging from one to seven nucleotides (Figure5A). SequencesThe falling minimum into requirement this category for a oftensequence either to fold fold into into an monomericintramolecular GQs GQ thatis typically do not four interact withruns one of another, at least two or formguanine multimeric bases separated structures by loops by ranging the first from mode one ofto multimerizationseven nucleotides (Figure (stacking of monomeric5A). Sequences subunits). falling Structureinto this category prediction often iseith neverthelesser fold into monomeric complicated GQs by that reports do not ofinteract sequences with one another, or form multimeric structures by the first mode of multimerization (stacking of containing four stretches of guanines which form intertwined dimers using the second mode of monomeric subunits). Structure prediction is nevertheless complicated by reports of sequences multimerizationcontaining four [65 stretches–67,72]. of Sequences guanines which with fewerform intertwined than four dimers G-runs using can typicallythe second only mode adopt of GQ structuresmultimerization by interacting [65–67,72]. to form Sequences multimeric with fewer structures than four using G-runs the secondcan typically mode only of multimerization adopt GQ (Figurestructures5B). Sequences by interacting with to two form or threemultimeric G-runs structures separated using by shortthe second loops mode usually of multimerization combine both types of multimerization(Figure 5B). Sequences and readily with two assemble or three G-runs into structures separated with by short eight, loops twelve, usually or combine even more both strands types [57]. At theof multimerization extreme end of thisand continuumreadily assemble lie the into G-wires structures [73 ],with long eight, linear twelve, ladder-like or even structuresmore strands formed [57]. from numerousAt the extreme slipped end GQ of tetrameric this continuum subunits lie the (Figure G-wire6).s [73], G-wires long arelinear longer ladder-like than any structures other higher-orderformed GQs,from with numerous maximum slipped lengths GQ tetramer dependingic subunits on the(Figure method 6). G-wires of preparation are longer than as wellany other as the higher- sequence. order GQs, with maximum lengths depending on the method of preparation as well as the sequence. For example, a DNA sample of d[G4T2G4] formed linear G-wires ranging from 7 to 100 nm in length. For example, a DNA sample of d[G4T2G4] formed linear G-wires ranging from 7 to 100 nm in length. A 100 nm G-wire was calculated to contain 75 GQ blocks composed of 140 full and 20 half strands [74]. A 100 nm G-wire was calculated to contain 75 GQ blocks composed of 140 full and 20 half strands Positioning[74]. Positioning of the GQof the subunits GQ subunits within within the the structure structure can can be controlled controlled by by adding adding GC bases GC bases to the to the terminalterminal ends, ends, which which enables enables the the formation formation of of G:C:G:C G:C:G:C tetrads tetrads linking linking the thesubunits subunits [75]. [75].

FigureFigure 6. StructureStructure of of a G-wire. a G-wire.

In additionIn addition to theto the number number of of G-runs G-runs in in the the sequence,sequence, multimerization multimerization also also depends depends on onthe thelength length of the G-runof the (FigureG-run (Figure5C). An 5C). instructive An instructive example example is a study is aof study structural of structural transitions transitions caused caused by truncations by of guaninetruncations tracts ofin guanine the telomere-derived tracts in the telomere-derived DNA strand d(G DNA4T4G strand4)[76 ].d(G The4T4 originalG4) [76]. sequenceThe original and its 50 sequence and its 5′ truncated analog d(G3T4G4) formed stable dimeric structures that did not undergo truncated analog d(G3T4G4) formed stable dimeric structures that did not undergo conversion into conversion into higher-order structures. In contrast, the 3′ truncated sequences d(G3T4G3) and higher-order structures. In contrast, the 30 truncated sequences d(G3T4G3) and d(G4T4G3) formed d(G4T4G3) formed a mixture of dimeric, trimeric, and tetrameric GQs in a concentration-dependent a mixture of dimeric, trimeric, and tetrameric GQs in a concentration-dependent manner. This manner. This structural polymorphism can be explained by varying stabilities of the dimeric structural polymorphism can be explained by varying stabilities of the dimeric structures formed by structures formed by the reference and the various truncated sequences. The reference d(G4T4G4)2 the referencedimer contains and thesixteen various guanines truncated which sequences.form four G-tetrads, The reference and is d(G therefore4T4G4 more)2 dimer stable contains than the sixteen guanines which form four G-tetrads, and is therefore more stable than the three-tetrad dimers formed Molecules 2019, 24, 3074 7 of 20

by truncated sequences. A bimolecular structure formed from two d(G3T4G4) strands is special in that, in contrast to the antiparallel dimers formed by d(G T G ) and d(G T G ), it contains three parallel Molecules 2019, 24, x FOR PEER REVIEW 3 4 3 4 4 3 7 of 20 strands. This specific strand orientation stabilizes the d(G3T4G4)2 dimer and thereby prevents it from rearrangingthree-tetrad into higher-orderdimers formed structures. by truncated sequences. A bimolecular structure formed from two Ad(G third3T4G feature4) strands of is G-runs special thatin that, can in a ffcontrastect multimerization to the antiparallel is the dimers presence formed of bulgesby d(G3 (interruptionsT4G3) and of G-runsd(G4T by4G3), non-G it contains nucleotides) three parallel (Figure strands.5D). This Despite specific bulges strand being orientation tolerated stabilizes by various the d(G GQs,3T4G4)2they can greatlydimer and influence thereby structuralprevents it from stability. rearranging The extent into higher-order to which GQ structures. stability is reduced depends on A third feature of G-runs that can affect multimerization is the presence of bulges (interruptions a number of factors, such as the location of the bulges within the GQ sequence, the context of the of G-runs by non-G nucleotides) (Figure 5D). Despite bulges being tolerated by various GQs, they sequence, and the overall GQ topology [29]. The introduction of bulges does not necessarily result can greatly influence structural stability. The extent to which GQ stability is reduced depends on a in structuralnumber changes,of factors, andsuch in as some the location cases they of the do bulges not prevent within monomericthe GQ sequence, GQs fromthe context forming. of the On the othersequence, hand, GQs and with the overall reduced GQ stability topology are [29]. prone The tointroduction structural of reorganizations, bulges does not necessarily including higher-orderresult in structurestructural formation. changes, Bulged and in nucleotides some cases as they drivers do not of prevent multimerization monomeric were GQs systematically from forming. investigatedOn the in a recentother studyhand, GQs that analyzedwith reduced dimerization stability are and prone tetramerization to structural asreorganizations, a function of guanineincluding substitutions higher- in G-runsorder [18structure]. The referenceformation. sequenceBulged nucleotides used in this as study,drivers d(G of multimerization3TG3AAG3TG3 A),were folds systematically into a parallel three-tetradinvestigated GQ, in with a recent nucleotides study that 2,analyzed 6, 11, dimeri and 15zation forming and tetramerization the central tetrad as a function (Figure of7 A).guanine Despite substitutions in G-runs [18]. The reference sequence used in this study, d(G3TG3AAG3TG3A), folds containing an exposed 50 tetrad, only monomers were observed when versions of this sequence which into a parallel three-tetrad GQ, with nucleotides 2, 6, 11, and 15 forming the central tetrad (Figure contained either a 5 hydroxyl or a 5 phosphate group were analyzed on native gels. Dimerization 7A). Despite containing0 an exposed0 5′ tetrad, only monomers were observed when versions of this and tetramerization were induced by introducing substitutions of guanines for other nucleotides sequence which contained either a 5′ hydroxyl or a 5′ phosphate group were analyzed on native gels. at certainDimerization positions and in tetramerization the central tetrad. were induced Bulged by G-runs introducing prevented substitutions thesesequences of guanines from for other forming stablenucleotides intramolecular at certain GQs positions by themselves in the central and increased tetrad. Bulged their G-runs propensity prevented to form these multimeric sequences structures.from Sequencesforming containing stable intramolecular substitutions GQs at by position themselves 2, 6, an ord both increased were their proposed propensity to form to form intertwined multimeric dimers withstructures. a core of threeSequences canonical containing G-tetrads substitutions at the 3 0atend position (Figure 2, 76,B). orSequences both were withproposed substitutions to form at positionsintertwined 11, 15, dimers or both with were a core instead of three proposed canonical to formG-tetrads intertwined at the 3′ end dimers (Figure containing 7B). Sequences a core with of three substitutions at positions 11, 15, or both were instead proposed to form intertwined dimers canonical G-tetrads at the 50 end (Figure7C). Such sequences also formed tetramers which, in accord containing a core of three canonical G-tetrads at the 5′ end (Figure 7C). Such sequences also formed with the most favorable mode of stacking, were proposed to consist of two dimers stacked in a 50 to 50 tetramers which, in accord with the most favorable mode of stacking, were proposed to consist of orientation (Figure7C). Intertwined dimers formed from sequences with other possible combinations two dimers stacked in a 5′ to 5′ orientation (Figure 7C). Intertwined dimers formed from sequences of substitutions (2 and 11, 2 and 15, 6 and 11, 6 and 15) would only contain two stacked G-tetrads [18]. with other possible combinations of substitutions (2 and 11, 2 and 15, 6 and 11, 6 and 15) would only This structuralcontain two arrangement stacked G-tetrads cannot [18]. effectively This structural stabilize arrangement the structures, cannot which effectively probably stabilize explains the why thesestructures, sequences which did not probably also multimerize. explains why these sequences did not also multimerize.

Figure 7. Mutations in tetrads can induce GQ multimerization. (a) Secondary structure of a GQ we Figure 7. Mutations in tetrads can induce GQ multimerization. (a) Secondary structure of a GQ we are studying in our group with the sequence GGGTGGGAAGGGTGGGA. We previously generated are studying in our group with the sequence GGGTGGGAAGGGTGGGA. We previously generated a librarya library containing containing all possibleall possible mutations mutations inin thethe central tetrad tetrad in in this this structure structure (at positions (at positions 2, 6, 11, 2, 6, 11, and 15)and and 15) testedand tested these these variants variants for for a a series series ofof biochemicalbiochemical activities activities asso associatedciated with with GQs GQs [18,77–79]. [18,77 –79]. (b) Proposed(b) Proposed secondary secondary structure structure of of dimers dimers formed formed by variants variants containing containing mutations mutations at positions at positions 2, 2, 6, or both6, or both in the in centralthe central tetrad tetrad of of the the reference reference GQ.GQ. TheThe 5 ′0 partpart of of this this structure, structure, which which contains contains the the mutatedmutated nucleotides, nucleotides, is represented is represented by a wavyby a wavy black black line. (line.c) Proposed (c) Proposed secondary secondary structure structure of tetramers of formedtetramers by variants formed containing by variants mutations containing at mutations positions at 11, positions 15, or both 11, 15, in theor both central in the tetrad central of thetetrad reference of the reference GQ. The 3′ part of this structure, which contains the mutated nucleotides, is represented GQ. The 30 part of this structure, which contains the mutated nucleotides, is represented by a wavy blackby line. a wavy black line.

Molecules 2019, 24, 3074 8 of 20

GQ sequences can be designed to begin and/or end with G-runs or they can contain overhanging non-G nucleotides at the 50 and/or 30 ends (Figure5E). Flanking nucleotides often inhibit stacking interactions by sterically hindering formation of interfaces by terminal tetrads [57,80]. In some cases even the introduction of a 50 phosphate group is sufficient to inhibit 50–50 stacking [73]. This knowledge can be used when probing multimerization mechanisms: for multimers stacked in a 50 to 50 orientation, addition of flanking nucleotides at the 50 end should interfere with the stacking interaction, while addition of flanking nucleotides to the 30 end should not [18]. Nevertheless, some studies report the stacking of GQ subunits which contain overhanging nucleotides. In these structures, flanking nucleotides either radiate out from the GQ axis or become a part of various unusual structures such as thymine triads [81], guanine–uridine octads [82] or guanine–cytidine octads [60]. The role of loop nucleotides in multimeric GQ assembly has received relatively little attention from researchers, and most studies investigating multimeric GQs have not analyzed the effects of loop length and sequence composition. Despite this, examples in which loop nucleotides affect GQ multimerization have been described. Loop length can indirectly influence multimerization by affecting GQ topology (Figure5F). Sequences bearing at least one single-nucleotide loop tend to adopt a parallel-stranded orientation, whereas longer loops increase the likelihood that sequences will fold into mixed-strand or antiparallel GQs [83]. As discussed above, this can affect the ability of GQs to interact by the first mode of multimerization: the propeller loops of parallel strand GQs do not generally affect their ability to form multimers through stacking on their terminal tetrads, whereas the lateral and diagonal loops of antiparallel GQs sometimes do. Loop sequence can also play an important role in GQ structure formation. One mechanism by which this can impact the global structure of the GQ is by formation of noncanonical extended G-tetrads. For example, d(GGA)4 strands assemble into dimers by stacking on a guanine–adenine heptad interface formed by two monomeric subunits. Three loop adenines provide hydrogen bonds donors and acceptors in the heptad plane and hence play an indispensable role in dimerization [64]. Moreover, several studies have shown that mutating nucleotides not thought to directly interact with tetrads can lead to major changes in the GQ structure. For example, changing the loops in a human telomere-derived sequence from TTA to AAA abolishes GQ formation [84]. Recent findings in our lab also show that mutations in loops can affect the propensity of sequences to form multimeric GQs [77,78]. In summary, these studies indicate that the main sequence features affecting multimerization are the number and length of the G-runs, the sequence composition of the G-runs, the presence of flanking nucleotides at the 50 and 30 ends of the GQ, and the length and sequence of loop nucleotides (Figure5). Despite much being understood about the sequence requirements of multimeric GQs, it is generally not possible to predict higher-order GQ topology from sequence. Instead, experimental data from low-resolution methods such as circular dichroism, dimethylsulfate footprinting, native PAGE, mass spectrometry, and analytical ultracentrifugation are used in combination with high-resolution techniques such as NMR and X-ray crystallography.

4. Potential Biological Roles of Multimeric G-Quadruplexes at the Tips of The first discussions of biologically relevant higher-order GQ structures emerged as the length and sequence composition of telomeres was being characterized [85]. In the majority of eukaryotes, the nucleotide sequence of telomeres consists of a G-rich strand running 50 to 30 toward the end of the chromosome and a complementary C-rich strand. For most of their length, telomeres are double-stranded, except for a single-stranded G-rich 30 overhang. While the telomeric sequence is similar in diverse species, the length of the overhang is not. For example, in Tetrahymena the overhang typically consists of tandem repeats of the sequence d(TTGGGG) and is 14 to 21 nucleotides long [86]. On the other hand, in humans the sequence of the overhang is d(TTAGGG) and it ranges in length from 125 to 275 nucleotides [87,88]. While the composition and length of the telomeric overhang has been characterized in many species, its secondary structure is still a matter of debate. The first studies of short telomere-derived synthetic Molecules 2019, 24, 3074 9 of 20 Molecules 2019, 24, x FOR PEER REVIEW 9 of 20 oligonucleotidesguanine-rich structures reported that[54–56,89]. these sequences Subsequent folded studies into severalshowed topologically that these differentcorrespond guanine-rich to GQ structuresstructures, [54 –although56,89]. Subsequentseveral different studies topologies showed thathave thesebeen correspondreported [6,90,91]. to GQ structures,Examples include although severalparallel-stranded different topologies tetrameric have and been antiparallel reported dime [6,90ric,91 GQs]. Examples assembled include from parallel-strandedshort telomeric sequences tetrameric andbearing antiparallel one and dimeric two G-runs, GQs assembled respectively, from as short well telomeric as intramolecular sequences monomeric bearing one GQs and formed two G-runs, by respectively,sequences ascontaining well as intramolecular four G-runs. monomeric GQs formed by sequences containing four G-runs. BecauseBecause the the human human telomeric telomeric overhang overhang containscontains mu multipleltiple repeats repeats of of segments segments with with four four G-runs, G-runs, it canit can potentially potentially fold fold intointo aa higher-order struct structureure containing containing a series a series of monomeric of monomeric GQs GQs[92]. A [92 ]. A single-stranded DNA DNA with with several several monomeric monomeric GQs GQs can can adopt adopt at atleast least two two different different structural structural arrangementsarrangements (Figure (Figure8). 8). The The first first one one is termedis termed “beads-on-a-string”, “beads-on-a-string”, and and assumes assumes that that monomeric monomeric GQ unitsGQ do units not do interact not interact with onewith another one another [93]. [93]. A second A second model model proposes proposes that that GQ GQ subunits subunits associate associate with onewith another one another through through stacking stacking interactions interactions between between the terminal the terminal tetrads tetrads of consecutive of consecutive GQs GQs to formto high-orderform high-order structures structures [45,46]. [45,46]. One such One arrangement such arrangement of stacked of stacked monomers monomers has recently has recently been described been in thedescribed context in of the nontelomeric context of nontelomeric oligonucleotides oligonucleotides containing multiplecontaining GQs multiple [94,95 ].GQs Despite [94,95]. evidence Despite for evidence for both models, a structure in which GQs stack on one another appears to be most likely both models, a structure in which GQs stack on one another appears to be most likely [92]. Structures [92]. Structures consistent with this model have been observed by atomic force and electron consistent with this model have been observed by atomic force and electron microscopy [71,96,97], and microscopy [71,96,97], and methods such as FRET have provided additional evidence for interactions methods such as FRET have provided additional evidence for interactions between neighboring GQ between neighboring GQ subunits [97]. Molecular dynamics simulations also support the idea that subunits [97]. Molecular dynamics simulations also support the idea that telomeric GQ monomers can telomeric GQ monomers can stack to form higher-order structures [45,46]. On the basis of stackcomparisons to form higher-order between calculated structures and [ 45experimentally,46]. On the basismeasured of comparisons sedimentation between coefficients, calculated it has and experimentallybeen proposed measured that a structure sedimentation consisting coe offfi hybridcients, rather it has beenthan all-parallel proposed thatGQs a is structure most likely consisting [98]. ofDespite hybrid ratherthese advances, than all-parallel a high-resolution GQs is most structur likelye [ 98of]. the Despite telomeric these overhang advances, has a high-resolutionnot yet been structurereported. of the telomeric overhang has not yet been reported.

FigureFigure 8. 8.Possible Possible structures structures of of telomeric telomeric GQs. GQs. (a ()a Beads-on-a-string) Beads-on-a-string model model in in which which telomeric telomeric GQs GQs do notdo interact. not interact. (b) Model (b) Model in which in telomericwhich telomeric GQs stack GQs on stack one anotheron one another to form higher-orderto form higher-order structures. Bluestructures. circles represent Blue circles GQ represent structures. GQ structures.

InIn vivo vivoevidence evidence for for the the presence presence ofof GQsGQs atat telomerestelomeres came came from from studies studies reporting reporting the the binding binding ofof GQ-specific GQ-specific antibodies antibodies to to the the tips tips ofof telomerestelomeres [[11,12,99].11,12,99]. These These experiments experiments also also showed showed that that telomerasetelomerase co-localizes co-localizes with with GQ GQ antibodies antibodies atat chromosomalchromosomal ends. ends. Telomerase Telomerase catalyzes catalyzes telomere telomere elongationelongation and and is is directly directly involved involved inin GQGQ unfoldingunfolding [[100].100]. Although Although these these studies studies indicate indicate that that telomerictelomeric DNA DNA adopts adopts a a GQ GQ structurestructure inin vivovivo,, theythey do not distinguish distinguish between between monomeric monomeric and and multimericmultimeric GQs. GQs. While While antibodies antibodies specificspecific forfor multimericmultimeric GQs GQs have have not not yet yet been been developed, developed, several several small molecules specific for multimeric GQs have recently been reported [51,101–103]. One of these small molecules specific for multimeric GQs have recently been reported [51,101–103]. One of these compounds, IZNP-1 (triaryl-substituted imidazole derivative), binds and stabilizes multimeric GQs compounds, IZNP-1 (triaryl-substituted imidazole derivative), binds and stabilizes multimeric GQs by by intercalating into the cavity between two stacked GQ monomers. IZNP-1-treated cells contain a intercalating into the cavity between two stacked GQ monomers. IZNP-1-treated cells contain a greater greater number of BG4 foci (human GQ structure-specific antibody foci) at telomeres than untreated number of BG4 foci (human GQ structure-specific antibody foci) at telomeres than untreated cells. cells. They also exhibit signs of DNA damage and telomere shortening, which can lead to cell cycle Theyarrest, also apoptosis, exhibit signs and ofsenescence. DNA damage Taken and together, telomere these shortening, findings suggest which canthat lead multimeric to cellcycle GQs arrest,can apoptosis,form at telomeres and senescence. under certain Taken conditions together, and these contribute findings to suggest telomere that dysfunction multimeric [51]. GQs can form at telomeres under certain conditions and contribute to telomere dysfunction [51].

MoleculesMolecules2019 2019, 24, 24, 3074, x FOR PEER REVIEW 1010 ofof 20 20

5. Potential Biological Roles of Multimeric G-Quadruplexes in Promoters 5. Potential Biological Roles of Multimeric G-Quadruplexes in Promoters Visualization of IZNP-1-induced BG4 foci confirmed that telomeric ends have the ability to form multimericVisualization GQs. Surprisingly, of IZNP-1-induced however, BG4 the foci percentage confirmed of that fluorescent telomeric foci ends at havetelomeres the ability was relatively to form multimericlow (38%), GQs.and the Surprisingly, majority of however, antibodies the localized percentage to other of fluorescent genomic fociregions. at telomeres One implication was relatively of this lowobservation (38%), and is the possibility majority of that antibodies multimeric localized GQs exist to other outside genomic telomeric regions. DNA. One If implicationso, multimeric of thisGQs observationmight have is roles the possibilitybeyond chromosomal that multimeric capping. GQs exist outside telomeric DNA. If so, multimeric GQs mightA have role roles for monomeric beyond chromosomal GQ structures capping. in transcriptional regulation was proposed following the discoveryA role that for monomericputative GQ GQ sequences structures are in highly transcriptional enriched in regulation nuclease hypersensitive was proposed followingregions within the discoverypromoters that [13]. putative In addition, GQ sequences more than are 40% highly of human enriched genes in nuclease contain hypersensitiveat least one GQ regions motif withinin their promoterspromoter. [While13]. In formation addition, of more monomeric than 40% intramolecul of human genesar GQs contain has been at investigated least one GQ for motif a number in their of promoter.these genes While including formation c-MYC, of monomeric KRAS, and intramolecular BCL2 [104–107], GQs only has a beenfew examples investigated of genes for a numberlikely to ofbe theseregulated genes by including multimericc-MYC, GQs KRAS in ,their and BCL2promoters[104– 107have], only been a fewidentified examples so offar. genes The likelymost towell- be regulatedcharacterized by multimeric example GQsis an in unusual their promoters tetrad:heptad: have beenheptad:tetrad identified so(T:H:H:T) far. The mostdimer, well-characterized generated from a exampled(GGA) isrepeat an unusual sequence. tetrad:heptad:heptad:tetrad An oligonucleotide (T:H:H:T)containing dimer, four generated such repeats from a d(GGA)folds into repeat an sequence.intramolecular An oligonucleotide structure in which containing a guanine four tetrad such and repeats a guanine–adenine folds into an intramolecularheptad are stacked structure on top inof which one another. a guanine The tetradNMR st andructure a guanine–adenine indicates that two heptad monomeric are stacked subunits on stack top of 5′ one to 5′ another. on the heptad The NMRplane, structure resulting indicates in a dimeric that twoT:H:H:T monomeric structure subunits [64] (Figure stack 9). 50 toA similar 50 on the structure heptad plane,was adopted resulting by inan aoligonucleotide dimeric T:H:H:T containing structure [eight64] (Figure copies9 ).of Athe similar repeat structure [108]. The was d(GGA) adopted repeat by an sequence oligonucleotide occurs in containingtwelve nearly eight perfect copies oftandem the repeat repeats [108 in]. Thethe c-MYB d(GGA) promoter. repeat sequence In theory, occurs this in sequence twelve nearly can give perfect rise tandemto three repeats independent in the c-MYB T:H buildingpromoter. subunits In theory, (with this four sequence tandem can repeats give rise per to threebuilding independent block), which T:H buildingcould then subunits stack (within different four tandem combinations repeats perto fo buildingrm dimers. block), A whichstudy couldinvestigating then stack this in possibility different combinationsproposed that to a form T:H:H:T dimers. motif A studyformed investigating by stacking this interactions possibility between proposed the that first a T:H:H:T and the motif third formedbuilding by blocks stacking functions interactions as a between negative the regulator first and theof c-MYB third building promoter blocks activity. functions Theas authors a negative also regulatoridentified of ac-MYB transcriptionpromoter factor, activity. MAZ, The which authors binds also to identifiedthe T:H:H:T atranscription structure and factor, couldMAZ, play a which role in bindsthe regulation to the T:H:H:T of c-MYB structure expression and could [109,110]. play a The role c-MYB in theregulation promoter ofis notc-MYB the expressiononly genomic [109 region,110]. Therichc-MYB in GGApromoter repeats, isand not similar the only motifs genomic have been region identified rich in GGAin the repeats, regulatory and regions similar of motifs other havegenes beensuch identified as NCAM in [111], the regulatory SPARC [112], regions KRAS of other [113], genes and such CCNB1IP1 as NCAM [114].[111 ],AlthoughSPARC [112 it is], KRAStempting[113 ],to andassumeCCNB1IP1 that the[114 transcription]. Although of it genes is tempting with multip to assumele d(GGA) that the repeats transcription in the promoter of genes is with regulated multiple by d(GGA)T:H:H:T repeats structures, in the the promoter results presented is regulated in [109] by T:H:H:Tand [110] structures, do not completely the results rule presented out the possibility in [109] andthat [ 110inhibition] do not of completely transcription rule is outmediated the possibility by monomeric that inhibition forms of ofthese transcription GQs. Testing is mediated the effects by of monomericantibodies formsor small of thesemolecules GQs. specific Testing thefor etheffects T:H:H:T of antibodies structure or smallon the molecules extent of specifictranscriptional for the T:H:H:Tinhibition structure could provide on the extentfurther of insight transcriptional into the st inhibitionructural organization could provide of the further d(GGA) insight repeat into region the structuralin this promoter. organization of the d(GGA) repeat region in this promoter.

Figure 9. Structure of a dimeric GQ formed by the sequence (GGA)8. See [64] and [107] for more informationFigure 9. Structure about this of structure. a dimeric GQ formed by the sequence (GGA)8. See [64] and [107] for more information about this structure. Another multimeric GQ structure that could potentially regulate gene expression occurs in the promoterAnother of hTERT multimeric, the gene GQ encodingstructure thethat reverse could potent transcriptaseially regulate of telomerase. gene expression The core occurs promoter in the ofpromoterhTERT contains of hTERT twelve, the gene consecutive encoding G-runs the reverse and can transcriptase therefore formof telomerase. multiple GQs.The core One promoter proposed of hTERT contains twelve consecutive G-runs and can therefore form multiple GQs. One proposed

Molecules 2019, 24, 3074 11 of 20 structural model for this region consists of two stacked GQs separated by a 26 nucleotide loop which contains another four G-runs and adopts a hairpin structure [115]. An alternative model suggests that all of the G-runs in the sequence participate in GQ folding and that the final structure consists of three parallel GQs tightly stacked on top of one another [40,116]. Although the available evidence does not indicate which model is correct, it does support the idea that a G-rich structure in this promoter regulates hTERT expression [115–118]. The hypothesis that multimeric GQs in promoters regulate gene expression is also supported by a study investigating the regulatory sequences of muscle-specific genes [119]. The promoter regions of these genes contain a high frequency of G-runs that form both homodimeric and heterodimeric structures in vitro. One example is the ITGA7 promoter region, which contains two G-rich sequences, each capable of forming an intramolecular monomeric GQ, separated from one another by 85 nucleotides. These two G-rich sequences formed a heterodimeric structure when analyzed on native gels. This structure is hypothesized to play a role in the regulation of mouse myogenic gene expression. Consistent with this hypothesis, the myogenic determination protein MyoD binds the heterodimeric structure but not a monomeric GQ formed by one of the heterodimer-forming sequences. MyoD exhibited a similar binding preference for a homodimeric GQ formed by a G-rich region in the promoter of the sMtCK gene. Although this G-rich sequence occurs only once in the sMtCK promoter region and therefore cannot assemble into a homodimer in vivo, several G-rich clusters are present in adjacent genomic regions. It would therefore be of interest to determine whether a heterodimeric GQ structure can form in this region and is bound by MyoD [119]. Taken together, these data indicate that multimeric GQ structures can form in the promoter regions of several genes and are consistent with the idea that such structures can regulate gene expression. Many additional promoters contain multiple G-runs. For example, genes containing more than eight G-tracts include TRIM13 [67], c-MYC [105,106], and c-KIT [120,121]. In addition, several viral promoters containing multiple sequences with the potential to form GQs have recently been reported [122,123]. Although these are candidates for genes regulated by multimeric GQs, our current understanding of GQ folding precludes reliable prediction of higher-order GQ structure from sequence. As discussed above, the development of antibodies or small molecules that selectively stabilize or destabilize specific multimeric GQ structures should greatly facilitate analysis of their potential biological roles in promoters.

6. Potential Biological Roles of RNA–DNA Hybrid G-Quadruplexes In addition to the multimeric DNA GQs discussed so far, multimeric GQs can also form from combinations of DNA and RNA strands [124]. These structures are called DNA:RNA hybrid GQs. They can be readily assembled in vitro, and also form during transcription when guanines in the nontemplate DNA strands of genes interact with guanines in RNA molecules encoded by these genes using the second mode of multimerization (Figure 10). DNA:RNA hybrid GQs inhibit transcription, presumably by impeding the progress of RNA polymerase along the DNA template. From the perspective of gene regulation, this is a compelling mechanism because it can act as a negative feedback loop to turn transcription off when mRNA levels are sufficiently high (Figure1A). The first hint that DNA:RNA hybrid GQs act as regulatory elements came from analysis of a region in the human mitochondrial genome called conserved sequence block II (CSB II) [125]. A G-rich element in CSB II induces transcriptional termination and promotes formation of an unusually stable structure in the template DNA. The stability of this structure suggested that it was not a conventional R-loop, in which an RNA transcript and the DNA template encoding it are held together by Watson–Crick base pairing. Instead, it was proposed that this structure was a GQ containing both DNA and RNA strands. Consistent with this hypothesis, a structure was detected by native PAGE that required for its formation the presence of both a DNA oligonucleotide corresponding to the G-rich strand of the template and an RNA oligonucleotide corresponding to the G-rich portion of the RNA transcript. Formation of this structure was inhibited when guanine bases in the DNA strand were replaced by 7-deazaguanine or Molecules 2019, 24, 3074 12 of 20

Molecules 2019, 24, x FOR PEER REVIEW 12 of 20 adenosine, both of which normally prevent GQ formation. The circular dichroism spectrum of this andstructure H, both was consistentof which withdegrade that RNA of a GQ, strands and itin was canonical resistant DNA:RNA to ribonucleases duplexes. A and Additional H, both of characterizationwhich degrade RNAusing strandsDMS fingerprinting, in canonical DNA:RNAwhich can distinguish duplexes. Additionalguanines in characterization tetrads from those using in singleDMS fingerprinting,or double-stranded which DNA can distinguishdue to their guanines reduced sensitivity in tetrads from to DMS those modification, in single or double-strandedand Zn-TTAPc- mediatedDNA due photocleavage, to their reduced which sensitivity preferentially to DMS modification, cleaves GQs, and provided Zn-TTAPc-mediated additional evidence photocleavage, that the structurewhich preferentially is a DNA:RNA cleaves hybrid GQs, GQ provided [126]. Similar additional methods evidence were thatused the to structureshow that is a a DNA:RNA DNA:RNA hybridhybrid GQ GQ also [126 ].forms Similar in the methods human were NRAS used gene to show[127]. thatAs was a DNA:RNA the case for hybrid the CSB GQ alsoII motif, forms the in structurethe human in NRASthe NRASgene gene [127 impeded]. As was T7 the RNA case polymerase for the CSB during II motif, in thevitro structure transcription in the reactions.NRAS gene It alsoimpeded inhibited T7 RNA expression polymerase of luciferase during fromin vitro plasmidstranscription transfected reactions. into cultured It also inhibitedhuman cells. expression Only two of stretchesluciferase of from guanines plasmids in the transfected nontemplate into strand cultured of the human gene cells. are needed Only two to form stretches the DNA:RNA of guanines hybrid in the GQ,nontemplate with the other strand two of provided the gene areby the needed RNA to transcript form the encoded DNA:RNA by the hybrid template GQ, strand. with the DNA:RNA other two hybridprovided GQs by containing the RNA transcripttwo rather encoded than three by tetrads the template can also strand. form, DNA:RNA although they hybrid are GQsless stable containing than thosetwo rather containing than threethree tetradstetrads can[128]. also Bioinformatic form, although studies they indicate are less stablethat sequences than those with containing the potential three totetrads form DNA:RNA [128]. Bioinformatic hybrid GQs studies occur indicatein 97% of that huma sequencesn genes [129]. with theThese potential motifs are to formenriched DNA:RNA in and nearhybrid promoters, GQs occur especially in 97% downstream of human genes of the [129 tran].scription These motifs start aresite enriched on the nontemplate in and near strand, promoters, and thisespecially pattern downstream of enrichment of the is transcriptionevolutionarily start conserved site on thein mammals nontemplate [129]. strand, Taken and together, this pattern these of studiesenrichment support is evolutionarily the idea that conservedDNA:RNA in hybrid mammals GQs [ 129represent]. Taken an together, important these regulatory studies supportelement thein higheridea that eukaryotes. DNA:RNA hybrid GQs represent an important regulatory element in higher eukaryotes.

FigureFigure 10. 10. ModelModel for for the the regulation regulation of of transcription transcription by by DNA:RNA DNA:RNA hybrid hybrid GQs, GQs, formed formed between between the the noncodingnoncoding strands strands of of G-rich G-rich ge genesnes and and RNA RNA molecules transcribed from from these genes. The The newly newly synthesizedsynthesized RNA RNA transcript transcript is is shown shown in in purple. purple. S Seeee [130] [130] for for more more information information about about this this model. model.

Additional studies have explored the mechanism of formation of DNA:RNA hybrid GQs during transcription. Zhang and colleagues developed a method to distinguish R-loops from DNA:RNA hybrid GQs on the basis of differences in the sensitivities of these structures to ribonucleases such as

Molecules 2019, 24, 3074 13 of 20

Additional studies have explored the mechanism of formation of DNA:RNA hybrid GQs during transcription. Zhang and colleagues developed a method to distinguish R-loops from DNA:RNA hybrid GQs on the basis of differences in the sensitivities of these structures to ribonucleases such as RNase A [130]. This was used to show that conventional R-loops form prior to DNA:RNA hybrid GQs during in vitro transcription of a template containing the CSB II motif [130]. Formation of DNA:RNA hybrid GQs was also inhibited by the presence of a C-rich oligonucleotide complementary to the G-rich region of both the nontemplate strand and the nascent RNA transcript. Based on these experiments, it was proposed that a canonical R-loop is formed in the first round of transcription of the CSB II motif. After the RNA strand in the R-loop is displaced by RNA polymerase in the next round of transcription, it forms a DNA:RNA hybrid GQ with the nontemplate strand and prevents further transcription (Figure 10). Real-time FRET studies indicate that R-loops also stabilize DNA:RNA hybrid GQs [131], probably because they prevent the nontemplate strand from forming a duplex with the template strand. The mechanical strength of DNA:RNA hybrid GQs has also been investigated using optical tweezers [132]. These experiments used a template containing the sequence (GGGGA)4, which occurs downstream of the transcription start site on the nontemplate strand in several hundred human genes [132]. Transcription reactions were performed in the absence of CTP, which caused the polymerase to stall 15 nucleotides downstream of the GQ at the position of the first G in the template. Stalled complexes were tethered between beads and stretched to determine their mechanical stabilities. This showed that DNA:RNA hybrid GQs are more stable and form more readily on the nontemplate strand than DNA GQs. Recent results from our laboratory raise the possibility that formation of DNA:RNA hybrid GQs can be regulated by biologically important small molecules [78]. These experiments used a mutant GQ previously shown by our group to bind GTP [79,133] and to form multimers [18]. When folded in the absence of GTP, this sequence forms multimeric GQs as well as monomers that are probably unfolded. When folded in the presence of physiological concentrations of GTP, however, the monomeric form of the GQ is stabilized and formation of multimers is suppressed [78]. NMR studies indicate that the GTP ligand is incorporated into a tetrad, and together with other data support a model in which GTP is incorporated into a cavity in the central tetrad of the monomer created by a G to A mutation. Hundreds of examples of sequences with the potential to form GTP-dependent structures were identified in the human genome, some of which were evolutionarily conserved in primates. Ongoing experiments in our group are investigating possible links between GTP-dependent formation of DNA:RNA hybrid GQs and transcriptional regulation.

7. Conclusions GQs can form a variety of multimeric structures, ranging in size from dimers to G-wires. Most contain interfaces formed by either stacked tetrads in adjacent GQ monomers (first mode of multimerization) or guanines from two or more nucleic acid strands that hydrogen bond to form a G-tetrad (second mode of multimerization). Some progress has been made in understanding the sequence requirements of multimeric GQs. For example, overhanging nucleotides often inhibit multimerization by interfering with the stacking of tetrads, while mutations in tetrads can promote multimerization by destabilizing the monomeric form of the structure. Despite this, the existence of unusual folds such as intertwined dimers formed from sequences containing four stretches of guanines highlight our inability to predict the structures of multimeric GQs from primary sequence. The propensity of GQs to multimerize in vitro, the high concentrations of GQs in eukaryotic cells, and the advantages of multimerization as a regulatory mechanism raise the possibility that GQ multimerization could be biologically important. Although this hypothesis is only starting to be explored, several lines of evidence suggest that higher-order GQ structures form in both telomeres and promoters. Furthermore, recent studies demonstrate that DNA:RNA hybrid GQs formed between the nontemplate strands of genes and the RNA transcripts they encode inhibit transcription, and suggest that this could be an important regulatory mechanism in higher eukaryotes. We anticipate that future progress in Molecules 2019, 24, 3074 14 of 20 defining the biological roles of multimeric GQs will require the development of tools analogous to those used to characterize monomeric GQs. Examples include bioinformatic models that can identify sequences with the potential to form multimeric GQs in sequenced genomes, antibodies, and small molecules specific for different types of multimeric GQs, techniques to visualize multimeric GQs in cells, and high-throughput methods to map multimeric GQs in genomic DNA (including those containing both DNA and RNA strands). The development and application of such tools has the potential to provide a wealth of new information about multimeric GQs, especially with respect to their potential biological roles.

Author Contributions: Conceptualization, E.A.C.; Writing—Original Draft Preparation, S.K. and E.A.C; Writing—Review & Editing, S.K. and E.A.C; Visualization, S.K. and E.A.C; Supervision, E.A.C.; Project Administration, E.A.C.; Funding Acquisition, E.A.C. Funding: This work was supported by an IOCB start-up grant awarded to E.A.C. and “Chemical biology for drugging undruggable targets (ChemBioDrug)” No. CZ.02.1.01/0.0/0.0/16_019/0000729 from the European Regional Development Fund (OP RDE). Acknowledgments: We thank Václav Veverka and colleagues at the IOCB for helpful discussions. Conflicts of Interest: The authors declare no conflict of interest.

References

1. Rich, A. DNA comes in many forms. Gene 1993, 135, 99–109. [CrossRef] 2. Hu, Y.; Cecconello, A.; Idili, A.; Ricci, F.; Willner, I. Triplex DNA nanostructures: From basic properties to applications. Angew. Chem. Int. Ed. Engl. 2017, 56, 15210–15233. [CrossRef][PubMed]

3. Davis, J.T. G-quartets 40 years later: From 50-GMP to molecular biology and supramolecular chemistry. Angew. Chem. Int. Ed. Engl. 2004, 43, 668–698. [CrossRef][PubMed] 4. Day, H.A.; Pavlou, P.; Waller, Z.A. I-Motif DNA: Structure, stability and targeting with ligands. Bioorg. Med. Chem. 2014, 22, 4407–4418. [CrossRef][PubMed] 5. Burge, S.; Parkinson, G.N.; Hazel, P.; Todd, A.K.; Neidle, S. Quadruplex DNA: Sequence, topology and structure. Nucleic Acids Res. 2006, 34, 5402–5415. [CrossRef][PubMed] 6. Phan, A.T. Human telomeric G-quadruplex: Structures of DNA and RNA sequences. FEBS J. 2010, 277, 1107–1117. [CrossRef][PubMed] 7. Yatsunyk, L.A.; Mendoza, O.; Mergny, J.L. “Nano-oddities”: Unusual nucleic acid assemblies for DNA-based nanostruc-tures and nanodevices. Acc. Chem. Res. 2014, 47, 1836–1844. [CrossRef][PubMed] 8. Mergny, J.L.; Sen, D. DNA quadruple helices in nano-technology. Chem. Rev. 2019, 119, 6290–6325. 9. Huppert, J.L.; Balasubramanian, S. Prevalence of quadruplexes in the human genome. Nucleic Acids Res. 2005, 33, 2908–2916. [CrossRef] 10. Todd, A.K.; Johnston, M.; Neidle, S. Highly prevalent putative quadruplex sequence motifs in human DNA. Nucleic Acids Res. 2005, 33, 2901–2907. [CrossRef] 11. Lam, E.Y.N.; Beraldi, D.; Tannahill, D.; Balasubramanian, S. G-quadruplex structures are stable and detectable in human genomic DNA. Nat. Commun. 2013, 4, 1796. [CrossRef] 12. Biffi, G.; Tannahill, D.; McCafferty, J.; Balasubramanian, S. Quantitative visualization of DNA G-quadruplex structures in human cells. Nat. Chem. 2013, 5, 182–186. [CrossRef] 13. Huppert, J.L.; Balasubramanian, S. G-quadruplexes in promoters throughout the human genome. Nucleic Acids Res. 2007, 35, 406–413. [CrossRef] 14. Kendrick, S.; Hurley, L.H. The role of G-quadruplex/i-motif secondary structures as cis-acting regulatory elements. Pure Appl. Chem. 2010, 82, 1609–1621. [CrossRef] 15. Rhodes, D.; Lipps, H.J. G-quadruplexes and their regulatory roles in biology. Nucleic Acids Res. 2015, 43, 8627–8637. [CrossRef] 16. Chambers, V.S.; Marsico, G.; Boutell, J.M.; Di Antonio, M.; Smith, G.P.; Balasubramanian, S. High-throughput sequencing of DNA G-quadruplex structures in the human genome. Nat. Biotechnol. 2015, 33, 877–881. [CrossRef] Molecules 2019, 24, 3074 15 of 20

17. Fujioka, A.; Terai, K.; Itoh, R.E.; Aoki, K.; Nakamura, T.; Kuroda, S.; Nishida, E.; Matsuda, M. Dynamics of the Ras/ERK MAPK cascade as monitored by fluorescent probes. J. Biol. Chem. 2006, 281, 8917–8926. [CrossRef] 18. Kolesnikova, S.; Hubálek, M.; Bednárová, L.; Cvaˇcka,J.; Curtis, E.A. Multimerization rules for G-quadruplexes. Nucleic Acids Res. 2017, 45, 8684–8696. [CrossRef] 19. Traut, T.W. Dissociation of enzyme oligomers: A mechanism for allosteric regulation. Crit. Rev. Biochem. Mol. Biol. 1994, 29, 125–163. [CrossRef] 20. Goodsell, D.S.; Olson, A.J. Structural symmetry and protein function. Annu. Rev. Biophys. Biomol. Struct. 2000, 29, 105–153. [CrossRef] 21. Amoutzias, G.D.; Robertson, D.L.; Van de Peer, Y.; Oliver, S.G. Choose your partners: Dimerization in eukaryotic transcription factors. Trends Biochem. Sci. 2008, 33, 220–229. [CrossRef] 22. Matthews, J.M.; Sunde, M. Dimers, oligomers, everywhere. Adv. Exp. Med. Biol. 2012, 747, 1–18. 23. Watson, J.D.; Crick, F.H.C. Molecular structure of nucleic acids: A structure for deoxyribose nucleic acid. Nature 1953, 171, 737–738. [CrossRef] 24. Yakovchuk, P.; Protozanova, E.; Frank-Kamenetskii, M.D. Base-stacking and base-pairing contributions into thermal stability of the DNA double helix. Nucleic Acids Res. 2006, 34, 564–574. [CrossRef] 25. Fire, A.; Xu, S.; Montgomery, M.K.; Kostas, S.A.; Driver, S.E.; Mello, C.C. Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans. Nature 1998, 391, 806–811. [CrossRef] 26. Smirnov, I.; Shafer, R.H. Effect of loop sequence and size on DNA aptamer stability. Biochemistry 2000, 39, 1462–1468. [CrossRef] 27. Bourdoncle, A.; Estevez Torres, A.; Gosse, C.; Lacroix, L.; Vekhoff, P.; Le Saux, T.; Jullien, L.; Mergny, J.L. Quadruplex-based molecular beacons as tunable DNA probes. J. Am. Chem. Soc. 2006, 128, 11094–11105. [CrossRef] 28. Guedin, A.; Gros, J.; Alberti, P.; Mergny, J.L. How long is too long? Effects of loop size on G-quadruplex stability. Nucleic Acids Res. 2010, 38, 7858–7868. [CrossRef] 29. Mukundan, V.T.; Phan, A.T. Bulges in G-quadruplexes: Broadening the definition of G-quadruplex-forming sequences. J. Am. Chem. Soc. 2013, 135, 5017–5028. [CrossRef] 30. Cheong, C.; Moore, P.B. Solution structure of an unusually stable RNA tetraplex containing G- and U-quartet structures. Biochemistry 1992, 31, 8406–8414. [CrossRef] 31. Patel, P.K.; Hosur, R.V. NMR observation of T-tetrads in a parallel stranded DNA quadruplex formed by Saccharomyces cerevisiae telomere repeats. Nucleic Acids Res. 1999, 27, 2457–2464. [CrossRef] 32. Patel, P.K.; Bhavesh, N.S.; Hosur, R.V. NMR observation of a novel C-tetrad in the structure of the SV40 repeat sequence GGGCGG. Biochem. Biophys. Res. Commun. 2000, 270, 967–971. [CrossRef] 33. Zhang, N.; Gorin, A.; Majumdar, A.; Kettani, A.; Chernichenko, N.; Skripkin, E.; Patel, D.J. Dimeric DNA quadruplex containing major groove-aligned A-T-A-T and G-C-G-C tetrads stabilized by inter-subunit Watson-Crick A-T and G-C pairs. J. Mol. Biol. 2001, 312, 1073–1088. [CrossRef] 34. Pan, B.; Xiong, Y.; Shi, K.; Deng, J.; Sundaralingam, M. Crystal structure of an RNA purine-rich tetraplex containing adenine tetrads: Implications for specific binding in RNA tetraplexes. Structure 2003, 11, 815–823. [CrossRef] 35. Kimura, T.; Xu, Y.; Komiyama, M. Human telomeric RNA r(UAGGGU) sequence forms parallel tetraplex structure with U-quartet. Nucleic Acids Symp. Ser. (Oxf.) 2009, 53, 239–240. [CrossRef] 36. Virgilio, A.; Esposito, V.; Citarella, G.; Mayol, L.; Galeone, A. Structural investigations on the anti-HIV G-quadruplex-forming oligonucleotide TGGGAG and its analogues: Evidence for the presence of an A-tetrad. ChemBioChem 2012, 13, 2219–2224. [CrossRef] 37. Huang, H.; Suslov, N.B.; Li, N.S.; Shelke, S.A.; Evans, M.E.; Koldobskaya, Y.; Rice, P.A.; Piccirilli, J.A. A G-quadruplex-containing RNA activates fluorescence in a GFP-like fluorophore. Nat. Chem. Biol. 2014, 10, 686–691. [CrossRef] 38. Nasiri, A.H.; Wurm, J.P.; Immer, C.; Weickhmann, A.K.; Wöhnert, J. An intermolecular G-quadruplex as the basis for GTP recognition in the class V-GTP aptamer. RNA 2016, 22, 1750–1759. [CrossRef] 39. Trachman, R.J., III; Demeshkina, N.A.; Lau, M.W.L.; Panchapakesan, S.S.S.; Jeng, S.C.Y.; Unrau, P.J.;

Ferré-D0Amaré, A.R. Structural basis for high-affinity fluorophore binding and activation by RNA mango. Nat. Chem. Biol. 2017, 13, 807–813. [CrossRef] 40. Chaires, J.B.; Trent, J.O.; Gray, R.D.; Dean, W.L.; Buscaglia, R.; Thomas, S.D.; Miller, D.M. An Improved model for the hTERT promoter quadruplex. PLoS ONE 2014, 9, e115580. [CrossRef] Molecules 2019, 24, 3074 16 of 20

41. Do, N.Q.; Lim, K.W.; Teo, M.H.; Heddi, B.; Phan, A.T. Stacking of G-quadruplexes: NMR structure of a G-rich oligonucleotide with potential anti-HIV and anticancer activity. Nucleic Acids Res. 2011, 39, 9448–9457. [CrossRef] 42. Kato, Y.; Ohyama, T.; Mita, H.; Yamamoto, Y. Dynamics and thermodynamics of dimerization of parallel G-quadruplexed DNA formed from d(TTAGn) (n = 3 5). J. Am. Chem. Soc. 2005, 127, 9980–9981. [CrossRef] − 43. Lech, C.J.; Heddi, B.; Phan, A.T. Guanine base stacking in G-quadruplex nucleic acids. Nucleic Acids Res. 2013, 41, 2034–2046. [CrossRef] 44. Smargiasso, N.; Rosu, F.; Hsia, W.; Colson, P.;Baker, E.S.; Bowers, M.T.; De Pauw, E.; Gabelica, V.G-Quadruplex DNA assemblies: Loop length, cation identity, and multimer formation. J. Am. Chem. Soc. 2008, 130, 10208–10216. [CrossRef] 45. Haider, S.M.; Parkinson, G.N.; Neidle, S. Molecular dynamics and principal components analysis of human telomeric quadruplex multimers. Biophys. J. 2008, 95, 296–311. [CrossRef] 46. Petraccone, L.; Trent, J.O.; Chaires, J.B. The tail of the telomere. J. Am. Chem. Soc. 2008, 130, 16530–16532. [CrossRef]

47. Monchaud, D.; Teulade-Fichou, M.P. A hitchhiker0s guide to G-quadruplex ligands. Org. Biomol. Chem. 2008, 6, 627–636. [CrossRef] 48. Bai, L.P.; Hagihara, M.; Jiang, Z.H.; Nakatani, K. Ligand binding to tandem G quadruplexes from human telomeric DNA. ChemBioChem 2008, 9, 2583–2587. [CrossRef] 49. Haider, S.M.; Neidle, S.; Parkinson, G.N. A structural analysis of G-quadruplex/ligand interactions. Biochimie 2011, 93, 1239–1251. [CrossRef] 50. Zhang, S.; Wu, Y.; Zhang, W. G-quadruplex structures and their interaction diversity with ligands. ChemMedChem 2014, 9, 899–911. [CrossRef] 51. Hu, M.H.; Chen, S.B.; Wang, B.; Ou, T.M.; Gu, L.Q.; Tan, J.H.; Huang, Z.S. Specific targeting of telomeric multimeric G-quadruplexes by a new triaryl-substituted imidazole. Nucleic Acids Res. 2017, 45, 1606–1618. [CrossRef] 52. Funke, A.; Karg, B.; Dickerhoff, J.; Balke, D.; Müller, S.; Weisz, K. Ligand- induced dimerization of a truncated parallel MYC G-Quadruplex. ChemBioChem 2018, 19, 505–512. [CrossRef] 53. Krishnan-Ghosh, Y.; Liu, D.; Balasubramanian, S. Formation of an interlocked quadruplex dimer by d(GGGT). J. Am. Chem. Soc. 2004, 126, 11009–11016. [CrossRef] 54. Sen, D.; Gilbert, W. Formation of parallel four-stranded complexes by guanine-rich motifs in DNA and its implications for meiosis. Nature 1988, 334, 65–70. [CrossRef] 55. Sundquist, W.I.; Klug, A. Telomeric DNA dimerizes by formation of guanine tetrads between hairpin loops. Nature 1989, 342, 825–829. [CrossRef] 56. Williamson, J.R.; Raghuraman, M.K.; Cech, T.R. Monovalent cation-induced structure of telomeric DNA: The G-quartet model. Cell 1989, 59, 871–880. [CrossRef] 57. Sen, D.; Gilbert, W. Novel DNA superstructures formed by telomere-like oligomers. Biochemistry 1992, 31, 65–70. [CrossRef] 58. Laughlan, G.; Murchie, A.I.; Norman, D.G.; Moore, M.H.; Moody, P.C.; Lilley, D.M.; Luisi, B. The high-resolution crystal structure of a parallel-stranded guanine tetraplex. Science 1994, 265, 520–524. [CrossRef] 59. Phillips, K.; Dauter, Z.; Murchie, A.I.; Lilley, D.M.; Luisi, B. The crystal structure of a parallel-stranded guanine tetraplex at 0.95 A resolution. J. Mol. Biol. 1997, 273, 171–182. [CrossRef]

60. Borbone, N.; Amato, J.; Oliviero, G.; D0Atri, V.; Gabelica, V.; De Pauw, E.; Piccialli, G.; Mayol, L. d(CGGTGGT) forms an octameric parallel G-quadruplex via stacking of unusual G(:C):G(:C):G(:C):G(:C) octads. Nucleic Acids Res. 2011, 39, 7848–7857. [CrossRef]

61. D0Atri, V.; Borbone, N.; Amato, J.; Gabelica, V.; D0Errico, S.; Piccialli, G.; Mayol, L.; Oliviero, G. DNA-based nanostructures: The effect of the base sequence on octamer formation from d(XGGYGGT) tetramolecular G-quadruplexes. Biochimie 2014, 99, 119–128. [CrossRef] 62. Kuryavyi, V.; Cahoon, L.A.; Seifert, H.S.; Patel, D.J. RecA-binding pilE G4 sequence essential for pilin

antigenic variation forms monomeric and 50 end-stacked dimeric parallel G-quadruplexes. Structure 2012, 20, 2090–2102. [CrossRef] Molecules 2019, 24, 3074 17 of 20

63. Adrian, M.; Ang, D.J.; Lech, C.J.; Heddi, B.; Nicolas, A.; Phan, A.T. Structure and conformational dynamics of a stacked dimeric G-quadruplex formed by the human CEB1 minisatellite. J. Am. Chem. Soc. 2014, 136, 6297–6305. [CrossRef] 64. Matsugami, A.; Ouhashi, K.; Kanagawa, M.; Liu, H.; Kanagawa, S.; Uesugi, S.; Katahira, M. An intramolecular quadruplex of (GGA) (4) triplet repeat DNA with a G:G:G:G tetrad and a G(:A):G(:A):G(:A):G heptad, and its dimeric interaction. J. Mol. Biol. 2001, 313, 255–269. [CrossRef] 65. Kuryavyi, V.; Phan, A.T.; Patel, D.J. Solution structures of all parallel-stranded monomeric and dimeric G-quadruplex scaffolds of the human c-kit2 promoter. Nucleic Acids Res. 2010, 38, 6757–6773. [CrossRef] 66. Trajkovski, M.; da Silva, M.W.; Plavec, J. Unique structural features of interconverting monomeric and dimeric G-quadruplexes adopted by a sequence from the intron of the N-myc gene. J. Am. Chem. Soc. 2012, 134, 4132–4141. [CrossRef] 67. Wei, D.; Todd, A.K.; Zloh, M.; Gunaratnam, M.; Parkinson, G.N.; Neidle, S. Crystal structure of a promoter sequence in the B-raf gene reveals an intertwined dimer quadruplex. J. Am. Chem. Soc. 2013, 135, 19319–19329. [CrossRef] 68. Chen, Y.; Yang, D. Sequence, stability, and structure of G-quadruplexes and their interactions with drugs. Curr. Protoc. Nucleic Acid Chem. 2012, 50, 1–17. 69. Yaku, H.; Murashima, T.; Tateishi-Karimata, H.; Nakano, S.; Miyoshi, D.; Sugimoto, N. Study on effects of molecular crowding on G-quadruplex-ligand binding and ligand-mediated telomerase inhibition. Methods 2013, 64, 19–27. [CrossRef] 70. Bhattacharyya, D.; Mirihana Arachchilage, G.; Basu, S. Metal cations in G- quadruplex folding and stability. Front. Chem. 2016, 4, 38. [CrossRef] 71. Kar, A.; Jones, N.; Arat, N.Ö.; Fishel, R.; Griffith, J.D. Long repeating (TTAGGG)n single-stranded DNA self-condenses into compact beaded filaments stabilized by G-quadruplex formation. J. Biol. Chem. 2018, 293, 9473–9485. [CrossRef] 72. Podbevsek, P.; Plavec, J. KRAS promoter oligonucleotide with decoy activity dimerizes into a unique topology consisting of two G-quadruplex units. Nucleic Acids Res. 2016, 44, 917–925. [CrossRef] 73. Marsh, T.C.; Henderson, E. G-wires: Self-assembly of a telomeric oligonucleotide, d(GGGGTTGGGG), into large superstructures. Biochemistry 1994, 33, 10718–10724. [CrossRef] 74. Bose, K.; Lech, C.J.; Heddi, B.; Phan, A.T. High-resolution AFM structure of DNA G-wires in aqueous solution. Nat. Commun. 2018, 9, 1959. [CrossRef] 75. Hessari, N.M.; Spindler, L.; Troha, T.; Lam, W.C.; Drevensek-Olenik, I.; da Silva, M.W. Programmed self-assembly of a quadruplex DNA nanowire. Chemistry 2014, 20, 3626–3630. [CrossRef] 76. Guo, X.; Liu, S.; Yu, Z. Bimolecular quadruplexes and their transitions to higher-order molecular structures detected by ESI-FTICR-MS. J. Am. Soc. Mass Spectrom. 2007, 18, 1467–1476. [CrossRef] 77. Majerová, T.; Streckerová, T.; Bednárová, L.; Curtis, E.A. Sequence requirements of intrinsically fluorescent G-quadruplexes. Biochemistry 2018, 57, 4052–4062. [CrossRef] 78. Kolesnikova, S.; Srb, P.; Vrzal, L.; Lawrence, M.; Veverka, V.; Curtis, E.A. GTP-dependent formation of multimeric G-quadruplexes. ACS Chem. Biol. in press. [CrossRef] 79. Švehlová, K.; Lawrence, M.S.; Bednárová, L.; Curtis, E.A. Altered biochemical specificity of G-quadruplexes with mutated tetrads. Nucleic Acids Res. 2016, 44, 10789–10803. [CrossRef] 80. Lu, M.; Guo, Q.; Kallenbach, N.R. Structure and stability of sodium and potassium complexes of dT4G4 and dT4G4T. Biochemistry 1992, 31, 2455–2459. [CrossRef] 81. Creze, C.; Rinaldi, B.; Haser, R.; Bouvet, P.; Gouet, P. Structure of a d(TGGGGT) quadruplex crystallized in the presence of Li+ ions. Acta Crystallogr. D Biol. Crystallogr. 2007, 63, 682–688. [CrossRef] 82. Deng, J.; Xiong, Y.; Sundaralingam, M. X-ray analysis of an RNA tetraplex (UGGGGU)(4) with divalent Sr(2+) ions at subatomic resolution (0.61 A). Proc. Nat. Acad. Sci. USA 2001, 98, 13665–13670. [CrossRef] 83. Hazel, P.; Huppert, J.; Balasubramanian, S.; Neidle, S. Loop-length-dependent folding of G-quadruplexes. J. Am. Chem. Soc. 2004, 126, 16405–16415. [CrossRef] 84. Risitano, A.; Fox, K.R. Stability of intramolecular DNA quadruplexes: Comparison with DNA duplexes. Biochemistry 2003, 42, 6507–6513. [CrossRef] 85. Blackburn, E.H.; Gall, J.G. A tandemly repeated sequence at the termini of the extrachromosomal ribosomal RNA genes in Tetrahymena. J. Mol. Biol. 1978, 120, 33–53. [CrossRef] Molecules 2019, 24, 3074 18 of 20

86. Jacob, N.K.; Skopp, R.; Price, C.M. G-overhang dynamics at Tetrahymena telomeres. EMBO J. 2001, 20, 4299–4308. [CrossRef] 87. Moyzis, R.K.; Buckingham, J.M.; Cram, L.S.; Dani, M.; Deaven, L.L.; Jones, M.D.; Meyne, J.; Ratliff, R.L.; Wu, J.R. A highly conserved repetitive DNA sequence, (TTAGGG)n, present at the telomeres of human chromosomes. Proc. Nat. Acad. Sci. USA 1988, 85, 6622–6626. [CrossRef] 88. Wright, W.E.; Tesmer, V.M.; Huffman, K.E.; Levene, S.D.; Shay, J.W. Normal human chromosomes have long G-rich telomeric overhangs at one end. Genes Dev. 1997, 11, 2801–2809. [CrossRef] 89. Henderson, E.; Hardin, C.C.; Walk, S.K.; Tinoco, I., Jr.; Blackburn, E.H. Telomeric DNA oligonucleotides form novel intramolecular structures containing guanine- guanine base pairs. Cell 1987, 51, 899–908. [CrossRef] 90. Neidle, S.; Parkinson, G.N. The structure of telomeric DNA. Curr. Opin. Struct. Biol. 2003, 13, 275–283. [CrossRef] 91. Li, J.; Correia, J.J.; Wang, L.; Trent, J.O.; Chaires, J.B. Not so crystal clear: The structure of the human telomere G-quadruplex in solution differs from that present in a crystal. Nucleic Acids Res. 2005, 33, 4649–4659. [CrossRef] 92. Petraccone, L. Higher-order quadruplex structures. Top. Curr. Chem. 2013, 330, 23–46. 93. Yu, H.Q.; Miyoshi, D.; Sugimoto, N. Characterization of structure and stability of long telomeric DNA G-quadruplexes. J. Am. Chem. Soc. 2006, 128, 15461–15468. [CrossRef] 94. Kankia, B. Tetrahelical monomolecular architecture of DNA: A new building block for nanotechnology. J. Phys. Chem. B. 2014, 118, 6134–6140. [CrossRef] 95. Kankia, B. Monomolecular tetrahelix of polyguanine with a strictly defined folding pattern. Sci. Rep. 2018, 8, 10115. [CrossRef] 96. Xu, Y.; Komiyama, M. The structural studies of human telomeric DNA using AFM. Nucleic Acids Symp. Ser. (Oxf). 2007, 51, 241–242. [CrossRef] 97. Xu, X.; Ishizuka, T.; Kurabayashi, K.; Komiyama, M. Consecutive formation of G-quadruplexes in human telomeric-overhang DNA: A protective capping structure for telomere ends. Angew. Chem. Int. Ed. 2009, 48, 7833–7836. [CrossRef] 98. Petraccone, L.; Spink, C.; Trent, J.O.; Garbett, N.C.; Mekmaysy, C.S.; Giancola, C.; Chaires, J.B. Structure and stability of higher-order human telomeric quadruplexes. J. Am. Chem. Soc. 2011, 133, 20951–20961. [CrossRef] 99. Schaffitzel, C.; Berger, I.; Postberg, J.; Hanes, J.; Lipps, H.J.; Pluckthun, A. In vitro generated antibodies specific for telomeric guanine-quadruplex DNA react with Stylonychia lemnae macronuclei. Proc. Nat. Acad. Sci. USA 2001, 98, 8572–8577. [CrossRef] 100. Moye, A.L.; Porter, K.C.; Cohen, S.B.; Phan, T.; Zyner, K.G.; Sasaki, N.; Lovrecz, G.O.; Beck, J.L.; Bryan, T.M. Telomeric G-quadruplexes are a substrate and site of localization for human telomerase. Nat. Commun. 2015, 6, 7643. [CrossRef] 101. Zhao, C.; Wu, L.; Ren, J.; Xu, Y.; Qu, X. Targeting human telomeric higher-order DNA: Dimeric G-quadruplex units serve as preferred binding site. J. Am. Chem. Soc. 2013, 135, 18786–18789. [CrossRef] 102. Huang, X.X.; Zhu, L.N.; Wu, B.; Huo, Y.F.; Duan, N.N.; Kong, D.M. Two cationic porphyrin isomers showing different multimeric G-quadruplex recognition specificity against monomeric G-quadruplexes. Nucleic Acids Res. 2014, 42, 8719–8731. [CrossRef] 103. Zhang, Q.; Liu, Y.C.; Kong, D.M.; Guo, D.S. Tetraphenylethene derivatives with different numbers of positively charged side arms have different multimeric G- quadruplex recognition specificity. Chemistry 2015, 21, 13253–13260. [CrossRef] 104. Dai, J.; Dexheimer, T.S.; Chen, D.; Carver, M.; Ambrus, A.; Jones, R.A.; Yang, D. An intramolecular G-quadruplex structure with mixed parallel/antiparallel G-strands formed in the human BCL-2 promoter region in solution. J. Am. Chem. Soc. 2006, 128, 1096–1098. [CrossRef]

105. González, V.; Hurley, L.H. The c-MYC NHE III1: Function and regulation. Annu. Rev. Pharmacol. Toxicol. 2008, 50, 111–129. [CrossRef]

106. Mathad, R.I.; Hatzakis, E.; Dai, J.; Yang, D. c-MYC promoter G-quadruplex formed at the 50-end of NHE III1 element: Insights into biological relevance and parallel-stranded G-quadruplex stability. Nucleic Acids Res. 2011, 39, 9023–9033. [CrossRef] Molecules 2019, 24, 3074 19 of 20

107. Kerkour, A.; Marquevielle, J.; Ivashchenko, S.; Yatsunyk, L.A.; Mergny, J.L.; Salgado, G.F. High-resolution three-dimensional NMR structure of the KRAS proto-oncogene promoter reveals key features of a G-quadruplex involved in transcriptional regulation. J. Biol. Chem. 2017, 292, 8082–8091. [CrossRef] 108. Matsugami, A.; Okuizumi, T.; Uesugi, S.; Katahira, M. Intramolecular higher order packing of parallel quadruplexes comprising a G:G:G:G tetrad and a G(:A):G(:A):G(:A):G heptad of GGA triplet repeat DNA. J. Biol. Chem. 2003, 278, 28147–28153. [CrossRef] 109. Palumbo, S.L.; Memmott, R.M.; Uribe, D.J.; Krotova-Khan, Y.; Hurley, J.H.; Ebbinghaus, S.W. A novel G-quadruplex-forming GGA repeat region in the c-myb promoter is a critical regulator of promoter activity. Nucleic Acids Res. 2008, 36, 1755–1769. [CrossRef] 110. Broxson, C.; Beckett, J.; Tornaletti, S. Transcription arrest by a G-quadruplex forming-trinucleotide repeat sequence from the human c-myb gene. Biochemistry 2011, 50, 4162–4172. [CrossRef] 111. Hirsch, M.R.; Gaugler, L.; Deagostini-Bazin, H.; Bally-Cuif, L.; Goridis, C. Identification of positive and negative regulatory elements governing cell-type-specific expression of the neural cell adhesion molecule gene. Mol. Cell. Biol. 1990, 10, 1959–1968. [CrossRef] 112. Chamboredon, S.; Briggs, J.; Vial, E.; Hurault, J.; Galvagni, F.; Oliviero, S.; Bos, T.; Castellazzi, M. v-Jun downregulates the SPARC target gene by binding to the proximal promoter indirectly through Sp1/3. Oncogene 2003, 22, 4047–4061. [CrossRef] 113. Morgan, R.K.; Batra, H.; Gaerig, V.C.; Hockings, J.; Brooks, T.A. Identification and characterization of a new G-quadruplex forming region within the kRAS promoter as a transcriptional regulator. Biochim. Biophys. Acta 2016, 1859, 235–245. [CrossRef] 114. Islam, I.; Baba, Y.; Witarto, A.B.; Yoshida, W. G-quadruplex-forming GGA repeat region functions as a negative regulator of the Ccnb1ip1 enhancer. Biosci. Biotechnol. Biochem. 2019, 7, 1–6. [CrossRef] 115. Palumbo, S.L.; Ebbinghaus, S.W.; Hurley, L.H. Formation of a unique end-to-end stacked pair of G-quadruplexes in the hTERT core promoter with implications for inhibition of telomerase by G-quadruplex-interactive ligands. J. Am. Chem. Soc. 2009, 131, 10878–10891. [CrossRef] 116. Kang, H.J.; Cui, Y.; Yin, H.; Scheid, A.; Hendricks, W.P.D.; Schmidt, J.; Sekulic, A.; Kong, D.; Trent, J.M.; Gokhale, V.; et al. A pharmacological chaperone molecule induces cancer cell death by restoring tertiary DNA structures in mutant hTERT promoters. J. Am. Chem. Soc. 2016, 138, 13673–13692. [CrossRef] 117. Micheli, E.; Martufi, M.; Cacchione, S.; De Santis, P.; Savino, M. Self-organization of G-quadruplex structures in the hTERT core promoter stabilized by polyaminic side chain perylene derivatives. Biophys. Chem. 2010, 153, 43–53. [CrossRef] 118. Saha, D.; Singh, A.; Hussain, T.; Srivastava, V.; Sengupta, S.; Kar, A.; Dhapola, P.; Dhople, V.; Ummanni, R.; Chowdhury,S. Epigenetic suppression of human telomerase (hTERT) is mediated by the metastasis suppressor NME2 in a G-quadruplex-dependent fashion. J. Biol. Chem. 2017, 292, 15205–15215. [CrossRef] 119. Yafe, A.; Etzioni, S.; Weisman-Shomer, P.; Fry, M. Formation and properties of hairpin and tetraplex structures of guanine-rich regulatory sequences of muscle-specific genes. Nucleic Acids Res. 2005, 33, 2887–2900. [CrossRef] 120. Rigo, R.; Sissi, C. Characterization of G4-G4 crosstalk in the c-KIT promoter region. Biochemistry 2017, 56, 4309–4312. [CrossRef] 121. Ducani, C.; Bernardinelli, G.; Högberg, B.; Keppler, B.K.; Terenzi, A. Interplay of three G-quadruplex units in the KIT promoter. J. Am. Chem. Soc. 2019, 141, 10205–10213. [CrossRef] 122. Lavezzo, E.; Berselli, M.; Frasson, I.; Perrone, R.; Palù, G.; Brazzale, A.R.; Richter, S.N.; Toppo, S. G-quadruplex forming sequences in the genome of all known viruses: A comprehensive guide. PLoS Comput. Biol. 2018, 14, e1006675. [CrossRef] 123. Frasson, I.; Nadai, M.; Richter, S.N. Conserved G-quadruplexes regulate the immediate early promoters of human Alphaherpesviruses. Molecules 2019, 24, 2375. [CrossRef] 124. Xu, Y.; Kimura, T.; Komiyama, M. Human telomere RNA and DNA form an intermolecular G-quadruplex. Nucleic Acids Symp. Ser. (Oxf.) 2008, 52, 169–170. [CrossRef] 125. Wanrooij, P.H.; Uhler, J.P.; Shi, Y.; Westerlund, F.; Falkenberg, M.; Gustafsson, C.M. A hybrid G-quadruplex structure formed between RNA and DNA explains the extraordinary stability of the mitochondrial R-loop. Nucleic Acids Res. 2012, 40, 10334–10344. [CrossRef] Molecules 2019, 24, 3074 20 of 20

126. Zheng, K.W.; Wu, R.Y.; He, Y.D.; Xiao, S.; Zhang, J.Y.; Liu, J.Q.; Hao, Y.H.; Tan, Z. A competitive formation of DNA:RNA hybrid G-quadruplex is responsible to the mitochondrial transcription termination at the DNA replication priming site. Nucleic Acids Res. 2014, 42, 10832–10844. [CrossRef] 127. Zheng, K.W.; Xiao, S.; Liu, J.Q.; Zhang, J.Y.; Hao, Y.H.; Tan, Z. Co-transcriptional formation of DNA:RNA hybrid G-quadruplex and potential function as constitutional cis element for transcription control. Nucleic Acids Res. 2013, 41, 5533–5541. [CrossRef] 128. Xiao, S.; Zhang, J.Y.; Wu, J.; Wu, R.Y.; Xia, Y.; Zheng, K.W.; Hao, Y.H.; Zhou, X.; Tan, Z. Formation of DNA:RNA hybrid G-quadruplexes of two G-quartet layers in transcription: Expansion of the prevalence and diversity of G-quadruplexes in genomes. Angew. Chem. Int. Ed. Engl. 2014, 53, 13110–13114. [CrossRef] 129. Xiao, S.; Zhang, J.Y.; Zheng, K.W.; Hao, Y.H.; Tan, Z. Bioinformatic analysis reveals an evolutional selection for DNA:RNA hybrid G-quadruplex structures as putative transcription regulatory elements in warm-blooded animals. Nucleic Acids Res. 2013, 41, 10379–10390. [CrossRef] 130. Zhang, J.Y.; Zheng, K.W.; Xiao, S.; Hao, Y.H.; Tan, Z. Mechanism and manipulation of DNA:RNA hybrid G-quadruplex formation in transcription of G-rich DNA. J. Am. Chem. Soc. 2014, 136, 1381–1390. [CrossRef] 131. Zhao, Y.; Zhang, J.Y.; Zhang, Z.Y.; Tong, T.J.; Hao, Y.H.; Tan, Z. Real-time detection reveals responsive cotranscriptional formation of persistent intramolecular DNA and intermolecular DNA:RNA hybrid G-quadruplexes stabilized by R-loop. Anal. Chem. 2017, 89, 6036–6042. [CrossRef] 132. Shrestha, P.; Xiao, S.; Dhakal, S.; Tan, Z.; Mao, H. Nascent RNA transcripts facilitate the formation of G-quadruplexes. Nucleic Acids Res. 2014, 42, 7236–7246. [CrossRef] 133. Curtis, E.A.; Liu, D.R. Discovery of widespread GTP-binding motifs in genomic RNA and DNA. Chem. Biol. 2013, 20, 521–532. [CrossRef]

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).