A structure-based mechanism for HEXIM displacement from 7SK
Vincent Pham Harvard University Michael Gao Harvard University Jennifer Meagher University of Michigan Janet Smith University of Michigan https://orcid.org/0000-0002-0664-9228 Victoria D'Souza ( [email protected] ) Harvard University
Article
Keywords: Transcriptional Elongation, Conformational Plasticity, Remodeling Site, Local Destabilization, Competitive Regulation
Posted Date: February 2nd, 2021
DOI: https://doi.org/10.21203/rs.3.rs-198962/v1
License: This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License 1 A structure-based mechanism for HEXIM displacement from 7SK
2 Vincent V. Pham1, Michael Gao1, Jennifer L. Meagher2, Janet L. Smith2,3 & Victoria M. 3 D’Souza1*
4 1Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, 5 02138, USA. 2Life Sciences Institute, University of Michigan, Ann Arbor, MI, 48109, USA. 6 3Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, 48109, 7 USA. *Correspondence and requests for materials should be addressed to V.M.D (email: 8 [email protected]).
9
10 Productive transcriptional elongation of many cellular and viral mRNAs requires
11 transcriptional factors to extract pTEFb from the 7SK snRNP by modulating the
12 association between the HEXIM protein and the 7SK snRNA. Here we report the
13 structure of the HEXIM arginine rich motif in complex with the apical stemloop-1 of
14 7SK (7SK-SL1apical) and detail how the HIV transcriptional regulator Tat from various
15 subtypes overcome the structural constraints required to displace HEXIM. While the
16 majority of interactions between 7SK and HEXIM and Tat are similar, critical
17 differences exist that guide function. First, the conformational plasticity of 7SK
18 enables the formation of three different base pair configurations at a critical
19 remodeling site, which allows for the modulation required for HEXIM binding and its
20 subsequent displacement by Tat. Furthermore, the specific sequence variations
21 observed in various Tat subtypes all converge on remodeling 7SK at this region.
22 Second, we show that HEXIM primes its own displacement by causing specific local
23 destabilization upon binding — a feature that is then exploited by Tat to bind 7SK
24 more efficiently. Overall, our study details the molecular environment presented by
25 HEXIM and uncovers a destabilization-driven displacement strategy that increases the
26 conformational sampling of 7SK-snRNP, which may allow diverse transcriptional
27 factors to competitively regulate pTEFb. 28 Transcription of all class II genes is a highly regulated process within cells. Shortly
29 after promoter clearance, RNA Polymerase II is inhibited by the negative elongation
30 factors1-5. Release from this stalled state requires all components to be phosphorylated by
31 the positive elongation factor pTEFb, a heterodimeric complex consisting of Cyclin T1
32 and the cyclin-dependent kinase Cdk96-11. However, most of the pTEFb is kept
33 catalytically inactive in the nucleus by the 7SK small nuclear ribonucleoprotein (7SK
34 snRNP) through its interactions with the HEXIM adapter protein and the 5’ stemloop-1
35 of the 7SK RNA 12-18 (7SK-SL1; Fig. 1a). Thus, productive transcriptional elongation of
36 many genes requires transcriptional factors to extract pTEFb from the 7SK snRNP — a
37 process that involves manipulating the interaction between HEXIM and 7SK. This
38 association between 7SK and HEXIM tightly controls the balance between active and
39 inactive pTEFb, and dysregulation of this interaction can have serious biological
40 consequences including cardiac hypertrophy and breast and pancreatic cancers19-21.
41 Furthermore, as many viruses rely on the host transcriptional machinery to produce
42 mRNA and genomes, they have also evolved mechanisms to capture pTEFb22-24. One such
43 unique case is the human immunodeficiency virus (HIV), which utilizes the viral Tat
44 protein to extract pTEFb by binding to the same region of 7SK as HEXIM and directly
45 displacing it 24. Structural insights into the consequence of HEXIM binding to 7SK and
46 how positive transcriptional factors like Tat compete with it are therefore important for
47 understanding HEXIM’s potency as a critical negative regulator.
48 To date, two HEXIM proteins have been identified that are capable of carrying out
49 the same function and both bind 7SK with identical regions of their Arginine Rich Motifs
50 (ARMs) (residues 151-159 in HEXIM1 and 89-97 in HEXIM2)24-27. Although HEXIM binds
51 7SK as a dimer, only one ARM directly contacts 7SK by engaging the apical region of
apical 28-33 52 stemloop-1 (G26 to C85, 7SK-SL1 ; Fig. 1a) . Both in vitro and in vivo studies have 53 shown that this represents the sole interaction between the two molecules that must be
54 modulated in order to release pTEFb27-29,31,32.
55 Our previous work showed that 7SK-SL1apical is enriched in arginine sandwich
56 motifs (ASMs)34. ASMs are defined by two nucleotides that stack in a manner that allows
57 for intercalation of arginine guanidinium moieties between the aromatic rings of the
58 bases34-37 (Fig. 2a). While a bulge pyrimidine forms the cap by engaging in a triple base
59 interaction with an n + 2 base pair in the stem, a Watson-Crick base paired nucleotide
60 preceding the bulge forms the base of the interaction. In the free 7SK-SL1apical, three such
61 bulges fold into preformed arginine sandwich motifs (ASM1, ASM2 and ASM4) poised for
62 arginine guanidinium moieties to dock into them. A fourth bulge folds into a pseudo
63 configuration (pseudo-ASM3) where U40 is able to form a triple base interaction with the
64 A43-U66 base pair to form the cap, but the base of the sandwich is sequestered in a reverse
65 Hoogsteen interaction, excluding it from use as a classical ASM. Our work also showed
66 that HIV-1 Tat NL4-3 (TatNL4-3) uses its arginine-rich motif to intercalate arginines not only
67 into the three preformed ASMs, but also to remodel the pseudo-ASM into a classical
34 68 ASM . This structural remodeling of pseudo-ASM3 is a key mechanism through which
69 Tat displaces HEXIM.
70 However, without the structure of the HEXIM:7SK-SL1apical interaction, it is
71 currently unclear what structural constraints Tat would need to overcome in order to
72 access pTEFb. Furthermore, while the Tat ARM is highly conserved, sequence variations
73 exist in different strains that allow for HEXIM displacement. For example, the ARM of
Fin 74 Tat Finland (Tat ; KR52KHRRR) differs from HEXIM (KK151KHRRR) by only a single
75 amino acid and would lack one of the ASM interactions from the previously described
G 76 Tat NL4-3 strain (KR52RQRRR). Additionally, while Tat subtype G (Tat ; KR52R53HRRR)
77 has an equivalent number of arginines as TatNL4-3, the critical linker sequence connecting
78 ASM3/ASM4 and ASM1/ASM2 interactions is the same as HEXIM. In this study, we present 79 the structure of the 7SK-SL1apical in complex with the HEXIM, TatFin, and TatG ARMs.
80 Despite sequence variations, the structures show deep major groove intercalations of all
81 ARMs, albeit with differential interactions with pseudo-ASM3 and ASM4. Furthermore,
82 we show that HEXIM causes local destabilization of ASM4, enhancing Tat’s affinity for
83 7SK. These studies thus uncover a feature in which HEXIM facilitates its own
84 displacement by increasing conformational sampling, which may be a more general
85 mechanism of pTEFb capture.
86
87 Results
88 Comparative binding affinities of HEXIM and Tat to 7SK.
89 As a first step toward identifying the atomic determinants of 7SK recognition by
90 HEXIM, we performed binding studies using isothermal titration calorimetry (ITC).
91 Studies have shown that the first 108 nucleotides comprising stemloop-1 of 7SK (G1-C108;
92 7SK-SL1Full) are required for binding full length HEXIM as a dimer both in vivo and in
93 vitro24,27,32. ITC traces of full-length HEXIM1 with 7SK-SL1Full corroborate these studies (N
94 = 2.1 ± 0.2, Kd = 209 ± 34 nM; Fig. 1b). Additionally, ITC traces of full-length HEXIM1 to
Full-AGU apical 95 7SK-SL1 and 7SK-SL1 (G26 to C85), both containing an AGU triloop engineered
96 to prevent low levels of dimerization and thus aid in increasing quality of NMR spectra,
97 also reveal dimeric HEXIM binding to both constructs with comparable binding affinities
Full 98 to the full-length HEXIM1: 7SK-SL1 complex (N = 2 ± 0.1, Kd = 200 ± 20 nM, N = 1.8 ±
99 0.03, Kd = 206 ± 58 nM, and N = 2.1 ± 0.2, Kd = 209 ± 34 nM, respectively; Fig. 1c, d),
100 indicating that the loop does not play a significant role in HEXIM1 binding and
101 confirming that the HEXIM:7SK interaction is confined to the apical region stem of 7SK-
102 SL1. 103 Previous studies have also shown that while HEXIM binds 7SK as a dimer, the
104 ARM of only one HEXIM monomer is responsible for interacting with 7SK both in vivo
17,32,33 105 and in vitro . ITC traces of the N- terminal ARM residues (R146QLGKKKHRRR156;
106 HEXIMN-ARM) to 7SK-SL1apical with the AGU triloop confirm these findings as the binding
107 affinities between the full-length dimeric HEXIM1:7SK-SL1apical and HEXIMN-ARM:7SK-
apical 108 SL1 complexes are similar (N = 1.8 ± 0.03, Kd = 206 ± 58 nM and N = 1 ± 0.1, Kd = 238 ±
109 35 nM, respectively; Fig. 1d, e). NMR studies comparing these two complexes show that
110 binding of either full-length HEXIM or the N-ARM gives rise to the same nuclear
111 Overhauser effect (NOE) shifts in 7SK-SL1apical, indicating that the HEXIMN-ARM:7SK-
112 SL1apical interaction represents the biologically relevant mode of HEXIM binding to 7SK
113 (Supplementary Fig. 1).
NL4-3 114 Finally, our previous work showed that the Tat (KR52RQRRR) ARM represents
115 the interaction domain between Tat and 7SK-SL1apical and has an approximately two-fold
116 increased affinity over the HEXIMN-ARM. ITC traces show that Tat Subtype G’s ARM
apical 117 (KR52RHRRR), which also has two N-terminal arginines, binds 7SK-SL1 with a Kd of
118 91 ± 32 nM (N = 1.1 ± 0.1; Fig. 1f), which is an approximately 2.6-fold increased binding
119 affinity over HEXIMN-ARM (Supplementary Table 1). On the other hand, Tat Finland’s
120 ARM (KR52KHRRR), despite have an additional N-terminal arginine compared to the
121 HEXIMN-ARM (R52 and K151, respectively), does not have a statistically significant increase
122 in binding affinity (Kd of 198 ± 16 nM; Fig. 1g). Overall, these results highlight the need
123 for understanding the HEXIM-bound 7SK; while the increased TatG affinity would allow
124 for HEXIM displacement, it is unclear how TatFin can achieve the same biological output.
125
126 127 Preformed configurations of ASM1 and ASM2 provide a common mode of interaction
128 with C-terminal residues.
129 To understand how the HEXIMN-ARM and the various Tat ARMs interact with 7SK-
130 SL1apical, we utilized a combination of small-angle X-ray scattering (SAXS) and NMR. All
131 reconstructed ab initio SAXS envelopes showed no major overall global changes between
132 peptide-bound and free 7SK-SL1apical (Fig. 2b, c). Numerous intermolecular NOEs place
133 both HEXIM and Tat arginine rich motifs into the major groove of the RNA and allow us
134 to define their interactions with all ASM regions. Base pairs in the lower part of the stem-
135 loop below the G79-U32 base pair, as well as the CAGUG pentaloop do not give any
136 intermolecular NOEs, indicating that the interactions are contained within a single turn
137 of the helix (Fig. 2d-f).
apical 138 In the free 7SK-SL1 , ASM1 and ASM2 are placed in a tandem orientation and
139 upon titration of the various ARMs, all expected NOEs for such configurations are
140 retained. Unlike a typical ASM where the following nucleotide after the bulge is in a
141 canonical Watson-Crick base pair, in ASM1, the residue A77 is configured into an A34, A77
142 pair. A NOE from the A77 H8 proton to the H1′ of C75 positions this residue under the C75
143 cap (Supplementary Fig. 2). This confirms a planar orientation of C75 with the C33-G78 base
144 pair and configures A77 in such a way that it is perfectly positioned to interact with the
145 guanidinium moiety of R156 in HEXIMN-ARM and R57 in TatFin and TatG, which intercalate
146 between C75 and G74 in a manner identical to canonical ASMs (Supplementary Fig. 2-5).
+ 147 Similarly, in ASM2, the C71 base also retains its protonation at the N3 position as
148 evidenced by a downfield shift of the N4 amino protons (Supplementary Fig. 6). The
N-ARM Fin G 149 guanidinium moiety of R155 in HEXIM and R56 of Tat and Tat interact with G73
+ 150 by intercalating between the C71 cap and G70 base of the motif (Supplementary Fig. 3-5).
+ 151 Additionally, intermolecular NOEs from the aromatic protons of the C75 and C71 caps and 152 the G74 and G70 bases of ASM1 and ASM2 to the H훾 and the Hδ protons confirm that
153 consecutive arginines R156 and R155 interact in a ladder-like configuration with the
154 tandem preformed motifs ASM1 and ASM2, respectively (Fig. 3a and Supplementary Fig.
155 4c). Such NOEs are also observed in both the TatFin and the TatG-bound complexes,
156 confirming the similar placement of the C-terminal R57 and R56 into the tandem ASM1
157 and ASM2, respectively (Fig. 3a and Supplementary Fig. 5a, b, d, e). Taken together, the
158 structures reveal a common mode of interaction between the non-varying C-terminal
159 arginines and the tandem ASMs.
160
161 Rearrangement of pseudo-ASM3 allows for HEXIM N-terminal interactions.
apical 162 In the free 7SK-SL1 , pseudo-ASM3 and ASM4 adopt a pseudo-symmetrical
163 architecture where the two motifs are spatially opposed. Upon HEXIMN-ARM binding, the
164 pseudo-ASM3 maintains its U40:A43-U66 triple base interaction although the base of the
165 sandwich, A39, rearranges from a reverse Hoogsteen interaction with U68 into a cis
166 Hoogsteen/sugar interaction, giving rise to an alternate pseudo configuration. (Fig. 3b,f).
167 This is evidenced both by NOEs from the U68 imino proton to the A39 amino protons and
168 NOEs from the U68 H2′ and H3′ protons to the A39 H8 proton (Supplementary Fig. 6b).
169 This frees up the U68 imino proton to engage the backbone carbonyl of K152 while
170 simultaneously bringing the N1 proton acceptor of A39 into the major groove to hydrogen
171 bond with the side chain Hε protons of K151 (Fig. 3b). Thus, both K151 and 152 can enter
172 deep into the major groove by remodeling the pseudo-ASM3.
173 The amino side chain of K151 is within hydrogen-bonding distance of the A39 N1
174 nitrogen as evidenced by NOEs from the K151 H훾 and Hβ protons to the C37 H6 and H5
175 protons, respectively and from the K151 Hε protons to the C38 H6 and H5 protons
176 (Supplementary Fig. 4f). Additionally, NOEs between the K152 Hβ protons with the U68 177 H5 proton, the K152 Hδ protons with the C67 and U66 H5 protons, and the K152 Hε protons
178 with the C67 H5 and H6 protons position the amino side chain of K152 within hydrogen-
179 bonding distance of the C67 backbone (Fig. 3b and Supplementary Fig. 4e, f). This gives
180 rise to a forked configuration of the two lysines, orienting the side chain amino groups
181 towards the phosphate backbones on opposite ends of the groove.
182 Unlike the other three ASMs, where the NOEs clearly define a single predominant
183 structural configuration as described above, multiple dynamic states exist for ASM4 (see
184 below). In the most abundant form, the preformed nature found in the free state is
185 retained as evidenced by a direct imino to imino connectivity between U44 and U63, along
186 with maintenance of the G46-C62 Watson-Crick base pair (Supplementary Fig. 6c). In fact,
187 this interaction is stabilized by K150, which displays NOEs between the Hε protons with
188 the U63 and the U40 H5 protons, positioning the amino side chain within hydrogen
189 bonding distance of the O4 atoms of both U63 and U40 (Supplementary Fig. 4d). Additional
190 intermolecular interactions between the U40 H5 proton and the U63 H5 and H1′ protons
191 with the K150 Hδ, H훾, and Hβ protons places the K150 directly under the U63 cap of ASM4
192 (Supplementary Fig. 4d, e). Taken together, these data show that despite the lack of
193 arginines, the lysine-rich N-terminus of HEXIMN-ARM is able to be accommodated by 7SK:
194 the Watson-Crick face of A39 turns from the minor into the major groove to interact with
195 K151 and 152, which then positions the K150 to interact with the oxygen-rich
196 environment of the U63 and U40 caps.
197
198 Conformational plasticity of the ASM3/ASM4 region provides differential mode of
199 interactions with N-terminal and spacer residues.
200 Our previous study showed that TatNL4-3 displaces HEXIM by remodeling the
34 201 pseudo-ASM3 into a canonical ASM3 to allow for arginine intercalation . Furthermore, 202 an additional arginine docks into the preformed ASM4. While the mechanism of
Fin G 203 remodeling pseudo-ASM3 is conserved upon binding of both Tat and Tat ARMs (Fig.
204 3c, e, g, h; Supplementary Fig. 6), both the drivers of the conformational switch and the
205 engagement of the ASM4 vary depending on differences in amino acid sequences.
206 While TatFin has two major differences from TatNL4-3 (K53 to R53 and spacer H54 to
207 Q54, respectively), it only differs in a single amino acid from HEXIM (R52 and K151,
NL4-3 208 respectively). Like Tat , R52 is responsible for remodeling pseudo-ASM3 (Fig. 3c;
NL4-3 209 Supplementary Fig. 5a). However, while R53 in Tat flips over R52 and engages ASM4,
210 the equivalent K53 stays in the spacer region between the ASM1/ASM2 and ASM3/ASM4
211 regions in a manner similar to HEXIM as evidenced by NOEs between the K53 Hβ
212 protons with the U68 H5 proton, the K53 Hδ protons with the C67 and U66 H5 protons, and
213 the K53 Hε protons with the C67 H5 and H6 protons, which position the amino side chain
214 of K53 within hydrogen-bonding distance of the C67 backbone (Fig. 3j; Supplementary
215 Fig. 5a).
Fin 216 Like HEXIM, ASM4 remains unoccupied upon binding Tat and the structure
217 shows that the K51 amino side chain is positioned to hydrogen-bond with the U63 ribose
218 ring in a stabilizing interaction (Fig. 3c). This is evidenced by NOEs between the K51 Hδ
219 protons with the U63 H5 and H1′ protons and the K51 Hε protons with the U63 2′ hydroxyl
220 proton (Supplementary Fig. 5c). Furthermore, the N-terminal K50 exits near the apical
221 loop with NOEs observed between the K50 Hδ and Hε protons with the C38 and C37 H5
222 and H1′ protons position the amino side chain of K50 to the C38 phosphate backbone
223 (Supplementary Fig. 5a).
224 Finally, in evaluating the structural consequences of the spacer substitution, we
Fin 225 see that H54 and R55 in Tat remain near ASM1 and ASM2, similar to what is found in
226 HEXIM. This is evidenced by NOEs between the H54 (H153 in HEXIM) Hβ protons with 227 the C35, C36, and C37 H5 protons, placing H54 near ASM2 whereas the R55 (R154 in HEXIM)
228 Hδ protons display NOEs with the A34 H1′ proton and the C33 H1’, H5, and H6 protons,
229 positioning this spacer residue near ASM1 (Fig. 3i, j; Supplementary Fig. 4, 5). This is in
NL4-3 230 contrast with the binding mode of Tat in which the intercalation of R53 into ASM4
231 drags both the Q54 and R55 spacer residues towards the apical ASMs.
232 The importance of the histidine H54 spacer is even more evident in the TatG strain
233 where it represents the only difference between TatNL4-3. This single difference changes
234 the identity of the arginine that remodels pseudo-ASM3. In this ARM, the positioning of
235 H54 near ASM2 precludes R53 from reaching ASM4 to accomplish the inverse
236 intercalation seen in NL4-3 (Fig. 3d, e, k and Supplementary Fig. 5d, e). The interactions
237 with the apical ASMs thus occur in a ladder-like manner where R53 intercalates into the
238 remodeled ASM3 whereas R52 intercalates into ASM4 (Fig. 3d, e and Supplementary Fig.
239 5a, d, e). K51 makes the final stabilizing interaction with NOEs seen between the Hε
240 protons with the U63 2′ hydroxyl proton, indicating a hydrogen-bonding interaction
241 between the K51 amino side chain with the U63 ribose ring (Fig. 3d and Supplementary
242 Fig. 5f). Taken together, these studies show that arginine sandwich motifs provide mini
243 domains that arginine rich motifs of proteins can differentially interact with in order to
244 achieve deep major groove binding into the stem of in 7SK-SL1apical.
245
246 HEXIM allows for increased conformational sampling of apical ASMs.
247 While titration of all four arginine rich motifs stabilize the majority of 7SK-SL1apical
248 into one predominant configuration, the HEXIM ARM is an outlier wherein binding
249 causes ASM4 to become destabilized and exhibit multiple conformations. In such
250 conformations, the NOEs between the imino protons of U63 and U44 disappear, indicating
251 the disruption of the U63:U44-A65 triple and loss of ASM4 (Supplementary Fig. 6c). The 252 destabilization of this region is also indicated by the line-broadening of K150, which
253 interacts with U63 in the folded configuration (Supplementary Fig. 4e).
254 The destabilization of 7SK-SL1apical only by HEXIM is further evident when
255 comparing the thermodynamic profiles between Tat and HEXIM. The binding of TatFin
256 and TatG strains are enthalpically driven (ΔH = -7.5 ± 0.2 and -8.9 ± 2.2 kcal mol-1,
257 respectively; Fig. 4a) with a modest entropic contribution (-TΔS = -1.7 ± 0.3 and -2 ± 0.8
258 kcal mol-1, respectively; Fig. 4a). On the other hand, HEXIM binding is entropically
259 enhanced by approximately 2.5-fold over both Tat strains (-TΔS = -4.6 ± 0.8 kcal mol-1, ΔH
260 = -4.4 ± 0.7 kcal mol-1; Fig. 4a).
261 To evaluate the implication of HEXIM’s ability to locally destabilize ASM4 in the
262 context of its displacement required for transcriptional regulation, we compared TatFin
263 and TatG binding to 7SK both free and in the presence of HEXIM. Due to the modest
264 differences in binding energetics between the different ARMs, competition experiments
265 using ITC were not tractable. As seen in Fig. 4b, a 1:1 titration of either HEXIM or Tat into
266 7SK in the NMR shows the presence of both bound and free forms of 7SK. However, upon
267 titration of TatFin and TatG into the HEXIM:7SK complex, we not only observe completion
268 of complex formation but also a total displacement of HEXIM in both cases. This is
269 especially striking given that the binding affinities of TatFin and HEXIM for free 7SK are
270 equivalent. Taken together, these data indicate that Tat has a greater affinity for a HEXIM-
271 bound 7SK complex. Finally, ITC data of full-length HEXIM bound to full-length 7SK
272 snRNA (N= 1.9 ± 0.1; Fig. 4b) show that an entropy-driven interaction is maintained, and
273 in fact, is even more pronounced (-TΔS = -6.4 ± 1.3 kcal mol-1, ΔH = -2.6 ± 1 kcal mol-1Fig.
274 4a), suggesting that HEXIM binding may globally increase the conformational space
275 sampled by the 7SK-snRNP complex. These studies suggest that destabilization by
276 HEXIM may play an important role in how transcription factors access 7SK for pTEFb
277 capture. 278 Discussion
279 The 7SK snRNP represents a central biomolecule that a wide range of
280 transcriptional factors need to interact with to access pTEFb to control transcriptional
281 elongation. In particular, pTEFb extraction by HIV Tat from this complex requires
282 manipulating the interaction between the 7SK snRNA and the HEXIM adapter protein.
283 In this study, we solved the structures of the RNA binding domains of HEXIM and Tat
284 bound to 7SK and gain several insights into their functional significance, including the
285 malleability of 7SK, the local destabilization by HEXIM, and the specific sequence
286 variations of Tat.
287 The structures show that both HEXIM and Tat regulate pTEFb by directly binding
288 the stem of 7SK-SL1apical through intercalation of arginine rich motifs into an entire helical
289 turn of the major groove. This is unusual as RNA major grooves are deep and narrow,
290 making them generally inaccessible for protein binding. The architecture of the four
291 sandwich motifs in 7SK allows for transcriptional regulators to differentially utilize their
292 ARMs. The tandem preformed ASMs, ASM1 and ASM2, remain unchanged from their
293 free configuration upon encountering C-terminal arginines of Tat and HEXIM. On the
294 other hand, the apical pseudo-symmetrical ASMs, pseudo-ASM3 and ASM4, reconfigure
295 depending on their binding partners. The structures show that the ASM3 region can adopt
296 at least three different base pair interactions: a reverse Hoogsteen in the free state, a cis
297 Hoogsteen/sugar interaction upon HEXIM binding, and a Watson-Crick base pair upon
298 Tat binding. The cis Hoogsteen/sugar interaction is especially significant because it
299 allows HEXIM to enter the major groove despite the lack of arginines in the N-terminus.
300 Similarly, while ASM4 retains its preformed configuration found in the free state upon
301 Tat binding, it can be destabilized in the presence of HEXIM and adopt multiple flexible 302 states. Taken together, these studies show that 7SK is adaptable in its ASM architecture,
303 which can be modulated upon encountering different transcription factors.
304 Comparative analyses of HEXIM and Tat provide insights into how both positive
305 and negative regulators can manipulate 7SK to carry out their transcription roles. Our
306 studies implicate HEXIM as potentially having a dual structural role. On the one hand, it
307 can bind with high affinity to the apical portion of 7SK-stemloop 1 to sequester pTEFb
308 and on the other hand, it simultaneously causes local destabilization of this region,
309 enhancing the binding affinities of positive regulators for pTEFb capture. In comparison
310 to Tat, the thermodynamic profile and solution-state characteristics of HEXIM binding
311 show an entropy-driven mode of interaction that is particularly attributed to the
312 destabilization of U63 in the ASM4 region. Indeed, mutational studies have shown that
27,33 313 deletion of U63 significantly reduces HEXIM binding . This expansion in the dynamic
314 state of 7SK surrounding the ASM4 region is also supported both by in-vivo SHAPE
315 analysis where U63 becomes ultra-reactive upon HEXIM:pTEFb binding, and structural
316 and molecular dynamics modeling34,38-40. Furthermore, we show that Tat capitalizes on
317 this increased dynamic state, binding with greater affinity to the HEXIM-bound complex
318 than to free 7SK. While the use of a HEXIM-displacement mechanism for pTEFb capture
319 by binding to 7SK-SL1 has yet to be discovered for cellular factors, the destabilization-
320 driven preparation of 7SK snRNP for pTEFb extraction may be a general feature exploited
321 by specialized transcriptional factors.
322 Comparative analysis of HEXIM and Tat also shed light on the sequence
323 requirements of ARMs for pTEFb regulation. While N-terminal lysines of HEXIM allow
324 for destabilization of ASM4, the anchoring required to enter the major groove can only be
325 provided by the stacking of C-terminal arginines within ASM1 and ASM2. Indeed, the
326 importance of these C-terminal arginines for HEXIM binding is supported by their nearly
327 complete conservation across metazoan species41. On the other hand, the equivalent 328 arginines in HIV-1 Tat occur as a consecutive pair only in approximately 50% of reported
329 strains, albeit with the strong requirement of at least one arginine. The structures show
330 that these variations may be possible due to the anchoring provided by the arginines that
331 intercalate into the apical ASMs. Nevertheless, when two arginines are present in Tat, the
332 interactions with the tandem ASMs mirror HEXIM.
333 Furthermore, differences in N-terminal and spacer ARM residues orchestrate the
334 structural modulations of the apical ASMs. In order to accommodate the continuation of
335 the HEXIM ARM chain from the ASM1/ASM2 to the ASM3/ASM4 region required for U63
336 destabilization, K152 induces the reconfiguration of pseudo-ASM3 from a reverse
337 Hoogsteen to a cis-Hoogsteen/sugar interaction. In all variations of N-terminal Tat
338 sequences studied, binding is concomitant with the rearrangement of pseudo-ASM3 into
339 a canonical ASM3 through the intercalation of an arginine.
340 The structures also provide insights into specific sequence variations that occur in
341 the highly conserved Tat ARM to displace HEXIM. When two arginines are available in
342 the N-terminal residues, both are involved in arginine sandwich interactions, providing
343 a 2-fold increase in affinity; however, either R52 (TatNL4-3) or R53 (TatG) can act as the
344 remodeler. This can be explained by the presence of either a glutamine or histidine spacer,
345 respectively, which is the only amino acid difference between the two strains. As
346 glutamine (75%) and histidine (15%) make up the majority of the sequence variation in
347 this spacer, the structures show that these two spacer residues drive the differential
348 positioning of the arginine remodeler. In the TatFin strain, which has a histidine spacer, it
349 is the R52 that acts as a remodeler. In this case, the R53K substitution provides the
350 stabilizing interactions to reposition the single R52 arginine near pseudo-ASM3.
351 Furthermore, it is also interesting to compare the mode of binding of TatFin to
352 HEXIM. First, the single residue difference (R52 vs K151) provides TatFin with the 353 additional ASM intercalation required for displacement. Thus, Tat has evolved specific
354 sequence variations that allow for the reconfiguration of pseudo-ASM3, even in cases
355 where there is only a single variation from HEXIM. Second, despite both ARMs having
356 lysines positioned near ASM4, only HEXIM leads to local destabilization. Our studies
357 therefore provide HEXIM as an example of a negative regulator that primes its own
358 displacement by locally destabilizing 7SK (Fig. 4c). Overall, these studies have broader
359 implications for 7SK-snRNP mediated regulation. Given that the destabilization-driven
360 displacement is a robust mechanism, it is possible that other yet to be identified cellular
361 and viral transcriptional regulators recruit pTEFb through direct intercalation of ARMs
362 into 7SK-SL1apical. Furthermore, as a destabilized state of 7SK snRNP is what is presented
363 to all transcriptional regulators, the mechanisms necessary to extract pTEFb may
364 converge on capitalizing on this conformational heterogeneity.
365
366 Methods
367 RNA sample preparation: RNA samples used for biophysical experiments were
368 synthesized by in vitro transcription using T7 RNA polymerase with either plasmid DNA
369 or with synthetic DNA templates containing 2′-O-methylated (Integrated DNA
370 Technologies) containing the T7 promoter and the desired sequences. Plasmid DNA for
371 7SK-SL1Full-WT and 7SK-SL1Full-AGU contain the T7 promoter, insert, and SmaI sequence
372 were cloned by Genscript in between the EcoRI and BamHI restriction sites of a puc19
373 vector. Plasmid DNA was prepared for in vitro transcription from a 5mL overnight culture
374 of NEB 5α Competent E.coli (C29871) transformed with the plasmid using Qiaprep Spin
375 Miniprep Kit (Qiagen 27104). 10 µL of purified DNA was combined with 25 µL of 2′-O-
376 methylated reverse primer at 100 µM (5′-mGmGAGCGGTGAGG GAGGAAG-3′ where
377 m indicates 2′ O-methylated nucleotide), 2 µL of forward primer at 100 µM (5′- 378 GACAAGCCCGTCAGGG-3′), 2.44 mL of water, and two tubes of EconoTaq PLUS 2X
379 Master Mix (Lucigen 30035-2). The 5 mL mixture was then aliquoted into 50 µL
380 increments in a 96-well PCR plate and the template for in vitro transcription reactions
381 were amplified using the following PCR protocol: 95 °C for 5 minutes, 34 cycles of (95 °C
382 for 30 seconds, 50 °C for 1 minute, and 68 °C for 90 seconds), and 68 °C for 5 minutes.
383 After PCR amplification, reactions were pooled into 5 mL volume in a 50 mL Falcon tube
384 and 0.5 mL of 3M sodium acetate, pH 5 and 32 mL of 100% ethanol was added to the
385 mixture and chilled at -80°C for at least 30 minutes before spinning down at 9000 x g for
386 10 minutes at 4 °C. The ethanol was decanted and the pellet was left to dry overnight
387 before in vitro transcription use. Template preparation for 7SK-SL1apical using 2′-O-
388 methylated reverse primers in order to suppress the heterogeneity at the 3′ end of the
389 transcripts involved combining 15 mL of both forward (5′-
390 TAATACGACTCACTATAGGGATCTGTCACCCCATTGATCGCCAGTGGCTGATCT
391 GGCTGGCTAGGCGGGTCCC-3′) and reverse (5′-mGmGGACCCGCCTAGCCAG
392 CCAGATCAGCCACTGGCGATCAATGGGGTGACAGATCCCTATAGTGAGTCGTAT
393 TA-3′ where m indicates 2′ O-methylated nucleotide) primers at 1 mM stock solution with
394 470 mL of water. The mixture was heated at 95 °C for five minutes and cooled at room
395 temperature for 30 minutes before assembling the in vitro transcription reaction. Samples
396 were either unlabeled, or residue-specifically labeled with 13C/15N- or 2H (Cambridge
397 Isotope Laboratories, Inc.). After transcription, RNA samples were heat denatured and
398 purified by using urea-denaturing polyacrylamide gels.
399 HEXIM ARM and Tat ARM peptide preparation: Unlabeled HEXIMN-ARM
400 (GISYGRQLGKKKHRRRAHQ), TatFin ARM (GISYGRKKRKHRRRAHQ), and TatG ARM
401 (GISYGRKKRRHRRRAHQ) peptides were purchased from Tufts University Core
402 Facility at a 0.1 mmol scale. Tat adapters were placed around the HEXIMN-ARM sequence
403 to prevent non-physiological aggregation in solution state NMR studies. HEXIMN-ARM 13 15 404 peptides containing selective C/ N-labeled residues, underlined,
405 (GISYGRQLGKKKHRRRAHQ and GISYGRQLGKKKHRRRAHQ) were purchased from
406 New England Peptide.
407 Full-length HEXIM1 preparation: Synthetic DNA encoding HEXIM1 (2-359) was cloned
408 into a bacterial pMCSG7 expression vector 42 [JLS1] encoding an N-terminal tobacco etch
409 virus (TEV) protease-cleavable His6 tag and was expressed in E. coli BL21 AI cells in an
410 overnight culture at 20°C. Cells were lysed by sonication in buffer containing 50 mM Tris
411 pH 8.0, 500 mM NaCl, 0.1% b-mercaptoethanol, 50 mM (NH4)2SO4 and protease inhibitor
412 aprotinin and leupeptin. His6-HEXIM1 was purified from the cleared cell lysate using
413 Ni-NTA resin (Qiagen) and the His6 tag was cleaved with TEV protease. The HEXIM was
414 run over a second Ni-NTA column, followed by anion exchange on a 5 mL HiTrap Q HP
415 column (Cytiva) and gel filtration on a Superdex 200 16/60 column (Cytiva) in a final
416 buffer containing 25 mM Hepes pH 7.5, 200 mM NaCl, 5% glycerol, 1 mM TCEP. HEXIM
417 was flash frozen in liquid nitrogen and stored at -80° C.
418 Isothermal titration calorimetry: Binding constants for the interactions of 7SK-SL1apical
419 with the HEXIMN-ARM and TatFin and TatG ARMs and full-length HEXIM1 with 7SK-
420 SL1apical, 7SK-SL1Full-WT, and 7SK-SL1Full-AGU were measured using an ITC-200
421 microcalorimeter (MicroCal). 68 µM HEXIMN-ARM peptide was titrated into 5 µM
422 solutions of 7SK-SL1apical in 10 mM sodium phosphate, 70 mM NaCl, 0.1mM EDTA, pH
423 5.2 at 25 °C. Titrations of Tat ARMs into 7SK-SL1apical were also performed in the same
424 buffer conditions as the HEXIMN-ARM titration although the Tat ARM concentration was
425 at 2.5 µM and the 7SK-SL1apical concentration was at 45 µM. Titrations with full-length
426 HEXIM1 were done at 100 µM of HEXIM1 into 3 µM of either 7SK-SL1apical, 7SK-SL1Full-WT,
427 and 7SK-SL1Full-AGU in a buffer of 25mM HEPES pH 7.5, 200mM NaCl, 5% glycerol, and
428 1mM TCEP. Titration curves were analyzed using ORIGIN (OriginLab) and all
429 thermodynamic parameters are reported with n=3 experiments. 430 Small angle X-ray scattering: SAXS data for the 7SK-SL1apical:HEXIMN-ARM, 7SK-SL1apical:
431 TatFin ARM, and 7SK-SL1apical:TatG ARM complexes were obtained at SIBYLS beamline of
432 Advanced Light Source at Lawrence Berkeley National Laboratory. Measurements were
433 performed in a buffer containing 10 mM sodium phosphate, 70 mM NaCl, 0.1mM EDTA,
434 pH 5.2 and the background scattering was subtracted from the sample scattering to
435 obtain the scattering intensity from the solute molecules. Data from three different
436 concentrations (50, 75, and 100 µM) were compared with scattering intensities at q = 0 Å-
437 1 [I(0)], as determined by Guinier analysis, to detect possible interparticle interactions.
438 Data was analyzed by using ScÅtter software, and the ab initio envelope structures were
439 reconstructed by using DAMMIF/DAMMIN software.
440 NMR data acquisition, resonance assignment and structural calculations: For NMR
441 experiments, the Tat ARM/HEXIMN-ARM:7SK-SL1apical complexes were dissolved in a buffer
442 containing 10 mM potassium phosphate, 70 mM NaCl, and 0.1mM EDTA, pH 5.2
443 whereas the full-length HEXIM1:7SK-SL1apical complex was in a buffer with 25mM HEPES
444 pH 7.5, 200mM NaCl, 5% 2H-glycerol, and 1mM TCEP. All NMR experiments were
445 acquired by using Bruker 700 or 800 MHz instruments equipped with cryogenic probes.
446 Spectra for observing non-exchangeable protons were collected at 298K in 99.96% D2O,
447 whereas those for exchangeable protons were at 283K and 298K in 10% D2O. For NOESY
448 experiments, mixing times were set to 200 ms. To help unambiguously assign the
449 intermolecular NOEs of the HEXIMN-ARM with 7SK-SL1apical, we used both specifically
450 protonated GA, AC, and GU samples of 7SK-SL1apical and two HEXIMN-ARM peptides
451 synthesized by with different combinations of 13C/15N-labeled amino acids. Samples of
452 the 7SK-SL1apical:HEXIMN-ARM was prepared at 1:0.9 equivalents whereas the 7SK-
453 SL1apical:TatFin ARM, 7SK-SL1apical:Tat G ARM, and full-length HEXIM1:7SK-SL1apical
454 complexes were prepared at a 1:0.3 equivalents to avoid any non-specific binding of the
455 peptides or aggregation of the proteins to the RNA. Assignments for non-exchangeable 456 1H, 13C, 15N signals of 7SK-SL1apical in complex with HEXIMN-ARM and Tat ARMs were
457 obtained by analyzing two-dimensional 1H-1H NOESY recorded with non-labeled
458 samples and two-dimensional 13C-HMQC and 15N-HSQC and three-dimensional 13C-
459 edited HMQC-NOESY spectra for labeled samples.
460 Initial structural models were generated using manually assigned restraints in
461 CYANA where upper-limit distance restraints of 2.7 A, 3.3 A, and 5.0 A were employed
462 for direct NOE cross-peaks of strong, medium and weak intensities, respectively43.
463 However, for crosspeaks pairs associated with the intra-residue H8/6 to H2’ and H3’,
464 upper distance limits of 4.2 Å and 3.2 Å were employed for NOEs of medium and strong
465 intensity, respectively. To prevent the generation of structures with collapsed major
466 grooves, cross-helix P–P distance restraints (with 20% weighting coefficient) were
467 employed for A-form helical segments. Standard torsion angle restraints were used for
468 regions of A-helical geometry, allowing for ± 50º deviations from ideality (α=−62°, β=
469 180°, γ= 48°, δ= 83°,ɛ=−152°,ζ=−73°)44. Standard hydrogen bonding restraints with an
470 approximately linear NH–N and NH–O bond distances of 1.85 ± 0.05 Å and N–N and N–
471 O bond distances of 3.00 ± 0.05 Å, and two lower-limit restraints per base pair (G–C base
472 pairs: G-C4 to C-C6 ≥ 8.3 Å and G-N9 to C-H6 ≥ 10.75 Å; A–U base pairs: A-C4 to U-C6 ≥
473 8.3 Å and A-N9 to U-H6 ≥ 10.75 Å) were employed in order to weakly enforce base-pair
474 planarity (20% weighting coefficient).
475 The CYANA structure with the lowest target function was used as the initial
476 model for structure calculations Xplor-NIH to incorporate electrostatic constraints. First,
477 structures were calculated using annealing from 2000°C to 25°C in steps of 12.5°C.
478 Standard energy potential terms for bonds, angles, torsion angles, van der Waals
479 interactions and interatomic repulsions were included. The statistical backbone H-bond
480 potential was utilized for protein residues. Energy potentials for NOEs, hydrogen bonds,
481 and planarity were incorporated with restraints derived from NMR data. All restraints 482 used in CYANA were included with the exception of phosphate-phosphate distances.
483 The structures were sorted by energy using bond, angle, dihedral, and NOE energy
484 potential terms, and the ten percent of the structures with the lowest sort energy were
485 further minimized with SAXS terms to incorporate orientation restraints. For this step,
486 minimization started at 1500°C to 25°C in steps of 12.5°C. The lowest ten percent of these
487 were deposited in the RCSB databank.
488 489
490
491
492
493
494
495
496
497
498
499
500 7SK-snRNP b c SL3
SL2 a
3’ 5’ N = 2.1 ± 0.2 N = 2 ± 0.1 K = 209 ± 34 nM K = 200 ± 20 nM SL4 d d
7SK-SL1Full(native): 7SK-SL1Full(AGU): HEXIM HEXIM pTEFb d e HEXIM SL1 Tat
N = 1.8 ± 0.03 N = 1 ± 0.1 HEXIM K = 206 ± 58 nM Kd = 238 ± 35 nM SL3 d
apical apical SL2 7SK-SL1 (AGU): 7SK-SL1 (AGU): HEXIM HEXIM N-ARM g 3’ 5’ f
SL4
pTEFb N = 1.1 ± 0.1 N = 1 ± 0.03 K = 91 ± 35 nM K = 189 ± 16 nM Tat d d SL1 7SK-SL1apical(AGU): 7SK-SL1apical(AGU): Tat Subtype G ARM Tat Finland ARM 501 Figure 1. Characterization of HEXIM and Tat binding to 7SK: 502 (a) Cartoon representation of the HEXIM dimer and pTEFb heterodimer binding to the 503 7SK snRNP (top). Upon introduction of Tat, HEXIM is displaced from the snRNP (below). 504 Not depicted are MEPCE and LARP7. Representative ITC data for: full-length HEXIM1 Full 505 bound to 7SK-SL1 (G1-C108) with a wild-type loop (b) or an AGU tri loop (c) show that 506 the loop does not play a significant role in dimeric HEXIM binding; full-length HEXIM1 Full apical 507 shows a similar binding affinity to both 7SK-SL1 (c) and 7SK-SL1 (G26-C85) (d); the 508 full-length HEXIM dimer (d) and HEXIMN-ARM (e) bind with similar affinities and 509 stoichiometry to 7SK-SL1apical, indicating that the HEXIMN-ARM:7SK-SL1apical complex 510 represents the minimal binding interaction. Representative ITC traces of the TatFin (f) and 511 TatG (g) ARMs into 7SK-SL1apical show an approximately 1.2 and 2.6-fold increased 512 binding affinity compared to HEXIMN-ARM. All reported values are for n=3 replicates. a e
Arginine Arginine Sandwich Motif (ASM)
Cap d U40 U63 ASM3 Base ASM4 C62 A39
G70 U40 ASM2 b C71 G A U C G pseudo- G74 C G U63 ASM1 G C ASM4 ASM3 U 63 C75 C G 44U A ASM4 A U66 A39 C UG C62 U 40 A U pseudo-C G ASM3 C GASM2 C 71 U C G G70 35C G ASM1 7SK-SL1apical:TatFin C 75 U ASM2 A A C G f U G G C U G C71 C G U G A U G C 85 25G C G C G74 U40 U63 ASM3 c ASM1 ASM4 7SK-SL1apical C62 A39 7SK-SL1:Tat Fin C75 G G70 7SK-SL1:Tat ASM2 7SK-SL1:HEXIM N-ARM C71 G74
C75 ASM1
7SK-SL1apical:HEXIMN-ARM
7SK-SL1apical:TatG 513 Figure 2. 7SK-SL1apical in complex with HEXIMN-ARM, TatFin, and TatG ARMs. 514 (a) Cartoon depicting an arginine sandwich motif. (b) Secondary structure of free 7SK- apical 515 SL1 with a modified AGU tri loop. The base and cap residues forming ASM1, ASM2, 516 pseudo-ASM3, and ASM4 colored in orange, green, magenta, and blue, respectively. The 517 dashed arcs represent the triple-base interactions from the bulge to the stem, giving rise 518 to the caps of the sandwiches. (c) Overlay of reconstructed ab initio SAXS envelopes of 519 free 7SK-SL1apical (gray) and bound to HEXIMN-ARM (orange), TatFin (blue), and TatG (red) 520 ARMs, demonstrating the lack of global rearrangement of the RNA. Representative NMR 521 structures of 7SK-SL1apical bound to (d) HEXIMN-ARM, (e) TatFin, and (f) TatG shows 522 engagement with all ASMs. a d C62 ARM Sequences: R56 C71
N-ARM HEXIM : KK151KHRRR U63 Fin Tat : KR52KHRRR G74 G Tat : KR52RHRRR R52 C75 R57 K51
b c U63 e U63 U40 U40 A39 R53 K150 U68 R52 A39 K51
K151 g f U68 U68 h U68 A39 A39
A39
i j k
K53 R55 H153
R154 H54 C67 H54
R55
7SK-SL1apical:HEXIMN-ARM 7SK-SL1apical:TatFin 7SK-SL1apical:TatG 523 Figure 3. Details of intermolecular interactions between HEXIM, TatFin, and TatG ARMs apical 524 with 7SK-SL1 and the rearrangement of the U68-A39 base pair. 525 C-terminal arginines of all ARMs dock into ASM1 and ASM2 with identical tertiary 526 structures; representative ARM of TatFin is shown in (a). K150 and K151 in HEXIM (b), 527 R52 and K51 in TatFin (c), and K51, R52 and R53 in TatG (d,e) interact with the apical ASMs. 528 The U68-A39 base pair rearranges into a cis-Hoogsteen/sugar interaction upon HEXIM Fin G 529 binding (f), while Tat (g) and Tat (h) both remodel ASM3 by rearranging the U68-A39 530 base pair into a Watson-Crick interaction. (i-k) Spacer residues between the ASM1/ASM2 Fin 531 and ASM3/ASM4 regions are positioned near ASM1 and ASM2. In the case of Tat (j), K53 532 also acts as a spacer residue in order to allow for the remodeling of ASM3 by R52. 533 a 6.6/apical 6.VQ51$ ǻ* G Fin Tat ARM Tat ARM HEXIM ARM HEXIM ǻ+ 7ǻ6 -2
-4 b
-6 kcal/mole -8
-10
-12 1
.d Q0 c
39 39
65 (bound) 65 (free) 65 (free)
65 (bound)
Free TatG Free TatFin HEXIM HEXIM + TatG HEXIM HEXIM + TatFin 534 535 536 Figure 4. Comparative thermodynamic analyses and competition experiments between 537 Tat and HEXIM: 538 (a) Comparison of enthalpic and entropic contributions between TatG, TatFIN, and 539 HEXIMN-ARM in complex with 7SK-SL1apical, and full length HEXIM in complex with 7SK 540 snRNA. Entropy values were calculated using a T value of 298K. The reversal in the 541 entropic and enthalpic contribution for Tat ARM compared to HEXIM is evident with 542 HEXIM having a entropically-driven binding profile. (b) Representative ITC data for full- 543 length HEXIM bound to 7SK snRNP demonstrating expected stoichiometry and specific 544 binding. (c) NMR competition titration analysis showing binding of 7SK by Tat 545 concomitant with the total displacement of HEXIM. Data are shown for the A39 (left) and 546 A65 (right) h2-c2 correlations. The increase in Tat affinity for the HEXIM-bound complex 547 is evident by lack of free-RNA populations for the A65 resonance in the competition 548 experiment compared to binding to free 7SK. 549 550
HEXIM TatFin TatG
NMR distance and dihedral constraints Distance restraints Total NOE 560 571 570 Intraresidue 229 229 229 Inter-residue 331 342 341 Sequential (|i – j| = 1) 152 152 152 Nonsequential (|i – j| > 1) 179 190 189 Hydrogen bonds 152 157 163 Total dihedral-angle restraints 376 376 376 Base pair 99 99 99 Sugar pucker 112 112 112 Backbone 127 127 127 Based on A-form geometry
Structure statistics Violations (mean ± s.d.) 323 ± 14 342 ± 14 348 ± 17 Distance constraints (Å) 0.28 ± 0.007 0.28 ± 0.01 0.26 ± 0.007
Dihedral-angle constraints (°) 0.22 ± 0.05 0.16 ± 0.05 0.31 ± 0.08 Max. dihedral-angle violation (°) 6.12 ± 1.01 7.89 ± 2.10 14.6 ± 1.96 Max. distance-constraint 1.69 ± 0.13 1.62 ± 0.16 2.15 ± 0.28 violation (Å)
Deviations from idealized geometry Bond lengths (Å) 0.006 0.006 0.006 Bond angles (°) 1.04 ± 0.009 1.06 ± 0.02 1.06 ± 0.005 Impropers (°) 0.68 ± 0.02 0.79 ± 0.06 0.94 ± 0.03
Average pairwise r.m.s. deviation (Å)a NOE-restrained RNA and 0.44 ± 0.08 0.54 ± 0.06 0.38 ± 0.05 protein residues b aPairwise r.m.s. deviation was calculated among ten refined structures. bThese are residues 24:87 for the RNA and 150:157 for the peptides. 551 552 Table 1. NMR restraints and structure statistics for HEXIM, TatFin, and TatG ARMs in 553 complex with 7SK-SL1apical 554 a b
140 apical 81 7SK-SL1 -AGU 112.20 83 79 112.00 69 78 31 25 4642 26 50 112.60 150 74 112.50 64 61 70 827360 113.00 113.00
G78.h1’ 160 G73.h1’ G42.h1’ G46.h1’ U68.h1’ G25.h1’ G69.h1’ U28.h1’
112.50 40 63 170 32 28 51 113.00 30 113.00 66 84 113.40 180 44 C ppm N ppm 13 15 140 apical 7SK-SL1 -WT 83 81 109.60 79 109.00 69 Gloop 25 4642 26 78 74 31 110.00 145 64 73 109.40 61 70 82 60 110.40
150 U30.h1’ U66.h1’ U44.h1’ U51.h1’
63 109.50 108.40 40 155 32 28 110.00 30 108.80 110.50 66 84 160 44
1H ppm 1H ppm 555
556 Supplementary Figure 1: 7SK-SL1apical construct design and comparison of 7SK-SL1apical 557 binding to full-length HEXIM1 and the HEXIM N-ARM. 558 (a) Two-dimensional 1H-15N HSQC spectra for 7SK-SL1apical with an AGU triloop (top) and 559 a native loop (bottom) showing that all major motifs in the apical stem are unaffected by 560 the change in the loop. (b) Two-dimensional 1H-13C HMQC spectra of GU-labeled 7SK- 561 SL1apical binding to 0.3 equivalents of full-length HEXIM1 (black) and 0.9 equivalents of 562 HEXIM N-ARM (magenta). While there are slight differences in both carbon and proton 563 chemical due to the salt concentration of the buffers, the HEXIM N-ARM makes same 564 interactions with 7SK-SL1apical as the full-length HEXIM1. 565 5.00 35
67.h1’ 5.20 38 67 45.h1’ 47 33 37 86 45 5.40 36 85
43.h2 48 29 5.60 34.h2 85.h1’ 75 87 62 49 65.h2 27.h2 71 G 5.80 34.h1’ 43 34 27.h1’ A U 43.h1’ 39 27 65.h1’ 65 C G 6.00 77 71 C G AC-7SK-SL1apical:HEXIM ARM G C ASM4 U 63 C G 5.10 44U A A U66 28 G C 68 U 5.30 U 50 32 40 A U 44 pseudo- C G 40 69 ASM3 ASM2 5.50 C G 84 46 C 71 H ppm 31 30 63 66 1 U 64 73 81 70 24 41 61 C G 5.70 79 35 G ASM1 74 82 C 75 83 26 C 42 78 60 U 25 A A 5.90 72 76 apical C G 51 GU-7SK-SL1 :HEXIM ARM U G G C U G 5.20 apical C G GA-7SK-SL1 :HEXIM ARM U G 50 A U 5.40 G C 85 25G C 69 G C 31 46 5' 3' 5.60 70 64 61 73 81 26 24 79 49 42 82 43.h2 74 83 5.80 34.h1’ 39 43 34.h2 34 27.h2 60 27.h1’ 78 25 65.h1’ 65 77 77.h2 65.h2 27 6.00 77.h1’
1H ppm 566 Supplementary Figure 2: Non-exchangeable RNA proton assignments of the HEXIM- 567 N-ARM: 7SK-SL1apical complex. 568 Portion of two-dimensional 1H–1H NOESY spectra of the HEXIM-N-ARM: 7SK-SL1apical 569 interaction using AC (top), GU (middle), and GA (lower) protonated samples with the 570 other nucleotides being deuterated. GISYGR146Q147L148G149K150K151K152H153R154R155R156AHQ
108.0 44.0 150.hz 146.nh 154.nh
46.0
110.0 152.nh N ppm 156.nh 148.nh 15 150.nh 152.hz 48.0
112.0 50.0
1H ppm
GISYGR146Q147L148G149K150K151K152H153R154R155R156AHQ
149.nh 110.0 151.nh
155.nh
115.0 N ppm 15
120.0 147.hz 147.nh
147.he21 147.he22 125.0
1H ppm 571 Supplementary Figure 3: Assignments of amino acid backbone and sidechain protons 572 of the HEXIM N-ARM:7SK-SL1apical interaction. 573 HSQC spectra of the two selectively 13C/15N- labeled HEXIM N-ARMs in complex with 574 7SK-SL1apical. Bolded residues in the sequences are the amino acids that have been 575 selectively labeled. All backbone amide protons and the side chain amino protons were 576 unambiguously assigned with the help of this selective labeling strategy. a b c 2.95 7.12 7.12 153.hβ 156.hη 155.hη 3.00 7.14 7.14
H ppm 3.05 1 7.16 7.16 155.hb 3.10 156.hb 71.h5 73.h8 75.h5 34.h1’ 77.h8 1H ppm 1H ppm 1H ppm alt d alt e G alt 40.h5 63.h5 63.h5 63.h5 I 63.h5 e 150.ha 1.30 G S A U Y 2.80 63.h1’ 40.h5 66.h1’ C 63.h5 G G 1.40 C G 150.h¡ G ASM4 R C 63 146 2.85 U H ppm 1.50 Q 1 C G 147 152.hb 44 150.hb U A L148 A U66 G 2.90 1.60 G C 149 150.hβ U K U 150 1 40 A U K H ppm 1.70 pseudo-C G 151 ASM2 K 1 ASM3 C G 152 H ppm C 71 H U 153 C G R f g 35C G ASM1 154 75 R 152.ha C 155 1.20 U R156 A A A C G 2.95 U G H 1.30 G C Q 153.hβ3 U G HEXIM N-ARM
1.40 33.h5 67.h5 3.00 C G 38.h5 U G 153.hβ2 A H ppm
U 1 G C 85 1.50 25G C 3.05 G C 5' 3' 1.60 152.hb 7SK-SL1apical 151.hb 154.hβ
1.70 36.h6 35.h6
1H ppm 1H ppm 577 Supplementary Figure 4: Assignments of the interactions between the HEXIM N-ARM 578 and 7SK-SL1apical: Secondary structure of the 7SK-SL1apical showing connectivities to the N-ARM 579 HEXIM . NOEs between: (a) A77 H8 proton and R156 guanidinium protons indicate 580 R156 docks into the preformed ASM1; (b) G74 and G73 with the guanidinium protons of 581 R156 and R155, respectively, and (c) 156 and 155 Hd protons and H5 protons of the C75 + 582 and C71 caps support entry of these arginines into ASM1 and ASM2; (d) K150 He protons 583 and the U63 and U40 caps position K150 within hydrogen-bonding distance to the O4 of 584 both residues. (e) K150 interacts with both the U63 and U40 caps of pseudo-ASM3 and ASM4 585 whereas interactions between the U66 H1’ protons with the K152 Hd protons positions 586 K152 as a spacer residue. (f) Interactions of K152 and K151 with the C67 and C38 H5 587 protons, respectively, and NOEs between the R154 Hb proton with C33 determine the 588 relative position of this spacer residue. (g) Positioning of the H153 Hb protons with the 589 H6 protons of C35 and C36. G G A U I a b 67.h5 38.h5 63.h1’ 66.h5 C G S 37.h5 36.h1’ C G G C ASM4 Y 56.hb U 63 G C G 2.80 3.20 55.hb 44U A R 50.h¡ 66 51.h¡ A U 54.hβ3 C K50 UG 2.90 U K 54.hβ2 40 A U 51 53.h¡3 3.30 pseudo-C G R ASM3 ASM2 52 59.hβ3 C G 71 C K53 3.00 53.h¡2
H ppm 3.40
U 1 C G H 35C G ASM1 54 52.h 75 R b C 55 3.10 56.hb 57.hb U 3.50 59.hβ2 A A R56 C G 55.hb U G R 57 G C A 3.20 3.60 U G C G H U G Q 33.h1’ 75.h5 40.h5 34.h1’ 36.h5 31.h1’ 30.h5 A U 71.h5 33.h5 71.h1’ 37.h5 G C 85 Tat Finland RBD 1 1 25G C H ppm H ppm 5'G C3' c 63.h02’ G G 2.70 U e A I d C G 33.h5 63.h1’ 2.80 C G S G C ASM4 2.90 51.h¡ U 63 Y 2.90 2.90 C G G 51.h¡3 3.00 44U A 66 H ppm A U R 51.h¡2 1 C 51.h¡2 1H ppm UG K 3.00 56.hb 3.00 U 40 A U K 57.hb pseudo- 51 C G R 52.hb 55.hb ASM3 C GASM2 52 3.10 3.10 53.hb C 71 R f
U 53 63.h02’ C G 35 ASM1 H H ppm C G 54 1 2.90 C 75 3.20 54.hβ33.20 54.hβ3 U R55 A A 54.hβ2 C G R 54.hβ2 56 3.30 59.hβ3 3.30 59.hβ3 51.h¡ U G R 3.00 G C 57 59.hβ2 U G A 59.hβ2 C G U G H 3.40 3.40 A U Q G C 85 1H ppm 25G C Tat Subype G RBD 75.h5 71.h5 35.h5 63.h5 30.h5 37.h5 29.h5 40.h5 5'G C3'
apical 1 1H ppm 590 7SK-SL1 H ppm 591 Supplementary Figure 5: Interactions between 7SK-SL1apical and TatFin and TatG RBDs: 592 Secondary structure of the 7SK-SL1apical showing connectivities to the RBDs. NOEs Fin + 593 between: (a) R57, R56, and R52 Hd of Tat protons and the C75, C71 , and U40 caps position 594 then into ASM1, ASM2, and remodeled ASM3; K53 He and C67 and U66, and R55 Hd and 595 C33 position these spacer residues in-between the lower and apical ASMs; (b) K51 He and 596 U63 and U66, and K50 He and C38 and C37 make up the final interactions as the peptide exits 597 the groove; H54 Hb and the C37 and C36 H5 provide information for the final spacer Fin 598 residue of the Tat ; (c) U63 2’ hydroxyl and K51 He indicate a hydrogen bonding G + 599 interaction; (d) R56, R52 Hd of Tat and the C71 and U63 caps show interaction with ASM2 600 and ASM1; (e) R57, R52 Hd and the C75 and U40 caps show interaction with ASM1 and the 601 formation of ASM3 by R52; (f) U63 2’ hydroxyl proton and K51 He indicate K51 hydrogen 602 bond with the U63 ribose ring. a b
71.h42 4.24 4.26 68.h2’ 68.h3’ H ppm 1 4.28 G 9.0 A U 39.h8 H ppm
C G 1 C G 39.h61 39.h62 G C ASM4 68.h3 U 63 10.0 71.h41 13.5 C G 44U A A U66 H, ppm 1H ppm G C U U 40 A U 30 pseudo- c 145.0 63 C G 40 32 28 ASM2 ASM3 C G 155.0 C 71 N ppm 15 U 10.0 C G 82-81 30-81 35C G ASM1 C 75 U A A C G 11.0 U G 84-83 31-79 32-79 G C 82-83 82-30 U G 84-28 44-63 C G 82-28 U G 12.0 66-40 74-73 A U 84-26 25-26 85 G C 44-64
25 H ppm 66-42 G C 1 5'G C3' 13.0 69-70 60-61 68-69
14.0
1H ppm
603 Supplementary Figure 6: Characterization of the A39-U68 base pair: 1 1 + 604 Portions of the H– H 2D NOESY spectra. (a) The C71 amino protons are downshifted 605 due to their participation in a triple base-interaction. (b) NOEs between the U68 imino 606 proton with the A39 amino protons and the U68 H2’ and H3’ protons with the A39 H8 607 proton confirms that the A39-U68 base pair is in a cis Hoogsteen/sugar interaction. (c). 608 Portion of the 1H– 15N 2D HSQC spectrum for 15N, 13C-labeled 7SK-SL1apical supports the 609 assignments of the U40 and U63 cap imino protons. 610
611 612 613 Supplementary Table 1. ITC derived binding parameter: 614
615
616
617
618
619
620
621
622 623 Acknowledgements
624 This work was supported by HHMI grant 55108516 (to V.M.D’S) and NIH grant U54
625 AI50470 (formerly U54 GM103297) (to V.M.D. and J.L.S.). SAXS data was collected at
626 the Advanced Light Source (ALS), SIBYLS beamline on behalf of US DOE-BER, through
627 the Integrated Diffraction Analysis Technologies (IDAT) program. Additional support
628 comes from the NIGMS project ALS-ENABLE (P30 GM124169) and a High-End
629 Instrumentation Grant S10OD018483. 630
631 Accession codes
632 Atomic coordinates have been deposited in the Protein Data Bank under accession codes
633 PDB XXXX (7SK-SL1apical:HEXIMN-ARM), PDB XXX (7SK-SL1apical:TatFin), and PDB XXX
634 (7SK-SL1apical:TatG). Chemical shifts have been deposited in the Biological Magnetic
635 Resonance Data Bank under accession codes XXX (7SK-SL1apical:HEXIMN-ARM), XXXX
636 (7SK-SL1apical:TatFin), and XXXX (7SK-SL1apical:TatG).
637
638 Author contributions
639 V.V.P, M.G., and V.M.D’S. conceived and designed the experiments. V.V.P., M.G. and
640 J.L.M. purified the samples and V.V.P performed the NMR, ITC, and SAXS experiments.
641 V.V.P., M.G., and V.M.D’S. performed the structural analyses, interpreted the data, and
642 wrote the manuscript. V.V.P, M.G., J.L.M., J.L.S. and V.M.D. analyzed data and edited
643 the manuscript. 644
645 Competing financial interests
646 The authors declare no competing financial interests. 647 References
648 1 Wada, T. et al. DSIF, a novel transcription elongation factor that regulates RNA 649 polymerase II processivity, is composed of human Spt4 and Spt5 homologs. 650 Genes Dev 12, 343-356, doi:10.1101/gad.12.3.343 (1998). 651 2 Yamaguchi, Y. et al. NELF, a multisubunit complex containing RD, cooperates 652 with DSIF to repress RNA polymerase II elongation. Cell 97, 41-51, 653 doi:10.1016/s0092-8674(00)80713-8 (1999). 654 3 Missra, A. & Gilmour, D. S. Interactions between DSIF (DRB sensitivity inducing 655 factor), NELF (negative elongation factor), and the Drosophila RNA polymerase 656 II transcription elongation complex. Proc Natl Acad Sci U S A 107, 11301-11306, 657 doi:10.1073/pnas.1000681107 (2010). 658 4 Renner, D. B., Yamaguchi, Y., Wada, T., Handa, H. & Price, D. H. A highly 659 purified RNA polymerase II elongation control system. J Biol Chem 276, 42601- 660 42609, doi:10.1074/jbc.M104967200 (2001). 661 5 Ehara, H. et al. Structure of the complete elongation complex of RNA polymerase 662 II with basal factors. Science 357, 921-924, doi:10.1126/science.aan8552 (2017). 663 6 Marshall, N. F. & Price, D. H. Purification of P-TEFb, a transcription factor 664 required for the transition into productive elongation. J Biol Chem 270, 12335- 665 12338, doi:10.1074/jbc.270.21.12335 (1995). 666 7 Peng, J., Zhu, Y., Milton, J. T. & Price, D. H. Identification of multiple cyclin 667 subunits of human P-TEFb. Genes Dev 12, 755-762, doi:10.1101/gad.12.5.755 668 (1998). 669 8 Kim, J. B. & Sharp, P. A. Positive transcription elongation factor B 670 phosphorylates hSPT5 and RNA polymerase II carboxyl-terminal domain 671 independently of cyclin-dependent kinase-activating kinase. J Biol Chem 276, 672 12317-12323, doi:10.1074/jbc.M010908200 (2001). 673 9 Fujinaga, K. et al. Dynamics of human immunodeficiency virus transcription: P- 674 TEFb phosphorylates RD and dissociates negative effectors from the 675 transactivation response element. Mol Cell Biol 24, 787-795, 676 doi:10.1128/mcb.24.2.787-795.2004 (2004). 677 10 Vos, S. M. et al. Structure of activated transcription complex Pol II-DSIF-PAF- 678 SPT6. Nature 560, 607-612, doi:10.1038/s41586-018-0440-4 (2018). 679 11 Yamada, T. et al. P-TEFb-mediated phosphorylation of hSpt5 C-terminal repeats 680 is critical for processive transcription elongation. Mol Cell 21, 227-237, 681 doi:10.1016/j.molcel.2005.11.024 (2006). 682 12 Yang, Z., Zhu, Q., Luo, K. & Zhou, Q. The 7SK small nuclear RNA inhibits the 683 CDK9/cyclin T1 kinase to control transcription. Nature 414, 317-322, 684 doi:10.1038/35104575 (2001). 685 13 Nguyen, V. T., Kiss, T., Michels, A. A. & Bensaude, O. 7SK small nuclear RNA 686 binds to and inhibits the activity of CDK9/cyclin T complexes. Nature 414, 322- 687 325, doi:10.1038/35104581 (2001). 688 14 Michels, A. A. et al. MAQ1 and 7SK RNA interact with CDK9/cyclin T complexes 689 in a transcription-dependent manner. Mol Cell Biol 23, 4859-4869, 690 doi:10.1128/mcb.23.14.4859-4869.2003 (2003). 691 15 Yik, J. H. et al. Inhibition of P-TEFb (CDK9/Cyclin T) kinase and RNA 692 polymerase II transcription by the coordinated actions of HEXIM1 and 7SK 693 snRNA. Mol Cell 12, 971-982, doi:10.1016/s1097-2765(03)00388-5 (2003). 694 16 Chen, R., Yang, Z. & Zhou, Q. Phosphorylated positive transcription elongation 695 factor b (P-TEFb) is tagged for inhibition through association with 7SK snRNA. J 696 Biol Chem 279, 4153-4160, doi:10.1074/jbc.M310044200 (2004). 697 17 Michels, A. A. et al. Binding of the 7SK snRNA turns the HEXIM1 protein into a 698 P-TEFb (CDK9/cyclin T) inhibitor. EMBO J 23, 2608-2619, 699 doi:10.1038/sj.emboj.7600275 (2004). 700 18 Kobbi, L. et al. An evolutionary conserved Hexim1 peptide binds to the Cdk9 701 catalytic site to inhibit P-TEFb. Proc Natl Acad Sci U S A 113, 12721-12726, 702 doi:10.1073/pnas.1612331113 (2016). 703 19 Sano, M. et al. Activation and function of cyclin T-Cdk9 (positive transcription 704 elongation factor-b) in cardiac muscle-cell hypertrophy. Nat Med 8, 1310-1317, 705 doi:10.1038/nm778 (2002). 706 20 Kretz, A. L. et al. CDK9 is a prognostic marker and therapeutic target in 707 pancreatic cancer. Tumour Biol 39, 1010428317694304, 708 doi:10.1177/1010428317694304 (2017). 709 21 Schlafstein, A. J. et al. CDK9 Expression Shows Role as a Potential Prognostic 710 Biomarker in Breast Cancer Patients Who Fail to Achieve Pathologic Complete 711 Response after Neoadjuvant Chemotherapy. Int J Breast Cancer 2018, 6945129, 712 doi:10.1155/2018/6945129 (2018). 713 22 Cho, W. K., Jang, M. K., Huang, K., Pise-Masison, C. A. & Brady, J. N. Human T- 714 lymphotropic virus type 1 Tax protein complexes with P-TEFb and competes for 715 Brd4 and 7SK snRNP/HEXIM1 binding. J Virol 84, 12801-12809, 716 doi:10.1128/JVI.00943-10 (2010). 717 23 Zhou, M. et al. Tax interacts with P-TEFb in a novel manner to stimulate human 718 T-lymphotropic virus type 1 transcription. J Virol 80, 4781-4791, 719 doi:10.1128/JVI.80.10.4781-4791.2006 (2006). 720 24 Muniz, L., Egloff, S., Ughy, B., Jady, B. E. & Kiss, T. Controlling cellular P-TEFb 721 activity by the HIV-1 transcriptional transactivator Tat. PLoS Pathog 6, e1001152, 722 doi:10.1371/journal.ppat.1001152 (2010). 723 25 Byers, S. A., Price, J. P., Cooper, J. J., Li, Q. & Price, D. H. HEXIM2, a HEXIM1- 724 related protein, regulates positive transcription elongation factor b through 725 association with 7SK. J Biol Chem 280, 16360-16367, doi:10.1074/jbc.M500424200 726 (2005). 727 26 Yik, J. H., Chen, R., Pezda, A. C. & Zhou, Q. Compensatory contributions of 728 HEXIM1 and HEXIM2 in maintaining the balance of active and inactive positive 729 transcription elongation factor b complexes for control of transcription. J Biol 730 Chem 280, 16368-16376, doi:10.1074/jbc.M500912200 (2005). 731 27 Egloff, S., Van Herreweghe, E. & Kiss, T. Regulation of polymerase II 732 transcription by 7SK snRNA: two distinct RNA elements direct P-TEFb and 733 HEXIM1 binding. Mol Cell Biol 26, 630-642, doi:10.1128/MCB.26.2.630-642.2006 734 (2006). 735 28 Li, Q. et al. Analysis of the large inactive P-TEFb complex indicates that it 736 contains one 7SK molecule, a dimer of HEXIM1 or HEXIM2, and two P-TEFb 737 molecules containing Cdk9 phosphorylated at threonine 186. J Biol Chem 280, 738 28819-28826, doi:10.1074/jbc.M502712200 (2005). 739 29 Dulac, C. et al. Transcription-dependent association of multiple positive 740 transcription elongation factor units to a HEXIM multimer. J Biol Chem 280, 741 30619-30629, doi:10.1074/jbc.M502471200 (2005). 742 30 Schonichen, A. et al. A flexible bipartite coiled coil structure is required for the 743 interaction of Hexim1 with the P-TEFB subunit cyclin T1. Biochemistry 49, 3083- 744 3091, doi:10.1021/bi902072f (2010). 745 31 Blazek, D., Barboric, M., Kohoutek, J., Oven, I. & Peterlin, B. M. Oligomerization 746 of HEXIM1 via 7SK snRNA and coiled-coil region directs the inhibition of P- 747 TEFb. Nucleic Acids Res 33, 7000-7010, doi:10.1093/nar/gki997 (2005). 748 32 Martinez-Zapien, D. et al. Intermolecular recognition of the non-coding RNA 7SK 749 and HEXIM protein in perspective. Biochimie 117, 63-71, 750 doi:10.1016/j.biochi.2015.03.020 (2015). 751 33 Lebars, I. et al. HEXIM1 targets a repeated GAUC motif in the riboregulator of 752 transcription 7SK and promotes base pair rearrangements. Nucleic Acids Res 38, 753 7749-7763, doi:10.1093/nar/gkq660 (2010). 754 34 Pham, V. V. et al. HIV-1 Tat interactions with cellular 7SK and viral TAR RNAs 755 identifies dual structural mimicry. Nat Commun 9, 4266, doi:10.1038/s41467-018- 756 06591-6 (2018). 757 35 Chen, L. & Frankel, A. D. An RNA-binding peptide from bovine 758 immunodeficiency virus Tat protein recognizes an unusual RNA structure. 759 Biochemistry 33, 2708-2715, doi:10.1021/bi00175a046 (1994). 760 36 Puglisi, J. D., Tan, R., Calnan, B. J., Frankel, A. D. & Williamson, J. R. 761 Conformation of the TAR RNA-arginine complex by NMR spectroscopy. Science 762 257, 76-80, doi:10.1126/science.1621097 (1992). 763 37 Puglisi, J. D., Chen, L., Blanchard, S. & Frankel, A. D. Solution structure of a 764 bovine immunodeficiency virus Tat-TAR peptide-RNA complex. Science 270, 765 1200-1203, doi:10.1126/science.270.5239.1200 (1995). 766 38 Martinez-Zapien, D. et al. The crystal structure of the 5 functional domain of the 767 transcription riboregulator 7SK. Nucleic Acids Res 45, 3568-3579, 768 doi:10.1093/nar/gkw1351 (2017). 769 39 Bourbigot, S. et al. Solution structure of the 5'-terminal hairpin of the 7SK small 770 nuclear RNA. RNA 22, 1844-1858, doi:10.1261/rna.056523.116 (2016). 771 40 Brogie, J. E. & Price, D. H. Reconstitution of a functional 7SK snRNP. Nucleic 772 Acids Res 45, 6864-6880, doi:10.1093/nar/gkx262 (2017). 773 41 Marz, M. et al. Evolution of 7SK RNA and its protein partners in metazoa. Mol 774 Biol Evol 26, 2821-2830, doi:10.1093/molbev/msp198 (2009). 775 42 Stols, L. et al. A new vector for high-throughput, ligation-independent cloning 776 encoding a tobacco etch virus protease cleavage site. Protein Expr Purif 25, 8-15, 777 doi:10.1006/prep.2001.1603 (2002). 778 43 Guntert, P., Mumenthaler, C. & Wuthrich, K. Torsion angle dynamics for NMR 779 structure calculation with the new program DYANA. J Mol Biol 273, 283-298, 780 doi:10.1006/jmbi.1997.1284 (1997). 781 44 Tolbert, B. S. et al. Major groove width variations in RNA structures determined 782 by NMR and impact of 13C residual chemical shift anisotropy and 1H-13C 783 residual dipolar coupling on refinement. J Biomol NMR 47, 205-219, 784 doi:10.1007/s10858-010-9424-x (2010). 785
786
787 Figures
Figure 1
Characterization of HEXIM and Tat binding to 7SK: (a) Cartoon representation of the HEXIM dimer and pTEFb heterodimer binding to the 7SK snRNP (top). Upon introduction of Tat, HEXIM is displaced from the snRNP (below). Not depicted are MEPCE and LARP7. Representative ITC data for: full-length HEXIM1 bound to 7SK-SL1Full (G1-C108) with a wild-type loop (b) or an AGU tri loop (c) show that the loop does not play a signi cant role in dimeric HEXIM binding; full-length HEXIM1 shows a similar binding a nity to both 7SK-SL1Full(c) and 7SK-SL1apical (G26-C85) (d); the full-length HEXIM dimer (d) and HEXIMN-ARM (e) bind with similar a nities and stoichiometry to 7SK-SL1apical, indicating that the HEXIMN-ARM:7SK- SL1apical complex represents the minimal binding interaction. Representative ITC traces of the TatFin (f) and TatG (g) ARMs into 7SK-SL1apical show an approximately 1.2 and 2.6-fold increased binding a nity compared to HEXIMN-ARM. All reported values are for n=3 replicates.
Figure 2 7SK-SL1apical in complex with HEXIMN-ARM, TatFin, and TatG ARMs. (a) Cartoon depicting an arginine sandwich motif. (b) Secondary structure of free 7SK SL1apical with a modi ed AGU tri loop. The base and cap residues forming ASM1, ASM2, pseudo-ASM3, and ASM4 colored in orange, green, magenta, and blue, respectively. The dashed arcs represent the triple-base interactions from the bulge to the stem, giving rise to the caps of the sandwiches. (c) Overlay of reconstructed ab initio SAXS envelopes of free 7SK-SL1apical (gray) and bound to HEXIMN-ARM (orange), TatFin (blue), and TatG (red) ARMs, demonstrating the lack of global rearrangement of the RNA. Representative NMR structures of 7SK- SL1apical bound to (d) HEXIMN-ARM, (e) TatFin, and (f) TatG shows engagement with all ASMs. Figure 3
Details of intermolecular interactions between HEXIM, TatFin, and TatG ARMs with 7SK-SL1apical and the rearrangement of the U68-A39 base pair. C-terminal arginines of all ARMs dock into ASM1 and ASM2 with identical tertiary structures; representative ARM of TatFin is shown in (a). K150 and K151 in HEXIM (b), R52 and K51 in TatFin (c), and K51, R52 and R53 in TatG (d,e) interact with the apical ASMs. The U68- A39 base pair rearranges into a cis-Hoogsteen/sugar interaction upon HEXIM binding (f), while TatFin (g) and TatG (h) both remodel ASM3 by rearranging the U68-A39 base pair into a Watson-Crick interaction. (i- k) Spacer residues between the ASM1/ASM2 and ASM3/ASM4 regions are positioned near ASM1 and ASM2. In the case of TatFin (j), K53 also acts as a spacer residue in order to allow for the remodeling of ASM3 by R52.
Figure 4
Comparative thermodynamic analyses and competition experiments between Tat and HEXIM: (a) Comparison of enthalpic and entropic contributions between TatG, TatFIN, and HEXIMN-ARM in complex with 7SK-SL1apical, and full length HEXIM in complex with 7SK snRNA. Entropy values were calculated using a T value of 298K. The reversal in the entropic and enthalpic contribution for Tat ARM compared to HEXIM is evident with HEXIM having a entropically-driven binding pro le. (b) Representative ITC data for full length HEXIM bound to 7SK snRNP demonstrating expected stoichiometry and speci c binding. (c) NMR competition titration analysis showing binding of 7SK by Tat concomitant with the total displacement of HEXIM. Data are shown for the A39 (left) and A65 (right) h2-c2 correlations. The increase in Tat a nity for the HEXIM-bound complex is evident by lack of free-RNA populations for the A65 resonance in the competition experiment compared to binding to free 7SK.