1004 Biophysical Journal Volume 105 August 2013 1004–1017

The Arginine-Rich RNA-Binding Motif of HIV-1 Rev Is Intrinsically Disordered and Folds upon RRE Binding

Fabio Casu,† Brendan M. Duggan,‡ and Mirko Hennig†* †Department of Biochemistry and Molecular Biology, Medical University of South Carolina, Charleston, South Carolina; and ‡Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California at San Diego, La Jolla, California

ABSTRACT Arginine-rich motifs (ARMs) capable of binding diverse RNA structures play critical roles in , trans- lation, RNA trafficking, and RNA packaging. The regulatory HIV-1 protein Rev is essential for viral replication and belongs to the ARM family of RNA-binding proteins. During the early stages of the HIV-1 life cycle, incompletely spliced and full-length viral mRNAs are very inefficiently recognized by the splicing machinery of the host cell and are subject to degradation in the . These transcripts harbor the Rev Response Element (RRE), which orchestrates the interaction with the Rev ARM and the successive Rev-dependent mRNA export pathway. Based on established criteria for predicting intrinsic disorder, such as hydropathy, combined with significant net charge, the very basic primary sequences of ARMs are expected to adopt coil-like structures. Thus, we initiated this study to investigate the conformational changes of the Rev ARM associated with RNA binding. We used multidimensional NMR and circular dichroism spectroscopy to monitor the observed structural transi- tions, and described the conformational landscapes using statistical ensemble and molecular-dynamics simulations. The com- bined spectroscopic and simulated results imply that the Rev ARM is intrinsically disordered not only as an isolated peptide but also when it is embedded into an oligomerization-deficient Rev mutant. RRE recognition triggers a crucial coil-to-helix transition employing an induced-fit mechanism.

INTRODUCTION RNA-binding proteins are fundamentally important for including oligomeric interactions among Rev monomers and diverse cellular and viral regulatory processes. One wide- interactions with cellular proteins of the nuclear import and spread RNA-binding domain is the arginine-rich motif export machineries. These adaptive recognition processes (ARM), which is present in various proteins, such as ribo- require an adequate degree of conformational flexibility. somal proteins, transcriptional antitermination N proteins The Rev ARM comprises 17 amino acid residues, of bacteriophages l and P22, and the HIV-1 regulatory including a total of 10 arginines (Rev34-50), and in complex proteins Tat and Rev (1–5). A preponderance of arginine with its cognate RNA, RRE StemIIB, it adopts an a-helical residues seems to be the only common feature among these conformation that fits deeply into the RRE StemIIB widened apparently simple and relatively short sequences (10–20 major groove (14). Four arginine residues (R35, R38, R39, amino acid residues in length). While maintaining the abil- and R44), a threonine (T34), and an asparagine (N40) ity to interact with their cognate targets with high affinity convey specific, high-affinity binding to RRE StemIIB and specificity (6,7), they have been shown to be surpris- (4,14–16). However, Rev ARM has also been shown to ingly dynamic and can adopt different conformations, adopt an extended conformation upon binding to anti-Rev such as a-helical, b-hairpin, or extended conformations, RNA aptamers (15,17). Moreover, interactions with selected after binding their specific RNA targets. DNA motifs do not require the T34 or N40 side chains (18), Rev is an ARM-containing RNA-binding protein and a and various arginine residues play key roles in complexes critical regulator of the HIV-1 replication cycle (8–10). Rev with distinctly folded RNA targets, including the primary interacts specifically with a highly structured intronic and secondary binding sites of the RRE (8). Overall, the element hosted within the Env gene known as the Rev variation observed in Rev ARM complex structures suggests Responsive Element (RRE), which is part of all intron-con- that the conformation of Rev ARM is induced by and depen- taining viral messenger (mRNAs). Upon recruitment dent upon its specific nucleic acid binding target. Such an of proteins that belong to the cellular nuclear-export machin- induced-fit or adaptive-binding model has been successfully ery, Crm1 and the GTPase , Rev induces export of applied to describe several protein-protein and protein- unspliced and singly spliced mRNA to the cytoplasm for sub- nucleic acid (DNA-RNA) interactions (19–21). sequent of virion assembly proteins (11–13). To In contrast, early investigations employing circular coordinate these events, Rev must engage in transient pro- dichroism (CD) and NMR suggested that the Rev ARM tein-RNA interactions as well as protein-protein interactions, is predominantly helical in solution (4,22), supporting a conformational selection (selected-fit or population-shift) model (23–27). In this scheme, RRE binding would induce Submitted March 20, 2013, and accepted for publication July 2, 2013. a shift in the conformational ensemble, resulting in prefer- *Correspondence: [email protected] ential selection and stabilization of a preformed, helical, Editor: Kathleen Hall. Ó 2013 by the Biophysical Society 0006-3495/13/08/1004/14 $2.00 http://dx.doi.org/10.1016/j.bpj.2013.07.022 Rev’s Arginine-Rich Motif Is Intrinsically Disordered 1005

RRE-binding competent conformation of the Rev ARM. GB1-derived vector as an N-terminal fusion with a His6 tag, a GB1 domain, However, both of these studies optimized conditions to and a TEV-protease cleavage site (32). Plasmids coding for (His6)-tagged increase helicity, either by modifying the peptide’s termini Rev in T7 expression vector (pSG003) were previously described (33). Cultures of E. coli BL21-CodonPlus (DE3)-RIPL cells expressing (His6)- or by working at low pH. tagged Rev wild-type and V16D/I55N mutant were grown from a single A recently reported x-ray structure of Rev complexed colony in 500 mL 15N- or 15N- and 13C-enriched M9 minimal media at with an engineered monoclonal Fab (23) and another of 37 C overnight. Cells were grown to OD600 of 0.8 at 37 CinLBorM9 media with 100 mg/mL ampicillin. Isopropyl-b-D-thiogalactopyranoside an oligomerization-deficient mutant that forms a homo- dimer (28,29) both show that Rev’s 65 N-terminal residues (IPTG) was added to 1 mM to induce expression for 4 h at 37 C, and har- vested cells were resuspended in denaturing lysis buffer (50 mM Na HPO / adopt an antiparallel helix-turn-helix architecture, and that 2 4 NaH2PO4 pH 7.4, 500 mM KCl, 10 mM imidazole, 1 mM dithiothreitol the ARM is helical in these oligomeric complexes. Pivotal [DTT] and 8 M urea). A urea denaturation and on-column refolding studies employing single-molecule fluorescence have un- protocol was used to remove nonspecific nucleic acid contamination as raveled the oligomerization pathway of Rev onto its cognate previously described (34), and the resulting sample was free of RNA RNA target sequence. The initial RRE binding and succes- contamination as determined by the 260/280 nm absorbance ratio (values of ~0.6) (34). Eluted fractions were analyzed by SDS-PAGE and pooled, sive oligomerization occur in a stepwise manner whereby a and in the case of the Rev ARM peptide fusion protein, TEV protease nucleating Rev monomer binds to the high-affinity binding was added and incubated at room temperature overnight. The resulting site StemIIB followed by highly cooperative binding of peptide was purified by RP-HPLC using a Jupiter Proteo column (Phenom- additional Rev monomers to form oligomeric assemblies enex, Torrance, CA) and a linear gradient of acetonitrile in the presence of stabilized by protein-protein and protein-RNA interactions 0.1% trifluoroacetic acid (TFA). Collected fractions were pooled and lyophilized. (30). Dimeric Rev/Rev complexes do not constitute the For uniform 15N labeling, the constructs were expressed in M9 minimal 15 15 basic building block for oligomeric Rev/RRE complex as- medium supplemented with NH4Cl[ N, 99%] (Cambridge Isotope Lab- sembly. Because Rev populates a monomeric state only at oratories, Andover, MA) as the sole nitrogen source. For uniform 15N,13C labeling, the peptide was expressed in M9 minimal medium supplemented submicromolar concentrations (31), RRE-binding-compe- 15 15 13 13 tent Rev monomers continue to evade structural character- with NH4Cl[ N, 99%] and C-glucose[ C, 99%] (Cambridge Isotope Laboratories). ization by x-ray crystallography or spectroscopic methods. In this study, we employed heteronuclear, multidimen- sional NMR and CD spectroscopy as well as molecular- RNA transcription and purification dynamics (MD) calculations to investigate whether the RRE StemIIB-containing plasmids were purified with the E.Z.N.A.Plasmid wild-type Rev ARM peptide adopts a helical, RRE-binding Giga Kit (Omega Bio-Tek, Norcross, GA), linearized with the appropriate competent conformation in aqueous solution under phy- 30 restriction endonuclease, phenol/chloroform extracted, and ethanol siological conditions. Furthermore, we examined the precipitated. DNA template-directed in vitro transcriptions using T7 RNA conformation of the ARM region in the context of an oligo- polymerase were performed in transcription buffer (40 mM Tris, pH 8.0, merization-deficient Rev variant, the double-mutant 1 mM spermidine, 10 mM DTT, 0.01% Triton X-100) containing 0.05 mg/mL plasmid template, 30 mM MgCl2,1mM T7 RNA polymerase, V16D/I55N, which carries mutations in the known oligo- and 6.5 mM each of ATP, CTP, GTP, and UTP). Transcription reactions merization interfaces and permits investigation of a mostly were incubated at 37C for 4 hr. monomeric state of Rev protein at concentrations suitable Transcripts were purified as previously described (35). Briefly, full- for NMR studies (30). For all of the peptide studies, we length transcripts were separated from protein, unincorporated nucleotides, employed a previously described Rev ARM construct that and abortive transcripts by anion exchange chromatography using a HiTrapQ HP column (GE Healthcare). The RNA was exchanged into buffer features the wild-type sequence with four amino acids containing 50 mM Na2HPO4/NaH2PO4 pH 7.4, 150 mM KCl, 1 mM EDTA, (G-4AMA) appended to the N-terminus of Rev34-50 followed 1 mM DTT, 0.02% NaN3). Complex spectra were acquired after addition of by Aþ1AAAR at the C-terminus (8). To obtain data for a concentrated RRE StemIIB solution to Rev ARM peptide samples (final well-defined helical conformation of the Rev ARM, we molar ratio: 1.1 (RNA) to 1.0 (peptide)). used two preparations: 1), a complex of the Rev ARM pep- tide with RRE StemIIB in physiological buffer; and 2), the CD spectroscopy Rev ARM peptide in 50% 2,2,2-trifluoroethanol (TFE)/50% physiological buffer. The TFE preparation provided a heli- CD measurements of purified Rev ARM peptide, wild-type Rev, and V16D/ cal reference sample without introducing additional signals. I55N Rev mutant protein were recorded on an AVIV 400 spectropolarime- ter (AVIV Biomedical, Lakewood, NJ) at 25C between 198 and 260 nm, with a 0.1 cm path-length cell, a wavelength increment of 1 nm, an aver- MATERIALS AND METHODS aging time of 5 s, and an equilibration time of 5 min. The baseline was cor- rected by subtracting the spectra of the respective buffers collected under Expression and purification of Rev ARM peptide identical conditions. Spectra from three scans were averaged. CD spectra m and full-length Rev constructs of 50 M Rev ARM peptide samples in buffer (20 mM Na2HPO4/NaH2PO4 pH 7.4, 150 mM KCl) were acquired in the presence of 0–50% (v/v) TFE, We cloned and overexpressed the HIV-1 Rev ARM peptide as a cleavable after preincubation at 25C for 1 h. CD spectra of 10 mM wild-type Rev and GB1 fusion in Escherichia coli. The HIV-1 ARM peptide sequence V16D/I55N Rev mutant protein samples were collected in the same buffer GAMATRQARRNRRRRWRERQRAAAAR (native amino acids are in the absence of TFE. The helical content was estimated using the ratio of underlined (8)) was expressed in E. coli strain BL21/DE3 from a p(H) the mean residue ellipticities (q) at 222 and 208 nm.

Biophysical Journal 105(4) 1004–1017 1006 Casu et al.

NMR spectroscopy: sample preparation interresidue NOE peaks were carried out using the CANDID (49) algorithm implemented in CYANA 2.1 (L.A. Systems, Tokyo, Japan) (50). For the The lyophilized peptide samples were dissolved in a buffer containing structure calculation of the Rev ARM peptide in 50% TFE, we evaluated 50 mM Na2HPO4/NaH2PO4, pH 7.4, 150 mM KCl, 1 mM EDTA, 1 mM a total of 236 intramolecular NOE distance constraints, including 53 m DTT, 0.02% NaN3 in 500 L of 90% H2O/10% D2O or 50% TFE/40% medium-range NOEs (see Fig. S2 in the Supporting Material). Additionally, buffer/10% D2O to a final peptide concentration of 1 mM. Aliquots of we derived 48 torsion angle constraints from the chemical-shift data using wild-type Rev and V16D/I55N Rev mutant protein were buffer exchanged TALOSþ predictions, and used them in the calculations (46). against Rev NMR buffer (50 mM Na2HPO4/NaH2PO4, pH 7.4, 500 mM KCl, 1 mM EDTA, 1 mM DTT, 0.02% NaN3) in 500 mL of 90% H2O/ m 10% D2O to a final protein concentration of ~100–200 M. Generation of conformational ensemble

We generated explicit ensemble descriptions of the Rev ARM peptide using NMR data collection and processing the flexible-meccano (FM) algorithm (51). The algorithm accounts for the heteropolymeric nature of the primary sequence and uses amino-acid spe- NMR experiments were recorded on Bruker Avance III 600 MHz and cific f,j-torsion angle sampling along with side-chain volumes to compute Avance II 700 and 800 MHz spectrometers (Bruker AG, Karlsruhe, Ger- the conformational ensemble. The ensemble comprised 100,000 con- many). The Avance 700 and 800 are equipped with a 5 mm 1H[13C/15N] formers, and simulated 1D(HN,N) and 3J(HN,Ha) coupling constants were triple-resonance probe featuring cryogenically cooled preamplifiers and averaged over all ensemble members. The Karplus coefficients (3J(Q) ¼ radiofrequency coils on 1H and 13C channels and the z-axis gradient, and A cos2QþBcosQþC) employed for back calculations were A ¼ 7.9, the 600 MHz instrument features a conventional 5 mm 1H[13C/15N] triple- B ¼1.05, and C ¼ 0.65 Hz (52). resonance probe. All NMR experiments were carried out at either 283 or 298 K. 1H chemical shifts were externally referenced to 4,4-dimethyl-4- silapentane-1-sulfonic acid (DSS), with heteronuclear13C and 15N chemical shifts referenced indirectly according to the 1X/1H ratio via DSS. MD simulations The NMR data were processed using nmrPipe (36) and analyzed with MD simulations were generated using the Amber 11 suite of programs (53) CcpNmr analysis software (37). and the ff99SB force field (54) with ILDN side-chain torsion modifications (55). All simulations used the particle mesh Ewald procedure (56) under periodic boundary conditions. A time step of 5 fs was used for all simulations, Residual dipolar coupling measurements and the SHAKE algorithm (57) was used to constrain the length of bonds to For residual dipolar coupling (RDC) measurements, 6% stretched neutral hydrogen atoms. Langevin dynamics (58) with a collision frequency of 1 polyacrylamide gels were cast using the apparatus described by Chou 1ps was used to maintain temperature. All-atom representations of the et al. (38) and prepared from a stock solution of 36% w/v acrylamide molecules were solvated in truncated octahedral boxes of TIP3P water and 0.94% w/v N,N-methylenebisacrylamide (molar ratio of acrylamide/ (59), and charge was neutralized by the addition of 10 chloride ions. bisacrylamide of 83:1), 0.08% ammonium persulfate, and 0.7% Initial coordinates for the Rev ARM peptide in the unfolding simulation tetramethylethylenediamine. were taken from Protein Data Bank (PDB) entry 1ETF (14). The N-terminal Alkyl-PEG bicelles for RDC measurements were prepared from a 15% Asp residue was replaced with the sequence Gly-Ala-Met-Ala using UCSF bicelle stock solution (150 mL C12E5, 550 mL NMR buffer, and 300 mL Chimera (60). These four N-terminal residues were added in a parallel b-strand conformation. Coordinates for the missing hydrogen atoms were D2O) according to the protocol described by Gebel and Shortle (39). A solution of n-hexanol was titrated into the C12E5 solution in microliter calculated using UCSF Chimera. additions up to a molar ratio C12E5/n-hexanol of 0.96. Bicelles were added To generate multiple unstructured initial coordinates for the folding b to the peptide sample in NMR buffer to make a 6% bicelle sample contain- simulations, we built an extended peptide in the antiparallel -strand ing 1 mM Rev ARM peptide. conformation using UCSF Chimera and subjected it to 50 steps of steep- est-descent energy minimization, followed by 50 steps of conjugate- gradient minimization. The equilibrated extended peptide was heated to NMR assignments and structure calculation 1500 K and held at this temperature for a 500 ps in vacuo trajectory. The trajectory was clustered, and representative structures from each of Unique backbone and side-chain assignments were made by combining the 10 clusters were solvated using different-sized truncated octahedral three-dimensional (3D) HNCO, HNCA, HN(CO)CA, CBCA(CO)NH, boxes to obtain a similar number of water molecules for each folding HNCACB, HBHA(CO)NH, and H(CCCO)NH (40); 15N-separated 3D simulation (Table S1). nuclear Overhauser effect spectroscopy-heteronuclear single quantum The solvated systems were energy minimized in two stages. First, the coherence (NOESY-HSQC) (41); HSQC-NOESY-HSQC (42); and 3D solvent was minimized with 500 steps of steepest-descent energy minimi- HNHA (43,44) experiments. Analysis of a simultaneous 13C/15N-separated zation followed by 500 steps of conjugate-gradient minimization with 3D NOESY-HSQC (45) spectrum together with secondary chemical-shift- positional restraints holding the peptide fixed. The second stage minimized derived torsion restraints (46) yielded extensive restraints for structure the entire system with 1000 steps of steepest descent followed by 1500 elucidation. Isotropic and anisotropic one-bond 1J(HN,N) and 1D(HN,N) steps of conjugate-gradient minimization. Next, the systems were equili- coupling constants were obtained from gradient-enhanced IPAP-HSQC brated, again in two stages. Initially, the peptide was held fixed while experiments employing band-selective 1Haliphatic decoupling pulses (47). the solvent was equilibrated over 20 ps at 1 atm and 283 K. In the second Vicinal 3J(HN,Ha) coupling constants were determined using quantitative stage, we equilibrated the entire system over 100 ps by removing the HNHA experiments. To compensate for finite 1Ha spin flips, apparent restraints on the peptide. Production runs of 50 ns were performed in the 3J(HN,Ha) couplings were multiplied by a uniform factor of 1.05 (44). NPT ensemble, and coordinates were saved every 5 fs. Trajectories of Steady-state, heteronuclear [1HN] 15N NOEs were measured according to each system were extended to 50 ns with a different random number standard procedures (48). Heteronuclear NOE values are reported as the seed on each restart (61). ratio of peak intensities in paired interleaved spectra collected with and Trajectories were analyzed using the Amber ptraj module. Root mean- without the initial proton saturation period (3 s) during the 5 s recycle delay. square deviations (RMSDs) were calculated using the backbone C, Ca, Distance restraints were obtained from NOEs in simultaneous, 3D 13C- and N atoms of all residues unless specified otherwise. Secondary structure and 15N-resolved NOESY experiments, and subsequent assignments of was determined using the DSSP (Define Secondary Structure of Proteins)

Biophysical Journal 105(4) 1004–1017 Rev’s Arginine-Rich Motif Is Intrinsically Disordered 1007 algorithm. Clustering used the average linkage algorithm. The dihedral 208 and 222 nm and the ellipticity progressively increased, angle Q was defined by the HN, N, Ca, and Ha atoms, to be consistent suggesting an increase in helical content. The ratio [q222]/ with the NMR measurements, and converted to coupling constants using [q ], a measure of helical content, increased from 0.42 values of 7.90, 1.05, and 0.65 in the Karplus relationship. 208 in the absence of TFE to a maximum of 0.82 in the pres- ence of 50% (v/v) TFE (Fig. 1 B). Accordingly, we chose RESULTS to use 50% TFE as a sample condition to provide a refer- CD spectroscopy ence helical conformation of the Rev ARM peptide. A comparison of the CD spectra collected for wild-type CD spectroscopy has been used extensively to study Rev, Rev and V16D/I55N Rev mutant protein confirmed signifi- and was the first method we applied. CD spectra of the cant a-helical content for oligomeric wild-type Rev. In Rev ARM peptide in physiological buffer exhibited a contrast, the oligomerization-deficient Rev double-mutant minimum around 200 nm, typical of an unfolded state V16D/I55N adopted a mostly random coil conformation (Fig. 1 A). Upon titration with TFE, minima appeared at (Fig. 1 C).

Peptide 1HN,15N-HSQC spectra To obtain residue-specific information regarding the Rev ARM peptide, we recorded 1HN,15N-HSQC spectra (Fig. 2). In aqueous solution, the peptide shows little dispersion of the resonances (Fig. 2 B), indicative of an unfolded peptide and in accordance with the CD data. In contrast, the 1HN,15N-HSQC spectra of the same construct in 50% TFE (Fig. 2 C) and in complex with RRE StemIIB (Fig. 2 D) show much greater dispersion. Overall, the TFE- and RRE StemIIB-induced, weighted amide chemical shift changes (Dd(1HN,15N)) are similar in magnitude, averaging 0.31 and 0.34 ppm, respectively. Interestingly, the majority of the Dd(1HN,15N) values induced by 50% TFE and RRE-StemIIB binding resemble each other in direction and magnitude, suggesting that the confor- mations in the presence of 50% TFE and RRE-StemIIB are similar. The RRE-StemIIB-induced Dd(1HN,15N) values of R35 and R44 differ considerably from those induced by 50% TFE; however, this difference is likely due to the base-specific contacts these residues make with opposite sides of the RRE-StemIIB major groove (see Fig. S1)(14).

Wild-type and mutant Rev 1HN,15N-correlation spectra NMR investigations of the 116-residue wild-type Rev in solution are notorious for limited protein solubility and uncontrolled oligomerization. Therefore, sequential assign- ments outside the ARM remain unavailable. Fewer than 20% of all backbone amides, all indicative of a disordered protein region, are observable in 1HN,15N-transverse relax- 1 N 15 FIGURE 1 TFE-dependent CD of the Rev ARM peptide and CD of Rev ation optimized spectroscopy ( H , N-TROSY) spectra protein variants. (A) Mean residue molar ellipticity q as a function of wave- (62) recorded at 283 K (Fig. 3 A). However, because the length (198–260 nm) for Rev ARM in increasing concentrations of TFE V16D/I55N mutations are known to block oligomerization cosolvent. The double minima at 208 and 222 nm indicate an a-helical sec- of Rev on the RRE (30), the introduction of these two ondary structure in TFE, and the TFE-free spectrum exhibiting a minimum mutations into the protein oligomerization domains at ~200 nm is indicative of a more random coil conformation. (B)[q ]/ 222 allows for detection of many additional, albeit poorly [q208] ratio as a function of TFE concentration. (C) Mean residue molar ellipticity q as a function of wavelength (198–260 nm) for wild-type Rev dispersed, resonances located in the N-terminal domain of and V16D/I55N mutant protein in buffer, pH 7.4. Rev (Fig. 3 C). Employing conventional through-bond

Biophysical Journal 105(4) 1004–1017 1008 Casu et al.

FIGURE 2 Resonance dispersion of Rev ARM in 1H,15N-HSQC spectra. (A) Primary sequence of the Rev ARM peptide construct investigated.

Nonnative N-terminal residues include G-4 through A-1, and the C-terminal extension consists of Aþ1 1 15 through Rþ5.(B) H, N-HSQC of Rev ARM pep- tide in NMR buffer (50 mM Na2HPO4/NaH2PO4 pH 7.4, 150 mM KCl, 1 mM EDTA, 1 mM DTT, 1 15 0.02% NaN3). (C) H, N-HSQC of the Rev ARM peptide in 50% TFE/40% NMR buffer/ 1 15 10% D2O. (D) H, N-HSQC of the Rev ARM peptide complexed with RRE StemIIB in NMR buffer. Insets show the side-chain W45 indole and the T34 backbone amide correlations. All 1H,15N-HSQC experiments were acquired at 800 MHz and 283 K.

assignment strategies, we were able to sequentially amide chemical shifts in the V16D/I55N Rev mutant assign >70% of the backbone amide resonances in the construct and those in the Rev ARM peptide strongly sug- V16D/I55N Rev mutant construct. Backbone amide assign- gest that the adopted confirmation is context independent ments obtained for the V16D/I55N Rev mutant could be and disordered (Fig. 3 B). transferred to the wild-type spectrum, where they mapped exclusively to the extreme C-terminus (Fig. 3, A and C). Secondary chemical shifts Although sizable portions of the oligomerization interfaces centered around helices 1 and 2 are broadened beyond To obtain additional insights into the conformational prefer- detectable limits, even in the V16D/I55N Rev mutant ences of the Rev ARM peptide and the V16D/I55N Rev (Fig. 3 D), unambiguous assignments include the N-termi- mutant, we analyzed the 13Ca and 13Cb secondary chemical nal part of the ARM, spanning residues T34 through N40 shifts. The difference between 13Ca and 13Cb secondary (Fig. 3 B). The observed similarities between the backbone chemical shifts for the Rev ARM peptide in aqueous

Biophysical Journal 105(4) 1004–1017 Rev’s Arginine-Rich Motif Is Intrinsically Disordered 1009

FIGURE 3 NMR analysis of wild-type Rev and V16D/I55N mutant protein and secondary chemical shifts of Rev V16D/I55N. (A) An 800 MHz 1H,15N-TROSY spectrum of wild-type Rev in NMR buffer containing 500 mM KCl, pH 7.4, recorded at 283 K. Assignments for observable residues are provided. (B) Inset showing an overlay of 1H,15N-HSQC spectrum of full-length V16D/I55N Rev (single black contour) and Rev ARM peptide (gray contours) in NMR buffer recorded at 283 K. Arrows and labels indicate the position of assigned residues in the ARM region. (C) An 800 MHz 1H,15N-HSQC spectrum of full-length V16D/I55N Rev in NMR buffer containing 500 mM KCl, pH 7.4, recorded at 283 K. (D) Combined secondary chemical shifts, Dd(13Ca,b), of full-length V16D/I55N Rev in NMR buffer, pH 7.4, are plotted against the residue number. The helix-turn-helix motif and functionally impor- tant ARM/NLS and NES are shown along with the full-length Rev primary sequence. Arrows indicate the position of the two mutations in the Rev oligo- merization domains (V16D and I55N). Boxed C-terminal residues are assigned in panel A. solution at pH 7.4 averages 0.8 ppm over all residues, ARM peptide scenario, this indicates that residues T34- consistent with a random coil-like conformation (Fig. 4 N40 do not adopt a stably folded a-helical conformation A). In contrast, in the presence of 50% TFE (Fig. 4 B)or in the context of the mostly monomeric V16D/I55N protein RRE StemIIB (Fig. 4 C), the Rev ARM peptide exhibits dif- variant. ferences in 13Ca and 13Cb secondary chemical shifts aver- aging 2.8 and 3.7 ppm, respectively, consistent with an 3J(HN,Ha) coupling constants a-helical conformation. Most of the residues between T34 and R50 have combined 13Ca and 13Cb secondary chemical Fig. 5 A summarizes the 3J(HN,Ha) coupling constants for shifts between 3 and 4 ppm, characteristic of a fully formed Rev ARM peptide in aqueous solution at pH 7.4, in 50% a-helix (63). Secondary chemical shifts of the oligomeriza- (v/v) TFE, and in the RRE-bound state. One can interpret tion-deficient V16D/I55N Rev mutant confirm that the the vicinal coupling data by considering the mean C-terminus of Rev is disordered. Furthermore, the observed 3J(HN,Ha) values predicted for a-helical and b-strand con- deviations from random coil for 13Ca and 13Cb chemical formations (64). In helical regions, the 3J(HN,Ha) coupling shifts of residues located in the ARM region (T34 and constants average 5.2 Hz, in b-sheet like conformations Q36–N40) within the V16D/I55N Rev mutant (Fig. 3 D) they average 8.5 Hz, and in random coil-like regions, inter- are virtually identical to the corresponding secondary chem- mediate values ranging from 7.1 to 7.7 Hz are found. For ical shifts in the ARM peptide context. In similarity to the the Rev ARM peptide in aqueous solution at pH 7.4, we

Biophysical Journal 105(4) 1004–1017 1010 Casu et al.

FIGURE 5 3J(HN,Ha) coupling constants and [1HN]15N-heteronuclear FIGURE 4 Secondary structure of Rev ARM as indicated by 13C chem- NOE analysis. (A) Vicinal 3J(HN,Ha) coupling constants for Rev ARM pep- ical shift. Combined secondary chemical shifts, Dd(13Ca,b), are plotted as tide in NMR buffer (black squares) and 50% TFE (gray triangles) were the difference of experimental 13Ca and 13Cb chemical-shift deviation from determined from quantitative HNHA experiments. 3J(HN,Ha) coupling random coil chemical shifts. Dd(13Ca,b) is plotted against residue number values for Rev ARM peptide in complex with RRE StemIIB (black circles) for (A) Rev ARM in NMR buffer, pH 7.4; (B) Rev ARM in 50% TFE; and were taken from Battiste (65). Error bars indicate the estimated error in the (C) Rev ARM in complex with StemIIB (14,65). Carbon chemical shifts for three-bond HNHA coupling. (B)[1HN]15N-hetNOEs of Rev ARM peptide Rev ARM peptide in complex with RRE StemIIB were taken from Battiste in NMR buffer, pH 7.4 (black squares), in 50% TFE (gray triangles), and in

(65). Native Rev ARM34-50 residues are shown in bold one-letter amino complex with RRE StemIIB (black circles). Error bars indicate the SD in acid abbreviations, and nonnative residues are indicated by an asterisk. the measured values. Native Rev ARM34-50 residues are shown in bold Values above the dashed line at 2.8 ppm are indicative of an a-helical one-letter amino acid abbreviations, and nonnative residues are indicated conformation. by an asterisk.

Structure calculations observed intermediate 3J(HN,Ha) couplings, with an average of 6.0 Hz for residues T34–R50. In comparison, 1H, 15N, and 13C chemical-shift assignments of Rev ARM in the J couplings of the Rev ARM peptide in 50% TFE are the absence and presence of 50% TFE were nearly com- smaller, with an average value of 4.7 Hz for residues plete, and assignments have been deposited in the Bio- T34–R50, indicative of the presence of a-helix. Similar MagResBank (accession numbers 18851 and 18852, values were reported by Battiste (65) for the a-helical respectively). conformation of the Rev ARM peptide in complex with NOE-derived distance restraints and chemical-shift- RRE StemIIB (with an average value of 4.4 Hz for residues derived backbone torsion restraints were used to calculate T34–R50). structural ensembles of the Rev ARM peptide in aqueous solution at pH 7.4 and in 50% TFE. In the case of the phys- iological sample, the simulated annealing procedure Backbone dynamics converged poorly and no single conformation could be Analysis of [1HN],15N heteronuclear NOEs (hetNOEs) defined that satisfied our experimental chemical shift and conveniently identifies fast-timescale protein backbone NOE data. In contrast, the data derived from the 50% TFE dynamics on a subnanosecond timescale (66). Fig. 5 B sample produced a well-defined helical ensemble. None of shows that the [1HN],15N hetNOE values for Rev ARM in the final ensemble of 20 energy-minimized NMR structures aqueous solution at pH 7.4 average 0.47 for residues T34– shown in Fig. 6 A violated NOE distances by more than R50, consistent with a disordered peptide. In contrast, the 0.5 A˚ . No torsion angle restraint was violated by more hetNOEs of the Rev ARM peptide in complex with RRE than 5. The vast majority of residues (99.8%) were found StemIIB average 0.83 for residues T34–R50, and in 50% in the most favored regions of the Ramachandran plot. TFE average 0.77 for residues T34–R50. These higher The root mean-square deviation (RMSD) of the structure hetNOE values indicate higher backbone rigidity and are ensemble compared with the mean structure was 0.87 5 typical of a well-ordered secondary structure. 0.35 A˚ for backbone and 1.69 5 0.32 A˚ for all heavy atoms.

Biophysical Journal 105(4) 1004–1017 Rev’s Arginine-Rich Motif Is Intrinsically Disordered 1011

conformation adopted in 20% TFE of a succinylated R42A point mutant (22), characterized by backbone RMSDs of 0.9 A˚ and 0.7 A˚ , respectively (Fig. S3, A and B). The atomic coordinates and experimental NMR restraints (PDB identi- fier 2M1A) for the Rev ARM structure in 50% TFE have been deposited in the PDB (http://www.rcsb.org/).

Residual dipolar couplings To investigate whether residual conformational preferences exist in aqueous solution at pH 7.4, we oriented the Rev FIGURE 6 NMR solution structure of Rev ARM in 50% TFE. (A) ARM peptide in stretched acrylamide gels (67) and polye- Ensemble of 20 CYANA conformers representing the solution structure thyleneglycol (PEG)/hexanol (68) mixtures to measure 1 N of the Rev ARM peptide in the presence of 50% (v/v) TFE calculated using D(H ,N) RDCs. Although the absolute level of alignment NOE-derived distance-restraints and TALOSþ chemical-shift-derived tor- differed between the two anisotropic environments, very sion-angle constraints. (B) Ribbon representation of the mean structure of similar general distributions of 1D(HN,N) emerged (Fig. 7, Rev ARM in 50% TFE calculated from the ensemble using MolMol (89). A and C). All but two residues at the very C-terminus of Rev ARM were found to have negative RDCs, with the A summary of the distance restraints and structural statistics largest values in the central residues. RDC values generally is given in Table S1 and Fig. S2. The Rev ARM peptide in followed a bell-shaped distribution over the sequence, with 50% TFE features an a-helical conformation (Fig. 6 B) that small couplings being measured at the N- and C-terminal closely resembles the StemIIB-bound state (14) and the extremities. Such distributions have been observed and

FIGURE 7 1D(HN,N) RDC analysis. (A) Experimental 1D(HN,N) values obtained in liquid crystalline media composed of n-dodecyl-penta(ethylene gly- col) (C12E5) and n-hexanol versus PALES-predicted 1D(HN,N) RDCs calculated assuming steric alignment of Rev ARM peptide coordinates taken from PDB entry 1ETF (14). (B) Correlation between PALES-predicted 1D(HN,N) RDCs for the a-helical Rev ARM peptide conformation and the experimental 1D(HN,N) values obtained in C12E5/n-hexanol. The dashed line is the linear regression (slope of 2.581). (C) Experimental 1D(HN,N) values obtained in C12E5/n-hexanol, stretched PAGE gels, and RDCs predicted from 100,000 FM-generated conformers. (D) Correlation between FM-predicted 1D(HN,N) RDCs and the experimental 1D(HN,N) values obtained in anisotropic C12E5/n-hexanol mixture. The dashed line is the linear regression (slope of 1.266).

Biophysical Journal 105(4) 1004–1017 1012 Casu et al. rationalized as random coil RDCs and are a direct conse- tion of the trajectory confirms the helical conformation of quence of the conformational properties embedded into the Rev ARM peptide. The unfolding simulation suggests the primary sequence (69–71). that once the core of the peptide is formed, its helical char- To rule out the presence of residual a-helical conforma- acter is quite stable over a 50 ns time span (Fig. 8 B). tional preferences, we used the PALES software (72)to Since the folded Rev ARM peptide was quite stable in our predict RDC data based on a steric mode of molecular align- unfolding simulation, we decided to test whether the peptide ment and the a-helical coordinates of the Rev ARM peptide would fold into a stable a-helix from random coil coordi- observed in the complex with RRE StemIIB (PDB identifier nates. To minimize the influence of the starting conforma- 1ETF). A comparison of experimental 1D(HN,N) values tion, we generated 10 50-ns-long trajectories, each starting obtained in PEG/hexanol mixtures with those predicted from different random coil coordinates. To produce starting for the a-helical conformation of the Rev ARM peptide coordinates, an extended structure of the Rev ARM peptide showed no obvious correlation (Fig. 7 A), as characterized was built, subjected to an in vacuo high-temperature simula- by a Pearson’s correlation coefficient of 0.493 (Fig. 7 B). tion, and then clustered (Fig. S4). To maximize sampling of Thus within detectable limits, the experimental 1D(HN,N) conformational space, the temperature and length of the in values are inconsistent with the presence of an a-helical vacuo simulation were increased until visualization of the conformation. trajectory revealed that the peptide had contracted and Explicitly built conformational ensembles based on expanded multiple times, and clustering had produced clus- knowledge-based torsion-angle statistics have proved to be ters that were visited multiple times over the course of the useful for characterizing intrinsically disordered proteins trajectory with no single cluster dominating. Clustering of (IDPs) (73). RDC predictions based on the statistical FM the trajectory into 10 clusters provided representative struc- (51) approach accurately reproduced the experimental tures that were used as the starting coordinates for folding 1D(HN,N) values obtained in stretched acrylamide gels and simulations (Fig. S5). Inspection of the secondary structure PEG/hexanol mixtures (Fig. 7 C). The correlation between over the course of the folding trajectories shows that these experimental PEG/hexanol and FM-predicted 1D(HN,N) simulations failed to produce any significant a-helical values has a Pearson’s coefficient of 0.842, suggesting that content (Fig. 9). in aqueous solution at pH 7.4, the Rev ARM peptide is intrin- To determine whether the simulations were consistent sically disordered (Fig. 7 D). It should be noted that we with the experimental NMR data, we measured the dihedral corrected variations in the absolute levels of alignment by angle f for all residues over the course of all 10 folding sim- applying optimal scaling factors. Collectively, our RDC anal- ulations and converted the values to mean coupling con- ysis demonstrates that the Rev ARM peptide in aqueous stants using a Karplus relation (52). The 3J(HN,Ha) values solution at pH 7.4 populates an ensemble of rapidly intercon- derived from the simulations averaged ~6.5 Hz, which is verting conformations that are mostly unstructured. typical of a disordered peptide and consistent with the experimental values obtained in aqueous solution at pH 7.4 (Fig. S6). To examine the sampling of conformational MD simulations space, we combined all 10 folding simulations and grouped To further probe the lack of experimental evidence for a he- the combined data into 10 clusters (Fig. S7). If conforma- lical conformation of the Rev ARM peptide in aqueous so- tional sampling was limited, then each cluster would corre- lution at pH 7.4, we generated 11 different 50 ns all-atom spond to one trajectory. If conformational sampling was explicit-solvent MD simulations. First, an unfolding simula- extensive, then each cluster would be sampled from each tion was started from the a-helical coordinates of the Rev trajectory. We found that seven of 10 clusters grouped ARM peptide observed in the NMR solution structure snapshots from multiple trajectories, and that during all with RRE StemIIB (PDB identifier 1ETF). The raw but one simulation the peptide transitioned to conformations RMSD from the starting coordinates plotted over the course sampled by other folding trajectories. The folding simula- of the simulation is large and continues to increase, but tions appear to have sampled the available conformational calculating the RMSD after superimposing the peptide on space extensively, are consistent with the 3J(HN,Ha) just the native residues (T34–R50) shows that the RMSD coupling constants obtained from the aqueous NMR exper- increases in two steps in the first 3 ns of the simulation iments in which the peptide was disordered, and fail to show and then remains fairly constant around 2 A˚ (Fig. 8 A). any consistent secondary structure formation, suggesting This indicates that after equilibrating over the first 3 ns, that on the 50 ns timescale, the Rev ARM peptide does the majority of the motion in the unfolding simulation not transition to an a-helical conformation. occurs in the termini. An examination of the secondary structure over the course of the unfolding simulation shows DISCUSSION that a structural transition occurs around 3 ns and that the native residues in the core of the peptide remain predomi- To date, NMR solution structures of the Rev ARM peptide in nantly a-helical throughout the simulation. Visual inspec- complex with its high-affinity recognition site, RRE StemIIB

Biophysical Journal 105(4) 1004–1017 Rev’s Arginine-Rich Motif Is Intrinsically Disordered 1013

FIGURE 8 Unfolding simulation. (A)RMSD from starting coordinates over the course of the unfolding simulation. The upper gray line was calculated after superimposing the peptide on the backbone atoms of all residues, and the lower black line was calculated after superimposing it on the native residues (T34–R50). (B) Secondary structure of the Rev ARM peptide over the course of the unfolding simulation. Secondary structure was determined with the DSSP algorithm imple- mented in the Amber ptraj module. Secondary structure is colored as follows: coil (gray), b-strand

(blue), turn (green), a-helix (red), and 310 helix (yellow).

(14), bound to an RNA aptamer (15), and of the N-terminally obtained in 50% TFE with those obtained in aqueous solution succinylated R42A point mutant of Rev ARM at pH 4.7 in the at pH 7.4, we found no evidence supportive of an a-helical presence of 20% TFE (22) have been determined. Crystal character of wild-type Rev ARM in aqueous solution at pH structures for the N-terminal helix-turn-helix motif of a 7.4. The strong tendency to self-associate has thus far Rev double mutant (28) and one of a complex with mono- thwarted attempts to obtain structural information about clonal Fab fragments (29) have been solved. Collectively, the full-length, wild-type Rev protein on the atomic level. these structures demonstrate the helical of Rev ARM upon Mutation of Val-16 and Ile-55 to Asp-16 and Asn-55 partially RRE binding and homodimerization, and under conditions alleviated uncontrolled oligomerization and permitted unam- with TFE as a cosolvent. In contrast, when it is bound to a biguous assignments for >70% of the backbone residues. 35-mer RNA aptamer, Rev ARM adopts an extended confor- Consistent with previous studies demonstrating the disor- mation (17). To characterize the conformational state of free dered nature of the C-terminus of the Rev L60R variant in Rev ARM, we first confirmed the previously observed stabi- complex with a 42 nt StemIIB hairpin (8,74), our secondary lization of a-helical structure by TFE, thereby establishing a chemical shifts reveal intrinsic disorder for the C-terminus of reference helical preparation for comparison studies. Using the wild-type and the entire oligomerization-deficient V16D/ heteronuclear 3D NMR spectroscopy, we confirmed that in I55N Rev variant, including the assignable portion of the 50% TFE, the structure of our Rev ARM peptide was very ARM in the absence of the RRE. similar to the succinylated R42Avariant structure determined To probe for low populations of a-helical ARM peptide in 20% TFE, and to the conformation found when the peptide conformations, we measured RDCs in two different align- is bound to RRE StemIIB. When we compared the NMR data ment media. In a recent study, experimental RDCs were

Biophysical Journal 105(4) 1004–1017 1014 Casu et al.

FIGURE 9 Folding simulations. Secondary structure of the Rev ARM peptide over the course of the 10 folding simulations. Secondary structure was deter- mined with the DSSP algorithm implemented in the Amber ptraj module. Secondary structure is colored as follows: coil (gray), b-strand (blue), turn (green), a-helix (red), and 310 helix (yellow).

Biophysical Journal 105(4) 1004–1017 Rev’s Arginine-Rich Motif Is Intrinsically Disordered 1015 reproduced with the use of statistical models derived from mined by gel-shift experiments from 40 to 35 nM. Combi- backbone conformations sampled in coil-like structures of natorial screening of arginine-rich peptide libraries the PDB (73). We found that the FM statistical coil model, identified that a glutamine (corresponding to Asn-40 of which describes an ensemble of unstructured, interconvert- the Rev ARM) within a stretch of polyarginine specifically ing structures, accurately reproduced our experimental recognizes StemIIB (79). Further optimization flanked the RDC data, whereas an a-helical conformation did not. glutamine at positions 6, 2, and þ2 with nonarginine Additionally, we generated >500 ns of MD simulations. residues featuring aspartic acid, glutamic acid, and alanine Typically, coil-to-helix transitions are thought to be highly at high frequencies. Since positions 6, 2, and þ2 are cooperative (75), with helix nucleation times estimated to located opposite of the glutamine within a putative be 20–70 ns (76), yet in 10 different 50 ns folding simula- a-helical conformation, it is unlikely that those nonargi- tions of the Rev ARM peptide, not one coil-to-helix transi- nine residues are specifically contacting the RRE major tion was observed. groove. Instead, charge neutralization and stabilization of The results from our comprehensive CD, NMR, and MD the a-helix in general appear to be responsible for the up studies of the Rev ARM peptide under physiological condi- to 50-fold higher affinity for StemIIB of these polyargi- tions demonstrate that its conformation is most accurately nine-derived peptides compared with the wild-type Rev represented by the FM statistical coil model, which is ARM (80). most likely a consequence of the abundance of positively charged arginine residues. Previous CD studies suggested CONCLUSIONS that the unmodified Rev ARM (T34–R50) adopts a random coil-like structure, and that the apparent helical content of The combination of low hydrophobicity and large net ARM peptides depends strongly on terminal modifications charge under physiological conditions (pIT34-R50 ¼ 12.6) because it was highest for a construct featuring a succiny- predicts intrinsic disorder for the Rev ARM peptide lated N-terminus, a C-terminal extension of four alanines (81,82). Indeed, our results confirm that the 17-amino- and one arginine residue, and an amidated C terminus (4). acid, arginine-rich RNA-binding motif of Rev located Similarly, qualitative homonuclear NMR studies suggesting within helix2 of the N-terminal helix-loop-helix motif con- a helical conformation for the Rev ARM peptide free in so- stitutes an intrinsically disordered sequence element within lution were performed with an N-succinylated-carboxamide the isolated peptide and also in the context of the full-length, R42A point mutant and were carried out at pH 4.7 and 4 C, monomeric Rev protein. Our CD, NMR, and MD investiga- conditions that were selected because they increased the tion of the Rev ARM peptide in aqueous solution at physi- structured nature of the peptide (22). ological pH reveals a statistical ensemble of interconverting In line with previous investigations that revealed concen- structures lacking detectable a-helical subpopulations, sug- tration-dependent secondary structure formation for wild- gesting that RRE StemIIB recognition by the Rev ARM type Rev and assigned a molten-globular state to the Rev occurs via an induced-fit (21) or coupled binding-folding monomer (77), our CD and NMR investigation of the olig- mechanism (83,84). Our findings outline a principle that omerization-deficient V16D/I55N Rev construct provides might constitute a general strategy for the ubiquitous class evidence that the ARM region constitutes an intrinsically of arginine-rich RNA-binding proteins to recognize their disordered element in the absence of binding partners that diverse and often induced RNA-bound structures (85) while can induce its a-helical conformation. The V16D/I55N performing dual functional roles as nuclear localization Rev variant is predominantly monomeric, even at concentra- signals (86–88). tions required for detailed NMR investigations, which facil- itated the confirmation of the unstructured nature of Rev monomers with site-specific resolution. SUPPORTING MATERIAL Because nucleation of Rev assembly at the high-affinity Seven figures, one table, and supporting references are available at http:// StemIIB site involves a Rev monomer (10,30,78), we sug- www.biophysj.org/biophysj/supplemental/S0006-3495(13)00807-2. gest that this initial interaction of the Rev ARM with RRE StemIIB proceeds via an induced-fit mechanism rather than We thank members of the Hennig laboratory for stimulating discussions and comments on the manuscript, and acknowledge the support of the Hollings by conformational selection. Upon encountering the un- Marine Laboratory NMR Facility for this work. structured ARM, StemIIB provides specific interactions This work was supported by the National Institutes of Health (grant that neutralize the positive charge of the numerous argi- AI081640 to M.H.). nines and enable formation of the a-helical conformation. A combination of mutagenesis and CD studies previously established a correlation between specific StemIIB binding REFERENCES affinity and a-helical content (4). The R A variant showed 42 1. Hemmerich, P., S. Bosbach, ., U. Krawinkel. 1997. Human ribosomal increased helical content compared with the wild-type protein L7 binds RNA with an a-helical arginine-rich and lysine-rich peptide, which in turn lowered the specific Kd as deter- domain. Eur. J. Biochem. 245:549–556.

Biophysical Journal 105(4) 1004–1017 1016 Casu et al.

2. Lazinski, D., E. Grzadzielska, and A. Das. 1989. Sequence-specific 26. Tsai, C. J., B. Ma, and R. Nussinov. 1999. Folding and binding cas- recognition of RNA hairpins by bacteriophage antiterminators requires cades: shifts in energy landscapes. Proc. Natl. Acad. Sci. USA. a conserved arginine-rich motif. Cell. 59:207–218. 96:9970–9972. 3. Su, L., J. T. Radek, ., M. A. Weiss. 1997. RNA recognition by a 27. Tsai, C. J., B. Ma, ., R. Nussinov. 2001. Structured disorder and bent a-helix regulates transcriptional antitermination in phage l. conformational selection. Proteins. 44:418–427. Biochemistry. 36:12722–12732. 28. Daugherty, M. D., B. Liu, and A. D. Frankel. 2010. Structural basis for 4. Tan, R., L. Chen, ., A. D. Frankel. 1993. RNA recognition by an iso- cooperative RNA binding and export complex assembly by HIV Rev. lated a helix. Cell. 73:1031–1040. Nat. Struct. Mol. Biol. 17:1337–1342. 5. Calnan, B. J., S. Biancalana, ., A. D. Frankel. 1991. Analysis of argi- 29. DiMattia, M. A., N. R. Watts, ., J. M. Grimes. 2010. Implications of nine-rich peptides from the HIV Tat protein reveals unusual features of the HIV-1 Rev dimer structure at 3.2 A resolution for multimeric bind- RNA-protein recognition. Genes Dev. 5:201–210. ing to the Rev response element. Proc. Natl. Acad. Sci. USA. 107:5810– 6. Tan, R., and A. D. Frankel. 1995. Structural variety of arginine-rich 5814. RNA-binding peptides. Proc. Natl. Acad. Sci. USA. 92:5282–5286. 30. Pond, S. J., W. K. Ridgeway, ., D. P. Millar. 2009. HIV-1 Rev protein 7. Smith, C. A., V. Calabro, and A. D. Frankel. 2000. An RNA-binding assembles on viral RNA one molecule at a time. Proc. Natl. Acad. Sci. chameleon. Mol. Cell. 6:1067–1076. USA. 106:1404–1408. 8. Daugherty, M. D., I. D’Orso, and A. D. Frankel. 2008. A solution to 31. Cole, J. L., J. D. Gehman, ., L. C. Kuo. 1993. Solution oligomer- limited genomic capacity: using adaptable binding surfaces to ization of the protein of HIV-1: implications for function. assemble the functional HIV Rev oligomer on RNA. Mol. Cell. Biochemistry. 32:11769–11775. 31:824–834. 32. Sugase, K., M. A. Landes, ., M. Martinez-Yamout. 2008. Overexpres- 9. Malim, M. H., and B. R. Cullen. 1991. HIV-1 structural gene expres- sion of post-translationally modified peptides in Escherichia coli by co- sion requires the binding of multiple Rev monomers to the viral expression with modifying enzymes. Protein Expr. Purif. 57:108–115. RRE: implications for HIV-1 latency. Cell. 65:241–248. 33. Jain, C., and J. G. Belasco. 1996. A structural model for the HIV-1 Rev- 10. Mann, D. A., I. Mikae´lian, ., J. Karn. 1994. A molecular rheostat. RRE complex deduced from altered-specificity rev variants isolated by Co-operative rev binding to stem I of the rev-response element modu- a rapid genetic strategy. Cell. 87:115–125. lates human immunodeficiency virus type-1 late gene expression. J. Mol. Biol. 241:193–207. 34. Marenchino, M., D. W. Armbruster, and M. Hennig. 2009. Rapid and efficient purification of RNA-binding proteins: application to HIV-1 11. Hope, T. J. 1999. The ins and outs of HIV Rev. Arch. Biochem. Biophys. Rev. Protein Expr. Purif. 63:112–119. 365:186–191. . 35. Scott, L. G., and M. Hennig. 2008. RNA structure determination by 12. Malim, M. H., J. Hauber, , B. R. Cullen. 1989. The HIV-1 rev trans- NMR. Methods Mol. Biol. 452:29–61. activator acts through a structured target sequence to activate nuclear export of unspliced viral mRNA. Nature. 338:254–257. 36. Delaglio, F., S. Grzesiek, ., A. Bax. 1995. NMRPipe: a multidimen- sional spectral processing system based on UNIX pipes. J. Biomol. 13. Pollard, V. W., and M. H. Malim. 1998. The HIV-1 Rev protein. Annu. NMR. 6:277–293. Rev. Microbiol. 52:491–532. 37. Vranken, W. F., W. Boucher, ., E. D. Laue. 2005. The CCPN data 14. Battiste, J. L., H. Mao, ., J. R. Williamson. 1996. a Helix-RNA major model for NMR spectroscopy: development of a software pipeline. groove recognition in an HIV-1 rev peptide-RRE RNA complex. Proteins. 59:687–696. Science. 273:1547–1551. . 15. Ye, X., A. Gorin, ., D. J. Patel. 1996. Deep penetration of an a-helix 38. Chou, J. J., S. Gaemers, , A. Bax. 2001. A simple apparatus for into a widened RNA major groove in the HIV-1 rev peptide-RNA generating stretched polyacrylamide gels, yielding uniform alignment aptamer complex. Nat. Struct. Biol. 3:1026–1033. of proteins and detergent micelles. J. Biomol. NMR. 21:377–382. 16. Xu, W., and A. D. Ellington. 1996. Anti-peptide aptamers recognize 39. Gebel, E. B., and D. Shortle. 2007. Characterization of denatured pro- amino acid sequence and bind a protein epitope. Proc. Natl. Acad. teins using residual dipolar couplings. Methods Mol. Biol. 350:39–48. Sci. USA. 93:7475–7480. 40. Sattler, M., J. Schleucher, and C. Griesinger. 1999. Heteronuclear 17. Ye, X., A. Gorin, ., D. J. Patel. 1999. RNA architecture dictates the multidimensional NMR experiments for the structure determination conformations of a bound peptide. Chem. Biol. 6:657–669. of proteins in solution employing pulsed field gradients. Prog. Nucl. Magn. Reson. Spectrosc. 34:93–158. 18. Landt, S. G., A. Ramirez, ., A. D. Frankel. 2005. A simple motif for protein recognition in DNA secondary structures. J. Mol. Biol. 41. Talluri, S., and G. Wagner. 1996. An optimized 3D NOESY-HSQC. 351:982–994. J. Magn. Reson. B. 112:200–205. 19. Bui, J. M., and J. A. McCammon. 2006. Protein complex formation by 42. Grzesiek, S., P. Wingfield, ., A. Bax. 1995. Four-dimensional 15N- acetylcholinesterase and the neurotoxin fasciculin-2 appears to involve separated NOESYof slowly tumbling perdeuterated 15N-enriched pro- an induced-fit mechanism. Proc. Natl. Acad. Sci. USA. 103:15451– teins. Application to HIV-1 Nef. J. Am. Chem. Soc. 117:9594–9595. 15456. 43. Kuboniwa, H., S. Grzesiek, ., A. Bax. 1994. Measurement of HN-H a 20. Levy, Y., J. N. Onuchic, and P. G. Wolynes. 2007. Fly-casting in pro- J couplings in calcium-free calmodulin using new 2D and 3D water- tein-DNA binding: frustration between protein folding and electro- flip-back methods. J. Biomol. NMR. 4:871–878. statics facilitates target recognition. J. Am. Chem. Soc. 129:738–739. 44. Vuister, G. W., and A. Bax. 1993. Quantitative J correlation: a new 21. Williamson, J. R. 2000. Induced fit in RNA-protein recognition. Nat. approach for measuring homonuclear three-bond J(HNH.a) coupling Struct. Biol. 7:834–837. constants in 15N-enriched proteins. J. Am. Chem. Soc. 115:7772–7777. 22. Scanlon, M. J., D. P. Fairlie, ., M. L. West. 1995. NMR solution struc- 45. Pascal, S. M., D. R. Muhandiram, ., L. E. Kay. 1994. Simultaneous ture of the RNA-binding peptide from human immunodeficiency virus acquisition of 15N- and 13C-edited NOE spectra of proteins dissolved (type 1) Rev. Biochemistry. 34:8242–8249. in H2O. J. Magn. Reson. B. 103:197–201. . 23. Kumar, S., B. Ma, , R. Nussinov. 2000. Folding and binding cas- 46. Shen, Y., F. Delaglio, ., A. Bax. 2009. TALOSþ: a hybrid method for cades: dynamic landscapes and population shifts. Protein Sci. 9:10–19. predicting protein backbone torsion angles from NMR chemical shifts. 24. Ma, B., S. Kumar, ., R. Nussinov. 1999. Folding funnels and binding J. Biomol. NMR. 44:213–223. mechanisms. Protein Eng. 12:713–720. 47. Yao, L., J. Ying, and A. Bax. 2009. Improved accuracy of 15N-1H sca- 25. Tsai, C. J., S. Kumar, ., R. Nussinov. 1999. Folding funnels, binding lar and residual dipolar couplings from gradient-enhanced IPAP-HSQC funnels, and protein function. Protein Sci. 8:1181–1190. experiments on protonated proteins. J. Biomol. NMR. 43:161–170.

Biophysical Journal 105(4) 1004–1017 Rev’s Arginine-Rich Motif Is Intrinsically Disordered 1017

48. Farrow, N. A., R. Muhandiram, ., L. E. Kay. 1994. Backbone 69. Louhivuori, M., K. Pa¨a¨kko¨nen, ., A. Annila. 2003. On the origin of dynamics of a free and phosphopeptide-complexed Src homology 2 residual dipolar couplings from denatured proteins. J. Am. Chem. domain studied by 15N NMR relaxation. Biochemistry. 33:5984–6003. Soc. 125:15647–15650. 49. Herrmann, T., P. Gu¨ntert, and K. Wu¨thrich. 2002. Protein NMR struc- 70. Obolensky, O. I., K. Schlepckow, ., A. V. Solov’yov. 2007. Theoret- ture determination with automated NOE assignment using the new ical framework for NMR residual dipolar couplings in unfolded pro- software CANDID and the torsion angle dynamics algorithm DYANA. teins. J. Biomol. NMR. 39:1–16. J. Mol. Biol. 319:209–227. 71. Bernado´, P., L. Blanchard, ., M. Blackledge. 2005. A structural model 50. Gu¨ntert, P. 2004. Automated NMR structure calculation with CYANA. for unfolded proteins from residual dipolar couplings and small-angle Methods Mol. Biol. 278:353–378. x-ray scattering. Proc. Natl. Acad. Sci. USA. 102:17002–17007. 51. Ozenne, V., F. Bauer, ., M. Blackledge. 2012. Flexible-meccano: a tool for the generation of explicit ensemble descriptions of intrinsically 72. Zweckstetter, M., and A. Bax. 2000. Prediction of Sterically Induced disordered proteins and their associated experimental observables. Alignment in a Dilute Liquid Crystalline Phase: Aid to Protein Struc- Bioinformatics. 28:1463–1470. ture Determination by NMR. J. Am. Chem. Soc. 122:3791–3792. 52. Schmidt, J. M., M. Blu¨mel, .,H.Ru¨terjans. 1999. Self-consistent 3J 73. Jensen, M. R., P. R. Markwick, ., M. Blackledge. 2009. Quantitative coupling analysis for the joint calibration of Karplus coefficients and determination of the conformational properties of partially folded evaluation of torsion angles. J. Biomol. NMR. 14:1–12. and intrinsically disordered proteins using NMR dipolar couplings. Structure. 17:1169–1185. 53. Case, D. A., T. A. Darden, ., P. A. Kollman. 2010. AMBER 11. Uni- versity of California, San Francisco. 74. Daugherty, M. D., D. S. Booth, ., A. D. Frankel. 2010. HIV Rev 54. Hornak, V., R. Abel, ., C. Simmerling. 2006. Comparison of multiple response element (RRE) directs assembly of the Rev homooligomer Amber force fields and development of improved protein backbone into discrete asymmetric complexes. Proc. Natl. Acad. Sci. USA. parameters. Proteins. 65:712–725. 107:12481–12486. 55. Lindorff-Larsen, K., S. Piana, ., D. E. Shaw. 2010. Improved side- 75. Lifson, S., and A. Roig. 1961. On the theory of helix–coil transition in chain torsion potentials for the Amber ff99SB protein force field. polypeptides. J. Chem. Phys. 34:1963. Proteins. 78:1950–1958. 76. De Sancho, D., and R. B. Best. 2011. What is the time scale for a-helix 56. Darden, T. A., D. M. York, and L. G. Pedersen. 1993. Particle mesh nucleation? J. Am. Chem. Soc. 133:6809–6816. , Ewald: an N log(N) method for Ewald sums in large systems. . J. Chem. Phys. 98:10089. 77. Surendran, R., P. Herman, , J. Ching Lee. 2004. HIV Rev self-assem- bly is linked to a molten-globule to compact structural transition. 57. Ryckaert, J.-P., G. Ciccotti, and H. J. C. Berendsen. 1977. Numerical Biophys. Chem. 108:101–119. integration of the cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanes. J. Comput. Phys. 78. Jain, C., and J. G. Belasco. 2001. Structural model for the cooperative 23:327–341. assembly of HIV-1 Rev multimers on the RRE as deduced from anal- ysis of assembly-defective mutants. Mol. Cell. 7:603–614. 58. Adelman, S. A., and J. D. Doll. 1976. Generalized Langevin equation approach for atom/solid-surface scattering: general formulation for 79. Tan, R., and A. D. Frankel. 1998. A novel glutamine-RNA interaction classical scattering off harmonic solids. J. Chem. Phys. 64:2375. identified by screening libraries in mammalian cells. Proc. Natl. Acad. 59. Jorgensen, W. L., J. Chandrasekhar, ., M. L. Klein. 1983. Comparison Sci. USA. 95:4247–4252. of simple potential functions for simulating liquid water. J. Chem. 80. Sugaya, M., N. Nishino, ., K. Harada. 2008. Amino acid requirement Phys. 79:926. for the high affinity binding of a selected arginine-rich peptide with the 60. Pettersen, E. F., T. D. Goddard, ., T. E. Ferrin. 2004. UCSF HIV Rev-response element RNA. J. Pept. Sci. 14:924–935. Chimera—a visualization system for exploratory research and analysis. 81. Uversky, V. N., J. R. Gillespie, and A. L. Fink. 2000. Why are ‘‘natively J. Comput. Chem. 25:1605–1612. unfolded’’ proteins unstructured under physiologic conditions? 61. Cerutti, D. S., R. Duke, ., T. P. Lybrand. 2008. Vulnerability in pop- Proteins. 41:415–427. ular molecular dynamics packages concerning Langevin and Andersen dynamics. J. Chem. Theory Comput. 4:1669–1680. 82. Uversky, V. N. 2002. What does it mean to be natively unfolded? Eur. J. Biochem. 269:2–12. 62. Pervushin, K., R. Riek, .,K.Wu¨thrich. 1997. Attenuated T2 relaxa- tion by mutual cancellation of dipole-dipole coupling and chemical 83. Oldfield, C. J., Y. Cheng, ., A. K. Dunker. 2005. Coupled folding and shift anisotropy indicates an avenue to NMR structures of very large binding with a-helix-forming molecular recognition elements. biological macromolecules in solution. Proc. Natl. Acad. Sci. USA. Biochemistry. 44:12454–12470. 94:12366–12371. 84. Dyson, H. J., and P. E. Wright. 2002. Coupling of folding and binding 63. Spera, S., and A. Bax. 1991. Empirical correlation between protein for unstructured proteins. Curr. Opin. Struct. Biol. 12:54–60. backbone conformation and C.a. and C.b. 13C nuclear magnetic reso- 85. Weiss, M. A., and N. Narayana. 1998. RNA recognition by arginine- nance chemical shifts. J. Am. Chem. Soc. 113:5490–5492. rich peptide motifs. Biopolymers. 48:167–180. 64. Smith, L. J., K. A. Bolin, ., C. M. Dobson. 1996. Analysis of main b chain torsion angles in proteins: prediction of NMR coupling constants 86. Palmeri, D., and M. H. Malim. 1999. Importin can mediate the for native and random coil conformations. J. Mol. Biol. 255:494–506. nuclear import of an arginine-rich nuclear localization signal in the absence of importin a. Mol. Cell. Biol. 19:1218–1225. 65. Battiste, J. L. 1996. Structure determination of an HIV-1 RRE RNA- Rev peptide complex by NMR spectroscopy. PhD thesis. Massachu- 87. Truant, R., and B. R. Cullen. 1999. The arginine-rich domains present setts Institute of Technology, Cambridge, MA. in human immunodeficiency virus type 1 Tat and Rev function as direct b 66. Abragam, A. 1961. The Principles of Nuclear Magnetism. Clarendon importin -dependent nuclear localization signals. Mol. Cell. Biol. Press, Oxford, UK. 19:1210–1217. 67. Sass, H. J., G. Musco, ., S. Grzesiek. 2000. Solution NMR of proteins 88. Henderson, B. R., and P. Percipalle. 1997. Interactions between HIV within polyacrylamide gels: diffusional properties and residual align- Rev and nuclear import and export factors: the Rev nuclear localisation ment by mechanical stress or embedding of oriented purple mem- signal mediates specific binding to human importin-b. J. Mol. Biol. branes. J. Biomol. NMR. 18:303–309. 274:693–707. 68. Ruckert, M., and G. Otting. 2000. Alignment of biological macromol- 89. Koradi, R., M. Billeter, and K. Wuthrich. 1996. MOLMOL: a program ecules in novel nonionic liquid crystalline media for NMR experi- for display and analysis of macromolecular structures. J. Mol. Graph. ments. J. Am. Chem. Soc. 122:7793–7797. 14:51–55, 29–32.

Biophysical Journal 105(4) 1004–1017