SCOTT BASKERVILLE AND ANDREW D. ELLINGTON RNA STRUCTURE Describing the elephant The three-dimensional structures of RNAs are notoriously difficult to determine. Functional comparisons of variant molecules and cross-linking experiments are providing new information for structural modeling.

The old fable of the blind men trying to describe the in vitro selection is the Rev-binding element (RBE) of shape of an elephant seems aptly to describe the present HIV-1. RBE is a 30-base sequence of the HIV-1 genome state of RNA structural analysis. Until recently, the crys- that can fold into a 'stem-internal-loop-stem' secondary tal structure of only one large RNA, tRNA, was known, structure and that interacts with the Rev , and and other non-crystallographic approaches to RNA regulates the splicing and transport of viral mRNAs. structure determination had not been particularly fruit- From a population of partially randomized sequences, ful. Low-resolution electron micrographs are at best able Bartel and Szostak [3] selected RBE variants that could to identify the global structure of the ribosome, but can- bind the Rev protein, whereas Giver et al. [4] selected not provide three-dimensional structural details. The binding variants from a population of completely random nuclear magnetic resonance (NMR) spectra of most sequences. These experiments revealed numerous se- RNAs are so complex that this method of structure quence co-variations that were used by Leclerc et al. [5] determination is limited to molecules of molecular mass 10 000 daltons or less. But new methods now provide at least partial insights into RNA structures. Like the blind men, each method has its own perspective, but taken together they may describe something akin to reality. Here we compare the approaches that have been used to tackle three RNA molecules of different degrees of com- plexity - the Rev-binding element (RBE) of the human immunodeficiency virus HIV-1, the bacterial enzyme ribonuclease P and the hammerhead .

One theme of these methods is the use of numerous 'soft' distance constraints to delimit possible molecular models. For example, Cedergren and co-workers [1,2] found that phylogenetic data based on sequence compar- isons can be used to establish low-resolution RNA struc- tural hypotheses that could subsequently be evaluated and refined by simulations of the energetics of the structure. But although some RNA structures, such as the tRNA anticodon loop, can be thoroughly evaluated by a struc- ture prediction program such as MC-SYM [1,2] because there is a wealth of available phylogenetic data for such RNAs, other natural RNAs are not so amenable.

To circumvent this deficiency, in vitro selection of func- tional RNA molecules can be used to establish an artifi- cial, as opposed to a natural, phylogeny of RNA sequences. In this approach, molecules with partially or completely randomized sequences are repeti- tively selected for function, and the sequences of the sur- vivors analyzed. Some nucleotide positions will seem to be 'conserved' and, hence, can be inferred to be impor- tant for function; some positions will vary randomly and are probably 'neutral' to function; finally, some positions Fig. 1. Structure of the Rev-binding element (RBE) of HIV-1. (a) will co-vary non-randomly, just as in a natural phylogeny, Sequence and secondary structure of wild-type RBE. Dotted lines and thus may reveal structural interactions. indicate non-canonical pairings. (b) The three-dimensional model of this sequence. The green and red residues are the sites of two different sets of genetic co-variations that aided in deter- Because of a dearth of natural phylogenetic variation, the mining the structure; phosphates are in blue. Two representations RNA that has probably been most extensively studied by are shown, rotated by 900 relative to one another.

120 © Current Biology 1995, Vol 5 No 2 DISPATCH 121

Fig. 2. The catalytic core of RNAse P docked with its substrate tRNA. (a) Predicted secondary structure of Escherichia coli RNase P. (b) Three-dimensional model of this sequence. The tRNA substrate is shown by a grey line; the scissile phosphate bond is a yellow ball. Each region of the molecule is represented by a particular color in both the secondary and tertiary structural models. (Graphics courtesy Mike Harris and Norman Pace.) to drive the MC-SYM modeling program and to con- one another, the photoactivatable cross-linking agent struct theoretical structures for the RBE. Twelve RBE azidophenacyl was attached to the 5' end of circularly models initially derived from the in vitro genetic data con- permuted RNase P RNAs. Irradiation resulted in the verged on a single three-dimensional structure following formation of covalently closed lariat structures in which energy minimization (Fig. 1). Interestingly, when the the cross-link site could be mapped by primer-extension solution structure of the RBE was subsequently revealed analysis [9]. Multiple intramolecular cross-links from by NMR analysis [6], it was found to confirm most of multiple circularly permuted RNAs were then used to the major features predicted by the computer model. establish a scaffold of distance constraints that, when satisfied, restricted possible three-dimensional structural In contrast to the RBE, which is no more than a com- models. The more cross-links that were formed and plex secondary structural element, ribonuclease P RNA evaluated, the fewer the structural models that were is part of a large ribonucleoprotein complex that cleaves allowed. Eventually, a family of closely related structures the 5' ends of precursor tRNA molecules. At high con- was obtained that was most consistent with the available centrations of monovalent ions, RNase P RNA can act data (Fig. 2) [10]. Although this is clearly an extremely on its own to catalyze strand scission and, hence, prob- low-resolution structure (standard deviations are of the ably provides much of the structure-forming potential for order of 10 A) it is nonetheless extremely exciting, as the the enzyme complex as a whole. The size of the RNA method can be quickly and efficiently applied to virtually component (around 400 bases), however, makes its struc- any large RNA. tural determination a daunting task. To simplify the problem, phylogenetic co-variation analysis was first used Finally, there has been great interest in establishing the to establish a solid secondary structure [7]. But although three-dimensional structure of the hammerhead ribo- phylogenetic analysis had been able previously to identify zyme. Although this RNA can hydrolyze a single specific tertiary structural contacts and to model the three- phosphodiester bond just like the behemoth RNase P, dimensional structure of another large RNA, the group I unlike RNase P it is extremely small (minimal versions self-splicing intron [8], in this case it proved insufficient of the catalytic core are less than 20 bases long). It is to launch RNase P into the third dimension. therefore amenable to the same type of constrained model building that has proved successful for the RBE. The necessary additional information was provided by The secondary structure of the hammerhead is a simple cross-linking experiments. To determine which second- three-helix junction, so determining its tertiary structure ary structural elements should be positioned adjacent to will to some extent consist of determining the distances 122 Current Biology 1995, Vol 5 No 2

Fig. 3. Hammerhead ribozyme structures. (a) Structure based on FRET. Stems I, II and III are all indicated; the site of the single deoxy- ribose moiety is indicated in green. An arrow marks the position where the strands would normally be cleaved. The green sphere at the 5' terminus of stem I and the red sphere at the 3' terminus of stem II represent the fluorescent dyes used in the FRET study. The transfer of energy between these fluorophores is strongly dependent on' distance. (Graphic courtesy Eric Westhof.) (b) Crystal structure. Stems I, II and III are again indicated; the DNA strand is indicated in green. The position where the strands would normally be cleaved is indi- cated. The FRET structure and the crystal structure are rotated in these representations by roughly 180° relative to one another about a vertical axis. (Graphic courtesy David McKay.)

and angles between three relatively rigid rods. Unfortu- molecules containing junctions formed by long rigid nately, there is little phylogenetic variation in the ham- helices are aligned by a short electrical pulse, and the merhead core structure, and even selection in vitro has decay of their alignment is then measured by examining failed to produce significant additional variation [11]. the polarization of light passing through the solution as a Instead, two quite different physical techniques have function of time. As with FRET, the rate of decay is been used to orient the helices. highly dependent on the orientation of the helices. To carry out the birefringence experiments, Amiri and First, Tuschl et al. [12] synthesized a series of RNAs Hagerman [13] first generated a series of hammerhead derivatized at specific positions with two fluorophores, in which two of the three arms of the hammer- rhodamine and fluorescein. Because the spectra of these head were extended by 70 base pairs. Decay measure- dyes partially overlap, excitation of fluorescein results in ments for each of the the pair-wise extended structures the transfer of absorbed energy to rhodamine, causing it indicated that the hammerhead helices are co-planar and to fluoresce. The degree of energy transfer, and thus the almost equidistant from one another. strength of rhodamine fluorescence, is highly dependent on the separation of the two dyes. Thus, by moving the To determine which of these studies gives a more 'donor' and 'acceptor' fluorophores between the various complete version of the elephant, it is helpful to know RNA helices and measuring the resultant fluorescence what a sighted man would say, even a nearsighted man. resonance energy transfer (FRET), Tuschl et al. [12] were We can find out by comparing the FRET and birefrin- able to triangulate the positions of the hammerhead arms gence models with the newly published crystal structure relative to one another. The helix orientations were used of the hammerhead ribozyme (Fig. 3b) [14]. The elec- as a starting point for molecular modeling, and the over- tron density maps reveal stems I and II to be almost all structure of the hammerhead was predicted to be a Y- adjacent to one another, with stem III forming the shaped junction in which stems I and II are adjacent to base of a Y, much as in the FRET structure, but dis- one another in the branch of the Y, and stem III forms similar from the TEB structure (although the helices do the base of the Y (Fig. 3a). seem to be roughly co-planar). In addition, of course, the crystal structure reveals new and surprising details of In a second study, Amiri and Hagerman [13] inferred how the hammerhead is constructed. The joining angles between the the helices by measuring the rotational regions between stems I and II, and stems II and III, re- diffusion of the hammerhead in solution using transient spectively, extend stem III with three additional base electrical birefringence (TEB). In this technique, RNA pairings. Two of these are non-canonical A:G pairs, DISPATCH 123 whereas the remaining formally single-stranded resi- 2. Gautheret D, Major F,Cedergren F: Modeling the three-dimensional structure of RNA using discrete nucleotide conformational sets. dues, a CUGA sequence, form a 'uridine turn' similar I Mol Biol 1993, 229:1049-1064. to that found in the anticodon and pseudouridine loops 3. Bartel DP, Zapp ML, Green MR, Szostak JW: HIV-1 Rev regulation of tRNA. involves recognition of non-Watson-Crick base pairs in viral RNA. Cell 1991, 67:529-536. 4. Giver L, Bartel D, Zapp M, Pawul A, Green M, Ellington AD: Selec- Although the crystal structure of the hammerhead repre- tive optimization of the Rev-binding element of HIV-1. Nuc Acids sents a great step forward in understanding both RNA Res 1993, 21:5509-5516. 5. Leclerc F, Cedergren R, Ellington AD: A three-dimensional model of structure and catalysis, there are probably interesting the Rev-binding element of HIV-1 derived from analyses of things still to find for researchers operating at any level of aptamers. Nature Struct Biol 1994, 1:293-300. resolution. One caveat that should be considered is that 6. Battiste JL, Tan R, Frankel AD, Williamson JR: Binding of an HIV Rev peptide to Rev responsive element RNA induces for- only the TEB experiments examined a catalytically active mation of purine-purine base pairs. Biochemistry 1994, 33: species. The hammerhead crystal structure was generated 2741-2747. 7. Pace NR, Brown JW: Evolutionary perspective on the structure and from a molecule in which DNA (green in Fig. 3b) function of ribonuclease P, a ribozyme. J Bacteriol, in press. spanned one strand of stems I and III, including the 8. Michel F, Westhof E: Modelling of the three-dimensional architec- cleavage site, whereas the FRET model was built from an ture of the group I catalytic introns based on comparative sequence analysis. J Mol Biol 1990, 216:585-610. RNA that contained a single deoxycytidine residue at the 9. Burgin AB, Pace NR: Mapping the active site of ribonuclease P RNA cleavage site. Although hammerhead ribozymes can be using a substrate containing a photoaffinity agent. EMBO J 1990, extensively substituted with DNA and still remain func- 9:4111-4118. 10. Harris ME, Nolan JM, Malhotra A, Brown JW, Harvey SC, Pace NR: tional, a 2' hydroxyl at the cleavage site is essential for Use of photoaffinity crosslinking and molecular modeling to catalysis (and perhaps for an active conformation as well). analyze the global architecture of ribonuclease P RNA. EMBO In addition, the only metal ion that was precisely located 1994, 13:3953-3963. 11. Nakamaye KL, Eckstein F: AUA-cleaving hammerhead ribozymes: in the hammerhead, between the G:A pairings in the attempted selection for improved cleavage. Biochemistry 1993, 33: extended stem II, was distal from the site of catalysis, 1271-1277. even though the hammerhead is known to be a metallo- 12. Tuschl T, Gohlke C, Jovin TM, Westhof E, Eckstein F: A three- dimensional model for the hammerhead ribozyme based on fluor- enzyme. Regardless of these details, though, these exper- escence measurements. Science 1994, 266:785-789. iments all provide new glances at RNA structure and 13. Amiri KMA, Hagerman PJ:Global conformation of a self-cleaving function. Taken together, they reveal that the best vision hammerhead RNA. Biochemistry 1994, 33:13172-13177. 14. Pley HW, Flaherty KM, McKay DB: Three-dimensional structure of any researcher can hope to have is insight. a hammerhead ribozyme. Nature 1994, 372:68-74.

References Scott Baskerville and Andrew D. Ellington, Department 1. Major F,Turcotte M, Gautheret D, Lapalme G, Fillion E, Cedergren R: The combination of symbolic and numerical computation for three- of Chemistry, Indiana University, Bloomington, Indiana dimensional modeling of RNA. Science 1991, 253:1255-1260. 47405, USA.

THE JUNE 1995 ISSUE (VOL. 5, NO. 3) OF CURRENT OPINION IN

will include the following reviews, edited by Eric Westhof and Dinshaw Patel, on Nucleic Acids:

New crystal structures of nucleic acids (DNA and RNA) and their complexes by M. Sundaralingam Telomere structure and function by D. Rhodes Nucleic acid electrostatics by B. Honig Nucleic acid thermodynamics by K. Breslauer Backbone modifications in oligonucleotides and protein-nucleic acid systems by A. De Mesmaeker Folding of RNA by A. Pyle Catalysis by small RNAs by F Eckstein Nucleic acid solution structures, hydration and opening byJ. Leroy