Theoretical Principles of in Vitro Selection Using Combinatorial
Total Page:16
File Type:pdf, Size:1020Kb
Theoretical Principles of In Vitro Selection UNIT 9.1 Using Combinatorial Nucleic Acid Libraries Over the past decade, a new paradigm for binding proteins, peptides, and small organic drug discovery (Gold, 1995) and biological molecules (reviewed in Klug and Famulok, research (Gold et al., 1995) has been developed 1994; Gold, 1995; Gold et al., 1995). This unit from technologies that integrate combinatorial presents a theoretical overview of in vitro af- chemistry with rounds of selection and ampli- finity selection using SELEX technology. fication, a technique that is called in vitro se- A schematic representation of the SELEX lection. Systematic Evolution of Ligands by process is shown in Figure 9.1.1 and may be EXponential enrichment, or SELEX (Ellington used to describe SELEX performed with librar- and Szostak, 1990; Tuerk and Gold, 1990) is a ies of RNA, RNA derivatives, or DNA. For the flexible and extremely successful form of this purposes of developing a mathematical model, technology that uses combinatorial libraries of the SELEX process for affinity binding may be oligonucleotides containing regions of ran- summarized in four steps: (1) generation of a domized sequence as potential ligands. Oli- library of potential ligands, (2) binding of the gonucleotide libraries (containing randomized library to the target molecule, (3) partitioning regions) provide, after selection, compounds of the bound ligands from the unbound ligands, that bind tightly to the intended target. The and (4) amplification of the partitioned ligands process of in vitro selection was called SELEX to generate a new, enriched library, leading by Tuerk and Gold (1990), while the selected again to step 1. Repeated application of steps 2 compounds were called aptamers by Ellington to 4 results in an enriched pool composed of and Szostak (1990). SELEX and in vitro selec- the sequences of interest. For selection of sin- tion (from oligonucleotide libraries) are iden- gle-stranded DNA (UNIT 9.2), the two strands of tical. The selected and amplified bonding site the PCR-amplified pool of dsDNA must be (SAAB) technology (Blackwell and Wein- denatured, and one of the strands isolated be- traub, 1990) is a specialized form of SELEX fore binding with the target. For selection of directed toward finding naturally occurring se- RNA and RNA derivatives (UNIT 9.3), the PCR- quences that bind proteins in vivo; however, the amplified pool of dsDNA must be transcribed number of unique sequences used for SAAB to form a pool of RNA before binding with the analysis is usually much smaller than that used target. The partitioned RNA must then be re- in most SELEX experiments, since the size of verse-transcribed into DNA before PCR ampli- the binding area is usually well defined and thus fication. For the present analysis, these enzy- the number of mutagenized nucleotides is matic transformations—reverse transcription small. SELEX and other adaptive molecular (RT), PCR, and transcription—are all assumed evolution techniques, such as phage display to be perfect, meaning that they do not affect (Cwirla et al., 1990; Scott and Smith, 1990; the relative concentrations of the ligands. How- Kay, 1994; Winter et al., 1994), gain much of ever, see Mathieu-Daudé et al., (1996) regard- their power from their ability to isolate individ- ing imperfect amplification due to concentra- ual molecules from vast molecular pools with- tion differences, and Sun et al. (1996) for out resorting to cumbersome deconvolution or mathematical modeling of the amplification tagging methods commonly used in combina- process taking stochastic effects into account. torial chemistry schemes. Rather, these meth- SELEX is a very forgiving technology. ods utilize iterative rounds consisting of ligand High-affinity ligands to nearly any desired tar- selection from combinatorial libraries followed get may be found even when the selection by amplification of these selected ligands to conditions (protein and RNA concentrations, form new libraries enriched for the particular for example) are far from optimal. However, function of interest, e.g., affinity binding or great savings in time and material, or perhaps catalytic function. Such techniques enable even success with difficult targets, may be quite rapid searches of enormous libraries achieved by working at the optimal conditions. (typically greater than 1015 potential ligands in Determining what these conditions are de- the case of SELEX). SELEX has been used to mands a deeper understanding of the mecha- discover high-affinity ligands to a wide variety nisms of SELEX. We present such a theoretical Combinatorial of different molecular targets, including nu- model here. We first describe the characteristics Methods in Nucleic Acid cleic acid binding proteins, non–nucleic acid of a ligand library, comprised of oligonu- Chemistry Contributed by Barry Vant-Hull, Larry Gold, and Dominic A. Zichi 9.1.1 Current Protocols in Nucleic Acid Chemistry (2000) 9.1.1-9.1.16 Copyright © 2000 by John Wiley & Sons, Inc. Figure 9.1.1 Schematic representation of the SELEX in vitro selection methodology. The initial random pool is derived from synthesized DNA oligonucleotides that are used directly for DNA SELEX or converted to double-stranded templates for transcription for RNA SELEX. Once the initial pool is created, the steps for a round of affinity binding SELEX are presented in the squares; (1) pool generation, (2) incubation with target, (3) partitioning, and (4) enzymatic amplification. cleotide sequences, that are relevant to the vide a larger contact surface, often making it model. We then describe the equilibrium selec- possible to find sequences that bind with higher tion model central to SELEX, incorporating affinity than the wild-type binding sequences. these library characteristics. A demonstration The only practical limitation on library size is of the model applied to experimental data is imposed by the volumes of material manipu- then presented. Analytical expressions for the lated experimentally; 1015 random sequences optimal nucleic acid and protein concentrations are easily synthesized and readily processed. are derived, these being two parameters easily Each sequence in the library is composed of a varied during SELEX experiments. However, the formulas for optimal concentrations unfor- random region of variable length sandwiched tunately depend on parameters that cannot eas- between two regions of fixed sequence used for ily be determined experimentally. We therefore primer binding sites during enzymatic process- introduce a new parameter, the signal-to-noise ing. The length of the random region varies ratio, which allows the determination of near- considerably among selection experiments. For optimal conditions based only on parameters affinity binding, most studies use between 20 that are easily determined experimentally. and 60 nucleotides (Gold et al., 1995), while researchers performing catalytic selections typically use much larger random regions, the NUCLEIC ACID LIBRARIES largest comprising >200 nucleotides (Hager et In vitro selection is performed with nucleic al., 1996; Breaker, 1997). The motivation for acid libraries containing vast numbers of the difference in sequence length is that binding unique molecules, typically ∼1015 sequences. interactions with proteins and small molecules Such large libraries are desirable in order to may require smaller molecular arrangements saturate the sequence space of longer random- ized regions—a useful goal, as SELEX is often than those needed to carry out enzymatic activ- directed toward non–nucleic acid–binding pro- ity. It is commonly believed that typical cata- teins that are unlikely to have sites with high lytic oligonucleotides have multiple secondary Theoretical affinity and specificity to short sequence re- structural domains that may be required for Principles of gions. Even with known nucleic acid–binding activity, but this hypothesis still awaits rigorous In Vitro Selection proteins, longer randomized regions may pro- proof. 9.1.2 Current Protocols in Nucleic Acid Chemistry A basic tenet of in vitro selection experi- tion, the actual frequency of high-affinity li- ments is that the selected function of oligonu- gands is likely to be different than that calcu- cleotide molecules is conferred through their lated from the consensus motif, since this motif three-dimensional structures. These structures, is usually defined from a sampling of relatively usually supported by stacking interactions be- few sequences and is thus underdetermined. We tween adjacent base pairs, are a consequence show below that the frequency of selected mo- of the individual sequences. The identification tifs within the initial pool plays a central role of conserved primary structural units (residues) in the progress of SELEX experiments. and secondary structural units (e.g., helices and For a continuous motif of length m, increas- loops) from those sequences sharing a selected ing the random region by n bases results in an function allows one to define a motif required n-fold increase for representation of that motif for the function. Once a motif is defined, it is in the random sequence library. However, cal- easy to compute its frequency of occurrence in culating the frequency of occurrence of a par- the initial pool. For example, SELEX-isolated ticular motif within the original sequence pool sequences that bind with high affinity to the E. is certainly an upper limit on the number of coli rho factor, displayed in Figure