"Construction of Recombinant DNA Libraries". In: Current Protocols In
Total Page:16
File Type:pdf, Size:1020Kb
CHAPTER 5 Construction of Recombinant DNA Libraries INTRODUCTION onstruction of recombinant DNA molecules by simply ligating vector DNA and a Cfragment of interest is a straightforward process, discussed in Chapters 1 and 3. Special problems arise, however, when the fragment of interest represents only a very small fraction of the total target DNA. This is the case in two commonly encountered situations: isolation of single copy genes from a complex genome and isolation of rare cDNA clones derived from a complex mRNA population. This chapter describes tech- niques to generate recombinant DNA libraries which contain complete representation of genomic or cDNA sequences. The major differences between genomic and cDNA libraries are discussed in Section I (Overview of Recombinant DNA libraries). Insert DNA molecules are a necessary prerequisite for the construction of any DNA library and several approaches for producing insert DNA are discussed in Section II (Preparation of Insert DNA from Genomic DNA) and Section III (Preparation of Insert DNA from Messenger RNA). Finally, approaches to producing complete libraries are described in Section IV (Production of Genomic DNA and cDNA Libraries). The production of large libraries is a complicated process and all of the necessary steps are described here. However, a number of library-making kits are now commercially available which make this process much more feasible. These kits have the additional advantage that they utilize different cloning vectors that include proprietary features, which facilitate the use of library clones. The central issue in library production is best presented in the context of considering the production of a genomic or cDNA library from a mammalian cell. The DNA of higher organisms is remarkably complex: a mammalian haploid genome contains approximately 3 × 109 base pairs. A particular 3000-bp fragment of interest thus comprises only 1 part in 106 of a preparation of genomic DNA. Similarly, a particularly rare mRNA species may comprise only 1 part in 105 or 106 of total poly(A) containing RNA, a ratio that is usually unaffected by the process of copying the RNA into cDNA. Clearly, the main problem in generating a useful recombinant DNA library from either genomic DNA or cDNA is the creation of the huge population of clones necessary to ensure that the library contains at least one version of every sequence of interest. The solutions to this problem are basically similar for genomic and cDNA libraries. As diagrammed in Fig. 5.0.1, the genomic DNA or cDNA are first prepared for insertion into the chosen vector. The vector and target DNA are then ligated together and introduced into E. coli by either packaging into phage λ heads in vitro or by direct transformation. In some aspects, however, strategies for isolation of individual genomic or cDNA clones can be quite different. The particular problems of creating these two different types of libraries will be discussed in detail separately (UNITS 5.1 & 5.2). Over the past several years, techniques for producing improved cDNA libraries have been optimized. These techniques will be described in the near future in an update to UNIT 5.8A. Most cDNA library screening procedures involve positive identification of cDNA clones with either antibodies or hybridizing nucleic acid probes. Subtracted cDNA libraries provide a method for identifying mRNAs (as cDNAs) that are expressed in one cell but not another. cDNAs are synthesized from the cell expressing the desired mRNA and all of the sequences expressed in a cell not expressing the RNA are removed by hybridization Construction of and selection. The remaining sequences are cloned into a bacteriophage or plasmid vector Recombinant DNA Libraries Contributed by J.G. Seidman 5.0.1 Current Protocols in Molecular Biology (2003) 5.0.1-5.0.3 Copyright © 2004 by John Wiley & Sons, Inc. Supplement 68 TISSUE VECTORS (choose a vector) mRNA DNA cDNA partially or completely digested DNA methylated, double-stranded size fractionation linkered DNA clonable DNA ligate DNA and vector introduce ligated vector/target DNA into E. coli (in vitro packaging or transformation) titer and characterize library screen for amplify for desired clone long-term storage Figure 5.0.1 Steps involved in the construction of cDNA or genomic DNA libraries. to produce a subtracted cDNA library. Estimation of the number of clones that must be screened is difficult, as it will vary with the cell type and the gene to be identified. Some cDNA libraries such as SAGE libraries and subtracted libraries do not capture full-length or near-full-length cDNAs. Construction of these libraries is described in Chapter 25, where they are used for assessing RNA expression. The E. coli vectors described in this chapter are limited with regard to the size of insert DNA that can be accommodated (∼20 kb for lambda and ∼40 kb for cosmid vectors). The ability to clone much larger fragments of DNA, however, has become an essential requirement for many genome analysis projects. Yeast artificial chromosome (YAC) vectors, maintained in yeast hosts, typically carry inserts ranging from 0.3 to 1.2 Mb of genomic DNA. Both the size and complexity of YAC libraries pose special considerations for production, screening, and analysis, and these concerns are addressed in UNITS 6.9 & 6.10 in the following chapter. Two important general points pertain to both genomic and cDNA libraries. First, it is essential that both the vector DNA and target DNA used to create the library are not contaminated by exogenous sequences detectable by the probes that will be used to isolate the clones of interest. There are obvious, potentially disastrous effects of contaminated target DNA—for example, by only 1 part in 105 of a plasmid containing the cDNA sequences to be used as a probe. See “Going for the gene,” The Boston Globe Magazine, Introduction 5.0.2 Supplement 68 Current Protocols in Molecular Biology Aug. 2, 1987, for an account of such a mistake. Common sense dictates care and use of absolutely clean and, where possible, disposable materials throughout. Second, libraries can be stored indefinitely. Over the past 5 to 10 years, large numbers of genomic libraries have been made from a variety of organisms and an even larger number of cDNA libraries have been produced from an astonishing number of tissues and cell lines. Many useful libraries, including examples of human and other mammalian genomic or cDNA libraries, have been made over the years. In some cases, journals (e.g., Cell, Science, Proceedings of the National Academy of Sciences U.S.A., and the publications of the American Society for Microbiology) require that libraries and individual clones discussed in their pages be freely available to other investigators. Both stock and custom-made libraries are also available from a variety of commercial sources. The ability to amplify these libraries without significantly altering the distribution of clones within the library allows investigators to utilize a library many times. Protocols for amplifying libraries are described in UNITS 5.10 & 5.11. J.G. Seidman Construction of Recombinant DNA Libraries 5.0.3 Current Protocols in Molecular Biology Supplement 64 OVERVIEW OF RECOMBINANT SECTION I DNA LIBRARIES The units in this section present strategies for generating genomic DNA and cDNA libraries (UNITS 5.1 & 5.2, respectively). Production of recombinant DNA libraries can be a very laborious procedure. We strongly recommend the purchase of genomic or cDNA libraries when possible; otherwise, we recommend purchasing reagent kits for producing libraries. These kits can save considerable time and effort. Genomic DNA Libraries UNIT 5.1 Genomic DNA libraries are almost always This equation can be used to define a useful screened by hybridization using a radioactive rule-of-thumb by calculating the probability of nucleic acid probe. Since this approach is es- isolating a fragment of interest as a function of sentially independent of a particular vector or (I/G). In general, to have a 99% chance of type of target DNA, the main problem faced isolating a desired sequence, the number of when considering creation of a genomic DNA clones screened should be such that the total library is simply generating a large enough number of base pairs present in the clones number of recombinant DNA clones. The basic screened (I × N) represents a 4.6-fold excess strategies used to address this problem have over the total number of base pairs in the included both minimizing the number of clones genome (G) (Seed et al., 1982). necessary by incorporating large fragments of When the desired fragment can be purified, genomic DNA, and maximizing cloning effi- the size of the library can be reduced. The ciency by using vectors based on bacteriophage library size can then be estimated by λ. This unit will discuss the appropriate numeri- ~ cal considerations for both ordinary genomic N = 3 × 1/p DNA libraries and subgenomic DNA libraries, and will then describe a limited number of where p = the probability of isolating a particu- appropriate vectors. lar fragment = 1/total number of fragments in the pool. REPRESENTATION AND It is important to note that this simple analy- RANDOMNESS sis assumes that the cloned DNA segments The size of a library of completely random randomly represent the sequences present in the fragments of genomic DNA necessary to en- genome. This assumption is true only if the sure representation of a particular sequence of target DNA was cleaved completely randomly interest is dictated by the size of the cloned prior to insertion into the vector. In the strictest fragments and the size of the genome. The sense, this level of randomness can be ap- likelihood that a sequence of interest is present proached only by the relatively inconvenient in such a random library can be estimated by means of shearing the target DNA.