US 2014O1961.76A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2014/0196176 A1 Heintz et al. (43) Pub. Date: Jul. 10, 2014

(54) METHOD FOR ISOLATING CELL-TYPE (52) U.S. Cl. SPECIFICMRNAS CPC ...... CI2N 15/1041 (2013.01); C12N 15/1006 (2013.01); C12O 1/6827 (2013.01): CI2N (71) Applicants:Nathaniel Heintz, Pelham Manor, NY 15/82 (2013.01); C12N 15/8222 (2013.01) (US); Tito A. Serafini, San Mateo, CA USPC ...... 800/287: 530/358; 435/419,536/25.4; (US); Andrew W. Shyjan, San Carlos, 435/6.11:506/9; 435/6. 12:435/320.1; 800/298 CA (US) (72) Inventors: Nathaniel Heintz, Pelham Manor, NY (57) ABSTRACT (US); Tito A. Serafini, San Mateo, CA (US); Andrew W. Shyjan, San Carlos, The 1nVent1On provides methods for isolating cell-type spe CA (US) cific mRNAs by selectively isolating ribosomes or that bind mRNA in a cell type specific manner, and, thereby, (21) Appl. No.: 13/930,864 the mRNA hound to the ribosomes or proteins that bind mRNA. Ribosomes, which are riboprotein complexes, bind (22) Filed: Jun. 28, 2013 mRNA that is being actively translated in cells. According to the methods of the invention, cells are engineered to express Related U.S. Application Data a molecularly tagged ribosomal or protein that binds (63) Continuation of application No. 13/104.316, filed on mRNA by introducing into the cell a nucleic acid comprising May 10, 2011, now Pat. No. 8,513,485, which is a a nucleotide sequence encoding a ribosomal protein or pro continuation of application No. 10/494.248, filed on tein that binds mRNA fused to a nucleotide sequence encod Aug. 16, 2004, now Pat. No. 7,985,553. ing a peptide tag. The tagged ribosome or mRNA binding protein can then be isolated, along with the mRNA bound to Publication Classification the tagged ribosome or mRNA binding protein, and the mRNA isolated and further used for expression analysis. (51) Int. Cl. The methods of the invention facilitate the analysis and quan CI2N IS/10 (2006.01) tification of gene expression in the selected cell type present CI2N 15/82 (2006.01) within a heterogeneous cell mixture, without the need to CI2O I/68 (2006.01) isolate the cells of that cell type as a preliminary step. Patent Application Publication Jul. 10, 2014 Sheet 1 of 2 US 2014/O1961.76 A1

Figure 1

1 2 3 4 5 Patent Application Publication Jul. 10, 2014 Sheet 2 of 2 US 2014/O1961.76 A1

Figure 2

?iaeae US 2014/O 1961.76 A1 Jul. 10, 2014

METHOD FOR ISOLATING CELL-TYPE or other preparation containing the tagged protein that binds SPECIFICMRNAS mRNA being analyzed). In a preferred embodiment, the poly Some preparation is a membrane-associated polysome prepa CROSS-REFERENCE ration. Specifically, the peptide tag may be an epitope that is recognized by an antibody that does not specifically bind any 0001. This application is a continuation of application Ser. epitope expressed in a cell or ribosome/polysome fraction No. 13/104,316, filed May 10, 2011, which is a continuation from an unengineered cell. As defined herein, specific bind of application Ser. No. 10/494.248, filed on Aug. 16, 2004, ing is not competed away by addition of non-specific pro now U.S. Pat. No. 7,985,553, which is a national stage of teins, e.g., bovine serum albumen (BSA). The tagged riboso Application No. PCT/US02/34645, filed on Oct. 29, 2002, mal protein or mRNA binding protein is then expressed which claims priority to provisional Application No. 60/340, selectively in a cell population of interest (for example, by 689, filed on Oct. 29, 2001, each of which is incorporated by operably linking the nucleotide sequence encoding the tagged reference in its entirety. ribosomal protein or mRNA binding protein to a cell-type 1. TECHNICAL FIELD specific promoter and/or other transcriptional element). In a preferred embodiment, the tagged ribosomal protein or 0002 The present invention relates to methods for isolat mRNA binding protein is overexpressed. ing cell-type specific mRNAS by isolating ribosomes in a 0005 Monosomes or polysomes (which are, respectively, cell-type specific manner. According to the methods of the single or multiple ribosomes in a complex with a single invention, ribosomes or proteins that bind mRNA of the mRNA) or other mRNA-containing complex are isolated selected cell type are molecularly tagged and isolated, and the selectively from the cell population of interest through the use mRNA bound to the ribosomes or proteins that bind mRNA is of the tagged ribosomal protein subunit or other mRNA bind then isolated and analyzed. The methods of the invention ing protein. As used herein, isolated means that the ribosomes facilitate the analysis and quantification of gene expression in are separated from other cell components, specifically that the the selected cell type present within a heterogeneous cell ribosomes are substantially free of untagged ribosomes and mixture, without the need to isolate the cells of that cell type of RNA (particularly mRNA) not bound by ribosomes or as a preliminary step. mRNA binding protein. In particular, the composition is 50%, 2. BACKGROUND OF THE INVENTION 60%, 70%, 80%, 90%, 95% or 99% tagged ribosome or mRNA binding protein and associated mRNA. The mRNA 0003. An important paradigm in the development of new species that are bound to the cell-type specific ribosomes or diagnostics and therapies for human diseases and disorders is mRNA binding protein are then isolated, and can Subse the characterization of the gene expression of defined cell quently be profiled and quantified, to analyze gene expression types. The cellular complexity of many tissues (such as the in the cell. In a specific embodiment, because nascent nervous system), however, poses a challenge for those seek polypeptides are attached to isolated monosomes and poly ing to characterize gene expression at this level. The enor Somes, the methods of the invention can also be used to isolate mous heterogeneity of a tissue Such as the nervous system newly synthesized polypeptides from a cell type of interest (thousands of neuronal cell types, with non-neuronal cells (e.g., for proteomic applications), for example, using antibod outnumbering neuronal cells by an order of magnitude) is a ies that specifically recognize an epitope on a specific barrier to the identification and analysis of gene transcripts polypeptide being synthesized by the cell. present in individual cell types. One way to overcome this 0006. In preferred embodiments, the invention provides barrier is to tag gene transcripts directly or indirectly, i.e., transformed organisms (including animals, plants, fungi and mRNA, present in a particular cell type, in Such a manner as bacteria), e.g., a transgenic animal Such as a transgenic to allow facile isolation of the gene transcripts without the mouse, that expresses one or more tagged ribosomal protein need to isolate the individual cells of that cell type as a (s) or mRNA binding protein(s) within a chosen cell type. The preliminary step. We describe such a technology here. invention also provides cultured cells that express one or more tagged ribosomal proteins or mRNA binding proteins. 3. SUMMARY OF THE INVENTION Cell-type specific expression is achieved by driving the 0004. The invention provides methods for isolating cell expression of the tagged ribosomal protein using the endog type specific mRNAs by selectively isolating ribosomes or enous promoter of a particular gene, wherein the expression proteins that bind mRNA in a cell type specific manner, and, of the gene is a defining characteristic of the chosen cell type thereby, the mRNA bound to the ribosomes or proteins that (i.e., the promoter causes gene expression specifically in the bind mRNA. Ribosomes, which are riboprotein complexes, chosen cell type). Thus, “cell-type' refers to a population of bind mRNA that is being actively translated in cells. Accord cells characterized by the expression of a particular gene. In a ing to the methods of the invention, cells are engineered to preferred embodiment, a collection of transgenic mice express a molecularly tagged ribosomal protein or protein expressing tagged ribosomal proteins within a set of chosen that binds mRNA by introducing into the cell a nucleic acid cell types is assembled. Additionally, since the level of comprising a nucleotide sequence encoding a ribosomal pro expression of the tagged ribosomal protein or mRNA binding tein or proteins that bind mRNA fused to a nucleotide protein within a cell may be important in the efficiency of the sequence encoding a peptide tag. The peptide tag can be any isolation procedure, in certain embodiments of the invention, non-ribosomal protein peptide or non-mRNA binding protein a binary system can be used, in which the endogenous pro peptide that is specifically bound by a reagent that either does moter drives expression of a protein that then activates a not recognize a component of the cell fraction from which the second expression construct. This second expression con tagged ribosomes or proteins that bind mRNA are to be iso structuses a strong promoter to drive expression of the tagged lated, for example, from a whole cell lysate or post-mitochon ribosomal protein or mRNA binding protein at higher levels drial fraction (or any other ribosome or polysome preparation than is possible using the endogenous promoter itself. US 2014/O 1961.76 A1 Jul. 10, 2014

0007. In specific embodiments, the invention provides does not recognize a component, other than the peptide tag, of molecularly tagged ribosomes, preferably bound to mRNA, the cell fraction from which the tagged ribosomes or mRNA that are bound to an affinity reagent for the molecular tag. In binding proteins are to be isolated, for example, from a whole more specific embodiments, the molecularly tagged ribo cell lysate or post-mitochondrial fraction (or any other ribo Somes are bound to an affinity reagent which is bound to a Some or polysome preparation or preparation containing Solid Support. In other particular embodiments, the invention mRNA binding protein bound to mRNA being analyzed). For provides molecularly tagged ribosomal proteins and mRNA example, the peptide tag may be an epitope that is recognized binding proteins of the invention (and the ribosomes, riboso by an antibody that does not specifically bind any epitope mal-mRNA complexes, and mRNA binding protein-mRNA expressed in a cell or ribosome/polysome fraction (or other complexes containing them); nucleic acids comprising nucle fraction) from an unengineered cell. As defined herein, spe otide sequences encoding a molecularly tagged ribosomal cific binding is not competed away by addition of non-spe protein or mRNA binding protein of the invention; vectors cific proteins, e.g., bovine serum albumen (BSA). and host cells comprising these nucleic acids and tagged 0013 The tagged ribosomal protein or mRNA binding proteins and ribosomes of the inventions. protein is then expressed selectively in a cell population of 0008. The methods of the invention are advantageous interest (for example, by operably linking the nucleotide because they permit the isolation of gene transcripts, or sequence encoding the tagged ribosomal or mRNA binding mRNA, present in a particular cell type, as defined by the protein to a cell-type specific promoter, enhancer and/or other common expression of a given gene, in Such a manner as to transcriptional element). The fused nucleotide sequences allow their facile isolation without the need to isolate the may be under the control of a transcriptional element (e.g., individual cells of that cell type as a preliminary step. promoter or enhancer) that activates transcription specifically 0009. Additionally, in specific embodiments, the methods in the cell type of choice (for example, transcriptional regu of the invention may be used to isolate other organelles or latory elements that control expression of the gene, the Subcellular structures by molecularly tagging proteins inte expression of which characterizes the cell type of choice, gral to those organelles or structures. In a particular embodi termed herein the “characterizing gene'). In a preferred ment, the methods of the invention are used to isolate cell embodiment, the tagged ribosomal or mRNA binding protein specific mRNAs for secreted, membrane bound and lysomal is overexpressed. Cell-specific polysomes (or other fraction proteins by isolating tagged membrane bound ribosomes. containing the tagged mRNA binding protein) containing the tag are purified, exploiting affinity of a purification reagent 4. DESCRIPTION OF THE FIGURES (e.g., an antibody or other biological compound that binds the tag) for the tag. The purification reagent can then be isolated 0010 FIG. 1. Polysomes from cells transfected with plas itself or be bound to another structure, e.g., a bead, that can be mids expressing tagged versions of ribosomal proteins S6 isolated from other components in the cell, and bound mRNA (lane2, in duplicate), L32 (lane 4, in duplicate), and L37 (lane is isolated from purified polysomes for Subsequent gene 5, in duplicate) contain proteins that are reactive to the anti expression analysis. streptag II antibodies. These proteins correspond to the pre dicted molecular weights of the S6 (34 kDa), L32 (52 kDa), 0014 5.1. Molecular Tagging of Ribosomes and mRNA and L37 (9 kDa) ribosomal proteins. The S6 and L37 proteins Binding Proteins appear to be more abundantly represented in the polysomal 0015 The invention provides methods for isolating cell fraction compared to the L32 protein. Tagged S20 (lane 3, in type specific mRNA using molecularly tagged ribosomal pro duplicate) does not appear to be present in the polysomal teins that become incorporated into the ribosomes of a par ticular cell type or molecularly tagged mRNA binding fraction. Polysomes from untransfected cells (lane 1, in dupli proteins that are expressed in a particular cell type of interest. cate) do not display any immunoreactive material. Specifically, ribosomes and mRNA binding proteins can be 0011 FIG. 2. Ribosomal RNA is present (arrow) in mate molecularly tagged by expressing in the cell type of interest a rial immunoprecipitated from tagged S6 (lane 2) transfec ribosomal fusion protein or mRNA binding protein fusion tants. Such RNA is also present at low levels in material from protein containing all or a portion of a ribosomal or mRNA tagged L37 transfectants (lane 3). Such RNA is not present in binding protein (preferably, the portion has the biological material from untransfected cells (lane 1). activity of the native ribosomal protein or mRNA binding protein, i.e., can function in an intact ribosome to carry out 5. DETAILED DESCRIPTION OF THE translation or binds mRNA) fused to (for example, through a INVENTION peptide bond) a protein or peptide tag that is not a ribosomal 0012. The invention provides methods for isolating cell protein or mRNA binding protein orportion thereof, or, pref type specific mRNAs by selectively isolating ribosomes, or erably, found in the organism in which the tagged protein is other proteins that bind mRNA, in a cell type specific manner, being expressed. Such expression can be carried out by intro and, thereby, the mRNA bound to the ribosomes or mRNA ducing into cells, or into an entire organism, a nucleic acid binding proteins. Ribosomes, which are riboprotein com encoding the molecularly tagged ribosomal protein or mRNA plexes, bind mRNA that is being actively translated in cells. binding protein, under the control of transcriptional regula According to the methods of the invention, preferably, cells tory elements that direct expression in the cell type of choice, are engineered to express a molecularly tagged ribosomal or putting the expression of the ribosomal or mRNA fusion protein or mRNA binding protein by introducing into the cell protein under the control of an endogenous promoter by a nucleic acid comprising a nucleotide sequence encoding a homologous recombination or in a bacterial artificial chro ribosomal protein or mRNA binding protein fused to a nucle mosome (“BAC). otide sequence encoding a peptide tag. The peptide tag can be 0016. The invention further provides methods for isolating any peptide that is not from a ribosomal protein or mRNA cell-type specific mRNA by tagging proteins that bind to binding protein and that is specifically boundby a reagent that mRNA, preferably actively translated mRNA. In a preferred US 2014/O 1961.76 A1 Jul. 10, 2014

embodiment, the protein that binds mRNA is not poly A ribosomal protein or mRNA binding protein, but are unlikely binding protein. In another embodiment, the protein that to inhibit or interfere with function of the tagged ribosomal binds mRNA is a CAP binding protein or a processing factor protein or mRNA binding protein. The tag may be of any that binds the 3' untranslated region of the mRNA. In certain length that permits binding to the corresponding binding other embodiments, the ribosome or mRNA binding protein reagent, but does not interfere with the tagged proteins bind is molecularly tagged by engineering the ribosome or mRNA ing to the mRNA. In a preferred embodiment, the tag is about binding protein to bind a small molecule, e.g., a peptide, that 8, 10, 12, 15, 18 or 20 amino acids, is less than 15, 20, 25, 30, is not significantly bound by the unengineered ribosome or 40 or 50 amino acids, but may be 100, 150, 200, 300, 400 or mRNA binding protein. 500 or more amino acids in length. The tag may be bound 0017. The nucleic acid encoding the ribosomal protein or specifically by a reagent that does not bind any component of other mRNA binding protein fused to the peptide tag can be (1) the cell of interest; or (2) a polysomal preparation of generated by routine genetic engineering methods in which a interest; or (3) whatever cellular fraction of interest is being nucleotide sequence encoding the amino acid sequence for contacted by the reagent that binds the tag. Molecular tags the peptide tag sequence is engineered in frame with the may include, by way of example, and not by limitation, pro nucleotide sequence encoding a ribosomal protein or mRNA tein Afragments; myc epitopes (Evan et al., Mol. Cell. Biol. binding protein. This can be accomplished by any method 5(12):3610-3616); Btag (Wang et al., 1996, Gene 169(1): known in the art, for example, via oligonucleotide-mediated 53-58; and polyhistidine tracts (Bomhorst et al., 2000, Puri site-directed mutagenesis or polymerase chain reaction and fication of proteins using polyhistidine affinity tags, Methods other routine protocols of molecular biology (see, e.g., Sam Enzymol 326:245-54). Other preferred tags include, but are brook et al., 2001, Molecular Cloning, A Laboratory Manual, not limited to: Third Edition, Cold Spring Harbor Laboratory Press, N.Y.: 0022 (1) a portion of the influenza virus hemagglutinin and Ausubel et al., 1989, Current Protocols in Molecular protein (Tyr-Pro-Tyr-Asp-Val-Pro-Asp-Tyr-Ala; SEQ ID Biology, Green Publishing Associates and Wiley Inter NO: 1). The reagent used for purification is a monoclonal science, N.Y., both of which are hereby incorporated by ref antibody recognizing the tagged protein (12CA5) (Wilson I erence in their entireties). In certain embodiments, the A, Niman H L. Houghten RA, Cherenson AR, Connolly M method of Walles-Granberg et al. (Biochim Biophys. Acta, L. Lerner RA. The structure of an antigenic determinant in a 2001, 1544(1-2): 378-385, which is incorporated herein by protein. Cell. 1984 July; 37(3):767-78). reference in its entirety) is used. 0023 (2) a portion of the human c-myc gene (Glu-Gln 0018. The nucleotide sequence encoding the peptide tag is Lys-Leu-Ile-Ser-Glu-Glu-Asp-Leu, SEQ ID NO: 2). The reagent used for purification is a monoclonal antibody recog preferably inserted in frame such that the tag is placed at the nizing the tagged protein (9E10) (Evan G. I. Lewis G. K. N- or C-terminus of the ribosomal protein, since these por Ramsay G, Bishop J. M. Isolation of monoclonal antibodies tions of proteins are often accessible to detection or purifica specific for human c-myc proto-oncogene product. Mol Cell tion reagents. The peptide tag, however, may be inserted into Biol. 1985 December; 5(12):3610-6). any portion of the ribosomal protein such that when the pro 0024 (3) a portion of the bluetongue virus VP7 protein tein is incorporated into an intact ribosome, the insertion of (Gln-Tyr-Pro-Ala-Leu-Thr: SEQ ID NO: 3). The reagent the tag does not prevent ribosomal function and the tag is used for purification is a monoclonal antibody recognizing accessible in the intact ribosome to the purification reagent to the tagged protein (D11 and/or F10) (Wang L. F.Yu. M. White be used in the isolation. If a mRNA binding protein is used, J. R. Eaton B. T. BTag: a novel six-residue epitope tag for the tag may be inserted into any portion of the protein Such Surveillance and purification of recombinant proteins. Gene. that the protein binds mRNA and the tag is accessible to the 1996 Feb. 22; 169(1):53-8) purification reagent. 0025 (4) a FLAG peptide (e.g., Asp-Tyr-Lys-Asp-Asp 0019 Encoded peptide tags can be any non-ribosomal Asp-Asp-Lys; SEQID NO: 4). The reagent used for purifi protein (or non-mRNA binding) peptide or protein (orportion cation are monoclonal antibodies recognizing the tagged pro thereof) that is not present and/or accessible in the cell of tein (e.g., M1 and/or M2) (Sigma) (Hopp et al., U.S. Pat. No. interest (or the cell fraction from which the tagged ribosomes 4.703,004, entitled “Synthesis of protein with an identifica or mRNA binding protein are to be affinity isolated) for which tion peptide' issued Oct. 27, 1987: Brizzard BL, Chubet RG, there exists an affinity reagent that recognizes the peptide and Vizard D. L. Immunoaffinity purification of FLAG epitope that is accessible to Solution (and thereby, the peptide tag) in tagged bacterial alkaline phosphatase using a novel mono the intact ribosomes or mRNA binding protein bound to clonal antibody and peptide elution. Biotechniques. 1994 mRNA. April; 16(4):730-5: Knappik A. Pluckthun A. An improved 0020 Molecular tagging with epitopes (“epitope tag affinity tag based on the FLAG peptide for the detection and ging”) is well known in the art (reviewed in Fritze C E. purification of recombinant antibody fragments. Biotech Anderson T. R. Epitope tagging: general method for tracking niques. 1994 October; 17(4):754-761) 0026 (5) a Strep-tag peptide (e.g., Ala-Trp-Arg-His-Pro recombinant proteins. Methods Enzymol. 2000; 327:3-16: Gln-Phe-Gly-Gly: SEQ ID NO: 5). In a preferred embodi Jarvik J. W. Telmer CA. Epitope tagging. Annu Rev Genet. ment, a strep-tag peptide is used. The reagent used for puri 1998: 32:601-18). An epitope tag can be any peptide protein fication is one of several optimized versions of streptavidin that is not normally present and/or accessible in the cell of that recognizes the tagged protein (IBA GmbH) (Skerra et al., interest (or other cells that will be contacted with the reagent U.S. Pat. No. 5,506,121, entitled Fusion peptides with bind that binds the tag) for which there exists an antibody that ing activity for streptavidin, issued Apr. 9, 1996; Skerra A. recognizes the protein, and that is accessible to Solution in the Schmidt T G. Applications of a peptide ligand for streptavi intact ribosomes or mRNA binding protein-mRNA com din: the Strep-tag. Biomol Eng. 1999 Dec. 31; 16(1-4):79-86: plexes. Skerra A, Schmidt TG. Use of the Strep-Tag and streptavidin 0021 Peptide tags can include those for which methods/ for detection and purification of recombinant proteins. Meth reagents exist that allow facile identification of the tagged ods Enzymol. 2000: 326:271-304). US 2014/O 1961.76 A1 Jul. 10, 2014

0027. Any ribosomal protein or mRNA binding protein gation rates. Native codons may be exchanged for codons of can be molecularly tagged for use in the methods of the highly expressed in the host cells. For instance, the invention, as described in this section, provided that when the nucleic acid molecule can be optimized for expression of the ribosomal protein is molecularly tagged and incorporated encoded protein in bacterial cells (e.g., E. coli), yeast (e.g., into a ribosome, the ribosome can bind mRNA and, prefer Pichia), insect cells (e.g., Drosophila), or mammaliancells or ably, translate the mRNA into protein, or, when the mRNA animals (e.g., human, sheep, bovine or mouse cells or ani binding protein is molecularly tagged, it can bind mRNA. In mals). addition, the tag of the tagged ribosomal protein or mRNA 0031 Restriction enzyme sites critical for gene synthesis binding protein must be accessible to the purification reagent, and DNA manipulation can be preserved or destroyed to so that the reagent can be used to purify the intact ribosomes facilitate nucleic acid and vector construction and expression or mRNA binding protein-mRNA complexes. Preferably, the of the encoded protein. In constructing the synthetic nucleic ribosomal protein or mRNA binding protein to be tagged is acids of the invention, it may be desirable to avoid sequences from the same species as the cell that is to express the molecu that may cause gene silencing. The codon optimized larly tagged protein. sequence is synthesized and assembled, and inserted into an 0028 Nucleic acids encoding the molecularly tagged ribo appropriate expression vector using conventional techniques somal proteins and mRNA binding proteins of the invention well known to those of skill in the art. may be produced using routine genetic engineering methods 0032. In a preferred embodiment, a synthetic nucleic acid and cloning and expression vectors that are well known in the encoding a molecularly tagged ribosomal protein comprises art. Nucleic acids encoding the ribosomal protein or mRNA at least one codon Substitution in which non-preferred or less binding protein to be molecularly tagged may be obtained preferred codon in the natural gene encoding the protein has using any method known in the art. The sequences for many been replaced by a preferred codon encoding the same amino ribosomal and mRNA binding proteins are known (see Table acid. The relative frequency of use for each codon can vary 2 in Section 5.2 below providing GenBank accession num significantly between species, although certain codons are bers for many human and murine ribosomal proteins). infrequently used across species (Zhang et al., 1991, Low Nucleic acids may be obtained, for example, by PCR using usage codons in Escherichia coli, yeast, fruit fly, and pri oligonucleotide primers based upon the published sequences. mates. Gene, 105:61-72). For instance in humans the pre Other related ribosomal and mRNA binding proteins (for ferred codons are: Ala (GCC); Arg (CGC); Asn (AAC); Asp example from other species) may be obtained by low, medium (GAC): Cys (TGC); Gln (CAG); Gly (GGC); His (CAC): Ile or high Stringency hybridization of appropriate nucleic acid (ATC); Leu (CTG); Lys (AAG); Pro(CCC); Phe (TTC); Ser libraries using the ribosomal or mRNA binding protein in (AGC); Thr (ACC); Tyr (TAC); and Val (GTG). Less pre hand as a probe. The nucleic acids encoding the desired ferred codons are: Gly (GGG): Ile (ATT); Leu (CTC): Ser ribosomal or mRNA binding protein may then be incorpo (TCC); Val (GTC); and Arg (AGG). All codons that do not fit rated into a nucleic acid vector either appropriate for addi the description of preferred codons or less preferred codons tional molecular manipulations and/or for incorporation and are non-preferred codons. expression in the host cells of interest. The nucleotide 0033. In general, the degree of preference of a particular sequences encoding the peptide tag may likewise be obtained codon is indicated by the prevalence of the codon in highly using methods well known in the art. For example, if the tag expressed genes. Codon preference for highly expressed is fairly short, a nucleic acid encoding the tag and appropriate human genes areas indicated in Table 1. For example, ATC for generating a fusion protein with the ribosomal or mRNA represents 77% of the Ile codons in highly expressed mam binding protein may be constructed using oligonucleotides to malian genes and is the preferred Ile codon; ATT represents form the double stranded nucleic acid encoding the peptide 18% of the Ile codons in highly expressed mammalian genes tag. The synthetic nucleic acid may then be cloned and used and is the less preferred Ile codon. The sequence ATA' for generating fusion proteins with ribosomal proteins or represents only 5% of the Ile codons in highly expressed mRNA binding proteins. human genes and is a non-preferred Ile codon. Replacing a 0029. In certain embodiments, a nucleic acid molecule codon with another codon that is more prevalent in highly encoding a molecularly tagged ribosomal protein is intended expressed human genes will generally increase expression of for a particular expression system, in which the codon fre the gene in mammalian cells. Accordingly, the invention quencies reflect the tRNA frequencies of the host cell or includes replacing a less preferred codon with a preferred organism in which the protein is expressed. Codon optimiza codon as well as replacing a non-preferred codon with a tionallows for maximum protein expression by increasing the preferred or less preferred codon. translational efficiency of a gene of interest. Codon optimi 0034. In a particularly preferred embodiment, the nucleic Zation is a standard component of custom gene design, and acid has been optimized for expression of the encoded protein may be obtained from commercial service providers (e.g., in human or mammalian cells or organisms. Aptagen, Inc., Herndon, Va.; Integrated DNA Technologies, Skokie, Ill.). TABLE 1 0030 The nucleic acid encoding a molecularly tagged ribosomal protein may be a synthetic nucleic acid in which Codon Frequency (Percentage) in highly expressed human genes the codons have been optimized for increased expression in Ala GC 53 the host cell in which it is produced. The degeneracy of the 17 genetic code permits variations of the nucleotide sequence, 13 while still producing a polypeptide having the identical 17 37 amino acid sequence as the polypeptide encoded by the native 7 DNA sequence. The frequency of individual synonymous 6 codons for amino acids varies widely from genome to 21 genome among eukaryotes and prokaryotes. The overall AG 10 expression levels of individual genes may be regulated by 18 differences in codon choice, which modulates peptide elon US 2014/O 1961.76 A1 Jul. 10, 2014

TABLE 1-continued protein should bind mRNA, and the peptide tag should be accessible to the corresponding isolation reagent. Accord Codon Frequency (Percentage) in highly expressed human genes ingly, selection of an appropriate ribosomal protein for tag ASn AA 78 ging can be based upon accessibility to affinity reagents such 22 as antibodies against N- and C-terminior other portions of the Asp GA 75 proteins in intact ribosomes (Syu WJ, Kahan L. Both ends of 25 Escherichia coli ribosomal protein S13 are immunochemi Leu CT 26 cally accessible in situ. J Protein Chem. 1992 June; 11(3): 5 3 225-30; reviewed in Syu WJ, Kahan B, Kahan L. Detecting 58 immunocomplex formation in Sucrose gradients by enzyme TT 2 immunoassay: application in determining epitope accessibil 6 ity on ribosomes. Anal Biochem. 1991 July; 196(1):174-7). Lys AA 18 However, accessibility does not imply that once tagged, the 82 Pro CC 48 ribosomal protein will function appropriately. One assay of 19 proper function of a tagged variant is the determination, via 16 immunohistochemistry, that the tagged protein displays 17 expected subcellular localization when expressed in cultured Phe TT 8O cells. The determination that the tag appears in a preparation 2O 68 of polysomes isolated from transfected cells is an indication 32 that ribosomal function is not greatly perturbed by the incor Gln CA 12 poration of the tagged protein into the organelle. See e.g., 88 Rosorius et al., 2000, Human Ribosomal Protein L5 Contains Glu GA 25 Defined Nuclear Localization and Export Signals, J. Biol. 75 Chem. 275(16): 12061-12068, and Russo et al., 1997, Differ Gly GG 50 12 ent Domains Cooperate to Target the Human Ribosomal L7a 14 Protein to the Nucleus and to the Nucleoli, J. Biol. Chem. 24 272(8): 5229-5235, both of which are hereby incorporated by His CA 79 reference in their entireties. 21 Ile AT 77 0038 More thorough evaluations of any possible pertur C 18 bation of ribosomal function involves comparisons of cellular 5 physiology in transfected and untransfected cells. For Ser TC 28 example, comparisons of relative protein or mRNA abun e 13 dances in transfected and untransfected cells would be such 5 9 measures of cellular physiology. An appropriate ribosomal AG 34 protein will be one which, when tagged, is incorporated into 10 ribosomes, allows those ribosomes to function without Thr AC 57 C 14 unduly affecting cellular physiology, and which has the tag 14 positioned so as to be accessible to affinity purification 15 reagents. Tyr TA 74 26 0039. The methods of Herfurth et al. (1995, Determination Wal GT 25 of peptide regions exposed at the Surface of the bacterial C 7 ribosome with antibodies against synthetic peptides. Biol 5 Chem Hoppe Seyler 376(2): 81–90; which is hereby incorpo & 64 rated by reference in its entirety) may be use to determine before tagging which parts of particular ribosomal proteins 0035. In particular embodiments, the invention provides are accessible in the intact ribosome. fusion proteins (including isolated or purified fusion proteins) 0040. Once accessibility is determined, one can determine containing all or a functional portion of a ribosomal protein or whether ribosomes containing the tagged riboprotein are mRNA binding protein and a peptide tag, as described above, functional using routine assays well known in the art. Analo as well as intact ribosomes and complexes of mRNA and gous tests for accessibility of the tagintagged mRNA binding mRNA binding protein (including isolated and purified intact proteins and formation and function of the mRNA binding ribosomes and complexes). The invention further provides protein-mRNA complex will be apparent to the skilled artisan nucleic acids comprising nucleotide sequences encoding the for identifying and designing appropriate tagged mRNA ribosomal and mRNA binding protein fusions with peptide binding proteins for use in the present invention. tags of the invention, vectors containing these nucleic acids, and host cells containing nucleic acids encoding the riboso 0041 Ribosomal proteins or protein subunits or mRNA mal and mRNA binding protein fusion proteins of the inven binding proteins suitable for use in the methods of the inven tion. tion are preferably of the same species as the host cell to be 0036 5.2. Selection of Ribosomal Protein for Tagging transformed, but in certain embodiments, may be of a differ 0037. Any ribosomal protein or mRNA binding protein ent species. may be molecularly tagged for use in the methods of the 0042 Ribosomal proteins or protein subunits suitable for invention. The ribosome containing the tagged protein should use in the methods of the invention include, but are not limited bind mRNA and, preferably, also translate the mRNA into to mouse and human ribosomal proteins in Tables 2 and 3. In protein, and the peptide tag in the intact ribosome should be Tables 2 and 3, the GenBank accession number is followed by accessible to the corresponding isolation reagent. Likewise, if a description of the ribosomal protein as it appears in Gen an mRNA binding protein is used, the tagged mRNA binding Bank:

US 2014/O 1961.76 A1 Jul. 10, 2014

TABLE 3-continued

Human Ribosomal Proteins NM 000982 - ribosomal protein L21 (gene or ) (RPL21), mRNA gi4506610|refNM 000982.1||4506610) NM 000981 - ribosomal protein L19 (RPL19), mRNA gi4506608 |refNM 000981.1||4506608)

0043 All of the sequences in Tables 2 and 3 are incorpo Oxford, pp. 280-285; incorporated herein by reference in its rated by reference in their entirety. entirety) to isolate the membrane-associated polysomes. 0044. In preferred embodiments, the tagged ribosomal 0052. Other appropriate cell lysates or fractions may be proteins are S6 or L37 ribosomal proteins, more preferably obtained using routine biochemical methods. tagged with a Strep Tag peptide tag, most preferably with the 0053 Specific polysomes can also be isolated using affin peptide tag at the C-terminus. In another preferred embodi ity separation techniques targeting nascent polypeptides or ment, the mRNA binding protein is not polyA binding pro endogenous or tagged mRNA-binding proteins using art tein. known methods e.g., using the methods of Lynch, 1987. 0045. 5.3. Isolation of Ribosomes Meth. Enzymol. 152: 248-253, and Brooks and Rigby, 2000, 0046 Various methods exist to isolate ribosomes, particu Nucleic Acids Res. 28(10): e49. larly polysomes, from cultured cells and tissues from trans 0054. In certain embodiments, polysomes are not isolated formed organisms (see, e.g., Bommer et al., 1997, Isolation from the post-mitochondrial Supernatant or even from a cell and characterization of eukaryotic polysomes, in Subcellular or tissue lysate before being subject to affinity purification. Fractionation, Graham and Rickwood (eds.), IRL Press, 0055. Once the cell lysate or fraction is obtained, the Oxford, pp. 280-285; incorporated herein by reference in its tagged ribosomes may be isolated using routine methods entirety). Preferably, the isolation method employed has the from untagged ribosomes and other cell components, prefer following characteristics: ably isolated from RNA, most preferably isolated from 0047 (1) Translation arresting compounds, such as emet mRNA, that is not bound to molecularly tagged ribosomes or ine or cycloheximide, are added to arrest translation, if pos tagged mRNA binding protein, using affinity reagents that sible, as a pre-treatment even before homogenization. This bind the tag specifically. prevents ribosome run-off and keeps the ribosome-mRNA complex stable, i.e., the ribosome remains bound to the 0056. In a preferred embodiment, the ribosomes are iso mRNA. lated from transfected cells by Scraping them into homogeni zation buffer (50 mM sucrose, 200 mMammonium chloride, 0048 (2) RNase inhibitors such as SUPERase InTM 7 mM magnesium acetate, 1 mM dithiothreitol, and 20 mM RNase Inhibitor (Ambion, Austin,Tex.) are added to buffers Tris-HCl, pH 7.6). The cells are then lysed by the addition of to maintain the integrity of the mRNA. the detergent, NP-40 (Nonidet P40, CALBIOCHEM-NOVA 0049 (3) After tissue or cell homogenization, total poly BIOCHEM Corporation, San Diego, Calif.) to a concentra Somes are isolated by preparing a post-mitochondrial Super tion of 0.5% followed by five strokes in a glass dounce tissue natant in the presence of at least a high concentration salt homogenizer. Unlysed cells, nuclei and mitochondria are pel buffer, e.g., 100-150 mM KC1. leted by centrifugation at 10,000xg for 10 minutes, at 4°C. 0050 (4) Detergent is also added to the post-mitochon The Supernatant is removed and layered over a two-step dis drial Supernatant to release membrane-associated polysomes continuous gradient of 1.8 M and 1.0M sucrose in 100 mM from endoplasmic reticulum membranes; total polysomes are ammonium chloride, 5 mM magnesium acetate, 1 mM dithio usually collected by centrifugation through a Sucrose cush threitol. 20 mM Tris-HCl (pH 7.6). The gradient is centri 1O. fuged for 18 hours at 98,000xg at 4°C. 0051. In certain embodiments, a variation of the above 0057 Following centrifugation, the supernatants are described general method is used to isolate membrane-asso ciated polysomes from a total pool of polysomes. This allows removed, and the polysome pellet is resuspended in 100 mM one to focus on the mRNA species encoding secreted or ammonium chloride, 5 mM magnesium chloride, 1 mMDTT transmembrane proteins, which are often targets of choice for and 20 mM Tris-HCl (pH 7.6). drug discovery. Various methods may be used to isolate mem 0.058 An equal volume of 2x denaturing protein electro brane-associated polysomes from cultured cells and tissue, phoresis sample buffer is added to the polysome sample. e.g., methods that employ differential centrifugation (Hall C, Solubilized polysomal proteins are fractionated by electro Lim L. Developmental changes in the composition of poly phoresis through a SDS containing 4-20% gradient polyacry adenylated RNA isolated from free and membrane-bound lamide gel, and transferred to a nitrocellulose filter. polyribosomes of the rat forebrain, analysed by translation in 0059. The isolation of tagged polysomes directly from vitro. Biochem J. 1981 Apr. 15; 196(1):327-36), rate-Zonal crude or post-mitochondrial Supernatants (adjusted appropri centrifugation (Rademacher and Steele, 1986, Isolation of ately with NaCl and detergent) is also envisioned. In certain undegraded free and membrane-bound polysomal mRNA embodiments, molecular tagging is achieved through the from rat brain, J. Neurochem. 47(3):953-957), isopycnic cen introduction of amino acids into a ribosomal protein-encod trifugation (Mechler, 1987, Isolation of messenger RNA from ing gene Such that the amino acids form a polypeptide region membrane-bound polysomes, Methods Enzymol. 152: 241 (i.e., a tag) that is capable of acting as a receptor or ligand for 248), and differential extraction (Bommer et al., 1997, Isola an affinity separation. tion and characterization of eukaryotic polysomes, in Subcel 0060. Because nascent polypeptides are attached to iso lular Fractionation, Graham and Rickwood (eds.), IRL Press, lated monosomes and polysomes, the methods of the inven US 2014/O 1961.76 A1 Jul. 10, 2014 tion can also be used to isolate newly synthesized polypep tion, Some or all of the regulatory sequences may be incor tides from a cell type of interest (e.g., for proteomic porated into nucleic acids of the invention (including trans applications). genes) to regulate the expression of tagged ribosomal protein 0061 Tagged polysomes that contain specific mRNAs or mRNA binding protein coding sequences. In certain (see infra) are isolated using antibodies that recognize spe embodiments, a gene that is not constitutively expressed, (i.e., cific nascent, encoded polypeptide chains (for review see exhibits some spatial or temporal restriction in its expression Lynch DC. Use of antibodies to obtain specific polysomes. pattern) is used as a source of a regulatory sequence. In other Methods Enzymol. 1987: 152:248-53; Schutz G, Kieval S, embodiments, a gene that is constitutively expressed is used Groner B, Sippel A E, Kurtz D. Feigelson P. Isolation of as a source of a regulatory sequence, for example, when the specific messenger RNA by adsorption of polysomes to nucleic acids of the invention are expressed in cultured cells. matrix-bound antibody. Nucleic Acids Res. 1977 January; 0069. In certain embodiments, the expression of tagged 4(1): 71-84; and Shapiro S. Z. Young JR. An immunochemical ribosomal protein or mRNA binding protein coding method for mRNA purification. Application to messenger sequences is regulated by a non-ribosomal regulatory RNA encoding trypanosome variable Surface antigen. J Biol. sequence. Such a sequence may include, but not be limited to, Chem. 1981 Feb. 25: 256(4): 1495-8). Particular mRNA spe parts of a ribosomal regulatory sequence (but does not include cies as low in abundance as 0.01-0.05% of total mRNA have the entire ribosomal regulatory sequence), but Such sequence been purified to near homogeneity via this approach. effects a different expression pattern than the ribosomal regu 0062 Affinity methods that can be used to isolate or purify latory sequence. tagged ribosomes or other mRNA binding proteins taking 0070 Preferably, the regulatory sequence is derived form advantage of the affinity of a reagent for the peptide tag are a human or mouse gene associated with an adrenergic or well known in the art including chromatography, Solid phase noradrenergic neurotransmitter pathway, e.g., one of the chromatography and precipitation, matrices, precipitation, genes listed in Table 4; a cholinergic neurotransmitter path etc. way, e.g., one of the genes listed in Table 5: a dopaminergic 0063. In specific embodiments, the invention provides neurotransmitter pathway, e.g., one of the genes listed in molecularly tagged ribosomes, preferably bound to mRNA, Table 6: a GABAergic neurotransmitter pathway, e.g., one of that are bound to an affinity reagent for the molecular tag. In the genes listed in Table 7: a glutaminergic neurotransmitter more specific embodiments, the molecularly tagged ribo pathway, e.g., one of the genes listed in Table 8; a glycinergic Somes are bound to an affinity reagent that is bound, prefer neurotransmitter pathway, e.g., one of the genes listed in ably covalently, to a solid surface, such as a chromatography Table 9; a histaminergic neurotransmitter pathway, e.g., one resin, e.g., agarose, Sepharose, and the like. of the genes listed in Table 10; a neuropeptidergic neurotrans 0064 5.4. Isolation of mRNA from Purified Polysomes mitter pathway, e.g., one of the genes listed in Table 11; a 0065. Once the tagged ribosome or mRNA binding pro serotonergic neurotransmitter pathway, e.g., one of the genes tein has been isolated, the associated mRNA complexed with listed in Table 12; a nucleotide receptor, e.g., one of the genes the ribosome or mRNA binding protein may be isolated using listed in Table 13; an , e.g., one of the genes listed methods well known in the art. For example, elution of in Table 14; markers of undifferentiated or not fully differen mRNA is accomplished by addition of EDTA to buffers, tiated cells, preferably nerve cells, e.g., one of the genes listed which disrupts polysomes and allows isolation of bound in Table 15; the Sonic hedgehog signaling pathway, e.g., one mRNA for analysis (Schutz, et al. (1977), Nucl. Acids Res. of the genes in Table 16; calcium binding, e.g., one of the 4:71-84: Kraus and Rosenberg (1982), Proc. Natl. Acad. Sci. genes listed in Table 17; or a neurotrophic factor receptor, USA 79:4015-4019). In addition, isolated polysomes (at e.g., one of the genes listed in Table 18. tached or detached from isolation matrix) can be directly 0071. The ion channel encoded by or associated with the input into RNA isolation procedures using reagents such as gene selected as the source of the regulatory sequence is Tri-reagent (Sigma) or Triazol (Sigma). In particular embodi preferably involved in generating and modulating ion flux ments, poly A" mRNA is preferentially isolated by virtue of across the plasma membrane of neurons, including, but not its hybridization of oligodT cellulose. Methods of mRNA limited to Voltage-sensitive and/or cation-sensitive channels, isolation are described, for example, in Sambrook et al., 2001, e.g., a calcium, Sodium or . Molecular Cloning, A Laboratory Manual. Third Edition, 0072. In Tables 4-18 that follow, the common names of Cold Spring Harbor Laboratory Press, N.Y.; and Ausubel et genes are listed, as well as their GeneCards identifiers (Reb al., 1989, Current Protocols in Molecular Biology, Green han et al., 1997, GeneCards: encyclopedia for genes, proteins Publishing Associates and Wiley Interscience, N.Y., both of and diseases, Weizmann Institute of Science, Bioinformatics which are hereby incorporated by reference in their entireties. Unit and Genome Center (Rehovot, Israel). GenBank acces 0066 5.5. Regulatory Sequences for Expression of sion numbers, UniGene accession numbers, and Mouse Tagged Ribosomes Genome Informatics (MGI). Database accession numbers 0067. According to the methods of the invention, the where available are also listed. GenBank is the NIH genetic tagged ribosomes are selectively expressed in a particular sequence database, an annotated collection of all publicly chosen cell type. Such expression is achieved by driving the available DNA sequences (Benson et al., 2000, Nucleic Acids expression of the tagged ribosomal protein or mRNA binding Res. 28(1): 15-18). The GenBank accession number is a protein using regulatory sequences from a gene expressed in unique identifier for a sequence record. An accession number the chosen cell type. applies to the complete record and is usually a combination of 0068. The population of cells comprises a discernable a letter(s) and numbers, such as a single letter followed by five group of cells sharing a common characteristic. Because of its digits (e.g., U12345), or two letters followed by six digits selective expression, the population of cells may be charac (e.g., AF 123456). terized or recognized based on its positive expression of the 0073. Accession numbers do not change, even if informa characterizing gene. According to the methods of the inven tion in the record is changed at the author's request. An US 2014/O 1961.76 A1 Jul. 10, 2014 17 original accession number might become secondary to a TABLE 5-continued newer accession number, if the authors make a new Submis sion that combines previous sequences, or if for Some reason MGI Database a new Submission Supercedes an earlier record. GenBank and or UniGene Accession 0074 UniGene (Schuler et al., 1996. A gene map of the Gene Accession Number Number , Science 274(5287):540-6) is an experimen CHRM4 human: X15265, M16405 MGI: 88.399 tal system for automatically partitioning GenBank sequences (Muscarinic Ach M4) receO into a non-redundant set of gene-oriented clusters for cow, CHRMS human: AFO26263, M80333 human, mouse, rat, and Zebrafish. Within UniGene, expressed (Muscarinic Ach M5) rat: NM O17362 sequence tags (ESTs) and full-length mRNA sequences are receO mouse: AI327507 organized into clusters that each represent a unique known or CHRNA1 human: YOO762, XO2502, MGI: 87885 putative gene. Each UniGene cluster contains related infor (nicotinic alpha1) S77094 mation Such as the tissue types in which the gene has been R NA2 human: U62431, Y16281 MGI: 87886 expressed and map location. Sequences are annotated with (nicotinic alpha2) mapping and expression information and cross-referenced to R human: NM 000743, U62432, other resources. Consequently, the collection may be used as (nicotinic alpha3) M37981, M86383, Y08418 a resource for gene discovery. receO 0075. The Mouse Genome Informatics (MGI) Database CHRNA4 human: U62433, L35901, MGI: 87888 (Jackson Laboratory, Bar Harbor, Me.) contains information (nicotinic alpha4) YO8421, X89745, X87629 on mouse genetic markers, mRNA and genomic sequence receO CHRNAS human: U62434, YO8419, MGI: 87889 information, phenotypes, comparative mapping data, experi (nicotinic alphaS) M83712 mental mapping data, and graphical displays for genetic, receO physical, and cytogenetic maps. CHRNA7 human: X70297, YO8420, MGI: 99779 (nicotinic alpha,7) Z23141, U40583, U62436, TABLE 4 receO L25827, AFO36903 CHRNB1 human: X14830 MGI: 878.90 MGI Database (nicotinic Beta 1) GenBank and or UniGene Accession Gene Accession Number Number human: U62437, X53179, MGI: 87891 inic Beta 2) YO8415, AJOO1935 ADRB1 (adrenergic human: JO3019 MGI: 87937 beta 1) human: YO8417, X67513, ADRB2 (adrenergic human: M15169 MGI: 87938 inic Beta 3) U62438, RIKEN BB284174 beta 2) ADRB3 (adrenergic human: NM 000025, X70811, MGI: 87939 human: U48861, U62439, MGI: 87892 beta 3) X72861, M29932, X70812, S53291, X70812 (nicotinic Beta 4) YO8416, X68275 ADRA1A (adrenergic human: D25235, UO2569, alpha1a) AFO13261, L31774, UO3866 CHRNG nicotinic human: XO1715, M11811 MGI: 87895 guinea pig. AF108016 gamma immature ADRA1B (adrenergic human: U03865, L31773 MGI: 104774 muscle receptor alpha1b) CHRNEnicotinic human: X66403 ADRA1C (adrenergic human: UO8994 epsilon receptor mouse: NM 009603 alpha 1 c) mouse: NM 013461 CHRNDnicotinic human: X55019 MGI: 878.93 ADRA1D (adrenergic human: M76446, U03864, MGI: 106673 delta receptor alpha 1 d) L31772, D29952, S70782 ADRA2A (adrenergic human: M18415, M23533 MGI: 87934 alpha 2A) ADRA2B (adrenergic human: M34041, AF005900 MGI: 87935 TABLE 6 alpha 2B) ADRA2C (adrenergic human: JO3853, D13538, MGI: 87936 MG Database alpha 2C) U72648 GenBank and or UniGene Accession SLC6A2 human: X91117, M65105, MGI: 1270850 Gene Accession Number Number Norepinephrine AB022846, AFO61198 transporter (NET) h (tyrosine human: M17589 MG : 98735 hydroxylase) at (dopamine human: NM 001044 MG : 94862 transporter) TABLE 5 opamine human UniGene: X58987, MG : 99578 receptor 1 S58541, X55760, X55758 MGI Database opamine human UniGene: X51362, MG : 94924 GenBank and or UniGene Accession receptor 2 M29066, AFO50737, Gene Accession Number Number S62137, X51645, M3.0625, S69899 CHRM1 human: X15263, M35128 MGI: 88396 opamine human UniGene: U25441, MG : 94925 (Muscarinic Ach M1) Y00508, X52068 receptor 3 U32499 receptor opamine human UniGene: L12398, MG : 94926 CHRM2 human: M16404, ABO41391, receptor 4 S76942 (Muscarinic Ach M2) X15264 opamine human UniGene: M67439, MG : 94927 receptor mouse:AF264049 receptor 5 M67439, X584.54 CHRM3 human: U29589, AB041395, bh dopamine human UniGene: X13255 MG : 94864 (Muscarinic Ach M3) X15266 beta hydroxylase receptor mouse: AF264.050

US 2014/O 1961.76 A1 Jul. 10, 2014

TABLE 8-continued TABLE 9

MGI Database GenBank and or UniGene Accession MGI Database Gene Accession Number Number GenBank and for UniGene Accession GRM7 human: NM 000844, X94552 Gene Accession Number Number mGluR7 mouse: RIKEN BB357072 REI, Glycine human: X52009 MGI: 95.747 GRM8 human: NM 000845, U95025, receptors mGluR8 AJ236921, AJ236922, AC000099 alpha 1 type III mouse: U17252 mGluR8 GLRA1 RD: human: AF009014 MGI: 95813 Glycine human: X52008, AFO53495 MGI: 95.748 glu ionotropic receptors delta alpha 2 excitatory human: UO3505, UO1824, Z32517, MGI: 101931 amino acid D85884 GLRA2 transporter2 Glycine human: AFO17724, U93917, glutamate aspartate receptors AFO18157 transporter II alpha 3 mouse:AF214575 glutamate GLRA3 transporter GLT1 Glycine no human glutamate receptors mouse:X75850, X75851, transporter SLC1A2 alpha 4 X75852, X75853 glial high GLRA4 glutamateaffinity glycine human: U33267, AFO94754, MGI: 95751 transporter receptor AFO947SS EAAC1 human: UO8989, UO3506, UO6469 MGI: 105083 beta neural SLC1A1 neuronal GLRB epithelia high affinity glutamate transporter EEAT1 human: D26443, AF070609, MGI: 999.17 TABLE 10 SLC1A3 L19158, UO3504, Z31713 glial high MGI Database affinity GenBank and or UniGene Accession glutamate Gene Accession Number Number transporter Histamine human: Z34897, D284.81, MGI: 107619 EAAT4 human: U18244, AC004659 MGI: 1096331 Hi-receptor 1 X76786, ABO41380, D14436 neural SLC1A6 AFO26261 s s high affinity Histamine human: M64799, AB023486, MGI: 108482 aspartate? H2-receptor 2 ABO41384 glutamate Histamine human: NM 007 232 transporter H3-receptor 3 mouse: MM31751

TABLE 11

MGI Database GenBank and for UniGene Accession Gene Accession Number Number

orexin OX-A human: AF041240 MGI: 1202306 hypocretin 1 Orexin B Orexin receptor OX1R human: AF041243 HCRTR1 Orexin receptor OX2R human: AF041245 HCRTR2 leptinR-long human: U66497, U43168, U59263, MGI: 104993 Leptin receptor long form U66495, U52913, U66496, U52914, U52912, U50748, AKOO1042 MCH human: M57703, S63697 melanin concentrating hormone PMCH US 2014/O 1961.76 A1 Jul. 10, 2014 20

TABLE 1 1-continued

MGI Database GenBank and for UniGene Accession Gene Accession Number Number

human: GDB: 138780 MGI: 96929 MC3 receptor mouse: MMS7183 melanocortin 3 receptor human: S77415, LO8603, MC4 receptor NM 005912 melanocortin 4 receptor MCSR human: L27080, Z25470, UO8353 MGI: 9942O MC5 receptor melanocortin 5 receptor prepro-CRF human: V00571 corticotropin-releasing factor rat: XO3036, M54987 precursor CRH corticotropin releasing hormone CRHR1 human: L23332, X72304, L23333, MG : 88498 CRH/CRF receptor 1 AFO39523, U16273 CRF R2 human: U34587, AFO 19381, MG : 89431.2 CRH/CRF receptor 2 AFO11406, AC004976, AC004976 human: X58022, S60697 MG : 88497 CRF binding protein Urocortin human: AFO386.33 MG : 1276.123 POMC human: VO1510, M38297, JOO292, MG : 97742 Pro-opiomelanocortin M286.36 CART human: U20325, U16826 MG : 13S1330 cocaine and amphetamine regulated transcript human: KO1911, M15789, MG : 97374 Neuropeptide Y M14298, AC004485 epro NPY human: M88461, M84755, MG : 104963 PYY1 receptor NM OOO909 europeptide Y1 receptor PY2R human: U42766, USO 146, U32500, MG : 108418 PYY2 receptor U36269, U42389, U76254, europeptide Y2 receptor NM 000910 PYY4 receptor human: Z66526, U35232, U42387 MG : 105374 py4R Neuropeptide Y4 receptor (mouse) NPYYS receptor human: U94320, U56079, U66275 MG : 108082 Npy5R Neuropeptide Y5 receptor mouse: MM10685 (mouse) NPYY6 receptor human: D86519, U59431, U67780 MG : 1098590 Npy6r Neuropeptide Y receptor (mouse) CCK human: NM 000729, L00354 MG : 88.297 Cholecystokinin CCKa receptor human: L19315, D85606, L13605 MG : 99.478 CCKAR cholecystokinin receptor U23430 CCKb receptor human: D13305, LO4473, LO8112, MG : 99.479 CCKBR cholecystokinin receptor LO7746, L10822, D21219, S70057, AFO74029 AGRP human: NM 001138, U88063, MG : 892O13 agouti related peptide U894.85 Galanin human: M77140, L11144 MG : 95637 GALP Galanin like peptide See, Jureus et al., 2000, Endocrinology 141 (7): 2703-06. GalR1 receptor human: NM 001480, U53511, MGI: 1096364 GALNR1 L34339, U23854 galanin receptor1 GalR2 receptor human: AF040630, AFO80586, MGI: 1337.018 GALNR2 AFO42782 galanin receptor2 GalR3 receptor human: AF073799, Z97630, MGI: 13290O3 GALNR3 AFO67733 Gar3 galanin receptor3 UTS2 human: Z98884, AF104118 MGI: 1346329 prepro-urotensin II GPR14 human: AI263.529 Urotensin receptor mouse:AI385474 US 2014/O 1961.76 A1 Jul. 10, 2014 21

TABLE 1 1-continued

MG Database GenBank and for UniGene Accession Gene Accession Number Number

SST human: J00306 MG : 98.326 Somatostatin SSTR1 human: M81829 MG : 98327 Somatostatin receptor SSt1 SSTR2 human: AF 184174 M81830 MG : 98.328 Somatostatin receptor SSt2 AF184174 SSTR3 human: M96738, Z82188 MG : 98.329 Somatostatin receptor SSt3 SSTR4 human: L14856, L07833, D16826, MG : 105372 Somatostatin receptor SSta. ALO49651 SSTRS human: D16827, L14865, MG : 894282 somatostatin receptorsstS ALO31713 GPR7 8: U22491 MG : 891989 G protein-coupled receptor 7 opioid-Somatostatin-like receptor GPR8 8: U224.92 G protein-coupled receptor 8 opioid-Somatostatin-like receptor PENK (pre Pro Enkephalin) human: V00510, JOO123 MGI: 1046.29 PDYN (Pre pro Dynorphin) human: KO2268, ALO34562, MGI: 97535 XOO176 PRM1 human: L25119, L29301, U12569, MGI: 97441 opiate receptor AL132774 PRK1 human: U11053, L37362, U17298 MGI: 97439 opiate receptor PRD1 human: UO7882, U10504, MGI: 974.38 delta opiate receptor ALOO9181 OPRL1 human: X77130, U30185 MGI: 97.440 ORL1 opioid receptor-like receptor R1 human: NM 018727, BE466577 anilloid receptor subtype 1 mouse: BE623398, RL-1 human: NM O15930 MGI: 1341836 anilloid receptor-like protein 1 rat: ABO40873 R1L1 mouse: NM O11706 anilloid receptor type 1 like protein 1 VRL1 vanilloid receptor-like protein 1 VR-OAC human: ACOO7834 vanilloid receptor-related osmotically activated channel CNR1 human: U73304, X81120, X81120, MGI: 104615 cannaboid receptors CB1 X54.937, X81121 EDN1 human: JO5008,YOO749, S56805, MGI: 95.283 endothelin 1 ET1 Z98050, M25380 GHRH human: L00137, AL031659, MGI: 95709 growth hormone releasing LOO137 hormone GHRHR human: AFO29342, U34195, growth hormone releasing mouse: NM 010285 hormone receptor PNOC human: X97370, U48263, X97367 MGI: 1053O8 nociceptin orphanin FQ/nocistatin NPFF human: AFOO5271 neuropeptide FF precursor mouse: RIKEN BB365815 neuropeptide FF receptor human: AF257210, NM 0.04885, neuropeptide AF receptor AF119815 G-protein coupled receptor HLWAR77 G-protein coupled receptor NPGPR GRP human: K02054, S67384, S73265, MGI : 958.33 gastrin releasing peptide M12512 preprogastrin-releasing peptide GRPR human: M73481, US7365 MGI : 95.836 gastrin releasing peptide receptor BB2 human: M21551 neuromedin B mouse:AI3273.79 US 2014/O 1961.76 A1 Jul. 10, 2014 22

TABLE 1 1-continued

MGI Database GenBank and for UniGene Accession Gene Accession Number Number

NMBR human: M73482 MGI: 1100S25 neuromedin B receptor BB1 BRS3 human: Z97632, L08893, X76498 bombesin like receptor subtype-3 mouse:ABO1028O uterine bombesin receptor GCGPROglucagon human: JO4040, XO3991, VO1515 MGI: 95674 GLP-1 GLP-2 human: UO3469, L20316 MGI: 99.572 glucagon receptor human: AL035690, UO1104, MGI: 99.571 GLP1 receptor UO1157, L23503, UO1156, U10O37 human: AF 105367 GLP2 receptor mouse:AF1662.65 VIP human: M36634, M54930, MGI: 98.933 vasoactive intestinal peptide M14623, M33027, M11554, LOO158, M36612 SCT mouse: NM 011328, X73580 Secretin PPYR1 human: Z66526, U35232, U42387 MGI: 105374 pancreatic polypeptide receptor 1 OXT human: M25650, M11186, pre pro Oxytocin XO3173 mouse: NM 011025, M88355 OXTR human: X64878 MGI: 109147 OTR oxytocin receptor AVP human: M25647, XO3172, MGI: 88121 Preprovasopressin M11166, AFO31476, X62890, X62891 AVPR1A human: U19906, L25615, S73899, V1a receptor AFO3.0625, AF101725 vasopressin receptorla mouse: NM O16847 human: D31833, L37112, V1b receptor AFO30512, AF101726 vasopressin receptor1b mouse: NM 011924 AVPR2 human: Z11687, UO4357, L22206, MGI: 88.123 V2 receptor U52112, AFO30626, AF032388, vasopressin receptor2 AF101727, AF101728 NTS human: NM 006183, U91618 proneurotensin proneuromedin N mouse: MM642O1 Neurotensin tridecapeptide plus neuromedin N NTSR1 human: X7007O MGI: 97386 Neurotensin receptor NT1 NTSR2 human: Y10148 Neurotensin receptor NT2 mouse: NM 008747 SORT1 human: X98248, L10377 MGI: 133801S sortilin 1 neurotensin receptor 3 BDKRB1 human: U12512, U48231, U22346, MGI: 88144 Bradykinin receptor 1 AJ238.044, AF117819 BDKRB2 human: X69680, S45489, S56772, MGI: 102845 Bradykinin receptor B2 M88714, X86164, X86163, X8616S GNRH1 human: XO1059, M12578, X15215 MGI: 95789 GnRH gonadotrophin releasing hormone GNRH2 human: AFO36329 GnRH gonadotrophin releasing hormone GNRHR human: NM 000406, LO7949, MGI: 95790 GnRH S60587, LO3380, S77472, Z81148, gonadotrophin releasing hormone U196O2 receptor CALCB human: XO2404, XO4861 calcitonin-related polypeptide, beta CALCA human: M26095, X00356, MGI: 88249 calcitonin calcitonin-related XO3662, M64486, M12667, polypeptide, alpha X02330, X15943 CALCR human: LOOS87 MGI: 101.9SO calcitonin receptor US 2014/O 1961.76 A1 Jul. 10, 2014 23

TABLE 1 1-continued

MGI Database GenBank and for UniGene Accession Gene Accession Number Number TAC1 (also called tac2) human: X54469, U37529, MGI: 98474 neurokinin A ACOO4140 TAC3 human: NM 013251 neurokinin B rat: NM O17053 TACR2 human: M75105, M57414, neurokinina (Subk) receptor M60284 TACR1 human: M84425, M74290, MGI: 98475 achykinin receptor NK2 (Sub P M81797, M76675, X65177, and K) M84426 TACR3 human: M89473 X65172 achykinin receptor NK3 (Sub P and K) neuromedin K ADCYAP1 human: X60435 MGI: 105094 PACAP NPPA human: M54951, XO1470, MGI: 97.367 atrial naturietic peptide (ANP) ALO21155, M30262, KO2043, CSO KO2O44 atrial natriuretic factor (ANF) CSO pronatriodilatin precursor prepronatriodilatin NPPB human: M25296, ALO21155, atrial naturietic peptide (BNP) M31776 CSO mouse: NM 008726 NPR1 human: X15357, AB010491 MGI: 973.71 naturietic peptide receptor 1 NPR2 human: L13436, AJOO5282, MGI: 973.72 naturietic peptide receptor 2 ABOOS647 NPR3 human: M59305, AF025998, MGI: 97373 naturietic peptide receptor 3 X52282 VIPR1 human: NM 004624, L13288, MGI: 109272 WPAC1 X75299, X77777, L20295, VIP receptor 1 U11087 VIPR2 human: X95097, L36566, Y18423, MGI: 107166 VIP receptor 2 L40764, AF027390 PACAP receptor

TABLE 12 TABLE 12-continued

MGI MGI Database Database GenBank and or UniGene Accession GenBank and or UniGene Accession Gene Accession Number Number Gene Accession Number Number SHT1A human: M83181, AB041403, MGI: serotonin receptor 1A M28269X13556 96273 Rain receptor 6 human: LA1147, AF007141 In receptor 2A " X57830 S. 5HT7 human: U68488, U68487, L21195, 5HT3 human: AJO05205, D49394, S82612, MGI: serotonin receptor 7 X98193 serotonin receptor 3 AJO05205, AJOO3079, AJO05205, 96.282 mouse: MM8053 AJOO3O80, AJOO3078 Sert human UniGene: L05568 MGI: SHT1B human: M81590, M81590, D10995, MGI: Serotonin transporter 96.285 SHT1Db M8318O, LO9732, M75128, 96274 TPRH human UniGene: AFO57280, MGI: serotonin receptor 1B AB041370, AB041377, ALO49595 TPH (Tph) X52836, L29306 98.796 5HT1D alpha human: AL049576 MGI: tryptophan serotonin receptor 1D 96276 hydroxylase 5HT1E human: NM 000865, M91467, serotonin receptor 1E M92826, Z11166 SHT2B human: NM 000867, X77307, MGI: TABLE 13 serotonin receptor 2B Z36748 109323 5HT2C human: NM 000868, U49516, MGI: MGI Database serotonin receptor 2C M81778, X80763, AF208053 96.281 GenBank and/or UniGene Accession SHT4 human: Y10437, YO8756, YO9586, Gene Accession Number Number serotonin receptor 4 Y13584, Y12505, Y12506, Y12507, P2RX1 human: U45448, X83688, MGI: 1098235 (has 5 subtypes AJO 11371, AJ243213 P2x1 receptor AFO78925, AF020498 isoforms) purinergic receptor P2X, SHTSA human: X81411 MGI: ligand-gated ion channel serotonin receptor 5A 96.283 P2RX3 human: YO7683 SHtSB rat: L10073 purinergic receptor P2X, mouse: RIKEN BB459124, serotonin receptor 5B ligand-gated ion channel, 3 RIKEN BB452419 US 2014/O 1961.76 A1 Jul. 10, 2014 24

TABLE 13-continued TABLE 13-continued

MGI Database MGI Database GenBank and for UniGene Accession GenBank and or UniGene Accession Gene Accession Number Number Gene Accession Number Number P2RX4 human: U83993, YO7684, MGI: 1338859 P2RY1 human: Z49205 MGI: 1 OSO49 purinergic receptor P2X, U87270, AFOOO234 purinergicG-protein coupledreceptor 1 P2Y, igand-gated ion channel, 4 P2RY2 human: UO7225 S74902 P2RX5 human: AF 168787, purinergic receptor P2Y, rat: US 6839 purinergic receptor P2X, AFO16709, U49395, G-protein coupled, 2 igand-gated ion channel, 5 U49396, AF168787 P2RY4 pyrimidinergic human: X91852, rat: AFO7.0573 receptor P2Y, G-protein X96597, U40223 P2RXL1 human UniGene: MGI: 13371.13 coupled, 4 purinergic receptor P2X- ABOO2O58 P2RY6 human: X97058, U52464, ike 1, orphan receptor pyrimidinergic receptor AFOO7892, AFOO7891, P2RX6 P2Y, G-protein coupled, 6 AF007893 P2RY11 human:AFO30335 P2RX7 human: YO9561, Y12851 MGI: 13399.57 purinergic receptor P2Y, purinergic receptor P2X, G-protein coupled, 11 igand-gated ion channel, 7

TABLE 1.4

GenBank and for UniGene MGI Database Gene Accession Number Accession Number

SCN1A human: X65362 MGI: 98.246 , voltage-gated, type I, alpha SCN1B human: L16242, L10338, U12194, MGI: 98.247 sodium channel, voltage-gated, NM 001037 type I, beta SCN2B human: AF049498, AF049497, MGI: 106921 sodium channel, voltage-gated, AF007783 type II, beta SCNSA human: M77235 Sodium channel, voltage-gated, type V, alpha SCN2A1 MGI: 98.248 Sodium channel, voltage-gated, type II, alpha 1 SCN2A2 human: M94055, X65361, M91803 Sodium channel, voltage-gated, type II, alpha 2 SCN3A human: ABO37777, AJ251507 MGI: 98249 Sodium channel, voltage-gated, type III, alpha SCN4A human: M81758, LO1983, L04236, MGI: 982SO Sodium channel, voltage-gated, U24.693 type IV, alpha SCN6A human: M91556 Sodium channel, voltage-gated, type VII or VI SCN8a. human: AF225988, AB027567 MGI: 103.169 SCN8A sodium channel, voltage-gated, type VIII SCN9A human: X82835, RIKEN BB468679 Sodium channel, voltage-gated, mouse: MM4O146 type IX, alpha SCN10A human: NM 006514, AF117907 Sodium channel, voltage-gated, type X, SCN11A human: AF 1886.79 MGI: 1345149 Sodium channel, voltage-gated, type XI, alpha SCN12A human: NM 014139 Sodium channel, voltage-gated, type XII, alpha SCNN1A human: X76180, Z92978, L29007, MGI: 101782 Sodium channel, nonvoltage- U81961, U81961, U81961, U81961, gated 1 alpha U81961 SCN4B Sodium channel, voltage-gated, type IV, beta US 2014/O 1961.76 A1 Jul. 10, 2014 25

TABLE 14-continued

GenBank and for UniGene MGI Database Gene Accession Number Accession Number SCNN1B human: X87159, L36593, Sodium channel, nonvoltage- AJOO5383, ACOO2300, U16023 gated 1, beta SCNNID human: U38254 Sodium channel, nonvoltage gated 1, delta SCNN1G human: X87160, L36592, U35630 MGI: 104.695 Sodium channel, nonvoltage gated 1, gamma CLCN1 human: Z25884, Z25587, M97820, MGI: 88417 1, skeletal Z25753 muscle CLCN2 human: AF026004 MGI: 105061 chloride channel 2 CLCN3 human: X78520, AL117599, MGI: 103555 chloride channel 3 AFO29346 CIC3 CLCN4 human: ABO19432X771.97 MGI: 104567 chloride channel 4 CLCNS human: X91906, X81836 MGI: 994.86 chloride channel 5 CLCN6 human: D28475, X83378, MGI: 1347049 chloride channel 6 ALO21155, X994.73, X99474, X96391, AL021155, AL021155, X994.75, ALO21155 CLCN7 human: AL031600, U88844, MGI: 1347048 chloride channel 7 Z67743, AJOO1910 CLIC1 human: X87689, AJO12008, chloride intracellular channel 1 X87689, U93205, AF129756 CLIC2 human: NM 001289 chloride intracellular channel 2 CLIC3 human: AF102166 chloride intracellular channel 3 CLICS human: AW816405 chloride intracellular channel 5 CLCNKB human: Z30644, S80315, U93879 chloride channel Kb CLCNKA human: Z30643, U93878 MGI: 13290.26 chloride channel Ka CLCA1 human: AFO394.00, AFO394O1 MGI: 1316732 chloride channel, calcium activated, family member 1 CLCA2 human: ABO268.33 chloride channel, calcium activated, family member 2 CLCA3 human: NM 004921 chloride channel, calcium activated, family member 3 CLCA4 human: AKO00072 chloride channel, calcium activated, family member 4 KCNA1 kv1.1 human: L02750 MGI: 96.654 potassium voltage-gate channel, Snaker-relate Subfamily, member KCNA2 human: Hs.248.139, LO2752 MGI: 96659 potassium voltage-gate mouse: MM5693O channel, Snaker-relate Subfamily, member 2 KCNA3 human: M85217, L23499, M38217, MGI:96660 potassium voltage-gate M55515 channel, Snaker-relate Subfamily, member 3 KCNA4 human: M55514, M60450, L02751 MGI: 96661 otassium voltage-gate nannel, Snaker-relate Subfamily, member 4 otassium voltage-gate nannel, Snaker-relate Subfamily, member 4-like US 2014/O 1961.76 A1 Jul. 10, 2014 26

TABLE 14-continued

GenBank and for UniGene MGI Database Gene Accession Number Accession Number KCNAS human: Hs.150208, M55513, MGI: 96.662 potassium voltage-gate M83254, M60451, M55513 channel, -related mouse: MM1241 Subfamily, member 5 KCNA6 human: X17622 MGI: 96663 potassium voltage-gate channel, shaker-related Subfamily, member 6 KCNA7 MGI: 96664 potassium voltage-gate channel, shaker-related Subfamily, member 7 KCNA10 human: U96110 potassium voltage-gate channel, shaker-related Subfamily, member 10 KCNB human: L02840, L02840, X68302, MGI: 96.666 potassium voltage-gate AFO26OOS channel, Shab-relate Subfamily, member KCNB2 human: Hs. 121498, U69962 potassium voltage-gate mouse: MM154372 channel, Shab-relate Subfamily, member 2 KCNC human: LOO621, S56770 MGI: 96.667 potassium voltage-gate channel, Shaw-relate Subfamily, member 1 KCNC2 MGI: 96668 potassium voltage-gate channel, Shaw-relate Subfamily, member 2 KCNC3 human: AFOSS989 MGI: 96.669 potassium voltage-gate channel, Shaw-relate Subfamily, member 3 KCNC4 human: M64676 MGI: 96670 potassium voltage-gate channel, Shaw-relate Subfamily, member 4 KCND human: AJOO5898, AF166003 MGI: 96671 potassium voltage-gate channel, Shal-related family, member 1 KCND2 human: AB028967, AJO10969, potassium voltage-gate ACOO4888 channel, Shal-related subfamily, member 2 KCND3 human: AF 120491, AF04.8713, potassium voltage-gate AFO48712, ALO495.57 channel, Shal-related subfamily, member 3 KCNE mouse: NM 0084.24 potassium voltage-gate

KCNE1L, human: AJO12743, NM 012282

KCNE human: AF302095 member 2 KCNE human: NM 005472, potassium voltage-gate rat: AJ271742 channel, Isk-related family, mouse: MM18733 member 3 KCNE4 mouse: MM.24386 US 2014/O 1961.76 A1 Jul. 10, 2014 27

TABLE 14-continued

GenBank and for UniGene MGI Database Gene Accession Number Accession Number

KCNF human: AFO33.382 potassium voltage-gate channel, Subfamily F, member 1 KCNG human: AFO33383, AL050404 potassium voltage-gate channel, Subfamily G, member 1 KCNG2 human: NM 012283 potassium voltage-gate channel, Subfamily G, member 2 KCNH human: AJOO1366, AF078741, potassium voltage-gate AFO78742 channel, Subfamily H (eag mouse: NM 01 0600 related), member 1 KCNH2 human: UO4270, AJO 10538, MGI: 1341722 potassium voltage-gate ABOO9071, AFO52728 channel, Subfamily H (eag related), member 2 KCNH3 human: AB022696, AB033108, potassium voltage-gate HS.64064 channel, Subfamily H (eag mouse: NM 01.0601, MM100209 related), member 3 KCNH4 human: ABO22698 potassium voltage-gate rat: BEC2 channel, Subfamily H (eag related), member 4 KCNHS human: Hs.27043 potassium voltage-gate mouse: MM44465 channel, Subfamily H (eag related), member 5 KCN1 human: UO3884, U12541, U12542, potassium inwardly-rectifying U12543 channel, Subfamily J, member 1 rait: NM 017023 KCNV2 human: U16861, U12507, U24055, MGI: 104744 potassium inwardly-rectifying AFO11904, U22413, AF021139 channel, Subfamily J, member 2 KCNJ3 human: USO964 U391.96 potassium inwardly-rectifying mouse: NM 008426 channel, Subfamily J, member 3 KCN4 human: Hs.32505, UO7364, Z97056, MGI: 104743 potassium inwardly-rectifying U24056, Z97056 channel, Subfamily J, member 4 mouse: MM104760 KCNJS human: NM OOO890 MGI: 1047SS potassium inwardly-rectifying channel, Subfamily J, member 5 KCNJ6 human: Hs. 11173, U52153, D87327, potassium inwardly-rectifying L784.80, S78685, AJOO1894 channel, Subfamily J, member 6 mouse: NM 01.0606, MM4276 rat: NM O13.192 KCNJ8 human: D50315, D50312 MGI: 1100508 potassium inwardly-rectifying channel, Subfamily J, member 8 KCNJ9 human: US 2152 MGI: 108.007 potassium inwardly-rectifying channel, Subfamily J, member 9 KCNJ 10 human: Hs.66727, U52155, U73192, MGI: 1194.504 potassium inwardly-rectifying U731.93 channel, Subfamily J, member O KCNJ11 human: Hs.248141, D50582 MGI: 107SO1 potassium inwardly-rectifying mouse: MM4722 channel, Subfamily J, member 1 KCNJ12 human: AFOO5214, L36069 MGI: 10849S potassium inwardly-rectifying channel, Subfamily J, member 2 KCNJ13 human: AJO07557, ABO13889, potassium inwardly-rectifying AFO61118, AJOO6128, AFO82182 channel, Subfamily J, member rat: ABO34241, ABO13890, 3 ABO34242 guinea pig: AF2007 14 US 2014/O 1961.76 A1 Jul. 10, 2014 28

TABLE 14-continued

GenBank and for UniGene MGI Database Gene Accession Number Accession Number

KCNJ14 human: HS.278677 potassium inwardly-rectifying mouse: Kir2.4, MM68170 channel, Subfamily J, member 4 KCNJ15 human: Hs. 17287, U73191, D87291, potassium inwardly-rectifying Y10745 channel, Subfamily J, member mouse: AJO12368, kir4.2, MM44238 5 KCN16 human: NM 018658, Kirs.1 potassium inwardly-rectifying mouse:ABO161.97 channel, Subfamily J, m ember 1 KCNK1 human: U76996, U33632, U90065 MGI: 109322 potassium channel, Subfamily K, member 1 (TWIK-1) KCNK2 human: AF004711, RIKEN potassium channel, Subfamily BB116O2S K, member 2 (TREK-1) KCNK3 human: AFOO6823 MGI: 1100509 potassium channel, Subfamily K, member 3 (TASK) KCNK4 human: AF247042, AL117564 potassium inwardly-rectifying mouse: NM OO8431 channel, Subfamily K, member 4 KCNKS human: NM 003740, AKOO1897 potassium channel, Subfamily mouse:AF259395 K, member 5 (TASK-2) KCNK6 human: AKO22344 potassium channel, Subfamily K, member 6 (TWIK-2) KCNK7 human: NM 005714 MGI: 1341841 potassium channel, subfamily mouse: MM23O2O K, member 7 KCNK8 mouse: NM O10609 potassium channel, Subfamily K, member 8 human: AF212829 potassium channel, Subfamily guinea pig: AF212828

KCNK human: AF279890 potassium channel, Subfamily K, member 10 (TREK2) human: NM 002248, U69883 potassium intermediate small conduc ance calcium-activated Cl8ll , Subfamily N, member 1 KCNN2 mouse: MM63515 potassium intermediate small conduc ance calcium-activated Cl8ll , Subfamily member 2 (hsk2) KCNN4 human: Hs.10082, AF022797, MGI: 1277957 potassium intermediate small AFO33021, AFOOO972, AF022150 ance calcium activated mouse: MM9911 , Subfamily N, member 4 human: U89364, AF000571, MGI: 108083 potassium voltage-gate AFO51426, AJOO6345, ABO15163, , KQT-like subfamily, ABO15163, AJOO6345 lele KCNQ2 human:Y15065, D82346, MGI: 1309SO3 potassium voltage-gate AFO33348, AFO74247, AF110020 , KQT-like subfamily, member 2 KCNQ3 human: NM 004519, AFO33347, MGI: 1336181 potassium voltage-gate AFO71491 , KQT-like subfamily, member 3 KCNQ4 human: Hs.241376, AF105202, potassium voltage-gate AF105216 , KQT-like subfamily, mouse:AF24.9747 member 4 KCNQ5 human: NM 019842 potassium voltage-gate , KQT-like subfamily, member 5 US 2014/O 1961.76 A1 Jul. 10, 2014 29

TABLE 14-continued

GenBank and for UniGene MGI Database Gene Accession Number Accession Number

KCNS human: AF043473 potassium voltage-gate mouse: NM OO8435 channel, delayed-rectifier, Subfamily S, member 1 KCNS2 mouse: NM 008436 potassium voltage-gate channel, delayed-rectifier, Subfamily S, member 2 KCNS3 human: AF043472 potassium voltage-gate channel, delayed-rectifier, Subfamily S, member 3 KCNAB1 L39833, U33428, L47665, X83127, MGI: 10915S potassium voltage-gate U16953 channel, shaker-related Subfamily, beta member 1 KCNAB2 human: U33429, AFO44253, potassium voltage-gate AFO29749 channel, shaker-related mouse: NM O10598 Subfamily, beta member 2 KCNAB3 human: NM 004732 MGI: 1336208 potassium voltage-gate mouse: MMS7241 channel, shaker-related Subfamily, beta member 3 human: Hs.248143, U53143 potassium inwardly-rectifying channel, Subfamily J, inhibitor 1 human: U11058, U13913, U11717, MGI: 99923 potassium large conductance U23767, AFO25999 calcium-activated channel, Subfamily M, alpha member 1 kcnma3 mouse: NM 008.432 potassium large conductance calcium-activated channel, rat: NM 019273

human: AF2097.47 mouse: NM OO5832

human: APOOO365 calcium-activated channel, Subfamily M, beta member 3 ike KCNMB3 human: NM O14407, AF214561 potassium large conductance calcium-activated channel KCNMB4 human: AJ271372, AF207992, potassium large conductance RIKEN BB329438, RIKEN calcium-activated channel, Sub BB265233 M, beta 4 HCN1 MGI: 1096392 hyperpolarization activated cyclic nucleotide-gated potassium channel 1 Cav1.1 c.111 CACNA1S human: L33798, U30707 MGI: 88.294 , voltage dependent, L type, alpha 1S Subunit Cav1.2 c.112 CACNA1C human: Z34815, L29536, Z34822, calcium channel, voltage L29534, L04569, Z34817, Z34809, dependent, L type, alpha 1C Z34813, Z34814, Z34820, Z34810, Subunit Z34811, L29529, Z34819, Z74996, Z34812, Z34816, AJ224873, Z34818, Z34821, AF070589, Z26308, M92269 Cav1.3 c.113 CACNA1D human: M83566, M76558, D43747, MGI: 88293 calcium channel, voltage AFO55575 dependent, L type, alpha 1D Subunit US 2014/O 1961.76 A1 Jul. 10, 2014 30

TABLE 14-continued

GenBank and for UniGene MGI Database Gene Accession Number Accession Number Cav1.4 c.114 CACNA1F human: AJ224874, AF235097, MGI: 1859639 calcium channel, voltage- AJOO6216, AFO67227, U93305 ependent, L type, alpha 1F Subuni Cav2.1 c.12.1 CACNA1AP/Q human: U79666, AF004883, MGI: 109482 type calcium channel, voltage- AFOO4884, X99897, ABO35727, ependent, P, Q type, alpha 1A U79663, U79665, U79664, Subuni U79667, U79668, AF100774 Caw3.2 C1 2.2 CACNA1B human: M94172, M941.73, U76666 MGI: 88.296 calcium channel, voltage ependent, L type, alpha 1B Subuni Caw3 C1 2.3 CACNA1E human: L29385, L29384, L27745 MGI: 106217 calcium channel, voltage ependent, alpha 1E. Subunit Cav3.1 c.131 CACNA1G human: ABO12043, AF190860, MGI: 1201678 calcium channel, voltage- AF126966, AF227746, AF227744, ependent, alpha 1G subunit AF134985, AF227745, AF227747, AF126965, AF227749, AF134986, AF227748, AF227751, AF227750, AB032949, AF029228 Cav3.2 C13.2 CACNA1H human: AF073931, AF051946, calcium channel, voltage- AFO7O604 dependent, alpha 1H subunit Cav3.3 c.133 CACNA1I human: AF 142567, ALO22319, calcium channel, voltage- AF211189, AB032.946 dependent, alpha 1I subunit

TABLE 1.5 TABLE 1.8

MGI Database GenBank and for UniGene MGI Database GenBank and for UniGene Accession Gene Accession Number Accession Number Gene Accession Number Number NTRK2 (TrkB) human: U12140, X75958, S76473, MGI: 97384 S76474 NES (nestin) no human MGI: 101784 GFRA1 human: NM 0.05264, AFO38420, MGI: 110O842 scip human: L26494 MGI: 101896 (GFRalpha 1) AFO38421, U97144, AFO42080, U95847, AF058999 GFRA2 human: U97145, AFOO2700, U93703 MGI: 1195462 (GFRalpha 2) GFRA3 human: AFOS1767 MGI: 12O1403 TABLE 16 (GFRalpha 3) Trika human: M23102, XO3541, XO42O1, MGI: 97383 Neurotrophin XO6704, X62947, M23102, X62947, GenBank and or UniGene MGI Database receptor M23102, ABO19488, M12128 Gene Accession Number Accession Number Trkc human: U05012, UO5O12, S76475, MGI: 97385 Neurotrophin AJ224.521, S76476, AFO52184 receptor Shh (Sonic Hedgehog) human: L38518 MGI: 98.297 Ret human: S80552 MGI: 97902 Neurotrophic Smoothened Shh human: U84401, AF114821 MGI: 108075 factor receptor receptor Patched Shh. human: NM 000264 0076 All of the sequences identified by the sequence data binding protein rat: AFO79162 base identifiers in Tables 4-18 are hereby incorporated by reference in their entireties. 0077. In yet another aspect of the invention, a promoter TABLE 17 directs tissue-specific expression of the tagged ribosomal protein or mRNA binding protein sequence to which it is GenBank and or UniGene MGI Database operably linked. For example, expression of the tagged ribo Gene Accession Number Accession Number Somal protein or mRNA binding protein coding sequences CALB1 (calbindin human: XO6661, M19879, MGI: 88248 may be controlled by any tissue-specific promoter/enhancer d28 K) element known in the art. Promoters that may be used to CALB2 (calretinin) human: NM OO1740, MGI: 101914 control expression include, but are not limited to, the follow X56667, X56668 PVALB (parvalbumin) human: X63578, X63070, MGI: 97821 ing animal transcriptional control regions that exhibit tissue Z82184, X52695, Z82184 specificity and that have been utilized in transgenic animals: elastase I gene control region, which is active in pancreatic acinar cells (Swift et al., 1984, Cell 38:639-646; Ornitz et al., US 2014/O 1961.76 A1 Jul. 10, 2014

1986, Cold Spring Harbor Symp. Quant. Biol. 50:399-409; incorporated by reference in its entirety). Using Such meth MacDonald, 1987, Hepatology 7:425-515); enolase pro ods, independently generated transgenic lines may express moter, which is active in brain regions, including the striatum, the nucleic acids encoding the molecularly tagged ribosomal cerebellum, CA1 region of the hippocampus, or deep layers proteins or mRNA binding proteins in a unique pattern, even of cerebral neocortex (Chen et al., 1998, Molecular Pharma though all incorporate identical regulatory elements. cology 54(3): 495-503); insulin gene control region, which is 0081 5.6. Introduction of Vectors into Host Cells active in pancreatic beta cells (Hanahan, 1985, Nature 315: I0082 In one aspect of the invention, a vector containing 115-22); immunoglobulin gene control region, which is the nucleic acid encoding the tagged ribosomal protein or active in lymphoid cells (Grosschedlet al., 1984, Cell 38:647 tagged mRNA binding protein and regulatory sequences 58: Adames et al., 1985, Nature 318:533-38: Alexander et al., (preferably characterizing gene regulatory sequences) can be 1987, Mol. Cell. Biol. 7:1436–44); mouse mammary tumor introduced transiently or stably into the genome of a host cell virus control region, which is active intesticular, breast, lym or be maintained episomally. In another aspect of the inven phoid and mast cells (Leder et al., 1986, Cell 45:485-95): tion, the vector can be transiently transfected wherein it is not albumin gene control region, which is active in liver (Pinkert integrated, but is maintained as an episome. et al., 1987, Genes and Devel. 1:268-76); alpha-fetoprotein 0083. The terms “host cell and “recombinant host cell gene control region which is active in liver (Krumlaufetal. are used interchangeably herein. It is understood that Such 1985, Mol. Cell. Biol. 5:1639–48; Hammer et al., 1987, Sci terms refer not only to the particular subject cell but to the ence 235:53-58); alpha 1-antitrypsin gene control region, progeny or potential progeny of Such a cell. Because certain which is active in the liver (Kelsey et al., 1987, Genes and modifications may occur in Succeeding generations due to Devel. 1:161-71); 13-globin gene control region, which is either mutation or environmental influences, such progeny active in myeloid cells (Mogram et al., 1985, Nature 315:338 may not, in fact, be identical to the parent cell, but are still 40; Kollias et al., 1986, Cell 46:89-94); myelin basic protein included within the scope of the term as used herein. gene control region, which is active in oligodendrocyte cells I0084. A host cell can be any prokaryotic (e.g., bacterium in the brain (Readhead et al., 1987, Cell 48:703-12); myosin Such as E. coli) or eukaryotic cell (e.g., a cell from a yeast, light chain-2 gene control region, which is active in skeletal plant, insect (e.g., Drosophila), amphibian, amniote, or mam muscle (Sani, 1985, Nature 314:283-86); and gonadotropic mal, to name but a few), preferably a vertebrate cell, more releasing hormone gene control region which is active in the preferably a mammalian cell, and most preferably, a mouse hypothalamus (Mason et al., 1986, Science 234:1372–78). cell. In certain embodiments, the host cell is a human cell, 0078. In other embodiments, the gene sequence from either a cultured cell, or in certain embodiments, an immor which the regulatory sequence derives is protein kinase C, talized cultured cell or primary human cell. In specific gamma (GenBank Accession Number: Z15114 (human); embodiments, the host cells are human embryonic stem cells, MGI Database Accession Number: MGI:97597); fos (Uni or other human stem cells (or murine stem cells, including Gene No. MM5043 (mouse)): TH-elastin; Pax7 (Mansouri, embryonic stem cells), tumor cells or cancer cells (particu 1998, The role of Pax3 and Pax7 in development and cancer, larly circulating cancer cells such as those resulting from Crit. Rev. Oncog.9(2):141-9); Eph receptor (Mellitzer et al., leukemias and other blood system cancers). Host cells 2000, Control of cell behaviour by signalling through Eph intended to be part of the invention include ones that comprise receptors and ephrins; Curt Opin. Neurobiol. 10(3):400-08; nucleic acids encoding one or more tagged ribosomal or Suda et al., 2000, Hematopoiesis and angiogenesis, Int. J. tagged mRNA binding proteins and, optionally, operably Hematol. 71(2):99-107; Wilkinson, 2000, Eph receptors and associated with characterizing gene sequences that have been ephrins: regulators of guidance and assembly, Int. Rev. Cytol. engineered to be present within the host cell (e.g., as part of a 196:177-244; Nakamoto, 2000, Eph receptors and ephrins, vector). The invention encompasses genetically engineered Int. J. Biochem. Cell Biol. 32(1): 7-12: Tallguist et al., 1999, host cells that contain any of the foregoing tagged ribosomal Growth factor signaling pathways in vascular development, protein or tagged mRNA binding protein coding sequences, Oncogene 18(55):7917-32); islet-1 (Bang et al., 1996, Regu optionally operatively associated with a regulatory element lation of vertebrate neural cell fate by transcription factors, (preferably from a characterizing gene, as described above) Curr. Opin. Neurobiol. 6(1):25-32: Ericson et al., 1995, Sonic that directs the expression of the coding sequences in the host hedgehog: a common signal for Ventral patterning along the cell. Both cDNA and genomic sequences can be cloned and rostrocaudal axis of the neural tube, J. Dev. Biol. 39(5):809 expressed. In a preferred aspect, the host cell is recombination 16; 0-actin; thy-1 (Caroni, 1997, Overexpression of growth deficient, i.e., Rec, and used for BAC recombination. In associated proteins in the neurons of adult transgenic mice, J. specific embodiments the host cell may contain more than Neurosci. Methods 71(1):3-9). one type of ribosomal or mRNA binding protein fusion, 0079. Nucleic acids of the invention may include all or a where the fusion of the different ribosomal and mRNA bind portion of the upstream regulatory sequences of the selected ing proteins is to the same or different peptide tags. gene. The characterizing gene regulatory sequences prefer I0085. A vector containing a nucleotide sequence of the ably direct expression of the tagged ribosomal protein or invention can be introduced into the desired host cell by mRNA binding protein sequences in Substantially the same methods known in the art, e.g., transfection, transformation, pattern as the endogenous characterizing gene within trans transduction, electroporation, infection, microinjection, cell genic organism, or tissue derived therefrom. fusion, DEAE dextran, calcium phosphate precipitation, lipo 0080. In certain embodiments, the nucleic acids encoding somes, LIPOFECTINTM (source), lysosome fusion, synthetic the molecularly tagged ribosomal proteins or mRNA binding cationic lipids, use of a gene gun or a DNA vector transporter, proteins may be selectively expressed in random but distinct Such that the nucleotide sequence is transmitted to offspring subsets of cells, as described in Feng et al. (2000, Imaging in the line. For various techniques for transformation or trans neuronal Subsets in transgenic mice expressing multiple spec fection of mammalian cells, see Keown et al., 1990, Methods tral variants of GFP. Neuron 28(0:41-51, which is hereby Enzymol. 185: 527-37; Sambrook et al., 2001, Molecular US 2014/O 1961.76 A1 Jul. 10, 2014 32

Cloning, A Laboratory Manual. Third Edition, Cold Spring expressed. See, e.g., Hogan et al 1986, Manipulating the Harbor Laboratory Press, N.Y. Mouse Embryo, Cold Spring Harbor Laboratory Press, New 0.086. In certain embodiments, the vector is introduced York, N.Y. into a cultured cell. In other embodiments, the vector is intro 0091 Viral methods of inserting nucleic acids are known duced into a proliferating cell (or population of cells), e.g., a in the art. tumor cell, a stem cell, a blood cell, a bone marrow cell, a cell 0092. For stable transfection of cultured mammaliancells, derived from a tissue biopsy, etc. only a small fraction of cells may integrate the foreign DNA 0087 Particularly preferred embodiments of the invention into their genome. The efficiency of integration depends upon encompass methods of introduction of the vector containing the vector and transfection technique used. In order to iden the nucleic acid of the invention, using pronuclearinjection of tify and select integrants, a gene that encodes a selectable a nucleic acid construct of the invention into the mononucleus marker (e.g., for resistance to antibiotics) is generally intro of a mouse embryo and infection with a viral vector compris duced into the host cells along with a nucleotide sequence of ing the construct. Methods of pronuclearinjection into mouse the invention. Preferred selectable markers include those embryos are well-known in the art and described in Hogan et which confer resistance to drugs, such as G418, hygromycin al. 1986, Manipulating the Mouse Embryo, Cold Spring Har and methotrexate. Cells stably transfected with the intro bor Laboratory Press, New York, N.Y. and Wagner et al., U.S. duced nucleic acid can be identified by drug selection (e.g., Pat. No. 4,873,191, issued Oct. 10, 1989, herein incorporated cells that have incorporated the selectable marker gene will by reference in their entireties. survive, while the other cells die). Such methods are particu 0088. In preferred embodiments, a vector containing the larly useful in methods involving homologous recombination nucleic acid of the invention is introduced into any genetic in mammalian cells (e.g., in murine ES cells) prior to intro material which ultimately forms a part of the nucleus of the ducing the recombinant cells into mouse embryos to generate Zygote of the animal to be made transgenic, including the chimeras. Zygote nucleus. In one embodiment, the nucleic acid of the 0093. A number of selection systems may be used to select invention can be introduced in the nucleus of a primordial transformed host cells. In particular, the vector may contain germ cell which is diploid, e.g., a spermatogonium or oogo certain detectable or selectable markers. Other methods of nium. The primordial germ cell is then allowed to mature to a selection include but are not limited to selecting for another gamete which is then united with another gameteor source of marker Such as: the herpes simplex virus thymidine kinase a haploid set of to form a Zygote. In another (Wigler et al., 1977, Cell 11: 223), hypoxanthine-guanine embodiment, the vector containing the nucleic acid of the phosphoribosyltransferase (Szybalska and Szybalski, 1962, invention is introduced in the nucleus of one of the gametes, Proc. Natl. Acad. Sci. USA 48: 2026), and adenine phospho e.g., a mature sperm, egg or polar body, which forms a part of ribosyltransferase (Lowy et al., 1980, Cell 22:817) genes can the Zygote. In preferred embodiments, the vector containing be employed in tha-, hgprt- or aprt-cells, respectively. Also, the nucleic acid of the invention is introduced in either the antimetabolite resistance can be used as the basis of selection male or female pronucleus of the Zygote. More preferably, it for the following genes: dhfr, which confers resistance to is introduced in either the male or the female pronucleus as methotrexate (Wigler et al., 1980, Natl. Acad. Sci. USA 77: Soon as possible after the sperm enters the egg. In other 3567: O'Hare et al., 1981, Proc. Natl. Acad. Sci. USA 78: words, right after the formation of the male pronucleus when 1527); gpt, which confers resistance to mycophenolic acid the pronuclei are clearly defined and are well separated, each (Mulligan and Berg, 1981, Proc. Natl. Acad. Sci. USA 78: being located near the Zygote membrane. 2072); neo, which confers resistance to the aminoglycoside 0089. In a most preferred embodiment, the vector contain G-418 (Colberre-Garapin et al., 1981, J. Mol. Biol. 150: 1); ing the nucleic acid of the invention is added to the male DNA and hygro, which confers resistance to hygromycin (Santerre complement, or a DNA complement other than the DNA et al., 1984, Gene 30: 147). complement of the female pronucleus, of the Zygote prior to 0094 5.7. Methods of Producing Transformed Organisms its being processed by the ovum nucleus or the Zygote female 0.095 The nucleic acid of the invention may integrate into pronucleus. In an alternate embodiment, the vector contain the genome of the founder organism (or an oocyte or embryo ing the nucleic acid of the invention could be added to the that gives rise to the founder organism), preferably by random nucleus of the sperm after it has been induced to undergo integration. If random, the integration preferably does not decondensation. Additionally, the vector containing the trans knock out, e.g., insert into, an endogenous gene(s) such that gene may be mixed with sperm and then the mixture injected the endogenous gene is not expressed or is mis-expressed. into the cytoplasm of an unfertilized egg. Perry et al., 1999, 0096. In other embodiments, the nucleic acid of the inven Science 284:1180-1183. Alternatively, the vector may be tion may integrate by a directed method, e.g., by directed injected into the vas deferens of a male mouse and the male homologous recombination ("knock-in”), Chappel, U.S. Pat. mouse mated with normal estrus females. Huguet et al., 2000, No. 5.272,071; and PCT publication No. WO91/06667, pub Mol. Reprod. Dev. 56:243-247. lished May 16, 1991; U.S. Pat. No. 5,464,764; Capecchi et al., 0090 Preferably, the nucleic acid of the invention is intro issued Nov. 7, 1995: U.S. Pat. No. 5,627,059, Capecchi et al. duced using any technique so long as it is not destructive to issued, May 6, 1997: U.S. Pat. No.5.487,992, Capecchi et al., the cell, nuclear membrane or other existing cellular or issued Jan. 30, 1996). Preferably, when homologous recom genetic structures. The nucleic acid of the invention is pref bination is used, it does not knock out or replace the hosts erentially inserted into the nucleic genetic material by micro endogenous copy of the characterizing gene (or characteriz injection. Microinjection of cells and cellular structures is ing gene ortholog). known and is used in the art. Also known in the art are 0097 Methods for generating cells having targeted gene methods of transplanting the embryo or Zygote into a modifications through homologous recombination are known pseudopregnant female where the embryo is developed to in the art. The construct will comprise at least a portion of the term and the nucleic acid of the invention is integrated and characterizing gene with a desired genetic modification, e.g., US 2014/O 1961.76 A1 Jul. 10, 2014 insertion of the nucleotide sequence coding for the tagged 0102. In a preferred embodiment, a transgenic animal of ribosomal protein and will include regions of homology to the the invention is created by introducing a nucleic acid of the target locus, i.e., the endogenous copy of the characterizing invention, encoding the characterizing gene regulatory gene in the host's genome. DNA constructs for random inte sequences operably linked to nucleotide sequences encoding gration need not include regions of homology to mediate a tagged ribosomal protein, into the male pronuclei of a recombination. Markers can be included for performing posi fertilized oocyte, e.g., by microinjection or retroviral infec tive and negative selection for insertion of the nucleic acid of tion, and allowing the oocyte to develop in a pseudopregnant the invention. female foster animal. Methods for generating transgenic ani 0098. To create a homologous recombinant organism, a mals via embryo manipulation and microinjection, particu homologous recombination vector is prepared in which the larly animals such as mice, have become conventional in the nucleotide sequence encoding the tagged ribosomal protein is art and are described, for example, in U.S. Pat. Nos. 4.736, flanked at its 5' and 3' ends by characterizing gene sequences 866 and 4,870,009, U.S. Pat. No. 4,873,191, in Hogan, to allow for homologous recombination to occur between the Manipulating the Mouse Embryo, (Cold Spring Harbor exogenous gene carried by the vector and the endogenous Laboratory Press, Cold Spring Harbor, N.Y., 1986) and in characterizing gene in an embryonic stem cell. The additional Wakayama et al., 1999, Proc. Natl. Acad. Sci. USA, flanking nucleic acid sequences are of Sufficient length for 96:14984-89. Similar methods are used for production of Successful homologous recombination with the endogenous other transgenic animals. characterizing gene. Typically, several kilobases of flanking 0103) A transgenic founder animal can be identified based DNA (both at the 5' and 3' ends) are included in the vector. upon the presence of the nucleic acid of the invention in its Methods for constructing homologous recombination vectors genome and/or expression of mRNA encoding the nucleic and homologous recombinant animals are described further acid of the invention in tissues or cells of the animals. A in Thomas and Capecchi, 1987, Cell 51:503: Bradley, 1991, transgenic founder animal can then be used to breed addi Curr. Opin. Bio/Technol. 2: 823-29; and PCT Publication tional animals carrying the nucleic acid of the invention as Nos. WO 90/11354, WO 91/01140, WO 92/0968, and WO described Supra. Moreover, transgenic animals carrying the 93FO4169. nucleic acid of the invention can further be bred to other 0099. A transgenic animal is a non-human animal, prefer transgenic animals carrying other nucleic acids of the inven ably a mammal, more preferably a rodent Such as a rat or tion. mouse, in which one or more of the cells of the animal 0104. In another embodiment, the nucleic acid of the includes a nucleic acid of the invention, i.e., has a non-endog invention is inserted into the genome of an embryonic stem enous (i.e., heterologous) nucleic acid sequence present as an (ES) cell, followed by injection of the modified ES cell into a extrachromosomal element in a portion of its cell or stably blastocyst-stage embryo that Subsequently develops to matu integrated into its germ line DNA (i.e., in the genomic rity and serves as the founder animal for a line of transgenic sequence of most or all of its cells). Other examples of trans animals. genic animals include non-human primates, sheep, dogs, 0105. In another embodiment, a vector bearing a nucleic cows, goats, chickens, amphibians, etc. The invention also acid of the invention is introduced into ES cells (e.g., by includes transgenic plants and fungi (including yeast). Unless electroporation) and cells in which the introduced gene has otherwise indicated, it will be assumed that a transgenic ani homologously recombined with the endogenous gene are mal comprises stable changes to the germline sequence. Het selected. See, e.g., Liet al., 1992, Cell 69:915. For embryonic erologous nucleic acid is introduced into the germline of such stem (ES) cells, an ES cell line may be employed, or embry a transgenic animal by genetic manipulation of, for example, onic cells may be obtained freshly from a host, e.g. mouse, rat, embryos or embryonic stem cells of the host animal. guinea pig, etc. 0100. As discussed above, transformed organisms of the 0106. After transformation, ES cells are grown on an invention, e.g., transgenic animals, are preferably generated appropriate feeder layer, e.g., a fibroblast-feeder layer, in an by random integration of a vector containing a nucleic acid of appropriate medium and in the presence of appropriate the invention into the genome of the organism, for example, growth factors, such as leukemia inhibiting factory (LIF). by pronuclear injection in an animal Zygote as described Cells that contain the construct may be detected by employing above. Other methods involve introducing the vector into a selective medium. Transformed ES cells may then be used cultured embryonic cells, for example ES cells, and then to produce transgenic animals via embryo manipulation and introducing the transformed cells into animal blastocysts, blastocyst injection. (See, e.g., U.S. Pat. Nos. 5,387.742, thereby generating a "chimeras' or "chimeric animals', in 4.736,866 and 5.565,186 for methods of making transgenic which only a subset of cells have the altered genome. Chime animals.) ras are primarily used for breeding purposes in order to gen 0107 Stable expression of the construct is preferred. For erate the desired transgenic animal. Animals having a het example, ES cells that stably express a nucleotide sequence erozygous alteration are generated by breeding of chimeras. encoding a tagged ribosomal protein may be engineered. Male and female heterozygotes are typically bred to generate Rather than using vectors that contain viral origins of repli homozygous animals. cation, ES host cells can be transformed with DNA, e.g., a 0101. A homologously recombinant organism may plasmid, controlled by appropriate expression control ele include, but is not limited to, a recombinant animal. Such as a ments (e.g., promoter, enhancer, sequences, transcription ter non-human animal, preferably a mammal, more preferably a minators, polyadenylation sites, etc.), and a selectable mouse, in which an endogenous gene has been altered by marker. Following the introduction of the foreign DNA, engi homologous recombination between the endogenous gene neered ES cells may be allowed to grow for 1-2 days in an and an exogenous DNA molecule introduced into a cell of the enriched media, and then are Switched to a selective media. animal, e.g., an embryonic cell of the animal, prior to devel The selectable marker in the recombinant plasmid confers opment of the animal. resistance to the selection and allows cells to stably integrate US 2014/O 1961.76 A1 Jul. 10, 2014 34 the plasmid into their chromosomes and expanded into cell date, sex, ear tag number, Source of mother and father, geno lines. This method may advantageously be used to engineer type, dates mated and generation. ES cell lines that express a nucleotide sequence encoding a 0113 More specifically, founder animals heterozygous tagged ribosomal protein. for the nucleic acid of the invention may be mated to generate 0108. The selected ES cells are then injected into a blas a homozygous line as follows: A heterozygous founder ani tocyst of an animal (e.g., a mouse) to form aggregation chi mal, designated as the P generation, is mated with an off meras. See, e.g., Bradley, 1987, in Teratocarcinomas and spring from a mating with a non-transgenic mouse, desig Embryonic Stem Cells: A Practical Approach, Robertson, ed., nated as the F1 generation, transgenic mouse of the opposite IRL, Oxford, 113-52. Blastocysts are obtained from 4 to 6 sex which is heterozygous for the nucleic acid of the invention week old superovulated females. The ES cells are trypsinized, (backcross). Based on classical genetics, one fourth of the and the modified cells are injected into the blastocoel of the results of this backcross are homozygous for the nucleic acid blastocyst. After injection, the blastocysts are implanted into of the invention. In a preferred embodiment, transgenic the uterine horns of Suitable pseudopregnant female foster founders are individually backcrossed to an inbred or outbred animal. Alternatively, the ES cells may be incorporated into a strain of choice. Different founders should not be inter morula to form a morula aggregate which is then implanted crossed, since different expression patterns may result from into a Suitable pseudopregnant female foster animal. Females separate nucleic acid integration events. are then allowed to go to term and the resulting litters 0114. The determination of whether a transgenic mouse is screened for mutant cells having the construct. homozygous or heterozygous for the nucleic acid of the 0109 The chimeric animals are screened for the presence invention is as follows: of the modified gene. By providing for a different phenotype 0.115. An offspring of the above described breeding cross of the blastocyst and the ES cells, chimeric progeny can be is mated to a normal control non-transgenic animal. The readily detected. Males and female chimeras having the offspring of this second mating are analyzed for the presence modification are mated to produce homozygous progeny. of the nucleic acid of the invention by the methods described Only chimeras with transformed germline cells will generate below. If all offspring of this cross test positive for the nucleic homozygous progeny. If the gene alterations cause lethality at acid of the invention, the mouse in question is homozygous Some point in development, tissues or organs can be main for the nucleic acid of the invention. If, on the other hand, tained as allergenic or congenic grafts or transplants, or in in some of the offspring test positive for the nucleic acid of the vitro culture. invention and others test negative, the mouse in question is heterozygous for the nucleic acid of the invention. 0110 Progeny harboring homologously recombined or 0116. An alternative method for distinguishing between a integrated DNA in their germline cells can be used to breed transgenic animal which is heterozygous and one which is animals in which all cells of the animal contain the homolo homozygous for the nucleic acid of the invention is to mea gously recombined DNA by germline transmission of the sure the intensity with radioactive probes following Southern nucleic acid of the invention. blot analysis of the DNA of the animal. Animals homozygous 0111 Clones of the non-human transgenic animals for the nucleic acid of the invention would be expected to described herein can also be produced according to the meth produce higher intensity signals from probes specific for the ods described in Wilmut et al., 1997, Nature 385: 810-13 and nucleic acid of the invention than would heterozygote trans PCT Publication NOS. WO 97/O7668 and WO 97/O7669. genic animals. 0112. Once the transgenic mice are generated they may be 0117. In a preferred embodiment, the transgenic mice are bred and maintained using methods well known in the art. By so highly inbred to be genetically identical except for sexual way of example, the mice may be housed in an environmen differences. The homozygotes are tested using backcross and tally controlled facility maintained on a 10 hour dark: 14 hour intercross analysis to ensure homozygosity. Homozygous light cycle. Mice are mated when they are sexually mature (6 lines for each integration site in founders with multiple inte to 8 weeks old). In certain embodiments, the transgenic grations are also established. Brother/sister matings for 20 or founders or chimeras are mated to an unmodified animal (i.e., more generations define an inbred Strain. In another preferred an animal having no cells containing the nucleic acid of the embodiment, the transgenic lines are maintained as hemizy invention). In a preferred embodiment, the transgenic founder gotes. or chimera is mated to C57BL/6 mice (Jackson Laboratories). 0118. In an alternative embodiment, individual geneti In a specific embodiment where the nucleic acid of the inven cally altered mouse strains are also cryopreserved rather than tion is introduced into ES cells and a chimeric mouse is propagated. Methods for freezing embryos for maintenance generated, the chimera is mated to 129/Sv mice, which have of founder animals and transgenic lines are known in the art. the same genotype as the embryonic stem cells. Protocols for Gestational day 2.5 embryos are isolated and cryopreserved Successful creation and breeding of transgenic mice are in Straws and stored in liquid nitrogen. The first straw and the known in the art (Manipulating the Mouse Embryo. A Labo last straw are subsequently thawed and transferred to foster ratory Manual, 2nd edition. B. Hogan, Beddington, R., Cos females to demonstrate viability of the line with the assump tantini, F. and Lacy, E., eds. 1994. Cold Spring Harbor Labo tion that all embryos frozen between the first straw and the ratory Press: Plainview, N.Y.). Preferably, a founder male is last straw will behave similarly. If viable progeny are not mated with two females and a founder female is mated with observed a second embryo transfer will be performed. Meth one male. Preferably two females are rotated through a male's ods for reconstituting frozen embryos and bringing the cage every 1-2 weeks. Pregnant females are housed 1 or 2 per embryos to term are known in the art. cage. Preferably, pups are ear tagged, genotyped, and weaned 0119 The nucleic acid encoding the molecularly tagged at 21 days. Males and females are housed separately. Prefer ribosomal protein or mRNA binding protein may be intro ably log sheets are kept for any mated animal, by example and duced into the genome of a founder plant (or embryo that not limitation, information should include pedigree, birth gives rise to the founder plant) using methods well known in US 2014/O 1961.76 A1 Jul. 10, 2014

the art (Newell, 2000, Plant transformation technology. 0.125 Preferably, the tagged ribosomal protein encoding Developments and applications, Mol. Biotechnol. sequence is inserted into the characterizing gene sequences 16(1):5365; Kumar and Fladung, 2001, Controlling trans using 5' direct fusion without the use of an IRES, i.e., such gene integration in plants, Trends in Plant Science 6 (4): that the tagged ribosomal protein encoding sequence(s) is 155-159). The nucleic acid encoding the molecularly tagged fused directly inframe to the nucleotide sequence encoding at ribosomal protein or mRNA binding protein may be intro least the first codon of the characterizing gene coding duced into the genome of bacteria and yeast using methods sequence and even the first two, four, five, six, eight, ten or described in Ausubel et al., 1989, Current Protocols in twelve codons. In other embodiments, the tagged ribosomal Molecular Biology, Green Publishing Associates and Wiley protein encoding sequence is inserted into the 3' UTR of the Interscience, N.Y., Chapters 1 and 13, respectively). characterizing gene and has its own IRES. In yet another 0120 5.7.1. Homologous Recombination in Bacterial specific embodiment, the tagged ribosomal protein encoding Artificial Chromosomes sequence is inserted into the 5' UTR of the characterizing 0121 The invention provides transformed organisms, e.g., gene with an IRES controlling the expression of the tagged transgenic mice, that express a tagged ribosomal protein ribosomal protein encoding sequence. within a chosen cell type (see infra) In preferred embodi I0126. In a preferred aspect of the invention, the molecu ments, BAC-mediated recombination (Yang, et al., 1997, Nat. larly tagged ribosomal protein encoding sequence is intro Biotechnol. 15(9):859-865) is used to create the transformed duced into a BAC containing characterizing gene regulatory organism. Such expression is achieved by using the endog sequences by the methods of Heintz et al. WO 98/59060 and enous regulatory sequences of a particular gene, wherein the Heintz et al., WO 01/05962, both of which are incorporated expression of gene is a defining characteristic of the chosen herein by reference in their entireties. The molecularly tagged cell type (as also described in PCT/US02/04765, entitled sequence is introduced by performing selective homologous “Collections of Transgenic Animal Lines (Living Library) recombination on a particular nucleotide sequence contained by Serafini, published as WO 02/064749 on Aug. 22, 2002, in a recombination deficient host cell, i.e., a cell that cannot which is incorporated by reference herein in its entirety). In independently Support homologous recombination, e.g., another preferred embodiment, a collection of transgenic RecA-. The method preferably employs a recombination cas mice expressing tagged ribosomal proteins within a set of sette that contains a nucleic acid containing the molecular-tag chosen cell types is assembled, as described infra. coding sequence that selectively integrates into a specific site 0122 Vectors used in the methods of the invention prefer in the characterizing gene by virtue of sequences homologous ably can accommodate, and in certain embodiments com to the characterizing gene flanking the molecular-tag gene prise, large pieces of heterologous DNA Such as genomic coding sequences on the shuttle vector when the recombina sequences. Such vectors can contain an entire genomic locus, tion deficient host cell is induced to Support homologous or at least Sufficient sequence to confer endogenous regula recombination (for example by providing a functional RecA tory expression pattern and to insulate the expression of cod gene on the shuttle vector used to introduce the recombination ing sequences from the effect of regulatory sequences Sur cassette). rounding the site of integration of the nucleic acid of the I0127. In a preferred aspect, the particular nucleotide invention in the genome to mimic better wildtype expression. sequence that has been selected to undergo homologous When entire genomic loci or significant portions thereof are recombination is contained in an independent origin based used, few, if any, site-specific expression problems of a cloning vector introduced into or contained within the host nucleic acid of the invention are encountered, unlike inser cell, and neither the independent origin based cloning vector tions of nucleic acids into Smaller sequences. In a preferred alone, nor the independent origin based cloning vector in embodiment, the vector is a BAC containing genomic combination with the host cell, can independently Support sequences into which a selected sequence encoding a molecu homologous recombination (e.g., is RecA-). Preferably, the lar tag, e.g., an epitope tag, has been inserted by directed independent origin based cloning vector is a BAC or a bacte homologous recombination in bacteria, e.g., by the methods riophage-derived artificial (BBPAC) and the of Heintz WO98/59060; Heintzet al., WO 01/05962; Yanget host cell is a host bacterium, preferably E. coli. al., 1997, Nature Biotechnol. 15: 859-865; Yang et al., 1999, Nature Genetics 22:327-35; which are incorporated herein I0128. In another preferred aspect, sufficient characteriz ing gene sequences flank the tagged ribosomal protein encod by reference in their entireties. ing sequence to accomplish homologous recombination and 0123. Using such methods, a BAC can be modified target the insertion of the molecularly tagged ribosomal pro directly in a recombination-deficient E. coli host strain by tein coding sequences to a particular location in the charac homologous recombination. terizing gene. The tagged ribosomal protein coding sequence 0.124. In a preferred embodiment, homologous recombi and the homologous characterizing gene sequences are pref nation in bacteria is used for target-directed insertion of a erably present on a shuttle vector containing appropriate sequence encoding a molecularly tagged ribosomal protein selectable markers and the RecA gene, optionally with a into the genomic DNA encoding Sufficient regulatory temperature sensitive origin of replication (see Heintz et al. sequences (termed "characterizing gene sequences”) to pro WO 98/59060 and Heintz et al., WO 01/05962 such that the mote expression of the tagged ribosomal protein in the endog shuttle vector only replicates at the permissive temperature enous expression pattern of the characterizing gene, which and can be diluted out of the host cell population at the sequences have been inserted into the BAC. The BAC com non-permissive temperature. When the shuttle vector is intro prising the molecularly tagged ribosomal protein sequence duced into the host cell containing the BAC, the RecA gene is under the regulation of this characterizing gene sequence is expressed and recombination of the homologous shuttle vec then recovered and introduced into the genome of a potential tor and BAC sequences can occur, thus targeting the tagged founder organism for a line of transformed organisms. ribosomal protein encoding sequence (along with the shuttle US 2014/O 1961.76 A1 Jul. 10, 2014 36 vector sequences and flanking characterizing gene 0.135. In other embodiments, the nucleic acid of the inven sequences) to the characterizing gene sequences in the BAC. tion is inserted into another vector developed for the cloning 0129. The BACs can be selected and screened for integra of large segments of mammalian DNA, such as a cosmid or tion of the molecularly tagged ribosomal protein coding bacteriophage P1 (Sternberg et al., 1990, Proc. Natl. Acad. sequences into the selected site in the characterizing gene Sci. USA87: 103-07). The approximate maximum insert size sequences using methods well known in the art (e.g., methods is 30-35 kb for cosmids and 100 kb for bacteriophage P1. described in Section 5, infra, and in Heintz et al., WO 0.136. In another embodiment, the nucleic acid of the 98/59060 entitled “Methods of preforming (sic) homologous invention is inserted into a P-1 derived artificial chromosome recombination based modification of nucleic acids in recom (PAC) (Mejia et al., 1997, Retrofitting vectors for Escherichia bination deficient cells and use of the modified nucleic acid coli-based artificial chromosomes (PACs and BACs) with products thereof.” and Heintz et al., WO 01/05962, entitled markers for transfection studies, Genome Res. 7(2): 179-86). “Conditional homologous recombination of large genomic The maximum insert size is 300kb. vector inserts'). Optionally, the shuttle vector sequences not I0137 5.8. Methods of Screening for Expression of containing the molecularly tagged ribosomal protein coding Nucleic Acids of the Invention sequences (including the RecA gene and any selectable mark 0.138 Potential founder organisms for a line of trans ers) can be removed from the BAC by resolution as described formed organisms can be screened for expression of the in Section 5 and in Heintz et al. WO 98/59060 and Heintz et tagged ribosomal protein gene sequence in the population of al., WO 01/05962. cells characterized by expression of the endogenous charac 0130. If the shuttle vector contains a negative selectable terizing gene. marker, cells can be selected for loss of the shuttle vector 0.139 Transformed organisms that exhibit appropriate sequences. In an alternative embodiment, the functional expression (e.g., detectable expression having Substantially RecA gene is provided on a second vector and removed after the same expression pattern as the endogenous characterizing recombination, e.g., by dilution of the vector or by any gene in a corresponding non-transgenic organism oranatomi method known in the art. The exact method used to introduce cal region thereof, i.e., detectable expression in at least 80%, the tagged ribosomal protein encoding sequence and to 90% or, preferably, 95% of the cells shown to express the remove (or not) the RecA (or other appropriate recombination endogenous gene by in situ hybridization) are selected as enzyme) will depend upon the nature of the BAC library used lines of transformed organisms. (for example, the selectable markers present on the BAC 0140. In a preferred embodiment, immunohistochemistry vectors) and such modifications are within the skill in the art. using an antibody specific for the epitope tag is used to detect 0131 Once the BAC containing the characterizing gene expression of the tagged ribosomal fusion protein product. regulatory sequences and molecularly tagged ribosomal pro 0141 5.9. Expression of a Tagged Ribosomal Protein in a tein coding sequences in the desired configuration is identi Population of Cells fied, it can be isolated from the host E. coli cells using routine 0142. The nucleic acid of the invention containing the methods and used to make transformed organisms as nucleotide sequence encoding the tagged ribosomal protein described infra). can be expressed in the cell type of interest using methods 0132 BACs to be used in the methods of the invention are well known in the art for recombinant gene expression. The selected and/or screened using the methods described Supra. choice of which method to use to express a DNA sequence 0.133 Alternatively, the BAC can also be engineered or encoding a tagged ribosomal protein in a chosen population modified by “E-T cloning,” as described by Muyrers et al. of cells depends upon the population. (1999, Nucleic Acids Res. 27(6): 1555-57, incorporated 0143. In certain embodiments, the chosen population of herein by reference in its entirety). Using these methods, cells is a particular population of cells in culture that have specific DNA may be engineered into a BAC independently been transfected with the construct encoding the tagged ribo of the presence of suitable restriction sites. This method is Somal protein, the expression construct is chosen to allow based on homologous recombination mediated by the recE efficient and high-level expression in the type of cells present and recT proteins (“ET-cloning) (Zhang et al., 1998, Nat. in culture, with the mRNA of the transfected population being Genet. 2002): 123-28; incorporated herein by reference in its isolated according to the methods described herein. entirety). Homologous recombination can be performed 0144. This mode of the invention would be particularly between a PCR fragment flanked by short homology arms and useful if one wanted to study global gene expression changes an endogenous intact recipient Such as a BAC. Using this in cultured cells in response to the expression of a particular method, homologous recombination is not limited by the gene product, co-expressed with a tagged ribosomal Subunit disposition of restriction endonuclease cleavage sites or the to allow isolation of mRNA from co-expressing cells. size of the target DNA. A BAC can be modified in its host 0145. In another embodiment, the expression construct strain using a plasmid, e.g., pBAD-CBY, in which recE and can be contained within a viral vector or virus, which is recThave been replaced by their respective functional coun introduced into the desired host cell as described above. This terparts of phage lambda (Muyrers et al., 1999, Nucleic Acids embodiment permits study of mRNA populations from trans Res. 27(6): 1555-57). Preferably, a BAC is modified by duced or infected cells, in vitro or in vivo. recombination with a PCR product containing homology 0146 In another embodiment, expression of the tagged arms ranging from 27-60 bp. In a specific embodiment, ribosomal protein is driven in populations of cells by the homology arms are 50 bp in length. characterizing gene regulatory element. 0134. In another embodiment, a nucleic acid of the inven 0147 In another embodiment, the gene sequences encod tion is inserted into a yeast artificial chromosome (YAC) ing the characterizing gene regulatory element and the tagged (Burke et al., 1987 Science 236: 806-12; and Peterson et al., ribosomal protein is introduced by homologous recombina 1997, Trends Genet. 13: 61). tion. US 2014/O 1961.76 A1 Jul. 10, 2014 37

0148. In another embodiment, homologous recombina tion into its genome using methods routine in the art, for tion is used to introduce only the epitope tag gene coding example, the methods described in Section 5.7., supra. A Sequences. construct is a recombinant nucleic acid, generally recombi 0149 Methods for selecting for cells containing and nant DNA, generated for the purpose of the expression of a expressing the nucleotide sequences encoding the fusion pro specific nucleotide sequence(s), or is to be used in the con teins of the invention are well known in the art. For example, struction of other recombinant nucleotide sequences. in eukaryotic cells, the nucleotide sequence encoding the 0.155. A transgenic construct of the invention includes at fusion protein is associated with (for example, present on the least the coding region for a peptide tag fused to the coding same vectoras) a selectable marker Such as dhfr. Cells having region for a ribosomal protein, operably linked to all or a the dhfr selectable marker are resistant to the drug methotr portion of the regulatory sequences, e.g. a promoter and/or exate. Increasing levels of methotrexate can also lead to enhancer, of the characterizing gene. The transgenic con amplification of the selectable marker (and, concomitantly, struct optionally includes enhancer sequences and coding and the sequence encoding the fusion protein of the invention). other non-coding sequences (including intron and 5' and 3' Once the selectable marker sequences have integrated into the untranslated sequences) from the characterizing gene Such host cell chromosome, the selectable marker sequences (and that the tagged ribosomal fusion protein gene is expressed in the sequences encoding the fusion protein of the invention) the same Subset of cells as the characterizing gene. The tagged will be maintained by the host cells even in the absence of ribosomal fusion protein gene coding sequences and the char selection (e.g., in the absence of methotrexate when the acterizing gene regulatory sequences are operably linked, selectable marker is dhfr). meaning that they are connected in Such away so as to permit 0150 5.10. Nucleic Acid Constructs expression of the tagged ribosomal protein gene when the 0151. The invention provides vectors and lines of organ appropriate molecules (e.g., transcriptional activator pro isms that contain a nucleic acid construct, e.g., a transgene, teins) are bound to the characterizing gene regulatory that comprises the coding sequence for a peptide tag-riboso sequences. Preferably the linkage is covalent, most preferably mal fusion protein or peptide tag-mRNA binding protein by a nucleotide bond. The promoter region is of sufficient fusion protein under the control of a regulatory sequences for length to promote transcription, as described in Alberts et al. a “characterizing gene.” The regulatory sequence is e.g., an (1989) in Molecular Biology of the Cell, 2d Ed. (Garland endogenous promoter of a characterizing gene. This charac Publishing, Inc.). terizing gene is endogenous to a host cell or host organism (or 0156. In one aspect of the invention, the regulatory is an ortholog of an endogenous gene) and is expressed in a sequence is the promoter of a characterizing gene. Other particular select population of cells of the organism. Expres promoters that direct tissue-specific expression of the coding sion of the nucleic acid construct is such that the nucleic acid sequences to which they are operably linked are also contem construct has substantially the same expression pattern as the plated in the invention. In specific embodiments, a promoter endogenous characterizing gene. from one gene and other regulatory sequences (such as 0152. A transgene is a nucleotide sequence that has been enhancers) from other genes are combined to achieve a par or is designed to be incorporated into a cell, particularly a ticular temporal and spatial expression pattern of the tagged mammalian cell, that in turn becomes or is incorporated into ribosomal protein gene. a living animal Such that the nucleic acid containing the 0157 Methods that are well known to those skilled in the nucleotide sequence is expressed (i.e., the mammalian cell is art can be used to construct vectors containing tagged ribo transformed with the transgene). Somal protein gene coding sequences operatively associated 0153. The characterizing gene sequence is preferably with the appropriate transcriptional and translational control endogenous to the transformed organism, or is an ortholog of signals of the characterizing gene. These methods include, for an endogenous gene, e.g., the human ortholog of a gene example, in vitro recombinant DNA techniques and in vivo endogenous to the animal to be made transgenic. A nucleic genetic recombination. See, for example, the techniques acid construct comprising the tagged ribosomal protein and described in Sambrook et al., 2001, Molecular Cloning. A optionally, the characterizing gene sequence may be present Laboratory Manual. Third Edition, Cold Spring Harbor as an extrachromosomal element in some or all of the cells of Laboratory Press, N.Y.; and Ausubel et al., 1989, Current a transformed organisms such as a transgenic animal or, pref Protocols in Molecular Biology, Green Publishing Associates erably, stably integrated into some or all of the cells, more and Wiley Interscience, N.Y., both of which are hereby incor preferably into the germ line DNA of the animal (i.e., such porated by reference in their entireties. that the nucleic acid construct is transmitted to all or some of 0158. The tagged ribosomal protein gene coding the animals progeny), thereby directing expression of an sequences may be incorporated into some or all of the char encoded gene product (i.e., the tagged ribosomal protein gene acterizing gene sequences such that the tagged ribosomal product) in one or more cell types or tissues of the trans protein gene is expressed in Substantially same expression formed organism. Unless otherwise indicated, it will be pattern as the endogenous characterizing gene in the trans assumed that a transformed organism, e.g., a transgenic ani formed organism, or at least in an anatomical region or tissue mal, comprises stable changes to the chromosomes of germ of the organisms (by way of example, in the brain, spinal line cells. In a preferred embodiment, the nucleic acid con chord, heart, skin, bones, head, limbs, blood, muscle, periph struct is present in the genome at a site other than where the eral nervous system, etc. of an animal) containing the popu endogenous characterizing gene is located. In other embodi lation of cells to be marked by expression of the tagged ments, the nucleic acid construct is incorporated into the ribosomal protein gene coding sequences. By “substantially genome of the organism at the site of the endogenous char the same expression pattern' is meant that the tagged riboso acterizing gene, for example, by homologous recombination. mal protein gene coding sequences are expressed in at least 0154) In certain embodiments, transformed organisms are 80%, 85%, 90%, 95%, and preferably 100% of the cells created by introducing a nucleic acid construct of the inven shown to express the endogenous characterizing gene by in US 2014/O 1961.76 A1 Jul. 10, 2014 situ hybridization. Because detection of the tagged ribosomal embodiment, a tagged ribosomal protein gene is inserted into protein gene expression product may be more sensitive than a separate cistron in the 5' region of the characterizing gene in situ hybridization detection of the endogenous character genomic sequence and has an independent IRES sequence. izing gene messenger RNA, more cells may be detected to 0164. In certain embodiments, an IRES is operably linked express the tagged ribosomal protein gene product in the to the tagged ribosomal protein gene coding sequence to transformed organism than are detected to express the endog direct translation of the tagged ribosomal protein gene. The enous characterizing gene by in situ hybridization or any IRES permits the creation of polycistronic mRNAs from other method known in the art for in situ detection of gene which several proteins can be synthesized under the control of expression. an endogenous transcriptional regulatory sequence. Such a 0159 For example, the nucleotide sequences encoding the construct is advantageous because it allows marker proteins tagged ribosomal protein gene protein product may replace to be produced in the same cells that express the endogenous the characterizing gene coding sequences in a genomic clone gene (Heintz, 2000, Hum. Mol. Genet. 9(6): 937-43; Heintzet of the characterizing gene, leaving the characterizing gene al., WO 98/59060; Heintz et al., WO 01/05962; which are regulatory non-coding sequences. In other embodiments, the incorporated herein by reference in their entireties). tagged ribosomal protein gene coding sequences (either 0.165 Shuttle vectors containing an IRES, such as the genomic or cDNA sequences) replace all or a portion of the pLD53 shuttle vector (see Heintz et al., WO 01/05962), may characterizing gene coding sequence and the nucleotide be used to insert the tagged ribosomal protein gene sequence sequence only contains the upstream and downstream char into the characterizing gene. The IRES in the plD53 shuttle acterizing gene regulatory sequences. vector is derived from EMCV (encephalomyocarditis virus) 0160. In a preferred embodiment, the tagged ribosomal (Jackson et al., 1990, Trends Biochem Sci. 15(12):477-83: protein gene coding sequences are inserted into or replace and Jang et al., 1988, J. Virol. 62(8):2636-43, both of which transcribed coding or non-coding sequences of the genomic are hereby incorporated by reference). The common characterizing gene sequences, for example, into or replacing sequence between the first and second IRES sites in the a region of an exon or of the 3'UTR of the characterizing gene shuttle vector is shown below. This common sequence also genomic sequence. Preferably, the tagged ribosomal protein matches plRES (Clontech) from 1158-1710. gene coding sequences are not inserted into or replace regu latory sequences of the genomic characterizing gene sequences. Preferably, the tagged ribosomal protein gene (SEQ ID NO : 6) coding sequences are also not inserted into or replace char TAACGTTACTGGCCGAAGCCGCTTGGAATAAGGCCGGTGTGCGTTTGTC acterizing gene intron sequences. TATATGTTATTTTCCACCATATTGCCGTCTTTTGGCAATGTGAGGGCCC 0161 In a preferred embodiment, the tagged ribosomal protein gene coding sequence is inserted into or replaces a GGAAACCTGGCCCTGTCTTCTTGACGAGCATTCCTAGGGGTCTTTCCCC portion of the 3' untranslated region (UTR) of the character izing gene genomic sequence. In another preferred embodi TCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCGTGAAGGAAGCAGTT ment, the coding sequence of the characterizing gene is CCTCTGGAAGCTTCTTGAAGACAAACAACGTCTGTAGCGACCCTTTGCA mutated or disrupted to abolish characterizing gene expres sion from the nucleic acid construct without affecting the GGCAGCGGAACCCCCCACCTGGCGACAGGTGCCTCTGCGGCCAAAAGCC expression of the tagged ribosomal protein gene. In certain ACGTGTATAAGATACACCTGCAAAGGCGGCACAACCCCAGTGCCACGTT embodiments, the tagged ribosomal protein gene coding sequence has its own internal ribosome entry site (IRES). For GTGAGTTGGATAGTTGTGGAAAGAGTCAAATGGCTCTCCTAAGCGTATT descriptions of IRESes, see, e.g., Jackson et al., 1990, Trends CAACAAGGGGCTGAAGGATGCCCAGAAGGTACTCCATTGTATGGGATCT Biochem Sci. 15(12):477-83;Janget al., 1988, J. Virol. 62(8): 2636-43; Jang et al., 1990, Enzyme 44(1-4):292-309; and GATCTGGGGCCTCGGTGCACATGCTTTACATGTGTTTAGTCGAGGTTAA Martinez-Salas, 1999, Curr. Opin. Biotechnol. 10(5):458-64. AAAAACGTCTAGGCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAA 0162. In another embodiment, the tagged ribosomal pro tein gene is inserted at the 3' end of the characterizing gene AACACCATGATA coding sequence. In a specific embodiment, the tagged ribo Somal protein coding sequences are introduced at the 3' end of (0166 In a specific embodiment, the EMCV IRES is used the characterizing gene coding sequence Such that the nucle to direct independent translation of the tagged ribosomal otide sequence encodes a fusion of the characterizing gene protein gene coding sequences (Gorski and Jones, 1999, and the tagged ribosomal protein gene sequences. Nucleic Acids Research 27(9):2059-61). 0163 Preferably, the tagged ribosomal protein gene cod (0167. In another embodiment, more than one IRES site is ing sequences are inserted using 5' direct fusion wherein the present in a nucleic acid of the invention to direct translation tagged ribosomal protein gene coding sequences are inserted of more than one coding sequence. However, in this case, in-frame adjacent to the initial ATG sequence (or adjacent the each IRES sequence must be a different sequence. nucleotide sequence encoding the first two, three, four, five, 0.168. In certain embodiments where a tagged ribosomal six, seven or eight amino acids of the characterizing gene protein gene is expressed conditionally, the tagged ribosomal protein product) of the characterizing gene. So that translation protein gene coding sequence is embedded in the genomic of the inserted sequence produces a fusion protein of the first sequence of the characterizing gene and is inactive unless methionine (or first few amino acids) derived from the char acted on by a transactivator or recombinase, whereby expres acterizing gene sequence fused to the tagged ribosomal pro sion of the tagged ribosomal protein gene can then be driven tein gene protein. In this embodiment, the characterizing gene by the characterizing gene regulatory sequences. coding sequence 3' of the tagged ribosomal protein gene 0169. In other embodiments the tagged ribosomal protein coding sequences are not expressed. In yet another specific gene is expressed conditionally, through the activity of a gene US 2014/O 1961.76 A1 Jul. 10, 2014 39 that is an activator or Suppressor of gene expression. In this 0.174 Preferably, the nucleic acid of the invention com case, the gene encodes a transactivator, e.g., tetR, or a recom prises all or a significant portion of the genomic characteriz binase, e.g., FLP, whose expression is regulated by the char ing gene, preferably, at least all or a significant portion of the acterizing gene regulatory sequences. The tagged ribosomal 5' regulatory sequences of the characterizing gene, most pref protein gene is linked to a conditional element, e.g., the tet erably, Sufficient sequence 5' of the characterizing gene cod promoter, or is flanked by recombinase sites, e.g., FRT sites, ing sequence to direct expression of the tagged ribosomal and may be located any where within the genome. In Such a protein gene coding sequences in the same expression pattern system, expression of the transactivator gene, as regulated by (temporal and/or spatial) as the endogenous counterpart of the characterizing gene regulatory sequences, activates the the characterizing gene. In certain embodiments, the nucleic expression of the tagged ribosomal protein gene. acid of the invention comprises one exon, two exons, all but 0170 In certain embodiments, exogenous translational one exon, or all but two exons, of the characterizing gene. control signals, including, for example, the ATG initiation 0.175 Nucleic acids comprising the characterizing gene codon, can be provided by the characterizing gene or some sequences and tagged ribosomal protein gene coding other heterologous gene. The initiation codon must be in sequences can be obtained from any available source. In most phase with the reading frame of the desired coding sequence cases, all or a portion of the characterizing gene sequences of the tagged ribosomal protein gene to ensure translation of and/or the tagged ribosomal protein gene coding sequences the entire insert. These exogenous translational control sig are known, for example, in publicly available databases Such nals and initiation codons can be of a variety of origins, both as GenBank, UniGene and the Mouse Genome Informatic natural and synthetic. The efficiency of expression may be (MGI) Database to name just a few, or in private subscription enhanced by the inclusion of appropriate transcription databases. With a portion of the sequence in hand, hybridiza enhancer elements, transcription terminators, etc. (see Bittner tion probes (for filter hybridization or PCR amplification) can et al., 1987, Methods in Enzymol. 153: 516-44). be designed using highly routine methods in the art to identify 0171 The construct can also comprise one or more select clones containing the appropriate sequences (preferred meth able markers that enable identification and/or selection of ods for identifying appropriate BACs are discussed in Section recombinant vectors. The selectable marker may be the 5.7.1, supra) for example in a library or other source of tagged ribosomal protein gene product itself oran additional nucleic acid. If the sequence of the gene of interest from one selectable marker not necessarily tied to the expression of the species is known and the counterpart gene from another spe characterizing gene. cies is desired, it is routine in the art to design probes based 0172. In a specific embodiment, a nucleic acid of the upon the known sequence. The probes hybridize to nucleic invention is expressed conditionally, using any type of induc acids from the species from which the sequence is desired, for ible or repressible system available for conditional expression example, hybridization to nucleic acids from genomic or of genes known in the art, e.g., a system inducible or repress DNA libraries from the species of interest. ible by tetracycline (“tet system); interferon; estrogen, 0176 By way of example and not limitation, genomic ecdysone, or other steroid inducible system; Lac operator, clones can be identified by probing a genomic DNA library progesterone antagonist RU486, or rapamycin (FK506). For under appropriate hybridization conditions, e.g., high Strin example, a conditionally expressible nucleic acid of the gency conditions, low stringency conditions or moderate invention can be created in which the coding region for the stringency conditions, depending on the relatedness of the tagged ribosomal protein gene (and, optionally also the char probe to the genomic DNA being probed. For example, if the acterizing gene) is operably linked to a genetic Switch, Such probe and the genomic DNA are from the same species, then that expression of the tagged ribosomal protein gene can be high Stringency hybridization conditions may be used; how further regulated. One example of this type of switch is a ever, if the probe and the genomic DNA are from different tetracycline-based Switch (see infra). In a specific embodi species, then low stringency hybridization conditions may be ment, the tagged ribosomal protein gene product is the con used. High, low and moderate stringency conditions are all ditional enhancer or Suppressor which, upon expression, well known in the art. enhances or Suppresses expression of a selectable or detect 0177 Procedures for low stringency hybridization are as able marker present either in the nucleic acid of the invention follows (see also Shilo and Weinberg, 1981, Proc. Natl. Acad. or elsewhere in the genome of the transformed organism. Sci. USA 78:6789-6792): Filters containing DNA are pre 0173 A conditionally expressible nucleic acid of the treated for 6 hours at 40° C. in a solution containing 35% invention can be site-specifically inserted into an untranslated formamide, 5xSSC, 50 mM Tris-HCl (pH 7.5), 5 mM EDTA, region (UTR) of genomic DNA of the characterizing gene, 0.1% PVP, 0.1% Ficoll, 1% BSA, and 500 g/ml denatured e.g., the 3' UTR or the 5' region, so that expression of the salmon sperm DNA. Hybridizations are carried out in the nucleic acid via the conditional expression system is induced same solution with the following modifications: 0.02% PVP. or abolished by administration of the inducing or repressing 0.02% Ficoll, 0.2% BSA, 100 g/ml salmon sperm DNA, Substance, e.g., administration of tetracycline or doxycycline, 10% (wt/vol) dextran sulfate, and 5-20x10 cpm P-labeled ecdysone, estrogen, etc., without interfering with the normal probe is used. Filters are incubated in hybridization mixture profile of gene expression (see, e.g., Bond et al., 2000, Sci for 18-20hours at 40°C., and thenwashed for 1.5 hours at 55° ence 289; 1942-46; incorporated herein by reference in its C. in a solution containing 2xSSC, 25 mM Tris-HCl (pH 7.4), entirety). In the case of a binary system, the detectable or 5 mM EDTA, and 0.1% SDS. The wash solution is replaced selectable marker operably linked to the conditional expres with fresh solution and incubated an additional 1.5 hours at sion elements is present in the nucleic acid of the invention, 60° C. Filters are blotted dry and exposed for autoradiogra but outside the characterizing gene coding sequences and not phy. If necessary, filters are washed for a third time at 65-68° operably linked to characterizing gene regulatory sequences C. and reexposed to film. or, alternatively, on another site in the genome of the trans 0.178 Procedures for high stringency hybridizations areas formed organism. follows: Prehybridization offilters containing DNA is carried US 2014/O 1961.76 A1 Jul. 10, 2014 40 out for 8 hours to overnight at 65° C. in buffer composed of sequence start site. Such information can then be used to map 6xSSC, 50 mM Tris-HCl (pH 7.5), 1 mM EDTA, 0.02% PVP. the characterizing gene coding sequence start site within the 0.02% Ficoll, 0.02% BSA, and 500 ug/ml denatured salmon clone. Alternatively, the tagged ribosomal protein gene sperm DNA. Filters are hybridized for 48 hours at 65° C. in sequences (or any other heterologous sequences) can be tar prehybridization mixture containing 100 ug/ml denatured geted to the 5' end of the characterizing gene coding sequence salmon sperm DNA and 5-20 35x10 cpm of P-labeled by directed homologous recombination (for example as probe. Washing of filters is done at 37° C. for 1 hour in a described in Section 5.7) in such a way that a restriction site solution containing 2xSSC, 0.01% PVP 0.01% Ficoll, and unique or at least rare in the characterizing gene clone 0.01% BSA. This is followed by a washin 0.1xSSC at 50° C. sequence is introduced. The position of the integrated tagged for 45 minutes before autoradiography. ribosomal protein gene coding sequences (and, thus, the 5' 0179 Moderate stringency conditions for hybridization end of the characterizing gene coding sequence) can be are as follows: Filters containing DNA are pretreated for 6 mapped by restriction endonuclease digestion and mapping. hours at 55°C. in a solution containing 6xSSC, 5xDenhardt's The clone may also be mapped using internally generated solution, 0.5% SDS, and 100 ug/ml denatured salmon sperm fingerprint data and/or by an alternative mapping protocol DNA. Hybridizations are carried out in the same solution and based upon the presence of restriction sites and the T7 and 5-20x10 CPM P-labeled probe is used. Filters are incu SP6 promoters in the BAC vector, as described in Section bated in the hybridization mixture for 18-20 hours at 55° C. 5.7.1, supra. and then washed twice for 30 minutes at 60° C. in a solution 0182. In certain embodiments, the tagged ribosomal pro containing 1 xSSC and 0.1% SDS. tein gene coding sequences are to be inserted in a site in the 0180. With respect to the characterizing gene, all or a characterizing gene sequences other than the 5' start site of the portion of the genomic sequence is preferred, particularly, the characterizing gene coding sequences, for example, in the 3' sequences 5' of the coding sequence that contain the regula most translated or untranslated regions. In these embodi tory sequences. A preferred method for identifying BACs ments, the clones containing the characterizing gene should containing appropriate and Sufficient characterizing gene be mapped to insure the clone contains the site for insertion in sequences to direct the expression of the tagged ribosomal as well as Sufficient sequence 5' of the characterizing gene protein gene coding sequences in Substantially the same coding sequences library to contain the regulatory sequences expression pattern as the endogenous characterizing gene is necessary to direct expression of the tagged ribosomal protein described in Section 5.7.1, supra. gene sequences in the same expression pattern as the endog 0181 Briefly, the characterizing gene genomic sequences enous characterizing gene. are preferably in a vector that can accommodate significant 0183) Once such an appropriate vector containing the lengths of sequence (for example, 10 kb's of sequence). Such characterizing gene sequences, the tagged ribosomal protein as cosmids, YACs, and, preferably, BACs, and encompass at gene can be incorporated into the characterizing gene least 50, 70,80, 100,120, 150,200,250 or 300kb of sequence sequence by any method known in the art for manipulating that comprises all or a portion of the characterizing gene DNA. In a preferred embodiment, homologous recombina sequence. The larger the vector insert, the more likely it is to tion in bacteria is used for target-directed insertion of the identify a vector that contains the characterizing gene tagged ribosomal protein gene sequence into the genomic sequences of interest. Vectors identified as containing char DNA encoding the characterizing gene and Sufficient regula acterizing gene sequences can then be screened for those that tory sequences to promote expression of the characterizing are most likely to contain Sufficient regulatory sequences gene in its endogenous expression pattern, which character from the characterizing gene to direct expression of the izing gene sequences have been inserted into a BAC (see tagged ribosomal protein gene coding sequences in Substan Section 5.7.1, Supra). The BAC comprising the tagged ribo tially the same patternas the endogenous characterizing gene. Somal protein gene and characterizing gene sequences is then In general, it is preferred to have a vector containing the entire introduced into the genome of a potential founder organism genomic sequence for the characterizing gene. However, in for generating a line of transformed organisms, using meth certain cases, the entire genomic sequence cannot be accom ods well known in the art, e.g., those methods described in modated by a single vector or Such a clone is not available. In Section 5.7. Supra. Such transformed organisms are then these instances (or when it is not known whether the clone screened for expression of the tagged ribosomal protein gene contains the entire genomic sequence), preferably the vector coding sequences that mimics the expression of the endog contains the characterizing gene sequence with the start, i.e., enous characterizing gene. Several different constructs con the most 5' end, of the coding sequence in the approximate taining nucleic acids of the invention may be introduced into middle of the vector insert containing the genomic sequences several potential founder organisms and the resulting trans and/or has at least 20 kb, 30 kb, 40 kb, 50 kb, 60 kb, 80 kb or formed organisms are then screened for the best, (e.g., highest 100 kb of genomic sequence on either side of the start of the level) and most accurate (best mimicking expression of the characterizing gene coding sequence. This can be determined endogenous characterizing gene) expression of the tagged by any method known in the art, for example, but not by way ribosomal protein gene coding sequences. of limitation, by sequencing, restriction mapping, PCR 0.184 The nucleic acid construct can be used to transform amplification assays, etc. In certain cases, the clones used a host or recipient cell or organism using well known meth may be from a library that has been characterized (e.g., by ods, e.g., those described in Section 5.6, Supra. Transforma sequencing and/or restriction mapping) and the clones iden tion can be either a permanent or transient genetic change, tified can be analyzed, for example, by restriction enzyme preferably a permanent genetic change, induced in a cell digestion and compared to database information available for following incorporation of new DNA (i.e., DNA exogenous the library. In this way, the clone of interest can be identified to the cell). Where the cell is a mammalian cell, a permanent and used to query publicly available databases for existing genetic change is generally achieved by introduction of the contigs correlated with the characterizing gene coding DNA into the genome of the cell. In one aspect of the inven US 2014/O 1961.76 A1 Jul. 10, 2014

tion, a vector is used for stable integration of the nucleic acid 0193 In a preferred embodiment, the control elements of construct into the genome of the cell. Vectors include plas the tetracycline-resistance operon of E. coli is used as an mids, retroviruses and other animal viruses, BACs,YACs, and inducible or repressible transactivator or transcriptional regu the like. lation system (“tet system’’) for conditional expression of the 0185 5.11. Expression Using a Binary System detectable or selectable marker. A tetracycline-controlled 0186. Since the level of expression of the tagged riboso transactivator can require either the presence or absence of mal protein within a cell may be important in the efficiency of the antibiotic tetracycline, or one of its derivatives, e.g., doxy the isolation procedure, in certain embodiments of the inven cycline (dox), for binding to the tet operator of the tet system, tion, a binary system can be used, in which the endogenous and thus for the activation of the tet system promoter (Ptet). promoter drives expression of a protein that then activates a Such an inducible or repressible tet system is preferably used second expression construct. This second expression con in a mammalian cell. structuses a strong promoter to drive expression of the tagged 0194 In a specific embodiment, a tetracycline-repressed ribosomal protein at higher levels than is possible using the regulatable system (TrRS) is used (Agha-Mohammadi and endogenous promoter itself. Lotze, 2000, J. Clin. Invest. 105(9): 1177-83; incorporated herein by reference in its entirety). This system exploits the 0187. In certain embodiments, a particular population specificity of the tet repressor (tetR) for the tet operator specific gene drives expression of a molecular Switch (e.g., a sequence (tetO), the sensitivity of tetR to tetracycline, and the recombinase, a transactivator) in a population-specific man activity of the potent herpes simplex virus transactivator ner. This Switch then activates high-level expression though a (VP16) in eukaryotic cells. The TrRS uses a conditionally second regulatory element regulating expression of the active chimeric tetracycline-repressed transactivator (tTA) tagged ribosomal protein. created by fusing the COOH-terminal 127 amino acids of 0188 For example, the molecularly tagged ribosomal pro vision protein 16 (VP16) to the COOH terminus of the tetR tein coding sequence may be expressed conditionally, protein (which may be the tagged ribosomal protein gene). In through the activity of a molecular Switch gene which is an the absence of tetracycline, the tetR moiety oftTA binds with activator or Suppressor of gene expression. In this case, the high affinity and specificity to a tetracycline-regulated pro second gene encodes a transactivator, e.g., tetR, a recombi moter (tRP), a regulatory region comprising seven repeats of nase, or FLP, whose expression is regulated by the character tetO placed upstream of a minimal human cytomegalovirus izing gene regulatory sequences. The gene encoding the (CMV) promoter or B-actin promoter B-actin is preferable for molecularly tagged ribosomal protein is linked to a condi neural expression). Once bound to the tRP, the VP16 moiety tional element, e.g., the tet promoter, or is flanked by recom oftTA transactivates the detectable or selectable marker gene binase sites, e.g., FRT sites, and may be located any where by promoting assembly of a transcriptional initiation com within the genome. In Such a system, expression of the plex. However, binding of tetracycline to tetR leads to a molecular Switch gene, as regulated by the characterizing conformational change intetRaccompanied with loss of tetR gene regulatory sequences, activates the expression of the affinity for tetO, allowing expression of the tagged ribosomal molecular tag. protein gene to be silenced by administering tetracycline. 0189 5.12. Conditional Transcriptional Regulation Sys Activity can be regulated over a range of orders of magnitude tems in response to tetracycline. 0190. In certain embodiments, the tagged ribosomal pro 0.195. In another specific embodiment, a tetracycline-in tein gene can be expressed conditionally by operably linking duced regulatable system is used to regulate expression of a at least the coding region for the tagged ribosomal protein detectable or selectable marker, e.g., the tetracycline transac gene to all or a portion of the regulatory sequences from the tivator (tTA) element of Gossen and Bujard (1992, Proc. Natl. characterizing gene, and then operably linking the tagged Acad. Sci. USA 89: 5547-51; incorporated herein by refer ribosomal protein gene coding sequences and characterizing ence in its entirety). gene sequences to an inducible or repressible transcriptional 0196. In another specific embodiment, the improved tTA regulation system. system of Shockett et al. (1995, Proc. Natl. Acad. Sci. USA 0191 Transactivators in these inducible or repressible 92: 6522-26, incorporated herein by reference in its entirety) transcriptional regulation systems are designed to interact is used to drive expression of the marker. This improved tTA specifically with sequences engineered into the vector. Such system places the tTA gene under control of the inducible systems include those regulated by tetracycline (“tet sys promoter to which tTA binds, making expression oftTA itself tems'), interferon, estrogen, ecdysone, Lac operator, proges inducible and autoregulatory. terone antagonist RU486, and rapamycin (FK506) with tet 0.197 In another embodiment, a reverse tetracycline-con systems being particularly preferred (see, e.g., Gingrich and trolled transactivator, e.g., rtTA2S-M2, is used. rtTA2S-M2 Roder, 1998, Annu Rev. Neurosci. 21:377-405; incorporated transactivator has reduced basal activity in the absence doxy herein by reference in its entirety). These drugs or hormones cycline, increased stability in eukaryotic cells, and increased (or their analogs) act on modular transactivators composed of doxycycline sensitivity (Urlinger et al., 2000, Proc. Natl. natural or mutant ligand binding domains and intrinsic or Acad. Sci. USA 97(14): 7963-68; incorporated herein by extrinsic DNA binding and transcriptional activation reference in its entirety). domains. In certain embodiments, expression of the detect 0.198. In another embodiment, the tet-repressible system able or selectable marker can be regulated by varying the described by Wells et al. (1999, Transgenic Res. 8(5):371-81: concentration of the drug or hormone in medium in vitro or in incorporated herein by reference in its entirety) is used. In one the diet of the transformed organism in vivo. aspect of the embodiment, a single plasmid Tet-repressible 0.192 The inducible or repressible genetic system can system is used. Preferably, a “mammalianized TetR gene, restrict the expression of the detectable or selectable marker rather than a wild-type TetR gene (tetR) is used (Wells et al., either temporally, spatially, or both temporally and spatially. 1999, Transgenic Res. 8(5): 371-81). US 2014/O 1961.76 A1 Jul. 10, 2014 42

0199. In other embodiments, expression of the tagged molecularly tagged ribosomal protein coding sequence by ribosomal protein gene is regulated by using a recombinase ribosomes in the population of cells characterized by expres system that is used to turn on or off tagged ribosomal protein sion of the endogenous characterizing gene. gene expression by recombination in the appropriate region 0206 Transformed organisms that exhibit appropriate of the genome in which the marker gene is inserted. Such a expression (e.g., detectable expression having Substantially recombinase system, in which a gene that encodes a recom the same expression pattern as the endogenous characterizing binase can be used to turn on or off expression of the tagged gene in a corresponding non-transformed organism or ana ribosomal protein gene (for review of temporal genetic tomical region thereof, i.e., detectable expression in at least Switches and “tissue scissors' using recombinases, see Hen 80%, 90% or, preferably, 95% of the cells shown to express nighausen and Furth, 1999, Nature Biotechnol. 17: 1062-63). the endogenous gene by in situ hybridization) are selected as Exclusive recombination in a selected cell type may be medi lines of transformed organisms. ated by use of a site-specific recombinase such as Cre, FLP 0207. In a preferred embodiment, immunohistochemistry wild type (wt), FLP-L or FLPe. Recombination may be using an antibody specific for the molecular tag or a marker effected by any art-known method, e.g., the method of Doet activated or repressed thereby is used to detect expression of schman et al. (1987, Nature 330: 576-78; incorporated herein by reference in its entirety); the method of Thomas et al., the molecular tag. (1986, Cell 44:419-28; incorporated herein by reference in its (0208 5.14. Profiling of mRNA Species entirety); the Cre-loxP recombination system (Sternberg and 0209. Once isolated, the mRNA bound by the tagged ribo Hamilton, 1981, J. Mol. Biol. 150: 467-86; Lakso et al., 1992, somal proteins or mRNA binding proteins of the invention Proc. Natl. Acad. Sci. USA 89: 6232-36; which are incorpo can be analyzed by any method known in the art. In one aspect rated herein by reference in their entireties); the FLP recom of the invention, the gene expression profile of cells express binase system of Saccharomyces cerevisiae (O’Gorman et al., ing the tagged ribosomal proteins or mRNA binding proteins 1991, Science 251: 1351-55); the Cre-loxP-tetracycline con is analyzed using any number of methods known in the art, for trol switch (Gossen and Bujard, 1992, Proc. Natl. Acad. Sci. example but not by way of limitation, by isolating the mRNA USA89: 5547-51); and ligand-regulated recombinase system and constructing cDNA libraries or by labeling the RNA for (Kellendonket al., 1999, J. Mol. Biol. 285: 175-82; incorpo gene expression analysis. rated herein by reference in its entirety). Preferably, the 0210. In a preferred embodiment, poly-A" RNA (mRNA) recombinase is highly active, e.g., the Cre-loxP or the FLPe is isolated from the tagged ribosomal proteins or mRNA system, and has enhanced thermostability (Rodriguez et al., binding proteins of the invention, and converted to cDNA 2000, Nature Genetics 25: 139-40; incorporated herein by through a reverse transcription reaction primed by a first reference in its entirety). primer that comprises an oligo-dT sequence. The first primer 0200. In certain embodiments, a recombinase system can is contacted with the poly-A" RNA under conditions that be linked to a second inducible or repressible transcriptional allow the oligo-dT site to hybridize to the first selected regulation system. For example, a cell-specific Cre-loXP sequence (i.e., the poly-A sequence). Alternatively, the first mediated recombination system (Gossen and Bujard, 1992, primer comprises a sequence that is the reverse complement Proc. Natl. Acad. Sci. USA 89: 5547-51) can be linked to a of a specific selected sequence (for example, a sequence cell-specific tetracycline-dependent time switch detailed characteristic of a family of mRNAs). above (Ewald et al., 1996, Science 273: 1384-1386; Furth et 0211. The first primer is then used to prime synthesis of a al. Proc. Natl. Acad. Sci. U.S.A. 91:9302-06 (1994); St-Onge first-strand cDNA by reverse transcription of the source et al., 1996, Nucleic Acids Research 24(19): 3875-77; which single-stranded nucleic acid. When the Source nucleic acid is are incorporated herein by reference in their entireties). mRNA, a RNA-dependent DNA polymerase activity is 0201 In one embodiment, an altered cre gene with required to convert the primer-source mRNA hybrid to a enhanced expression in mammalian cells is used (Gorski and first-strand cDNA-source mRNA hybrid. A reverse tran Jones, 1999, Nucleic Acids Research 27(9): 2059-61; incor scriptase can be used to catalyze RNA-dependent DNA poly porated herein by reference in its entirety). merase activity. 0202 In a specific embodiment, the ligand-regulated 0212 Reverse transcriptase is found in all retroviruses and recombinase system of Kellendonket al. (1999, J. Mol. Biol. is commonly derived from Moloney murine leukemia virus 285: 175-82; incorporated herein by reference in its entirety) (M-MLV-RT), avian myeloblastosis virus (AMV-RT), bovine can be used. In this system, the ligand-binding domain (LBD) leukemia virus (BLV-RT), Rous sarcoma virus (RSV-RT), of a receptor, e.g., the progesterone or estrogen receptor, is and human immunodeficiency virus (HIV-RT); enzymes fused to the Cre recombinase to increase specificity of the from these sources are commercially available (e.g., Life recombinase. Technologies-Gibco BRL, Rockville, Md. Roche Molecular 0203) 5.13. Methods of Screening for Expression of Biochemicals, Indianapolis, Ind.: PanVera, Madison Wis.). Molecularly Tagged Ribosomal Protein Coding Sequences 0213. A single reverse transcriptase or a combination of 0204. In preferred embodiments, the invention provides a two or more reverse transcriptases (e.g., M-MLV-RT and collection of lines of transformed organisms that contain a AMV-RT) can be used to catalyze reverse transcription and selected Subset of cells or cell population expressing molecu first-strand cDNA synthesis. Such reverse transcriptases are larly-tagged ribosomes. The collection comprises at least two used to convert a primer-single-stranded nucleic acid individual lines, preferably at least five individual lines. Each (mRNA) hybrid to a first-strand cDNA-primer-single individual line is selected for the collection based on the stranded nucleic acid hybrid in the presence of additional identity of the subset of cells in which the molecularly tagged reagents that include, but are not limited to: dNTPs; monova ribosomes are expressed. lent and divalent cations, e.g., KC1, MgCl2, Sulfhydryl 0205 Potential founder organisms for a line of trans reagents, e.g., dithiothreitol (DTT); and buffering agents, formed organisms can be screened for expression of the e.g., Tris-Cl. US 2014/O 1961.76 A1 Jul. 10, 2014

0214. As described below (second-strand cDNA synthe C. for 30-40 minusing SUPERSCRIPT IITM as the source of sis), the catalytic activities required to convert a first-strand reverse transcriptase/DNA polymerase. cDNA-single-stranded nucleic acid hybrid to ds cloNA are an 0220. The transcribed first-strand cDNA may be isolated RNase H activity and a DNA-dependent DNA polymerase from the source RNA to which it is hybridized by any of wide activity. Most reverse transcriptases, such as the ones variety of established methods. For example, the isolation described above (i.e., M-MLV-RT, AMV-RT, BLV-RT, RSV method may involve treating the RNA with a nuclease such as RT, and HIV-RT) also catalyze each of these activities. There RNase H, a denaturant such as heat or an alkali, etc., and/or fore, in certain embodiments, the reverse transcriptase separating the strands by electrophoresis. The second strand employed for first-strand cDNA synthesis remains in the of cDNA can be synthesized using methods well known in the reaction mixture where it can also serve to catalyze second art, for example using reverse transcriptase which primes strand cDNA synthesis. Alternatively, a variety of proteins from the hairpin loop structure that forms at the 3' end of the that catalyze one or two of these activities can be added to the first strand of cDNA. cDNA synthesis reaction. Such proteins may be added 0221 Gene expression in cells treated and not treated with together during a single reaction step, or added sequentially a compound of interest or in cells from animals treated or during two or more Substeps. untreated with a particular treatment, e.g., pharmaceutical or 0215 Preferably a reverse transcriptase lacking RNase H Surgical treatment, may be compared. In addition, mRNA activity is used, in particular when long transcripts are bound by the tagged ribosomal proteins or mRNA binding desired. For example, M-MLV reverse transcriptase lacking proteins may also be analyzed, for example by northern blot RNase H activity (Kotewicz et al., U.S. Pat. No. 5,405,776, analysis, PCR, RNase protection, etc., for the presence of issued Apr. 11, 1995; commercially available as SUPER mRNAS encoding certain protein products and for changes in SCRIPT IITM (Life Technologies-Gibco BRL) can be used to the presence or levels of these mRNAs depending on the catalyze both RNA-dependent DNA polymerase activity and treatment of the cells. In specific embodiments, the mRNA is DNA-dependent DNA polymerase activity. In a preferred isolated from different populations of cells or from popula embodiment, SUPERSCRIPT IITM (Life Technologies tions of cells exposed to different stimuli. Gibco BRL) is used as a source of DNA polymerase activity. 0222. In another aspect, mRNA bound by the tagged ribo This DNA polymerase can be used to synthesize a comple Somal proteins or mRNA binding proteins may be used to mentary DNA strand from single-stranded RNA, DNA, or an produce a cDNA library and, in fact, a collection of such cell RNA:DNA hybrid. SUPERSCRIPT IITM is genetically engi type specific cDNA libraries may be generated from different neered by the introduction of point mutations that greatly populations of isolated cells. Such cDNA libraries are useful reduce its RNase H activity but preserve full DNA poly to analyze gene expression, isolate and identify cell type merase activity. The structural modification of the enzyme specific genes, splice variants and non-coding RNAS. In therefore eliminates almost all degradation of RNA mol another aspect, such cell-type specific libraries prepared from ecules during first-strand cDNA synthesis. mRNA bound by, and isolated from, the tagged ribosomal 0216. In certain embodiments, the reverse transcriptase is proteins or mRNA binding proteins from treated and inactivated after first-strand synthesis. The reverse tran untreated transgenic animals of the invention or from trans Scriptase may be rendered inactive using any convenient pro genic animals of the invention having and not having a dis tocol. The transcriptase may be irreversibly or reversibly ease state can be used, for example in Subtractive hybridiza rendered inactive. Where the transcriptase is reversibly ren tion procedures, to identify genes expressed at higher or lower dered inactive, the transcriptase is physically or chemically levels in response to a particular treatment or in a disease state altered so as to no longer be able to catalyze RNA-dependent as compared to untreated transgenic animals. The mRNA DNA polymerase activity. isolated from the tagged ribosomal proteins or mRNA bind ing proteins may also be analyzed using particular microar 0217. The reverse transcriptase may be irreversibly inac rays generated and analyzed by methods well known in the tivated by any convenient means. In certain embodiments, the art. Gene expression analysis using microarray technology is reverse transcriptase is heatinactivated. The reaction mixture well known in the art. Methods for making microarrays are is subjected to heating to a temperature Sufficient to inactivate taught, for example, in U.S. Pat. No. 5,700,637 by Southern, the reverse transcriptase prior to commencement of the tran U.S. Pat. No. 5,510,270 by Fodor et al. and PCT publication Scription step. In these embodiments, the temperature of the WO99/35293 by Albrecht et al., which are incorporated by reaction mixture, and therefore the reverse transcriptase reference in their entireties. By probing a microarray with present therein, is typically raised to 55° C. to 70° C. for 5 to various populations of mRNAS, transcribed genes in certain 60 minutes, preferably to about 65° C. for 15 minutes. In a cell populations can be identified. Moreover, the pattern of preferred embodiment, the transcriptase is inactivated by gene expression in different cell types of cell states may be adding 1M KOH to the reaction mixture, preferably to make readily compared. a final concentration of 50 mMKOH in the reaction mixture, 0223) Data from such analyses may be used to generate a and by incubating at 65° C. for 15 mM prior to commence database of gene expression analysis for different populations ment of the transcription step. This step ensures that contami of cells in the animal or in particular tissues or anatomical nating non-poly-ARNA is removed from the sample, making regions, for example, in the brain. Using Such a database the Subsequent tailing reaction more efficient. together with bioinformatics tools, such as hierarchical and 0218. Alternatively, reverse transcriptase may irreversibly non-hierarchical clustering analysis and principal compo inactivated by introducing a reagent into the reaction mixture nents analysis, cells are “fingerprinted for particular indica that chemically alters the protein so that it no longer has tions from healthy and disease-model animals or tissues. RNA-dependent DNA polymerase activity. 0224. In yet another embodiment, specific cells or cell 0219. In a preferred embodiment, the reverse transcription populations that express a potential a molecularly tagged reaction to synthesize the first-strand cDNA proceeds at 42 ribosomal protein or mRNA binding protein are isolated from US 2014/O 1961.76 A1 Jul. 10, 2014 44 the collection and analyzed for specific protein-protein inter - Continued actions or an entire protein profile using proteomics methods L37 known in the art, for example, chromatography, mass spec (SEO ID NO: 9) troscopy, 2D gel analysis, etc. 5' oligo. GGAATTCCCGGCGACATGGCTAAACGCACCAAGAAGG 0225. Other types of assays may be used to analyze the cell (SEQ ID NO: 10) population expressing the molecularly tagged ribosomal pro 3' oligo. GCGGCCGCTCTGGTCTTTCAGTTCCTTCAGTCTTCTGAT tein or mRNA binding protein, either in vivo, in explanted or S2O sectioned tissue or in the isolated cells, for example, to moni (SEQ ID NO: 11) tor the response of the cells to a certain treatment or candidate 5' oligo. GGAATTCGCGCGCAACAGCCATGGCTTTTAAGGATAC compound or to compare the response of the animals, tissue or cells to expression of the target or inhibitor thereof, with (SEQ ID NO: 12) animals, tissue or cells from animals not expressing the target 3' oligo. GCGGCCGCTAGCATCTGCAATGGTGACTTCCACCTCAAC or inhibitor thereof. The cells may be monitored, for example, L32 but not by way of limitation, for changes in electrophysiol (SEQ ID NO: 13) ogy, physiology (for example, changes in physiological 5' oligo. GGAATTCGGCATCATGGCTGCCCTTCGGCCTCTGGTG parameters of cells. Such as intracellular or extracellular cal (SEQ ID NO: 14) cium or other ion concentration, change in pH, change in the 3' oligo. GCGGCCGCTTTCATTCTCTTCGCTGCGTAGCCTGGC presence or amount of second messengers, cell morphology, cell viability, indicators of apoptosis, Secretion of Secreted 0231. Mouse brain cDNA (Clontech) was used as the tem factors, cell replication, contact inhibition, etc.), morphology, plate for a polymerase chain reaction (PCR). 50 mL PCR etc. aliquots were prepared for each set of primer pairs. Each 0226. In particular embodiments, the isolated mRNA is reaction consisted of 40 mL PCR-grade water, 5 mL 10x used to probe a comprehensive expression library (see, e.g., Advantage 2 PCR Buffer (Clontech), 1 mL mouse brain Serafini et al., U.S. Pat. No. 6,110,711, issued Aug. 29, 2000, cDNA template, 1 mL each 5' and 3' oligonucleotide primer which is incorporated by reference herein). The library may (10 mM), 1 mL dNTP mix (10 mM each dATP, dCTP, dTTP, be normalized and presented in a high density array. Because and dGTP), and 1 mL 50x Advantage 2 Polymerase Mix. approximately one tenth of the mRNA species in a typical 0232. The PCR reaction was carried out under the follow somatic cell constitute 50% to 65% of the mRNA present, the ing conditions: cDNA library may be normalized using reassociation-kinet 0233 1.95° C. for 1 minute ics based methods. (See Soares, 1997, Curr. Opin. Biotech 0234 2.30 cycles of 95°C. for 15 seconds and 68° C. for nol. 8:542-546). 1 minute 0227. In a particular embodiment, a subpopulation of cells 0235 10 mL of each reaction was analyzed by electro expressing a molecularly tagged ribosomal protein or mRNA phoresis through a 1.2% agarose gel in TAE. The remainder binding protein is identified and/or gene expression analyzed of the reaction was purified using a QIAGEN QUICKSPIN using the methods of Serafini et al., WO 99/29877 entitled PCR reaction purification kit following the manufacturers “Methods for defining cell types,” which is hereby incorpo protocol. rated by reference in its entirety. 0236 Purified DNA was digested with EcoRI and NotI followed by electrophoresis through a 1.2% agarose gel, iso 6.EXAMPLE1 lation of the DNA fragment, and extraction of the DNA from the gel using a QIAGEN QUICKSPIN Gel isolation kit fol Tagging of Ribosomal Proteins lowing the manufacturer's protocol. 0228 6.1. Isolation of Ribosomal Protein-Encoding 0237 Each cDNA fragment was ligated to pcDNA3.1+, cDNAS which had been digested with EcoRI and NotI. Ligated DNA 0229. This example demonstrates the successful introduc was used to transform chemically competent DH5a bacteria. tion of a Strep-tag into ribosomal Subunit protein-encoding Transformed bacteria were plated onto LB plates containing cDNAS. 100 mg/mL amplicillin. For each ligation, 3 amplicillin resis 0230. Oligonucleotides complementary to the sequence of tant colonies were picked, grown in 5 mL LB cultures con ribosomal subunit proteins, S6, and L37 were designed to taining 100 mg/mL amplicillin. permit PCR amplification of the cDNAs from reverse tran 0238. The cultures were incubated for 16 hours on a shak scribed mRNA. EcoRI and Not restrictions sites were incor ing platform at 37° C. Plasmid DNA was isolated from the porated into the 5' terminal ends of the 5' and 3' specific cultures using a QIAGEN miniprep kit following the manu oligonucleotides to facility the subcloning of the amplified facturer's protocol. Plasmid DNA was digested with PmeI cDNAs into the expression vector pcDNA3.1+. The sequence and analyzed on a 1.2% agarose gel to identify plasmids that of the oligonucleotide sets were as follows: contain the cDNA insert. 0239 6.2. Addition of Strep-Tag to the Ribosomal Subunit Proteins S6 (SEO ID NO: 7) 0240. The amino acid sequence Trp-Ser-His-Pro-Gln 5' oligo. GGAATTCATTCAAGATGAAGCTGAACATCTCCTTCCC Phe-Glu-Lys (SEQ ID NO: 17) represents Strep-tag II, a peptide that is able to bind with high affinity to the protein (SEQ ID NO: 8) Streptavidin. Proteins that contain the Strep-tag II can be 3' oligo. GCGGCCGCTTTTCTGACTGGATTCAGACTTAGAAGTAGAA identified and isolated through affinity to Streptavidin. Strep GCT tag II was added to each of the ribosomal subunit proteins, S6, S20, L32, and L37, at the C-terminus of the protein. Two complementary oligonucleotide adaptors were designed that US 2014/O 1961.76 A1 Jul. 10, 2014

encode for Strep-tag II. These complementary oligonucle in 100 mMammonium chloride, 5 mM magnesium chloride, otide adaptors, when hybridized to form a double stranded 1 mM DTT and 20 mM Tris-HCl (pH 7.6). DNA, can be ligated in-frame to the ribosomal subunit 0248. An equal volume of 2x denaturing protein electro cDNAs in the vector pcDNA3.1+. phoresis sample buffer was added to each of the polysome 0241 The sequences of the Strep-tag II oligonucleotides samples. Solubilized polysomal proteins were fractionated Were by electrophoresis through a SDS containing 4-20% gradient polyacrylamide gel, and transferred to a nitrocellulose filter. The filter was quenched for 1 hour in PBS containing 5% dry Upper strand oligonucleotide (SEQ ID NO: 15) milk followed by incubation with rabbit antisera specific for 5 GGCCGCAGCGCTTGGAGCCACCCGCAGTTCGAAAAATAA 3 the Strep-tag II amino acid sequence epitope Trp-Ser-His Pro-Gln-Phe-Glu-Lys (SEQ ID NO: 17). The filters were Bottom strand oligonucleotide rinsed three times in PBS for 20 minutes each, followed by a (SEQ ID NO: 16) one hour incubation with a goat anti-rabbit antisera that had s' TCGATTATTTTTCGAACTGCGGGTGGCTCCAAGCGCTGC 3' been conjugated to horse radish peroxidase (HRP), in PBS 0242 Each of the plasmids containing the ribosomal sub containing 10% dry milk. The filters were then washed for unit protein-encoding cDNAs was digested with Non and three times in PBS. The HRP was detected by incubating the XhoI. The upper strand and bottom strand oligonucleotides filter in 20 mL of PBS, containing 4-chlornaphtol and hydro were mixed in equal molar ratios, heated to 70° C., and gen peroxidase. allowed to cool to room temperature. The hybridized oligo 0249. As seen in FIG. 1, polysomes from cells transfected nucleotides were then ligated to the Nod and XhoI digested with plasmids expressing tagged versions of ribosomal pro plasmids. The ligation reactions were transformed into com teins S6 (lane 2, in duplicate), L32 (lane 4, in duplicate, not petent DH5a bacteria and plated onto LB plates supple easily seen in the reproduction), and L37 (lane 5, in duplicate) mented with 100 mg/mL amplicillin. contain proteins that are reactive to the anti-streptag II anti 0243 For each ligation, five ampicillin resistant colonies bodies. These proteins correspond to the predicted molecular were picked into 5 mL LB cultures containing 100 mg/mL weights of the S6 (34 kDa), L32 (52 kDa), and L37 (9 kDa). ampicillin. The cultures were grown at 37°C. for 16 hours. The S6 and L37 proteins appear to be more abundantly rep Plasmid DNA was harvested, cut with PmeI, and analyzed by electrophoresis through 5% non denaturing polyacrylamide resented in the polysomal fraction compared to the L32 pro gels. Untagged ribosomal Subunit protein encoding cDNAS tein, which is difficult to visualize in the figure but is present were also digested with Pmel, and run side by side with the upon close inspection of the original filter. Tagged S20 (lane tagged versions to identify the cDNAs that contained the 3, in duplicate) does not appear to be present in the polysomal strep-tagII sequence. All tagged cDNAS were then sequenced fraction. Polysomes from untransfected cells (lane 1, in dupli to confirm the sequence of each cDNA. cate) do not display any immunoreactive material. 0250) 7.2. Polysome Immunoprecipitation 7. EXAMPLE 2 (0251 HEK 293 cells were transfected with plasmid con structs expressing tagged ribosomal proteins S6 or L37 and Isolation and Immunoprecipitation of PolySomes homogenized as above. Unlysed cells, nuclei, and mitochon dria were removed by centrifugation at 10,000xg for 10 min 0244 7.1. Polysome Isolation utes. 5 micrograms of an anti-streptag rabbit polyclonal anti 0245 Plasmid constructs expressing tagged ribosomal proteins were transfected into Human Embryonic Kidney sera was added to the supernatant and incubated at 4°C. for 72 (HEK293) cells using the transfection reagent FuGENE 6 hours. 100 microliters of a protein A Sepharose slurry was (Roche Applied Science) following the manufacturer's pro then added and incubation continued at 4°C. for one hour. cedures. Briefly, for each transfection, 100 mL of serum free The sepharose beads were pelleted by centrifugation at medium (DMEM) was placed into a sterile tube, followed by 1,000xg for 5 minutes. The supernatant was removed, and the the addition of three mL of Fugene 6 and 1 mg of plasmid pellet was resuspended in 10 mLS of fresh homogenization DNA. The Fugene 6/DNA mixture was allowed to incubate at buffer. This procedure was repeated three times. room temperature for 15 minutes before being added to a 60 0252 RNA was harvested from the protein A Sepharose mm plate of HEK293 cells grown in DMEM supplemented pellet using an RNA isolation kit (Ambion). Briefly, the pel with 10% fetal calf serum, glutamine, and antibiotics. lets were solubilized in 600 microliters of homogenization 0246 Three days after transfection, the cells were har buffer, followed by the addition of 600 microliters of 64% vested by scraping into homogenization buffer (50 mM EtOH. This mixture was applied to the spin column provided sucrose, 200 mM ammonium chloride, 7 mM magnesium by the kit, followed by centrifugation at 10,000xg for 1 acetate, 1 mM dithiothreitol, and 20 mM Tris-HCl, pH 7.6). minute. The column was sequentially washed in the two wash The cells were lysed by the addition of the detergent, NP-40, buffers provided with the kit. RNA bound to the column was to a concentration of 0.5% followed by five strokes in a glass released by the addition of elution buffer heated to 95°C. dounce tissue homogenizer. Unlysed cells, nuclei and mito RNA was visualized by electrophoresis through an ethidium chondria were pelleted by centrifugation at 10,000xg for 10 bromide containing agarose gel. minutes, at 4°C. The Supernatant was carefully removed and 0253) As seen in FIG.2, ribosomal RNA is present (arrow) layered over a two-step discontinuous gradient of 1.8 Mand in material immunoprecipitated from tagged S6 (lane 2) 1.OM sucrose in 100 mM ammonium chloride, 5 mM mag transfectants. Such RNA is also present at low levels in mate nesium acetate, 1 mM dithiothreitol. 20 mM Tris-HCl (pH rial from tagged L37 transfectants (lane 3: not easily seen in 7.6). The gradient was centrifuged for 18 hours at 98,000xg at reproduction). Such RNA is not present in material from 40 C. untransfected cells (lane 1). 0247. Following centrifugation, the supernatants were 0254 All references cited herein are incorporated herein carefully removed, and the polysome pellet was resuspended by reference in their entireties and for all purposes to the same US 2014/O 1961.76 A1 Jul. 10, 2014 46 extent as if each individual publication, patent or patent appli 0256 Many modifications and variations of this invention cation was specifically and individually indicated to be incor can be made without departing from its spirit and scope, as porated by reference in its entirety for all purposes. will be apparent to those skilled in the art. The specific 0255. The citation of any publication is for its disclosure embodiments described herein are offered by way of example prior to the filing date and should not be construed as an only, and the invention is to be limited only by the terms of the admission that the present invention is not entitled to antedate appended claims along with the full scope of equivalents to such publication by virtue of prior invention. which such claims are entitled.

SEQUENCE LISTING

<16 Os NUMBER OF SEO ID NOS : 17

<21 Os SEQ ID NO 1 &211s LENGTH: 9 212s. TYPE: PRT <213> ORGANISM: Influenza virus

<4 OOs SEQUENCE: 1 Tyr Pro Tyr Asp Val Pro Asp Tyr Ala 1. 5

<21 Os SEQ ID NO 2 &211s LENGTH: 10 212s. TYPE: PRT <213> ORGANISM: Homo sapiens <4 OOs SEQUENCE: 2 Glu Glin Llys Lieu. Ile Ser Glu Glu Asp Lieu. 1. 5 1O

<21 Os SEQ ID NO 3 &211s LENGTH: 6 212s. TYPE: PRT <213> ORGANISM: Bluetongue virus <4 OOs SEQUENCE: 3 Gln Tyr Pro Ala Lieu. Thr 1. 5

<21 Os SEQ ID NO 4 &211s LENGTH: 8 212s. TYPE: PRT <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic peptide

<4 OOs SEQUENCE: 4 Asp Tyr Lys Asp Asp Asp Asp Llys 1. 5

<21 Os SEQ ID NO 5 &211s LENGTH: 9 212s. TYPE: PRT <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic peptide

<4 OOs SEQUENCE: 5 Ala Trp Arg His Pro Glin Phe Gly Gly 1. 5

<21 Os SEQ ID NO 6 &211s LENGTH: 551 US 2014/O 1961.76 A1 Jul. 10, 2014 47

- Continued

&212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic polynucleotide

<4 OOs, SEQUENCE: 6 taacgttact ggc.cgaagcc gcttggaata aggc.cggtgt gcgtttgtct atatgttatt 6 O titccaccata ttgcc.gtc.tt ttggcaatgt gagggc.ccgg aaacctggcc ctdtctt citt 12 O gacgagcatt cct agggg.tc titt CC cct ct cqccaaagga atgcaaggt C tittgaatgt 18O cgtgaaggaa gcagttcCtc taa.gct tc ttgaaga caa acaacgtctg. tagcgaccct 24 O ttgcaggcag C9gaac cc cc cacctggcga Caggtgcctic ticggcc aaa agccacgtgt 3OO ataagataca Cctgcaaagg cqgcacalacc C cagtgccac gttgtgagtt ggatagttgt 360 ggaaagagtic aaatggct ct cctaagcgta ttcaacaagg ggctgaagga tigcc.ca.gaag 42O gtactic catt gtatgggatc tdatctgggg cct cqgtgca catgctttac atgtgtttag 48O tcgaggittaa aaaaacgt.ct aggc.ccc.ccg aaccacgggg acgtggttitt CCtttgaaaa 54 O acac catgat a 551

<210s, SEQ ID NO 7 &211s LENGTH: 37 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence & 22 O FEATURE; <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer

<4 OO > SEQUENCE: 7 ggaatt catt caagatgaag ctgaacat ct cott.ccc 37

<210s, SEQ ID NO 8 &211s LENGTH: 43 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer

<4 OOs, SEQUENCE: 8 tcgaagatga agattcagaci ttaggtoagt cttitt.cgc.cg gcg 43

<210s, SEQ ID NO 9 &211s LENGTH: 37 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer

<4 OOs, SEQUENCE: 9 ggaatt CCC9 gcgacatggc taaacgcacc aagaagg 37

<210s, SEQ ID NO 10 &211s LENGTH: 39 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer US 2014/O 1961.76 A1 Jul. 10, 2014 48

- Continued

<4 OOs, SEQUENCE: 10 tagt cittctg act tcc ttga ctittctggtc. tcgc.cggcg 39

<210s, SEQ ID NO 11 &211s LENGTH: 37 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 11 ggaatticgcg cgcaa.cagcc atggcttitta aggatac 37

<210s, SEQ ID NO 12 &211s LENGTH: 39 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 12

Caactic cacc titcagtggta acgt.ctacga t cqc.cggcg 39

<210s, SEQ ID NO 13 &211s LENGTH: 37 & 212 TYPE DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 13 ggaatticggc at catggctg. CCCtt cqgcc tictggtg 37

<210s, SEQ ID NO 14 &211s LENGTH: 36 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 14 cggit Cogatg cgt.cgcttct Ctt actitt.cg ccggcg 36

<210s, SEQ ID NO 15 &211s LENGTH: 39 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 15 ggcc.gcagcg Cttggagc.ca ccc.gcagttc gaaaaataa 39

<210s, SEQ ID NO 16 &211s LENGTH: 39 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic US 2014/O 1961.76 A1 Jul. 10, 2014 49

- Continued primer

<4 OOs, SEQUENCE: 16 tcgattattt ttcgaactgc gggtggct Co. aag.cgctgc 39

<210s, SEQ ID NO 17 &211s LENGTH: 8 212. TYPE: PRT <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic peptide

<4 OOs, SEQUENCE: 17 Trp Ser His Pro Glin Phe Glu Lys 1. 5

What is claimed is: causes expression of said ribosomal fusion protein in a chosen 1. An isolated ribosome-reagent complex comprising: cell type, and wherein said ribosomal fusion protein is incor (a) an intact ribosome comprising a ribosomal fusion pro porated in an intact ribosome that translates and/or binds tein, wherein said ribosomal fusion protein comprises a mRNA. ribosomal protein, or fragment thereof, fused to a pep 14. The transgenic plant of claim 13, wherein said peptide tide tag: tag is not a ribosomal protein. (b) an mRNA bound to said intact ribosome; and 15. The transgenic plant of claim 13, wherein said riboso (c) a reagent bound to said peptide tag. mal protein is S6, S15, S18, L10a, L32, or L37. 2. The isolated complex of claim 1, wherein said reagent is 16. The transgenic plant of claim 13, wherein said peptide bound to a solid support. tag is placed at the N- or C-terminus of said ribosomal protein. 3. The isolated complex of claim 1, wherein said peptide 17. The transgenic plant of claim 13, wherein said peptide tag is not a ribosomal protein. tag does not inhibit or interfere with a function of said ribo 4. The isolated complex of claim 1, wherein said ribosomal Somal protein. protein is S6, S15, S18, L10a, L32, or L37. 18. A transgenic plant comprising a transgene comprising 5. The isolated complex of claim 1, wherein said peptide a nucleotide sequence encoding a mRNA binding fusion pro tag is placed at the N- or C-terminus of said ribosomal protein. tein, wherein said mRNA binding fusion protein comprises a 6. The isolated complex of claim 1, wherein said peptide mRNA binding protein, or fragment thereof, fused to a pep tag does not inhibit or interfere with a function of said ribo tide tag, wherein said nucleotide sequence is operably linked Somal protein. to a plant endogenous promoter, wherein said plant endog 7. An isolated mRNA binding protein-mRNA-reagent enous promoter causes expression of said mRNA binding complex comprising: fusion protein in a chosen cell type. (a) a mRNA binding fusion protein comprises a mRNA 19. The transgenic plant of claim 18, wherein said peptide binding protein, or fragment thereof, fused to a peptide tag is not a mRNA binding protein. tag, 20. The transgenic plant of claim 18, wherein said mRNA (b) a mRNA bound to said mRNA binding fusion protein; binding fusion protein is not a polyA binding protein. and 21. The transgenic plant of claim 18, wherein said peptide (c) a reagent bound to said peptide tag. tag is placed at the N- or C-terminus of said mRNA binding 8. The isolated complex of claim 7, wherein said reagent is protein. bound to a solid Support. 22. The transgenic plant of claim 18, wherein said peptide 9. The isolated complex of claim 7, wherein said peptide tag does not inhibit or interfere with a function of said mRNA tag is not a mRNA binding protein. binding protein. 10. The isolated complex of claim 7, wherein said mRNA binding fusion protein is not a polyA binding protein. 23. The transgenic plant of claim 13, wherein said riboso 11. The isolated complex of claim 7, wherein said peptide mal fusion protein is not translated in a cell type that is not tag is placed at the N- or C-terminus of said mRNA binding chosen. protein. 24. The transgenic plant of claim 13, wherein said chosen 12. The isolated complex of claim 7, wherein said peptide cell type comprises neural cells. tag does not inhibit or interfere with a function of said mRNA 25. The transgenic plant of claim 24, wherein said neural binding protein. cells comprise neuronal cells. 13. A transgenic plant comprising a transgene comprising 26. The transgenic plant of claim 13, wherein said peptide a nucleotide sequence encoding a ribosomal fusion protein, tag is streptavidin. wherein said ribosomal fusion protein comprises a ribosomal 27. The transgenic plant of claim 13, wherein said peptide protein, or fragment thereof, fused to a peptide tag, wherein tag comprises 200 or more amino acids. said nucleotide sequence is operably linked to a plant endog 28. The transgenic plant of claim 13, wherein said plant enous promoter, wherein said plant endogenous promoter endogenous promoter is from a characterizing gene. US 2014/O 1961.76 A1 Jul. 10, 2014 50

29. The transgenic plant of claim 28, wherein said expres 48. The method of claim 47, wherein said endogenous sion of said nucleotide sequence is substantially similar to regulatory sequence is at least or about 100 kilobases in expression of said characterizing gene. length. 30. The transgenic plant of claim 28, wherein said expres 49. The method of claim 47, wherein said endogenous sion of said nucleotide sequence is in at least 80% of cells regulatory sequence is at least or about 200 kilobases in shown to express said characterizing gene in said mammal. length. 31. The transgenic plant of claim 13, wherein said plant 50. The method of claim 47, wherein said endogenous endogenous promoter is part of an endogenous regulatory regulatory sequence comprises a transcription regulatory ele Sequence. 32. The transgenic plant of claim 31, wherein said endog ment. enous regulatory sequence is at least or about 100 kilobases in 51. The method of claim 50, wherein said transcription length. regulatory element comprises a transcriptional enhancer 33. The transgenic plant of claim 31, wherein said endog sequence, insulator sequence, or a combination thereof. enous regulatory sequence is at least or about 200 kilobases in 52. The method of claim 38, wherein said nucleotide length. sequence is operably linked to a regulatory sequence associ 34. The transgenic plant of claim 31, wherein said endog ated with or part of a bacterial artificial chromosome (BAC). enous regulatory sequence comprises a transcription regula 53. The method of claim 38, further comprising contacting tory element. said cell with an agent capable of arresting translation. 35. The transgenic plant of claim 34, wherein said tran 54. The method of claim 38, further comprising determin Scription regulatory element comprises a transcriptional ing a gene expression profile for said cell. enhancer sequence, insulator sequence, or a combination thereof. 55. The method of claim38, further comprising identifying 36. The transgenic plant of claim 13, wherein said nucle said actively translated mRNA. otide sequence is operably linked to a regulatory sequence 56. The method of claim 38, further comprising quantify associated with or part of a bacterial artificial chromosome ing said actively translated mRNA of said cell. (BAC). 57. A transgene comprising a nucleotide sequence encod 37. A cell of said transgenic plant of claim 13 comprising a ing a ribosomal fusion protein, wherein said ribosomal fusion nucleotide sequence encoding a ribosomal fusion protein, protein comprises a ribosomal protein, or fragment thereof, wherein said ribosomal fusion protein comprises a ribosomal fused to a peptide tag, wherein said nucleotide sequence is protein, or fragment thereof, fused to a peptide tag, wherein operably linked to plant endogenous promoter, wherein said said nucleotide sequence is operably linked to a plant endog plant endogenous promoter causes expression of said riboso enous promoter, wherein said plant endogenous promoter mal fusion protein in a chosen cell type, and wherein said causes expression of said ribosomal fusion protein said cell, ribosomal fusion protein is incorporated in an intact ribosome and wherein said ribosomal fusion protein is incorporated in that translates and/or binds mRNA. an intact ribosome that translates and/or binds mRNA. 58. The transgene of claim 57, wherein said ribosomal 38. A method of isolating an actively translated mRNA protein is S6, S15, S18, L10a, L32, or L37. from said transgenic plant of claim 13, comprising: 59. The transgene of claim 57, wherein said peptide tag is (a) contacting a cell isolated from said transgenic plant or placed at the N- or C-terminus of said ribosomal protein. a lysate of said cell that comprises said ribosomal fusion 60. A transgene comprising a nucleotide sequence encod protein with a reagent that binds to said peptide tag: ing a mRNA binding fusion protein, wherein said mRNA (b) isolating said ribosomal fusion protein containing said binding fusion protein comprises a mRNA binding protein, or peptide tag; and fragment thereof, fused to a peptide tag, wherein said nucle (c) isolating said actively translated mRNA from said ribo otide sequence is operably linked to a plant endogenous pro SOS. moter, wherein said plant endogenous promoter causes 39. The method of claim 38, wherein said reagent is bound expression of said mRNA binding fusion protein in a chosen to a solid Support. cell type. 40. The method of claim 38, wherein said peptide tag is 61. The transgene of claim 60, wherein said peptide tag is streptavidin and said reagent specifically binds Streptavidin. placed at the N- or C-terminus of said mRNA binding protein. 41. The method of claim 38, wherein said chosen cell type 62. The transgene of claim 57, wherein said plant endog is a neural cell. enous promoter is part of an endogenous regulatory sequence 42. The method of claim 41, wherein said neural cells comprise neuronal cells. that is at least or about 100 kilobases in length. 43. The method of claim 38, wherein said peptide tag 63. The transgene of claim 57, wherein said plant endog comprises 200 or more amino acids. enous promoter is part of an endogenous regulatory sequence 44. The method of claim 38, wherein said plant endog that is at least or about 200 kilobases in length. enous promoter is from a characterizing gene. 64. The transgene of claim 57, wherein said plant endog 45. The method of claim 44, wherein said expression of enous promoter is part of an endogenous regulatory sequence said nucleotide sequence is Substantially similar to expres that comprises a transcription regulatory element. sion of said characterizing gene. 65. The transgene of claim 64, wherein said transcription 46. The method of claim 44, wherein said expression of regulatory element comprises a transcriptional enhancer said nucleotide sequence is in at least 80% of cells shown to sequence, insulator sequence, or a combination thereof. express said characterizing gene in said mammal. 66. The transgene of claim 57, wherein said nucleotide 47. The method of claim 38, wherein said plant endog sequence is operably linked to a regulatory sequence associ enous promoteris part of an endogenous regulatory sequence. ated with or part of a bacterial artificial chromosome (BAC). US 2014/O 1961.76 A1 Jul. 10, 2014 51

67. A method of making said transgenic plant of claim 13, comprising introducing into a founder plant or embryo a transgene comprising a nucleotide sequence encoding a ribo Somal fusion protein. k k k k k