USO05766941A Patent 19 11 Patent Number: 5,766,941 Cormier et al. 45) Date of Patent: Jun. 16, 1998

54 RECOMBINANT DNA VECTORS CAPABLE Inouye. S. et al., "Cloning and sequence analysis of cDNA OF EXPRESSINGAPOAEQUORIN for the luminescent protein ." Proc. Natl. Acad. Sci. USA, vol. 82, pp. 3154-3158. (May 1985). (75) Inventors: Milton J. Cormier, Bogart. Ga.; Tsuji, F.I. et al., "Site-specific mutagenesis of the calci Douglas Prasher. East Falmouth, Mass. um-binding photoprotein aequorin.” Proc. Natl. Acad. Sci. USA, vol. 83, pp. 8107-8111. (Nov. 1986). 73) Assignee: Research Charbonneau, H. et al., "Amino Acid Sequence of the Foundation, Inc., Athens. Ga. Calcium-Dependent Photoprotein Aequorin ." Biochemis try, vol. 24, No. 24. pp. 6762-6771. (Nov. 19, 1985). (21) Appl. No.: 487,779 Shimomura, O. et al., “Resistivity to denaturation of the apoprotein of aequorin and reconstitution of the luminescent 22 Filed: Jun. 7, 1995 photoprotein from the partially denatured apoprotein." Bio chem. J. vol. 199, pp. 825-828. (Dec. 1981). Related U.S. Application Data Prendergast, F.G. et al., "Chemical and Physical Properties 63 Continuation of Ser. No. 346,379, Nov. 29, 1994, which is of Aequorin and the Green Fluorescent Protein Isolated from a continuation of Ser. No. 960,195, Oct. 9, 1992, Pat. No. Aequorea forskalea." American Chemical Society, vol. 17. 5,422,266, which is a continuation of Ser, No. 569,362, Aug. No. 17. pp. 3449-3453. (Aug. 1978). 13, 1990, abandoned, which is a continuation of Ser. No. Shimomura. O. et al., "Chemical Nature of 165,422, Feb. 29, 1988, abandoned, which is a continuation of Ser. No. 942,273, Dec. 15, 1986, abandoned, which is a Systems in Coelenterates." Proc. Nat. Acad. Sci. USA. vol. continuation-in-part of Ser. No. 687.903, Dec. 31, 1984, 72, No. 4, pp. 1546-1549, (Apr. 1975). abandoned. Muesing. M. et al. "High-level expression in of calcium-binding domains of an embryonic sea urchin (51) Int. Cl. ... A61K 14/435 protein." Gene, vol. 1108, pp. 155-164. (Nov. 1984). 52 U.S. Cl...... 530/324; 530/350; 530/855 Ohno. S. et al. "Evolutionary origin of a calcium-dependent 58) Field of Search ...... 435/69.1, 172.3, protease by fusion of genes for a thiol protease and a 435/252.3-252.35, 320.1; 536/23.2, 23.5. calcium-binding protein?," Nature, vol. 312. No. 6, pp. 24.3; 530/324, 350, 858 566-570. (Dec. 1984). Kuwano, R. et al., “Molecular cloning and the complete 56 References Cited nucleotide sequence of cDNA to mRNA for S-100 protein of rat brain." Nucleic Acids Research, vol. 12. No. 19. pp. U.S. PATENT DOCUMENTS 7455-7456, (1984). 4,011,219 3/1977 Nishii et al...... 544/237 Putkey, J.A. et al., “Chicken Calmodulin Genes." The Jour 4,104,029 8/1978 Maier, Jr...... 435/.4 nal of Biological Chemistry, vol. 258. No. 19, pp. 4, 160,016 7/1979 Ullman ...... 436/537 11864-11870, (Oct. 10, 1983). 4,181,650 1/1980 Maier et al. . ... 530/303 Inouye. S. et al., "Structural similarities between the devel 4,220,450 9/1980 Maggio ...... 436/537 opment-specific protein S from a Gram-negative bacterium, 4.225,485 9/1980 Buckler et al. .. 530/391.5 4,331.808 5/1982 Buckler et al...... 544/234 Myxococcus xanthus, and calmodulin." Proc. Natl. Acad. 4,334,069 6/1982 Buckler et al...... 544/237 Sci. USA. vol. 80, pp. 6829-6833. (Nov. 1983). 4,380,580 4/1983 Boguslaski et al. . ... 435/5 Garfinkel, L.L. et al., "Cloning and Characterization of 4,396,579 8/1983 Schroeder et al. ... 422/52 cDNA Sequences Corresponding to Myosin Light Chains 1. 4,478,817 10/1984 Campbell et al...... 424/7.1 2, and 3, Troponin-C. Troponin-T, o-Tropomyosin, and 4,581,335 4/1986 Baldwin ...... 435/1723 oc-Actin." The Journal of Biological Chemistry, vol. 257. 4,604,364 8/1986 Kosak ...... 436/501 No. 18, pp. 11078-11086, (Sep. 25, 1982). 4,614,712 9/1986 Baldwin et al. . ... 435/7.92 Suggs, S.V. et al. "Use of synthetic oligonucleotides as 4,665,022 5/1987 Schaefer et al. 435/72 hybridization probes: Isolation of cloned cDNA sequences 4,740,468 4/1988 Weng et al. . ... 435/7.91 4,968,613 11/1990 Matsuda ...... 435/1723 for human B-microglobulin." Proc. Natl. Acad. Sci. USA. 5,162,227 11/1992 Cormier ...... 435/252.33 vol. 78, No. 11, pp. 6613-6617, (Nov. 1981). Sanger. F. et al. "Nucleotide Sequence of Bacteriophage FOREIGN PATENT DOCUMENTS DNA." J. Mol, Biol. vol. 162. pp. 729-773. (1982). 1461877 1/1977 United Kingdom. 2095 830 10/1982 United Kingdom. 81 01883 9/1981 WIPO (List continued on next page.) 87 03304 4/1987 WTPO : Primary Examiner-James Martinell 89 09393 5/1989 WIPO Attorney; Agent, or Firm-Jones & Askew, LLP OTHER PUBLICATIONS 57 ABSTRACT Prasher, D.C. et al. "Cloning and Expression of the cDNA Coding for Aequorin, a Bioluminescent Calcium-Binding A gene which codes for the protein apoaequorin is disclosed Protein." Biochemical and Biophysical Research Communi along with recombinant DNA vectors containing this gene. cations, vol. 126, No. 3, pp. 1259-1268, (Feb. 15, 1985). Homogeneous peptides having the bioluminescence proper Inouye, S. et al., "Expression of Apoaequorin Complemen ties of natural, mixed apoaeqorin are also disclosed. tary DNA in Escherichia coli." , vol. 25. No. 26, pp. 8425-8429, (Dec. 30, 1986). 1 Claim, 4 Drawing Sheets 5,766,941 Page 2

OTHER PUBLICATIONS Pazzagli. G.F. et al., "Homogeneous Luminescent Immu noassay for Progesterone: A study on the Antibody-En Chen, R. "Complete amino acid sequence and glycosylation hanced Chemiluminescence." Luminescent Assays: Per sites of glycoprotein gp71A of Friend murine leukemia spectives in Endocrinology and Clinical Chemistry, Raven virus." Proc. Natl. Acad, Sci., USA. vol. 79, pp. 5788-5792. Press, pp. 191-200, (1982). (Oct. 1982), Kim, J.B. et al., "Measurement of Plasma Estradiol-17B by Patel, A. et al., "Calcium-Sensitive Photoproteins as Biolu Solid-Phase Chemiluminescence Immunoassay." Clinical minescent Labels in Immunoassay." Analytical Applications Chemistry, vol. 28, No. 5, pp. 1120-1124. (1982). of Bioluminescence and Chemiluminescence. pp. 273-276. Kohen, F. et al. "An Immunoassay for Plasma Cortisol (1984). Based on Chemiluminescence." Steroids. pp., pp. 421-437. Cummings, R.D. et al. "A Mouse Lymphoma Line (May 21, 1980). Resistant to the Leukoagglutinating Lectin from Phaseolus Kohen, F. et al. "An Assay Procedure for Plasma Progest vulgaris Is Deficient in UDP-GlcNAc:o-D-mannoside erone Based on Antibody-Enhanced Chemiluminescence." B1.6 N-Acetylglucosaminyltransferase.” The Journal of FEBS Letters, vol. 104, No. 1. pp. 201-205. (Aug. 1979). Biological Chemistry, vol. 257, No. 22, pp. 13421-13427, Jaye, M. et al. "Isolation of a human anti-haemophilic (Nov. 25, 1982). factor DX cDNA clone using a unique 52-base synthetic Stephenson. D.G. et al., "Studies on the Luminescent oligonucleotide probe deduced from the amino acid Response of the Ca'-Activated Photoprotein, Obelin." sequence of bovine factor DX." Nucleic Acids Research. vol. Biochimica et Biophysica Acta, vol. 678, pp. 65-75, (1981). 11, No. 8, pp. 2325-2335. (1983). Cormier, M.J. "Renilla and Aequorin Bioluminescence.” Hsu. S. "Immunoperoxidase Techniques Using the Avidin Bioluminescence and Chemiluminescence, pp. 225-233. Biotin System.” Department of Pathology and Laboratory (1985). Medicine, University of Texas Health Science Center, pp. Cormier, M.J., “Mechanism of energy conversion and trans 467-476. (1985) unknown publication. fer in bioluminescence." Chemical Abstract, vol. 93, No. 5. Matsuda, G. et al. "The Primary Structure of L-1 Light p. 545. No. 4201H, (Aug. 4, 1980), Chain of Chicken Fast Skeletal Muscle Myosin and Its DeLucaM.A., "Bioluminescence and Chemiluminescence." Genetic Implication." FEBS Letters, vol. 126, No. 1, pp. Methods in Enzymology, vol. LVII, pp. 238,240, 242, 243, 111-113, (Apr. 1981). and 590 (1978). Pierce. M. et al. "The Localization of Galactosyltrans Hart, R.C. et al., “Mechanism of the -Catalyzed ferases in Polyacrylamide Gels by a Coupled Enzyme Bioluminescent Oxidation of Coelenterate-Type ." Assay." Analytical Biochemistry, vol. 102, pp. 441-449, Biochemical and Biophysical Research Communications, (1980). vol. 81. No. 3. pp.980-986. (Apr. 14, 1978). Smith. D.F. et al., "Rabbit Antibodies against the Human Matthews, J.C. et al., "Purification and Properties of Renilla Milk Sialyloligosaccharide Alditol of LS-Tetrasaccharide a reniformis ." Biochemistry, vol. 16, No. 1. pp. (NeuAco.2-3Ga181-3GlcNAcB1-4Glco)." Archives of 85-91, (1977). Biochemistry and Biophysics. vol. 241. No. 1. pp. 298-303, Hart, R.C. et al., "Renilla renifomuis Bioluminescence: (Aug. 15, 1985). Luciferase-Catalyzed Production of Nonradiating Excited Lorenz, W.W. et al. "Isolation and Expression of a cDNA Encoding Renilla reniformis luciferase." Proc. Natl. Acad. States from Luciferin Analogues and Elucidation of the Sci. USA. vol. 88, pp. 4438-4442. (May 1991). Excited State Species Involved in Energy Transfer to Renilla Blinks, J.R. et al. "Practical Aspects of the Use of Aequorin Green Fluorescent Protein." Biochemistry, vol. 18, No. 11. as a Calcium Indicator: Assay, Preparation, Microinjection, pp. 2204-2210. (1979). and Interpretation of Signals." Methods of Enzymology, vol. Hori. K. et al. "Structure of native Renilla reniformis 57, pp. 292–328, (1978). luciferin." Proc. Natl. Acad. Sci. USA. vol. 74, No. 10, pp. Miyata, T. et al., “Plasminogens Tochigi II and Nagoya: Two 4285-4287, (Oct. 1977). Additional Molecular Defects with Ala-600 Thr Replace Anderson, J.M. et al., "Lumisomes: A Bioluminescent Par ment Found in Plasmin Light Chain Variants." J. Biochem. ticle Isolated from the Sea Pansy Renilla reniformis," vol. 96, No. 2. pp. 277-285. (Aug. 1984). Chemiluminescence and Bioluminescence, Plenum Publish Maniatis, T. et al., "Molecular Cloning: A Laboratory ing Corporation, pp. 387-392, (1973). Manual. .” Cold Spring Harbor Laboratory, pp. 224-246 and Cormier, M.J. "Comparative Biochemistry of Animal Sys 416-427. (1982). tems." Bioluminescence In Action, Academic Press, pp. Lehninger. A. Biochemistry. Worth Publishers, New York, 75-108. (1978). p. 718. (1970). Cormier, M.J. et al., "Evidence for Similar biochemical Cormier, M.J., et al., "Bioluminescence: Recent Advances." Requirements for Bioluminescence among the Coelenter Annual Review of Biochemistry, vol. 44, pp. 255-271, ates." Journal of Cellular Physiology, vol. 81, No. 2, pp. (1975). 291-297. (Apr. 1973). Prasher, D.C. et al., “Sequence Comparisons of cDNAs Patel, A. et al., "A new chemiluminescent label for use in Encoding for Aequorin Isotypes." Biochemistry 26: 1326 immunoassay." Biochemical Society Transactions, vol. 10, (1987). pp. 224-225. (1982). Cormier, M.J. et al., "Cloning of the Apoaequorin cDNA and Oi, V.T. et al., "Fluorescent Phycobiliprotein Conjugates for Its Expression in E. coli." Report to Roche Diagnostics, Analyses of Cells and Molecules." The Journal of Cell (Oct. 26, 1984). , vol. 93, pp. 981-986. (Jun. 1982). Lehninger, Biochemistry, Woth Publishers, New York, pp. Simpson, J.S.A. et al., "A stable chemiluminescent-labelled 69-71 (1970). antibody for immunological assays." Nature, vol. 279. No. Erwin et al. Biochemistry, vol. 22, p. 4856 (1983). 14, pp. 646-647. (Jun. 1979). Wallace et al. Nucleic Acids Res... vol. 6, p. 3543 (1979). U.S. Patent Jun. 16, 1998 Sheet 1 of 4 5,766,941

FIGURE 1 U.S. Patent Jun. 16, 1998 Sheet 2 of 4 5,766,941

Sad

Il pullH pullH

My OE

N Vs N L s IHUD Say U.S. Patent Jun. 16, 1998 Sheet 3 of 4 5,766,941

?-or-le-Dubnb.) NIonbg vodvillard

S & 3 St 3 se S S g s

s

8 &s S. & R S 9 s se S (o-o-01-10 Dillonb.) NIOOOgrod 3MIrv

5,766,941 1. 2 RECOMBINANT ONA WECTORS CAPABLE relationship is such that the amino acid sequence of the OF EXPRESSINGAPOAEQUORN protein is reflected in the nucleotide sequence of the DNA. There are one or more trinucleotide sequence groups spe This application is a Continuation of Ser. No. 08/346, cifically related to each of the twenty amino acids most 5 commonly occuring in proteins. The specific relationship 379, filed Nov. 29, 1994, which is a Continuation of Ser. No. between each given trinucleotide sequence and its corre O7/960,195 filed Nov. O9, 1992, now U.S. Pat. No. 5,422, sponding amino acid constitutes the genetic code. The 266, which is a Continuation of Ser. No. 071569,362 filed genetic code is believed to be the same or similar for all Aug. 13, 1990, now abandoned, which is a Continuation of living organisms. As a consequence, the amino acid Ser. No. 07/165.422 filed Feb. 29, 1988, now abandoned, sequence of every protein or peptide is reflected by a which is a Continuation of Ser. No. 06/942,273, filed Dec. corresponding nucleotide sequence, according to a well 15, 1986, now abandoned, which is a Continuation-In-Part understood relationship. Furthermore, this sequence of of Ser. No. 06/687.903 filed Dec. 31, 1984, now abandoned. nucleotides can, in principle, be translated by any living BACKGROUND OF THE INVENTION organism. 5 1. Field of the Invention TABLE 1. This invention relates to the field of genetic engineering and more particularly to the insertion of genes for the protein GENETIC CODE apoaequorin into recombinant DNA vectors and to the Phenylalanine (Phe) TTK Histidine (His) CAK production of apoaequorin in recipient strains of microor Leucine (Leu) XTY Glutamine (Gln) CAU ganisms. Isoleucine (le) AM Asparagine (Asn) AAK Methionine (Met) ATG Lysine (Lys) AA 2. Description of the Background Waline (Wal) GTL Aspartic acid (Asp) GAK Apoaequorin is a single polypeptide chain protein which Serine (Ser) QRS Glutamic acid (Glu) GAJ Proline (Pro) CCL Cysteine (Cys) TGK can be isolated from the luminous Aequorea Vic Threonine (Thr) ACL Tryptophan (Try) TGG toria. When this protein contains one molecule of coelenter 25. Alanine (Ala) GCL Arginine (Arg) WGZ. ate luciferin bound non-covalently to it, it is known as Tyrosine (Tyr) TAK Glycine (Gly) GGL aequorin. Aequorin is oxidized in the presence of calcium Termination signal TAJ ions to produce visible light. Once light is produced, the Termination signal TGA spent protein (apoaequorin) can be purified from the oxi Key: dized luciferin and subsequently recharged using natural or 30 Each 3-letter triplet represents a trinucleotide of DNA having a 5' end on the synthetic luciferin under appropriate conditions. The addi left and a 3' end on the right. The letters stand for the purine or pyrimidine bases forming the nucleotide sequence. tion of calcium ions to the recharged aequorin will again A = adenine result in the production of light. Apoaequorin can therefore G = guanine be used in various chemical and biochemical assays as a marker. 35 Natural apoaequorin is not a single compound but rather represents a mixture of molecular species. When pure natu ral aequorin, representing that of many thousands of indi X = T or C if Y is A or G vidual Aequorea, is subjected to electrophoresis (O. Gabriel, X = C if is C or T Methods Enzymol. (1971) 22:565-578) in alkaline buffers Y= A, G, C, or T if X is C under non-denaturing conditions, including 0.1 mM EDTA Y = A or G if X is T in all buffers, at least six distinct bands of blue luminescence W is C or A if Z is C or T W = Cif2 is C or T are visible when the gel (0.5 cmx10 cm) is immersed in Zs A, G, C, or T if Wis G 0.1M CaCl. This observation agrees with that of J. R. Z = A or G if Wis A Blinks and G. C. Harres (Fed. Proc. (1975) 34:474) who 45 QR = TC if S is A, G, C, or T observed as many as twelve luminescent bands after the QR = AG if S is T or C isoelectric focusing of a similar extract. Blinks and Harres S = A, G, C, or T if QR is TC observed more species because isoelectric focusing is S = T or C if QR is AG capable of higher resolution than is electrophoresis. The trinucleotides of Table 1. termed codons. are pre However, none of the bands was ever isolated as a pure 50 sented as DNA trinucleotides, as they exist in the genetic peptide. material of a living organism. Expression of these codons in Furthermore, it is difficult to produce sufficient aequorin protein synthesis requires intermediate formation of mes or apoaequorin from jellyfish or other natural sources to senger RNA (mRNA), as described more fully, infra. The provide the amounts necessary for use in bioluminescence mRNA codons have the same sequences as the DNA codons assays. Accordingly, an improved means for producing 55 of Table 1, except that uracil is found in place of thymine. apoaequorin in sufficient quantities for commercial utiliza Complementary trinucleotide DNA sequences having oppo tion is greatly needed. site strand polarity are functionally equivalent to the codons Recently developed techniques have made it possible to of Table 1, as is understood in the art. An important and well employ microorganisms, capable of rapid and abundant known feature of the genetic code is its redundancy, growth, for the synthesis of commercially useful proteins whereby, for most of the amino acids used to make proteins. and peptides, regardless of their source in nature. These more than one coding nucleotide triplet may be employed. techniques make it possible to genetically endow a suitable Therefore, a number of different nucleotide sequences may microorganism with the ability to synthesize a protein, or code for a given amino acid sequence. Such nucleotide peptide normally made by another organism. The technique sequences are considered functionally equivalent since they makes use of a fundamental relationship which exists in all 65 can result in the production of the same amino acid sequence living organisms between the genetic material, usually in all organisms, although certain strains may translate some DNA, and the proteins synthesized by the organism. This sequences more efficiently than they do others. 5,766,941 3 4 Occasionally, a methylated variant of a purine or pyrimidine Messenger RNA functions in the process of converting may be found in a given nucleotide sequence. Such methy the nucleotide sequence information of DNA into the amino lations do not affect the coding relationship in any way. acid sequence structure of a protein. In the first step of this It its basic outline, a method of endowing a microorgan process, termed transcription, a local segment of DNA ism with the ability to synthesize a new protein involves having a nucleotide sequence which specifies a protein to be three general steps: (1) isolation and purification (or chemi made, is copied into RNA, RNA is a polynucleotide similar cal synthesis) of the specific gene or nucleotide sequence containing the genetically coded information for the amino to DNA except that ribose is substituted for deoxyribose and acid sequence of the desired protein, (2) recombination of uracil is used in place of thymine. The nucleotide bases in the isolated nucleotide sequence with an appropriate vector, RNA are capable of entering into the same kind of base typically the DNA of a bacteriophage of plasmid, and (3) pairing relationships that are known to exist between the transfer of the vector to the appropriate microorganism and complementary strands of DNA. A and U (T) are selection of a strain of the recipient microorganism contain complementary, and G and Care complementary. The RNA ing the desired genetic information. transcript of a DNA nucleotide sequence will be comple A fundamental difficulty encountered in attempts to mentary to the copied sequence. Such RNA is termed exploit commercially the above-described process lies in the 15 messenger RNA (mRNA) because of its status as interme first step, the isolation and purification of the desired specific diary between the genetic apparatus of the cell and its genetic information. DNA exists in all living cells in the protein synthesizingd apparatus. Generally, the only mRNA form of extremely high molecular weight chains of nucle sequences present in the cell at any given time are those otides. A cell may contain more than 10,000 structural genes. which correspond to proteins being actively synthesized at coding for the amino acid sequences of over 10,000 specific that time. Therefore, a differentiated cell whose function is proteins, each gene having a sequence many hundreds of devoted primarily to the synthesis of a single protein will nucleotides in length. For the most part, four different contain primarily the RNA species corresponding to that nucleotide bases make up all the existing sequences. These protein. In those instances where it is feasible, the isolation are adenine (A). guanine (G), cytosine (C), and thymine (T). and purification of the appropriate nucleotide sequence The long sequences comprising the structural genes of 25 coding for a given protein can be accomplished by taking specific proteins are consequently very similar in overall advantage of the specialized synthesis of such protein in chemical composition and physical properties. The separa differentiated cells. tion of one such sequence from the plethora of other A major disadvantage of the foregoing procedure is that it sequences present in isolated DNA cannot ordinarily be is applicable only in the ralatively rare instances where cells accomplished by conventional physical and chemical pre can be found engaged in synthesizing primarily a single parative methods. protein. The majority of proteins of commercial intest are Two general methods have been used in the prior art to not synthesized in such a specialized way, The desired accomplish step (1) in the above-described general proce proteins may be one of a hundred or so different proteins dure. The first method is sometimes referred to as the being produced by the cells of a tissue or organism at a given shotgun technique. The DNA of an organism is fragmented 35 into segments generally longer than the desired nucleotide time. Nevertheless, the mRNA isolation technique is useful sequence. Step (1) of the above-described process is essen since the set of RNA species present in the cell usually tially by-passed. The DNA fragments are immediately represents only a fraction of the total sequences existing in recombined with the desired vector, without prior purifica the DNA, and thus provides an initial purification. tion of specific sequences. Optionally, a crude fractionation 40 In a more recent development, U.S. Pat. No. 4363.877 step may be interposed. The selection techniques of micro provides a process whereby nucleotide sequences can be bial are relied upon to select, from among all the isolated and purified even when present at a frequency as possibilities, a strain of microorganism containing the low as 2% of a heterogeneous population of mRNA sequences. Furthermore, the method may be combined with desired genetic information. The shotgun procedure suffers known methods of fractionating mRNA to isolate and purify from two major disadvantages. More importantly, the pro 45 sequences present in even lower frequency in the total RNA cedure can result in the transfer of hundreds of unknown population as initially isolated. The method is generally genes into recipient microorganisms, so that during the applicable to mRNA species extracted from virtually any experiment, new strains are created, having unknown organism and therefore provides a powerful basic tool for genetic capabilities. Therefore, the use of such a procedure the ultimate production of proteins of commercial and could create a hazard for laboratory workers and for the 50 environment. A second disadvantage of the shotgun method research interest, in useful quantities. is that it is extremely inefficient for the production of the The process takes advantage of certain structural features desired strain, and is dependent upon the use of a selection of mRNA and DNA, and makes use of certain enzyme technique having sufficient resolution to compensate for the catalyzed reactions. The nature of these reactions and struc lack of fractionation in the first step. However, methods of 55 tural details as they are understood in the prior art are overcoming these disadvantages exist, as will become described herein and are further detailed in the patent. The apparent in later sections of this application. symbols and abbreviations used herein are set forth in the The second general method takes advantage of the fact following table. that the total genetic information in a cell is seldom, if ever, expressed at any given time. In particular, the differentiated TABLE 2 tissues of higher organisms may be synthesizing only a DNA-deoxyribonucleic acid A-Adenine minor portion of the proteins which the organism is capable RNA-ribonucleic acid T-Thymine cDNA-complementary DNA G-Guanine of making at any one time. In extreme cases, such cells may (enzymatically synthesized C-Cytosine be synthesizing predominantly one protein. In such extreme from an mRNA sequence) U-Uracil cases, it has been possible to isolate the nucleotide sequence 65 mRNA-messenger RNA Tris-2-Amino-2- coding for the protein in question by isolating the corre dATP-deoxyadenosine triphosphate hydroxymethyl sponding messenger RNA from the appropriate cells. 5,766,941 S 6 association of the oligodeoxy-nucleotide primer near the 3' TABLE 2-continued end of mRNA followed by stepwise addition of the appro dGTP-deoxyguanosine triphosphate 1,3-propanediol priate deoxynucleotides, as determined by base-pairing rela dCTP-deoxycytidine triphosphate EDTA-ethylene tionships with the mRNA nucleotide sequence, to the 3' end TCA-Trichloroacetic acid diamine tetra of the growing chain. The product molecule may be dTT-thymidine acetic acid described as a hairpin structure in which the original RNA triphosphate ATP-adenosine is paired by hydrogen bonding with a complementary strand triphosphate of DNApartly folded back upon itself at one end. The DNA and RNA strands are not covalently joined to each other. In its native configuration, DNA exists in the form of 10 Reverse transcriptase is also capable of catalyzing a similar paired linear polynucleotide strands. The complementary reaction using a single-stranded DNA template. in which base pairing relationships described above exist between the case the resulting product is a double-stranded DNA hairpin paired strands such that each nucleotide base of one strand having a loop of single-stranded. DNA joining one set of exists opposite its complement on the other strand. The ends. See Aviv and Leder, Proc. Natl. Acad. Sci. USA (1972) entire sequence of one strand is mirrored by a complemen 15 69:1408 and Efstratiadis, Kafatos. Maxam, and Maniatis, tary sequence on the other strand. If the strands are separate, Cell (1976) 7:279. it is possible to synthesize a new partner strand. starting Restriction endonucleases are capable of hydro from the appropriate precursor monomers. The sequence of lyzing phosphodiester bonds in DNA, thereby creating a addition of the monomers starting from one end is deter break in the continuity of the DNA strand. If the DNA is in mined by, and complementary to, the sequence of the the form of a closed loop, the loop is converted to a linear original intact polynucleotide strand, which thus serves as a structure, The principal feature of a restriction enzyme is template for the synthesis of this complementary partner. that its hydrolytic action is exerted only at a point where a The synthesis of mRNA corresponding to a specific nucle specific nucleotide sequence occurs. Such a sequence is otide sequence of DNA is understood to follow the same termed the restriction site for the restriction endonuclease. basic principle. Therefore a specific mRNA molecule will 25 Restriction endonuclease from a variety of sources have have a sequence complementary to one strand of DNA and been isolated and characterized in terms of the nucleotide identical to the sequence of the opposite DNA strand, in the sequence of their restriction sites. When acting on double region transcribed. Enzymatic mechanisms exist within liv stranded DNA, some restriction endonucleases hydrolyze ing cells which permit the selective transcription of a the phosphodiester bonds on both strands at the same point, particular DNA segment containing the nucleotide sequence 30 producing blunt ends. Others catalyze hydrolysis of bonds for a particular protein. Consequently, isolating the mRNA separated by a few nucleotides from each other, producing which contains the nucleotide sequence coding for the free single-stranded regions at each end of the cleaved amino acid sequence of a particular protein is equivalent to molecule. Such single-stranded ends are self the isolation of the same sequence, or gene, from the DNA complementary, hence cohesive. and may be used to rejoin itself. If the mRNA is retranscribed to form DNA comple 35 the hydrolyzed DNA. Since any DNA susceptible to cleav mentary thereto (cDNA), the exact DNA sequence is thereby age by such an enzyme must contain the same recognition reconstituted and can by appropriate techniques, be inserted site, the same cohesive ends will be produced, so that it is into the genetic material of another organism. The two possible to join heterogeneous sequences of DNA which complementary versions of a given sequence are therefore have been treated with restriction endonuclease to other inter-convertible and functionally equivalent to each other. sequences similarly treated. See Roberts. Crit. Rey, Bio The nucleotide subunits of DNA and RNA are linked chem. (1976) 4:123. together by phosphodiester bonds between the 5' position of It has been observed that restriction sites for a given one nucleotide sugar and the 3' position of its next neighbor. enzyme are relatively rare and are nonuniformly distributed. Reiteration of such linkages produces a linear polynucle Whether a specific restriction site exists within a given otide which has polarity in the sense that one end can be 45 segment is a matter which must be empirically determined. distinguished from the other. The 3' end may have a free However, there is a large and growing number of restriction 3'-hydroxyl, or the hydroxyl may be substituted with a endonucleases, isolated from a variety of sources with varied phosphate or a more complex structure. The same is true of site specificity, so that there is a reasonable probability that the 5' end. In eucaryotic organisms, i.e., those having a a given segment of a thousand nucleotides will contain one defined nucleus and mitotic apparatus, the synthesis of 50 functional mRNA usually includes the addition of polyade or more restriction sites. nylic acid to the 3' end of the mRNA. Messenger RNA can For general background see Watson, J. D. The Molecular therefore be separated from other classes of RNA isolated Biology of the Gene, 3d Ed. Benjamin, Menlo Park, Calif., from an eucaryotic organism by column chromatography on (1976); Davidson. J. N. The Biochemistry of the Nucleic cellulose to which is attached polythymidylic acid. See Aviv 55 Acids, 8th Ed. Revised by Adams, R. L. P. Burdon, R. H., and Leder, Proc. Natl. Acad. Sci. USA, (1972) 69:1408. Campbell. A. M. and Smellie, R. M. S. Academic Press, Other chromatorgraphic methods, exploiting the base New York. (1976) and Hayes, W. The Genetics of Bacteria pairing affinity of poly A for chromatographic packing and Their Wiruses, Studies in Basic Genetics and Molecular materials, containing oligo dT. poly U, or combinations of Biology, 2d Ed., Blackwell Scientific Pub., Oxford (1968). poly T and poly U, for example, poly U-Sepharose, are SUMMARY OF THE INVENTION likewise suitable. Reverse transcriptase catalyzes the synthesis of DNA Accordingly, it is an object of this invention to provide a complementary to an RNA template strand in the presence microorganism capable of providing useful quantities of of the RNA template, a primer which may be any comple apoaequorin. mentary oligo or polynucleotide having a 3'-hydroxyl, and 65 It is a further object of this invention to provide a the four deoxynucleotide triphosphates. dATP, dGTP, dCTP. recombinant DNA vector capable of being inserted into a and dTTP. The reaction is initiated by the non-covalent microorganism and expressing apoaequorin. 5,766,941 7 8 It is still another object of this invention to provide a DNA GCL (ACL or QRS GAK. GAJ, or TGKlo GAJ segment of defined structure that can be produced syntheti XTY GAJ or AAJ AAJ or WGZTAK, GCL or cally or isolated from natural sources and that can be used QRSls AAJAAKoo (CAJ or GAJo ATM or CCLo in the production of the desired recombinant DNA vectors. ACLo XTYo ATMs WGZo ATM or XTYo, It is yet another object of this invention to provide a TGGos GGLo GAKo GCL XTY TTK1 GAK peptide that can be produced synthetically in a laboratory or ATMs ATM or GTL) GAK AAJs GAK CAJo by microorganism that will mimic the activity of natural AAK2 GGL122 GCL 12 ATM12 ACL or QRS.12s apoaequorin. XTY QRS O GAK 127 GAJ12s TGG2 AAJiao GCL13 TAK132 ACL133 AAJ QRS or GCL 135 GCL16 GGL137 These and other objects of the invention as will herein ATMs ATM CAJo ACL or QRS QRS GAJ after become more readily apparent have been accomplished O GAJ or GAKTGK4s GAJ146 GAJ47 ACL14s TTKs by providing a homogeneous pepetide selected from (1) WGZso GTLs TGKs GAKsa ATMs GAKs GAJ156 compounds of QRS or AAKs. GGL1ss CAJ1soXTYo GAK16GTL162 (a) a first formula GAK163 GAJ164 ATG16s ACL166 WGZ167 CAJ16s CAK169 WKLTPDF N N P K W G R H K HMF NFL DV 15 XTYo GGL171 TTK2 TGG 173 TAK174 ACL-17s ATG176 NHN GKISL DEMW YKASDIVIN NL GA GAK77 CCL17s GCL79 TGKlso GAJ1s AAJ182 XTYs TPE Q A K R H K DAV EAFF G G A GM KYG V TAK, GGLuss GGL1ss GCL187 GTLss CCL1so ETD WPAYE G WKKLATD ELEKYAKN wherein QIT LIRI W G D ALFD I I DKD QN GA IT A is deoxyadenyl, L S E W KAY T K S A G II QT S E E CE ETF R V CD I DES GQL DVD E M T R Q H L G F W YT 20 G is deoxyguanyl. MD PACE KLY G GAWP-COOH C is deoxycytosyl, T is deoxythymidyl. wherein A is alanine. C is cysteine, D is aspartate. E is glutamate, F is phenylalanine. G is glycine, H is J is A or G; histidine. I is isoleucine, K is lysine, L is leucine, M is K is T or C: methionine. N is asparagine. P is proline. Q is 25 L is A, T, C, or G; glutamine. R is arginine. S is serine, T is threonine. V M is A, C, or T. is valine. W is tryptonphan, and Y is tyrosine, X is T or C. if the succeeding Y is A or G, and C if the (b) a second formula in which P is replaced by S. N is succeeding Y is C or T: replaced by D. K is replaced by R. Ko is replaced by Y is A. G. C. or T. if the preceding X is C, and A or G if R. E. is replaced by G. As is replaced by D. Ds is the preceding X is T. replaced by E. As is replaced by E. Kiss is replaced by W is C or A. if the succeeding Z is G or A. and C if the R.T. is replaced by S. Do is replaced by C or E, Qos succeeding Z is C or T; is replaced by K. K. is replaced by R. As is replaced Z is A. G. C. or T. if the preceding Wis C. and A or G if by S. Qo is replaced by E. Io is replaced by P. Io 35 the preceding W is A; is replaced by L. It is replaced by W.T.s is replaced QR is TC. if the succeeding S is A. G. C. or T. and AG if by S. S. is replaced by D. Sas is replaced by A.T the succeeding S is T or C; is replaced by S. E is replaced by D, or S is S is A. G. C. or T, if the preceding QR is TC. and T or C replace by N in said first formula wherein subscript if the preceding QR is AG; and numbers refer to the amino acid position numbered subscript numerals refer to the amino acid position in from the amino terminal of said first formula, apoaequorin for which the nucleotide sequence corre (c) a third formula in which from 1 to 15 amino acids are sponds according to the genetic code, the amino acid absent from either the amino terminal, the carboxy positions being numbered from the amino end, or a terminal, or both terminals of said first formular or said nucleotide sequence coding a peptide previously second formula, or 45 mentioned, are also provided for use in carrying out (d) a fourth formula in which from 1 to 10 additional preferred aspects of the invention relating to the pro amino acids are attached sequentially to the amino duction of such pepetides by the techniques of genetic terminal. carboxy terminal, or both terminals of said engineering. first formula or said second formula and BRIEF DESCRIPTION OF THE FIGURES (2) salts of compounds having said formulas, wherein said 50 peptide is capable of binding coelenterate luciferin and The accompanying figures and drawings are provided to emitting light in the presence of Ca". demonstrate the results obtained in the specific examples DNA molecules, recombinant DNA vectors, and modified which illustrate the invention but are not considered to be microorganisms comprising a nucleotide sequence GTL. limiting thereof. AAJ XTY ACL (CCL or QRS GAKTTK (AAK or 55 FIG. 1 is a photograph showing an autoradiographic GAKls AAK CCL (AAJ or WGZ TGG ATM analysis of in vitro translated proteins using poly(A)RNA GGL WGZs CAKAAJ, CAKs ATGTTKAAK isolated from Aequorea jellyfish. The translation was per TTK22 XTY2 GAK GTL.2s AAK CAK27 AAK28 formed in the absence (lane 1) of presence (lane 3) of GGL AAJ or WGZ). ATM QRS XTY GAK, Aequorea poly(A)RNA. The anti-aequorin immunoprecipi GAJATG GTL, TAK38AAJ GCLo QRS GAK tated proteins from the two reactions were applied to lanes ATM GTL. ATMAAKAAK XTY GGL GCLo 2 and 4. respectively. On the right are marked the positions ACLs CCL52 GAJ3 CAJs GCLss AAJss WGZs, CAKss of the protein molecular weight standards phosphorylase b. AAJo GAK GCL GTL (GAJ or GGL), GCL or BSA, ovalbumin, carbonic anhydrase, SBT1 and lysozyme. GAKls TTKs TTK GGL6, GGL6 GCL GGLo The position of native aequorin is also indicated. ATG, AAJ TAK GGL, GTLs GAJ GAK or 65 FIG. 2 is a restriction map of a gene isolated from an GAJIs TGG. CCLso GCL or GAJls TAKs ATMs Aqueous victoria jellyfish that contains a DNA sequence GAJs GGLss TGGss AAJs AAJ or WGZiss XTYsg coding for apoaequorin. 5,766,941 9 10 FIG. 3 is a graph of time- and oxygen-dependent formu Tris, pH 7.5 and 100 mM KCL. The elution positions of lation of Ca'-dependent photoprotein activity in paEQ1 various molecular weight markers are indicated. extracts. Conditions used: (a) In curves 1 and 2, 0.5 ml aliquotes of the active fractions were made 2 mM in DESCRIPTION OF THE PREFERRED (3-mercaptoethanol and 0.1 mM in coelenterate luciferin and EMBODIMENTS incubated at 40° for the times indicated. At appropriate time The present inventor has obtained for the first time intervals. 5 ul aliquotes were removed and assayed for recombinant DNA vectors capable of expressing the protein photoprotein activity. (b) In curve 2. dissolved Olevels apoaequorin in a microorganism and has additionally iden were reduced by bubbling with Ar gas and the mixture tified for the first time the amino acid sequence of exposed to oxygen at the time indicated. (c) In curve 3, 10 apoaequorin, thereby providing access to homogeneous native apoaequorin was used in the incubation mixture in apoaequorin. Using this information a variety of recombi place of the pAEQ1 extract. nant DNA vectors capable of providing homogeneous FIG. 4 is a graph of a gel filtration profile of the Ca"- apoaequorin in reasonable quantities are obtained. Addi dependent photoprotein activity generated from pAEQ1 tional recombinant DNA vectors can be produced using extracts. Partially purified apoaequorin activity from pAEQ1 5 standard techniques of recombinant DNA technology. A extracts were used to generate Ca"-dependent photoprotein transformant expressing apoaequorin has also been pro activity as described in FIG. 3. This photoprotein fraction duced as an example of this technology. (50 ul) was then placed on a G-75-40 superfine column (30.7 The amino acid sequence of a typical molecule of apoae ml bed volume) equilibrated with 10 mM EDTA, 15 mM quorin is shown in Table 3.

5,766,941 13 14 Since there is a known and definite correspondence between amino acids in a peptide and the DNA sequence that TABLE 4-continued codes for the peptide, the DNA sequence of a DNA or RNA Nucleotide sequence of one strand of apoaequorin DNA. The numbers molecule coding for apoaequorin (or any of the modified refer to the amino acid sequence and corresponding DNA codon peptides later discussed) can.readily be derived from this sequence beginning at the amino terminus of the protein. The DNA amino acid sequence, and such a sequence of nucleotides is sequence corresponds to the mRNA sequence except that U replaces T shown in Table 4. in the mRNA ACL or GAK, GAJ, GAJ XTY (GA or TABLE 4 QRS or TGK) AAJ O 100 Nucleotide sequence of one strand of apoaequorin DNA. The numbers Lys or Arg Tyr Ala or Ser Lys As refer to the amino acid sequence and corresponding DNA codon AA or TAK GCL or QRS AAJ AAK sequence beginning at the amino terminus of the protein. The DNA WGZ) sequence corresponds to the mRNA sequence except that U replaces T OS in the mRNA. Gin or Giu le or Pro r Leu le 5 CAJ or ATM or CCL ACL XTY ATM 5 GAJ O Wall Lys Leu Thr Pro or Arg le or Leu Try Gly Asp Ser WG2. ATM or XIY TGG GGL GAK GT AA XY ACL CCL or 115 QRS Ala Leu Pie Asp le O GCL XY TITK GAK AM Asp Phe Asin or Asp Asn Po 12O GAK TIK AAK or GAKAAK CCL Ile or Wall Asp ys Asp Gn 15 ATM or GAK AA GAK CA Lys or Try le Gly Arg GTL 125 Arg Asia Gly Ala le Thr or AA or TGG ATM GGL WGZ Ser WGZ) 25 AAK GGL GCL ATM ACL 20 or QRS His Lys His Met Phe 30 CAK AA CAK ATG TTK Leu Ser or Asp Gla Try Lys 25 XTY QRS or GAK GAJ TGG AAJ Asn Phe Leu Asp Wal 135 AAK TIK XTY GAK GL 30 Ala Tyr Thir Lys Ser or 30 Ala As His As Gly Lys or GCL TAK ACL AA QRS or Arg GCL AAR CAK AAK GCL AA or 140 WGZ) Ala Gly le le Gn 35 35 GCL GGL ATM ATM CA e Ser Leu Asp Gu 145 ATM QRS XTY GAK GA Thr or Ser Ser Gu Glu or Asp) Cys 40 ACL or QRS GAU GAJ or GAK TGK Met Wa Tyr Lys Ala QRS) ATG GTL TAK AA GCL 150 45 Gl Gu Thr Phe Arg Ser Asp e Wall Ile 40 GAJ GA ACL K WGZ. QRS GAK ATM GTL ATM 55 50 Wai Cys Asp le Asp Asn Asn Leu Gly Ala GTL TGK GAK ATM GAK AAK AAK XY GGL GC 160 55 Gl Ser or Asn Gly Gn Leu Thr Po Gu. Gn Ala 45 GA QRS or AAK GGL CA XY ACL CC GA CA GCL 165 60 Asp Wall Asp Glu Met Lys Arg His Lys Ap GAK GTTL GAK GA ATG AAJ WGZ. CAK AA GAK O 65 Thr Arg G His Leu Ala Wa Ghu or Gly Ala or Asp Phe 50 ACL WGZ. CA CAR XIY GCL GTL GAJ or GGL) (GCL or GAK TTK 175 70 Gly Phe Try Tyr Thr Phe Gly Gly Ala Gly GG TTK TGG TAK ACL TTK GG GGL GC GGL 18O 75 Met Asp Pro Ala Cys Met Lys Tyr Gly Wall 55 ATG GAK CCL GCL GK ATG AA TAK GGL GTL 185 80 Gu Lys Leu Tyr Gly Gu Thr Asp or Glu Try Pro GA AA XY TAK GGL GA ACL GAK or GAJ TGG CCL 89 85 Gly Ala Wa Pro-COOH Ala or Glu Tyr le Glu Gly GGL GCL GL CCL GCL or TAK AM GAJ GG GAJ 90 Since the DNA sequence of the gene has been fully Try Lys Lys or Arg Leu Ala GG AA AAJ or WGZ XTY GCL identified, it is possible to produce a DNA gene entirely by 95 synthetic chemistry, after which the gene can be inserted into Thr or Ser Asp, Glu, Gu Leu Glu or 65 any of the many available DNA vectors using known or Cys) Lys techniques of recombinant DNA technology. Thus the present invention can be carried out using reagents, plas 5,766,941 15 16 mids. and microorganism which are freely available and in cium ions. Examples of this process are described later in the public domain at the time of filing of this patent detail. If light is emitted, the replacement is immaterial, and application. the molecule being tested is equivalent to those of Table 3. For example, nucleotide sequences greater than 100 bases Peptides in which more than one replacement has taken long can be readily synthesized on an Applied Biosystems place can readily be tested in the same manner. Model 380A DNA Synthesizer as evidenced by commercial DNA molecules that code for such peptides can readily be advertising of the same (e.g., Genetic Engineering News. determined from the list of codons in Table 1 and are November/December 1984, p. 3). Such obigonucleotides likewise contemplated as being equivalent to the DNA can readily be spliced using, among others, the techniques sequence of Table 4. In fact, since there is a fixed relation described later in this application to produce any nucleotide 10 ship between DNA codons and amino acids in a peptide, any sequence described herein. discussion in this application of a replacement or other Furthermore, automated equipment is also available that change in a peptide is equally applicable to the correspond makes direct synthesis of any of the peptides disclosed ing DNA sequence or to the DNA molecule. recombinant herein readily available. In the same issue of Genetic Engi vector, or transformed microorganism in which the sequence neering News mentioned above, a commercially available 15 is located (and vice versa). automated peptide synthesizer having a coupling efficiency In addition to the specific nucleotides listed in Table 4. exceeding 99% is advertised (page 34). Such equipment DNA (or corresponding RNA) molecules of the invention provides ready access to the peptides of the invention, either can have additional nucleotides preceeding or following by direct synthesis or by synthesis of a series of fragments those that are specifically listed. For example, poly Acan be that can be coupled using other known techniques. 20 added to the 3'-terminal. short (e.g., fewer than 20 nucle In addition to the specific peptide sequences shown in otides) sequence can be added to either terminal to provide Table 3 other peptides based on these sequences and rep a terminal sequence corresponding to a restriction endonu resenting minor variations thereof will have the biological clease site, stop codons can follow the peptide sequence to activity of apoaequorin. For example, up to 15 amino acids terminate transcription, and the like. Additionally, DNA 25 molecules containing a promoter region or other control can be absent from either or both terminals of the sequence region upstream from the gene can be produced. All DNA given without losing luciferin and calcium binding ability. molecules containing the sequences of the invention will be Likewise. up to 10 additional amino acids can be present at useful for at least one purpose since all can minimally, be either or both terminals. These variations are possible fragmented to produce oligonucleotide probes and be used because the luciferin and calcium binding sites involve the amino acids in the middle of the given sequences. For 30 in the isolation of additional DNA from biological sources. example, the luciferin binding site appears to involve amino Peptides of the invention can be prepared for the first time acids 40-100. Since the terminals are relatively unimportant as homogeneous preparations, either by direct synthesis or for biological activity, the identity of added amino acids is by using a cloned gene as described herein. By “homoge likewise unimportant and can be any of the amino acids neous" is meant, when referring to a peptide or DNA mentioned herein. 35 sequence. that the primary molecular structure (i.e., the sequence of amino acids or nucleotides) of substantially all Experimental data is available to verify that added amino molecules present in the composition under consideration is acids at the amine terminal do not have a significant effect identical. The term "substantially" as used in the preceeding on bioluminescence. Nevertheless, preferred compounds are sentence preferably means at least 95% by weight, more those which more closely approach the specific formulas preferably at least 99% by weight, and most preferably at given with 10 or fewer, more preferably 5 or fewer, absent least 99.8% by weight. The presence of fragments derived amino acids being preferred for either terminal and 7 or from entire molecules of the homogeneous peptide or DNA fewer, more preferably 4 or fewer, additional amino acids sequence, if present in no more than 5% by weight, prefer being preferred for either terminal. ably 1% by weight, and more preferably 0.2% by weight, is Within the central portion of the molecule, replacement of 45 not to be considered in determining homogenity since the amino acids is more restricted in order that biological term "homogeneous" relates to the presence of entire activity can be maintained. However, all of the points of moleucles (and fragments thereof) have a single defined microheterogenity shown in Table 3 or Table 4 represent structure as opposed to mixtures (such as those that occur in biologically functional replacements and any combination natural apoaequorin) in which several molecules of similar of the indicated replacements will represent a functional 50 molecular weight are present but which differ in their molecule. In both Tables, the main line (Table 3) or first primary molecular structure. The term "isolated" as used entry (Table 4) represents the more prevelant amino acid or herein refers to pure peptide. DNA, or RNA separated from nucleotide for that location and is preferred. other peptides, DNAs, or RNAs, respectively, and being In addition minor variations of the previously mentioned found in the presence of (if anything) only a solvent, buffer, peotides and DNA molecules are also contemplated as being 55 ion or other component normally present in a biochemical equivalent to those peptides and DNA molecules that are set solution of the same. “Isolated" does not encompass either forth in more detail, as will be appreciated by those skilled natural materials in their native state or natural materials that in the art. For example. it is reasonable to expect that an have been separated into components (e.g. in an acylamide isolated replacement of a leucine with an isoleucine or gel) but not obtained either as pure substances or as solu valine, an aspartate with a glutamate, a threonine with a tions. The term "pure" as used herein preferably has the serine, or a similar replacement of an amino acid with a same numerical limits as "substantially" immediately above. structurally related amino acid will not have a major effect The phrase "replaced by" or "replacement" as used herein on the biological activity of the resulting molecule, espe does not necessarily refer to any action that must take place cially if the replacement does not involve an amino acid at but to the peptide that exists when an indicated "replace a binding site. Whether a change results in a functioning 65 ment" amino acid is present in the same position as the peptide can readily be determined by incubating the result amino acid indicated to be present in a different formula ing peptide with a luciferin followed by contact with cal (e.g., when serine is present at position 5 instead of proline). 5,766,941 17 18 Salts of any of the peptides described herein will naturally because these enzymes are capable of hydrolytic cleavage of occur when such peptides are present in (or isolated from) the RNA nucleotide sequence. A suitable method for inhib aqueous solutions of various pHs. All salts of peptides iting RNase during extraction from cells involves the use of having the indicated biological activity are considered to be 4M guanidium thiocyanate and 1M mercaptoethanol during within the scope of the present invention, Examples include the cell disruption step. In addition, a low temperature and alkali, alkaline earth, and other metal salts of carboxylic acid a pH near 5.0 are helpful in further reducing RNase degra residues, acid addition salts (e.g., HCl) of amino residues. dation of the isolated RNA. and Zwitter ions formed by reactions between carboxylic Generally, mRNA is prepared essentially free of contami acid and amino residues within the same molecule. nating protein, DNA polysaccharides and lipids. Standard The invention has specifically contemplated each and 10 methods are well known in the art for accomplishing such every possible variation of peptide or nucleotide that could purification. RNA thus isolated contains non-messenger as be made by selecting combinations based on the possible well as messenger RNA. A convenient method for separat amino acid and codon choices listed in Table 3 and Table 4. ing the mRNA of eucaryotes is chromatography on columns and all such variations are to be considered as being spe of oligo-dT cellulose, or other oligonucleotide-substituted cifically disclosed. 15 column material such as poly U-Sepharose, taking advan In a preferred embodiment of the invention, genetic tage of the hydrogen bonding specificity conferred by the information encoded as mRNA is obtained from Aequorea presence of polyadenylic acid on the 3' end of eucaruotic jellyfish and used in the construction of a DNA gene, which mRNA. is in turn used to produce a peptide of the invention. The next step in most methods is the formation of DNA It is preferred to use a cell extract from the light emitting commplementary to the isolated heterogeneous sequences of organs of an Aequoria jellyfish as a source of mRNA, mRNA. The enzyme of choice for this reaction is reverse although a whole body cell extract may be used. Typically, transcriptase, although in principle any enzyme capable of a jellyfish or parts thereof is cut into small pieces (minced) forming a faithful complementary DNA copy of the mRNA and the pieces are ground to provide an initial crude cell template could be used. The reaction may be carried out suspension. The cell suspension is sonicated or otherwise 25 under conditions described in the prior art, using mRNA as treated to disrupt cell membranes so that a crude cell extract a template and a mixture of the four deoxynucleoside is obtained. Known techniques of biochemistry (e.g., pref triphosphates. dATP, dGTP, dCTP and dTTP, as precursors erential precipitation of proteins) can be used for initial for the DNA strand. It is convenient to provide that one of purification if desired. The crude cell extract, or a partially the deoxynucleoside triphosphates be labeled with a radio purified RNA portion therefrom, is then treated to further 30 isotope, for example p in the alpha position, in order to separate the RNA. For example, crude cell extract can be monitor the course of the reaction. to provide a tag for layered on top of a 5 ml cushion of 5.7M CsCl, 10 mM recovering the product after separation procedures such as Tris-HCl, pH 7.5, 1 mM EDTA in a 1 in.x3% in, nitrocel chromatography and electrophoresis, and for the purpose of lulose tube and centrifuged in an SW27 rotor (Beckman making quantitative estimates of recovery. See Efstratiadis, Instruments Corp., Fullerton, Calif.) at 27,000 rpm for 16 hrs 35 A. et al, Supra. at 15° C. After centrifugation, the tube contents are The cDNA transcripts produced by the reverse tran decanted, the tube is drained, and the bottom y2 cm con scriptase reaction are somewhat heterogeneous with respect taining the clear RNA pellet is cut off with a razorblade. The to sequences at the 5' end and the 3' end due to variations in pellets are transferred to a flask and dissolved in 20 ml 10 the initiation and termination points of individual tran mM Tris-HCl, pH 7.5, 1 mM EDTA, 5% sarcosyl and 5% scripts, relative to the mRNA template. The variability at the phenol. The solution is then made 0.1M in NaCl and shaken 5' end is thought to be due to the fact that the oligo-dT primer with 40 ml of a 1:1 phenol:chloroform mixture. RNA is used to initiate synthesis is capable of binding at a variety of precipitated from the aqueous phase with ethanol in the loci along the polyadenylated region of the mRNA. Synthe presence of 0.2M Na-acetate pH 5.5 and collected by sis of the cDNA transcript begins at an indeterminate point centrifugation. Any other method of isolating RNA from a 45 in the poly-A region, and variable length of poly-A region is cellular source may be used instead of this method. transcribed depending on the inital binding site of the Various forms of RNA may be employed such as poly oligo-dT primer. It is possible to avoid this indeterminacy by adenylated, crude or partially purified messenger RNA. the use of a primer containing, in addition to an oligo-dt which may be heterogeneous in sequence and in molecular tract, one or two nucleotides of the RNA sequence itself, size. The selectivity of the RNA isolation procedure is 50 thereby producing a primer which will have a preferred and enhanced by any method which results in an enrichment of defined binding site for initiating the transcription reaction. the desired mRNA in the heterodisperse population of The indeterminacy at the 3'-end of the cDNA transcript is mRNA isolated. Any such prepurification method may be due to a variety of factors affecting the reverse transcriptase employed in preparing a gene of the present invention. 55 reaction, and to the possiblity of partial degradation of the provided that the method does not introduce endonucleolytic RNA template. The isolation of specific cDNA transcripts of cleavage of the mRNA. maximal length is greatly facilitated if conditions for the Prepurification to enrich for desired mRNA sequences reverse transcriptase reaction are chosen which not only may also be carried out using conventional methods for favor full length synthesis but also repress the synthesis of fractionating RNA, after its isolation from the cell. Any small DNA chains. Preferred reaction conditions for avian technique which does not result in degradation of the RNA myeloblastosis virus reverse transcriptase are given in the may be employed. The techniques of preparative sedimen examples section of U.S. Pat. No. 4363.377 and are herein tation in a sucrose gradient and gel electrophoresis are incorporated by reference. The specific parameters which especially suitable. may be varied to provide maximal production of long-chain The mRNA must be isolated from the source cells under 65 DNA transcripts of high fidelity are reaction temperature, conditions which preclude degradation of the mRNA. The salt concentration. amount of enzyme, concentration of action of RNase enzymes is particularly to be avoided primer relative to template, and reaction time. 5,766,941 19 20 The conditions of temperature and salt concentration are cleavage may be used as probes capable of identifying the chosen so as to optimize specific base-pairing between the synthesis of the desired protein in an appropriate in vitro oligo-dT primer and the polyadenylated portion of the RNA protein synthesizing system. Alternatively, the mRNA may template. Under properly chosen conditions, the primer will be purified by affinity chromatography. Other techniques be able to bind at the polyadenylated region of the RNA 5 which may be suggested to those skilled in the art will be template, but non-specific initiation due to primer binding at appropriate for this purpose. other locations on the template, such as short, A-rich The number of restriction enzymes suitable for use sequences, will be substantially prevented. The effects of depends upon whether single-stranded or double-stranded temperature and salt are interdependent. Higher tempera cDNA is used. The preferred enzymes are those capable of tures and low salt concentrations decrease the stability of 10 acting on single-stranded DNA, which is the immediate specific base-pairing interactions. The reaction time is kept reaction product of mRNA reverse transcription. The num as short as possible, in order to prevent non-specific initia ber of restriction enzymes now known to be capable of tions and to minimize the opportunity for degradation. acting on single-stranded DNA is limited. The enzymes Reaction times are interrelated with temperature, lower Hael II. HhaI and Hincf)I are presently known to be suitable. temperatures requiring longer reaction times. At 42° C. In addition. the enzyme Mbo I may act on single-stranded reactions ranging from 1 min. to 10 minutes are suitable. The 15 DNA. Where further study reveals that other restriction primer should be present in 50 to 500-fold molar excess over enzymes can act on single-stranded DNA, such other the RNA template and the enzyme should be present in enzymes may appropriately be included in the list of pre similar molar excess over the RNA template. The use of ferred enzymes. Additional suitable enzymes include those excess enzyme and primer enhances Initiation and cDNA specified for double-stranded cDNA. Such enzymes are not chain growth so that long-chain cDNA transcripts are pro preferred since additional reactions are required in order to duced efficiently within the confines of the sort incubation produce double-stranded cDNA, providing increased oppor times. tunities for the loss of longer sequences and for other losses In many cases, it will be possible to further purify the due to incomplete recovery. The use of double-stranded cDNA using single-stranded cDNA sequences transcribed 25 cDNA presents the additional technical disadvantages that from mRNA. However, as discussed below, there may be subsequent sequence analysis is more complex and labori instances in which the desired restriction enzyme is one ous. For these reasons, single-stranded cDNA is preferred, which acts only on double-stranded DNA. In these cases, the but the use of double-stranded DNA is feasible. In fact, the cDNA prepared as described above may be used as a present invention was initially reduced to practice using template for the synthesis of double-stranded DNA, using a double-stranded cDNA. DNA polymerase such as reverse transcriptase and a The cDNA prepared for restriction endonuclease treat nuclease capable of hydrolyzing single-stranded DNA. ment may be radioactively labeled so that it may be detected Methods for preparing double-stranded DNA in this manner after subsequent separation steps. A preferred technique is to have been described in the prior art. See. for example. incorporate a radioactive label such as 'p in the alpha Ullrich, A., Shine. J., Chirgwin, J. Pictet. R. Tischer. E. 35 position of one of the four deoxynucleoside triphosphate Rutter, W.J. and Goodman, H. M. Science (1977) 196:1313. precursors. Highest activity is obtained when the concen If desired, the cDNA can be purified further by the process tration of radioactive precursor is high relative to the con of U.S. Pat. No. 4363.877, although this is not essential. In centration of the non-radioactive form. However, the total this method, heterogeneous cDNA, prepared by transcrip concentration of any deoxynucleoside triphosphate should tion of heterogeneous mRNA sequences, is treated with one 40 be greater than 30 M. in order to maximize the length of or two restriction endonucleases. The choice of endonu cDNA obtained in the reverse transcriptase reaction. See clease to be used depends in the first instance upon a prior Efstratiadis. A. Maniatis. T. Kafatos, F. C., Jeffrey, A., and determination that recognition sites for the enzyme exist in Vournakis, J. N. Cell, (1975) 4:367. For the purpose of the sequence of the cDNA to be isolated. The method determining the nucleotide sequence of cDNA, the 5' ends depends upon the existence of two such sites. If the sites are 45 may be conveniently labeled with 'pin a reaction catalyzed identical, a single enzyme will be sufficient. The desired by the enzyme polynucleotide kinase. See Maxam, A. M. sequence will be cleaved at both sites, eliminating size and Gilbert. W., Proc. Natl. Acad. Sci. USA (1977) 74:560. heterogeneity as far as the desired cDNA sequence is Fragments which have been produced by the action of a concerned and creating a population of molecules, termed restriction enzyme or combination of two restriction fragments, containing the desired sequence and homoge 50 enzymes may be separated from each other and from het neous in length. If the restriction sites are different, two erodisperse sequences lacking recognition sites by any enzymes will be required in order to produce the desired appropriate technique capable of separating polynucleotides homogeneous length fragments. on the basis of differences in length. Such methods include The choice of restriction enzyme(s) capable of producing a variety of electrophoretic techniques and sedimentation an optimal length nucleotide sequence fragment coding for 55 techniques using an ultracentrifuge. Gel electrophoresis is all or part of the desired protein must be made empirically. preferred because it provides the best resolution on the basis If the amino acid sequence of the desired protein is known, of polynucleotide length. In addition, the method readily it is possible to compare the nucleotide sequence of uniform permits quantitative recovery of separated materials. Con length nucleotide fragments produced by restriction endo venient gel electrophoresis methods have been described by nuclease cleavage with the amino acid sequence for which Dingman, C. W., and Peacock, A. C. Biochemistry (1968) it codes, using the known relationship of the genetic code 7:659, and by Maniatis, T. Jeffrey, A. and van de Sande. H. common to all forms of life. A complete amino acid Biochemistry (1975) 14:3787. sequence for the desired protein is not necessary, however, Prior to restriction endonuclease treatment, cDNA tran since a reasonably accurate identification may be made on scripts obtained from most sources will be found to be the basis of a partial sequence. Where the amino acid 65 heterodisperse in length. By the action of a properly chosen sequence of the desired protein is now known, the uniform restriction endonuclease, or pair of endonucleases. poly length polynucleotides produced by restriction endonuclease nucleotide chains containing the desired sequence will be 5,766,941 21 22 cleaved at the respective restriction sites to yield polynucle unwanted result is prevented by treatment of the homoge otide fragments of uniform length. Upon gel electrophoresis. neous length cDNA fragment of desired sequence with an these will be observed to form a distinct band. Depending on agent capable of removing the 5'-terminal phosphate groups the presence or absence of restriction sites on other on the cDNA prior to cleavage of the homogeneous cDNA sequences, other discrete bands may be formed as well, with a restriction endonuclease. The enzyme alkaline phos which will most likely be of different length than that of the phatase is preferred. The 5'-terminal phosphate groups are a desired sequence. Therefore, as a consequence of restriction structural prerequisite for the subsequent joining action of endonuclease action, the gel electrophoresis pattern will DNA ligase used for reconstituting the cleaved sub-frag reveal the appearance of one or more discrete bands, while ments. Therefore, ends which lack a 5'-terminal phosphate the remainder of the cDNA will continue to be heterodis cannot be covalently joined. The DNA sub-fragments can perse. In the case where the desired cDNA sequence com 10 only be joined at the ends containing a 5'-phosphate gener prises the major polynucleotide species present, the electro ated by the restriction endonuclease cleavage performed on phoresis pattern will reveal that most of the cDNA is present the isolated DNA fragment. in the discrete band. The majority of cDNA transcripts, under the conditions Although it is unlikely that two different sequences will described above. are derived from the mRNA region con be cleaved by restriction enzymes to yield fragments of 15 taining the 5'-end of the mRNA template by specifically essentially similar length, a method for determining the priming on the same template with a fragment obtained by purity of the defined length fragments is desirable. Sequence restriction endonuclease cleavage. In this way, the above analysis of the electrophoresis band may be used to detect described method may be used to obtain not only fragments impurities representing 10% or more of the material in the of specific nucleotide sequence related to a desired protein, band. A method for detecting lower levels of impurities has but also the entire nucleotide sequence coding for the protein been developed founded upon the same general principles of interest. Double-stranded, chemically synthesized oligo applied in the initial isolation method. The method requires nucleotide linkers, containing the recognition sequence for a that the desired nucleotide sequence fragment contain a restriction endonuclease, may be attached to the ends of the recognition site for a restriction endonuclease not employed isolated cDNA, to facilitate subsequent enzymatic removal in the initial isolation. Treatment of polynucleotide material, 25 of the gene portion from the vector DNA. See Scheller et al., eluted from a gel electrophoresis band, with a restriction Science (1977) 196:177. The vector DNA is converted from endonuclease capable of acting internally upon the desired a continuous loop to a linear form by treatment with an sequence will result in cleavage of the desired sequence into appropriate restriction endonuclease. The ends thereby two sub-fragments, most probably of unequal length. These formed are treated with alkaline phosphatase to remove sub-fragments upon electrophoresis will form two discrete 30 5'-phosphate end groups so that the vector DNA may not bands at positions corresponding to their respective lengths, reform a continuous loop in a DNA ligase reaction without the sum of which will equal the length of the polynucleotide first incorporating a segment of the apoaequorin DNA. The prior to cleavage. Contaminants in the original band that are cDNA, with attached linker oligonucleotides. and the treated not susceptible to the restriction enzyme may be expected to vector DNA are mixed together with a DNA ligase enzyme, migrate to the original position. Contaminants containing 35 to join the cDNA to the vector DNA, forming a continuous one or more recognition sites for the enzyme may be loop of recombinant vector DNA. having the cDNA incor expected to yield two or more sub-fragments. Since the porated therein. Where a plasmid vector is used, usually the distribution of recognition sites is believed to be essentially closed loop will be the only form able to transform a random, the probability that a contaminant will also yield bacterium. Transformation, as is understood in the art and sub-fragments of the same size as those of the fragment of used herein, is the term used to denote the process whereby desired sequence is extremely low. The amount of material a microorganism incorporates extracellular DNA into its present in any band of radioactively labeled polynucleotide own genetic constitution. Plasmid DNA in the form of a can be determined by quantitative measurement of the closed loop may be so incorporated under appropriate envi amount of radioactivity present in each band, or by any other ronmental conditions. The incorporated closed loop plasmid appropriate method. A quantitative measure of the purity of 45 undergoes replication in the transformed cell, and the rep the fragments of desired sequence can be obtained by licated copies are distributed to progeny cells when cell comparing the relative amounts of material present in those division occurs. As a result, a new cell line is established bands representing sub-fragments of the desired sequence containing the plasmid and carrying the genetic determi with the total amount of material. nants thereof. Transformation by a plasmid in this manner, where the plasmid genes are maintained in the cell line by Following the foregoing separation or any other technique plasmid replication, occurs at high frequency when the that isolates the desired gene, the sequence may be recon transforming plasmid DNA is in closed loop form, and does stituted. The enzyme DNA ligase, which catalyzes the not or rarely occurs if linear plasmid DNA is used. Once a end-to-end joining of DNA fragments, may be employed for recombinant vector has been made, transformation of a this purpose. The gel electrophoresis bands representing the 55 sub-fragments of the desired sequence may be separately suitable microorganism is a straightforward process, and eluted and combined in the presence of DNA ligase, under novel microorganism strains containing the apoaequorin the appropriate conditions. See Sgaramella, V. Van de gene may readily be isolated, using appropriate selection Sande.J. H., and Khorana, H. G., Proc. Natl. Acad. Sci. USA techniques, as understood in the art. (1970) 67:1468. Where the sequences to be joined are not In summary, genetic information can be obtained from blunt-ended, the ligase obtained from E. coli may be used, Aequorea jellyfish, converted into cDNA, inserted into a Modrich, P. and Lehman, I. R., J. Biol. Chem. (1970) vector, used to transform a host microorganism, and 245:3626. expressed as apoaequorin in the following manner: The efficiency of reconstituting the original sequence 1. Isolate poly(A)RNA from Aequorea jellyfish. from sub-fragments produced by restriction endonuclease 65 2. Synthesize in vitro single-stranded cDNA and then treatment will be greatly enhanced by the use of a method double-stranded cDNA using reverse transcriptase. for preventing reconstitution in improper sequence. This 3. Digest the single-stranded region with S1 nuclease. 5,766,941 23 24 4. Size-fractionate the double-stranded cDNA by gel engineering have recently undergone explosive growth and filtration. development. Many recent U.S. patents disclose plasmids. 5. Tail the cDNA using terminal transferase and dCTP. genetically engineering microorganisms, and methods of 6. Digest pFBR322 with Pst1 and then tail the linear DNA conducting genetic engineering which can be used in the with terminal transferase and dGTP. 5 practice of the present invention. For example, U.S. Pat. No. 7. Anneal the dC-tailed cDNA fragment and dG-tailed 4.273.875 discloses a plasmid and a process of isolating the pBR322. same. U.S. Pat. No. 4304.803 discloses a process for 8. Transform E. coli SK1592. Select for tetracycline producing bacteria by genetic engineering in which a hybrid resistant colonies. plasmid is constructed and used to transform a bacterial host. O U.S. Pat. No. 4419,450 discloses a plasmid useful as a 9. Screen the transformants for ampicillin sensitivity. The cloning vehicle in recombinant DNA work. U.S. Pat. No. tet" amp colonies contain recombinant plasmids. Store 4.362,867 discloses recombinant cDNA construction meth them at -80° C. ods and hybrid nucleotides produced thereby which are 10. Label an oligonucleotide mixed probe (using a useful in cloning processes. U.S. Pat. No. 4,403,036 dis sequence deduced from the determined amino acid 5 closes genetic reagents for generating plasmids containing sequence) with radioactivity. multiple copies of DNA segments. U.S. Pat. No. 4363.877 11. Grow the members of the Aequorea cDNA bank on discloses recombinant DNA transfer vectors. U.S. Pat. No. mitocellulose filters. Lyse the colonies and fix the DNA 4,356.270 discloses a recombinant DNA cloning vehicle and to the filters. is a particularly useful disclosure for those with limited 12. Hybridize the P-labelled oligonucleotide mixture to experience in the area of genetic engineering since it defines the nitrocellulose filters. The 'P-probe will hybridize many of the terms used in genetic engineering and the basic to plasmid DNA from those E. coli recombinants which processes used therein. U.S. Pat. No. 4,336,336 discloses a contain the aequorin cDNA sequence. fused gene and a method of making the same. U.S. Pat. No. 13. Wash excess P-probe from the filters. 4,349.629 discloses plasmid vectors and the production and 25 use thereof. U.S. Pat. No. 4.332.901 discloses a cloning 14. Expose X-ray film to the filters. vector useful in recombinant DNA. Although some of these 15. Prepare plasmid DNA from the recombinants identi patents are directed to the production of a particular gene fied in the Aequorea cDNA bank. product that is not within the scope of the present invention. 16. Hybridize the P-labelled oligonucleotide to the the procedures described therein can easily be modified to plasmid DNA (Southern blot) to confirm the hybrid the practice of the invention described in this specification ization. by those skilled in the art of genetic engineering. 17. Demonstrate that these recombinants contain the All of these patents as well as all other patents and other aequorin DNA sequence by preparing extracts in publications cited in this disclosure are indicative of the EDTA-containing buffers, pH7.2. Charge the expressed level of skill of those skilled in the art to which this invention apoprotein by adding coelenterate luciferin and 35 pertains and are all herein individually incorporated by B-mercaptoethanol and incubating at 4°C. overnight. A reference. flash of blue light is emitted upon the addition of Ca The implications of the present invention are significant in from samples that express aequorin apoprotein. that unlimited supplies of apoaequorin will become avail Although the sequence of steps set forth above, when used able for use in the development of luminescent immunoas in combination with the knowledge of those skilled in the art says or in any other type of assay utilizing aequorin as a of genetic engineering and the previously stated guidelines, marker. Methods of using apoaequorin in a bioluminiscent will readily enable isolation of the desired gene and its use assay are disclosed in Ser. No. 541.405, filed Oct. 13, 1983, in recombinant DNA vectors now that sufficient information and commonly assigned, which is herein incorporated by is provided to locate the gene, other methods which lead to reference. Transferring the apoaequorin cDNA which has the same result-are also known and may be used in the 45 been isolated to other-expression vectors will produce con preparation of recombinant DNA vectors of this invention. structs which improve the expression of the apoaequorin Expression of apoaequorin can be enhanced by including polypeptide in E. coli or express apoaequorin in other hosts. multiple copies of the apoaequorin gene in a transformed Furthermore, by using the apoaequorin cDNA or a fragment host, by selecting a vector known to reproduce in the host. thereof as a hybridization probe. structurally related genes thereby producing large quantities of protein from exoge found in other bioluminescent coelenterates and other organ neous inserted DNA (such as puC8, ptacl2, or plN-III isms such as squid (Mollusca), fish (Pisces), and Crustacea ompa1.2. or 3), or by any other known means of enhancing can be easily cloned. These genes include those that code for peptide expression. the of Renilla, Stylatula, Ptilosarcus, In all cases, apoaequorin will be expressed when the DNA Cavernularia, and Acanthoptilum in addition to those that sequence is functionally inserted into the vector. By "func 55 code for the photoproteins found in the Hydrozoan Obelia tionally inserted" is meant in proper reading frame and and the ctenophores Mnemiopsis and Beroe. orientation, as is well understood by those skilled in the art. Particularly contemplated is the isolation of genes from Typically, an apoaequorin gene will be inserted downstream these and related organisms that express photoproteins using from a promoter and will be followed by a stop codon. oligonucleotide probes based on the principal and variant although production as a hybrid protein followed by cleav nucleotide sequences disclosed herein. Such probes can be age may be used, if desired. considerably shorter than the entire sequence but should be In addition to the above general procedures which can be at least 10, preferably at least 14, nucleotides in length. used for preparing recombinant DNA molecules and trans Longer oligonucleotides are also useful, up to the full length formed unicellular organisms in accordance with the prac of the gene. Both RNA and DNA probes can be used. tices of this invention, other known techniques and modi 65 In use, the probes are typically labelled in a detectable fications thereof can be used in carrying out the practice of manner (e.g., with 32p, H. biotin, or avidin) and are the invention. In particular, techniques relating to genetic incubated with single-stranded DNA or RNA from the 5,766,941 25 26 organism in which a gene is being sought. Hybridization is (1978) 57:292–328) except that Sephadex G-75 (superfine) detected by means of the label after single-stranded and is used in the second gel filtration step. The purification of double-stranded (hybridized) DNA (or DNA/RNA) have aequorin took place as follows: been separated (typically using nitrocellulose paper). 1. Collection of Aequorea in Friday Harbor, Washington, Hybridization techniques suitable for use with oligonucle 5 and removal of circumoral tissue (photocytes). otides are well known. 2. Extraction of proteins from photocytes via hypotonic Although probes are normally used with a detectable lable lysis in EDTA. that allows easy identification, unlabeled oligonucleotides 3. Ammonium sulfate fractionation of photocyte extract are also useful, both as precursors of labeled probes and for (0-75%). use in methods that provide for direct detection of double O 4. Centrifugation of (NH4)2SO precipitate; storage at stranded DNA (or DNA/RNA). Accordingly, the term "oli -70° C., during and after shipment from Friday Harbor, gonucleotide probe" refers to both labeled and unlabeled Washington. forms. Particularly preferred are oligonucleotides obtained from 5. Gel filtration on Sephadex G-50 (fine). the region coding for amino acids 40 through 110 of the 15 6. Ion-exchange on QAE Sephadex with pH-step and salt peptide sequences described herein, since these are the gradient elution. amino acids involved in binding to luciferin. 7. Gel filtration on Sephadex G-75 (superfine). Coelenterate luciferin is found in and binds to photopro 8. Ion-exchange on DEAE-Sephadex with pH-step and teins from all the organisms listed in Table 5. and it is salt gradient elution. contemplated that oligonucleotides as described herein will 9. Lyophilization (in EDTA) of pure aequorin and storage be useful as probes in isolating photoprotein genes from all at -80° C. Steps 1-4 were performed at Friday Harbor. these species. Except for collection and removal of circumoral tissue, all steps are done at 0°-4°C. The final product from TABLE 5 Step 4 was stored on dry ice in 250 ml centrifuge 25 bottles. The material was shipped in this form. Distribution of Coelenterate-Type Luciferin The purification of aequorin and green fluorescent protein 1. Cnidaria (coelenterates) (GFP) was done in Athens, Georgia (Steps 5-9). All steps A. Anthozoa were performed at 0-4 C. Aequorin-containing fractions Renilla (three sp) Stylatula were stored at -80° C. between steps; aequorin seems to be Ptilosarcus 30 stabile to freezing and thawing irrespective of protein con Cavernularia centration. Acanthoptilum Step 5: Gel filtration on Sephadex G-50 (fine). Column B. Hydrozoa Aequorea dimensions: 5.8 cmx97 cm; 2.563 ml. The column was run Obelia in 10 mM EDTA. pH 5.5 (the disodium salt was used to 2. 35 prepare EDTA solutions) at a flow rate of 75 ml/hour. The Mnemiopsis GFP and aeqorin eluted together on this column. 65-75% of Beroe the aequorin activity was pooled for subsequent purification. 3. Mollusca Watasenia (squid) Side fractions were also pooled and stored for later purifi 4. Pisces cation. Aequorin yield in this step varied from 50% to 80%; Neoscopelus microchir' 65–75% yields were usually achieved. The capacity of the Diaphus column was approximately 1000 mg (Bradford) in 75 ml; 5 Crustacea A. Decapods (shrimp) generally smaller volumes were loaded whenever possible. Acanthephyra eximia Step 6: Ion-exchange on QAE Sephadex. Column dimen Acanthephyra purpurea sions: 5 cm diameter. 5 grams of dry Sephadex were used in Oplophorus spinosus' 45 Heterocarpus grimaldi this step; the column bed volume changed during Heterocarpus laevigatus' chromatography, depending on the ionic strength and com Systellaspis cristata position of the buffer. Systellaspis debilis Generally the pooled material from 6 to 10 initial G-50 B. Mysidacea (opossum shrimp) Gnathophausia ingens steps was run on this column. Overall yield was improved by 50 doing this. as was efficiency. This step was performed Structure identical to (I) based on chemical and physical data on the extracted exactly as described by Blinks et al. (1978). After the . All others are based on luciferin-luciferase cross reactions as well column was loaded, the GFP was selectively eluted with a as on kinetic and bioluminescence emission spectra comparisons. Additional evidence that the luciferin is identical to () is devived from pH-step (5 mM Na Ac, 5 mM EDTA, pH 4.75). Aequorin chemical and physical data on the isolated emitte which has been shown to was then eluted in a linear NaCl gradient in 10 mM EDTA, be identical to (II). 55 pH 5.5 (500 ml total volume). The GFP was made 10 mM in Tris and the pH raised to 8.0 for storage at -80 until The invention now being generally described, it will be further purification. The aequorin pool was concentrated via more readily understood by reference to the following ultrafiltration (Amicon YM-10 membrane) in preparation for examples which are included for purposes of illustration the next step. Aequorin yield: 80%. only and are note intended to limit the invention unless so Step 7: Gel filtration on Sephadex G-75 (superfine). stated. Column dimensions: 2.8 cmx150 cm; 924 ml. The column EXAMPLE 1. was run in 10 mM EDTA, pH 5.5 at 10 ml/hour. Aequorin yield: 60-80%. Purification of Natural Aequorin Step 8: Ion-exchange on DEAE-Sephadex. The pooled Aequorin was purified according to the method of Blinks 65 aequorin from step 7 was run directly onto this column, et al. (J. R. Blinks. P. H. Mattingly, B. R. Jewell, M. van which was run exactly as the QAE Sephadex column. The Leeuwen, G. C. Harrer, and D. G. Allen, Methods Enzymol. aequorin yield was generally 75-80%. This step is unnec 5,766,941 27 28 essary with most aequorin preps. The material from step 7 is national Biotechnologies. Inc. and used according to con usually pure. according to SDS-PAGE in 12% acrylamide. ditions described by the supplier. RNasin and reverse tran Step 9: Aequorin was lyophilized with >95% recovery scriptase were obtained from Biotech and Life Sciences. provided that some EDTA was present. Recoveries varied respectively. Terminal transferase was purchased from PL from 0% to 95% in the absence of EDTA (see Blinks et al., Biochemicals. Coelecterate luciferin was synthesized as 1978). described Hori, K., Anderson, J. M., Ward, W. W. and EXAMPLE 2 Cormier. M. J. Biochemistry, 14, 2371-2376 (1975); Hori. Sequencing Methodology Applied in the Sequence K., Charbonneau, H., Hart, R. C., and Cormier, M.J., Proc. Determination of Aequorin Natl. Acad. Sci., USA. 74, 4285-4287 (1977): Inouye. S. Amino acid sequence analysis was performed using auto O Sugiura, H. Kakoi, H. Hasizuma, K. Goto, T., and Iio, H., mated Edman Degradation (Edman and Begg. 1967). The Chem. Lett. 141-144 (1975) and stored as a lyophilized sequence analysis of relatively large amounts of protein or powder until needed. peptide (10 nmol or more) was carried out using a Model RNA. Isolation and In Vitro Translation 890 B Beckman sequencer (Duke University) updated as Aequorea victoria jellyfish were collected at the Univer described by Brown et al. (1980) and employing a 0.55M 15 sity of Washington Marine Biology Laboratory at Friday Quadrol program with polybrene (Tarr et al., 1978). Two Harbor. Washington. The circumoral rings were cut from the peptides. M3 and M5. which were small or appeared to wash circumrerence of the jellyfish and immediately frozen in a out of the cup with the Quadrol method, were sequenced dry ice/methanol bath. The tissue was kept at -700° C. until using a program adapted for dimethylallylamine buffer and needed. polybrene as suggested by Klapper et al. (1978). Phenylth RNA was isolated according to the method of Kim et al. iohydantion (PTH-) derivatives of amino acids were iden Kim, Y-J., Shuman. J. Sette, K. and Przybyla. A. J. Cell. tified using reverse phase HPLC chromatography on a Biol. (1983) 96:393-400 and poly(A)--RNA was prepared DuPont Zorbay ODS column essentially as described by using a previously described technique Aviv and Leder, Hunkapiller and Hood (1978). Peptides which were avail PNAS 69 (1972) 1408-1412. able at the 2-10 nmol level were sequenced on a Model 890 25 Poly(A)RNA (1 pig) and poly(A)RNA (20 Jug) were C Beckman sequencer (University of Washington) using a translated using the rabbit reticulocyte in vitro translation program for use with 0.1M Quadrol (Brauer et al.. 1975) and system (Pelham and Jackson, Eur: J. Biochem. (1976) 247; polybrene. PTH-amino acids were identified using the W. C. Merrick in Methods in Enz. 101(c) (1983) 606–615). reverse phase HPLC system described by Ericsson et al. The lysate was stripped of its endogeneous mRNA with (1977). An applied Biosystems Model 470 A gas phase 30 micrococcal nuclease. Each translation (62 pil total volume) sequencer (University of Washington) (Hunkapiller et al. was incubated 90 min at 25° C. in the presence of S 1983) was used for sequence analysis when there was less methionine (38 uCi). Two ul of each translation were than 1.5 nmol of peptide available. PTH amino acids from removed for analysis by electrophoresis. Apoaequorin was the gas phase instrument were identified using an IBM immunoprecipitated by adding antiaequorin (2 ul) and Staph Cyano column as described by Hunkapiller and Hood 35 aureus cells to 50 ul of each translation mixture. After (1983). several washings the antibody - apoaequorin complex was dissociated by heating in the presence of SDS. The trans References lated products were analyzed on a SDS polyacrylamide Edman. P. and Begg, G., Eur: J. Biochem... 1, 80-91 (1967). (13%) gel. Following electrophoresis the gel was stained Brown, A. G., Cornelius. T. U. Mole, J. E. Lynn, J. D., with Coomassie R-250 to identify the protein standards and Tidwell, W. A., and Bennett J. C. Anal. Biochem... 102. then the gel was impregnated with PPO in DMSO. Fluo 35-38 (1980). rography was performed at -70° C. Terr, G. E. Beechner, J. F., Bell, M., and McKean, D. J. Recombinant DNA Procedures Anal. Biochem... 84, 622-627 (1978). Double-stranded cDNA was synthesized from total Klapper. D. G., Wilde. C. E., III and Capra, J. D., Anal. 45 Aequorea poly(A)RNA as described by Wickens et al. Biochen., 85, 126-131 (1978). (Wickens. M. P. Buell. G. N., and Schimke. R. T. J. Biol. Hunkapiller. M. W., and Hood, L. E. Biochemistry. 17. Chem. (1978) 253:2483-2495). After addition of homopoly 2124-2133 (1978). meric dC tails, double-stranded cDNA was annealed to Brauer, A. W., Margolies. M. N., and Haber. E., dG-tailed Pst 1-cut p3R322 (Villa-Komaroff, L., Biochemistry, 13, 3029-3035 (1975). 50 Efstradiadis, A. Broome, S. Lomedico, P. Tizard, R. Ericsson, L. H. Wade. R. D. Gagnon, J., McDonald, R. R. Naber, S. P., Chick W. L., and Gilbert, PNAS (1978) and Walsh, K. A. in Solid Phase Methods in Protein 75:3727-3731) and used to transform E. coli strain SK1592. Sequence Analysis (Previero. A. and ColeHi-Previero, M. Tetracycline-resistant, amplicillin-sensitive colonies were A. eds.) pp. 137-142. Elsevier/North Holland, Amster transferred to and frozen in microtiter dishes at -700° C. dam (1977). 55 The Aequorea cDNA library was screened for the Hunkapiller, M. W., Hewick, R. M. Dreyer, W.J., and Hood, aequorin cDNA using a synthetic oligonucleotide mixture. L. E. Methods Enzymol. 91.399–413 (1983). The oligonucleotide mixture was supplied by Charles Cantor Hunkapiller, M.W. and Hood, L. E. Methods Enzymol.91. and Carlos Argarana (Columbia University). Following their 486-493 (1983). purification by polyacrylamide electrophoresis Maniatis, T. and Efstratidis. A., Meth. in Enz. (1980) 65:299-305) and EXAMPLE 3 the 17-mers were radioactively labelled using polynucle Cloning and Expression of cDNA Coding for otide kinase and Y-P-ATP Maxam, A. M. and Gilbert. W. Homogeneous Apoaequorin Meth. in Enz. (1980) 65:499-559). The unincorporated P was removed by DEAE-cellulose ion-exchange chromatog Materials and Methods 65 raphy. Restriction enzymes were purchased from Bethesda The Aequorea cDNA bank was screened in the following Research Laboratories. New England Bio Labs and Inter manner: The E. coli recombinants were transferred from 5,766,941 29 30 frozen cultures to nitrocellulose filters (7x11cm) placed on rea RNA (lane 2) or in the presence of Aequorea poly(A) Luria agar plates. The colonies were grown 12 hours at 37 RNA (data not shown), C. and lysed and then the DNA was fixed as Taub and The primary translation products immuno-precipitated Thompson Taub, F. and Thompson, E. B. Anal. Biochem. with the anti-aequorin migrated on the SDS-PAGE gel with (1982) 126:222-230) described for using Whatman 541 s an apparent molecular weight (23.400 daltons, lane 4) paper. The filters were baked under vacuum for 2 hours after slightly greater than that for native aequorin isolated from they had been air-dried. Aequorea (22,800 daltons, indicated in FIG. 1). This data, The filters were incubated at 55° C. for 12-20 hours in 3 and the data shown in FIG. 4. are consistent with the ml/filter of a prehybridization solution (10XNET, 0.1% SDS, presence of a presequence of approximately seven additional 3XDenhardt's) after first wetting them in 1XSSC. The solu amino acids at the amino terminal of the primary translation tion was poured from the hybridization bag and replaced O product. with 1 ml/filter of the hybridization solution (10xNET, 0.1% The proteins immunoprecipitated from the poly(A)RNA SDS. 3xDenhardt's. 1x10 cpm 'P-labelled 17-mers per translation migrated as a doublet or even a triplet (lane 4, filter). The hybridization was carried out for 24 hours at 37° FIG. 1) if one studies the original autoradiogram. This result C. after which the filters were washed four times in 10xSSC can be interpreted in two ways. Firstly, multiple apoaequorin at 4 C. for 10 min. The filters were air dried and then 15 genes may exist in Aequorea victoria and their respective wrapped in plastic wrap. Kodak XAR-5 film was exposed to preproteins differin molecular weight due to various lengths the filters at -70° C. using a DuPont Cronex intensifying of their presequences. Aequorin isozymes Blinks, J. R. and Screen. Harrer, G. C. Fed. Proc. (1975) 34:474 may be indicative Growth and Extraction Procedures for E. coli of such a multi-gene family. Secondly, the Aequorea victoria E. coli SK1592 containing pAEQ1 - paRQ6 were grown population at Friday Harbor mya consist of several species overnight in 25 ml of Luria broth at 37° C. The cells were of Aequorea. centrifuged and then resuspended in 5 ml of 10% sucrose, 50 The Aequorea cDNA library used contained 6000 recom mM Tris pH 8. The cells were lysed with the addition of the binants having inserts greater than 450 bp. Of 25 random following: 7.2 pil of 0.1M phenylmethylsulfonylfluoride, recombinants screened, none had inserts less than 500 bp 312 of 0.2M EDTA, 10 mg lysozyme, and 10 ul of 10 25 and two were larger than 3 kbp. mg/ml RNase A. After 45 min on ice, the mixture was The Aequorea cDNA bank was screened with the follow centrifuged at 43,500xg for one hour. The supernatent was ing mixed synthetic oligonucleotide probe: saved. A. A A Purification and Assay of Aequorin 30 3. ACCAT TGGTACC GG-5 Aequorin was extracted and purified by the method of C T G Blinks et al. Blinks, J. R., Wier, W. G., Hess, P. and C Predergast, F. G. Prog. Biophys. Molec. Biol (1982) 40:1-114). Aequorin, or photoprotein activity, was mea sured by injecting 5ul of the sample into 0.5 ml of 0.1M The DNA sequences of these oligonucleotides were deter CaC, 0.1M Tris, pH 8.0 and simultaneously measuring peak 35 mined by an examination of the complete amino acid light intensity and total photons. The design of photometers sequence of apoaequorin. These oligonucleotides are for making such measurements and calibrating the instru complementary to the mRNA which codes for the peptide ment for absolute photon yields have been previously Trp'Tyr.ThrMet Asp.Pro' in the carboxy terminus described Anderson, J. M. Faini. G. J., and Wampler, J. E., region of the aequorin polypeptide. The 17-mers were Methods in Enz. (1978) 57:529-559; Charbonneau, H., and 40 'P-labelled and hybridized to plasmid DNA from the Cormier, M. J. J. Biol. Chen. (1979) 254:769-780). Aequorea cDNA library as described in Methods. Partial Purification of Apoaequorin Activity in pa:Q1 Six transformants were identified which contained plas Extracts mids having inserts that hybridized to the synthetic oligo The expressed apoaequorin was partially purified by nucleotides. The restriction map of the plasmid containing passage of 23 ml of a pAEQ1 extract over a 42 ml bed 45 the largest Pst I insert, pAEQ1, is shown in FIG. 2. No volume of Whatman DE-22 equilibrated in 1 mM EDTA, 1.5 hybridization of the synthetic oligonucleotides would occur mM Tris, pH 7.5. An 800 ml NaCl gradient (O-1M) was if paO1 was digested with BamHl. Upon examination of applied and the active apoaequorin eluted at 0.3MNaCl. The the 17-mers DNA sequence, the hybridization probe does peak fractions were pooled and dialyzed against 0.5M KC1, contain a BamH1 recognition sequence (GGATCC). Hence. 50 the BamH1 site in paRQ1 could be used to identify the 10 mM EDTA, and 15 mM Tris, pH 7.5 for the experiments 3'-region of the apoaequorin coding sequence. The recom described in FIG. 3. binant plasmid paO1 does indeed contain the apoaequorin Results and Discussion cDNA as demonstrated by its expression in E. coli, as In Vitro Translation of Aequorea poly(A)RNA described below. Approximately 1.6 pig poly(A)RNA was isolated from 55 Expression of Apoaequorin in E. coli each gram of frozen jellyfish tissue. The results of the in In order to find out whether any of these six transformants vitrotranslation of the Aequorea poly(A)RNA are shown in were expressing biologically active apoaequorin, extracts of FIG. 1. The translation products which reacted with anti each of these, as well as the host strain, were prepared as aequorin are shown in lane 4. The S counts immunopre described in Methods. To 0.5 m of each extract was added cipitated represented 0.3% of the total acid-precipitable B-mercaptoethanol (2 mM) and coelenterate luciferin (0.1 counts in the translation which implies that the apoaequorin mM) and the mixture allowed to incubate at 4 for 20 hours. mRNA represents approximately 0.3% of the total poly(A" This mixture was then assayed for Ca"-dependent photo )mRNA populations. This relative abundance agrees well protein activity as described in Methods. Caf-dependent with the fraction of total protein (0.5%) which corresponds luminescence was observed in extracts prepared from the to aequorin in a crude extract of circumoral rings from 65 recombinant pAEQ1, but no such luminescence was Aequorea. No proteins were immunoprecipitated when the observed in extracts of the host strain or in extracts derived in vitro translation was performed in the absence of Aequo from any of the other transformants. The inserts in 5,766,941 31 32 pAEQ1-6 cross hybridized suggesting that they contain To 0.5 ml of each extract was added mercaptoethanol (2 homologous DNA sequences. However, if the cDNA inserts mM) and coelenterate luciferin (0.015 mM). The mixture in paQ2-6 were not of sufficient length or oriented was incubated at 4 for 20 hours. A 5 Jul sample was improperly within the plasmid, apoaequorin activity in those removed, injected into 0.5 ml of 0.1 mM Ca", and peak extracts would not be expected, 5 The kinetics of formation of photoprotein activity from light intensity measured. extracts of paQ1 is similar to that observed with native. mixed apoaequorin as shown in FIG. 3. Requirements for The invention now being fully described, it will be the formation of photoprotein activity in this extract is also apparent to one of ordinary skill in the art that many changes identical to that observed when authentic apoaequorin is and modifications can be made thereto without departing used. As FIG. 3 shows, dissolved O is required. from the spirit or scope of the invention as set forth herein. Furthermore, the elimination of either 3-mercaptoethanol or coelenterate luciferin from the reaction mixture results in Zero production of Ca"-dependent photoprotein activity. What is claimed is: 1. A homogeneous peptide selected from Injection of the active commponent into Ca"-free buffers 5 produced no luminescence. The subsequent addition of Ca" resulted in a luminescence flash. To further characterize the active component in extracts of (1) compounds of pAEQ1, this recombinant plasmid was subjected to chro (a) a first formula: matography over DE-22 as described in Methods. The apoaequorin activity eluted at about 0.3M salt which is WKLTPDFN N P K W G R H K HMF NFL W NHN G. K S L DEMW KAS DIWN NL CA similar to that observed for authentic apoaequorin. The T PE Q A K R H K DAV EAFF G G A GM KY GW active fractions were then incubated in the presence of E TD WPAYE GW KKLA TD ELEKYAKN coelenterate luciferin, B-mercaptoethanol and oxygen to QIT L I R I, W G D ALF DI ID. K. D. QN GAIT generate photoprotein activity as described in FIG. 3. This 25 L S E W KAY T K S A G II QT S E E CEETF R mixture was then subjected to gel filtration. As FIG. 4 VC DI DES G QLD WIDE M T R Q H L GF WYT shows. the photoprotein activity generated from the partially MD PACE KLYG GAWP-COOH purified component in paRQ1 extracts eluted from the wherin A is alanine, C is cysteine, D is aspartate. E column with an M. of 20.600 as compared to a value of is glutamate, F is phenylalanine, G is glycine. H is 19,600 for native aequorin. Similar results were observed during in vitro translation experiments (FIG. 1). From the histidine. I is isoleucine, K is lysine, L is leucine. M data of FIG. 4, one may also conclude that the luciferin is methionine, N is asparagine, P is proline, Q is becomes tightly associated with the active component in glutamine, R is arginine. S is serine. T is threonine, pAEQ1 extracts under the charging conditions used. V is valine. W is tryptophan, and Y is tyrosine. The pooled photoprotein fraction from FIG. 4 produces a 35 (b) a second formula in which P is replaced by S. N. luminescence flash upon the addition of Ca". The kinetics is replaced by D, K is replaced by R. Ko is of this flash was indistinguishable from the kinetics of the replaced by R. Es is replaced by G. As is replaced Ca'-dependent aequorin reaction. Other recombinant plas by D. Ds is replaced by E. As is replaced by E. Kss mids did not express a light-emitting protein when present in is replaced by R.T. is replaced by S.D. is replaced transformants, as is shown in Table 6 below. 40 The above data show that the cDNA inserted into pAEQ1 by C or E. Eos is replaced by K. K. is replaced by represents the full-length cDNA coding for apoaequorin. R. As is replaced by S. QoI is replaced by E. I.1.2 The data also show that this cDNA is being expressed in is replaced by P. It is replaced by L. It is replaced pAEQ1 and that the protein product is indistinguishable in by V. Ts is replaced by S. S. is replaced by D, its biological properties from that of native, mixed apoae 45 Ss is replaced by A. T is replaced by S E14 is quorin. The level of expression was estimated to be about replaced by D, and Ss is replaced by N in said first 0.01% of the total soluble protein. formula wherein subscript numbers refer to the amino acid position numbered from the amino ter TABLE 6 minal of said first formula, 50 Recharging of Apoaequorin in Extracts of (c) a third formula in which from 1 to 15 amino acids Apoacalorin cDNA Clones are absent from either the amino terminal, the car Peak Light Intensity boxy terminal, or both terminals of said first formula (hy sec) in Extracts or said second formula, or 55 (d) a fourth formula in which from 1 to 10 additional Clone --Cat -Ca amino acids are attached sequentially to the amino pAEQ1 5 x 0. O terminal, carboxy terminal, or both terminals of said pAEQ1 O O first formula or said second formula and PAEQ3 O O pAEQ4 O O (2) salts of compounds having said formulas, wherein said pAEQ5 O O pAEQ6 O O peptide is capable of binding coelenterate luciferin and SK 1592 (Host Strain) O emitting light in the presence of Ca".

ck sk. x: s: s: