Novel Restriction Endonucleases, DNA Encoding These Endonucleases and Methods for Identifying New Endonucleases with the Same Or Varied Specificity
Total Page:16
File Type:pdf, Size:1020Kb
(19) TZZ _T (11) EP 2 565 266 A1 (12) EUROPEAN PATENT APPLICATION (43) Date of publication: (51) Int Cl.: 06.03.2013 Bulletin 2013/10 C12N 9/22 (2006.01) C12N 15/55 (2006.01) (21) Application number: 12075064.1 (22) Date of filing: 03.08.2006 (84) Designated Contracting States: • Roberts, Richard J. AT BE BG CH CY CZ DE DK EE ES FI FR GB GR Wenham, MA 01984 (US) HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR (74) Representative: Griffin, Philippa Jane et al Mathys & Squire LLP (30) Priority: 04.08.2005 US 705504 P 120 Holborn London (62) Document number(s) of the earlier application(s) in EC1N 2SQ (GB) accordance with Art. 76 EPC: 06849810.4 / 1 919 937 Remarks: This application was filed on 28-06-2012 as a (71) Applicant: New England Biolabs, Inc. divisional application to the application mentioned Ipswich, MA 01938 (US) under INID code 62. (72) Inventors: • Morgan, Richard D. Middleton, MA 01949 (US) (54) Novel restriction endonucleases, DNA encoding these endonucleases and methods for identifying new endonucleases with the same or varied specificity (57) Inter alia, a DNA encoding a restriction endonu- clease comprising a protein sequence having at least 90% sequence identity with SEQ ID NO: 104 is disclosed. EP 2 565 266 A1 Printed by Jouve, 75001 PARIS (FR) EP 2 565 266 A1 Description BACKGROUND 5 [0001] Restriction endonucleases are enzymes that occur naturally in certain unicellular microbes—mainly bacteria and archaea—and that function to protect those organisms from infections by viruses and other parasitic DNA elements. These enzymes bind to specific sequences of nucleotides (’recognition sequence’) in double-stranded DNA molecules (dsDNA) and cleave the DNA, usually within or close to the recognition sequence, disrupting the DNA and triggering its destruction. Restriction endonucleases commonly occur with one or more companion enzymes termed modification 10 DNA methyltransferases. DNA methyltransferases bind to the same sequences in dsDNA as the restriction endonucle- ases they accompany, but instead of cleaving the DNA, they alter it by the addition of a methyl group to one of the bases within the sequence. This modification (’methylation’) prevents the restriction endonuclease from binding to that site thereafter, rendering the site resistant to cleavage. Methyltransferases function as cellular antidotes to the restriction endonucleases they accompany, protecting the cell’s own DNA from destruction by its restriction endonucleases. To- 15 gether, a restriction endonuclease and its companion modification methyltransferase(s) form a restriction-modification (R-M) system, an enzymatic partnership that accomplishes for microbes what the immune system accomplishes, in some respects, for multicellular organisms. [0002] A large and varied class of restriction endonucleases has been classified as ’Type II’ class of restriction endo- nucleases. These enzymes cleave DNA at defined positions, and when purified can be used to cut DNA molecules into 20 precise fragments for gene cloning and analysis. [0003] New Type II restriction endonucleases can be discovered by a number of methods. The traditional approach to screening for restriction endonucleases, pioneered by Roberts et al. and others in the early to mid 1970’s (e.g. Smith, H.O. and Wilcox, K.W., J. Mol. Biol. 51: 379-391 (1970); Kelly, T.J. Jr. and Smith, H.O., J. Mol. Biol. 51: 393-409, (1970); Middleton, J.H. et al., J. Virol. 10:42-50 (1972); and Roberts, R.J. et al., J. Mol. Biol. 91:121-123, (1975)), was to grow 25 small cultures of individual strains, prepare cell extracts and then test the crude cell extracts for their ability to produce specific fragments on small DNA molecules (see Schildkraut, I.S., "Screening for and Characterizing Restriction Endo- nucleases", in Genetic Engineering, Principles and Methods, Vol. 6, pp. 117-140, Plenum Press, NY, NY (1984)). Using this approach, about 12,000 strains have been screened worldwide to yield the current harvest of almost 3,600 restriction endonucleases (Roberts, R.J. et al., Nucl. Acids. Res. 33: D230-D232 (2005)). Roughly, one in four of all strains examined, 30 using a biochemical approach, show the presence of a Type II restriction enzyme. [0004] An in silico screening technique to identify restriction-modification systems has also been described and has been successfully used to identify novel restriction endonucleases (US- 2004-0137576-A1). This method relies on iden- tifying new methylases by their consensus sequences. Methylases have much more conservation of amino acid se- quence, because they all must bind the methyl donor cofactor S- adenosyl methionine (SAM) and bind the nucleotide to 35 be methylated, either an adenine or a cytosine base, and then perform the methyl transfer chemistry. Although there are several classes of methyltransferases, there are many sequenced examples of methylases and these have well conserved motifs that can be used to identify a protein sequence in a database as a methylase. In this method, identifying restriction endonucleases relies on testing any or all open reading frame (ORF) protein sequences located near the identified methylases. 40 [0005] Since the various Type II restriction enzymes appear to perform similar biological roles and share the biochem- istry of causing dsDNA breaks, it might be thought that they would closely resemble one another in amino acid sequence. Experience shows this not to be true, however. Surprisingly, far from sharing significant amino acid similarity with one another, most enzymes appear unique, with their amino acid sequences resembling neither other restriction enzymes nor any other known proteins. Thus the Type II restriction endonucleases seem either to have arisen independently of 45 each other during evolution or to be evolving very rapidly thereby losing apparent sequence similarity, so that today’s enzymes represent a heterogeneous collection rather than one or a few distinct families. [0006] Restriction endonucleases are biochemically diverse in their function: some act as homodimers, some as monomers, others as heterodimers. Some bind symmetric sequences, others asymmetric sequences; some bind con- tinuous sequences, others discontinuous sequences; some bind unique sequences, others multiple sequences. Some 50 are accompanied by a single methyltransferase, others by two, and yet others by none at all. When two methyltransferases are present, sometimes they are separate proteins; at other times they are fused. The orders and orientations of restriction and modification genes vary, with all possible organizations occurring. Given this great diversity among restriction en- donucleases, it is perhaps not surprising that it has not been possible to form consensus sequences that can be used for in silico searches that are able to identify Type II restriction endonucleases, as has been successfully done for DNA 55 methyltransferases. Thus there is no general common amino acid sequence motif (s) that can be used to identify restriction endonucleases from translated raw DNA sequence ab initio. [0007] Although restriction endonucleases lack conserved sequence motifs and generally have highly diverged DNA and amino acid sequences, some restriction endonucleases are in fact related to one another and, though they may 2 EP 2 565 266 A1 diverge in function, these endonucleases or families of related endonucleases share significant sequence similarity with one another. The key to unlocking these families of endonucleases is to obtain the sequence of one of the members of the related enzymes; from this sequence the other members of the family can be identified. With the advent of whole genome sequencing, many prokaryotic DNA sequences, and from the DNA sequence many amino acid sequences, 5 have become available. Thus there are many amino acid sequences in the database with no known function. This pool of sequences undoubtedly contains numerous restriction endonucleases. The problem is how to identify which genes encode restriction endonucleases, and then how to characterize the function of these genes. SUMMARY 10 [0008] In an embodiment of the invention, a DNA segment encoding a restriction endonuclease and the corresponding amino acid sequence is described where the DNA has at least 90% sequence identity with a DNA sequence selected from SEQ ID NOS: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 15 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, and 187 or the amino acid sequence has at least 90% sequence identity with an amino acid sequence selected from SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 20 142, 144, 146, 148, 150, 152, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186 and 188. [0009] In an additional embodiment of the invention, an identified restriction endonuclease is provided which has an amino acid sequence identified by an Expectation value of less than or equal to e-02 in a BLAST search when an amino acid sequence having at least 90% sequence identity to the amino acid sequences listed above is used for searching the database. In the context of the embodiments of the invention, "an" is not intended to be limited to "one." 25 [0010] In an additional embodiment of the invention, a method is provided for identifying a restriction endonuclease, that includes: (a) selecting one or more probes having at least 90% sequence identity to a sequence selected from SEQ ID NOS: 1-153 and 156-188; (b) comparing the one or more probes with a database of sequences by a sequence similarity analysis to identify a sequence match; and (c) identifying the restriction endonuclease from the sequence match.