On the Possibilities and Limitations of Rational Protein Design to Expand the Specificity of Restriction Enzymes: a Case Study Employing Ecorv As the Target
Total Page:16
File Type:pdf, Size:1020Kb
Protein Engineering vol.13 no.4 pp.275–281, 2000 On the possibilities and limitations of rational protein design to expand the specificity of restriction enzymes: a case study employing EcoRV as the target Thomas Lanio, Albert Jeltsch and Alfred Pingoud1 a 6 base pair (bp) recognition site (Roberts and Macelis, 1999). The biological function of restriction enzymes is to cleave Institut fu¨r Biochemie, FB 08, Justus-Liebig-Universita¨t, Heinrich-Buff-Ring phage DNA invading a bacterium and thereby to protect the 58, D-35392 Giessen, Germany. bacterial cell from infection. As palindromic sites compris- 1To whom correspondence should be addressed ing 8 bp statistically occur only every 65 536 bp and phage E-mail: [email protected] genomes usually are short (104–105 bp), a restriction enzyme The restriction endonuclease EcoRV has been characterized with an 8 or even 10 bp recognition site could not fulfil this in structural and functional terms in great detail. Based role efficiently. Restriction endonucleases having a recognition on this detailed information we employed a structure- sequence of ജ8 bp, therefore, in general do not provide a guided approach to engineer variants of EcoRV that should selective advantage for the bacterial host and in consequence be able to discriminate between differently flanked EcoRV to date only 3.5% of all possible 8 bp cutters are available recognition sites. In crystal structures of EcoRV complexed from natural sources (Roberts and Macelis, 1999) and most with d(CGGGATATCCC)2 and d(AAAGATATCTT)2, likely not many more will be found by screening bacterial Lys104 and Ala181 closely approach the two base pairs strains. On the other hand, rare cutting enzymes could be flanking the GATATC recognition site and thus were particularly useful for the manipulation of large DNA frag- proposed to be a reasonable starting point for the rational ments, e.g. human genes, which are often larger than 5 kb. A extension of site specificity in EcoRV [Horton,N.C. and possible way to overcome the shortage of naturally occurring Perona,J.J. (1998) J. Biol. Chem., 273, 21721–21729]. To ‘rare cutters’ is the engineering of existing restriction enzymes test this proposal, several single (K104R, A181E, A181K) to recognize a more extended sequence than their natural and double mutants of EcoRV (K104R/A181E, K104R/ recognition sequence. A181K) were generated. A detailed characterization of all The restriction endonuclease EcoRV (recognition site: GAT- variants examined shows that only the substitution of ATC) is one of the best characterized type II restriction Ala181 by Glu leads to a considerably altered selectivity enzymes and, therefore, within this class of enzymes is an with both oligodeoxynucleotide and macromolecular DNA ideal target for a rational protein design. Besides detailed substrates, but not the predicted one, as these variants biochemical analysis (review: Pingoud and Jeltsch, 1997), a prefer cleavage of a TA flanked site over all other sites, wealth of structural information is available for this enzyme under all conditions tested. The substitution of Lys104 by including the structure of the free enzyme (Winkler et al., Arg, in contrast, which appeared to be very promising on 1993), a structure of the enzyme bound non-specifically to the basis of the crystallographic analysis, does not lead to DNA (Winkler et al., 1993), different structures of specific variants which differ very much from the EcoRV wild- EcoRV substrate (Kostrewa and Winkler, 1995; Perona and type enzyme with respect to the flanking sequence prefer- Martin, 1997; Horton and Perona, 1998a) and EcoRV product ences. The K104R/A181E and K104R/A181K double complexes (Kostrewa and Winkler, 1995; Horton and Perona, mutants show nearly the same preferences as the A181E 1998b) as well as structures of EcoRV variants bound to and A181K single mutants. We conclude that even for specific substrates (Horton and Perona, 1998b) and of EcoRV the very well characterized restriction enzyme EcoRV, bound to modified substrates (Martin et al., 1999). All these properties that determine specificity and selectivity are structural studies as well as biochemical studies using chemic- difficult to model on the basis of the available structural ally modified substrates (Thorogood et al., 1996) and EcoRV information. variants (Wenz et al., 1996) show that EcoRV in addition to Keywords: DNA recognition/EcoRV/protein engineering/ contacts with its recognition sequence interacts also with base rational protein design/site-directed mutagenesis/specificity pairs upstream and downstream of its recognition site. These additional contacts may explain the site preferences of EcoRV observed under certain conditions (Taylor and Halford, 1992; Introduction Lanio et al., 1998; Scho¨ttler et al., 1998). Hence it seems to Type II restriction endonucleases belong to the most important be possible to create new protein–DNA contacts to the base enzymes in molecular biology and molecular medicine. These pairs flanking the GATATC recognition site which could enable homodimeric enzymes recognize and cleave DNA in short the nuclease to recognize an extended site comprising up palindromic sequences comprising 4–8 base pairs with very to 10 bp. Crystallographic analyses of EcoRV with short high accuracy. Independent of the sequence context, the oligodeoxynucleotides differing in the base pairs adjacent to canonical site is cleaved several orders of magnitude faster the recognition site, suggest a promising starting point for a than all other sites, including sites which differ at only one rational protein engineering project. Two amino acid residues, base pair from the canonical recognition site (reviews: Roberts Lys104 and Ala181, form water-mediated contacts to the and Halford, 1993; Pingoud and Jeltsch, 1997). Today, more neighboring base pairs upstream (Ala181) and downstream than 2000 restriction endonucleases with ~200 different speci- (Lys104) of the recognition site (Horton and Perona, 1998a). ficities are known, among them only a few with longer than In the complex with d(CGGGATATCCC)2, Lys104 interacts © Oxford University Press 275 T.Lanio et al. through water molecules with the exocyclic N-4 amino group the 5-methyl group of thymine (Horton and Perona, 1998a). of the flanking cytosines on the 3Ј-side of the recognition These water-mediated contacts to a base next to the recognition sequence. These contacts are not seen with the d(AAAGAT- sequence could in principle be replaced by a specific hydrogen ATCTT)2 substrate, presumably because they are prevented bond, if the lysine is replaced by the slightly larger arginine. by steric exclusion of water molecules due to the presence of Changing a water-mediated contact to a specific hydrogen bond should enable the resulting variant to discriminate against substrates which could not provide that additional contact. In the case of Lys104 a change to arginine has been suggested in order to create a direct contact on the 3Ј-side with the O-4 of a flanking thymine or the O-6 of a flanking guanine (Horton and Perona, 1998a). On the 5Ј-side, the side chain of Ala181 points towards the base pair flanking the recognition site. This residue has been investigated by site-directed mutagenesis recently and some of the variants generated were shown to display an extended specificity towards differently flanked sites, such as A181E, A181F, A181I and A181K (Scho¨ttler et al., 1998). Therefore, combination of variants at positions 104 and 181 appeared to be very promising for the design of an 8 or 10 bp cutter, as discussed by Horton and Perona (1998a) and illustrated in Figure 1. As shown in Table I, combinations of different substitutions at positions 104 and 181 should result in several EcoRV variants with an extended specificity towards differently flanked substrates (Horton and Perona, 1998a). For example, combination of the Lys104 to Arg with the Ala181 to Glu substitution should result in a double mutant with a preference for CGATATCG. We have produced and characterized all the variants given in Table I. While some of the variants show preferences that are different from the wild-type enzyme, none of them displays the postu- lated preferences. Furthermore, the combination of amino acid substitutions does not lead to synergistic effects. Table I. Predicted preferences of single and double mutants of EcoRV with amino acid substitutions at positions 104 and 181 according to Horton and Perona (1998a) Variant Possible contact Predicted preferences K104R 3Ј O4Tor3Ј O6GNGATATC(G/T) A181E 5Ј N4CCGATATCN A181K 5Ј N7A and 5Ј N7/O6G (A/G)GATATCN K104R/A181E 5Ј N4C and 3Ј O6GCGATATCG K104R/A181K 5Ј N7A and 3Ј O4TAGATATCT Fig. 1. Models of possible interactions of EcoRV mutants with flanking DNA. These models were constructed by Horton and Perona (1998a) using subunit II of the dGC cocrystal structure as the starting point. Mutations in the protein were introduced and torsion angles varied systematically. Optimal least-squares superpositions of alternative base pairs were carried out using atoms in the glycosidic bonds and the glycosidic torsion angles were adjusted to match those in the dGC structure. (A) The K104R/A181E mutant interacting with flanking CG base pair. Here a 3Ј-G (GUA10) replaces the 3Ј-C visualized in the dGC structure. CYT9 is the 3Ј-base of the target site GATATC. Dotted black lines indicate modeled hydrogen bonds or nearest approach distances with the distance between the two electronegative atoms indicated in Å. The specific recognition of the outer base pair of the target site by Gly182 and Gly184 is also shown. (B)The K104R/A181K mutant interacting with a flanking TA base pair, where a 3Ј- T (THY10) replaces the 3Ј-C in the dGC structure. (C) The K104R/A181K mutant interacting with a flanking GC base pair. Here the flanking pair is as visualized in the dGC structure with a 3Ј-C nucleotide adjacent to the target site (reproduced with permission of the authors).