View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Elsevier - Publisher Connector

Research Article 1563

Crystal structure of the DpnM DNA from the DpnII restriction system of Streptococcus pneumoniae bound to S-adenosylmethionine Phidung H Tran, Z Richard Korszun†, Susana Cerritelli, Sylvia S Springhorn and Sanford A Lacks*

Background: (Mtases) catalyze the transfer of methyl Address: Department of Biology, Brookhaven groups from S-adenosylmethionine (AdoMet) to a variety of small molecular and National Laboratory, Upton, NY 11973, USA. macromolecular substrates. These contain a characteristic α/β *Corresponding author. structural fold. Four groups of DNA Mtases have been defined and E-mail: [email protected] representative structures have been determined for three groups. DpnM is a DNA Mtase that acts on adenine N6 in the sequence GATC; the †This paper is dedicated to the memory of Z Richard represents group α DNA Mtases, for which no structures are known. Korszun who died on 31st March 1997. Key words: adenine N6 , DNA base Results: The structure of DpnM in complex with AdoMet was determined at eversion, methyltransferase evolution, multiwavelength 1.80 Å resolution. The protein comprises a consensus Mtase fold with a helical anomalous diffraction, protein fold conservation cluster insert. DpnM binds AdoMet in a similar manner to most other Mtases Received: 7 September 1998 and the enzyme contains a hollow that can accommodate DNA. The helical Revisions requested: 25 September 1998 cluster supports a shelf within the hollow that may recognize the target Revisions received: 12 October 1998 sequence. Modeling studies indicate a potential site for binding the target Accepted: 13 October 1998 adenine, everted from the DNA helix. Comparison of the DpnM structure and α Structure 15 December 1998, 6:1563–1575 sequences of group DNA Mtases indicates that the group is a genetically http://biomednet.com/elecref/0969212600601563 related family. Structural comparisons show DpnM to be most similar to a small- Mtase and then to macromolecular Mtases, although several © Current Biology Ltd ISSN 0969-2126 dehydrogenases show greater similarity than one DNA Mtase.

Conclusions: DpnM, and by extension the DpnM family or group α Mtases, contains the consensus fold and AdoMet-binding motifs found in most Mtases. Structural considerations suggest that macromolecular Mtases evolved from small-molecule Mtases, with different groups of DNA Mtases evolving independently. Mtases may have evolved from dehydrogenases. Comparison of these enzymes indicates that in protein evolution, the structural fold is most highly conserved, then function and lastly sequence.

Introduction All C5mC Mtases, including those from eukaryotes, systems are widespread among prokary- show considerable sequence homology [6,7]. The struc- otes. They contain an activity, the role of ture of a bacterial C5mC Mtase, M.HhaI, in complex which is to cleave incoming bacteriophage DNA at a specific with cognate DNA, was determined crystallographically sequence, thereby restricting viral growth. In general, these [8,9]. A striking observation was the eversion of the enzyme systems also contain a methyltransferase (Mtase) to target base from the double-helical DNA ; this modify that specific sequence in the host ’s DNA, eversion was also observed for another C5mC Mtase, thereby protecting it from the endonuclease. Depending on M.HaeIII [10]. Such base ‘flipping’ appears to be a the subunit composition of the enzymes, the systems have common mechanism for enzymatic attack on DNA been classified into several types (reviewed in [1]). So far, [11,12]. Also remarkable was a comparison of the M.HhaI structures have only been solved for representatives of type structure to that of catechol-O-Mtase (COMT) [13], II systems, in which the endonuclease and Mtase functions which revealed a common fold containing a seven- are carried on separate proteins. In all cases the Mtases trans- stranded β sheet and six α helices [13,14]. This charac- fer methyl groups from S-adenosylmethionine (AdoMet) to teristic Mtase fold has been observed in two additional either the C5 or N4 positions of or the N6 position DNA Mtases [15,16] and also in [17], protein [18] of adenine in DNA, to give C5mC, N4mC or N6mAd, and RNA [19,20] Mtases, for which structures have been respectively. Eukaryotes also contain C5mC [2], and determined. Comparison of an N6mAd Mtase, M.TaqI N6mAd Mtases are found in protozoa [3,4] and algae [5]. [15], to M.HhaI showed that as well as containing this 1564 Structure 1998, Vol 6 No 12

common fold, putative functional motifs were arranged Results and discussion in a similar manner within the fold [14]. In a bold Structure determination hypothesis that proposed a common structural fold for all The structure of DpnM in complex with AdoMet was DNA Mtases and assigned motifs determined at 1.8 Å resolution using multiwavelength within the fold, N4mC and N6mAd Mtases were divided anomalous diffraction (MAD) [33] of a Hg derivative at among three groups [21]. This grouping depended on 100°K. The crystals belong to the orthorhombic space the relative positions of conserved motifs and the puta- group P212121, with a unit cell of a = 56.6 Å, b = 68.1 Å, tive DNA target recognition domain (TRD) in the c = 85.2 Å, and one 32.9 kDa monomer per asymmetric protein chain: group γ corresponds to M.TaqI and other unit. Earlier Patterson mapping (not shown) revealed a N6mAd Mtases; group β consists of both N4mC and single Hg atom per monomer. Data were collected on two N6mAd Mtases; and group α contains, with one excep- frozen crystals, at several wavelengths related to the absorp- tion, N6mAd Mtases. The structure of M.PvuII, an tion edge for Hg: crystal I at the inflection (λ2), peak (λ3) N4mC Mtase in group β, was recently determined and and a remote shorter wavelength (λ4); crystal II at the peak generally fulfilled the prediction for this group [16]. (λ5) and a longer (bottom) wavelength (λ1). Statistics for these data are listed in Table 1. For MAD phasing, the data Cells of Streptococcus pneumoniae contain either the DpnI or from crystal I at different wavelengths were treated like the DpnII restriction system. These type II systems are data from multiple isomorphous replacements with inclu- encoded by interchangeable genetic cassettes located at a sion of anomalous scattering (MIRAS) [34]. The crystal II particular position in the chromosome [22]. DpnI is an bottom wavelength data were taken as ‘native’. Data from unusual restriction endonuclease that only cleaves DNA crystal II, λ5, were used for structural refinement of the methylated at GATC sites [23]. The DpnII system recog- DpnM–AdoMet complex to a resolution of 1.80 Å. The nizes the unmethylated sequence GATC and differs from refinement statistics are summarized in Table 2. most restriction systems in that it includes two indepen- dent Mtases, DpnM and DpnA [24], as do its closely DpnM is a polypeptide of 284 residues and related homologs MboI [25] and LlaI [26]. contains a total of 2273 nonhydrogen protein atoms. The final model contained 2088 nonhydrogen protein atoms In addition to its homologs in the MboI and LlaI systems, representing residues 10–200, 206–259 and 272–284. The DpnM is homologous to the Dam Mtases of N-terminal nine residues and parts of two loops extending [27] and its T4 bacteriophage [28], neither of which is part into the hollow in which we propose that DNA is bound of a restriction system. In E. coli and related , the are not visible in the electron-density map. These loops Dam Mtase acts to enhance mutation avoidance by a DNA mismatch correction system that, without the Table 1 methylation feature, is present in nearly all living cells Crystallographic MAD data statistics. [29]. These Mtases show overall sequence similarity to a larger family of N6mAd Mtases that methylate at GATC Parameter λ1 λ2 λ3 λ4 or similar sequences. This DpnM family corresponds essentially to the α group of N6mAd Mtases [21]. There- Wavelength (Å) 1.011 1.008 0.997 0.968 fore, determination of the DpnM structure would com- Energy (eV) 12,252 12,284 12,428 12,794 Resolution range (Å) 30–1.78 50–2.00 50–2.00 50–2.00 plete our structural knowledge of the major groups of Completeness (%) 99.7 88.7 87.8 88.8 DNA Mtases, which should lead to a better understanding Rmerge* 0.04 0.05 0.05 0.05 of their enzymatic mechanisms and evolution. † 35.0 19.2 19.5 19.4 Observed reflections 122,783 60,538 62,747 61,611 In the DpnII system, both DpnM and DpnA methylate Unique reflections 31,622 20,423 20,246 20,703 the N6 position of adenine in GATC [24]. However, Anomalous pairs 28,889 17,472 17,630 17,630 DpnM methylates only double-stranded DNA, and Highest resolution DpnA preferentially methylates single-stranded DNA shell (Å) 1.81–1.78 2.03–2.00 2.03–2.00 2.03–2.00 [30]. The DpnII system also contains an endonuclease, Completeness (%)‡ 99.9 72.9 74.4 72.6 ‡ DpnB, which cleaves unmethylated, double-stranded Rmerge* 0.37 0.25 0.27 0.24 DNA [31]. As the three DpnII system proteins are very †‡ 4.1 4.0 4.1 4.2 ‡ different in primary structure [22], it should be interest- Unique reflections 1630 819 841 841 ing to see how they recognize the same GATC sequence. The crystals were in space group P212121 with unit-cell dimensions We began this approach by investigating the structure of of a = 56.6 Å, b = 68.1 Å, c = 85.2 Å. The Mr of the subunit is the DpnM Mtase, a monomeric protein of 32,903 Da, for 32,900 and there were four subunits in the unit cell; the solvent content was 48%. *R = Σ|I–|/ΣI, where I is the observed which crystals were obtained [32]. We report here the merge intensity and is the averaged intensity from multiple observations. crystallographic analysis of the structure of DpnM in †The averaged ratio of to root mean square deviation. ‡Values for complex with AdoMet. the highest resolution shell. Research Article M.DpnII DNA adenine methyltransferase Tran et al. 1565

Table 2 The secondary structure elements of DpnM are portrayed diagrammatically in Figure 1d. In the α/β domain, the first Refinement statistics. six β strands are parallel and the seventh, which returns to Wavelength (Å) 1.008 the sheet between β strands 5 and 6, is antiparallel. Resolution range (Å) 20–1.80 (1.84–1.80) Except for strand 7, α helices precede the β strands and Completeness (%) 99.9 (100.0) are oriented more or less parallel to them. This configura-

Rmerge* 0.05 (0.22) tion is identical to the characteristic Mtase fold that was † 33.4 (8.7) found in M.HhaI [8] and COMT [13]. This domain is con- Observed reflections 207,177 (13527) served in other Mtase structures [10,15–20], as shown in Unique reflections 30,274 (1979) Figure 2. The Mtases appear to have evolved from a R ‡ 23.8 factor single α/β domain; the insertion of additional domains or R § 28.4 free secondary structure elements gives rise to the properties Nonhydrogen protein atoms 2088 of individual enzymes. In Figure 2a, the consensus fold is AdoMet atoms 28 α Number of water 178 shown to begin with 1 and continues with alternate β α Rms deviation from ideal strands and helices, with the exception of the anti- β β bond lengths (Å) 0.01 parallel 7 strand that immediately follows 6. This fold is bond angles (o) 1.3 an extension of the Rossmann fold [35], with the addition dihedrals (o) 22.8 of α1, α4 and β7. DpnM appears to be constructed by the improper (o) 1.1 simple insertion of a four-helix cluster between α3 and β3 Ramachandran plot of the Mtase fold. most favored regions (%) 89.1 additionally allowed (%) 10.0 Figure 2b portrays the structures of various Mtases as result- generously allowed (%) 0.9 ing from inserts into the common fold. This representation disallowed (%) 0.0 is consistent with a common genetic origin for nearly all Values in parentheses are for the highest resolution shell. Mtases. (, in which AdoMet partici- Σ < > Σ < > *Rmerge = |I– I |/ I, where I is the observed intensity and I is the pates in a reductive methylation of vitamin B12, lacks this averaged intensity from multiple observations. †Averaged ratio of fold [36].) Differences among the Mtases result from one or to root mean square of deviation. ‡R = Σ|F –F |/Σ|F |. §The R factor o c o free more inserts, which presumably arose in individual genes was calculated using a subset (10%) of the reflections not used in the refinement. depending on the substrate specificity of the encoded enzyme. The simplest insert, consisting of only two helices at the N terminus, is found in COMT, which methylates a are apparently disordered in the absence of DNA. In addi- small molecule, catechol. The other Mtases, which act on tion, 11 sidechains (which correspond to hydrophilic macromolecules, have more elaborate and/or multiple surface residues 10, 30, 34, 107, 138, 157, 170, 212, 248, inserts. The DpnM structure is very similar to that of 259 and 284) were not defined in the electron-density map COMT, which suggests its evolution from a small-molecule and are represented as alanine residues in the model. The Mtase, perhaps a close relative of COMT acting on adenine. model also contains 27 nonhydrogen atoms corresponding to AdoMet, 178 water molecules and the Hg atom, which In the case of M.PvuII [16], however, inserts cannot was located next to Cys70. account for the derivation of its fold from a single Mtase gene. But, M.PvuII shows evidence of Mtase gene dupli- Overall architecture cation [37], and once duplication occurs genetic mecha- A Cα tracing of the protein and a ribbon model of DpnM nisms readily lead to triplication. Triplication of a complexed with AdoMet are shown in Figures 1a and b. precursor gene, in concert with deletion of terminal seg- In the view shown, the protein appears in the form of a ments and simple insertions, could explain the origin of letter ‘C’. The protein contains a large domain (residues M.PvuII and group β Mtases in general. Circular permuta- 10–76 and 169–284) and a small domain (residues 77–168). tion was previously proposed to explain the origin of this The large domain, which corresponds to the upper two group from a group γ ancestor [16,21]; however, there is no thirds of the C, is an α/β domain that consists of a seven- known genetic basis for circular permutation of a single stranded β sheet decorated on either face by α helices. gene copy. In summary, insertions of substrate specificity This domain contains a Rossmann fold [13,35] and binds determinants at various places in a common ancestral AdoMet (as shown in Figure 1 and detailed below). The Mtase can account for the different groups of DNA small domain, which corresponds to the lower third of the Mtases, as well as for the variety of Mtases in general. C, is composed of an α-helical cluster. The opening in the C corresponds to a hollow where we propose that DNA is Comparison of DpnM to known protein structures bound, with its axis perpendicular to the plane of the pro- The DpnM structure was compared to structures in the jection (as modeled in Figure 1c). Protein Data Bank at Brookhaven National Laboratory 1566 Structure 1998, Vol 6 No 12

Figure 1

The structure of DpnM in complex with (a) (b) 231 AdoMet. The AdoMet and DNA 250 250 254 substrate are shown as stick models with 230 281 244 206 220 C terminus atoms shown in standard colors: C, black; N, C terminus 280 235 216 blue; O, red; P, magenta; S, yellow. (a) Tracing 183 242 of the Cα chain of DpnM with residue 190 200 200 190180 240 32 178 276 numbers offset above right. The width of the 195 210 259 trace and the size of labels decrease with 40 30 39 42 176 distance from the observer. (b) Ribbon 57 63 diagram of DpnM in complex with AdoMet. 60 55 49 25 The mainchain is colored from the N terminus 170 50 20 169 (blue) to the C terminus (green). Numbers indicate residues at the start of structural N terminus elements. Labels decrease in size with 70 140 10 distance from the observer. (c) The DpnM N terminus 10 130 114 110 complex with AdoMet modeled with bound 77 128 DNA substrate; labeling and color scheme as 78 102 160 120 in (b). (d) Schematic diagram of the secondary 156 100 structure elements of DpnM. Rectangles 80 150 94 represent α helices and block arrows 92 represent β strands. Yellow elements correspond to the consensus Mtase fold; red 90 elements correspond to the helical cluster in DpnM. Dashed lines represent unstructured VIII (c)231 (d) parts of loops. Colored segments indicate 254 6 258 regions of observed contacts with AdoMet 250 C 254 206 281 244 VII (green) and proposed contacts with the target 280 7276 C terminus IX 235 216 adenine (blue); other proposed contacts with 249 J244 183 242 the DNA are shown in red. Orange bars VI 190 200

235 5 241 designated by Roman numerals show the 32 178 276 195 V 39 42 259 location of conserved sequence motifs. 230 I 216 176 IV (Figures 1a–c were prepared using the 57 63 190 4 194 program MOLSCRIPT [64].) 55 49 25 31 A 25 N X 169 182 41 39 1 I N terminus H 53 B 48 10 178 III 114 62 57 2 77 128 II 78 102 76 C 65 127 156 78 170 3 175 94 DF

92 168 G 156 91 114 94 E 101 Structure

using the program DALI [38]. The 12 highest scoring prototype for the various Mtases acting on macromole- matches are listed in Table 3 and include all the Mtases in cules, each of which may have originated directly from a the Data Bank at the time of comparison. The closest progenitor like COMT. Intercomparison of various Mtases match is with COMT [13], followed by M.TaqI [15], and dehydrogenases, also shown in Table 3, is consistent glycine-N-methyltransferase (GNMT) [17], and VP39, the with the proposal that macromolecular Mtases, including vaccinia protein that methylates RNA [19]. A third DNA the several classes acting on DNA, evolved independently Mtase, M.HhaI [9], is last on the list. The elements of the from small-molecule Mtases. Mtase fold, which are present in all of these Mtases, accounts for the high scores. Most of the other high The similarity with GNMT is particularly interesting. scoring proteins are dehydrogenases that contain the Ross- Although this protein contains the consensus fold, it does mann fold, which accommodates adenosine-containing not bind AdoMet in the manner of the other Mtases coenzymes [35]. (described below). Unlike the other Mtases, GNMT acts as a tetramer and two subunits participate in binding an The high structural homology with COMT supports the AdoMet molecule [17]. GNMT may have originated from idea that an Mtase acting on a small molecule may be the a monomeric Mtase that evolved to a multimeric form and Research Article M.DpnII DNA adenine methyltransferase Tran et al. 1567

Figure 2

(a)

F E D N DpnM N Consensus fold

G C B A C α3 α2 α1 C

3 2 1 4 5 6 β3 β2 β1 β4 β5 β6 7 β7

H I J α4 α5 α6

(b) N A B COMT N 1 A GNMT

E D C C E DC B C

3 2 1 4 5 6 4 3 2 5 6 10 7 11 F G H I F G H 8 7 9 N H M. Taq I 4 3 B A 5 CheR ( ) C 3 2 1 4 5 6 8 I G F E D C B A N 7 9 C C D E 6 2 1 7 8 9 10

J K C M. Pvu II VP39 32 B1 B A Z F D C B A N C 3 3 2 1 4 5 6 8 1 7 3 2 1 4 5 7 I 8 N C D E H E F 6 9 G 51 52

ErmAM N C M. Hha II F G B A 8 Z G 16 F 15 14 12 1 13 H B A N 8 4 3 2 5 6 7 9 I C 3 2 1 4 5 6 10 7 9 11 C D E C D E Structure

Comparison of the structural fold of DpnM with other Mtases. Circles represent additional elements not shown. Elements are numbered represent α helices and triangles represent β strands; the N and C according to references in the text, with the exception that the termini are indicated. (a) DpnM topology showing the consensus fold numbering of α helices was converted to letters and the original (colored black) and the DpnM-specific insert (green). (b) Insertions numbering of M.HhaI [8] was modified in accordance with (green) into the consensus fold (black) for various Mtases. Absent subsequent publications [14,16]. consensus elements are indicated by red connectors. Parentheses altered its functional elements in the process. Thus, the the Mtases discussed, GNMT retains the fold without its Mtases present an interesting paradigm relating the evolu- functions, all the other Mtases conserve functions of the tion of structure and function. It appears that the overall elements making up the fold, and only closely related fold of a protein is most highly conserved, then the func- groups, like the DpnM family described below, show sig- tion of its elements, and lastly its amino acid sequence. Of nificant sequence identity. The tendency for a structural 1568 Structure 1998, Vol 6 No 12

Table 3

Comparison of DpnM to structures in the Protein Data Bank.

Z score* with test protein

Protein PDB code DpnM 1VID 2ADM 1GDH 1HMY

Catechol-O-methyltransferase (COMT) 1VID 8.2 – 10.9 5.9 7.8 Adenine-N6-DNA-methyltransferase (M.TaqI) 2ADM 8.1 10.9 – 6.9 7.8 Glycine-N-methyltransferase (GNMT) 1XVA 8.1 13.2 10.4 5.7 7.4 mRNA methyltransferase (VP39) 1VPT 6.1 9.2 8.4 5.1 4.8 D-Glycerate dehydrogenase 1GDH 5.6 5.9 6.9 – 3.9 NAD-dependent formate dehydrogenase 2NAD 5.4 6.5 6.6 27.8 5.3 D-3-Phosphoglycerate dehydrogenase 1PSD 5.1 6.1 6.6 32.7 4.6 17-β-Hydroxysteroid dehydrogenase 1FDS 5.1 9.0 6.0 7.1 6.5 Carbonyl reductase 1CYD 4.9 7.3 6.5 7.8 7.3 Aspartate carbamoyltransferase 8ATC 4.8 6.5 2.8 11.8 4.2 Trimethylamine dehydrogenase 2TMD 4.8 5.0 3.9 5.8 4.6 Cytosine-C5-DNA-methyltransferase (M.HhaI) 1HMY 4.8 7.8 7.8 3.9 –

*Z scores were calculated using the program DALI, version 2 [38]. fold to persist longer than sequence similarity can allow [28], despite recognition of GATC, are less similar to recognition of function even among proteins no longer DpnM and form a cluster with M.EcoRV [44], M.CviBI sharing significant sequence identity. This observation [45] and the pair M.StsIb and M.FokIb, which recognize has been made with other sets of proteins (e.g. amido- GATATC, GANTC and GGATG, respectively. The [39]) and may permit recognition of function in proteins M.StsI and M.FokI carry two Mtase activities proteins for which only the structure is known. each of which methylates the nonpalindromic recognition sequence on one strand [46,47]. The C-terminal half DpnM family motifs (M.StsIb or M.FokIb) methylates GGATG, and the N-ter- From protein sequence comparisons, DpnM appears to be minal half (M.StsIa or M.FokIa) methylates CATCC. The related to a family of Mtases that recognize sequences con- cluster most distant from DpnM in the family consists of taining GAT and methylate the A residue [27,28,40,21]. the latter two Mtases and M.CviAII [48] and M.NlaIII These sequences are aligned in Figure 3 together with a [49], which recognize CATG. The Cvi Mtases are linear display of the secondary structure determined for encoded by viruses that infect Chlorella, a eukaryote. DpnM. It is apparent that nearly all of the sequence inser- Eukaryotes may also have separated from bacteria over a tions or deletions in members of the family occur within billion years ago, but the plethora of restriction systems the loops between structural elements. found in the Chlorella viruses [50] suggests that these large viruses with of ~300 kb originated as Also related to the Mtases shown in Figure 3, is a group of bacterial parasites. Mtases that recognize DNA sequences containing CAT. The dendrogram in Figure 4 shows that the latter group is Examination of the sequence alignment of the DpnM less closely related to DpnM. Although group α Mtases family Mtases (Figure 3) shows that many residues are were categorized by the order of their common Mtase conserved throughout the sequence. Given the long motifs [21], it appears that except for the N4mC Mtase period of separate evolution of most of the species repre- M.MvaI, group α is a genetically related group of N6mA sented, the conserved residues probably all have func- Mtases, recognizable by sequence similarity, which we call tional significance. With respect to the DpnM secondary the DpnM family. structure, these residues fall into three classes: hydro- phobic residues within structural elements, both α helices Most closely related to DpnM in sequence are its and β strands, which bind these elements together into homologs from the LlaI and MboI systems, which like either the Mtase fold or the helical cluster; residues at the DpnII contain two separate Mtases recognizing GATC. ends of structural elements or within loops between them, Also closely related are two putative Mtases deduced from of which many correspond to motifs previously found in DNA sequence data, M.PgiI [41] and M.MjaI [42]. Mtases and are involved in binding AdoMet and catalyz- M.MjaI is classified in archaea, so either the DpnM family ing the methyl transfer; and residues within a long struc- is very ancient (archaea and eubacteria reputedly sepa- tured loop (from residue 128–155 in DpnM) in which rated over a billion years ago) or lateral exchange occurred almost half the residues are conserved. This structured more recently between archaea and eubacteria. The loop is exclusive to the DpnM family and presumably cor- stand-alone Dam Mtases from E. coli [43] and phage T4 responds to a DNA target recognition domain (TRD). In Research Article M.DpnII DNA adenine methyltransferase Tran et al. 1569

Figure 3

Sequence alignment of the DpnM family of Mtases. The secondary alignment: uppercase, conserved in every instance; lowercase, structure of DpnM is indicated above the sequences, with rectangles conserved in all but one or two sequences; X and x, hydrophobic representing α helices and block arrows representing β strands; residues. Boldface letters in the sequence indicate residues similar to unstructured regions are shown by dashes. Conserved residues and those in DpnM. See text for references. consensus motifs (labeled with Roman numerals) are shown below the addition to this rigid loop, two unstructured loops in the assigned the correct function. Motif III, however, was C-terminal third of the protein (indicated by dashes in the misidentified; it was positioned in what is actually the structure depicted in Figure 3) contain many polar middle of helix D, which is part of the helical cluster residues and probably interact with the DNA substrate domain of DpnM. The functional motif III, in which an after its initial binding. Asp177 carboxyl forms a hydrogen bond with adenine N6 of AdoMet, is located past the helical cluster. With respect Of the motifs that are implicated in the binding of to the Mtase fold, motif X corresponds to the loop preced- AdoMet and methyl transfer (detailed below), the predic- ing α1 and motifs I to IV correspond mainly to the loops tions made by Malone et al. [21] for group α Mtases were following β1 to β4, respectively. Motifs V to IX are rela- remarkably (but not entirely) correct, and we have tively weakly conserved in the DpnM family; they appear retained their nomenclature. Motifs X, I, II and IV were to maintain the framework of the fold. 1570 Structure 1998, Vol 6 No 12

Figure 4 Mtases. This supports the proposal of an independent origin of the four DNA Mtase groups from a small mol- ecule Mtase progenitor. Two other Mtases for which struc- Spn DpnM GATC tures were reported, the protein Mtase CheR that regulates chemotaxis in Salmonella typhimurium [18] and the RNA Lla II GATC Mtase VP39 of vaccinia virus [19], also share the consensus Pgi putative ? fold and bind AdoMet in a similar manner. However, the glycine Mtase (GNMT) shares the fold but binds AdoMet Mbo I GATC differently [17]. This protein acts as a tetramer, and AdoMet binding is shared by two adjacent monomers [17], Mja putative ? which may explain the different binding pattern.

Eco Dam GATC The availability of the DpnM structure should benefit Cvi BI GANTC mutational analysis of enzyme function by distinguishing local effects from broader configurational changes. Eco RV GATATC However, in the absence of such information, experimen- tal mutation of residues in EcoDam [51], T4Dam [52] and Eco T4Dam GATC M.EcoRV [53] supports a role for motif IV in AdoMet binding as well as in . Alteration of a Sts Ib GGATG residue in T4Dam, corresponding to Pro195 of DpnM, Fok Ib GGATG raised the Km for AdoMet 20-fold. Similarly, mutation of Asp181 of EcoDam, which corresponds to Asp194 of Sts Ia CATCC DpnM and is shown here to bond directly to the methion- ine amino group of AdoMet, completely prevented Fok Ia CATCC binding of AdoMet. AdoMet binding by M.EcoRV was shown to require Lys16, Glu37, Phe39 and Asp58, which Cvi AII CATG correspond to Lys21, Glu41, Phe43 and Asp62 of DpnM — all of which are shown here to bind to AdoMet Nla III CATG (Table 4). By demonstrating that Asp78 was not essential for binding, these investigators reported [53], as we do, that motif III had been misidentified. The crystal struc- 10% ture of DpnM shows that the AdoMet-binding residue is divergence Structure Asp177 in the true motif III.

Dendrogram showing relationships within the DpnM family of Mtases. DNA binding Stem lengths indicate percentage divergence of amino acid identity. Modeling of B-form DNA with the DpnM–AdoMet struc- (The figure was created using the program CLUSTAL W [65].) ture showed that the closest approach of the DNA methy- lation target to the methyl donor in AdoMet would occur if Cofactor binding the DNA fit into the hollow of the C-shaped DpnM The AdoMet cofactor is bound in a pocket on the surface protein with the DNA helical axis parallel to the groove of DpnM, with numerous hydrogen bonds linking it to the (Figure 1c). Such positioning of the DNA was observed for protein (as indicated in Table 4 and Figure 5). The aro- HhaI [9] and HaeIII [10] and also hypothesized for M.TaqI matic sidechains of Phe43 and, to a lesser extent, Phe63, [15] and M.PvuII [16]. In this position, the base pairs of Phe178 and Phe214 provide a hydrophobic environment the GATC recognition sequence approach a well struc- for the adenine ring. The conserved residues Gly45 and tured protein loop consisting of residues 130–150, which Gly47 allow a close approach of the cofactor methionine intrudes into the hollow. Although this sequence is highly sidechain. This pattern of AdoMet binding is the same as conserved in the DpnM family (Figure 3), it is not part of seen for other Mtases with the same fold (Table 4). the Mtase fold but rather a feature of the α-helical domain inserted into that fold; it presumably represents a TRD The Mtases listed in Table 4, which include representa- [54,21]. We propose that this loop forms a rigid shelf that tives of all four groups of DNA Mtases, all bind AdoMet in initiates contact with the cognate DNA. Several conserved a similar manner. Most bonds are formed between the residues, such as Asn130, Arg134, Asn136 and Asn142, cofactor and residues in motifs I, II and III, but some could potentially form hydrogen bonds with DNA bases. bonds are also made by motifs X and IV. Interestingly, the The basic residues Arg147 and Lys149 might interact with participation of motif IV makes DpnM more similar to phosphate groups. Supporting such a docking function, COMT, a small molecule Mtase, than to the other DNA mutants of M.EcoRV altered in residues corresponding to Research Article M.DpnII DNA adenine methyltransferase Tran et al. 1571

Table 4

Interactions between AdoMet and amino acid residues in DpnM and other Mtases*.

Atoms in other Mtases

AdoMet atom Distance (Å) Atom in DpnM M.TaqI † M.PvuII † M.HhaI † COMT†

N1 3.01 N Phe178 (III) N Phe90 (III) N Asp34 (III) N Ile61 (III) OG Ser119 (III) N3 3.45 N Phe63 (II) N Ile72 (II) N Trp41 (II) N6 3.17 OD2 Asp177 (III) OD2 Asp89 (III) OD1 Asp34 (III) OD1 Asp60 (III) OG Ser119 (III) N6 3.53 H2O616 OE Gln120 (III) N7 3.33 H2O616 O2′ 2.74 OD1 Asp62 (II) OE1 Glu71 (II) OE1 Glu294 (II) OE1 Glu40 (II) OE1 Glu90 (II) ′ O2 2.94 H2O444 NE1 Trp44 (II) H2O444– ND2 Asn92 (II) O3′ 2.63 OD2 Asp62 (II) OE2 Glu71 (II) OE2 Glu294 (II) OE2 Glu40 (II) OE2 Glu90 (II) ′ O3 3.06 NE1 Trp17 (X) ND1 His246 (X) H2O333– ND2 Asn304 (X) OX1 3.09 N Lys21 (X) N Thr23 (X) N Ser276 (I) N Gly23 (I) OX1 3.53 N Ala48 (I) OG Thr23 (X) OG Ser276 (I) OG Ser305 (X) OG Ser72 (I) O 2.84 N Gly46 (I) OE1 Glu22 (X) OG Ser305 (X) N Val42 (X) O 2.81 H2O402– N Leu21 (I) 2.89‡ OE2 Glu41 (I) NH3+ 2.69 OD2 Asp194 (IV) OD1 Asn105 (IV) OD2 Asp141 (IV) 3+ NH 3.05 H2O902– OE1 Glu45 (I) H2O340– OG Ser72 (I) 2.60‡ O Phe43 (I) OE1 Glu119 (VI) 2.80‡ OE2 Glu41 (I) 3+ NH 3.37 H2O402– H2O338– O Gly66 (I) 2.97‡ N Leu49 (I) N Gly78 (IV)

*The designation of motifs is given in parentheses. †References: M.TaqI [15], M.PvuII [16], M.HhaI [9], COMT [13]. ‡Via water bridge.

Arg134 and Asn136 were defective in DNA binding [53]. residue (Pro144 in DpnM) to serine allowed methylation In the T4Dam Mtase, changing the conserved proline of non-GATC sites [54]. Subsequent to docking, the two

Figure 5

The of DpnM showing the binding of AdoMet and the proposed binding mode of the target adenine in DNA. Atoms are shown in standard colors: C, black; N, blue; O, red (except water O, white); P, purple; S, yellow. Tyr197 The bonds are color coded: AdoMet, orange; adenylate residue of DNA, deep blue; protein, green. Dashed lines indicate hydrogen bonds. Ade For Leu49, Phe43, Phe63 and Phe178, only mainchain atoms are shown; for Asp194 and Pro195 Trp17, only terminal parts of the sidechain are indicated. Not all contacts are shown. (The Lys21 figure was prepared using the program Asp194 MOLSCRIPT [64].) AdoMet

Trp17

Phe178

Asp177 Phe43 Ala48 Phe63

Asp62 Gly46 Glu41 Leu49 Structure 1572 Structure 1998, Vol 6 No 12

Table 5

Structural interactions between DpnM atoms mediated by hydrogen bonds.

First atom Distance Second atom

SE* Motif Residue Atom (Å) Atom Residue Motif SE*

X Lys21 NZ 3.14† OD1 Asp194 IV 4 X Lys21 NZ 2.80† OD2 Asp194 IV 4 X Lys21 NZ 2.93 OH Tyr197 IV I Asn37 ND2 3.18 O Thr187 X I Asn37 OD1 3.14–H2O452–3.46 OD1 Asp76 C 1 I Tyr39 OH 2.49 OE1 Glu41 I 1 1 I Glu41 OE1 3,18 N Phe50 B I Gly45 O 2.90 OH Tyr71 C E Tyr96 OH 2.76 ND1 His88 D TRD Arg134 NE 2.85 OD1 Asn142 TRD TRD Asn136 ND2 2.89 OE1 Gln140 TRD TRD Asn142 ND2 2.85 N Asn136 TRD III Asp177 OD1 2.94 NH2 Arg221 V I III Asp177 OD2 2.88 NH1 Arg221 V I IV Tyr197 OH 2.60 OE1 Glu276 IX 7 I V Asp218 OD1 3.06 NH2 Arg221 V I I V Asp218 OD2 2.83 OG Ser215 V I V Asp218 OD2 2.83 N Ser215 V I V Gln219 O 2.78–H2O437–2.75 OH Tyr250 VII J I V Gln219 OE1 3.08–H2O437 I V Gln219 NE2 2.77–H2O424–2.87 OH Tyr209 IV Tyr197 OH 3.42–H2O461–2.86 OG Ser239 VI 5 H2O461–2.94 OE1 Glu276 IX 7 IV Tyr197 OH 2.60 OE1 Glu276 IX 7 5 VI Ser239 OG 2.83 OH Tyr250 VII J 5 VI Asn240 O 3.04 N Ile198 IV 5 VI Asn240 ND2 3.16 OE1 Gln219 I 5 VI Ser241 OG 2.71 O Ile198 IV J VII Glu247 OE1 3.03 OH Tyr257 VIII 6 6 VIII Asn254 ND2 2.75 OH Tyr282 IX 7 6 VIII Asn254 ND2 2.97 OE1 Glu283 IX 6 VIII His256 ND1 3.16 O Ile278 IX 7 7 IX Asn281 O 2.92 N Val236 VI 5 7 IX Asn281 ND2 2.76 O Ala234 VI 5 7 IX Asn281 ND2 2.82 OG Ser230 VI I

*SE (structural element); letters refer to α helices and numbers refer to β strands. †The ionic pair Lys21 and Asp194 form a salt bridge. unstructured and presumably flexible loops (residues residues in motif IV, which is considered to be part of the 201–205 and 260–273), which are rich in serine, threonine, catalytic site in all DNA Mtases [21], Ade can bind to and residues that are potential hydrogen- residue Lys21 of motif X, which together with Gly19 is bond donors, could anchor the DNA. These loops are not highly conserved in the DpnM family. The absence of a highly conserved in sequence, but they might stabilize the side chain in Gly19 allows the bulky adenine ring to fit into complex through nonspecific interactions with the phos- the receptor site. Figure 1c models the DNA substrate phate–sugar backbone of the DNA. with the everted Ade bound at this site. The everted DNA backbone is modeled on the observed extension in M.HhaI After docking of the DNA substrate the adenine target [9]. Indirect experimental evidence of Ade eversion was would be too far from the active site to undergo methyla- reported for DpnM family member M.EcoRV [55]. tion, unless the adenine residue is everted from the DNA as found for the target cytosine residue in M.HhaI and Catalytic site M.HaeIII [9,10]. Such eversion was also hypothesized for As hypothetically modeled in Figure 5, Ade is juxtaposed target bases in M.TaqI [15] and M.PvuII [16], and in the to Tyr197 with the planes of the aromatic rings nearly par- latter case a receptor site for the target cytosine was pro- allel and separated by ~3.8 Å. All DNA amino-Mtases posed. A putative receptor site can be similarly discerned have either or at this position in in the DpnM structure for the target adenine (Ade), with the catalytic motif IV (Asp-Pro-Pro-Tyr, Asn-Pro-Pro-Tyr, bonding as indicated in Figure 5. As well as binding to Asn-Pro-Pro-Phe or Ser-Pro-Pro-Tyr) [56], and these Research Article M.DpnII DNA adenine methyltransferase Tran et al. 1573

aromatic residues presumably bind the target bases simi- motifs together to form the cofactor-binding and catalytic larly. Hydrogen bonds to ring atoms N1 and N7 from the sites; and intramotif bonds that give structure to the Lys21 ε-amino and Tyr197 imido groups would orient the motifs. In this regard, note the linkage of Arg134, Asn136 Ade ring. The catalysis of methyl transfer directly to and Asn142, which are three highly conserved residues in adenine N6, shown to occur for M.EcoRI [57], is presum- the putative DNA TRD. ably mediated by the negative polarization of N6 by hydro- gen bonding to Asp194 and Pro195, as previously proposed Biological implications on the basis of sequence comparisons [21]. In the configu- Methyltransferases (Mtases) catalyze the transfer of ration shown, Ade N6 lies only 1.7 Å from the of the methyl groups from S-adenosylmethionine (AdoMet) to that will be transferred to it. Thus, we may a variety of small molecular and macromolecular sub- have modeled the transition state for catalysis. strates. DNA Mtases have been classified into four groups of which three (α, β and γ) can methylate Although the effects of amino acid substitution on catalysis adenine. Structural information is available for three of in DpnM have not been investigated, such changes have these groups. DpnM is an enzyme from the restriction- been examined in its homologs, EcoDam, T4Dam and modification system DpnII of Streptococcus pneumoniae M.EcoRV. In EcoDam, changes in the Asp-Pro-Pro portion and a member of the α group of DNA adenine Mtases. of motif IV abolished Mtase activity, although changing Until now, no structural information was available for the second proline residue to glycine or glutamate still this class of Mtase, and the crystal structure determina- allowed binding of AdoMet, which was detected by its tion of DpnM reported here fills a gap in our structural cross-linking to the protein on UV irradiation [51]. In knowledge. Structural comparisons of DpnM to other T4Dam, alteration of the first proline residue in this motif proteins in the data base showed it to be most closely to alanine or threonine reduced kcat by a factor of two and related to a small-molecule Mtase, catechol-O-methyl- four, respectively, but the Km for AdoMet was reduced by (COMT). The structure also showed sim- factors of five and 20 [52]. The aspartate residue in this ilarities to other macromolecular Mtases, but some motif makes a hydrogen bond to AdoMet and the effect of dehydrogenases were closer in structure to DpnM than its substitution on cofactor binding, as observed for was a DNA cytosine methylating enzyme. EcoDam, is understandable. With respect to the effect of altering the first proline residue in the motif on AdoMet The structural similarity between Mtases and dehydro- binding, as seen in both EcoDam and T4Dam, this could genases may illuminate the origin of Mtases as a class. be explained if the proline residue at this position has a Although Mtases perform invaluable biological func- role in directing the mainchain away from blocking the tions, in a general sense they are dispensable. For AdoMet site. In EcoRV, the alteration of the aspartate and example, entire eukaryotic phyla lack DNA cytosine tyrosine residues in motif IV inactivated the Mtase, as methylation [58]. The structurally related dehydroge- expected, but changes in M.EcoRV residues Asp211, nases, however, carry out essential metabolic functions. Ser229, Trp231 and Tyr258 also interfered with catalysis, Therefore, it is likely that Mtases evolved from dehydro- without affecting AdoMet or DNA binding [53]. These genases that used adenosine-containing compounds as residues correspond to Asp218, Ser239, Ser241 and Arg262 cofactors, by adaptation of the cofactor- to in DpnM. The last residue falls in a disordered part of the accommodate AdoMet. DpnM protein. Although the first two of these residues are conserved throughout the DpnM family, it is not clear The structure of DpnM in complex with AdoMet pro- from the protein structure how they could play a direct vides additional evidence for an ancestral Mtase fold in role in catalysis. They do, however, link parts of the Mtase proteins that retain Mtase function but no significant fold together, and the mutations might affect the overall sequence similarity. Comparisons of the structure conformation of the protein. suggest that the progenitor Mtase acted on a small mol- ecule, and that various Mtases acting on DNA, RNA or Structural fold interactions proteins evolved independently through the insertion of Table 5 lists some interactions between residues in DpnM additional sequence into the fold between its elements of that appear to contribute to its structural integrity rather secondary structure. similarities between than participating directly in cofactor or substrate binding the three groups of DNA adenine Mtases indicate their or catalysis. These residues are conserved for the most possible origin from an Mtase acting on free adenine. part in the DpnM family (Figure 3), with the exception of Cytosine C5 Mtases may have originated from an those only providing mainchain contacts. Most of the enzyme acting on free cytosine. Combinations of gene interactions are direct, but in a few instances they are duplication and insertion can account for the variety of mediated by water molecules. Three types of interactions structures evolved from a single progenitor. In the case of are evident: those that join together α helical or β strand DpnM, an inserted helical cluster may serve as a plat- elements of the Mtase fold; those that bring different form for docking the cognate DNA. DpnM contains a 1574 Structure 1998, Vol 6 No 12

hollow in which the DNA substrate can fit; nevertheless, initial model was built using the program O [62]. The structure was it appears from the location of the methyl donor that the refined by X-PLOR [63] alternated with rounds of model building. The current model has an R factor of 23.8% and an Rfree of 28.4%. Refine- target adenine is everted, as found for other enzymes ment statistics are summarized in Table 2. acting on DNA bases. Although DpnM methylates only GATC, related proteins recognize slightly divergent Accession numbers DNA sequences, presumably by minor modifications in The atomic coordinates for the DpnM–AdoMet complex have been a target recognition domain. Enzymes methylating deposited in the Brookhaven Protein Data Bank with accession code 2DPM. adenine in quite different recognition sequences and those methylating cytosine fall into different families. Acknowledgements Functional motifs situated adjacent to structural ele- We dedicate this paper to the memory of Z Richard Korszun, whose tragic ments of the Mtase fold retain a high degree of sequence death occurred while this work was in progress. Maria Bewley kindly pro- similarity only among members of a family. Despite vided expert guidance for completing the work. We are grateful to Robert M Sweet and Malcolm S Capel for advice and assistance at the NSLS lacking sequence similarity with more distantly related beamlines, to Stephan L Ginell for advice on freezing crystals, to Joel Mtases, these motifs still retain similar functions in co- L Sussman for helpful suggestions in analyzing the results and to Stephen W White for helping initiate this study. This research was conducted at factor binding or catalysis. One exception is glycine Brookhaven National Laboratory under the auspices of the US Department N-methyltransferase, which retains the fold but has of Energy Office of Biological and Environmental Research. It was also sup- altered functions. It appears that throughout evolution ported by US Public Health Service grant GM29721. the overall fold of a protein is most highly conserved, then the function of its elements and finally the sequence. References 1. Wilson, G.G. (1991). Organization of restriction-modification systems. Nucleic Acids Res. 19, 2539-2566. Materials and methods 2. Hotchkiss, R.D. (1948). The quantitative separation of purines, Protein preparation and crystallization pyrimidines, and nucleotides by paper chromatography. J. Biol. Chem. 175 DpnM was produced in an E. coli expression system and purified by , 315-332. 3. Gorovsky, M.A., Hattman, S. & Pleger, G.L. (1973). (6N)methyl the modification of a previously described method [30]. The protein adenine in the nuclear DNA of a eucaryote, Tetrahymena pyriformis. J. was crystallized by lowering the concentration of NaCl [32]. Details of Cell Biol. 56, 697-701. the purification and crystallization procedures will be published else- 4. Cummings, D.J., Tait, A. & Goddard, J.M. (1974). Methylated bases in where (SAL and SSS, unpublished results). The crystals used here DNA from Paramecium aurelia. Biochim. Biophys. Acta 374, 1-11. were grown at 4°C by vapor diffusion in hanging drops containing 5. Van Etten, J.L., Schuster, A.M., Girton, L., Burbank, D.E., Swinton, D. 15 µg (0.45 nmol) of DpnM protein, 0.9 nmol AdoMet, 100 mM & Hattman, S. (1985). DNA methylation of viruses infecting a HEPES buffer, pH 7.0, 250 mM NaCl and 5% (w/v) polyethylene glycol eukaryotic Chlorella-like green alga. Nucleic Acids Res. 13, (PEG) 3350 in a volume of 6.0 µl, with 1.0 ml of 50 mM HEPES, 3471-3478. 125 mM NaCl and 2.5% PEG 3350 in the reservoir. 6. Som, S., Bhagwat, A.S. & Friedman, S. (1987). Nucleotide sequence and expression of the gene encoding the EcoRII modification enzyme. Nucleic Acids Res. 14, 313-332. Mercury derivatization and freezing of crystals 7. Kumar, S., et al., & Wilson, G.G. (1994). The DNA (cytosine-5) Crystals, 0.2 to 0.3 mm in each dimension, were transferred into 90 µl methyltransferases. Nucleic Acids Res. 22, 1-10. of a cryoprotective solution containing 10 µM AdoMet, 0.1 M HEPES, 8. Cheng, X., Kumar, S., Posfai, J., Pflugrath, J.W. & Roberts, R.J. (1993). pH 7.0, 0.5 M NaCl and 25% PEG 3350. Mercuric acetate was Crystal structure of the HhaI DNA methyltransferase complexed with added to 0.5 mM. After soaking for two days at 4°C, the crystals were S-adenosyl-L-methionine. Cell 74, 299-307. frozen in propane. The crystals were isomorphous in the presence and 9. Klimasauskas, S., Kumar, S., Roberts, R.J. & Cheng, X. (1994). HhaI absence of Hg. methyltransferase flips its target base out of the DNA helix. Cell 76, 357-369. 10. Reinisch, K.M., Chen, L., Verdine, G.L. & Lipscomb, W.N. (1995). The Data collection crystal structure of HaeIII methyltransferase covalently complexed to Single crystals, kept frozen by means of an Oxford nitrogen cold DNA: an extrahelical cytosine and rearranged base pairing. Cell 82, stream, were used to collect data at multiple wavelengths at each of 143-153. two beamline stations, X12B and X12C, at the National Synchrotron 11. Vassylyev, D.G., et al., & Morikawa, K. (1995). Atomic model of a Light Source (NSLS) of Brookhaven National Laboratory. For the MAD pyrimidine dimer excision repair enzyme complexed with a DNA analysis, as indicated in Table 1, data taken at the longer wavelength at substrate: structural basis for damaged DNA recognition. Cell 83, X12B were used as the preinflection or bottom wavelength (λ1) and 773-782. data at three wavelengths obtained at X12C corresponded to the 12. Slupphaug, G., Mol, C.D., Kavli, B., Arvai, A.S., Krokan, H.E. & Tainer, J.A. (1996). A nucleotide-flipping mechanism from the structure of inflection (λ2), peak (λ3) and remote (λ4) wavelengths with respect to λ human uracil-DNA glycosylase bound to DNA. Nature 384, 87-92. Hg absorption. Data from X12B at a shorter wavelength ( 5) were 13. Vidgren, J., Svensson, L.A. & Liljas, A. (1994). Crystal structure of used for structure refinement. Data were processed using the HKL catechol O-methyltransferase. Nature 368, 354-358. suite of programs [59]. 14. O’Gara, M., McCloy, K., Malone, T. & Cheng, X. (1995). Structure- based sequence alignment of three AdoMet-dependent Phasing, model building and refinement methyltransferases. Gene 157, 135-138. The MIRAS method was used to derive phases for MAD data sets by 15. Labahn, J., et al., & Saenger, W. (1994). Three-dimensional structure treating scattering differences at various wavelengths, which result of the adenine-specific DNA methyltransferase M.TaqI in complex with from anomalous diffraction, similarly to isomorphous differences due to the cofactor S-adenosylmethionine. Proc. Natl Acad. Sci. USA 91, 10957-10961. addition of a heavy atom [34]. Data taken at λ1 served as the ‘native’ 16. Gong, W., O’Gara, M., Blumenthal, R.M. & Cheng, X. (1997). set. The PHASES suite of programs [60] was used to obtain initial Structure of PvuII DNA-(cytosine N4) methyltransferase, an example of λ phases. Phasing power calculated from anomalous differences at 2, domain permutation and protein fold assignment. Nucleic Acids Res. λ3 and λ4 and isomorphous differences at λ2 and λ4 ranged from 25, 2702-2715. 0.87 to 2.14. After solvent flattening [61], the overall figure of merit 17. Fu, Z., et al., & Takusagawa, F. (1996). Crystal structure of glycine N- was 0.874. Interpretable electron-density maps were obtained, and an methyltransferase from rat liver. 35, 11985-11993. Research Article M.DpnII DNA adenine methyltransferase Tran et al. 1575

18. Djordjevic, S. & Stock, A.M. (1997). Crystal structure of the chemotaxis 42. Bult, C.J., et al., & Venter, C. (1996). Complete sequence of receptor methyltransferase CheR suggest a conserved the methanogenic archaeon, Methanococcus jannaschii. Science for binding S-adenosylmethionine. Structure 5, 545-558. 273, 1058-1073. 19. Hodel, A.E., Gershon, P.D., Shi, X. & Quiocho, F.A. (1996). The 43. Brooks, J.E., Blumenthal, R.M. & Gingeras, T.R. (1983). The isolation 1.85 Å structure of Vaccinia protein VP39: a bifunctional enzyme that and characterization of the Escherichia coli DNA adenine methylase participates in the modification of both mRNA ends. Cell 85, 247-256. (dam) gene. Nucleic Acids Res. 11, 837-851. 20. Yu, L., et al., & Fesik, S.W. (1997). Solution structure of an rRNA 44. Bourgueleret, L., Schwarzstein, M., Tsugita, A. & Zabeau, M. (1984). methyltransferase (ErmAM) that confers macrolide-lincosamide- Characterization of the genes coding for the EcoRV restriction and streptogramin antibiotic resistance. Nat. Struct. Biol. 4, 483-489. modification system of Escherichia coli. Nucleic Acids Res. 12, 21. Malone, T., Blumenthal, R.M. & Cheng, X. (1995). Structure-guided 3659-3676. analysis reveals nine sequence motifs conserved among DNA 45. Kan, T.N., Li, L. & Chandrasegaran, S. (1992). Cloning, sequencing, aminomethyltransferases, and suggests a catalytic mechanism for overproduction, and purification of M.Cvi BI (GANTC) these enzymes. J. Mol. Biol. 253, 618-632. methyltransferase from Chlorella virus NC-1A. Gene 121, 1-7. 22. Lacks, S.A., Mannarelli, B.M., Springhorn, S.S. & Greenberg, B. 46. Looney, M.C., et al., & Wilson, G.G. (1989). Nucleotide sequence of (1986). Genetic basis of the complementary DpnI and DpnII the Fok I restriction-modification system: separate strand-specificity restriction systems of S. pneumoniae: an intercellular cassette domains in the methyltransferase. Gene 80, 193-208. mechanism. Cell 46, 993-1000. 47. Kita, K., Suisha, M., Kotani, H., Yanase, H. & Kato, N. (1992). Cloning 23. Lacks, S. & Greenberg, B. (1977). Complementary specificity of and sequence analysis of the StsI restriction-modification gene: restriction of Diplococcus pneumoniae with respect to presence of homology to Fok I restriction-modification enzymes. DNA methylation. J. Mol. Biol. 114, 153-168. Nucleic Acids Res. 20, 4167-4172. 24. de la Campa, A.G., Kale, P., Springhorn, S.S. & Lacks, S.A. (1987). 48. Zhang, Y., Nelson, M., Nietfeldt, J.W., Burbank, D.E. & Van Etten, J.L. Proteins encoded by the DpnII restriction gene cassette: two (1992). Characterization of Chlorella virus PBCV-1 CviAII restriction methylases and an endonuclease. J. Mol. Biol. 196, 457-469. and modification system. Nucleic Acids Res. 20, 5351-5356. 25. Ueno, T., Ito, H., Kimizuka, F., Kotani, H. & Nakajima, K. (1993). Gene 49. Labbe, D., Holtke, H.J. & Lau, P.C. (1990). Cloning and structure and expression of the MboI restriction-modification system. characterization of two tandemly arranged DNA methyltransferase Nucleic Acids Res. 21, 2309-2313. genes of Neisseria lactamica: an adenine-specific M.NlaIII and a 26. Moineau, S., Walker, S.A., Vedamuthu, E.R. & Vandenbergh, P.A. cytosine-type methylase. Mol. Gen. Genet. 224, 101-110. (1995). Cloning and sequencing of the LlaII restriction/modification 50. Zhang, Y., et al., & Van Etten, L. (1998). Chlorella virus NY-2A genes from Lactococcus lactis and relatedness of this system to the encodes at least 12 DNA endonuclease/methyltransferase genes. Streptococcus pneumoniae DpnII system. Appl. Environ. Microbiol. Virology 240, 366-375. 61, 2193-2202. 51. Guyot, J.-B., Grassi, J., Hahn, U. & Guschlbauer, W. (1993). The role 27. Mannarelli, B.M., Balganesh, T.S., Greenberg, B., Springhorn, S.S. & of the preserved sequences of Dam methylase. Nucleic Acids Res. Lacks, S.A. (1985). Nucleotide sequence of the DpnII DNA methylase 21, 3183-3190. gene of Streptococcus pneumoniae and its relationship to the dam 52. Kossykh, V.G., Schlagman, S.L. & Hattman, S. (1993). Conserved gene of E. coli. Proc. Natl Acad. Sci. USA 82, 4468-4472. sequence motif DPPY in region IV of the phage T4 Dam DNA-[N6- 28. Hattman, S., Wilkinson, J., Swinton, D., Schlagman, S., MacDonald, adenine]-methyltransferase is important for S-adenosyl-L-methionine P.M. & Mosig, G. (1985). Common evolutionary origin of the phage T4 binding. Nucleic Acids Res. 21, 3563-3566. dam and host Escherichia coli DNA-adenine methyltransferase genes. 53. Roth, M., Helm-Kruse, S., Friedrich, T. & Jeltsch, A. (1998). Functional J. Bacteriol. 164, 932-937. roles of conserved amino acid residues in DNA methyltransferases 29. Claverys, J.P. & Lacks, S.A. (1986). Heteroduplex deoxyribonucleic investigated by site directed mutagenesis of the EcoRV adenine-N6- acid base mismatch repair in bacteria. Microbiol. Rev. 50, 133-165. methyltransferase. J. Biol. Chem. 273, 17333-17342. 30. Cerritelli, S., Springhorn, S.S. & Lacks, S.A. (1989). DpnA, a 54. Miner, Z., Schlagman, S.L. & Hattman, S. (1989). Single amino acid methylase for single-strand DNA in the DpnII restriction system, and changes that alter the DNA sequence specificity of the DNA-[N6- its biological function. Proc. Natl Acad. Sci. USA 86, 9223-9227. adenine] methyltransferase (Dam) of bacteriophage T4. Nucleic Acids 31. Vovis, G.F. & Lacks, S. (1977). Complementary action of restriction Res. 17, 8149-8157. enzymes Endo R.DpnI and Endo R.DpnII on bacteriophage f1 DNA. 55. Cal, S. & Connolly, B.A. (1997). DNA distortion and base flipping by J. Mol. Biol. 115, 525-538. the EcoRV DNA methytransferase. J. Biol. Chem. 272, 490-496. 32. Cerritelli, S., White, S.W. & Lacks, S.A. (1989). Crystallization of the 56. Smith, H.O., Annau, T.M. & Chandrasegaran, S. (1990). Finding DpnM methylase from the DpnII restriction system of Streptococcus sequence motifs in groups of functionally related proteins. Proc. Natl pneumoniae. J. Mol. Biol. 207, 841-842. Acad. Sci. USA 87, 826-830. 33. Hendrickson, W.A. & Ogata, C.M. (1997). Phase determination from 57. Pogolotti, A.L., Ono, A., Subramaniam, R. & Santi, D.V. (1988). On the multiwavelength anomalous diffraction measurements. Methods mechanism of DNA-adenine methylase. J. Biol. Chem. 263, Enzymol. 276, 494-523. 7461-7464. 34. Ramakrishnan, V. & Biou, V. (1997). Treatment of multiwavelength 58. Bird, A.P. & Taggart, M.H. (1980). Variable patterns of total DNA and anomalous diffraction data as a special case of multiple isomorphous rDNA methylation in animals. Nucleic Acids Res. 8, 1485-1497. replacement. Methods Enzymol. 276, 538-557. 59. Otwinowski, Z. & Minor, W. (1997). Processing of X-ray diffraction 35. Rossmann, M.G., Moras, D. & Olsen, K.W. (1974). Chemical and data collected in oscillation mode. Methods Enzymol. 276, 307-326. biological evolution of a nucleotide-binding domain. Nature 250, 194-199. 60. Furey, W. & Swaminathan, S. (1997). PHASES-95: a program 36. Dixon, M.M., Huang, S., Matthews, R.G. & Ludwig, M. (1996). The package for the processing and analysis of diffraction data from structure of the C-terminal domain of methionine synthase: presenting macromolecules. Methods Enzymol. 277, 590-620. S-adenosylmethionine for reductive methylation of B12. Structure 4, 61. Wang, B.C. (1985). Resolution of phase ambiguity in macromolecular 1263-1275. crystallography. Methods Enzymol. 115, 90-112. 37. Adams, G.M. & Blumenthal, R.M. (1997). The PvuII DNA-(cytosine- 62. Jones, T.A., Zou, J.Y., Cowan, S.W. & Kjeldgaard, M. (1991). Improved N4) methyltransferase comprises two trypsin-defined domains, each methods for building protein models in electron density maps and the of which binds a molecule of S-adenosyl-L-methionine. Biochemistry location of errors in these models. Acta Cryst. A 47, 110-119. 36, 8284-8292. 63. Brünger, A.T. (1992). X-PLOR, Version 3.1. Yale University Press, 38. Holm, L. & Sander, C. (1993). Protein structure comparison by New Haven, CT. alignment of distance matrices. J. Mol. Biol. 233, 123-138. 64. Kraulis, P.J. (1991). MOLSCRIPT: a program to produce both detailed 39. Holm, L. & Sander, C. (1997). An evolutionary treasure: unification of a and schematic plots of protein structures. J. Appl. Cryst. 24, 946-950. broad set of amidohydrolases related to urease. Proteins 28, 72-82. 65. Thompson, J.D., Higgins, D.G. & Gibson, T.J. (1994). CLUSTAL W: 40. Lauster, R., Kriebardis, A. & Guschlbauer, W. (1987). The GATATC- improving the sensitivity of progressive multiple sequence alignment modification enzyme EcoRV is closely related to the GATC- through sequence weighting, position specific gap penalties and recognizing methyltransferases DpnII and dam from E. coli and phage weight matrix choice. Nucleic Acids Res. 22, 4673-4680. T4. Gene 220, 167-178. 41. Banas, J.A., Ferretti, J.J. & Progulske-Fox, A. (1991). Identification and sequence analysis of a methylase gene in Porphyromonas gingivalis. Nucleic Acids Res. 19, 4189-4192.