<<

Oncogene (2002) 21, 9022 – 9032 ª 2002 Nature Publishing Group All rights reserved 0950 – 9232/02 $25.00 www.nature.com/onc

Structure and function of in DNA repair: shape, grip and blade of the DNA scissors

Tatsuya Nishino1 and Kosuke Morikawa*,1

1Department of Structural Biology, Research Institute (BERI), 6-2-3 Furuedai, Suita, Osaka 565-0874, Japan

DNA nucleases catalyze the cleavage of phosphodiester mismatched . They also recognize the bonds. These play crucial roles in various DNA replication or recombination intermediates to facilitate repair processes, which involve DNA replication, base the following reaction steps through the cleavage of excision repair, excision repair, mismatch DNA strands (Table 1). repair, and double strand break repair. In recent years, Nucleases can be regarded as molecular scissors, new nucleases involved in various DNA repair processes which cleave phosphodiester bonds between the sugars have been reported, including the Mus81 : Mms4 (Eme1) and the phosphate moieties of DNA. They contain complex, which functions during the meiotic phase and conserved minimal motifs, which usually consist of the Artemis : DNA-PK complex, which processes a V(D)J acidic and basic residues forming the . recombination intermediate. Defects of these nucleases These active site residues coordinate catalytically cause genetic instability or severe immunodeficiency. essential divalent cations, such as magnesium, Thus, structural biology on various actions is calcium, manganese or zinc, as a . However, essential for the elucidation of the molecular mechanism the requirements for actual cleavage, such as the types of complex DNA repair machinery. Three-dimensional and the numbers of metals, are very complicated, but structural information of nucleases is also rapidly are not common among the nucleases. It appears that accumulating, thus providing important insights into the the major role of the metals is to stabilize inter- molecular architectures, as well as the DNA recognition mediates, thereby facilitating the phosphoryl transfer and cleavage mechanisms. This review focuses on the reactions. Cleavage reactions occur either at the end three-dimensional structure-function relationships of or within DNA, and thus DNA nucleases are nucleases crucial for DNA repair processes. categorized as and , respec- Oncogene (2002) 21, 9022 – 9032. doi:10.1038/sj.onc. tively (Figure 1). Exonucleases can be further 1206135 classified as 5’ end processing or 3’ end processing enzymes, according to their polarity of consecutive Keywords: DNA repair; nuclease; metal-dependent cleavage. cleavage; protein-DNA interaction; structure-function This review describes the three-dimensional (3D) relationships structural views of the actions of various nucleases involved in many DNA repair pathways. The rapidly accumulating genomic, biochemical and structural data have allowed us to classify various nucleases into Introduction folding families. In general, the nucleases involved in DNA repair recognize the damaged moiety through the Quality control of genetic material is a function remarkably large deformation of DNA duplexes, and conserved in all living organisms. DNA suffers from thus in terms of their DNA recognition mode, they many environmental stresses, including attacks by apparently differ from the sequence-specific endonu- , radiation, UV light, and cleases, such as the restriction enzymes. The active sites carcinogens, which modify the DNA. In addition, of DNA repair nucleases have some similarity with there are intrinsic errors and unusual structures, which other nucleases, including the metal-coordinating are formed during replication or recombination, and residues; however, they also display pronounced they must be corrected by the various repair protein diversity. machineries to avoid alterations of the base sequences or entanglement of the DNA. These DNA repair Nucleases in various categories of DNA repair proteins may function independently, but in many cases, they form complexes to perform more efficient Replication repair reactions. In the repair complexes, nucleases play important roles in eliminating the damaged or DNA replicates a new strand of DNA, the sequence of which is complementary to the template DNA. Most DNA in and *Correspondence: K Morikawa; E-mail: [email protected] are composed of two different enzymes, a Structure and function of DNA repair nuclease T Nishino and K Morikawa 9023 Table 1 Nucleases involved in DNA repair / Bacteriophage Mammals

Replication PolI, II PolB, D Pold, e, g Pold, e, g DnaQ Okazaki fragment processing RNaseH RNaseHII RNaseH RNaseH FEN1 FEN1 Dna2 Replication fork cleavage Hef Mus81 Mus81 (+Mms4[Eme1])a,b Wrn

Base excision Repair EndoV APN1 Abasic site processing EndolIV HAP1[APE,APEX]b ExoIII Mismatchrepair MutH

Nucleotide excision repair 5’ processing UvrC(+UvrB)a Rad1(+Rad10)a XPF(+ERCC1)a 3’ processing UvrC Rad2 XPG Short patch repair Vsr

Double strand break repair End processing RecB(+RecCD)a Dna2 SbcD(+SbcC)a Mre11(+)a Mre11(+Rad50)a Mre11(+Rad50)a RecJ ExoVII[RecE]b ExoI[SbcB]b Artemis(+DNA-PK)a resolvase RuvC Ccell[Ydc2]b RusA Hjc T4 endoVII T7 endoI

aProteins in parenthesis form a complex. bProteins in brackets are a homolog or alternative name of the protein

Most of the are eliminated by RNaseH, ubiquitously present in all living organisms. RNaseH produces nicks in the RNA region of Okazaki fragments (Figure 2). In eukaryotes and in archaea, FEN1 endonucleases also participate in the removal of Okazaki fragments (reviewed in Lieber, 1997). FEN1 is a multi-functional enzyme. In addition to the 5’ to 3’ activity to remove the Okazaki fragments, the enzyme can also generate an incision at the junction point of a 5’ flap DNA Figure 1 Schematic diagram of the nuclease activity. The two structure. This latter activity is required to eliminate strands of DNA are schematically drawn. The cleavage made by the nuclease is represented by arrowhead non-homologous tails in and in recombination intermediates. The replication process is stalled by various modes of DNA damage. Upon the halt of fork progression, polymerase and an exonuclease, encoded within the the DNA polymerase and other protein complexes same polypeptide, but sometimes they are formed by abandon the replication fork. The remaining fork must different subunits. The exonuclease degrades misincor- be processed by various fork-specific protein machi- porated DNA strand in the 3’ to 5’ direction (Figure 2) neries. The most notable protein among them is (reviewed in Shevelev and Hubscher, 2002). Mus81, which was recently found as a new fork/ of these proofreading nucleases results in lethal or junction specific (Boddy et al., 2000, strong mutator phenotypes in (Fijalkowska 2001; Interthal and Heyer, 2000; Kaliraman et al., and Schaaper, 1996) and in yeast (Morrison et al., 2001; Mullen et al., 2001). Genetic and biochemical 1993), and causes cancer in mice (Goldsby et al., analyses have revealed that this endonuclease is 2001). completely conserved in eukaryotes, while its homolog The removal of Okazaki fragments is another has been found in archaea. The loss of Mus81 in yeast important process in replication. This DNA : RNA causes UV or damage sensitivity (Interthal hybrid is required to initialize DNA polymerization, and Heyer, 2000) and defects in sporulation (Mullen et but once the replication starts, it is rapidly degraded. al., 2001).

Oncogene Structure and function of DNA repair nuclease T Nishino and K Morikawa 9024

Base Excision Repair

Figure 2 Nuclease associated DNA repair pathways. The substrate are drawn schematically and the arrowheads denote nuclease cleavage. RNA regions are drawn in bold line

responsible for mismatches in certain sequences Base excision repair (reviewed in Modrich and Lahue, 1996; Yang, 2000; Abasic sites within DNA duplexes are frequently Tsutakawa and Morikawa, 2001). In the MutSLH produced by the actions of various DNA glycosylases system, the MutS protein recognizes and binds involved in the base excision repair pathway, in mismatched base moieties of DNA. MutL mediates addition to the spontaneous hydrolysis of bases. These the interaction between the MutS and MutH proteins. apyrimidine or apurine (AP) sites are removed by AP MutH recognizes a hemimethylated GATC sequence, endonucleases which cleave the phosphdiester bond and cleaves next to the G of the non-methylated strand next to an abasic site (Figure 2) (reviewed in Mol et (Figure 2). The cleavage activity of MutH is enhanced al., 2000a). E. coli cells contain two AP endonucleases: by the MutL protein, although its mechanism remains endonuclease IV (endoIV) and exonuclease III unclear. Vsr is a mismatch-specific endonuclease (exoIII). Interestingly, these two enzymes show no involved in , and recognizes a sequence similarity to each other; although their AP TG mismatch at the specific sequence CT(A/T)GG, endonuclease activities are quite similar. In eukaryotes, where the mismatch occurs at the second , there seems to be a single, major AP endonuclease upon spontaneous . Vsr makes an incision working in each organism. APN1, the yeast homolog next to the mismatched base. In both cases, after the of E. coli endoIV, shows sequence and catalytic has been introduced, these sites are degraded by activity similarity to endoIV. The absence of APN1 the RecJ, ExoVII, or ExoI nuclease and are resynthe- results in enhanced sensitivity to oxidative damage and sized by the DNA polymerase. alkylating agents (Ramotar et al., 1991). Mammalian organisms, including humans, bear Ape1, which shares Nucleotide excision repair sequence similarity with E. coli exoIII but lacks the intrinsic 3’ to 5’ exonuclease activity. In addition to Nucleotide excision repair (NER) is primarily used to the AP endonuclease activity, Ape1 also plays a major process DNA damage that is not repaired by base role in sensing the redox state of the cell (Xanthou- excision repair. These forms of damage involve those dakis et al., 1992). The loss of Ape1 generates generated by the UV radiation and the large adducts embryonic lethality in mice (Wilson and Thompson, produced by various chemicals. In the NER pathway, a 1997). short stretch of DNA containing the damaged nucleotide is removed. During this process, two incisions, on the 5’ side and the 3’ side are made by Mismatch repair two different nuclease reactions (Figure 2) (reviewed in In prokaryotes, mismatch repair is conducted mainly Petit and Sancar, 1999; Prakash and Prakash, 2000; de by the MutSLH proteins, while the Vsr protein is Boer and Hoeijmakers, 2000). In bacteria, this dual

Oncogene Structure and function of DNA repair nuclease T Nishino and K Morikawa 9025 incision is performed by the UvrB-UvrC complex. In many nucleases have been studied extensively. budding yeast, Rad2 and the Rad1-Rad10 complex However, in some cases, it is very difficult to identify make the 5’ and 3’ incisions, respectively. The same the actual functional targets of the nucleases, because process in mammalian cells is conducted by their of their broad substrate specificity. Nevertheless, many homologs, XPG and XPF-ERCC1, respectively. Dele- candidates for nucleases are available from various tions or introduced into these nucleases genome sequences, and their functional properties can cause sensitivity to UV damage, and result in cancer be inferred by sequence comparisons with other well- formation. In addition, abnormalities of these proteins studied nucleases. For instance, Koonin and his cause defects in neural development. associates have successfully classified nucleases, phos- phoesterases, and into several families, based on extensive data base analyses of the primary Double strand break repair sequences (Aravind and Koonin, 1998a,b; Aravind et Double strand breaks are generated by the accidental al., 1999). This classification has also revealed the halt of fork progression during replication or by relationships between nucleases and identified several ionizing radiation and strand incision chemicals. They new nuclease families. are also generated as an intermediate state during In addition to the classifications of primary and V(D)J recombination. These double strand sequences, 3D structural data have been rapidly breaks are repaired through the two main pathways of accumulating with respect to the proteins involved in non-homologous end joining and homologous recom- DNA repair, including nucleases. Most of their bination. In either case, the ends of the double strand structures were solved in the DNA-free states, although breaks must be processed to initiate the repair reaction a number of them were determined in complex with (Figure 2). Mre11 is a multi-functional nuclease cofactors or/and DNA (Table 2). The classification of involved in the processing of the DNA ends or nucleases in terms of their 3D structures provides more hairpin structures (reviewed in D’Amours and Jack- defined properties, since it is accepted that the 3D son, 2002). While Mre11 itself exhibits a ssDNA structures are much less diverged and more closely exonuclease activity, its complex with Rad50 processes related to the functions than the primary sequences. As double strand break ends. Moreover, in the presence a matter of fact, in the type II restriction endonu- of ATP, Rad50 activates the cleavage activity of cleases, all of the structures share the common core Mre11. Mutations introduced into Mre11 cause an motif, which includes the active sites, and thus could be ataxia-telangiectasia-like disorder (Stewart et al., grouped into a single folding family, despite its primary 1999). sequence diversity (reviewed in Pingoud and Jeltsch, V(D)J recombination involves a reaction process, in 2001). In the following section, we describe each of the which hairpin DNAs are opened, and subsequently, folding families of the DNA repair nucleases, classifica- both ends are connected. Recently, the Artemis : DNA- tions based on the SCOP database (Figure 3 and Table PK complex was shown to participate in this opening 3) (Murzin et al., 1995). reaction (Ma et al., 2002). Although Artemis alone possesses a ssDNA exonuclease activity, its complex RNaseH-like fold formation with DNA-PK allows the processing of the double strand break ends to open the hairpin structure. The RNaseH-like fold, which is one of the most Defects in each protein cause severe immunodeficiency ubiquitous architectures in the protein world, has been (Blunt et al., 1995; Kirchgessner et al., 1995; Moshous found in RuvC, RNaseH, integrase, transposase, and et al., 2001). proofreading exonucleases (Figure 3a). The core In , two homologous structure contains a five-stranded b-sheet flanked by DNA strands are paired and are connected by D-loop several a-helices. The strand order is 32145, with strand structures or Holliday junction intermediates. In 2 anti-parallel to the others. The active site residues, bacteria, the RuvC protein cleaves the Holliday which are constituted according to the DDE motif, are junction at two symmetrical sites near the junction located on one side of the sheet. These three (or center to resolve the junction into two dsDNAs (Figure sometimes four) acidic residues coordinate the metals, 2). Similar junction resolving enzymes have also been which are essential for the catalytic reaction. For found in other bacteria, bacteriophages, and archaea instance, the crystal structures of RNaseHI exhibit one (reviewed in Sharples, 2001). In eukaryotes, FEN1, (Katayanagi et al., 1993) or two (Goedken and XPF-ERCC1, and Mus81 are known to cleave the D- Marqusee, 2001) metals bound to the active site. loop structure, while Cce1/Ydc2 processes Holliday Similarly, the active site of the proofreading subunit junctions in mitochondria. of DNA polymerase III coordinates two metals (Hamdan et al., 2002). The cocrystal structure with TMP revealed that the phosphate moiety is directly Structural classification of DNA repair nucleases coordinated between the two metals, as it mimics the product DNA. A similar structure is also observed in The primary sequences of nucleases are often poorly the DNA complexes of the of conserved, except for the motifs related to catalytic polymerase I (Beese and Steitz, 1991) and the RB69 sites. The functional and biochemical properties of DNA polymerase (Shamoo and Steitz, 1999).

Oncogene Structure and function of DNA repair nuclease T Nishino and K Morikawa 9026 Table 2 Structural analysis of DNA repair proteins five-stranded b-sheet flanked by several a-helices. The Free/partial +cofactor +DNA strand order is 12345, with strand 2, and in some cases, strand 5, anti-parallel to the others. A conserved Replication coupled repair proteins Replicational polymerase ***PDXn(D/E)XK sequence is located on one side of (+proofreading domain)a the b-sheet, and is involved in the formation of the Repair polymerase ***catalytic centers in most restriction endonucleases. Error prone/free polymerase ***Similar sequences are also found in several DNA PCNA ** repair nucleases, such as MutH, Hjc, and T7 endoI, FEN1a ** which are categorized into essentially the same folding Damage Reversal family. The Vsr endonuclease also shares a similar fold, ** whereas the (D/E)XK sequence is replaced by FXH, Ada/Ogt ** where participates in catalysis (Tsutakawa et MutT ** al., 1999b). The active sites in endonucleases with the Base excision repair restriction endonuclease-like fold coordinate up to Aag Glycosylase ***three metals depending upon the enzyme. AlkA *** MutM/Fpg/EndoVIII *** MutY/Ogg/EndoIII ***RecJ-like fold UDG *** TAG ***This fold was recently identified by the determination ExoIII/Ape1a ***of the RecJ nuclease structure (Yamagata et al., 2002) a EndoIV ***(Figure 3d). Previous sequence analyses have shown Mismatch repair that this family includes RecJ and the phosphoes- MutS ***terases, which contain conserved phosphoesterase MutL/Pms2 * motifs (Aravind and Koonin, 1998a). The structure MutHa * a revealed a novel fold, which consists of a five-stranded Vsr ***parallel b-sheet flanked by six a-helices. The strand Nucleotide excision repair order of the b-sheet is 21345. On one side of the b- UvrB ** sheet, four phosphoesterase motifs form a cluster, which contains five invariant aspartates and two Double strand break repair conserved . The structure of the crystal, Ku70-80 – – * Mre11a * grown in the presence of 100 mM MnCl2, exhibits a Rad50 ** strong metal peak coordinating three of the aspartates Xrcc4- IV * and one of the histidines. These residues, which Homologous recombination constitute part of the active site, are likely to RecA/Rad51 ** participate in the cleavage reaction. RecG *** RecJa ** RuvA * – * RuvB ** Metallo-dependent fold Holliday junction ** a Mre11 and several phosphatases, including the purple resolvase (RuvC etc.) and the ser/thr phosphatases, share aContain nuclease activity this fold (Figure 3e). The core structure contains two b-sheets, which are sandwiched by a-helices to form a four-layered structure. The primary sequence of this family contains the conserved phosphoesterase motifs Resolvase-like fold usually constituted by six histidines, three aspartates, This fold has been found in gd resolvase, 5’ –3’ and an , which form a cluster on one side of exonucleases, and FEN1 (Figure 3b). It is similar to the b-sheet. The cocrystal structure of Mre11 with Mn the RNaseH-like fold, with a five-stranded b-sheet. and dAMP shows two manganese ions bound to the However, it possesses a different strand order, which is active site, and these two metals are simultaneously defined as 21345 with strand 5 anti-parallel to the coordinated to the phosphate moiety, thus mimicking others. FEN1 possesses two acidic clusters formed by the product-bound state (Hopfner et al., 2001). The four or three conserved aspartate/glutamate residues. active sites of the ser/thr phosphatases bind two metals These clusters each coordinate a metal, and are (zinc and iron) with a similar coordination scheme separated by 5 A˚ from each other (Hwang et al., (Griffith et al., 1995). 1998; Hosfield et al., 1998). DNaseI-like fold Restriction endonuclease-like fold This fold is found in DNaseI, ExoIII, and Ape1 The structures of restriction endonucleases revealed (Figure 3F). It is also observed in some phosphatases, that their catalytic domains share common fold such as inositol 5-phosphatase. These nucleases share a architecture (Figure 3c). The core fold comprises a four-layered structure containing an a/b sandwich, as

Oncogene Structure and function of DNA repair nuclease T Nishino and K Morikawa 9027

Figure 3 Folding patterns of DNA repair nucleases. The core folding is drawn schematically. The yellow arrows indicate the core b-sheet, where the strand orders are numbered on the top. a-helices are shown as blue cylinders. The positions of the bound metals are marked by black circles. Representative repair nuclease of the folding is written in parenthesis

Table 3 Structural classification of DNA repair nucleases TIM b/a barrel fold RNaseH-like fold RNaseH The TIM barrel was first observed in triosephosphate RuvC , and is now known to be the most ExoI ubiquitous fold adopted by various enzymes with proofreading (exonuclease domain) diverse functions (Farber and Petsko, 1990) (Figure 3g). It forms the a8/b8 barrel structure, where a barrel- Resolvase-like fold FEN1 like parallel b-sheet is surrounded by eight a-helices. In this fold, the key residues for the enzymatic activity are Restriction endonuclease fold usually located on the C-terminal side of the barrel. MutH The structure of E. coli endoIV was the first DNA Vsr T7 endoI repair enzyme structure with the TIM barrel (Hosfield Hjc et al., 1999). The active site contains a cluster of three zinc ions coordinated by histidines and aspartates. The RecJ fold endoIV-DNA complex structure revealed how these RecJ zinc ions coordinate the cleaved AP site. Metallophosphatase fold Mre11 His-Me finger endonuclease fold DNaseI fold T4 endonuclease VII (T4 endoVII) and several other ExoIII/Ape1 nucleases, such as the nucleases, Serratia TIM b/a barrel fold nuclease and I-PpoI intein, contain this folding motif EndoIV (Figure 3h). It is usually embedded as a constituent of larger architectures. The core fold is a b-hairpin His-Me finger nuclease fold flanked by two helices. Within the hairpin, several T4 endoVII histidines and acidic residues form a cluster and coordinate a catalytically important divalent metal. In the case of T4 endoVII, a single metal ion is coordinated to aspartate, glutamate, and found in the metallo-dependent phosphatases, although (Raaijmakers et al., 1999). The I-PpoI-DNA complex the b-sheet topology and the environments around the structure revealed that a histidine lies within the active sites are different. The active site is located on distance of hydrogen-bond from the scissile phosphate one side of the b-sheet, which assembles several group in the metal-containing active site (Galburt et conserved acidic residues. The crystal structures of al., 1999). DNaseI (Suck et al., 1988) and ExoIII (Mol et al., 1995) revealed a single metal ion bound to the active site. On the other hand, one (Gorman et al., 1997) or DNA recognition by DNA repair nuclease two (Beernink et al., 2001) metals were observed in the free form of Ape1. The Ape1-DNA complex structure The binding modes of DNA nucleases are roughly revealed one metal, coordinated with the acidic divided into two categories, corresponding to non- residues and the cleaved phosphate in the active site specific and specific associations. Both modes are (Mol et al., 2000b). important for efficient and accurate recognition

Oncogene Structure and function of DNA repair nuclease T Nishino and K Morikawa 9028 between enzymes and DNA. Non-specific DNA on the major groove side by (Arg177). These binding allows enzymes to scan for target sequences insertions generate a sharp kink of the DNA duplex at or damage by a rapid diffusion process along the the abasic site. The comparison of the free form with DNA. Once the nuclease finds its proper target, specific the complex revealed a small difference, suggesting that interactions are made to dock the active site residues the surface of the enzyme contains a preformed pocket correctly to the chemical groups within the DNA for to be filled by the flipped out base. Thus, it is likely cleavage. These two binding modes have been that Ape1 searches its target by scanning for a possible visualized within the crystal structures of the type II base flipping site. Once Ape1 finds the target, the base restriction endonucleases (reviewed in Pingoud and flips out into the enzyme pocket, and the remaining Jeltsch, 2001). In the cases of EcoRV, BamHI, and gap is occupied by the inserted arginine to stabilize the PvuII, the non-specific binding involves a weak protein-DNA complex. Biochemical experiments association, which is contributed by an electrostatic confirmed the role of this arginine, which when interaction between the minimum surface area of the mutated to , resulted in elevated enzyme protein and the DNA, and the overall shape of the turnover. (Mol et al., 2000b). DNA remains in the canonical B-form, without serious In the endoIV-DNA complex, an abasic site is deformations. By contrast, in the specific complex, the similarly flipped out into the protein pocket (Hosfield DNA is buried within the deep cleft of the protein in a et al., 1999) (Figure 4b). However, the conformation of sequence-specific manner, accompanied by the remark- the DNA duplex is drastically different from that of able deformation of the DNA duplex, which is Ape1-DNA. The orphan base opposite the abasic site required for the cleavage by the enzyme. also occupies an extrahelical position. Consequently, This scheme can be generally applied to DNA repair the DNA duplex is sharply bent (908) at the abasic site. nucleases as well. The nuclease surfaces are rich in The gap made by both flipped out nucleotides is filled basic residues, which form positive surfaces competent by arginine (Arg37), (Tyr72), and for electrostatic interactions with DNA. Some (Leu73) inserted from the minor groove. In contrast to nucleases, such as MutH or Vsr, which both share the preformed pocket of Ape1, the recognition loops of the restriction endonuclease fold, possess partial endoIV undergo a drastic conformational change upon competences for sequence-specific recognition, just like DNA binding. The residues involved in base flipping restriction endonucleases. However, most DNA repair are located in this loop. It is likely that endoIV scans nucleases recognize certain mismatches, forms of the DNA duplex on the minor groove side by this damages, or particular backbone structures of DNA. DNA recognition loop. Once the enzyme finds the Therefore, they require additional and unique binding target, it inserts all of the DNA-penetrating residues, mechanisms for specific interactions with DNA. and flips the two bases into extrahelical positions. Although the information available for DNA repair nuclease-DNA complexes is limited, they can still Insertion of aromatic side chains provide considerable insights into such recognition mechanisms. Another important factor in the recognition between repair enzymes and DNA is the insertion of aromatic amino acids into DNA duplexes. This is different from Base flipping out the insertion of side chains, which fill up Base flipping out has been observed in many DNA the gap created by a base-flipping out. A representative glycosylases and (reviewed in case was observed in the Vsr-DNA complex (Tsutaka- Roberts and Cheng, 1998; Vassylyev and Morikawa, wa et al., 1999a) (Figure 4c). Vsr recognizes a TG 1997; Mol et al., 2000a; Parikh et al., 2000). The wobble mismatch located in a five base pair flipping-out of a base is defined as the local long recognition sequence. In the close vicinity of the conformational change of a DNA duplex, where a mismatch, Vsr intercalates three conserved aromatic base is swung out from inside of the helix into an amino acids (Phe67, Trp68, Trp86) from the major extrahelical position and is usually inserted into the groove. In addition to the inserted helix from the binding pocket of the protein. The space created by minor groove, this insertion expands the space between this process of base pair disruption is occupied by the TG mismatch and the adjacent base pair, while the protein atoms, which are often involved in catalytic base pair itself is not disrupted. A similar insertion of reactions. This mechanism is observed in the two aromatic residues was observed in the MutS-DNA crystal structures of the AP endonucleases, Ape1 and complex, where the aromatic side chain of a conserved EndoIV (Figure 4a,b), which were both complexed with was inserted next to a mismatched or DNA duplexes containing an AP site in the middle. gapped base pair (Lamers et al., 2000; Obmolova et al., These two structures showed a similar base flipping, 2000). but different fitting modes, between the DNA and the The exonuclease domain of DNA polymerase uses proteins. aromatic residues for the correct positioning of the In the Ape1-DNA complex, the abasic nucleotide nucleotides (Figure 4d). In the editing complex of was flipped out into the enzyme pocket (Mol et al., RB69 DNA polymerase with its substrate DNA, two 2000b) (Figure 4a). The gap was filled on the minor single-stranded nucleotides are located in the groove of groove side by two (Met270, Met271) and the exonuclease domain (Shamoo and Steitz, 1999).

Oncogene Structure and function of DNA repair nuclease T Nishino and K Morikawa 9029

Figure 4 DNA recognition by DNA repair nuclease. Ribbon diagram of the DNA repair nuclease. (Left panel) Overall structure of the protein-DNA complex. (Right panel) Close-up view of the boxed region. Proteins are represented by a yellow ribbon diagram, and the side chains involved in DNA recognition are displayed and numbered with a stick model. The bound DNA is shown as a white stick model, and the flipped out nucleotides are colored red. The observed metals are shown as spheres. Blue, zinc; light blue, manganese; red, magnesium; gray, calcium. (a) endoIV (b) Ape1 (c) Vsr (d) RB69 polymerase exonuclease domain

One of the nucleotides is held by forming a hydrogen water is activated by a general base in the nuclease bond with the side chain of Arg260. Another active center, which usually bears a metal cofactor. nucleotide, whose backbone is cleaved, is located more This activation is performed by protein side chains or deeply within the exonuclease pocket, and is segregated divalent metals. The activated water is converted to a from the remaining region by the insertion of two hydroxide, which attacks the phosphate, thus forming aromatic side chains (Phe123 and Phe221) to separate the state intermediate. There are two modes the two nucleotides. These create a wall, for this nucleophilic substitution: associative and and thus the base is correctly positioned within the dissociative. The associative mechanism involves the active site pocket. formation of a pentacovalent intermediate with a In vivo and in vitro experiments, measuring the UV hydroxide, followed by the release of a leaving group. sensitivity and probing with potassium permanganate, In this mechanism, a general base is required to have demonstrated that in E. coli RuvC, the aromatic generate the hydroxide, and a general acid is needed to side chain of Phe69 plays a crucial role in specific stabilize the leaving group. The dissociative mechan- recognition with the Holliday junction (Yoshikawa et ism, on the other hand, does not require this general al., 2001). Phe69 lies in the protruding loop and directs acid and general base, and they form a metaphosphate its side chain into the catalytic cleft, which accom- intermediate, which requires more stabilization of the modates one of the DNA duplexes. A similar residue is transition intermediate. Many nucleases are assumed to also present in the yeast structural homolog, Ydc2, follow the associative mechanism, while alkaline whereas it is absent in another yeast homolog, Cce1. phosphatase uses the dissociative mechanism. Consequently, the detailed structural view of recogni- A large number of nucleases utilize metal cofactors tion mechanism between RuvC and the junction DNA for the hydrolytic reaction. They are proposed to play is required to solve the complex directly. any one or a combination of the following roles (Figure 5) (Jencks, 1969): (1) positioning the substrate and/or the attacking nucleophile; (2) enhancing the Active site environments of DNA repair nucleases nucleophilicity of the phosphate at the scissile bond; (3) activating the nucleophile; (4) neutralizing the negative All nucleases cleave the same , to charge in the transition state; (5) facilitating the leave 5’-phosphate and 3’-OH groups at the produced departure of the leaving group. To examine these segments. Similar reactions are conducted by phospha- roles, various metals are recruited to the nuclease tases and ribozymes, although their catalytic active sites. While the utilized metal may differ, mechanisms have not been clarified yet. The overall depending upon the nuclease, magnesium or manga- aspect of this enzymatic scheme is that the attacking nese is the most common metal for catalysis, and in

Oncogene Structure and function of DNA repair nuclease T Nishino and K Morikawa 9030

Figure 5 Schematic diagram of cleavage by DNA repair nucleases. X, Y, Z-H denote general base, Lewis acid, and general acid, respectively. Numbers in circles indicate reaction steps where the metal cofactors may be involved

rare cases, zinc is used. The magnesium ion appears to structure of the Ape1-DNA complex was obtained be transiently recruited to the active sites, whereas zinc under acidic conditions, and only one manganese ion is and manganese are more tightly bound to the catalytic bound to the product DNA cleaved at the abasic site centers. (Figure 6c) (Mol et al., 2000b). Similar ambiguity with EndoIV contains three zincs, which are coordinated respect to the number of metals was reported for by five histidines, two glutamates, and two aspartates, RNaseHI, such as one magnesium (Katayanagi et al., in addition to two water molecules (Figure 6a). These 1993) and two manganeses (Goedken and Marqusee, metals are so tightly coordinated to the enzyme that 2001). In the Vsr-DNA complex structure, two even EDTA cannot chelate them (Levin et al., 1991). magnesium ions are clearly observed in the active site, Two of the three zinc atoms are likely to be involved in with one of the metals holding both the 5’ phosphate generating the attacking nucleophile, in cooperation and 3’ OH groups (Figure 6d, Tsutakawa et al., 1999a). with the carboxyl side chain of Glu261. Furthermore, Two metals are also found in the exonuclease domain all three of the metals coordinate the phosphate moiety of polymerases (Calcium) (Figure 6e) (Shamoo and after cleavage (Hosfield et al., 1999). Steitz, 1999) and in T7 endoI (Manganese) (Hadden et Mre11 coordinates two manganese ions through five al., 2002), although the two sites are not equivalent histidines, two aspartates, and one asparagine (Figure between the two enzymes, and one of the two metals 6b) (Hopfner et al., 2001). The two manganese ions shows partial occupancy. directly coordinate the phosphate moiety of the dAMP. When magnesium is substituted for manganese, they can only occupy one of the two metal binding sites, Future perspectives and the nuclease is inactive. This indicates that both metals are required for the nuclease activity. With the rapid accumulation of metal As for the nucleases that require magnesium cations information, various catalytic mechanisms have been for catalyis, the number of metals and their positions proposed, including the classical two metal binding in the active sites are more ambiguous. They are mechanism (Beese and Steitz, 1991). However, it coordinated with protein atoms in a more transient appears to us that the actual numbers and positions manner. This relatively weak binding, and the fact that of the metals involved in catalysis are too broadly the electron number of the magnesium cation is varied from enzyme to enzyme to describe their comparable to a water molecule, make the clear hydrolytic mechanisms by a unified catalytic scheme. identification of the metal positions more difficult. In More detailed structural information, hopefully addition, the number of bound metals may change, combined with biochemical data, is essential to obtain depending upon different crystallization conditions. clear insights into the metal dependent nuclease The free form structure of the Ape1 crystal, obtained mechanisms. Meanwhile, the large diversity in nuclease under acidic conditions, revealed a single, bound metal architectures suggests that they can specifically recog- (Samarium) (Gorman et al., 1997), whereas the crystal nize DNA substrates by virtue of the large variety of obtained at a neutral pH contained two metals (Lead) surface properties, which were adopted through in the active site (Beernink et al., 2001). These Ape1 selection over an extremely long period. In particular, data indicate that the metals occupy multiple sites, the nucleases involved in DNA repair have acquired a which are affected by the protonation of the acidic special damage recognition system. At present, much of residues. It appears that two metals are required for the structural information is based on that of catalysis, since Ape1 is only active at a neutral pH. prokaryotic and archaeal proteins. Eukaryotic However, the actual numbers and the role of each nucleases obviously hold more complicated structures metal cannot be clarified at the moment, because the and properties, because they must bear -

Oncogene Structure and function of DNA repair nuclease T Nishino and K Morikawa 9031

Figure 6 Active site of DNA repair nuclease. Close-up view of the nuclease active site, shown in a stereo diagram. Residues in- volved in the nuclease activity and the metal coordination are drawn in stick models. The coloring scheme is same as in Figure 4. (a) endoIV (b) Mre11 (c) Ape1 (d) Vsr (e) RB69 polymerase exonuclease domain specific regulatory mechanisms involving protein – Acknowledgements protein and protein-DNA interactions. Further 3D We regret that the limit of space may have not allowed us to site all works in the field. We thank Kayoko Komori for structural characterizations of the eukaryotic DNA critical reading of the manuscript and helpful comments. T repair nucleases should provide additional variations or Nishino is a research fellow of the Japan society for the conserved architectures of protein folding, while promotion of sciences. This research was partly supported structural analyses of their complexes with DNA by NEDO (New Energy and Industrial Technology substrates will clarify the recognition mechanisms. Development Organization).

References

Aravind L and Koonin EV. (1998a). Trends Biochem. Sci., Boddy MN, Lopez-Girona A, Shanahan P, Interthal H, 23, 17 – 19. Heyer WD and Russell P. (2000). Mol. Cell. Biol., 20, Aravind L and Koonin EV. (1998b). Nucleic Acids Res., 26, 8758 – 8766. 3746 – 3752. D’Amours D and Jackson SP. (2002). Nat. Rev. Mol. Cell. Aravind L, Walker DR and Koonin EV. (1999). Nucleic Biol., 3, 317 – 327. Acids Res., 27, 1223 – 1242. de Boer J and Hoeijmakers JH. (2000). Carcinogenesis, 21, Beernink PT, Segelke BW, Hadi MZ, Erzberger JP, Wilson 453 – 460. III DM and Rupp B. (2001). J. Mol. Biol., 307, 1023 – 1034. Farber GK and Petsko GA. (1990). Trends Biochem. Sci., 15, Beese LS and Steitz TA. (1991). EMBO J., 10, 25 – 33. 228 – 234. Blunt T, Finnie NJ, Taccioli GE, Smith GC, Demengeot J, Fijalkowska IJ and Schaaper RM. (1996). Proc. Natl. Acad. Gottlieb TM, Mizuta R, Varghese AJ, Alt FW and Jeggo Sci. USA, 93, 2856 – 2861. PA. (1995). Cell,, 80, 813 – 823. Galburt EA, Chevalier B, Tang W, Jurica MS, Flick KE, Boddy MN, Gaillard PH, McDonald WH, Shanahan P, Monnat Jr RJ and Stoddard BL. (1999). Nat. Struct. Biol., Yates III JR and Russell P. (2001). Cell, 107, 537 – 548. 6, 1096 – 1099.

Oncogene Structure and function of DNA repair nuclease T Nishino and K Morikawa 9032 Goedken ER and Marqusee S. (2001). J. Biol. Chem., 276, Moshous D, Callebaut I, de Chasseval R, Corneo B, 7266 – 7271. Cavazzana-Calvo M, Le Deist F, Tezcan I, Sanal O, Goldsby RE, Lawrence NA, Hays LE, Olmsted EA, Chen X, Bertrand Y, Philippe N, Fischer A and de Villartay JP. Singh M and Preston BD. (2001). Nat. Med., 7, 638 – 639. (2001). Cell, 105, 177 – 186. Gorman MA, Morera S, Rothwell DG, de La FortelleE, Mol Mullen JR, Kaliraman V, Ibrahim SS and Brill SJ. (2001). CD, Tainer JA, Hickson ID and Freemont PS. (1997). , 157, 103 – 118. EMBO J., 16, 6548 – 6558. Murzin AG, Brenner SE, Hubbard T and Chothia C. (1995). Griffith JP, Kim JL, Kim EE, Sintchak MD, Thomson JA, J. Mol. Biol., 247, 536 – 540. Fitzgibbon MJ, Fleming MA, Caron PR, Hsiao K and Obmolova G, Ban C, Hsieh P and Yang W. (2000). Nature, Navia MA. (1995). Cell, 82, 507 – 522. 407, 703 – 710. Hadden JM, Declais AC, Phillips SE and Lilley DM. (2002). Parikh SS, Putnam CD and Tainer JA. (2000). Mutat. Res., EMBO J., 21, 3505 – 3515. 460, 183 – 199. Hamdan S, Carr PD, Brown SE, Ollis DL and Dixon NE. Petit C and Sancar A. (1999). Biochimie, 81, 15 – 25. (2002). Structure (Camb), 10, 535 – 546. Pingoud A and Jeltsch A. (2001). Nucleic Acids Res., 29, Hopfner KP, Karcher A, Craig L, Woo TT, Carney JP and 3705 – 3727. Tainer JA. (2001). Cell, 105, 473 – 485. Prakash S and Prakash L. (2000). Mutat. Res., 451, 13 – 24. Hosfield DJ, Mol CD, Shen B and Tainer JA. (1998). Cell, Raaijmakers H, Vix O, Toro I, Golz S, Kemper B and Suck 95, 135 – 146. D. (1999). EMBO J., 18, 1447 – 1458. Hosfield DJ, Guan Y, Haas BJ, Cunningham RP and Tainer Ramotar D, Popoff SC, Gralla EB and Demple B. (1991). JA. (1999). Cell, 98, 397 – 408. Mol. Cell. Biol., 11, 4537 – 4544. Hwang KY, Baek K, Kim HY and Cho Y. (1998). Nat. Roberts RJ and Cheng X. (1998). Annu. Rev. Biochem., 67, Struct. Biol., 5, 707 – 713. 181 – 198. Interthal H and Heyer WD. (2000). Mol. Gen. Genet., 263, Shamoo Y and Steitz TA. (1999). Cell, 99, 155 – 166. 812 – 827. Sharples GJ. (2001). Mol. Microbiol., 39, 823 – 834. Jencks WP. (1969). Catalysis in Chemistry and Enzymology. Shevelev IV and Hubscher U. (2002). Nat. Rev. Mol. Cell. New York: McGraw Hill, pp 111 – 115. Biol., 3, 364 – 376. Kaliraman V, Mullen JR, Fricke WM, Bastin-Shanower SA Stewart GS, Maser RS, Stankovic T, Bressan DA, Kaplan and Brill SJ. (2001). Dev., 15, 2730 – 2740. MI, Jaspers NG, Raams A, Byrd PJ, Petrini JH and Taylor Katayanagi K, Okumura M and Morikawa K. (1993). AM. (1999). Cell, 99, 577 – 587. Proteins, 17, 337 – 346. Suck D, Lahm A and Oefner C. (1988). Nature, 332, 464 – Kirchgessner CU, Patil CK, Evans JW, Cuomo CA, Fried 468. LM, Carter T, Oettinger MA and Brown JM. (1995). Tsutakawa SE, Jingami H and Morikawa K. (1999a). Cell, Science, 267, 1178 – 1183. 99, 615 – 623. Lamers MH, Perrakis A, Enzlin JH, Winterwerp HH, de Tsutakawa SE, Muto T, Kawate T, Jingami H, Kunishima Wind N and Sixma TK. (2000). Nature, 407, 711 – 717. N, Ariyoshi M, Kohda D, Nakagawa M and Morikawa K. Levin JD, Shapiro R and Demple B. (1991). J. Biol. Chem., (1999b). Mol. Cell, 3, 621 – 628. 266, 22893 – 22898. Tsutakawa SE and Morikawa K. (2001). Nucleic Acids Res., Lieber MR. (1997). Bioessays, 19, 233 – 240. 19, 3775 – 3783. Ma Y, Pannicke U, Schwarz K and Lieber MR. (2002). Cell, Vassylyev D and Morikawa K. (1997). Curr. Opin. Struct. 108, 781 – 794. Biol., 7, 103 – 109. Modrich P and Lahue R. (1996). Annu. Rev. Biochem., 65, Wilson III DM and Thompson LH. (1997). Proc. Natl. Acad. 101 – 133. Sci. USA, 94, 12754 – 12757. Mol CD, Hosfield DJ and Tainer JA. (2000a). Mutat. Res., Xanthoudakis S, Miao G, Wang F, Pan YC and Curran T. 460, 211 – 229. (1992). EMBO J., 11, 3323 – 3335. Mol CD, Izumi T, Mitra S and Tainer JA. (2000b). Nature, Yamagata A, Kakuta Y, Masui R and Fukuyama K. (2002). 403, 451 – 456. Proc. Natl. Acad. Sci. USA, 99, 5908 – 5912. Mol CD, Kuo CF, Thayer MM, Cunningham RP and Tainer Yang W. (2000). Mutat. Res., 460, 245 – 256. JA. (1995). Nature, 374, 381 – 386. Yoshikawa M, Iwasaki H and Shinagawa H. (2001). J. Biol. Morrison A, Johnson AL, Johnston LH and Sugino A. Chem., 276, 10432 – 10436. (1993). EMBO J., 12, 1467 – 1473.

Oncogene