DNA Repair 94 (2020) 102903

Contents lists available at ScienceDirect

DNA Repair

journal homepage: www.elsevier.com/locate/dnarepair

Structural biology of DNA abasic site protection by SRAP proteins T Katherine M. Amidona, Brandt F. Eichmana,b,* a Department of Biological Sciences, Vanderbilt University, Nashville, TN, 37232 USA b Department of , Vanderbilt University School of Medicine, Nashville, TN, 37232 USA

ARTICLE INFO ABSTRACT

Keywords: Abasic (AP) sites are one of the most frequently occurring types of DNA damage. They lead to DNA strand breaks, DNA-protein crosslink interstrand DNA crosslinks, and block transcription and replication. Mutagenicity of AP sites arises from Abasic site translesion synthesis (TLS) by error-prone bypass polymerases. Recently, a new cellular response to AP sites was Thiazolidine discovered, in which the protein HMCES (5-hydroxymethlycytosine (5hmC) binding, embryonic stem cell-spe- DNA lyase cific) forms a stable, covalent DNA-protein crosslink (DPC) to AP sites at stalled replication forks. The stability of HMCES the HMCES-DPC prevents strand cleavage by endonucleases and mutagenic bypass by TLS polymerases. SRAP Crosslinking is carried out by a unique SRAP (SOS Response Associated Peptidase) domain conserved across all domains of life. Here, we review the collection of recently reported SRAP crystal structures from human HMCES and E. coli YedK, which provide a unified basis for SRAP specificity and a putative chemical mechanism of AP site crosslinking. We discuss the structural and chemical basis for the stability of the SRAP DPC and how it differs from covalent protein-DNA intermediates in DNA lyase catalysis of strand scission.

1. Introduction opened aldehyde (Fig. 1A). The AP site exists primarily in the cyclic furanose form as a mixture of α- and β-hemiacetals, with approximately 1.1. AP site formation and outcomes 1% of the sugar in the ring-opened aldehyde form [14,15]. This elec- trophilic aldehyde is susceptible to base-catalyzed β-elimination of the Apurinic/apyrimidinic (AP, or abasic) sites are one of the most 3′ phosphoryl group, generating a single-strand break [16](Fig. 1B). AP common forms of DNA damage, occurring at rates of 10,000–30,000 sites can also react with exocyclic groups of on the com- per cell per day [1–3]. AP sites are generated directly by reactive plimentary strand to generate ICLs [12,17](Fig. 1C), and with primary oxygen species or ionizing radiation, and as intermediates during base amines in proteins to generate DPCs [11](Fig. 1D). In addition to their excision repair of aberrant nucleotides. Frequent exposure of DNA to reactivity, AP sites lead to stalled replication forks by inhibiting re- environmental toxins, UV radiation, and reactive cellular metabolites plicative polymerases [18](Fig. 1E). Moreover, replication forks that generates chemically modified nucleobases [4–6]. AP sites arise from encounter an AP site on the template strand can lead to a double-strand either spontaneous or DNA glycosylase catalyzed hydrolysis of the N- break (DSB) (Fig. 1E). For a recent review of biological outcomes of AP glycosidic bond that links the modified base to the deoxyribose [7,8] sites, we direct the reader to an accompanying review in this special (Fig. 1A). DNA glycosylases remove a large number of alkylated, oxi- issue [19]. dized, and deaminated bases as the first step of the (BER) pathway and are thus primarily responsible for AP sites [9,10]. 1.2. AP site repair and tolerance AP sites are unstable and reactive, and can lead to DNA strand breaks, interstrand DNA crosslinks (ICLs), and DNA-protein crosslinks Despite the fact that AP sites form more readily in single-stranded (DPCs) [11–13]. In solution, an AP site in DNA is at equilibrium be- DNA (ssDNA) [7,20], AP site repair occurs within the context of double- tween a ring-closed 2′-deoxy-D-erythro-pentofuranose and a ring- stranded DNA (dsDNA) as part of BER [8,21,22]. In the second step of

Abbreviations: AP, Apurinic/apyrimidinic; ssDNA, single-stranded DNA; dsDNA, double-stranded DNA; DSB, double-strand break; DPC, DNA-protein crosslink; ICL, interstrand DNA crosslink; THF, tetrahydrofuran; 5′-dRP, 5′-deoxyribose phosphate; BER, base excision repair; NER, nucleotide excision repair; PARP, poly(ADP- ribose) polymerase; PCNA, proliferating cell nuclear antigen; PDB, Protein Data Bank; PIP, PCNA-interacting protein; TLS, translesion synthesis; HMCES, 5-hy- droxymethylcytosine embryonic stem cells specific; 5hmC, 5-hydroxymethylcytosine; pol β, DNA polymerase β; SRAP, SOS response associated peptidase; NHEJ, non-homologous end joining ⁎ Corresponding author. Present Address: Department of Biological Sciences, Vanderbilt University, Nashville, TN, 37235, USA. E-mail address: [email protected] (B.F. Eichman). https://doi.org/10.1016/j.dnarep.2020.102903 Received 3 June 2020; Received in revised form 24 June 2020; Accepted 25 June 2020 Available online 29 June 2020 1568-7864/ © 2020 Elsevier B.V. All rights reserved. K.M. Amidon and B.F. Eichman DNA Repair 94 (2020) 102903

Fig. 1. Consequences of abasic sites. A. AP sites arise from enzymatic and spontaneous hydrolysis of the N-glycosidic bond and exist in either furanose or aldehyde forms. B. Base-catalyzed β-elimination of an AP site generates a strand break. C,D. Formation of an ICL (C) and a DPC (D) by nucleophilic attack of AP site C1ʹ by primary amines in DNA or proteins. E. Consequences of AP sites in the template strand during DNA replication. F. Incision of DNA by AP endonuclease and DNA lyases.

BER, AP endonuclease I (APE1) incises DNA 5′ to the AP nucleotide to found in most bacteria and eukaryotes, some archaea, as well as viruses generate a 3′−OH and 5′-deoxyribose phosphate (5′-dRP) residue and bacteriophages [30, 34]. HMCES is the only SRAP-containing pro- (Fig. 1F) [23]. The 3′−OH serves as a substrate for long patch repair tein present in humans. HMCES and other eukaryotic SRAP proteins DNA synthesis by DNA pol δ or short-patch repair synthesis by DNA have a C-terminal disordered region containing at least one non-cano- polymerase β (DNA pol β). DNA pol β also contains 5′-dRP lyase ac- nical PCNA-interacting protein (PIP) box that mediates the direct in- tivity that cleaves the 3′ side of the AP site (Fig. 1F) [24]. teraction with PCNA [30], whereas the SRAP domain constitutes the AP sites that occur in ssDNA, such as those encountered during entirety of the bacterial proteins [34]. The bacterial SRAP proteins do replication, are not removed by BER. APE1 has much weaker activity not appear to have an interacting motif (QL(S/D)LF) for the DNA for AP sites in ssDNA than in dsDNA [25,26], and indeed, APE1 incision polymerase III β-subunit (β-clamp) [35]—the prokaryotic ortholog of at an AP site in ssDNA would generate a strand break. Until recently, PCNA [36]—and their potential to interact with the sliding β-clamp is the only known fate of AP sites during DNA replication was lesion unknown. Crosslinking to AP sites in ssDNA has been demonstrated for bypass by low-fidelity translesion synthesis (TLS) polymerases [18,27]. the SRAP domain of human HMCES and the E. coli ortholog YedK [30]. This error-prone damage-tolerance pathway allows replication to con- SRAP was named for the proximity of the gene to other prokaryotic tinue at the expense of potentially introducing [28,29]. DNA repair (SOS response) genes, and for its highly conserved triad of An error-free pathway for repair of replication-associated AP sites cysteine, glutamate, and histidine residues (Cys2, Glu127, and His210 was discovered recently that depends on the protein HMCES [5-hy- in HMCES) reminiscent of cysteine proteases [34]. Cys2 is invariant and droxymethlycytosine (5hmC) binding, embryonic stem cell-specific required for HMCES function in cells and for the crosslinking activity of (originally C3Orf37)] [30]. Cells lacking HMCES exhibit elevated levels HMCES and YedK [30]. AP site crosslinking by Cys2 depends on re- and delayed repair of AP sites, as well as increased double-strand breaks moval of the N-terminal methionine by an aminopeptidase activity that and frequency from TLS. HMCES is recruited to replication exposes Cys2 at the N-terminus [30,34,37]. Cleavage of Met1 by forks via a direct interaction with proliferating cell nuclear antigen HMCES autoproteolysis has been reported [37], although all organisms (PCNA), the processivity factor for replicative polymerases and a hub contain essential methionyl aminopeptidases that are capable of re- for replication-associated processes [31]. Importantly, HMCES forms a moving this residue. Crosslinking activity also depends on natural AP covalent DPC with AP sites present in ssDNA, but not dsDNA (30). The sites, as a tetrahydrofuran (THF) analog lacking the hydroxyl group at HMCES DPC is highly stable and persists in cells [30,32]. Subsequent C1′ does not react with SRAP [30]. repair of the DPC is unknown, although the protein is eventually de- The native DPC formed between SRAP and AP sites is a unique DNA graded via the proteasome [30]. The current model is that the highly repair mechanism. Typically, proteins covalently conjugated to DNA stable HMCES DPC protects AP sites from cleavage and mu- occur either as deleterious lesions [38–40] or as a catalytic intermediate tagenic TLS polymerases [30,33]. in DNA strand cleavage (lyase) reactions [41–43]. For example, the AP lyase activity of a number of DNA repair proteins depends on formation of a transient Schiff base intermediate between protein amino and DNA 2. Function of the SRAP domain in AP site repair carbonyl groups [44,45]. In contrast, SRAP DPCs are highly stable—on the order of hours in cells and days in vitro at physiological tempera- HMCES cellular function and AP-site crosslinking activity depends ture, and resistant to boiling for up to 10 min [30,32]. Consequently, on a highly conserved SRAP (SOS Response Associated Peptidase) do- the SRAP DPC blocks nuclease activity by APE1, even when SRAP has main, which constitutes the majority of the protein. The SRAP domain been proteolyzed to leave a DNA-peptide crosslink [30,32]. Thus, the is conserved across all domains of life, with a SRAP-containing protein

2 K.M. Amidon and B.F. Eichman DNA Repair 94 (2020) 102903

Table 1 noticeable differences between the HMCES and YedK structures are in SRAP protein crystal structures. the loop regions, which have weaker sequence similarity and in HMCES – Protein DNAa PDB ID Reference are noticeably longer. HMCES residues 149 159, found only in se- quences from higher eukaryotes, are disordered in all available HMCES- H. sapiens HMCES SRAP None 5KO9 [48] SRAP structures. According to the SCOP (Structural Classification of domain Proteins) database [46], SRAP domains possess a unique BB1717-like H. sapiens HMCES SRAP Non-covalent overhang 6OEA [48] ′ ′ β α domain 5 CCAGACGTTG3 fold containing a bifurcated -sheet surrounded by several -helices. ′ ′ 3 GGTCTG5 The β-barrel core forms a positively charged DNA binding channel, with H. sapiens HMCES SRAP Non-covalent overhang 6OEB [48] the Cys2 active site in the middle (Fig. 2A). Not surprisingly, this 5′ 3′ domain CCAGACGTT channel is the most highly-conserved region of the SRAP domain 3′ 5′ GGTCTG [32,47]. The channel cradles the phosphoribose backbone in all avail- H. sapiens HMCES SRAP Covalent overhang 6OE7 [48] ′ ′ domain 5 CCAGACGT(AP)3 able SRAP-DNA structures. Two highly conserved surface-exposed ar- ′ ′ 3 GGTCTG5 ginine residues, Arg98 and Arg212 in HMCES (Arg77 and Arg162 in H. sapiens HMCES SRAP Non-covalent 6OOV [49] YedK) interact with phosphates of the DNA backbone and are required domain palindromic ′ ′ for DNA binding [30,32,47,48]. 5 CAACGTTGTTTT3 ′ ′ ff 3 TTTTGTTGCAAC5 Despite the di erences in the DNA ligands used in the HMCES and E. coli YedK Covalent ssDNA 6NUA [32] YedK structures (Table 1), the observed confirmations of DNA and the ′ ′ 5 GTC(AP)GGA3 protein-DNA interactions were remarkably similar, even between non- E. coli YedK Non-covalent ssDNA 6NUH [32] covalent and DPC complexes, indicating that SRAP interaction with 5′ 3′ GTC(C3)GGA DNA is likely not sequence-dependent. Several of the YedK-DNA E. coli YedK Covalent ssDNA 6KCQ [47] ′ ′ 5 AAA(AP)AA3 structures (PDB IDs 6NUA, 6NUH, 6KBS, 6KBZ, 6KCQ) contained a E. coli YedK Covalent ssDNA 6KIJ [47] continuous ssDNA molecule bound across the entire channel, which ′ ′ 5 TTC(AP)3 showed the details for how the DNA backbone is highly kinked and E. coli YedK Covalent ssDNA 6KBX [47] ′ ′ twisted at the position of the active site (Fig. 2A) [32,47]. The HMCES- 5 CGGT(AP)3 E. coli YedK Non-covalent ssDNA 6KBZ [47] SRAP structures (PDB IDs 6OEA, 6OEB, and 6OE7) contained duplex ′ ′ 5 GGT(THF)GATTC3 DNA with a 3′-ssDNA overhang tail that bound to one side of the E. coli YedK Non-covalent ssDNA 6KBS [47] channel up to the active site (Fig. 2B) [48]. The duplex portion of a 5′ 3′ GGTCGATTC symmetry-related molecule bound to the other side, creating a semi- E. coli YedK None 6KBU [47] continuous strand that overlays with the continuous ssDNA in the YedK E. coli YedK None 2ICU SECSG, RSGIb Bordetella bronchiseptica None 1ZN6 NESGc structures (Fig. 2B,C). A more recent HMCES-SRAP structure (PDB ID Q7WLM8 6OOV) crystallized with a palindromic DNA containing 3′ overhangs on Bordetella bronchiseptica None 2BDV NESGc each end. The overhangs link two symmetry related protomers by BB2244 binding to one side of their positive channels in the same manner as the Agrobacterium tumefaciens None 2AEG NESGc Atu5096 previous HMCES structures, but neither protomer contains DNA on the Bacteroides thetaiotaomicron None 2F20 NESGc other side of the channel [49]. BT_1218 Importantly, all of the structures showed that a conserved “wedge” motif protrudes into one side of the positive channel and disrupts nu- a AP, abasic site; C3, C3-spacer, THF, tetrahydrofuran. b cleobase stacking, precluding a complimentary second strand from SECSG, Southeast Collaboratory for Structural Genomics; RSGI RIKEN pairing immediately adjacent to the 5′ side the AP site (Fig. 2A,B). This Structural Genomics/Proteomics Initiative. structural feature, together with the sharp kink in the DNA at the active c NESG, Northeast Structural Genomics Consortium. site, explains SRAP’s preference for ssDNA [30,32,47,48]. Comparison of DNA-bound and free forms of HMCES-SRAP and YedK show that high degree of stability of the SRAP DPC is likely important for there is very little change in the SRAP DNA binding site upon binding shielding AP sites from cleavage and explains the HMCES-dependent DNA (Cα RMSD of 1.16 Å for YedK and 1.00 Å for HMCES-SRAP), in- reduction of spontaneous DNA strand breaks in cells [30]. dicating that restructuring of the DNA in this particular way is a key feature of the protein. Furthermore, structures of YedK bound non- 3. Structural basis for SRAP interaction with DNA covalently to DNA containing either a full nucleotide or different types of abasic sites showed almost no difference in the DNA conformation, Over the past 14 years, a number of SRAP protein structures have despite their different crystal packing arrangements (Fig. 2D,E). been deposited in the Protein Data Bank (PDB) (Table 1). The first of these were unpublished entries of bacterial SRAP proteins, including E. 3.2. SRAP accommodates 3′-junction structures coli YedK, from structural genomics groups. More recently, structures of – the HMCES-SRAP domain (residues 1 270 and lacking the 84-residue In contrast to the kinked DNA 5′ to the AP site by the wedge motif, C-terminal tail) appeared (Fig. 2). Despite the handful of structures and all of the structures revealed that the DNA on the other side of the the high conservation of this protein domain, the function of SRAP was active site adopts a B-form conformation, and that duplex DNA can be unknown at their time of deposition. In 2019, 12 crystal structures from accommodated immediately adjacent to the 3′-side of the AP site HMCES-SRAP and YedK were published, 11 of which were in complex [32,47,48]. Such a dsDNA-ssDNA junction would be formed by a stalled with DNA (Table 1). This wealth of new structural information pro- replicative polymerase during DNA synthesis. Four structures explicitly vided insights into the mechanism of these fascinating proteins, as well showed dsDNA bound to this side of the positively charged channel via as a more detailed understanding of their interaction with DNA. crystal lattice interactions. Two HMCES-SRAP structures (PDB IDs 6OE7, 6OEB) contained the dsDNA portion of a symmetry-related 3.1. SRAP architecture molecule bound to the channel such that the blunt end of the duplex stacked against a highly conserved protein surface immediately ad- The HMCES SRAP domain and YedK, which share 29% sequence jacent to the active site, which we refer to as the DNA “shelf” (Fig. 2F). identity and 43% similarity, are highly similar in structure, with an Similarly, in two YedK structures crystalized with ssDNA (PDB IDs RMSD of 1.2 Å for all Cα atoms (PDB IDs 5KO9 and 2ICU). The only 6KBS, 6KBZ), the DNA from one complex partially hybridizes to the

3 K.M. Amidon and B.F. Eichman DNA Repair 94 (2020) 102903

Fig. 2. SRAP-DNA structure. A,B. Electrostatic surface potential of (A) YedK-DPC (PDB ID 6NUA) and (B) HMCES-SRAP DPC (PDB ID 6OE7) structures. The AP sites are green, and the symmetry-related DNA molecule in HMCES is shown in black and grey. The white asterisk denotes the position of the active site. C. Superposition of YedK-DPC (PDB ID 6NUA) and HMCES-SRAP DPC (PDB ID 6OE7). The DPC is marked with an asterisk. The ends of the DNA in the HMCES structure have been removed for clarity. D,E. Superposition of YedK bound non-covalently to ssDNA containing (D) a cytosine or an abasic site analog, and (E) two different abasic site analogs. The position of the nucleotide at the active site is marked with an asterisk. F,G. Similarity of dsDNA bound to (F) HMCES-SRAP (PDB ID 6OEB) and (G) YedK (PDB ID 6KBS). The asymmetric unit is colored blue or orange, and symmetry related DNA is shown in black and grey. The red triangle marks the position of the active site. In the schematic at the bottom, base pairs are denoted by open circles. H. Superposition of the two structures in panels F and G. DNA is colored blue/cyan in HMCES and orange/gold in YedK.

DNA from a symmetry-related molecule, effectively establishing a Cys2 residue [32](Fig. 3A). This specific DPC linkage was also observed dsDNA-ssDNA junction on the 3′ side of the AP site (Fig. 2G). The in subsequent high-resolution (1.2 Å) YedK-DPC structures [47] and in kinking of the continuous ssDNA along the channel allows for the a 2.2-Å DPC structure of HMCES-SRAP (PDB ID 6OE7, Fig. 3B), which nested 3'-end of the duplex to stack against the DNA shelf. The struc- revealed the conserved mode of SRAP crosslinking across domains of tures of the dsDNA and nested 3′-ends in HMCES and YedK are re- life [48]. Thiazolidine adducts between proteins and formaldehyde markably similar (Fig. 2H). An additional YedK-ssDNA structure (PDB have been identified [50,51], but to our knowledge a thiazolidine ID 6NUA) did not contain dsDNA, but the ssDNA 3′ to the AP site linkage between protein and DNA has not been described previously. adopted a B-form conformation such that duplex DNA could be easily modeled to create a dsDNA-ssDNA junction with a nested 3′-end [32]. 4.1. The SRAP active site The structure of this modeled 3′-junction is virtually identical to both the human and bacterial SRAP structures that contain duplex from The residues responsible for DPC formation lie at the center of the different crystal packing interactions, indicating that this interaction is DNA binding channel and include the invariant Cys2 and a highly an inherent property of SRAP proteins. Interestingly, ssDNA 3′ to the AP conserved glutamate (Glu127 in HMCES; Glu105 in YedK), histidine site in two additional YedK-DPC structures (PDB IDs 6KIJ, 6KBX) and in (His210 in HMCES, His160 in YedK), and asparagine (Asn96 in HMCES; a non-covalent YedK-DNA complex (PDB ID 6NUH) either did not ex- Asn75 in YedK). These residues are critical for crosslinking as sub- hibit sufficient density to be modeled or showed high B-factors, sug- stitution to alanine either abrogates or significantly reduces activity gesting that ssDNA 3′ to the AP site is relatively mobile in the absence of [30,32 ,47]. Their interactions with the AP site are conserved between a base-paired strand to stabilize that region of DNA. The preferential HMCES and YedK structures (Fig. 3A,B). In the absence of DNA, the binding to 3′-junctions were verified biochemically in both HMCES and thiol side chain of Cys2 points down into the protein, with the N- YedK [32,47,48]. terminal amine position varying (HMCES, PDB ID 5KO9; YedK, PDB ID 6KBU). In one structure (YedK, PDB ID 2ICU) Cys2 was not observed in 4. The unique SRAP DNA-protein crosslink the crystallographic data, suggesting a degree of disorder potentially due to the presence of an N-terminal tag. When either HMCES or YedK The molecular basis for the stability of the SRAP DPC was elucidated is bound to DNA containing a natural base (PDB IDs 6OEA, 6OEB, by a 1.6-Å crystal structure of YedK crosslinked to AP-DNA (6NUA), 6OOV, 6KBS), Cys2 is observed in multiple positions and the nucleo- which revealed a thiazolidine linkage between the ring-opened AP base is facing away from the active site [47,48]. In non-covalent deoxyribose and the α-amino and sulfhydryl groups of the N-terminal complexes of YedK with an AP site analog, THF or C3-spacer (PDB IDs

4 K.M. Amidon and B.F. Eichman DNA Repair 94 (2020) 102903

Fig. 3. SRAP active site and mechanism of crosslinking. A,B. Atomic details of the thiazolidine DPC in (A) YedK (PDB ID 6NUA) and (B) HMCES-SRAP (PDB ID 6OE7). DNA is greyscale with AP site dark grey, and SRAP is colored by amino acid. Interatomic distances (Å) are labeled, with hydrogen bonds indicated by dark dashes and close contacts with light dashes. C. Proposed catalytic mechanism of crosslinking.

6KBZ, 6NUH), Cys2 is rotated 180° relative to its buried position in the formation. The glutamate would then facilitate Schiff base formation by unbound structures, restructured into a pre-catalytic, crosslinkable acid-base cycling to deprotonate Cys2 α-NH2 and possibly protonate the position [32,47]. Notably, the backbone of the THF structure is twisted AP site O1′. Glu127 would also be positioned to deprotonate the Cys2 by 90° such that the THF is positioned towards the Cys2, similar to the sulfhydryl to create the thiolate nucleophile required to complete for- DNA conformation seen in the DPC structures. mation of the thiazolidine (Fig. 3C). A cycle of Glu127 ionization is The histidine side chain hydrogen bonds to the O4′ hydroxyl of the supported by the two observed conformations—an ionized conformer ring-opened AP site in the DPC structure but not in the non-covalent contacting the thiazolidine nitrogen and a protonated conformer con- THF structure. The structure of YedK noncovalently bound to DNA with tacting the phosphate 3′ to the AP site [32]. The hydrophobic residues a in the active site position shows the nucleotide rotated around the active site would raise the pKa of the glutamate, increasing such that His160 now forms a hydrogen bond with O3′ as opposed to the likelihood that it acts as both a proton donor and acceptor in the O4′ in the DPC structure [47]. The structure of HMCES-SRAP bound to acid-base cycle. Finally, based on its position in the structures, the palindromic DNA containing 3′ overhangs (PDB ID 6OOV) shows the highly conserved asparagine likely positions the N-terminal Cys2 for 3′−OH of each overhang hydrogen bonded to His210 [49]. In the nucleophilic attack. various DPC structures, the glutamate side chain is in close contact to the thiazolidine ring, anywhere from 2.6 to 4.2 Å observed in the 4.3. Thiazolidine stability various DPC structures. In the YedK DPC structures, a second glutamate conformer is observed within hydrogen bonding distance of the phos- The Cys2 active site is surrounded by a pocket of strongly conserved phate 3′ to the AP site, and is thus likely protonated to avoid repulsion hydrophobic residues. The solvent inaccessibility of this conserved DNA with the negative backbone [32,47]. On the other side of the thiazoli- binding pocket likely provides an optimal environment for crosslink dine crosslink, the asparagine hydrogen bonds with the carbonyl formation and helps to shield it from hydrolysis. However, shielding is oxygen and the backbone amide nitrogen of Cys2. not the sole basis for crosslink stability as proteolysis of the DPC down to a peptide-DNA crosslink is also persistent [32]. Indeed, the chemical 4.2. Catalytic mechanism of SRAP DPC formation nature of the thiazolidine linkage is inherently stable and its formation proceeds efficiently under acidic conditions, as revealed by reactions Based on the SRAP-DNA structures and previous work on thiazoli- between free cysteine and various aldehydes [52,53,56,57]. The thia- dine chemistry (52-55), we propose the following mechanism for SRAP zolidine ring is stable up to at least 3 M HCl, and reversal back to al- DPC formation at AP sites. Thiazolidine formation would proceed by dehyde and free cysteine is only observed under basic conditions nucleophilic attack of the AP site C1′ carbon by the N-terminal Cys2 α- greater than 1 M NaOH [54]. This stability has also been exploited in amino group to form a Schiff base intermediate, followed by a second conjugation reactions to generate peptide-peptide linkages or to attach nucleophilic attack of the imino carbon by the Cys2 thiolate side chain site-specific labels [58,59]. (Figs. 3C and 4 A). Because of the low abundance of the aldehyde form The stability of the thiazolidine ring explains how SRAP protects AP of an AP site [14], it is possible that SRAP first catalyzes AP site ring sites from spontaneous strand breakage. Both AP sites and their Schiff opening, although the alternative in which SRAP traps a spontaneously base conjugates to proteins are susceptible to β-elimination, leading to formed aldehyde is also possible. Based on their positions in the crystal cleavage of the DNA backbone (Figs. 1B and 4). Thiazolidines exist in structures, Glu127 and His210 could catalyze ring opening by pro- equilibrium with the Schiff base, but the ring-closed thiazolidine is viding a general base to deprotonate the O1′ hydroxyl and a general favored over the ring-opened iminium ion by 5 orders of magnitude acid to protonate O4′, both of which are necessary for aldehyde [52,53,55]. Thus, the closed thiazolidine ring draws the equilibrium

5 K.M. Amidon and B.F. Eichman DNA Repair 94 (2020) 102903

Fig. 4. Comparison of DPCs formed by SRAP and AP lyases. A. Wild-type SRAP forms a thiazolidine DPC via a Schiff base intermediate. B. The Schiff base formed by a SRAP C2A mutant is prone to β-elimination and can be re- duced to a stable DPC via borohydride treatment. C. Catalysis of DNA lyase activity by bifunctional DNA glycosy- lases.

away from Schiff base degradation (Fig. 4A). Consistent with this pre- nucleophile—either an internal lysine or an N-terminal proline or va- mise, removal or substitution of the thiolate by a YedK C2A or C2S line—to form a Schiff base intermediate that can be trapped by bor- mutant not only abrogated crosslink formation but also led to cleavage ohydride reduction [44,60,73]. Lyases lack a second nucleophile to of the DNA to form a product consistent with β-elimination at the AP stabilize the DPC, and instead the Schiff base increases the basicity of site (Fig. 4B) [32]. Evidence for the Schiff base intermediate in SRAP C2′ to promote catalysis of β-elimination and cleavage of the DNA was provided by borohydride trapping [60] of YedK C2A and C2S DPC (Fig. 4C). Thus, the main distinction between AP lyases and SRAP is the intermediates (Fig. 4B), which dramatically reduced DNA strand clea- ability of the SRAP crosslinking nucleophile to exhibit a second nu- vage [32]. Formation of a Schiff base also indicates that the N-terminal cleophilic attack of the imino C1′ carbon to form a stable linkage, amine—not the Cys2 thiolate—initiates crosslink formation. Indeed, thereby acting as an AP-site sink to inhibit the competing elimination the hydrophobic environment of this residue would raise the pKa of the reaction. Wild type SRAP does not lead to, but rather prevents strand thiol and lower the pKa of the amine, favoring the amine as the nu- cleavage. cleophile. Interestingly, some AP lyases form stable DPCs with oxidized AP A C2S mutant could theoretically form an oxazolidine ring by the sites in mammalian cells, exhibiting half-lives of 15−60 min before same mechanism as thiazolidine formation by Cys2. To our knowledge, proteasome dependent resolution [40,74,75]. These AP lyase DPCs however, there is no evidence of a SRAP C2S variant in nature. We did were associated with an increase in double-strand breaks in cells, in not observe any DPC formation in our biochemical experiments with a contrast to the HMCES dependent reduction in double-strand breaks C2S mutant. An oxazolidine is less stable than a thiazolidine [55,61] [30,69]. Thus, whereas HMCES DPCs play a protective role, the oxi- likely because sulfur is 0.4 Å larger than oxygen, and can form longer dative AP lyase DPCs are thought to be toxic byproducts of DNA repair bonds that could aid in the stability of the thiazolidine [62,63]. proteins trapped in unproductive complexes [40]. In contrast, PARP1

Moreover, the side chain of cysteine has a lower pKa than that of serine, has been shown to form a DPC with AP sites by reduction of the Schiff making cysteine a better nucleophile. Thus, an oxazolidine is unlikely base via an intrinsic redox activity of the protein, similar to the reaction to form, and if it did form, it would be unable to shift the equilibrium in borohydride trapping experiments. Like HMCES, the PARP1 DPC is away from the competing β-elimination reaction. proposed to protect AP sites and recruit repair proteins [76].

4.4. Comparison to DPC formation in DNA lyases 5. Concluding remarks

The N-terminal cysteine in SRAP is unique among DNA repair This review has focused primarily on the mechanics of SRAP proteins. A number of repair enzymes interact with AP sites via Schiff crosslinking to AP sites. However, in addition to its role in the re- base intermediates, but typically do so to catalyze strand scission plication stress response [30,77], HMCES is implicated in a number of [64–67]. Bifunctional DNA glycosylases contain AP lyase activity that other DNA related transactions across species, including viral replica- cleaves the DNA backbone 3′ to the AP site [68]. Pol β and Ku antigen tion [78], aging [79], alternative end-joining during B-cell class switch contain 5′-deoxyribophosphate (5′-dRP) lyase activities important for recombination [49], and long interspersed element-1 (LINE-1) retro- single-strand break processing during BER and non-homologous end transposition [80–82]. Moreover, HMCES expression is significantly joining (NHEJ), respectively [24,69,70]. PARP1 and PARP2 have been down regulated in both B-cell lymphoma and multidrug resistant os- shown to exhibit AP lyase and 5′-dRP lyase activity in vitro [71,72]. teosarcoma cells [83,84]. HMCES was named for its identification in a Similar to the SRAP crosslinking mechanism, DNA lyases use an amine screen for readers of modified bases in embryonic stem cells [85],

6 K.M. Amidon and B.F. Eichman DNA Repair 94 (2020) 102903 although this name is likely a misnomer as HMCES is expressed in all DNA, J. Am. Chem. Soc. 136 (2014) 3483–3490. cell lineages [86] and its binding to oxidatively modified bases has not [18] L. Haracska, I. Unk, R.E. Johnson, E. Johansson, P.M. Burgers, S. Prakash, L. Prakash, Roles of yeast DNA polymerases δ and ζ and of Rev1 in the bypass of been reproduced. Moreover, global DNA methylation patterns are not abasic sites, Genes Dev. 15 (2001) 945–954. significantly altered in HMCES knockouts [30,49], and the catalytic [19] P.S. Thompson, D. Cortez, New insights into abasic site repair and tolerance, DNA SRAP domain is conserved in prokaryotes, which do not utilize 5hmC Repair (Amst) 90 (2020) 102866. [20] B. Kavli, M. Otterlei, G. Slupphaug, H.E. Krokan, in DNA—general mutagen, [87]. but normal intermediate in acquired immunity, DNA Repair 6 (2007) 505–516. An important open question relates to the fate of the HMCES DPC in [21] D.K. Srivastava, B.J.V. Berg, R. Prasad, J.T. Molina, W.A. Beard, A.E. Tomkinson, cells. HMCES is ultimately targeted for ubiquitin-dependent destruction S.H. Wilson, Mammalian abasic site base excision repair Identification of the re- by the proteasome [30]. Is the resulting peptide-DNA crosslink repaired action sequence and rate-determining steps, J. Biol. Chem. 273 (1998) 21203–21209. by nucleotide excision or some other repair pathway? In HMCES, sev- [22] H.E. Krokan, R. Standal, G. Slupphaug, DNA glycosylases in the base excision repair eral lysine residues located in the C-terminal tail and on the surface of of DNA, Biochem. J. 325 (1997) 1–16. the protein opposite the DNA binding channel have been found to be [23] A. Spiering, W. Deutsch, Drosophila apurinic/apyrimidinic DNA endonucleases. Characterization of mechanism of action and demonstration of a novel type of ubiquitylated via mass spectrometry [88,89]. The majority of these enzyme activity, J. Biol. Chem. 261 (1986) 3222–3228. lysines are highly conserved in eukaryotic SRAP proteins. These lysines [24] S.L. Allinson, I.I. Dianova, G.L. Dianov, DNA polymerase β is the major dRP lyase and other predicted phosphorylation and SUMOylation sites are mu- involved in repair of oxidative base lesions in DNA by mammalian cell extracts, EMBO J. 20 (2001) 6919–6926. tated in some human tumors [88,90]. Other cancer associated muta- [25] J.P. Erzberger, D. Barsky, O.D. Schärer, M.E. Colvin, D.M. Wilson III, Elements in tions appear to be destabilizing mutations as they reside in the protein abasic site recognition by the major human and apurinic/apyr- interior and likely result in improper folding and/or subsequent de- imidinic endonucleases, Nucleic Acids Res. 26 (1998) 2771–2778. [26] D.M. Wilson, M. Takeshita, A.P. Grollman, B. Demple, Incision activity of human gradation. HMCES regulation and the fate of SRAP-AP DPCs in cells apurinic endonuclease (Ape) at abasic site analogs in DNA, J. Biol. Chem. 270 represent the next frontier in understanding this important class of (1995) 16002–16007. protein. [27] R.M. Schaaper, T.A. Kunkel, L.A. Loeb, Infidelity of DNA synthesis associated with bypass of apurinic sites, Proc. Natl. Acad. Sci. 80 (1983) 487–491. [28] P.L. Andersen, F. Xu, W. Xiao, Eukaryotic DNA damage tolerance and translesion Declaration of Competing Interest synthesis through covalent modifications of PCNA, Cell Res. 18 (2008) 162. [29] K.T. Powers, M.T. Washington, Eukaryotic translesion synthesis: choosing the right tool for the job, DNA Repair 71 (2018) 127–134. There are no conflicts of interest to declare. [30] K.N. Mohni, S.R. Wessel, R. Zhao, A.C. Wojciechowski, J.W. Luzwick, H. Layden, B.F. Eichman, P.S. Thompson, K.P. Mehta, D. Cortez, HMCES maintains genome – Acknowledgement integrity by shielding abasic sites in single-strand DNA, Cell 176 (2019) 144 153 e113. [31] E.M. Boehm, M.S. Gildenberg, M.T. Washington, The many roles of PCNA in eu- Research on abasic site repair in the Eichman lab is funded by karyotic DNA replication, Enzymes 39 (2016) 231–254. National Institutes of Health (USA) grants R35GM136401 and [32] P.S. Thompson, K.M. Amidon, K.N. Mohni, D. Cortez, B.F. Eichman, Protection of abasic sites during DNA replication by a stable thiazolidine protein-DNA cross-link, R01GM131071. K.M.A. is supported by National Institutes of Health Nat. Struct. Mol. Biol. 26 (2019) 613–618. (USA) grant T32GM008320. [33] K.P.M. Mehta, C.A. Lovejoy, R. Zhao, D.R. Heintzman, D. Cortez, HMCES maintains replication fork progression and prevents double-strand breaks in response to APOBEC and abasic site formation, Cell Rep. 31 (2020) 107705. References [34] L. Aravind, S. Anand, L.M. Iyer, Novel autoproteolytic and DNA-damage sensing components in the bacterial SOS response and oxidized methylcytosine-induced [1] J. Nakamura, V.E. Walker, P.B. Upton, S.-Y. Chiang, Y.W. Kow, J.A. Swenberg, eukaryotic DNA demethylation systems, Biol. Direct 8 (2013) 20. fi fi Highly sensitive apurinic/apyrimidinic site assay can detect spontaneous and che- [35] J.M. Duzen, G.C. Walker, M.D. Sutton, Identi cation of speci c amino acid residues mically induced under physiological conditions, Cancer Res. 58 in the E. Coli beta processivity clamp involved in interactions with DNA polymerase – (1998) 222–225. III, UmuD and UmuD, DNA Repair (Amst) 3 (2004) 301 312. [2] J. Nakamura, J.A. Swenberg, Endogenous apurinic/apyrimidinic sites in genomic [36] G. Tomer, N.B. Reuven, Z. Livneh, The beta subunit sliding DNA clamp is re- DNA of mammalian tissues, Cancer Res. 59 (1999) 2522–2526. sponsible for unassisted mutagenic translesion replication by DNA polymerase III – [3] T. Lindahl, Instability and decay of the primary structure of DNA, Nature 362 holoenzyme, Proc. Natl. Acad. Sci. U. S. A. 95 (1998) 14106 14111. (1993) 709–715. [37] S.-M. Kweon, B. Zhu, Y. Chen, L. Aravind, S.-Y. Xu, D.E. Feldman, Erasure of tet- – [4] D.E. Barnes, T. Lindahl, Repair and genetic consequences of endogenous DNA base oxidized 5-Methylcytosine by a SRAP nuclease, Cell Rep. 21 (2017) 482 494. damage in mammalian cells, Annu. Rev. Genet. 38 (2004) 445–476. [38] J.M. Covey, C. Jaxel, K.W. Kohn, Y. Pommier, Protein-linked DNA strand breaks [5] J. Cadet, J.R. Wagner, DNA base damage by reactive oxygen species, oxidizing induced in mammalian cells by camptothecin, an inhibitor of topoisomerase I, – agents, and UV radiation, Cold Spring Harb. Perspect. Biol. (2013) 5. Cancer Res. 49 (1989) 5016 5022. [6] R. De Bont, N. van Larebeke, Endogenous DNA damage in humans: a review of [39] H. Ide, T. Nakano, A.M.H. Salem, M.I. Shoulkamy, DNA-protein cross-links: for- quantitative data, 19 (2004) 169–185. midable challenges to maintaining genome integrity, DNA Repair (Amst) 71 (2018) – [7] T. Lindahl, B. Nyberg, Rate of depurination of native deoxyribonucleic acid, 190 197. Biochemistry 11 (1972) 3610–3618. [40] J.L. Quiñones, U. Thapar, S.H. Wilson, D.A. Ramsden, B. Demple, Oxidative DNA- [8] H.E. Krokan, M. Bjørås, Base excision repair, Cold Spring Harb. Perspect. Biol. 5 protein crosslinks formed in mammalian cells by abasic site lyases involved in DNA (2013) a012583. repair, DNA Repair 87 (2020) 102773. [9] E.A. Mullins, A.A. Rodriguez, N.P. Bradley, B.F. Eichman, Emerging roles of DNA [41] D.O. Zharkov, R.A. Rieger, C.R. Iden, A.P. Grollman, NH2-terminal proline acts as a glycosylases and the base excision repair pathway, Trends Biochem. Sci. 44 (2019) nucleophile in the glycosylase/AP-lyase reaction catalyzed by Escherichia coli 765–781. formamidopyrimidine-DNA glycosylase (Fpg) protein, J. Biol. Chem. 272 (1997) – [10] K. Hitomi, S. Iwai, J.A. Tainer, The intricate structural chemistry of base excision 5335 5341. repair machinery: implications for DNA damage recognition, removal, and repair, [42] R. Gilboa, D.O. Zharkov, G. Golan, A.S. Fernandes, S.E. Gerchman, E. Matz, DNA Repair 6 (2007) 410–428. J.H. Kycia, A.P. Grollman, G. Shoham, Structure of formamidopyrimidine-DNA – [11] J. Nakamura, M. Nakamura, DNA-protein crosslink formation by endogenous al- glycosylase covalently complexed to DNA, J. Biol. Chem. 277 (2002) 19811 19816. dehydes and AP sites, DNA Repair 88 (2020) 102806. [43] Y. Pommier, P. Pourquier, Y. Urasaki, J. Wu, G.S. Laco, Topoisomerase I inhibitors: – [12] Z. Yang, N.E. Price, K.M. Johnson, Y. Wang, K.S. Gates, Interstrand cross-links selectivity and cellular resistance, Drug Resist. Updates 2 (1999) 307 318. arising from strand breaks at true abasic sites in duplex DNA, Nucleic Acids Res. 45 [44] A.K. McCullough, A. Sanchez, M. Dodson, P. Marapaka, J.-S. Taylor, R.S. Lloyd, The (2017) 6275–6283. reaction mechanism of DNA glycosylase/AP lyases at abasic sites, Biochemistry 40 – [13] T. Lindahl, A. Andersson, Rate of chain breakage at apurinic sites in double- (2001) 561 568. ff stranded deoxyribonucleic acid, Biochemistry 11 (1972) 3618–3623. [45] E.H. Cordes, W.P. Jencks, On the mechanism of schi base formation and hydro- – [14] W.G. Overend, 533. Deoxy-sugars. Part XIII. Some observations on the Feulgen lysis, J. Am. Chem. Soc. 84 (1962) 832 837. nucleal reaction, J. Chem. Soc. (Resumed) (1950) 2769–2774. [46] A. Andreeva, E. Kulesha, J. Gough, A.G. Murzin, The SCOP database in 2020: ex- fi [15] M. Manoharan, S.C. Ransom, A. Mazumder, J.A. Gerlt, J.A. Wilde, J.A. Withka, panded classi cation of representative family and superfamily domains of known – P.H. Bolton, The characterization of abasic sites in DNA heteroduplexes by site protein structures, Nucleic Acids Res. 48 (2019) D376 D382. specific labeling with carbon-13, J. Am. Chem. Soc. 110 (1988) 1620–1622. [47] N. Wang, H. Bao, L. Chen, Y. Liu, Y. Li, B. Wu, H. Huang, Molecular basis of abasic [16] J. Lhomme, J.F. Constant, M. Demeunynck, Abasic DNA structure, reactivity, and site sensing in single-stranded DNA by the SRAP domain of E. coli yedK, Nucleic – recognition, Biopolym. Origi. Res. Biomol. 52 (1999) 65–83. Acids Res. 47 (2019) 10388 10399. [17] N.E. Price, K.M. Johnson, J. Wang, M.I. Fekry, Y. Wang, K.S. Gates, Interstrand [48] L. Halabelian, M. Ravichandran, Y. Li, H. Zeng, A. Rao, L. Aravind, DNA–DNA cross-link formation between adenine residues and abasic sites in duplex C.H. Arrowsmith, Structural basis of HMCES interactions with abasic DNA and

7 K.M. Amidon and B.F. Eichman DNA Repair 94 (2020) 102903

multivalent substrate recognition, Nat. Struct. Mol. Biol. 26 (2019) 607–612. Interaction of PARP-2 with AP site containing DNA, Biochimie 112 (2015) 10–19. [49] V. Shukla, L. Halabelian, S. Balagere, D. Samaniego-Castruita, D.E. Feldman, [72] S. Khodyreva, R. Prasad, E. Ilina, M. Sukhanova, M. Kutuzov, Y. Liu, E. Hou, C.H. Arrowsmith, A. Rao, L. Aravind, HMCES functions in the alternative end- S. Wilson, O. Lavrik, Apurinic/apyrimidinic (AP) site recognition by the 5′-dRP/AP joining pathway of the DNA DSB repair during class switch recombination in B cells, lyase in poly (ADP-ribose) polymerase-1 (PARP-1), Proc. Natl. Acad. Sci. 107 Mol. Cell 77 (2019) 384–394. (2010) 22090–22095. [50] K.A. Higgins, D. Giedroc, Insights into protein allostery in the CsoR/RcnR family of [73] A. Prakash, B.E. Eckenroth, A.M. Averill, K. Imamura, S.S. Wallace, S. Doublié, transcriptional repressors, Chem. Lett. 43 (2014) 20–25. Structural investigation of a viral ortholog of human NEIL2/3 DNA glycosylases, [51] B. Metz, G.F. Kersten, P. Hoogerhout, H.F. Brugghe, H.A. Timmermans, A. de Jong, DNA Repair (Amst) 12 (2013) 1062–1071. H. Meiring, J. ten Hove, W.E. Hennink, D.J. Crommelin, et al., Identification of [74] J.L. Quiñones, U. Thapar, K. Yu, Q. Fang, R.W. Sobol, B. Demple, Enzyme me- formaldehyde-induced modifications in proteins: reactions with model peptides, J. chanism-based, oxidative DNA–protein cross-links formed with DNA polymerase β Biol. Chem. 279 (2004) 6235–6243. in vivo, Proc. Natl. Acad. Sci. 112 (2015) 8602–8607. [52] S. Ratner, H.T. Clarke, The action of formaldehyde upon cysteine, J. Am. Chem. [75] M.S. DeMott, E. Beyret, D. Wong, B.C. Bales, J.-T. Hwang, M.M. Greenberg, Soc. 59 (1937) 200–206. B. Demple, Covalent trapping of human DNA polymerase β by the oxidative DNA [53] R.G. Kallen, Mechanism of reactions involving Schiff base intermediates. lesion 2-deoxyribonolactone, J. Biol. Chem. 277 (2002) 7637–7640. Thiazolidine formation from L-cysteine and formaldehyde, J. Am. Chem. Soc. 93 [76] R. Prasad, J.K. Horton, S.H. Wilson, Requirements for PARP-1 covalent crosslinking (1971) 6236–6248. to DNA (PARP-1 DPC), DNA Repair (Amst) 89 (2020) 102824. [54] J. Pesek, J. Frost, Decomposition of thiazolidines in acidic and basic solution: [77] M. Srivastava, D. Su, H. Zhang, Z. Chen, M. Tang, L. Nie, J. Chen, HMCES safe- spectroscopic evidence for Schiff base intermediates, Tetrahedron 31 (1975) guards replication from oxidative stress and ensures error-free repair, EMBO Rep. 907–913. 21 (2020) e49123. [55] M. Canle, A. Lawley, E.C. McManus, R.A.M. O’Ferrall, Rate and equilibrium con- [78] O.V. Viktorovskaya, T.M. Greco, I.M. Cristea, S.R. Thompson, Identification of RNA stants for oxazolidine and thiazolidine ring-opening reactions, Pure and Applied binding proteins associated with dengue virus RNA in infected cells reveals tem- Chemistry 68 (1996), pp. 813–818. porally distinct host factor requirements, PLoS Negl. Trop. Dis. 10 (2016) [56] M.P. Schubert, Compounds of thiol acids with aldehydes, J. Biol. Chem. 114 (1936) e0004921. 341–350. [79] M. Horikoshi, F.R. Day, M. Akiyama, M. Hirata, Y. Kamatani, K. Matsuda, [57] A.H. Cook, I.M. Heilbron, H.T. Clarke, J.R. Johnson, R. Robinson (Eds.), Chemistry K. Ishigaki, M. Kanai, H. Wright, C.A. Toro, Elucidating the genetic architecture of of Penicillin, Princeton University Press, 1949, pp. 921–972. reproductive ageing in the Japanese population, Nat. Commun. 9 (2018) 1977. [58] I.E. Gentle, D.P. De Souza, M. Baca, Direct production of proteins with N-terminal [80] M.S. Taylor, J. LaCava, P. Mita, K.R. Molloy, C.R.L. Huang, D. Li, E.M. Adney, cysteine for site-specific conjugation, Bioconjug. Chem. 15 (2004) 658–663. H. Jiang, K.H. Burns, B.T. Chait, Affinity proteomics reveals human host factors [59] L. Zhang, J.P. Tam, Thiazolidine formation as a general and site-specific conjuga- implicated in discrete stages of LINE-1 retrotransposition, Cell 155 (2013) tion method for synthetic peptides and proteins, Anal. Biochem. 233 (1996) 87–93. 1034–1048. [60] J.H. Billman, A.C. Diesing, Reduction of Schiff bases with sodium borohydride, J. [81] M.S. Taylor, I. Altukhov, K.R. Molloy, P. Mita, H. Jiang, E.M. Adney, A. Wudzinska, Org. Chem. 22 (1957) 1068–1070. S. Badri, D. Ischenko, G. Eng, Dissection of affinity captured LINE-1 macro- [61] G. Just, B.Y. Chung, S. Kim, G. Rosebery, P. Rossy, Reactions of oxygen and sulphur molecular complexes, Elife 7 (2018) e30094. anions with oxazolidine and thiazolidine derivatives of 2-mesyloxymethylglycer- [82] T. Miyoshi, T. Makino, J.V. Moran, Poly (ADP-Ribose) polymerase 2 recruits re- aldehyde acetonide, Can. J. Chem. 54 (1976) 2089–2093. plication protein a to sites of LINE-1 integration to facilitate retrotransposition, Mol. [62] Av. Bondi, Van der Waals volumes and radii, J. Phys. Chem. 68 (1964) 441–451. Cell 75 (2019) 1286–1298 e1212. [63] H. Eramian, Y.-H. Tian, Z. Fox, H.Z. Beneberu, M. Kertesz, On the anisotropy of van [83] Y. Wang, L. Zeng, C. Liang, R. Zan, W. Ji, Z. Zhang, Y. Wei, S. Tu, Y. Dong, der Waals atomic radii of O, S, Se, F, Cl, Br, and I, J. Phys. Chem. A 117 (2013) Integrated analysis of transcriptome-wide m6A methylome of osteosarcoma stem 14184–14190. cells enriched by chemotherapy, Epigenomics 11 (2019) 1693–1715. [64] C.E. Piersen, R. Prasad, S.H. Wilson, R.S. Lloyd, Evidence for an imino intermediate [84] H.-X. Gao, A. Nuerlan, G. Abulajiang, W.-L. Cui, J. Xue, W. Sang, S.-J. Li, J. Niu, Z.- in the DNA polymerase β deoxyribose phosphate excision reaction, J. Biol. Chem. P. Ma, W. Zhang, Quantitative proteomics analysis of differentially expressed pro- 271 (1996) 17811–17815. teins in activated B-cell-like diffuse large B-cell lymphoma using quantitative pro- [65] V. Bailly, W. Verly, T. O’Connor, J. Laval, Mechanism of DNA strand nicking at teomics, Pathol.-Res. Pract. 215 (2019) 152528. apurinic/apyrimidinic sites by Escherichia coli [formamidopyrimidine] DNA gly- [85] C.G. Spruijt, F. Gnerlich, A.H. Smits, T. Pfaffeneder, P.W. Jansen, C. Bauer, cosylase, Biochem. J. 262 (1989) 581–589. M. Münzel, M. Wagner, M. Müller, F. Khan, Dynamic readers for 5-(hydroxy) me- [66] B. Sun, K.A. Latham, M. Dodson, R.S. Lloyd, Studies on the catalytic mechanism of thylcytosine and its oxidized derivatives, Cell 152 (2013) 1146–1159. five DNA glycosylases probing for enzyme-DNA IMINO intermediates, J. Biol. [86] L. Fagerberg, B.M. Hallström, P. Oksvold, C. Kampf, D. Djureinovic, J. Odeberg, Chem. 270 (1995) 19501–19508. M. Habuka, S. Tahmasebpoor, A. Danielsson, K. Edlund, Analysis of the human [67] G. Golan, D.O. Zharkov, A.P. Grollman, M. Dodson, A.K. McCullough, R.S. Lloyd, tissue-specific expression by genome-wide integration of transcriptomics and anti- G. Shoham, Structure of T4 dimer glycosylase in a reduced imine body-based proteomics, Mol. Cell. Proteom. 13 (2014) 397–406. covalent complex with abasic site-containing DNA, J. Mol. Biol. 362 (2006) [87] G.P. Pfeifer, P.E. Szabó, J. Song, Protein interactions at oxidized 5-Methylcytosine 241–258. bases, J. Mol. Biol. 432 (2019) 1718–1730. [68] A.A. Purmal, L.E. Rabow, G.W. Lampman, R.P. Cunningham, Y.W. Kow, A common [88] P.V. Hornbeck, I. Chabra, J.M. Kornhauser, E. Skrzypek, B. Zhang, PhosphoSite: a mechanism of action for the N-glycosylase activity of DNA N-glycosylase/AP lyases bioinformatics resource dedicated to physiological protein phosphorylation, from E. coli and T4, Mutat. Res. 364 (1996) 193–207. Proteomics 4 (2004) 1551–1561. [69] S.A. Roberts, N. Strande, M.D. Burkhalter, C. Strom, J.M. Havener, P. Hasty, [89] R. Oughtred, C. Stark, B.-J. Breitkreutz, J. Rust, L. Boucher, C. Chang, N. Kolas, D.A. Ramsden, Ku is a 5′-dRP/AP lyase that excises nucleotide damage near broken L. O’Donnell, G. Leung, R. McAdam, The BioGRID interaction database: 2019 up- ends, Nature 464 (2010) 1214–1217. date, Nucleic Acids Res. 47 (2019) D529–D541. [70] N. Strande, S.A. Roberts, S. Oh, E.A. Hendrickson, D.A. Ramsden, Specificity of the [90] J.N. Weinstein, E.A. Collisson, G.B. Mills, K.R.M. Shaw, B.A. Ozenberger, K. Ellrott, dRP/AP lyase of Ku promotes nonhomologous end joining (NHEJ) fidelity at da- I. Shmulevich, C. Sander, J.M. Stuart, C.G.A.R. Network, The cancer genome atlas maged ends, J. Biol. Chem. 287 (2012) 13686–13693. pan-cancer analysis project, Nat. Genet. 45 (2013) 1113. [71] M.M. Kutuzov, S.N. Khodyreva, E.S. Ilina, M.V. Sukhanova, J.C. Amé, O.I. Lavrik,

8