DNA Ligase C and PrimPolC participate in base excision repair in mycobacteria
Article (Accepted Version)
Plocinski, Przemyslaw, Brissett, Nigel C, Bianchi, Julie, Brzostek, Anna, Korycka-Machała, Małgorzata, Dziembowski, Andrzej, Dziadek, Jaroslaw and Doherty, Aidan J (2017) DNA Ligase C and Prim-PolC participate in base excision repair in mycobacteria. Nature Communications, 8 (1). a1251 1-12. ISSN 2041-1723
This version is available from Sussex Research Online: http://sro.sussex.ac.uk/id/eprint/70317/
This document is made available in accordance with publisher policies and may differ from the published version or from the version of record. If you wish to cite this item you are advised to consult the publisher’s version. Please see the URL above for details on accessing the published version.
Copyright and reuse: Sussex Research Online is a digital repository of the research output of the University.
Copyright and all moral rights to the version of the paper presented here belong to the individual author(s) and/or other copyright owners. To the extent reasonable and practicable, the material made available in SRO has been checked for eligibility before being made available.
Copies of full text items generally can be reproduced, displayed or performed and given to third parties in any format or medium for personal research or study, educational, or not-for-profit purposes without prior permission or charge, provided that the authors, title and full bibliographic details are credited, a hyperlink and/or URL is given for the original metadata page and the content is not changed in any way.
http://sro.sussex.ac.uk 1
2 DNA Ligase C and Prim-PolC participate
3 in base excision repair in mycobacteria
4
5 Przemysáaw PáociĔski1,2§, Nigel C. Brissett1§, Julie Bianchi1,4, Anna Brzostek2,
6 Maágorzata Korycka-Machaáa2 , Andrzej Dziembowski3, Jarosáaw Dziadek2 and
7 Aidan J. Doherty1*
8
9 1 Genome Damage and Stability Centre, School of Life Sciences, University of 10 Sussex, Brighton BN1 9RQ, UK.
11 2 Institute of Medical Biology, Polish Academy of Sciences, Lodowa 106, 93-232 12 Lodz, Poland.
13 3 Institute of Biochemistry and Biophysics, Polish Academy of Sciences, PawiĔskiego 14 5A, 02-106 Warsaw, Poland.
15 4 Present address: Department of Oncology-Pathology, Cancer Center Karolinska, 16 Karolinska Institutet, R8:04, Karolinska Universitetssjukhuset Solna 171 76 17 Stockholm, Sweden.
18 § These authors contributed equally to this work
19
20 * Corresponding author: Aidan Doherty, email: [email protected]
21
22 Keywords: LigC, LigD, BER, primase, polymerase, repair, ligase, PrimPol, AEP, 23 NHEJ, SSB, RNA, mycobacteria
24 25
26 Abstract
27 Prokaryotic Ligase D is a conserved DNA repair apparatus processing DNA double-
28 strand breaks in stationary phase. An orthologous Ligase C (LigC) complex also co-
29 exists in many bacterial species but its function is unknown. Here, we show that the
30 LigC complex interacts with core BER enzymes in vivo and demonstrate that
31 together these factors constitute an excision repair apparatus capable of repairing
32 damaged bases and abasic sites. The polymerase component, which contains a
33 conserved C-terminal structural loop, preferentially binds to and fills-in short gapped
34 DNA intermediates with RNA and LigC ligates the resulting nicks to complete repair.
35 Components of the LigC complex, like LigD, are expressed upon entry into stationary
36 phase and cells lacking either of these pathways exhibit increased sensitivity to
37 oxidising genotoxins. Together, these findings establish that the LigC complex is
38 directly involved in an excision repair pathway(s) that repairs DNA damage with
39 ribonucleotides during stationary phase.
40
1
41 Introduction
42 In bacteria, canonical primer synthesis during DNA replication is carried out by
43 enzymes from the DnaG superfamily 1,2. In contrast, priming of replication in archaea
44 and eukaryotes is performed by members of the archaeo-eukaryotic primase (AEP)
45 superfamily 3,4. However, AEPs are also widely distributed in most bacterial species
46 4, where they have evolved to fulfill divergent roles and have recently been
47 reclassified as a family of polymerases called primase-polymerases (Prim-Pols) to
48 better reflect their evolutionary origins and more diverse roles in DNA metabolism 3,4.
49 The best characterized bacterial AEP is Prim-PolD (PolDom) that forms part of a
50 multifunctional non homologous end-joining (NHEJ) DNA break repair complex
51 called Ligase D (LigD). In mycobacterial LigD, an AEP is fused to phosphoesterase
52 and ATP-dependent DNA ligase domains that, together with the Ku repair factor,
53 coordinate the sequential synapsis, processing and repair of double-strand breaks
54 (DSBs) in stationary phase 5-9. However, in many other species these domains are
55 encoded by separate operonically-associated genes 5,10.
56 Many bacterial species, including Actinobacteria, encode multiple ATP-dependent
57 DNA ligases and Prim-Pols similar to those found in mycobacterial LigD. Exemplary
58 classes of Prim-Pols and DNA ligases in selected gram-positive bacteria are shown
59 in Supplementary Fig. 1. Mycobacterium smegmatis encodes four distinct primase-
60 polymerases. Although it is known that the Prim-PolD subunit of LigD is involved in
61 the NHEJ repair complex, the roles of the other stand-alone Prim-Pols remain
62 unknown. One orthologue, MSMEG_6301 (Prim-PolC / LigC Pol / PolD) is encoded
63 in the genomic proximity of two DNA ligase genes (LigC1: msmeg_6302 and LigC2:
64 msmeg_6304). Although LigDs role in NHEJ-mediated repair is firmly established
65 7,9,11, the pathways in which these other ligases and Prim-Pols operate in remains 2
66 unclear. Due to their similarities with the NHEJ complex, it was proposed that DNA
67 ligase C and operonically-associated Prim-PolC (LigC-Pol or PolD1) function as a
68 back-up complex for the LigD pathway 12,13, although this has not been proven.
69 While the basic biochemical characteristics of Prim-PolC and PolD2, another closely
70 related AEP, were partially described in a previous study 12, Prim-PolC failed to act
71 as a back-up of Prim-PolD during DSB repair. Another study reported that LigC
72 was responsible for ligation of ~20% of DSBs in a LigD mutant strain containing a
73 mutation disrupting the ligase function but possessing functional primase and
74 exonuclease domains that can still participate in the NHEJ 14.
75
76 Here, we sought to elucidate the role(s) of LigC, and its associated polymerase
77 (Prim-PolC), in mycobacterial DNA metabolism. We show that components of the
78 LigC complex act together to preferentially fill in short DNA gaps with ribonucleotides
79 (rNTPs), followed by sealing of the resulting nicks. We identify that LigC operon-
80 encoded proteins associate with components of the base excision repair (BER)
81 apparatus in vivo and, together with the excision repair machinery, form repair
82 complexes that remove damaged DNA and abasic sites and fill in resulting gaps with
83 rNTPs. Structural studies reveal that a conserved C-terminal element of Prim-PolC
84 forms an extended surface surrounding the active site, which likely supports gap
85 recognition and synthesis. Components of the LigC complex are expressed upon
86 entry into stationary phase, when the intracellular pool of dNTPs is naturally
87 depleted, necessitating the preferential incorporation of rNTPs after lesion removal.
88 Abrogation of the LigC complex, results in an increased sensitivity of mycobacterial
89 cell to oxidative damaging agents. LigD-deficient cells are also sensitive to oxidative
90 damaging agents, uncovering an unexpected duel role in both excision and DSB
3
91 repair. Together, these findings establish that the LigC complex operates in excision
92 repair pathways that specifically mends lesions and single-strand breaks in
93 stationary phase.
94
95
4
96 Results
97 DNA ligase C complex interacts with BER enzymes in vivo
98 Although LigD-associated Prim-Pols and ligases facilitate NHEJ repair of DSBs in
99 bacteria, the functions of closely related orthologues remains unclear. To investigate
100 the cellular roles of Prim-PolC, and its operonically-associated DNA ligases (LigC1
101 and LigC2), we first sought to define their physical associations with other proteins /
102 complexes in mycobacterial cells. We employed an eGFP-tag (enhanced green
103 fluorescent protein) facilitated co-purification approach, in which each of the proteins
104 (Prim-PolC, LigC1 and LigC2) were fused to eGFP to serve as baits for the
105 purification and identification of cellular partners. Purified complexes were cross-
106 linked with a reversible cross-linker, 3,3'-dithiobis (sulfosuccinimidyl propionate)
107 (DSSP), washed and subjected to mass spectrometry (MS) analysis 15. Complexes
108 were purified from cells grown to late stationary phase. As expected, tagged LigC
109 and Prim-PolC were among most abundant hits in the MS samples, supporting the
110 specificity of the purification process. A list of polypeptides that co-purified with each
111 bait, known to be association with DNA repair processes, are shown in Figure 1a. A
112 full list of all co-purified proteins is provided in Supplementary Data 1. Significantly,
113 all three bait proteins co-purified with multiple repair enzymes, including the major
114 DNA glycosylases and nucleases involved in base excision repair (BER) repair
115 pathways. Reverse experiments using mycobacterial eGFP-tagged endonuclease IV
116 as bait in M. bovis BCG resulted in co-purification of LigC, confirming that these two
117 proteins form a complex in vivo (Supplementary Data 2). Together, these in vivo
118 studies reveal that Prim-PolC and LigC interact with BER components, suggesting
119 that they function primarily in the repair of damaged bases, abasic sites and single-
120 strand breaks (SSBs). 5
121
122 DNA ligase C complex interacts with BER enzymes in vitro
123 To validate the interactions of LigC1 and Prim-PolC with components of the BER
124 machinery identified in our pull-down studies, we expressed and purified
125 recombinant forms of each of these BER enzymes (Supplementary Fig. 2). Taking
126 advantage of Prim-PolC and LigC1-specific antibodies, a slot blotting-based
127 methodology was employed to authenticate the interactions between the identified
128 proteins. We also used this approach to address if Prim-PolC or LigC1 interacted
129 with Ku, to determine if they also function in NHEJ repair. Mono-functional DNA
130 glycosylase (MPG) was purified alongside bifunctional glycosylases (FPG and Nth),
131 that possess abasic site lyase activity, as well as several of the key end-processing
132 nucleases including: exodeoxyribonuclease VIIB (ExoVIIB), endonuclease IV
133 (EndoIV) and both exonuclease III paralogues (ExoIII, XthA) from M. smegmatis.
134 PolD2, the Prim-Pol with closest similarity to Prim-PolC, as well as the major BER
135 repair polymerase PolA, were also purified for these in vitro interaction studies.
136 Recombinant proteins were spotted onto nitrocellulose membrane and incubated
137 with either Prim-PolC or LigC1, respectively, followed by extensive washing and
138 subsequent detection by western blotting using Prim-PolC / LigC1-specific
139 antibodies. We observed significant signals with many of the BER proteins incubated
140 with LigC1 (Figure 1b). MPG and FPG (MutM) glycosylases exhibited the strongest
141 interactions with LigC1. LigC1 also interacted with EndoIV, ExoIII and XthA end-
142 processing enzymes and both Prim-PolC and PolD2. A summary of the major
143 interactions identified in these studies is shown in Figure 1c. In contrast, Prim-PolC
144 only interacts with LigC1 but not with either Ku or ExoVIIB, which therefore served as
145 control proteins for these experiments. Together, these data establish that LigC
6
146 appears to act as a key scaffolding protein, akin to ligase III in eukaryotes 16,17,
147 required for sequential recruitment of BER repair enzymes and therefore potentially
148 coordinating the processing and repair of lesions and SSBs in vivo 18-20. To further
149 validate the interactions with the glycosylase, we performed pull-down assays of
150 LigC1 with the two potentially interacting glycosylases, FPG and MPG and observed
151 a strong association of tag-free LigC1 with FPG coated His-trap beads and a weaker
152 interaction with MPG (Supplementary Fig. 3).
153
154 Previously, comparative sequence database mining analysis of bacterial operons
155 uncovered a genetic link between LigD and Ku, which led to the identification of the
156 NHEJ pathway in prokaryotes 21,22. Similar in silico comparative operon analysis of
157 many mycobacterial genomes revealed that the LigC and Prim-PolC genes are co-
158 transcribed with the major base excision repair bifunctional glycosylase, FPG, in
159 operons present in M. avium, M. intracellulare and Mycobacterium JS623, a close
160 relative of M. smegmatis. Notably, these operons are also located within close
161 genomic proximity of other DNA excision repair enzymes (PolA and UvrB;
162 Supplementary Fig. 3). Together, these findings further strengthen a functional
163 connection between LigC-related proteins and BER processes in mycobacteria.
164
165 Prim-PolC preferentially binds to short gapped DNA
166 Following validation of the association of Prim-PolC and LigC1 with components of
167 the BER machinery, we proceeded to biochemically characterize the capacity of
168 these enzymes to process DNA repair intermediates that arise during the processing
169 of BER substrates, e.g. abasic sites. We first performed electrophoretic mobility shift 7
170 assays (EMSAs) to determine the optimal DNA substrates for binding and
171 processing by Prim-PolC. The most common BER intermediates are, short DNA
172 gaps containing a phosphate moiety on the 5 terminus, that arise from base excision
173 and subsequent abasic site removal. We therefore tested Prim-PolCs capacity to
174 bind to an array of phosphorylated and non-phosphorylated DNA substrates,
175 containing various gap lengths. Notably, Prim-PolCs binding affinity was significantly
176 stimulated by the presence of a 5-phosphate group within a gap and it bound most
177 avidly to short gaps of 1-3 nucleotides (Figure 2a; Supplementary Fig. 4a). In
178 contrast with its NHEJ-specific orthologue, Prim-PolC bound much more weakly to
179 DNA substrates containing overhangs (Figure 2b).
180
181 NHEJ Prim-Pols bind strongly to 5 phosphorylated DNA overhangs 23 and can
182 directly induce break synapsis 23-26. To determine if Prim-PolC also possesses end-
183 synapsis activity, we employed a microhomology-mediated end-joining (MMEJ)
184 assay that utilises substrates that mimic DSBs to measure this activity in vitro 27. As
185 expected, Prim-PolD efficiently promoted break synapsis and subsequent in-trans
186 extension of MMEJ intermediates (Figures 2c & d). In contrast, Prim-PolC did not
187 exhibit significant MMEJ activity, indicating that DSBs are not preferred substrates
188 for this enzyme. This identifies another significant difference in substrate specificities
189 between these two classes of Prim-Pols.
190
191 To cross-validate the observed DNA substrate binding affinities, we carried out
192 additional DNA binding studies using polyA-tailed DNA substrates coated onto
193 magnetic particles. Beads pre-coated with different DNA substrates were incubated
8
194 with whole cell mycobacterial lysates, washed extensively and bound preys were
195 identified by mass spectrometry. As expected, LigD and Ku were both found to co-
196 purify on DSB-like substrates with free ends. However, neither Ku or LigD bound to
197 this substrate when the terminal nucleotides were blocked with S-bond. Notably,
198 Prim-PolC and LigD, but not Ku, co-purified on a substrate containing a single
199 nucleotide gap, suggesting that both may contribute to the repair of such
200 intermediates in vivo (Supplementary Fig. 4b, Supplementary Data 3).
201
202 LigC complex repairs short gapped DNA intermediates
203 To establish the functional importance of Prim-PolCs preferential gap binding
204 activity, we next performed primer-template extension assays using gapped
205 substrates (1,2,3 or 5 nt gap) in the presence of intracellular levels of divalent
206 cations. Consistent with the DNA binding results, Prim-PolC most efficiently
207 processed short DNA gaps (Figure 3a), with its polymerization activity significantly
208 stimulated by the presence of a phosphate group at the 5 terminus of the gap
209 (Figure 3a). In common with Prim-PolD, Prim-PolC also preferentially inserted rNTPs
210 during gap-filling synthesis (Figure 3a). This intrinsic preference for NTPs over
211 dNTPs was quantified using a nucleotide competition assay (Supplementary Fig. 5a)
212 23. This assay revealed that its nucleotide preference factor (F) was ~190 fold in
213 favour of rNTP incorporation, which compares to an F value of ~70 fold for Mt Prim-
214 PolD (PolDom) 23.
215
216 The implication of this finding is that gap filling synthesis by Prim-PolC would
217 preferentially place RNA, or at least a single rNTP, on the 3 side of the nick that 9
218 must then be ligated to DNA on the opposite 5 side. To test this prediction, we
219 evaluated the ligation specificity of LigC1 using nicked DNA substrates containing
220 either DNA or RNA on the 3 side and a phosphorylated DNA strand the 5 side.
221 LigC1 preferentially ligated hybrids of RNA and DNA within the repaired gap, rather
222 than more typical DNA:DNA gaps (Supplementary Fig. 5b), consistent with the
223 previously reported ligation specificity of LigD 10,14. As repair synthesis by Prim-PolC
224 would result in the production of such hybrid RNA:DNA nicks in vitro, we next tested
225 whether the two enzymes co-operatively repair short DNA gaps. Together, Prim-
226 PolC and LigC1 efficiently filled in and ligated short (1-3nt) gaps with RNA but were
227 much less efficient at repairing longer DNA gaps (Figure 3b), consistent with the
228 reduced DNA binding and extension activities of Prim-PolC observed previously on
229 these substrates (Figure 3a and Supplementary Fig. 4a).
230
231 Reconstituting a LigC-dependent BER repair pathway in vitro
232 To establish if the LigC apparatus operates as part of a functional BER complex, we
233 sought to reconstitute a complete excision repair pathway in vitro using Prim-PolC,
234 LigC1 and the interacting BER proteins. We employed oligonucleotide substrates
235 containing either deoxyuracil (dU) or 8-oxoGuanine (8-oxoG) incorporated in a
236 central position of the labelled DNA strand to enable us to follow the consecutive
237 enzymatic processing steps of BER (Figure 4a). dU-containing substrates were
238 initially treated with UDG glycosylase and the resulting abasic site substrate was
239 then tested with the same enzyme combinations also assayed on 8-oxoG containing
240 substrates. We observed that FPG removed the oxidatively damaged DNA base and
241 processed the resulting abasic site with its lyase activity (Figure 4b, c). The 3 end of
242 the gap was subsequently processed by either one of the 3-phosphatase and / or 3- 10
243 5 exonuclease activities of EndoIV, ExoIII or XthA. The resulting DNA gap was
244 subsequently filled-in with RNA by Prim-PolC and, finally, the resulting nick was
245 ligated by LigC1. We observed that EndoIV was the most beneficial nuclease for
246 ensuring proper repair in vitro as it lacked strong 3-5 exonuclease activity.
247 Together, these findings establish that BER end-processing enzymes, shown to
248 interact with LigC1 in vivo, facilitate LigC complex-mediated repair of lesions arising
249 from abasic site formation in vitro.
250
251 Structure of a mycobacterial LigC-associated Prim-Pol
252 Although Prim-PolC shares many common features with its NHEJ orthologue, such
253 as 5 phosphate recognition and gap filling synthesis, this enzyme is likely to have
254 evolved distinct structural features to facilitate its bespoke role in excision repair. To
255 define molecular differences between these distinct Prim-Pols, we elucidated the
256 crystal structure of Ms Prim-PolC (Table 1) and compared it to a NHEJ-specific Prim-
257 Pol. Prim-PolC retains the same overall fold of the mycobacterial Prim-PolD
258 (PolDom) structure23-26, which can be superposed with an RMSD of 1.712 Å (over
259 270 aligned positions, Figure 5a; Supplementary Fig. 6a). Shared structural features
260 include the phosphate recognition pocket containing the conserved basic residues
261 (K23, K35 & N20) that facilitate this key DNA interaction. Prim-PolC also retains
262 Loops 1, which is critically required to orient the template / 3 overhanging strand,
263 and Loop 2 that accepts and positions the incoming primer strand from the adjacent
264 side of the break and also facilitates the binding / activation of the second metal ion
265 in the active site 23-26. These loops are critical for Prim-PolDs engagement with the
266 break termini, promoting end-synapsis / MMEJ and are required for activation of the
267 catalytic centre to allow DNA synthesis to commence25,26. 11
268
269 Despite these similarities, Prim-PolC orthologues possess a strictly conserved C-
270 terminal extension whose function has not been ascribed (Figure 5b). In the Prim-
271 PolC structure, we observe that this additional C-terminal region (aa 294-336)
272 consists of a conserved alpha helix, (residues G299-R311), which lies across alpha
273 helix 4 after which the peptide chain becomes an unstructured loop (Figure 5a,b).
274 Notably, this structural element, termed Loop 3 (residues G313-P333), is positioned
275 between the two prominent surface loops (1 and 2) that together form a continuous
276 surface on one side of the active site pocket (Figures 5c, Supplementary Fig. 6b).
277 Apical Loop 1 residues, normally involved in template strand orientation 23-26, are not
278 conserved and are instead repurposed for locking Loop 3 into position
279 (Supplementary Fig. 6b & 7). Residue P94 packs against a pocket formed of
280 residues Y321, P322, K323 and M324 from Loop 3. This also results in Loop 1
281 adopting a position more proximal to the active site. Loop 2 residue W219 interacts
282 with Loop 3 residues P316, Y317, P333 and K335 via a mixture of hydrophobic and
283 polar interactions (Supplementary Fig. 6b & 7). Again, this has the effect of locking
284 Loop 3 into its current orientation, and also reorients Loop 2 into a more distal
285 position to the active centre, compared to the existing Prim-PolD MMEJ structure25.
286 A consequence of Loop 2s interaction with Loop 3 is the fixing of Loop 2 into its
287 current orientation. The effects of limiting the flexibility of Loop 2, and therefore the
288 conserved arginine 224 (equivalent to R220 in Mt Prim-PolD), may have a significant
289 effect on Prim-PolCs catalytic activity.
290
291 Superposition of Prim-PolC onto the structure of a DNA-bound Prim-PolD MMEJ
292 complex revealed that the 5 phosphate fits into the conserved binding pocket and 12
293 the dsDNA is also well positioned into the structure (Figure 5d and Supplementary
294 Fig. 6c & 6d). The first three bases of the single-stranded template strand / 3
295 overhang are within contacting distance of Loop 1. However, the 3 terminal bases of
296 this strand clash with Loop 3 in the Prim-PolC structure (Figure 5d, Supplementary
297 Fig. 6c). This clash suggests that a possible key role of the insertion is to
298 accommodate a continuous template strand that is reoriented towards the active site
299 thus positioning gapped substrates, as opposed to NHEJ intermediates, in a more
300 optimal position. This repositioning ensures, together with Loops 1 and 2, that the
301 primer strand is correctly oriented into the active site in preparation for gap-filling
302 synthesis although this prediction remains to be proven.
303
304 LigC or LigD mutants are sensitive to oxidative DNA damage
305 Collectively, our interaction and in vitro reconstitution studies strongly implicate Prim-
306 PolC and LigC, and their associated factors, in the repair of lesions and SSBs
307 processed via abasic site intermediates. As such lesions are formed in large
308 quantities during all stages of the cell cycle, we next sought to determine if this repair
309 pathway functions throughout the cell cycle or, like LigD-dependent repair, is specific
310 to a particular stage. To address this, we examined the expression profiles of Prim-
311 PolC and LigC1 at different phases of the growth cycle (early, mid, stationary) using
312 specific antibodies. This revealed that the levels of both proteins peaks during early
313 stationary phase (Figure 6a), indicating that this pathway preferentially operates as
314 cells begin to enter stationary phase.
315
13
316 To evaluate their roles in excision repair processes in vivo, we deleted prim-polC (
317 prim-polC) and ligC1, ligC2 ( ligC1, C2) in M. smegmatis. To test whether deletion
318 of ligD and prim-polC genes have a cumulative effect or the proteins can fully
319 compensate for one anothers activities, we generated single ( LigD) and double (
320 ligD, prim-polC) deletion mutants. We also created a polA mutant complemented
321 with truncated PolA, lacking functional DNA polymerase domain but maintaining its
322 essential 3 exo activity28, ( polA :: PolA1-755 aa) along with prim-polC and ligD
323 mutants in a polA deficient background ( prim-polC, polA :: polA1-755 aa; prim-
324 polC, ligD, polA :: polA1-755 aa). Cells were grown to stationary phase before
325 exposing them to genotoxic agents, to assess the oxidative damage sensitivity of
326 each strain. We tested several oxidizing agents and observed decreased viabilities
327 for prim-polC and ligC1,C2 strains exposed to oxidizing agents (Figure 6b &
328 Supplementary Fig. 8), particularly cumene hydroperoxide and tert-butyl
329 hydroperoxide. We observed that the kinetics of damage sensitivity induced by
330 organic hydroperoxides compared to hydrogen peroxide differs over time. Peroxide
331 produced an initial burst of bacterial killing that then remained at similar levels after
332 longer incubation times, whereas organic hydroperoxides continued to induce
333 increased damage sensitivity over longer time periods (Supplementary Fig. 8),
334 probably reflecting their stability in cultures.
335
336 Prim-PolC and LigC1 were previously reported to contribute to DNA repair during
337 stationary phase in M. smegmatis 12,14. Loss of Prim-Pol ( prim-polC) or ligase (
338 ligC1, C2) components of the LigC complex caused a similar level of impaired
339 survival when treated with either of these oxidizing agents (Figure 6b &
14
340 Supplementary Fig. 8). As expected, PolA deficient strains were the most sensitive
341 to these genotoxins. Cumene hydroperoxide is among the best known inducers of
342 PolA expression in mycobacteria, suggesting that it is one of the key enzymes
343 necessary for the repair of DNA lesions caused by organic hydroperoxides.
344 Unexpectedly, LigD-deficient strains were even more sensitive to these oxidizing
345 agents and simultaneous loss of LigD and Prim-PolC ( ligD, prim-polC) had an
346 additive impact on cell survival (Figure 6b and Supplementary Fig. 8), indicating that
347 both pathways are important for repairing oxidative lesions in stationary phase.
348
15
349 Discussion
350 Prior to this study, it was proposed that the LigC complex functions as an alternative
351 back-up pathway 8,14 required for repairing DSBs in stationary phase, following the
352 loss of the Ligase D complex 6,29. However, this hypothesis has not been
353 experimentally proven and therefore the cellular function(s) of this pathway remained
354 unclear. A major issue hampering the assignment of a biological role for the LigC
355 complex, distinct from LigD, has been the almost indistinguishable biochemical
356 activities shared between these closely related repair complexes. These include an
357 affinity towards 5-P moieties on their respective DNA substrates and a preference to
358 incorporate short patches of ribonucleotides to fill in gapped intermediates, followed
359 by requisite resealing by a specific RNA / DNA ligase activity 24,25. These similarities
360 also extend to the structures of Prim-PolC and D, which are highly superposable and
361 share similar prominent surface loops (Loop 1 & 2) and a 5 phosphate binding
362 pocket that together dictate the preferential substrate binding specificity of Prim-PolD
363 for DNA overhangs and break-annealed gaps 23-26.
364
365 Despite these similarities, this study reveals that Prim-PolC possesses a number of
366 distinguishing biochemical and structural features that enable us to define its
367 preferred DNA substrates. Prim-PolC preferentially binds to and fills-in short DNA
368 gaps. Unlike Prim-PolD, it shows weak preference for overhangs and is unable to
369 mediate break synapsis / synthesis. It also contains a conserved C-terminal
370 structural element (Loop 3), which is absent from Prim-PolD orthologues. This loop is
371 positioned between loops 1 & 2, where it forms a continuous surface that is probably
372 involved in directing the template strand towards the active site to optimally position,
373 with Loops 1 and 2, the 3 hydroxyl moiety of the primer strand into the active site in
16
374 preparation for extension. The absence of Loop 3 in Prim-PolD orthologues, leaves a
375 space between the remaining loops (1 and 2), which allows these NHEJ
376 polymerases to promote break annealing / MMEJ and processing of subsequently
377 annealed DSBs 23-26. Validation of this proposed role for Loop 3 awaits the
378 elucidation of additional structures of Prim-PolC bound to DNA gap intermediates.
379
380 In addition to these molecular analyses, our in vivo studies establish that the LigC
381 complex is a central nexus for a distinctive excision repair pathway required to
382 remove and replace damaged or modified bases in mycobacterial genomes during
383 stationary phase (Figure 6c). BER reconstitution studies in vitro demonstrated that
384 LigC-dependent excision repair complexes can repair oxidatively damaged bases
385 and probably other lesions, e.g. methylated / nitrosative damaged bases and SSBs,
386 that are all processed via an abasic site intermediate. A key feature of this repair
387 pathway is the preferential incorporation of ribonuclotides, consistent with their
388 abundance in stationary phase. We observed the most efficient repair in vitro when
389 endonuclease IV was employed as the gap processing enzyme, dephosphorylating
390 the 3-phosphate end of the lesion resulting from ȕ,Ȗ-elimination of the abasic site,
391 catalyzed by FPG. EndoIV lacks a strong 3-5 exonuclease activity, present in
392 alternative end-processing enzymes (ExoIII and XthA), and therefore this appears to
393 limit the gap resection to an optimal size for LigC-mediated repair. Consistent with
394 this, it was previously reported that EndoIV, rather than ExoIII, is responsible for vast
395 majority of BER gaps processing in mycobacteria, in contrast to eubacteria where
396 ExoIII plays a more dominant role 30.
397 In mycobacteria, the expression of the LigC complex peaks during early stationary
398 phase. This coincides with a significant reduction in intercellular pools of dNTPs in 17
399 the absence of replication 31,32, probably explaining the necessity to incorporate
400 ribonucleotides during DNA repair processes in stationary phase. Another notable
401 feature of the LigC pathway is the specific incorporation of short ribonucleotide
402 patches that are likely resistant to RNase H1 activities 33-35. Apart from Prim-PolC /
403 D, DinB2 also incorporates ribonucleotides into DNA during stationary phase repair
404 processes in mycobacteria 36. However, this Y-family polymerase has a tendency to
405 incorporate longer RNA patches and the resulting DNA:RNA hybrids would be
406 sensitive to RNaseH1 digestion. In gap filling assays, Prim-PolC efficiently filled in
407 short gaps (1-3nt) but failed to process longer gaps (5nt). Notably, stretches of >4
408 ribonucleotides incorporated into dsDNA are known to exhibit high instability due to
409 their sensitivity to RNase H1 digestion in vivo 33-35. It is therefore likely that short
410 RNA patches incorporated by Prim-Pols during BER and NHEJ repair processing are
411 later recognized and replaced by PolA, rather than RNase H, which undertakes a
412 similar role during RNA primer removal, once intracellular pools of dNTPs are
413 restored 37,38.
414 Prim-Pols have relatively low fidelity during synthesis and therefore another reason
415 to favor the incorporation of ribonucleotides during repair is to mark these regions
416 of repair synthesis for subsequent replacement in a post-repair manner by more
417 accurate polymerases 5,10, reminiscent of RNA primer removal during replication.
418 PolA removes RNA primers and is also known to participate in several DNA repair
419 pathways in bacteria 37,39-41. PolA-driven DNA synthesis occurs slowly but accurately
420 and it has proofreading and bidirectional nuclease activities beneficial for correcting
421 errors introduced by Prim-Pols 39,42. Notably, in this study we also detected a weak
422 association between PolA and the LigC complex, suggesting that this replicase may
18
423 remove and replace RNA inserted by this pathway, and likely others e.g. NHEJ, with
424 more accurately synthesized DNA.
425
426 Both prim-polC and ligC1,C2 strains were similarly sensitive to oxidizing agents,
427 consistent with their proposed joint roles in a distinctive excision repair pathway in
428 mycobacteria. Surprisingly, a LigD deficient strain was even more sensitive to these
429 agents, although this could potentially be attributed to a deficit in DSB repair.
430 However, this sensitivity was evident even with low levels of peroxide (~5mM),
431 reported to produce predominantly SSBs and < 0.1% DSBs 43, thus uncovering an
432 unexpected and additional role for LigD in excision repair. Consistent with this
433 proposal, co-deletion of LigD and Prim-PolC had an additive impact on cell survival
434 in the presence of these genotoxins, also indicating that they are not epistatic in their
435 function. Consistent with the mutant phenotypes, both LigD and Prim-PolC co-
436 purified on a 5-phosphorylated single nucleotide gap DNA substrate in pull-down
437 assays. This substrate mimics a naturally occurring BER DNA intermediate that
438 would arise from the enzymatic processing of an oxidative lesion by FPG and
439 EndoIV. Together, these findings establish that both LigC and LigD-dependent
440 complexes contribute significantly to the repair of lesions produced by oxidative
441 damage during stationary phase in mycobacteria. Notably, Ory et al. reported
442 recently that Bacillus subtilis LigD possesses dRP-ase activity in vitro supporting this
443 proposed role in excision repair44. Further studies are required to define the exact
444 nature of the oxidative lesions that these Prim-Pol pathways process in vivo and
445 uncover the overlap and cross-talk that occurs between these, and other, stationary
446 phase DNA repair pathways in mycobacteria.
19
447
448
449
20
450
451 Methods
452 Bacterial strains and growth conditions
453 Laboratory stock of M. smegmatis mc2 155 and its derivatives were cultured in 7H9
454 liquid broth or 7H10 solid medium (Beckton Dickinson) with ADC supplement
455 (Albumin-Dextrose-Catalase) and hygromycin B (50ȝg ml-1) or kanamycin (30ȝg ml-
456 1), if antibiotic selection was required. E. coli DH5Į strain (Invitrogen) was used for
457 cloning purposes and OrigamiTM 2(DE3)pLysS (Novagen) for protein purification,
458 respectively. E. coli strains were cultured in standard Luria broth medium or terrific
459 broth (Formedium), where large amounts of protein production was required with
460 addition of kanamycin (50ȝg ml-1) for selection.
461
462 Protein complex purification and mass spectrometry analysis
463 Genes encoding Prim-PolC, LigC1 and LigC2 were PCR amplified and sub-cloned
464 into pKW08-gfp vector using a sequence and ligation independent protocol to
465 produce C-terminal eGFP fusion proteins used as baits for protein complex
466 purification 15. Purification and data analysis was performed essentially as described
467 previously 15. Briefly, mycobacterial cells were collected by centrifugation (15 min,
468 4800 × g, 4°C) and resuspended in a cold sonication buffer containing 50 mM Tris
469 (pH 8.0), 100 mM NaCl, 1 mM dithiotreitol, 2 mM phenylmethylsulfonyl fluoride, 25
470 U/ml benzonase, 0.5% Triton X-100 and protease inhibitor coctail. The cells were
471 lysed by sonication and lysates were pre-cleared by centrifugation (20 min, 4800 × g,
472 4°C). Lysates were then mixed with 50 ȝls of the anti-GFP Sepharose beads
473 (Chromotek), incubated for 2 h at 4°C to capture protein complexes, washed 2 times
21
474 with 10 ml of wash buffer [10 mM Tris (pH 8.0), 150 mM NaCl and 0,1% Triton X-
475 100], followed by 2 washes with TEV buffer [10 mM Tris (pH 8.0), 150 mM NaCl, 0.5
476 mM EDTA and 1 mM DTT] and digested with TEV protease (Thermo Fisher
477 Scientific) to elute protein complexes.
478 For DNA-protein pull-downs, each DNA substrate at a final concentration of 50nM
479 was annealed to 1mg of the poly-dT magnetic particles (NEB) in buffer containing 20
480 mM Tris-HCl, pH 8.0, 10 mM MgCl2, 0.1 M KCl, for three hours at room temperaturĊ
481 and washed twice with the wash buffer [50mM HEPES pH 7.5, 150mM NaCl, 0.5%
482 TritonX, 1mM EDTA]. Whole cell lysates were prepared as described above,
483 however, in 50mM HEPES pH 7.5 as buffering agent and without the addition of
484 nucleases. The pre-cleared lysates were filtered through 0.22µm filters and
485 incubated with the DNA coated beads for three hours at 4°C. For the last 10
486 minutes 2µls of RNaseA (10mg/ml) were added to the 1ml reaction. Beads were
487 collected on the magnetic stand and washed 5 times with the wash buffer. DNA
488 bound protein complexes were then eluted by DNAseI digestion (NEB) for 1hour at
489 37°C in dedicated buffer.
490 The resultant protein mixture was precipitated with pyrogallol redmolybdate (PRM;
491 0.05 mM pyrogallol red, 0.16 mM sodium molybdate, 1 mM sodium oxalate, 50 mM
492 succinic acid, pH 2.5), collected by centrifugation and subjected to a standard trypsin
493 digestion. The resulting peptide mixtures were applied to a nano-HPLC RP-18
494 column (Waters) using an acetonitryle gradient in the presence of trifluoroacetic acid.
495 The column was directly linked to the ion source and the Orbitrap (Thermo Scientific)
496 was operated in a data-dependent mode.
22
497 The MaxQuant (v1.3.0.5) computational proteomics platform was used to process
498 raw MS files. Integrated Andromeda search engine and apropriate M. smegmatis or
499 M. bovis BCG protein databases were used to search against the fragmentation
500 spectra. Carbamidomethylation of cysteines, N-terminal acetylation and oxidations
501 were set as possible peptide modifications. One percent false discovery rate was
502 applied to all protein and peptide identifications. Human derived contaminants and
503 random protein identifications were excluded from the results files. Proteins
504 considered as valid identifications were identified by at least two peptides.
505
506 Purification of multiple DNA repair proteins
507 Genes encoding Prim-PolC, PolD2, LigC1, LigC2, MPG, FPG, NTH, EndoIV, ExoIII,
508 XthA, XseVIIB and PolA were PCR amplified using primers flanked with restriction
509 digestion sites required for in frame cloning into pET28 vector (Supplementary Table
510 1). All the proteins were designed to contain a N-terminal His-tag, except for FPG,
511 which was rendered inactive unless tagged at its C terminus. Proteins were purified
512 according to routine laboratory procedures using ÄKTA purifier and compatible
513 columns purchased from GE healthcare. Briefly, pre-cleared cell lysates obtained
514 after overexpression of recombinant proteins in E. coli Origami B pLysS strain were
515 loaded onto Ni Sepharose (Qiagen) column in Tris buffers [50mM Tris pH8.0],
516 washed extensively and eluted in a gradient of imidazole. Proteins were next loaded
517 onto an ion exchange column (SP or Q (GE Healthcare), depending on the
518 isoelectric point (pI) ) eluted in a gradient of NaCl and further purified on preparative
519 gel filtration columns (S200 or S75 (GE Healthcare), depending on the molecular
520 mass of purified protein). Quality of proteins after each and every purification step
23
521 was checked using sodium dodecyl sulfate (SDS) polyacrylamide gel electrophoresis
522 (PAGE).
523
524 EMSA, Polymerization and DNA repair assays
525 Assays were carried out essentially as described previously 10,25. Briefly, EMSAs
526 were employed to determine optimal substrates for Prim-PolC activity and were set
527 up in 50 mM Tris-HCl (pH 7.5), 0.1 mg ml-1 of BSA (NEB Biolabs), 1 mM DTT, 5%
528 glycerol, with indicated 30nM 6-carboxyfluorescein (6-FAM) labelled DNA substrates
529 and different concentrations of Prim-PolC in a 15 ȝl reaction volume. Incubation
530 mixtures were kept on ice for 30 minutes and were resolved by native gel
531 electrophoresis on a 5% polyacrylamide gel (80:1 (w/w) acrylamide / bisacrylamide).
532 DNA extension reaction mixture contained 50mM Tris-HCl (pH 7.5), 5mM MgCl2,
533 100ȝM MnCl2, 30nM 6-FAM labelled DNA substrate, 250ȝM NTPs and indicated
534 DNA repair proteins, in a total volume of 20ȝl. Reactions were further supplemented
535 with 1mM ATP, 1mM DTT and 0.1 mg ml-1 bovine serum albumin (BSA) for DNA
536 repair reconstitution reactions. After a set incubation time at 37°C, reactions were
537 terminated by adding stop buffer solution (95% (v/v) formamide, 0.09% (w/v)
538 bromophenol blue, 20mM EDTA and 600nM unlabelled oligonucleotide, identical to
539 labelled strand). Resulting DNA extension or repair products were resolved for 2
540 hours at constant wattage of 20W, on TBE-buffered 15% polyacrylamide gels
541 containing 8M urea. Detection of fluorescently labelled oligonucleotide products was
542 carried out using Fujifilm FLA-5100 fluorescent image scanner and EMSA data was
543 quantified using Quantity One (Bio-Rad).
544
24
545 Antibody production, purification and western blotting
546 Antibodies were generated against M. smegmatis proteins (Ku, LigC1 or Prim-PolC)
547 using a 28 day rabbit immunisation program (Eurogentec), using ~1 mg full-length
548 recombinant protein (2 applications), with serum collected at different time points.
549 Recombinant proteins (5 µg / lane) were resolved by 10% SDS-PAGE and proteins
550 transfer onto polyvinylidene fluoride (PVDF) membrane, which was subsequently
551 blocked with Tris-buffered saline and Tween 20 (TBST) and 5% (w/v) non-fat dried
552 milk. Membrane bands containing the recombinant protein were excised, washed
553 twice in TBST and incubated overnight at 4°C with 500 µl of serum. Serum was next
554 removed and membranes washed (x4) with 500 µl of TBST. Purified
555 immunoglobulins bound to the membranes were eluted with 150 µl of glycine buffer
556 (500 mM NaCl, 50 mM glycine, 0.5% Tween 20, 0.1% BSA, pH 2-3) and the blot
557 vortex pulsed before adding 25 µl of 1M Tris pH 8.8.
558 For western blotting analysis, the primary antibodies (anti- Ku, Prim-PolC and LigC1)
559 were used at a dilution of 1/1000 and the secondary antibody (HRP-conjugated Anti-
560 Rabbit IgG; Abcam ab6721) at a dilution of 1:5000 dilution. Uncropped versions of all
561 Western blots and gels can be found in Supplementary Fig. 9.
562
563
564 Far-western analysis
565 This method was used to test for protein-protein interactions in vitro. Recombinant
566 proteins were first slot-blotted using a slot blot manifold (GE Healthcare) onto a
567 methanol-wetted PVDF membrane according to the manufacturers instructions.
25
568 Typically, a fixed concentration of recombinant protein (50 ng for the probed protein
569 and 3 ȝg for the possible interactors) was blotted onto the membrane. The
570 membrane was first blocked with Tris-buffered saline (TBS: 280 mM NaCl, 20 mM
571 Tris) containing 0.05 % (v/v) Tween 20 and 5 % (w/v) non-fat dried milk. For far-
572 Western analysis, the membrane was then incubated with blocking buffer containing
573 5 ȝg ml-1 of the candidate interacting recombinant protein. The membrane was then
574 washed, probed with primary (1/1000 dilution) and secondary (1/5000 dilution)
575 antibodies, and subjected to chemiluminescent detection. The primary antibody used
576 was specific for the candidate interacting recombinant protein, and
577 chemiluminescence detected if interaction occurred between the two proteins.
578
579 Pull-down assays
580 Pull-down assays were used to confirm chosen protein-protein interactions.
581 Equimolar amounts of all proteins were used at 5 nM per pull-down. Recombinant
582 FPG-6xHis or 6xHis-MPG bait proteins were first bound to nickel-nitrilotriacetic acid
583 (NTA) coupled resin in a Tris-based buffer [50mM, pH8.0] containing 150 mM NaCl,
584 for 1 hour at 4°C. The protein-coated beads were washed and interacted with tag-
585 free LigC (6xHis-LigC subjected to a standard thrombin digestion) or BSA, for further
586 2 hours. The beads were next washed extensively and protein-complexes were
587 eluted with buffer containing 300 nM imidazole. The protein complexes were
588 visualized by SDS-PAGE followed by Coomassie staining.
589
590 Crystallisation and X-ray structure determination
26
591 Crystals of the apo Prim-PolC were grown at 285 K by vapour diffusion as sitting
592 drops. The protein was screened at a concentration of 8 mg ml-1 and 0.5 ȝL of
593 protein solution was mixed with 0.5 ȝL of crystallisation buffer (0.05 M Tris HCl (pH
594 8.0), 0.2 M ammonium chloride, 0.01M calcium chloride, 30% (w/v) PEG 4000). Prior
595 to data collection, crystals were soaked in mother liquor containing 20% ethylene
596 glycol. 0.9795 Å X-ray diffraction data was collected at 100 K using a synchrotron
597 source at station I02 Diamond Light Source, Didcot, UK. The diffraction data were
598 processed with xia2 45 with additional processing by programs from the CCP4 suite
599 (Collaborative Computational Project, Number 4, 1994). The statistics for data
600 processing are summarized in Table 1. Initial phases were obtained by molecular
601 replacement with PHASER 46 using Prim-PolD (2IRY) as a search model 23. Iterative
602 cycles of model building and refinement were performed using Coot 47 and Phenix 48.
603 A final refined model at 1.84 Å resolution, with an Rfactor of 18.58% and Rfree of
604 20.92% was obtained. 99.1% of residues are in preferred regions with the remainder
605 (0.9%) in allowed regions according to Ramachandran statistics. Structural images
606 were prepared with CCP4mg 49. The structure of apo Prim-PolC is deposited in the
607 Protein Data Bank under accession code 5OP0.
608
609 Construction of mutant strains lacking DNA repair genes Targeted gene
610 replacement strategy was employed to generate unmarked ǻprim-polC, ǻligC1,
611 ǻligC2, ǻligD, and ǻpolA::polA1-755aa mutants and strains with simultaneous
612 removal of multiple of these genes as previously published 50,51. Briefly, short
613 fragments corresponding to 3 and 5 end of each genes, flanked with surrounding
614 genomic region of about 1 kb at each side (primers listed in Supplementary Table 1),
615 were ligated together out of frame and sub-cloned into a suicidal p2NIL delivery
27
616 vector, merged with crossing over event selection cassette from pGOAL17. Multistep
617 screening led to double crossing over strains selection and resulted in gene
618 inactivation, which was confirmed by Southern blotting and PCR, for each
619 recombination event.
620
621 Phenotypic analysis of M. smegmatis strains
622 M. smegmatis MC2 155 wild type (ATCC 700084) and mutant strains were grown to
623 late stationary phase in 7H9 liquid media, with supplements, typically 4 days after
624 reaching optical density (OD600nm) of 3.0. The cells were harvested and suspended
625 in PBS containing Tween 80 (0.05% (v/v)), to prevent clumping, at OD600 of 0.1.
626 They were next treated with 10mM cumene hydroperoxide (from 1M stock in 50%
627 ethanol), 50mM tert-butyl hydroperoxide (from 5M stock in water), or 5mM hydrogen
628 peroxide (from 0.5M stock in water), respectively, for various amount of time with
629 incubation at 37°C. Untreated cells suspension was used to control overall viability of
630 each strain. At indicated time points 10ȝls of each cell suspension was transferred
631 onto 7H10 solid agar and plates were incubated for 72 hours before read out of cell
632 survival was recorded.
633
634 Data Availability. All data are provided in full in the results section and the
635 supplementary information accompanying this paper. The crystal structure has been
636 deposited in the Protein Data Bank and can be accessed using accession code
637 5OP0.
638
28
639 References
640 1. Bouche, J.P., Zechel, K. & Kornberg, A. dnaG gene product, a rifampicin- 641 resistant RNA polymerase, initiates the conversion of a single-stranded 642 coliphage DNA to its duplex replicative form. Journal of Biological Chemistry 643 250, 5995-6001 (1975).
644 2. Keck, J.L., Roche, D.D., Lynch, A.S. & Berger, J.M. Structure of the RNA 645 polymerase domain of E. coli primase. Science 287, 2482-6 (2000).
646 3. Frick, D.N. & Richardson, C.C. DNA primases. Annual Review of 647 Biochemistry 70, 39-80 (2001).
648 4. Guilliam, T.A., Keen, B.A., Brissett, N.C. & Doherty, A.J. Primase- 649 polymerases are a functionally diverse superfamily of replication and repair 650 enzymes. Nucleic Acids Research 43, 6651-64 (2015).
651 5. Pitcher, R.S. et al. NHEJ protects mycobacteria in stationary phase against 652 the harmful effects of desiccation. DNA Repair (Amst) 6, 1271-6 (2007).
653 6. Weller, G.R. et al. Identification of a DNA nonhomologous end-joining 654 complex in bacteria. Science 297, 1686-9 (2002).
655 7. Della, M. et al. Mycobacterial Ku and ligase proteins constitute a two- 656 component NHEJ repair machine. Science 306, 683-5 (2004).
657 8. Gong, C. et al. Mechanism of nonhomologous end-joining in mycobacteria: a 658 low-fidelity repair system driven by Ku, ligase D and ligase C. Nature 659 Structural & Molecular Biology 12, 304-12 (2005).
660 9. Pitcher, R.S., Brissett, N.C. & Doherty, A.J. Nonhomologous end-joining in 661 bacteria: a microbial perspective. Annual Review of Microbiology 61, 259-82 662 (2007).
663 10. Bartlett, E.J., Brissett, N.C. & Doherty, A.J. Ribonucleolytic resection is 664 required for repair of strand displaced nonhomologous end-joining 665 intermediates. Proceedings of the National Academy of Sciences USA 110, 666 E1984-91 (2013).
667 11. Bowater, R. & Doherty, A.J. Making ends meet: repairing breaks in bacterial 668 DNA by non-homologous end-joining. PLoS Genetics 2, e8 (2006).
669 12. Zhu, H., Bhattarai, H., Yan, H.G., Shuman, S. & Glickman, M.S. 670 Characterization of Mycobacterium smegmatis PolD2 and PolD1 as 671 RNA/DNA Polymerases Homologous to the POL Domain of Bacterial DNA 672 Ligase D. Biochemistry 51, 10147-10158 (2012).
673 13. Matthews, L.A. & Simmons, L.A. Bacterial Nonhomologous End Joining 674 Requires Teamwork. Journal of Bacteriology 196, 3363-3365 (2014).
29
675 14. Bhattarai, H., Gupta, R. & Glickman, M.S. DNA ligase C1 mediates the LigD- 676 independent nonhomologous end-joining pathway of Mycobacterium 677 smegmatis. Journal of Bacteriology 196, 3366-76 (2014).
678 15. Plocinski, P. et al. Identification of protein partners in mycobacteria using a 679 single-step affinity purification method. PLoS One 9, e91380 (2014).
680 16. Caldecott, K.W., Tucker, J.D., Stanker, L.H. & Thompson, L.H. 681 Characterization of the XRCC1-DNA ligase III complex in vitro and its 682 absence from mutant hamster cells. Nucleic Acids Research 23, 4836-43 683 (1995).
684 17. Leppard, J.B., Dong, Z., Mackey, Z.B. & Tomkinson, A.E. Physical and 685 functional interaction between DNA ligase IIIalpha and poly(ADP-Ribose) 686 polymerase 1 in DNA single-strand break repair. Molecular and Cellular 687 Biology 23, 5919-27 (2003).
688 18. De, A. & Campbell, C. A novel interaction between DNA ligase III and DNA 689 polymerase gamma plays an essential role in mitochondrial DNA stability. 690 Biochemical Journal 402, 175-86 (2007).
691 19. Tomkinson, A.E. & Sallmyr, A. Structure and function of the DNA ligases 692 encoded by the mammalian LIG3 gene. Gene 531, 150-7 (2013).
693 20. Fan, J. & Wilson, D.M., 3rd. Protein-protein interactions and posttranslational 694 modifications in mammalian base excision repair. Free Radical Biology and 695 Medicine 38, 1121-38 (2005).
696 21. Doherty, A.J., Jackson, S.P. & Weller, G.R. Identification of bacterial 697 homologues of the Ku DNA repair proteins. FEBS Letters 500, 186-8 (2001).
698 22. Aravind, L. & Koonin, E.V. Prokaryotic homologs of the eukaryotic DNA-end- 699 binding protein Ku, novel domains in the Ku protein and prediction of a 700 prokaryotic double-strand break repair system. Genome Research 11, 1365- 701 74 (2001).
702 23. Pitcher, R.S. et al. Structure and function of a mycobacterial NHEJ DNA 703 repair polymerase. Journal of Molecular Biology 366, 391-405 (2007).
704 24. Brissett, N.C. et al. Structure of a Preternary Complex Involving a Prokaryotic 705 NHEJ DNA Polymerase. Molecular Cell 41, 221-231 (2011).
706 25. Brissett, N.C. et al. Molecular Basis for DNA Double-Strand Break Annealing 707 and Primer Extension by an NHEJ DNA Polymerase. Cell Reports 5, 1108- 708 1120 (2013).
709 26. Brissett, N.C. et al. Structure of a NHEJ polymerase-mediated DNA synaptic 710 complex. Science 318, 456-459 (2007).
30
711 27. Kent, T., Chandramouly, G., Mcdevitt, S. M., Ozdemir, A. Y. & Pomerantz, R. 712 T. . Mechanism of microhomology-mediated end-joining promoted by human 713 DNA polymerase ș. Nature Structural & Molecular Biology 22, 230-237 714 (2015).
715 28. Gordhan, B.G., Andersen, S.J., De Meyer, A.R. & Mizrahi, V. Construction by 716 homologous recombination and phenotypic characterization of a DNA 717 polymerase domain polA mutant of Mycobacterium smegmatis. Gene 178, 718 125-30 (1996).
719 29. Wilson, T.E., Topper, L.M. & Palmbos, P.L. Non-homologous end-joining: 720 bacteria join the chromosome breakdance. Trends in Biochemical Sciences 721 28, 62-6 (2003).
722 30. Puri, R.V., Singh, N., Gupta, R.K. & Tyagi, A.K. Endonuclease IV Is the major 723 apurinic/apyrimidinic endonuclease in Mycobacterium tuberculosis and is 724 important for protection against oxidative damage. PLoS One 8, e71535 725 (2013).
726 31. Buckstein, M.H., He, J. & Rubin, H. Characterization of nucleotide pools as a 727 function of physiological state in Escherichia coli. Journal of Bacteriology 190, 728 718-26 (2008).
729 32. Jacewicz, A. & Shuman, S. Biochemical Characterization of Mycobacterium 730 smegmatis RnhC (MSMEG_4305), a Bifunctional Enzyme Composed of 731 Autonomous N-Terminal Type I RNase H and C-Terminal Acid Phosphatase 732 Domains. Journal of Bacteriology 197, 2489-98 (2015).
733 33. Minias, A.E. et al. RNase HI Is Essential for Survival of Mycobacterium 734 smegmatis. PLoS One 10, e0126260 (2015).
735 34. Nowotny, M., Gaidamakov, S.A., Crouch, R.J. & Yang, W. Crystal structures 736 of RNase H bound to an RNA/DNA hybrid: substrate specificity and metal- 737 dependent catalysis. Cell 121, 1005-16 (2005).
738 35. Jacewicz, A. & Shuman, S. Biochemical Characterization of Mycobacterium 739 smegmatis RnhC (MSMEG_4305), a Bifunctional Enzyme Composed of 740 Autonomous N-Terminal Type I RNase H and C-Terminal Acid Phosphatase 741 Domains. J Bacteriol 197, 2489-98 (2015).
742 36. Ordonez, H., Uson, M.L. & Shuman, S. Characterization of three 743 mycobacterial DinB (DNA polymerase IV) paralogs highlights DinB2 as 744 naturally adept at ribonucleotide incorporation. Nucleic Acids Research 42, 745 11056-70 (2014).
746 37. Ogawa, T. & Okazaki, T. Function of RNase H in DNA replication revealed by 747 RNase H defective mutants of Escherichia coli. Molecular Genetics and 748 Genomics 193, 231-7 (1984).
31
749 38. Minias, A.E., Brzostek, A.M., Minias, P. & Dziadek, J. The deletion of rnhB in 750 Mycobacterium smegmatis does not affect the level of RNase HII substrates 751 or influence genome stability. PLoS One 10, e0115521 (2015).
752 39. Heyneker, H.L. & Klenow, H. Involvement of Escherichia coli DNA 753 polymerase-I-associated 5' in equilibrium 3' exonuclease in excision-repair of 754 UV-damaged DNA. Basic Life Sciences 5A, 219-23 (1975).
755 40. Friedberg, E.C. The eureka enzyme: the discovery of DNA polymerase. 756 Nature Reviews Molecular Cell Biology 7, 143-7 (2006).
757 41. Tait, R.C., Harris, A.L. & Smith, D.W. DNA repair in Escherichia coli mutants 758 deficient in DNA polymerases I, II and-or 3. Proceedings of the National 759 Academy of Sciences USA 71, 675-9 (1974).
760 42. Bailly, V. & Verly, W.G. The excision of AP sites by the 3'-5' exonuclease 761 activity of the Klenow fragment of Escherichia coli DNA polymerase I. FEBS 762 Letters 178, 223-7 (1984).
763 43. Ismail, I.H., Nyström, S., Nygren, J. & Hammarsten, O. Activation of ataxia 764 telangiectasia mutated by DNA strand break-inducing agents correlates 765 closely with the number of DNA double strand breaks. Journal of Biological 766 Chemistry 280, 4649-55 (2005).
767 44. de Ory, A. et al. Identification of a conserved 5'-dRP lyase activity in bacterial 768 DNA repair ligase D and its potential role in base excision repair. Nucleic 769 Acids Research 44, 1833-44 (2016).
770 45. Winter, G. xia2: an expert system for macromolecular crystallography data 771 reduction. Journal of Applied Crystallography 43, 186-190 (2010).
772 46. Mccoy, A.J. et al. Phaser crystallographic software. Journal of Applied 773 Crystallography 40, 658-674 (2007).
774 47. Emsley, P., Lohkamp, B., Scott, W.G. & Cowtan, K. Features and 775 development of Coot. Acta Crystallographica Section D: Biological 776 Crystallography 66, 486-501 (2010).
777 48. Adams, P.D. et al. PHENIX: a comprehensive Python-based system for 778 macromolecular structure solution. Acta Crystallographica Section D: 779 Biological Crystallography 66, 213-21 (2010).
780 49. McNicholas, S., Potterton, E., Wilson, K.S. & Noble, M.E. Presenting your 781 structures: the CCP4mg molecular-graphics software. Acta Crystallographica 782 Section D: Biological Crystallography 67, 386-94 (2011).
783 50. Parish, T. & Stoker, N.G. Use of a flexible cassette method to generate a 784 double unmarked Mycobacterium tuberculosis tlyA plcABC mutant by gene 785 replacement. Microbiology 146 ( Pt 8), 1969-75 (2000).
32
786 51. Plocinski, P. et al. Mycobacterium tuberculosis CwsA interacts with CrgA and 787 Wag31, and the CrgA-CwsA complex is involved in peptidoglycan synthesis 788 and cell shape determination. Journal of Bacteriology 194, 6398-409 (2012).
789
790
791
792
33
793 Accession codes. Atomic coordinates and structure factors have been deposited in 794 the Protein Data Bank under accession code 5OP0.
795
796 Acknowledgements
797 We declare that none of the authors have a financial interest related to this work. We
798 thank Dr M. Roe for assistance with X-ray data collection. AJDs laboratory was
799 supported by grants from Biotechnology and Biological Sciences Research Council
800 (BB/F013795/1, BB/J018643/1 and BB/M004236/1). PP was supported by a mobility
801 grant from the Polish Ministry of Science and Higher Education (1073/MOB/2013/0:
802 Mobilnosc Plus). Funding for open access charge: Research Councils UK (RCUK).
803
804 Author contributions. A.J.D. designed the project, directed the experimental work,
805 performed database analysis and wrote the manuscript. P.P. and N.B. contributed to
806 the project design, performed and designed experiments, performed computational
807 analyses and co-wrote the manuscript. P.P. performed mass spectrometry analysis.
808 N.B. performed X-ray data processing, model building and analysis. J.B. generated
809 LigC and Prim-PolC expression plasmids, purified proteins, measured protein
810 expression levels in cells and performed the initial biochemical and phenotype
811 studies. A.B. and M.K-M. constructed mutant strains in M. smegmatis and assisted
812 P.P. with CFU experiments. A.D. and J.D. helped with study design and directed
813 parts of the experimental work (mass spectrometry analysis of protein-complexes
814 and generation of mutant strains and phenotypic analysis, respectively).
815
816 Competing interests: The authors declare no competing financial interests. 34
817
818
35
819 Figure Legends
820
821 Figure 1. Interactions between Prim-PolC, LigC proteins and base excision
822 repair elements. (a) A table showing the DNA repair-associated preys that co-
823 purified in an eGFP-facilitated affinity purification experiment using LigC and Prim-
824 PolC as baits. (b) Base excision repair enzymes were purified as recombinant
825 proteins and interactions with Prim-PolC and LigC1 were confirmed by slot-blot
826 analysis, where positive interactions are marked with red, possible weak
827 associations with green and negative interactions with black font, respectively. (c)
828 Verified interactions are summarized in a schematic diagram showing that LigC is
829 the major scaffolding protein involved in multiple protein complex formation.
830
831 Figure 2. DNA binding activity of Prim-PolC. (a) Prim-PolC prefers to bind to 5
832 phosphorylated gaps. In EMSAs, mixtures of a single nucleotide gap containing
833 substrate (30nM), with or without phosphorylation of the 5' end of the lesion, and 0-
834 10µM Prim-PolC were incubated for 20 minutes and resolved on a native
835 polyacrylamide gel. A filled triangle indicates unshifted probe, whereas the black
836 arrow probably indicates a binary complex of Prim-PolC and DNA. The grey arrow
837 indicates multiple protein monomers bound to the probe. Quantification of the EMSA
838 data is presented. For each Prim-PolC concentration the percentage of DNA bound
839 (in relation to the total DNA) was calculated and compared for EMSAs containing
840 Prim-PolC binding a 1 nucleotide gap with or without 5-phosphorylation.
841 (b) Prim-PolC prefers to bind to gaps rather than overhangs unlike Prim-PolD. This
842 EMSA uses 30nM of 5'-fluorescein labelled 36-mer containing a 5' phosphorylated
843 single nucleotide gap or 30nM of 5'-fluorescein labelled 36-mer containing a single- 36
844 stranded 3 overhang given by phosphorylation of the 5'-end of the gap and no
845 primer strand. The triangles and arrow are as in (a). Using the same conditions as
846 (a), this time comparing Prim-PolC to Mtu Prim-PolD (25, 50, 75 & 100 nM).
847 Quantification of the EMSA data is presented. For each enzyme concentration the
848 percentage of DNA bound (in relation to the total DNA) was calculated and
849 compared for EMSAs containing Prim-PolC or Prim-PolD binding a 5-
850 phosphorylated 1 nucleotide gap or a 3-overhang with a 5-phosphorylated
851 downstream strand. (c) Prim-PolC is not an NHEJ polymerase. A schematic of 5'-
852 fluorescein-labelled substrates used in a MMEJ activity assay. The pss- number
853 refers to the number of bases at the end of the 3overhang that can base pair with
854 itself. A primer extension assay, 30nM of the denoted substrate extended by 300nM
855 enzyme (PPC-Prim-PolC, PPD-Mtu Prim-PolD) in the presence of 250 µM dNTPs
856 mix for 30 min at 37°C. The products are resolved on a 15% denaturing gel. (d) The
857 same reaction products from (c), after treatment with Proteinase K and run on a 12%
858 non-denaturing gel.
859
860 Figure 3. Short gap-filling activity of Prim-PolC in complex with LigC1. (a) Prim-
861 PolC prefers to insert rNTPs into short gaps. In primer extension assays, 30nM of 5'-
862 fluorescein labelled 36-mer containing a nick (n), a single-stranded gap of various
863 length (indicated as numbers), with or without phosphorylation of the 5'-end or
864 lacking the 5'-end strand thus forming an overhang (ov), was extended by Prim-PolC
865 (300nM) in the presence of a 250 µM dNTPs or rNTPs mix for 5 min at 37°C. (b) The
866 preferred products from Prim-PolC gap-filling are ligatable. For gap filling and
867 ligation, LigC1 (300nM) was added to the primer extension reactions (same as a)
868 and were incubated for an extra 45 min at 37°C to produce a repair product (P).
37
869
870 Figure 4. Reconstitution of a Prim-PolC - LigC1-dependent BER repair complex
871 in vitro. (a) Schematic representation of BER repair reactions showing repair
872 intermediates generated by subsequent DNA repair activities of glycosylases, end-
873 processing enzymes, gap-filling and ligation. (b & c) In vitro repair of deoxyuracil- (b)
874 or 8-oxoguanine- (c) containing DNA substrates. 30nM of 5'-fluorescein labelled ds
875 36-mer containing a central lesion was processed by uracil glycosylase or directly by
876 recombinant mycobacterial FPG (for 8-oxoguanine), to produce a resulting abasic
877 site intermediate processed into a 3',5'- biphosphorylated gap by the abasic site
878 lyase activity of FPG. Reactions were carried out with addition of a 250 µM dNTPs or
879 rNTPs mix for 30 min at 37°C. Gap ends were next processed, including removal of
880 the 3 phosphate, by addition of various amounts of either XthA, ExoIII or EndoIV for
881 5 min at 37°C. Subsequently, a mixture of Prim-PolC and LigC1, 300nM of each,
882 was added to provide gap filling and ligation activities for 45 min at 37°C.
883
884 Figure 5. Crystal structure of Prim-PolC. (a) A ribbon diagram representation of
885 the crystal structure of Prim-PolC (PDBID: 5OP0). The previously characterized
886 NHEJ polymerase core is coloured sky blue (see also Figure S6a), with the
887 conserved loop structures from this family coloured dark blue and cyan for Loops 1
888 and 2, respectively. The catalytic aspartate side-chains are represented with the
889 carboxylic oxygens shown in red. Side-chains involved in forming the 5 phosphate
890 binding pocket are depicted in dark blue. The Prim-PolC specific C-terminal
891 extension is shown in pink, comprising α8 and Loop 3. (b) Schematic representation
892 comparing the primary structures of Prim-PolD and Prim-PolC. Conservation of the
38
893 C-terminal extension of Prim-PolC amongst selected organisms is highlighted by a
894 sequence alignment of this region (c) Ribbon diagram representation of the Prim-
895 PolC crystal structure highlighting the molecular contacts formed between the
896 conserved loops (1 and 2) and the additional Loop 3. The loops are further
897 represented as translucent solvent accessible surfaces colour matched to the
898 specific loop element. (d) Superposition of DNA from the NHEJ Prim-PolD MMEJ
899 complex (PDBID: 4MKY) onto the Prim-PolC structure. The path of the templating
900 DNA from the in-trans structure clashes with Loop 3.
901
902 Figure 6. Expression of Prim-PolC and LigC1 in vivo and phenotypes
903 associated with their deletion. (a) Expression profiling using western blotting.
904 Specific antibodies risen against LigC1, Prim-PolC and Ku proteins were used to
905 probe for corresponding proteins in whole cell lysates obtained from M. smegmatis
906 cells grown to various growth phases including, exponential (E), early stationary
907 (E/S) and stationary (S) phase. (b) A plot of colony forming units (CFU) data showing
908 the survival of wild type (MC 155) or mutant cells, lacking Prim-Pols and / or ligases,
909 treated with 10mM cumene hydroperoxide. Cells were plated at 1, 15, 30 or 60 min
910 after treatment and the percentage survival is calculated against untreated control
911 cells, treated with diluted buffer instead of peroxide. Data are representative of the
912 mean of five individual experiments and error bars show the standard deviation. (c)
913 A model of the repair of a damaged base by the LigC-dependent excision repair
914 complex. The damage is initially recognised by a specific mono or bifunctional
915 glycosylases, whose activity results in the formation of an abasic site intermediate.
916 LigC is recruited to the lesion early to coordinate later repair steps and mediate
917 multi-protein repair complex formation. Short gaps produced by processing of the
39
918 lesion by the BER machinery are then filled in with RNA by Prim-PolC and the
919 resulting nick is ligated by LigC. RNA is probably later excised and replaced with
920 DNA catalyzed by PolA and the nicked DNA intermediate sealed by LigA.
921
40
922 Table 1: Data collection and refinement statistics (molecular replacement)
923
Msm Prim-PolC
Data collection
Space group P1
Cell dimensions
a, b, c (Å) 51.55, 56.43, 64.45
α, β, γ (°) 97.17, 100.2, 90.64
Resolution (Å) 44.77 (1.84) *
Rsym or Rmerge 0.09 (0.60)
Ι / σΙ 6.8 (1.7)
Completeness (%) 96.9 (95.4)
Redundancy 2.6 (2.6)
Refinement
Resolution (Å) 44.77 (1.84)
No. reflections 59594 (5894)
Rwork / Rfree 0.1858 / 0.2092
No. atoms 5824
Protein 5254
Water 570
B-factors
Protein 28.81
Water 37.11
R.M.S. deviations
Bond lengths (Å) 0.005
Bond angles (°) 0.743
924 *From 1 crystal. *Values in parentheses are for highest resolution shell.
925
926
927
41
a! c! Prim-PolC! PolD2! Pept M.W. Intensity Intensity Intensity Protein Identifier! No.! [kDa]! LigC1! LigC2! PrimPolC! PolA! MS_6302 LigC1! 16 39,8 3,26E+09 Primases / ! MS_0001 DNA Pol III, beta subunit (dnaN) [2.7.7.7] 14 41,3 9,56E+08 4,98E+08 5,70E+09 MS_2731 ATPase involved in DNA repair 13 48,5 3,61E+08 3,56E+08 2,10E+09 Polymerases! MS_3839 DNA polymerase I [2.7.7.7] 31 101 2,78E+08 4,08E+08 5,06E+09 Nucleases! MS_6896 Single-stranded DNA-binding protein 4 17,4 2,48E+08 3,43E+08 2,11E+09 MS_5156 Base excision DNA repair protein 3 20,9 3,85E+07 1,44E+07 4,46E+08 MS_1383 Endonuclease IV 4 26,6 2,18E+07 6,14E+08 DNA ! MS_2399 Uracil-DNA glycosylase (ung) [3.2.2.-] 2 26,6 1,46E+07 1,00E+08 EndoIV! Glycosylases! MS_3816 Excinuclease ABC, B subunit (uvrB) 14 80,8 1,33E+07 8,44E+07 1,16E+09 MS_5423 Transcription-repair coupling factor (mfd) 9 131 3,63E+06 4,17E+06 4,00E+08 LigC1! MS_3883 5ʼ-3 exonuclease 3 33,9 2,71E+07 NTH! MS_5440 Deoxyribonuclease 4 33,4 2,59E+07 4,07E+08 ExoIII! MS_6304 LigC2! 4 39,3 2,90E+08 MS_6285 DNA Pol III gamma/tau subunit [2.7.7.7] 5 69,9 4,22E+08 FPG! MS_6301 Prim-PolC! 15 41,6 1,13E+10 XthA! MS_2362 DNA ligase (ligA) [6.5.1.2] 16 76,3 1,69E+08 1,43E+08 2,35E+09 MPG! b! + Interactor + Anti- LigC1! + Interactor + Anti- Prim-PolC! !
XseB BSA PolD2! Ku Nth ! XseB BSA PolD2! Ku Nth !
EndoIV Prim-PolC ! LigC1 FPG! EndoIV LigC1 ! Prim-PolC FPG!
XthA ExoIII LigC1! MPG PolA ! XthA ExoIII Prim-PolC! MPG PolA ! - Interactor + Anti- LigC1! - Interactor ! + Anti- Prim-PolC!
Ku Nth ! XseB BSA PolD2! Ku Nth ! XseB BSA PolD2!
EndoIV LigC1 ! EndoIV Prim-PolC ! LigC1 FPG! Prim-PolC FPG!
ExoIIIA ExoIII LigC1! MPG PolA ! ExoIIIA ExoIII Prim-PolC! MPG PolA ! a 1nt Gap 1nt Gap/5ʼ-℗ '!!" '*+",-." &!" Prim-PolC Prim-PolC '*+",-./)012" %!" $!" #!" &'$#()*+,#-./# !" !(!'#)" !(!#)" !('" !(#)" !()" '" )" '!" 012340)5!#-µ6/# b '!!" 1nt Gap/5ʼ-℗ 3ʼ Ov/5ʼ-℗ 445"'*+",-./)012" &!" 446"'*+",-./)012" Ø Prim-PolC Prim-PolD Ø Prim-PolC Prim-PolD %!" 445"78"9:/)012" 446"78"9:/)012" $!" #!" &'$#()*+,#-./# !" !(!#)" !(!)" !(!3)" !('" 01)782+#-µ6/# c d pss-0 pss-4 pss-6 pss-0 pss-4 pss-6 26nt 5ʼ $%$!#3ʼ pss-0 3ʼ 5ʼ Ø Ø Ø Ø Ø Ø 14nt
50 nt 26nt 5ʼ !!""#3ʼ pss-4 3ʼ 5ʼ 14nt 50 nt
26nt 5ʼ !!!"""#3ʼ pss-6 3ʼ 5ʼ 14nt a 0-5 nt %#$$ "#$ !"#$ %#$ %#$ "#$ rNTPs rNTPs/5ʼ-℗ dNTPs dNTPs/5ʼ-℗ Ø n 1 2 3 5 ov n 1 2 3 5 Ø n 1 2 3 5 ov n 1 2 3 5 +4 +3 +2 +1 0 b
rNTPs rNTPs/5ʼ-℗ dNTPs dNTPs/5ʼ-℗ Ø n 1 2 3 5 ov n 1 2 3 5 n 1 2 3 5 ov n 1 2 3 5
P
+3 +2 +1 0 a! dU! !" !" +UNG! Abasic site 3ʼ-℗, 5ʼ-℗-gap 3ʼ-OH, 5ʼ-℗-gap Fill-in product Ligated DNA 3’ 5’ 5’ DNA Lesion! !" !" !" !" !" !" !" !" !" !" +FPG! +XthA! +Prim-PolC! +LigC! 8OxoG! /ExoIII! !" !" +FPG! /EndoIV! b! LigC + PPolC! LigC + PPolC! LigC + PPolC!
XthA! ExoIII! EndoIV! rNTP! 5ʼ 5ʼ
!" 5ʼ 5ʼ 5ʼ !!" " c! 5ʼ LigC + PPolC! LigC + PPolC! LigC + PPolC! XthA! ExoIII! EndoIV! rNTP! 5ʼ 5ʼ
!" 5ʼ 5ʼ 5ʼ !!" " 5ʼ a! Loop 3 b! Loop 1 5’ PB L1 L2 C Prim-PolD N-! -C! !8! L3! Prim-PolC 5’ PO4 binding
R63 Loop 2 !8! Loop 3! K23
N20 Msmeg 6301 SVQQSLGPLLDLVAADE-ERGLGDLPYPPNYPKMPGEPPRVQPSKK P65 Mt Rv3730c DVAQSIAPLLDLAAADE-ERGLGDMPYPPNYPKMPGEPKRVQPSRD N Mav 0362 DVAQSIGPLLRMAEADE-ERGLGDMPYPPNYPKMPGEPKRVQPSRD Rop 51690 DVAQSLDVLLEMAERDT-ANGLGDLPYPPNYPKMPGEPKRVQPSRD Gbro 0416 KHRKGIERLLEMVQADD-ANGLGDMPYPPSYPKMPGEPPRVQPSKK K35 Srot 2335 --PGSLDKLLE------QDTPLLPYPPQYPKMPDEPKRVQPSKD Sav 1696 DHAFSLEALLELADRDEHDHGLGDLPYPPEYPKMPGEPKRVQPSRA .: ** . :****.***** ** *****: c! Loop 3 DNA clash 8! d! ! Loop 3 Loop 1 !8!
Loop 2 Loop 2 R63 Loop 1 K23
P65 N20 N
K35 a! b! 100! mc2 155!
E E/S S 10! Δprim-polC ΔligC1,C2
anti-Ku ! 1! ΔligD anti-Prim-PolC ΔligD,prim-polC C 0.1! ΔpolA %Survival anti-LigC1 ΔpolA,prim-polC 0.01! ΔpolA,ligD,prim-polC
0.001! 0! 1! 2! 3! Time of exposure [hr]! c! !"#$%&'
Damage detection & base removal !"#$ Recruitment of repair enzymes
Processing by FPG! PolA & LigA ? Ap 5ʼ 3ʼ
%"#$ ExoIII/ EndoIV! LigC!
Gap filling Abasic site & ligation Prim-PolC! removal P 5ʼ P P
End processing !"##$%&%'()*+,-).$%,/0,123(,45,678,#*2&%*3,"3%9,2',(:23,3("9+0,
!"#$%"&'(&%)'*+"',"%,-"-.#+/'+*',"+.%#/'%0,"%&+/'1+/&."(1.&,
Amplicon Forward primer 5'->3' Reverse primer 5'->3' Nth GGAATTCCATATGAGTGCGGGTGCCGCCCGG GCCCAAGCTTCACAAACCGGCCAGCGCCA FPG (SLIC) TTAACTTTAAGAAGGAGATATACCATGCCTGAGC TGGTGGTGCTCGAGTGCGGCCGCAAGCTGACCCGAG TTCCCGAGGT GCACCCGCGGAC MPG GGAATTCCATATGAGCGTCGACCTGCTGAC CGCGGATCCTCAGTCACTGCTGCCGGGCGC ExoIII GGAATTCCATATGGCTTCCCCGCAGACTCT CGCGGATCCTCATGCGCTGAACTCGACCG XthA GGAATTCCATATGCGTGTGGCCACCTGGAA CGCGGATCCTCAGTTCAGTTCGACGAGCACG EndoIV GGAATTCCATATGCTCATTGGCTCGCATGTAG CGCGGATCCCTAACCCGCGTGTTCGCGCA XseB GGAATTCCATATGAAGCCCATTAGTGAACTGG GCCGCTCGAGGTCCTCGTCGGCCTCGCGG PolA GGAATTCCATATGAGCCCCGCCAAGACCGC GCCCAAGCTTCAGTGAGCCGCGGCGTCCCA PolD1 GGAATTCCATATGACAACCATCGCAGTTGC CGCGGCCGCTCAGCGGCGCCACTGTGCGCCCG Prim-PolC GGAATTCCATATGTGATGGCCAGTGCGGCAACC GCCCAAGCTTTCACTGCTTGCGGTTGCCCTGCT LigC1 GGAATTCCATATGGACTTGCCGGTGCAGCC GCCCAAGCTTTCACTGTTCCTCCAGCACGTCGT Prim-PolC GGAGGATCCGAACGATGGCCAGTGCGGCAACCG CCTTAAGCTTCTGCTTGCGGTTGCCCTGCT GFP (SLIC) AA LigC1 GFP TCGCTACTCTCATCGTGGAATCCTGACAGGATCG TGTACAGCGAGGTGATGTCGGCGGCCTTAAGCTTCT (SLIC) CCAGGGAGGATCCGAACGATGGACTTGCCGGTG GTTCCTCCAGCACGTCGT CAGCC '
!"#$%"&'(&%)'*+"'2%/%"-.#+/'+*'2%/%'"%,3-1%$%/.',3-&$#)&'
Amplicon Forward primer 5'->3' Reverse primer 5'->3' !PolD2 (MSMEG_0597) LF AACTGCAGCGACATCGCCGGACGACATC CAAGCTTGAGGATCATCGGCCTGCCG RF GCTGATCGAGATCGCCCGC GAAGATCTAGGCGCCATCACGGTCGG !Prim-PolC (MSMEG_6301) LF TGGAGACCCGCACCCTGATG GGCGAGCGGTTGTTCATCG RF GTCCGGGTTGGTGAACCGG TCATACGACATCCCCGGCGG !LigC2 (MSMEG_6304) LF AGACCAGACAGCCCGAGATCGC TCATACGACATCCCCGGCGG RF CGACGTTCCACCTGACGCC CGTGGGCCTGACCGATCAG !LigC1 (MSMEG_6302),LigC2 AGACCAGACAGCCCGAGATCGC CGAGGCCGTCGACAACAAACG LF RF CGACGTTCCACCTGACGCC CGTGGGCCTGACCGATCAG !LigC1,LigC2,Prim-PolC LF TGGAGACCCGCACCCTGATG GGCGAGCGGTTGTTCATCG RF CGATCTGCTGGCACTCGGC CGTGGGCCTGACCGATCAG ' M. smegmatis! LigD! LigC!
0 1 2 3 5 6 6.99!
LigD! KU! Prim-PolC! LigC1!LigC2!
M. tuberculosis! LigD! LigC!
0 1 2 3 4 4.41 [Mbp]!
M. sp. JSP623! LigD! LigC!
0 1 2 3 4 5 6 7 7.2! pMYCSM02 pMYCSM03!
S. Coelicolor! LigD! LigC! LigD!LigD!
0 1 2 5 6 7 8 8.67!
ATP-Ligase! AEP / Prim-Pol! LigD! Ku! LigD_N!
Supplementary Figure 1. Diversity and genomic localization of Prim-Pol and ATP-dependent DNA ligase genes in Actinobacteria. Approximate distribution of genes (depicted as arrows) encoding Prim-Pols (blue), ATP- dependent DNA ligases (green) and non-homologous end joining elements: multidomain LigD (red), Ku protein (orange) and LigD associated phosphoesterase (yellow); are shown on respective actinobacterial genomes and mobile elements (black nodes). The three separate components of the
LigD complex in Streptomyces are denoted.
kDa! !1 2 3 4 5 6 7 8 9 10 11 12! 250!
150!
100!
75! 50! 37! 25!
20!
15! Supplementary Figure 2. Recombinant proteins purified for this study.
Approximately 2µg of each purified recombinant protein was loaded on a 4-
12% gradient SDS-polyacrylamide gel and resolved in 1 x TGS buffer under routine conditions. Gel was stained with colloidal Coomassie stain. Proteins loaded as follows:! 1. Protein molecular weight marker, 2. LigC1 39,5kDa
+HIS, 3. PrimPolC 39,4kDa +HIS, 4. PolD1 47,6kDa +HIS, 5. PolA 99,9kDa
+HIS, 6. FPG 31,6kDa +C-HIS, 7. MPG 21,5kDa +HIS, 8. Nth 28,8kDa +HIS,
9. Endo4 26,6kDa +HIS, 10. XthA (MSMEG_0829) 29,1kDa +HIS, 11. ExoIII
(MSMEG_1656) 30,1kDa +HIS, 12. XseB 7,8kDa.
a! M. intracellulare! ! !"#$% &'()% *!+%$,!% -.+% ! Mycobacterium sp. JSP623! !"#$% &'()% ! *!+%$,!% M. avium!
! !"#$% &'()% *!+% -.+% b! strong interaction weak/no interaction
[kDa]
50 40 LigC no-tag LigC no-tag
30 FPG-HIS MPG-HIS LigC digestion with thrombin 20 !""""#$""""%"""""&"" !""""#$""""%"""""&"" [kDa]
50 40 LigC-HIS LigC no-tag no interaction no interaction
BSA BSA
FPG-HIS MPG-HIS
!""""#$""""%"""""&"" !""""#$""""%"""""&""
!'()*+""""#$',)-"./0)12/""""%'-*3/4"""""&'5(16)7"899:;"<:<+*=)(5" Supplementary Figure 3. Operonic and functional association of
bifunctional glycosylase FPG with Prim-PolC and LigC homologues in
mycobacteria.
(a) For chosen mycobacteria, genomic regions encoding FPG co-transcribed
with either individual or both Prim-PolC and LigC orthologues, in the vicinity of
base excision repair genes polA and uvrB, are presented. (b) Pull-down assay confirmation of the stable protein-protein interaction between recombinant tag- free LigC (prey protein) and FPG-6xHis (bait protein) from M. smegmatis, with respective controls.
! a
Non-phosphorylated! 5ʼ-℗ gap ! n! 1! 2! 3! 5! 10! ov! ss! n! 1! 2! 3! 5! 10! ov! ss!
b RecD SRAP """"""""""""""""! $$$$$$$$$$$$$$$$$$$!
Prim-PolC LigD RecBCD PolA En4 SRAP #! """"""""""""""""! $$$$$$$$$$$$$$$$$$$!
PolA LigD RecBCD """"""""""""""""! $$$$$$$$$$$$$$$$$$$! En4 Ku
- DNA end blocked with S-bond Supplementary Figure 4. Substrate specificities of Prim-PolC and LigC1.
(a) An EMSA showing the optimal substrate binding specificity of Prim-PolC.
Reaction mixtures composed of 30nM of substrate, with a nick or single stranded gap of various lengths, with or without phosphorylation of the 5' end of the lesion and 300nM Prim-PolC were incubated on ice for 20 minutes and resolved on a 5% native polyacrylamide gel. A filled triangle indicates the unshifted probe, whereas the arrow indicates a discreet bound complex of
PrimPolC and DNA. (b) A schematic representation of proteins isolated from mycobacterial extracts that bound to a variety of different DNA substrates coated onto magnetic beads (black filled circles). Proteins were isolated by pulling-down these beads, washing them extensively and subsequently identified by mass spectrometry analysis.
a!
X=T! X=G! X=C! X=A!
Ø! A! B! Ø! A! B! Ø! A! B! Ø! A! B!
P+rNTP! P+dNTP! P!
dA+A! dC+C! dG+G! dT+U!
!"#$ b! X! )*# )*# !"%+$)*# %&'($ !"#$ //$(01## !"!# !"$%# !"&%# !"'%# !"(%# # !"! !"% )'&*$$!+$$,-./01$ !"!# !"#$ !"$%# !"#$ !"&%# !"#$ !"'%# /# !"#$ 2# /# !"(%# !"#$ !"!)*# ,&-.# !"#$ !"%)*# ,&-.# !"#$ ,&-.# /# !"%+$)*# Supplementary Figure 5.
(a) Competition assay for deoxy- versus ribo- nucleotides. The ratio of deoxy : ribo is 100:1 with 50nM : 500pM in A and 5nM : 50pM in B. The schematic of the substrate is depicted below the gel. The templating base is defined with an X and the complementary deoxy : ribo mix is used as the cofactor in a gap- filling reaction. The activity was assayed as described in the methods with the following exceptions; 30°C incubation for 30 minutes with the products resolved on a 25% denaturing gel. The preference factor, F, is given by F = P
+ rNMP / P + dNMP as determined by densitometry. F was determined for a minimum of 5 experiments at two concentrations and averaged for the four base pairings. (b) LigC1 ligation assays. Ligation reactions were carried out in the Tris-pH 7.5 based buffer containing 300nM of LigC1, 0.1mg/ml BSA, 1mM
DTT with addition of 1mM ATP, 100µM manganese and 5mM magnesium as cofactors for 45 min at 37°C or up to 2 hours for LigD substrate (D:R+1bl).
Substrates used for ligation are depicted as a cartoon on the right side of panel b. Phosphorylation of DNA ends is marked with a red circle and green and blue diamonds represent ribo- and deoxyribonucleotides, respectively.
a Loop 3 b Loop 1 8! C ! !8! C P334 K324
Loop 2 W220 Y318 P317 M325 P323 W219
N Y322 S95 P94 R224 F93 K221
c Loop 3 Loop 3 Loop 1 d Loop 1 C C !8! !8!
Loop 2 Loop 2 R63 R63
K23 K23
P65 N20 P65 N20 N N
K35 K35 Supplementary Figure S6. Structural differences between Prim-PolC and Prim-PolD. (a) Superposition of Msm Prim-PolC (sky blue) and Mtu Prim-
PolD (lemon, PDBID: 3PKY). Loop 1 and Loop 2 are shown in dark blue and cyan, respectively. The loops occupy differing positions, reinforcing the idea that the functions of the conserved structural elements have developed to stabilize the Loop 3 element (pink). (b) Loop 1 and Loop2 residues involved in stabilising Loop 3 are highlighted along with the conserved residues of Loop
2. (c) Superposed DNA as in Figure 5D. In this instance, the full path of the template DNA is shown with the DNA (red) that clashes with Loop 3 in translucent form. The conserved residues involved in phosphate binding and
DNA interaction are highlighted. (d) As C, but the superposed DNA (red), UTP
(green) and catalytic metal ions (magenta) are from the Mtu Prim-PolD pre- ternary structure (PDBID: 3PKY). This representation illustrates the catalytic core is structurally conserved.
!" Supplementary Figure S7. Stereo view of electron density for Loop 3 of
Prim-PolC. A stick representation of the C-terminal residues G313-P333 that make up Loop 3 (pink). Density from a weighted 2Fo-Fc map scaled at 1.1σ is also depicted in grey. Loop 1 and Loop 2 residues are coloured blue and cyan, respectively.
5mM H2O2! 50mM tert-Butyl hydroperoxide! 1.00E+08! 1.00E+08!
1.00E+07! 1.00E+07! mc2 155 1.00E+06! !primPolC 1.00E+06! !"#$%& !"#$%& !ligC1C2 1.00E+05! 1.00E+05! !ligD 1.00E+04! !ligD,primPolC 1.00E+04! !polA 1.00E+03! 1.00E+03! 0! 1! 2! 3! 0! 1! 2! 3! time of exposure [h]! time of exposure [h]!
Supplementary Figure 8.
Plots of CFU data for phenotypes of mutants lacking Prim-Pols and ATP- dependent DNA ligases treated with inorganic and organic hydroperoxides:
5mM H2O2 – hydrogen peroxide and 50mM tert-butyl hydroperoxide. Cells were plated at 1, 15, 30 or 60 min after treatment.
! Supplementary Figure 9. Uncropped versions of the western blots and gels used in this study are shown in the panels below.!
Figure 1b
+ Interactor + Anti- LigC1 + Interactor + Anti- Prim-PolC
XseB BSA PolD2 Ku Nth XseB BSA PolD2 Ku Nth
EndoIV Prim-PolC LigC1 FPG EndoIV LigC1 Prim-PolC FPG
XthA ExoIII LigC1 MPG PolA XthA ExoIII Prim-PolC MPG PolA - Interactor + Anti- LigC1 - Interactor ! + Anti- Prim-PolC
Ku Nth XseB BSA PolD2 Ku Nth XseB BSA PolD2
EndoIV LigC1 EndoIV Prim-PolC LigC1 FPG Prim-PolC FPG
ExoIIIA ExoIII LigC1 MPG PolA ExoIIIA ExoIII Prim-PolC MPG PolA Figure 2a Figure 2b
1nt Gap/5ʼ-℗ 3ʼ Ov/5ʼ-℗ 1nt Gap 1nt Gap/5ʼ-℗ Prim-PolC Prim-PolC Ø Prim-PolC Prim-PolD Ø Prim-PolC Prim-PolD
Figure 2c Figure 2d
pss-0 pss-4 pss-6 pss-0 pss-4 pss-6
Ø Ø Ø Ø Ø Ø
50 nt
50 nt Figure 3a
rNTPs rNTPs/5ʼ-℗ dNTPs dNTPs/5ʼ-℗ Ø n 1 2 3 5 ov n 1 2 3 5 Ø n 1 2 3 5 ov n 1 2 3 5 +4 +3 +2 +1 0
Figure 3b
rNTPs rNTPs/5ʼ-℗ dNTPs dNTPs/5ʼ-℗ Ø n 1 2 3 5 ov n 1 2 3 5 n 1 2 3 5 ov n 1 2 3 5
P
+3 +2 +1 0 Figure 4b
Figure 4c Figure 6a
E E/S S! Anti-Ku!
! Anti-Prim-PolC!
C! Anti-LigC1! Supplementary Figure 2
kDa! !1 2 3 4 5 6 7 8 9 10 11 12! 250!
150!
100!
75! 50! 37! 25!
20!
15! Supplementary Figure 3b
strong interaction weak/no interaction
[kDa]
50 40 LigC no-tag LigC no-tag
30 FPG-HIS MPG-HIS LigC digestion with thrombin 20 "!!!!#$!!!!%!!!!!&!! "!!!!#$!!!!%!!!!!&!! [kDa]
50 40 LigC-HIS LigC no-tag no interaction no interaction
BSA BSA
FPG-HIS MPG-HIS
"!!!!#$!!!!%!!!!!&!! "!!!!#$!!!!%!!!!!&!!
L-load FT-flow through W-wash6 E-elution 300mM imidazole! Supplementary Figure 4a
Non-phosphorylated! 5ʼ-℗ gap ! n! 1! 2! 3! 5! 10! ov! ss! n! 1! 2! 3! 5! 10! ov! ss! Supplementary Figure 5a
X=T X=G X=C X=A
Ø A B Ø A B Ø A B Ø A B
P+rNTP P+dNTP P
dA+A! dC+C! dG+G! dT+U!
Supplementary Figure 5b ! ,- ,-! '()*+,-! "#$%! ..+012!! '('! '(+)! '(3)! '(4)! '(0)! '(' '() &$#'!!()!!*+,-./!
.! /! .!
.!