Supporting Information

Mayanagi et al. 10.1073/pnas.1010933108 SI Material and Methods. (GCCTGCACGAATTAAGCAATTCGTAATCATGGTCAT- Purification of the Pyrococcus furiosus family B polymerases (PfuPolB) AGCT) primed with primer (AGCTATGACCATGATTAC- and Pfu Proliferating Cell Nuclear Antigen (PfuPCNA) . Escher- GAATTGCTTAAT) for pri30/40, template ichia coli JM109 containing the plasmid for PfuPolB overproduc- (GCCTGCACGAATTAAGCAATTCGTAATCATGGTCAT- tion was grown at 37 °C with shaking in 1 L of Luria–Bertani AGCT) primed with primer (AGCTATGACCATGATTAC- medium containing 50 mg ampicillin. When the culture reached GAATTGC) for pri25/40, and template an optical density at 600 nm of 0.6, IPTG was added to a final (GCCTGCACGAATTAAGCAATTCGTAATCATGGTCAT- concentration of 1 mM, and the culture was further incubated AGCT) primed with primer (AGCTATGACCATGATTACG) for for 14 h. The cells were harvested and disrupted by sonication pri19/40. in buffer A, containing 50 mM Tris–HCl, pH 7.6, 2 mM EDTA, 2.4 mM PMSF and 0.2% Tween 20, and the cell debris was Structure Analysis and Fitting of the Crystal Structures. No filter was removed by centrifugation at 24;000 × g for 15 min at 4 °C. The applied to the individual 18,897 images of the PfuPolB-PfuPC- NA-DNA complex, each boxed in a 56 × 56 pixel square supernatant was incubated at 80 °C for 15 min to denature and 3 1 ∕ precipitate most of the E. coli proteins. After centrifugation, ( . Å pixel), prior to image analysis. The initial 3D model was the heat-treated supernatant was dialyzed against buffer A con- obtained using the common-line method. Subsequent iterative alignment and 3D reconstruction were performed using the taining 10% glycerol, and then was subjected to anion-exchange REFINE routine in EMAN (1). The total number of the particles chromatography (RESOURCE Q, GE Healthcare), which was included in the final reconstruction was 16,679. The average num- developed with a linear gradient of 0–500 mM NaCl, by using ber of particles per class averages was 60. The threshold used a high-pressure liquid chromatography apparatus (ÄKTA Ex- for volume calculation was 2.25, which corresponds to 210 kD plorer 10S; GE Healthcare. The fractions containing the PfuPolB (1.3 g∕mL). Only a spheric mask (24-pixel radius, which corre- eluted at around 100 mM NaCl and were subjected to sponds to 74 Å) has been applied for final reconstruction, and cation-exchange chromatography (RESOURCE S, GE Health- no low pass filter has been applied to the final map. The resolu- care) on a column equilibrated with the same buffer. The PfuPolB tion of the final map was estimated by means of the Fourier shell protein was eluted with a linear gradient of 150–200 mM NaCl. correlation (FSC) method, using the 0.5 FSC criteria. The visua- To produce purified PfuPCNA, E. coli BL21-CodonPlus lization of the electron microscopy (EM) map and the fitting (DE3)-RIL cells containing the plasmid for PfuPCNA overpro- of the crystal structures into the map were performed with the duction were grown in 1 L of Luria–Bertani medium containing Chimera software (2). Initially, PfuPCNA crystal structure was 50 mg ampicillin to an optical density at 600 nm of 0.3 at 37 °C. manually placed to the hexagonal ring region of the map, so that IPTG was then added to the culture at a final concentration of the so-called C side of PCNA should face the PolB. Then the 0.2 mM, and growth was continued for 2 h. Cells were harvested, structure was rotated around the axis of the ring, and the inter- suspended in 25 mL of buffer B (50 mM Tris-HCl, pH 8.5; 0.1 mM domain connecting loops (IDCLs) were roughly adjusted to the EDTA; 2 mM β-mercaptoethanol; 0.1 M NaCl; 10% glycerol), flat edges of the hexagonal ring region. Finally the best fit and the cell extract was prepared as described above. The E. coli was searched using the “Fit Model in Map” tool in Chimera. proteins were partially removed by a two-step heating program, The PfuPolB crystal structure was also manually docked to the involving an initial incubation at 75 °C for 15 min. The superna- PolB region covering the PCNA, so that the DNA binding pock- tant after centrifugation was incubated at 80 °C for 10 min, and ets side should face the DNA rod density and PCNA. The crystal then centrifuged. To the second supernatant, polyethyleneimine structure was then further fitted to the EM map considering the and NaCl were added to 0.2% (wt∕vol) and 0.58 M, respectively, oval shell-like shape of the EM map and the crystal structure. and the mixture was stirred for 30 min at 4 °C. The proteins in the Notably the flexible C-terminal loop with the PCNA-interacting supernatant (30 mL) were precipitated by adding 16.8 g of am- protein (PIP)-binding motif was found in the vicinity of the bind- monium sulfate (80% saturation). The precipitate was dissolved ing pocket of PCNA, suggesting that the model is placed prop- in buffer C (50 mM Tris-HCl, pH 8.0; 0.1 mM EDTA; 10% gly- erly, close to the best fit. The final model was obtained by Fit Model in Map tool. No flexible fitting was used and both PCNA cerol; and 0.5 mM dithiothreitol) and dialyzed against the same and PolB crystal structures were fitted as rigid body. buffer. The dialysate was applied to an anion-exchange column (HiTrap Q, 5 ml; GE Healthcare), and the chromatography Sequence Alignments of Amino Acids Around the Second Contact Site. was developed with a 50-mL linear gradient of 0 to 1.0 M NaCl Sequence alignments of the representative clamp proteins 2 ∕ in buffer C, at a flow rate of mL min. The PfuPCNA was around the second contact site (E171 of PfuPCNA) were con- eluted at a salt concentration of 0.5 M. The purified PfuPolB structed. The amino acid sequences of the clamp proteins in the and PfuPCNA was stored at 4 °C after dialysis against buffer C. (PDB) (1isqA, Pyrococcus furiosus PCNA; The protein concentrations were determined by measuring the 1rwzA, Archaeoglobus fulgidus PCNA; 2ntiABC, Sulfolobus solfa- absorbance at 280 nm. The theoretical molar extension coeffi- taricus PCNA; 1ok7A, Escherichia coli polymerase β subunit; cients of these molecules were calculated based on the numbers 1vpkA, Thermotoga maritima polymerase β subunit; 2avtA, Strep- of tryptophan and tyrosine residues. tococcus pyogenes polymerase β subunit; 1b77A, bacteriophage RB69 gp4 protein; 1plqA, Saccharomyces cerevisiae PCNA; Synthetic DNAs Used for Sample Preparation. The primed-template 2z0lA, human herpes virus 4 BMRF1 protein; 2zvvA, Arabidopsis DNAs (priDNAs) were produced by annealing four pairs of syn- thaliana PCNA1; 2zvmA, Homo sapiens PCNA; 3a1jABC, Homo thetic DNAs, as follows: template (AGCTACCATGCCTGCAC- sapiens Rad9A-Hus1-Rad1 complex) were used for a blast search GAATTAAGCAATTCGTAATCATGGTCATAGCT) primed against the UniProt database, and the detected close relatives with primer (AGCTATGACCATGATTACGAATTGC) for pri25/ (>20% sequence identity, and >80% mutual coverage with query) 49, template were aligned with ClustalW (3–5).

Mayanagi et al. www.pnas.org/cgi/doi/10.1073/pnas.1010933108 1of7 The sequence alignments of the representative archaeal PolBs Step 5: 1isq was trimerized into stp5 according to the BIOMT around the second contact site (379RRLR of PfuPolB) was matrix. constructed by the same method. The amino acid sequences of Step 6: stp4 and stp5 were assembled into stp6. The stp6 model the archaeal DNA polymerase B homologue in the PDB (3a2fA, showed that the DNA from 2po5 (yellow in stp6), in which the 3′ Pyrococcus furiosus; 1tgoA, Thermococcus gorgonarius; 1s5jA, terminus of the nascent strand is in the exonuclease active site, Sulfolobus solfataricus; 1d5aA, Sesulfurococcus sp.) were used made a smaller angle (∼22°) relative to the DNA clamped by the for a blast search against the UniProt database. PCNA (red in stp6). On the other hand, the angle was larger (∼43°) for the DNA from 2vwj (magenta in stp6), in which the Single Particle Analysis of the Mutant PolB (R379E)-PCNA-DNA nascent terminus was in the DNA polymerase active site. Complex. The mutant PfuPolB-PCNA-DNA was prepared by Step 7: DNA polymerase B in stp6 was divided into the the same method as that for the wild-type complex, using R379E N-terminal PIP box (residues 760–770) and the other part (resi- mutant PfuPolB and pri30/40. EM images of the mutant complex dues 1–759). PCNA, DNA, and PIP box were extracted from were recorded with a pixel size of 5.1 Å∕pixel. The total number stp6 (stp7). of boxed images used for single particle analyses was 13,447. The Step 8: The 20bp dsDNA and stp7 were assembled into stp8. 2D class averages were obtained using refine2d tool of EMAN, Step 9: DNA polymerase from 3a2f and the DNA from 2vwj assuming 100 classes. The obtained class averages obviously were extracted from stp6 and assembled into stp9e. exhibited a structural heterogeneity of the complex: Although Step 10: stp8 and stp9e were assembled into stp10e. some of the class averages exhibited the closed conformation very Step 11: The hairpin ssDNA of 2vwj was divided into the dsDNA of the template (nucleotides 0–13 of 2vwjB) and nascent similar to the wild-type complex (Fig. S6A), many class averages – exhibited open conformations, in which the PolB is rising up due (nucleotides 14 27 of 2vwjB) strands. The DNA polymerase- to the second contact disruption (Fig. S6B). DNA in stp10e was moved relative to the PCNA-PIP box, to bring the residues 759 (C terminus of DNA polymerase without PIP Construction of the Atomic Model of the DNA Polymerase B-PCNA- box) and 760 (N terminus of PIP box), and each strand of the DNA Complex. The in-house program SEARCHCMP was used dsDNAs from 2vwj and stp8, closer together (stp11e). for constructing the atomic models of the PfuPolB-PfuPCNA- Step 12: Then, the DNA polymerase and PIP box, the template DNA complex. This program searched for pairs of superposable DNA strands from 2vwj and stp8, and the nascent DNA strands subunits, which might be proteins, nucleic acids, peptides, carbo- from 2vwj and stp8 were connected into single chains, respec- tively. This model (stp12e) was used as the initial model for hydrate polymers, or small molecules, from two PDB formatted the editing mode complex. coordinate files, and combined the two coordinates by superpos- Step 13: The polymerase mode model was also constructed. ing the pair of subunits. This program then scored the multiple The DNA polymerase from 3a2f and the DNA from 2po5 were superposed structures with an internal scoring function, and extracted from stp6 and complied into stp13p. outputted the defined number of models according to the score. Step 14: stp8 and stp13p were assembled into stp14p. In the scoring of models, the coordinate space was divided into a Step 15: The DNA polymerase-DNA in stp14p was moved lattice of 2.6-Å intervals. A cell in the lattice was assigned to a against the PCNA-PIP box to bring the residues 759 (C terminus subunit and was referred to as an occupied cell if an atom of of DNA polymerase without PIP box) and 760 (N terminus of PIP a subunit was found within the cell. If the cell was also occupied box), and each strand of the dsDNAs from 2po5 and stp8, closer by any other subunits, then it was counted as a clash cell. If an together (stp15p). occupied cell was in contact with another occupied cell of another Step 16: Then, the DNA polymerase and PIP-box, the template subunit, then it was defined as a contact cell. The score S was DNA strands from 2po5 and stp8, and the nascent DNA strands obtained with the function S ¼ F1ðG1 − G2ÞðN1 − N2Þ∕N1, from 2po5 and stp8 were connected, respectively. This model where G1 and G2 are the numbers of occupied and clash cells, (stp16p) was used as the model for the polymerizing mode respectively. N1 is the number of subunits in the complex, and complex. N2 is the number of subunits to which clash cells were assigned. F1 ¼ 0, if there was a subunit that had no contact with other Model Refinement with Energy Minimization. The initial model of subunits (otherwise F1 ¼ 1). SEARCHCMP was written in C the editing mode was further refined (Fig. S7 C–E). The DNA language and is available from the authors upon request. sequences in the stp12e and stp16p models were replaced The atomic models of the PfuPolB-PfuPCNA-DNA complex with that used for the EM sample; namely, the pri19/40 DNA were constructed from the crystal structures listed in Table S1 mentioned above. as explained below step by step (Fig. S7). Energy minimization of the models was performed by using Step 1: 3bep and 3a2f were assembled into stp1. the SANDER module of the AMBER 9 program suite (6). The Step 2: 2po5 and stp1 were assembled into stp2. AMBER03 force field was adopted. The simulation was executed Step 3: 2o4i and the stp2 were assembled into stp3. The nucle- in vacuo by using an implicit solvent system (the Born solvent ase B-subunit (2o4iB) was deleted. model), with a cutoff distance of 12 Å (7). Each model was energy Step 4: 2vwj and the stp3 were assembled into stp4. minimized in 5,000 steps.

1. Ludtke SJ, Baldwin PR, Chiu W (1999) EMAN: Semiautomated software for high- 5. Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: Improving the sensitivity of resolution single-particle reconstructions. J Struct Biol 128:82–97. progressive multiple sequence alignment through sequence weighting, position- 2. Pettersen EF, et al. (2004) UCSF Chimera—A visualization system for exploratory specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680. research and analysis. J Comput Chem 25:1605–1612. 6. Duan Y, et al. (2003) A point-charge force field for molecular mechanics simulations of 3. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search proteins based on condensed-phase quantum mechanical calculations. J Comput tool. J Mol Biol 215:403–410. Chem 24:1999–2012. 4. Bairoch A, Boeckmann B, Ferro S, Gasteiger E (2004) Swiss-Prot: Juggling between 7. Tsui V, Case DA (2000) Theory and applications of the generalized Born solvation evolution and stability. Brief Bioinform 5:39–55. model in macromolecular simulations. Biopolymers 56:275–291.

Mayanagi et al. www.pnas.org/cgi/doi/10.1073/pnas.1010933108 2of7 Fig. S1. Sample preparation and the resolution of the structure analysis of the PfuPolB-PCNA-DNA complex (A) Gel filtration chromatography of the recon- stituted PfuPolB-PCNA-DNA complex. Positions of the molecular mass standard markers are indicated on the top of the profiles. (B) The Fourier shell correlation (FSC) curve of the final 3D map. The resolution of the final map was estimated to be 19 Å, using the 0.5 FSC criterion.

Fig. S2. Fitting of the PfuPCNA crystal structure. (A) The PfuPCNA crystal structure (cyan ribbon) was fitted to the hexagonal ring region of the EM map. The crystal structure of Sulfolobus solfataricus (Sso) PCNA (purple ribbon) with the defined structures of the IDCL was also docked to examine the fitting. (B) Another local minima of the fitting, obtained by an approximately 60° rotation of the crystal structure. Notably, the fitting in A, where the IDCL is placed to the flat edge of the hexagonal ring, exhibited higher correlation for both Pfu and SSo PCNA crystal structures than in B.

Mayanagi et al. www.pnas.org/cgi/doi/10.1073/pnas.1010933108 3of7 Fig. S3. The 2D class averages of the SA labeled complexes. (A) The 2D class averages (Upper), with the corresponding reprojections (Lower) of the final 3D structure of the PfuPolB-PCNA-DNA complex with the SA label at the ds DNA terminus. The side length of the individual images is 24.5 nm. (B) The 2D class averages (Upper), with the corresponding reprojections (Lower) of the final 3D structure of the complex with the SA label at the ssDNA terminus. The side length of the individual images is 20.4 nm.

Fig. S4. Sequence alignment of the representative clamp proteins around the second contact sites. The sites equivalent to E171 of PfuPCNA (asterisk) are conserved among the clamp proteins from a wide variety of organisms. Residues conserved in more than 80% of the sequences with equivalent amino acids (D ¼ E, N ¼ Q, A ¼ G, S ¼ T, I ¼ L ¼ V, and F ¼ Y) are highlighted. Amino acid sequences are shown only for the b12-b13 β-hairpin (arrows) region. The proteins are identified with domains and species names of organisms, protein names, and PDB codes (with chain ID). Species names are abbreviated as follows: Pfu, Pyrococcus furiosus; Afu, Archaeoglobus fulgidus; Sso, Sulfolobus solfataricus; Eco, Escherichia coli; Tma, Thermotoga maritima; Spy, Streptococcus pyo- ; Sce, Saccharomyces cerevisiae; Ath, Arabidopsis thaliana; Hsa, Homo sapiens. GN and GP bacteria are Gram-negative and -positive bacteria, respectively. DNA pol3b is DNA polymerase III β-subunit. HHV4 stands for human herpes virus 4.

Fig. S5. Sequence alignment of the archaeal PolBs around the arginine cluster. The regions of the archaeal PolBs equivalent to the arginine cluster of PfuPolB (asterisks) contain basic residues (shown in boldface). The proteins are identified with domains and species names of organisms, protein names, and PDB codes (with chain ID). The names of species are abbreviated as follows: Pfu, Pyrococcus furiosus; Tgo, Thermococcus gorgonarius; Tli, Thermococcus litoralis; Mvo, Methanococcus voltae; Afu, Archaeoglobus fulgidus; Sso, Sulfolobus solfataricus; Dsp, Desulfurococcus sp.; Ape, Aeropyrum pernix, respectively.

Mayanagi et al. www.pnas.org/cgi/doi/10.1073/pnas.1010933108 4of7 Fig. S6. The 2D class averages of the R379E mutant PolB-PCNA-DNA complex. The images are aligned so that the PolB and PCNA parts are coming to the upper and lower area, respectively. The side length of the individual images is 18.4 nm. (A) The 2D class averages showing closed conformation, similar to the wild- type complex. Numbers of images used for averaging were 118, 63, and 105 (from left). (B) The 2D class averages exhibiting open conformation, due to the release of the second contact. Numbers of images used for averaging were 53, 231, 226, 210, 275, and 137 (from left). (C) PCNA-dependent exonuclease activities affected by mutations at the arginine cluster (379RRLR). The exonuclease activities of the single(R379E), double(R379/382E), and triple(R379/380/ 382E) PolB mutants, were measured, using linearized pGEM-T plasmid substrate, and were compared with that of the wild type. The size marker DNA (New England Biolabs) was loaded on the left. The gel images obtained by an image analyzer (Typhoon Trio+) are shown in a contrast different from that of Fig. 4A, to highlight the cleaved product bands (indicated by an arrow).

Mayanagi et al. www.pnas.org/cgi/doi/10.1073/pnas.1010933108 5of7 Fig. S7. Modeling process of the ternary complex. The atomic model of the PfuPolB-PfuPCNA-DNA complex was constructed from the known structures in the PDB. (A) The known crystal structures were assembled to prepare an initial compiled model (steps 1 to 6). (B) Models of the editing and the polymerizing modes were constructed by extracting and compiling the appropriate parts of stp6 (steps 7 to 16). (C) The initial model (stp12e) was fit into the EM density. (D) Subunits were manually fit into the density, and the DNA sequences in the stp12e and stp16p models were replaced with that used for the EM sample. (E) The model was further refined through energy minimization.

Mayanagi et al. www.pnas.org/cgi/doi/10.1073/pnas.1010933108 6of7 Table S1. PDB entries used for modeling Code Structure description Source 3bep DNA polymerase IIIβ subunit—DNA complex Escherichia coli 1isq PCNA Pyrococcus furiosus 3a2f DNA polymerase—PCNA complex Pyrococcus furiosus 2o4i 3′ repair exonuclease 1—DNA complex Mus musculus 2vwj DNA polymerase TGO—DNA complex Thermococcus gorgonarius 2p5o DNA polymerase GP43—DNA complex Bacterophage RB69 — 21bps B-DNA Modeled

Mayanagi et al. www.pnas.org/cgi/doi/10.1073/pnas.1010933108 7of7