Copyright by Young-Sam Lee 2010

The Dissertation Committee for Young-Sam Lee Certifies that this is the approved version of the following dissertation:

Structural and Functional Studies of the Human Mitochondrial DNA

Committee:

Whitney Yin, Supervisor

Ian Molineux

Kenneth Johnson

Tanya Paull

Jon Robertus

Structural and Functional Studies of the Human Mitochondrial DNA Polymerase

by

Young-Sam Lee, B.S, M.S.

Dissertation Presented to the Faculty of the Graduate School of The University of Texas at Austin in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

The University of Texas at Austin August, 2010

Dedication

For my wife, In-Sook Jung.

Acknowledgements

I would like to appreciate Dr. Whitney Yin for giving me chance to working in her lab and mentoring me through my graduate program. Not only the scientific insights, also the warmness that she gave me and my family encouraged me to pursue my Ph. D. degree in the foreign country. I also would like to thank “a guru of ” Dr. Ian Molineux and “a guru of kinetics” Dr. Kenneth Johnson. Without their critical advice, I would not be accomplished my publication. I hope to be a respectable expert in my research field like them. I also should remember friendship and generosity given by many current and former Yin lab members: Hey-Ryung Chang, Qingchao “Eric” Meng, Xu Yang, Jeff Knight, Dr. Michio Matsunaga, Dr. He “River” Quan, Taewung Lee, Xin “Ella” Wang, Jamila Momand, and Max Shay. Most of all, I really appreciate my parents for their endless love and support, and my wife, In-Sook Jung, and my son, Jason Seung-Hyeon Lee who always stand by me with patients during my graduate carrier. I will continue to try my best to make them to be proud of me.

v Structural and Functional Studies of the Human Mitochondrial DNA Polymerase

Publication No.______

Young-Sam Lee, Ph.D. The University of Texas at Austin, 2010

Supervisor: Y. Whitney Yin

The human mitochondrial DNA polymerase (Pol γ) catalyzes mitochondrial DNA synthesis, and thus is essential for the integrity of the organelle. Mutations of Pol γ have been implicated in more than 150 human diseases. Reduced Pol γ activity caused by inhibition of anti-HIV drugs targeted to HIV confers major drug toxicity. To illustrate the structural basis for mtDNA replication and facilitate rational design of antiviral drugs, I have determined the crystal structure of human Pol γ holoenzyme. The structure reveals heterotrimer architecture of Pol γ holoenzyme with a monomeric catalytic subunit Pol γA, and a dimeric factor Pol γB. While the polymerase and domains in Pol γA present high structural homology with the other members of the DNA Pol I family, the spacer between the two functional domains shows a unique fold, and constitutes the subunit interface. The structure suggests a novel mechanism for Pol γ’s high processivity of DNA replication. Furthermore, the vi structure reveals dissimilarity in the active sites between Pol γ and HIV RT, thereby indicating an exploitable space for design of less toxic anti-HIV drugs. Interestingly, the structure shows an asymmetric subunit interaction, that is, one

monomer of dimeric Pol γB primarily participates in interactions with Pol γA. To understand the roles of each Pol γB monomer, I generated a monomeric human Pol γB variant by disrupting the dimeric interface of the subunit. Comparative studies of this

variant and dimeric wild-type Pol γB reveal that each monomer in the dimeric Pol γB makes a distinct contribution to processivity: one monomer (proximal to Pol γA) increases DNA binding affinity whereas the other monomer (distal to Pol γA) enhances the rate of polymerization.

The pol γ holoenzyme structure also gives a rationale to establish the genotypic- phenotypic relationship of many disease-implicated mutations, especially for those located outside of the conserved pol or exo domains. Using the structure as a guide, I

characterized a substitution of Pol γA residue R232 that is located at the subunit interface but far from either active sites. Kinetic analyses reveal that the mutation has no effect on intrinsic Pol γA activity, but shows functional defects in the holoenzyme, including decreased polymerase activity and increased exonuclease activity, as well as reduced discrimination between mismatched and corrected . Results provide a molecular rationale for the Pol γA-R232 substitution mediated mitochondrial diseases.

vii Table of Contents

List of Figures ...... xi

List of Tables ...... xiii

Chapter 1: Introduction ...... 1 Mitochondrial DNA polymerase (Pol γ) ...... 1 Enzyme activity of Pol γ ...... 1 Accessory subunit Pol γB as a processivity factor for DNA replication3 Pol γ-mutation related mitochondrial disease ...... 4 Mitochondrial toxicity by nucleoside analogue reverse transcriptase inhibitors ...... 6 Replication Mechanism of human mitochondrial DNA ...... 7 Significance of studies in the dissertation ...... 9

Chapter 2: Cloning, over-expression and purification of the human Pol γ subunits10 Cloning human Pol γ subunits ...... 10 Generating a recombinant baculovirus for Pol γA expression in insect cell system...... 12 Overexpression of human Pol γA in Sf9 insect cells ...... 12 Human Pol γB overexpression in bacteria ...... 14 Purification of Pol γ subunits ...... 14 Discussion ...... 17 Initial efforts to express active Pol γA in bacteria ...... 17 Optimizing overexpression and purification procedure for Pol γA expressed from insect cells ...... 21

Chapter 3: Crystal structure of the human Pol γ holoenzyme and its functional implications ...... 26 Introduction ...... 26 Materials and Methods ...... 27 Construction of Pol γA and Pol γB...... 27 Polymerization assay ...... 28 viii Limited ...... 28 Crystallography ...... 29 Results and Discussion ...... 30 Structure of the catalytic subunit ...... 30 Holoenzyme formation and subunit interface ...... 35 Processivity of the holoenzyme ...... 41 Distinct mode of substrate binding ...... 48 Pol γ mutations and human diseases ...... 49 Structural dissimilarities between human Pol γ and HIV reverse transcriptase provides exploitable space for drug design ...... 54

Chapter 4: Dissecting the processivity function of the dimeric human Pol γB .....57 Introduction ...... 57 Materials and Methods ...... 58 Cloning, expression and protein purification ...... 58 Analytical ultracentrifugation ...... 59 Steady-state Polymerization assay ...... 60 Pre-steady state kinetics ...... 60 Analytical gel filtration ...... 61 Results ...... 62 Construction and Preparation of Pol γB variants ...... 62 Oligomerization of Pol γB variants ...... 64 Effects of Pol γB dimerization on processive DNA synthesis ...... 67 Pre-steady State Kinetics analysis of Pol γB variants ...... 69 Effect of Pol γA on Pol γB dimerization ...... 72 DNA-dependent subunit interaction ...... 74 Discussion ...... 77 Contribution factors to processivity ...... 77 Distinct roles of each Pol γB monomer in processivity ...... 78 Pol γA promotes Pol γB dimerization ...... 80

ix Chapter 5: Characterization of R232 of human mitochondrial DNA polymerase: The residue linked to mitochondrial disease controls balance between polymerization and activities...... 84 Introduction ...... 84 Materials and Methods ...... 86 Cloning, expression, and protein purification ...... 86 Polymerization kinetics assays ...... 87 Pre-steady state exonuclease assay ...... 87 Results ...... 87 Pol γA variants construction and mutation ...... 87 Interaction between Pol γA R232 variants and the accessory subunit .88 DNA synthesis activity by Pol γA R232 variants ...... 90 Effect of Pol γA R232 mutation on processivity ...... 94 Effects of the R232 mutation on exonuclease activity ...... 97 Mismatch removal by Pol γA and holoenzyme ...... 100 Hydrolysis of a correct base-pair ...... 101 Discussion ...... 104 Substitutions of Pol γA R232 alter both polymerase and exonuclease activites of holoenzyme ...... 105 Pol γA R232 senses the conformation of primer-template for selective exo activity...... 106 R232 and primer strand transfer ...... 107 Pol γA R232 substitution and human diseases ...... 110

Summary ...... 112

References ...... 115

Vita ...... 121

x List of Figures

Figure 2.1 Scheme of cloning human Pol γA into pBacPAK9 shuttle vector for baculovirus generation ...... 11 Figure 2.2 Evaluation of plaque-purified baculovirus for Pol γ overexpression in insect cells...... 13 Figure 2.3 Overexpression of human Pol γ subunits ...... 14 Figure 2.4 Purification of human Pol γ subunits ...... 17 Figure 2.5 Attempt to obtain soluble Pol γA in bacteria ...... 18 Figure 2.6 In-gel activity assay to detect polymerization activity of Pol γA ...... 19 Figure 2.7 Attempt to isolate functional domains of Pol γA after trypsin digestion ...... 20

Figure 3.1 Polymerization activities of Pol γA (ΔN-exo-) and Pol γB (ΔI4) ...... 31 Figure 3.2 Structure of human Pol γ holoenzyme. (A) Structure of Pol γA ...... 34 Figure 3.3 Formation of Pol γ holoenzyme analyzed by gel filtration chromatography . 35 Figure 3.4 The major Pol γ subunit interfaces ...... 39 Figure 3.5 DNA activity assay of Pol γA-L-helix mutant ...... 40 Figure 3.6 Superposition of Pol γA and T7 DNAP shows their similarity in the active sites...... 41 Figure 3.7 Comparison of a modeled Pol γ-DNA complex with that of the T7 DNAP- DNA complex ...... 44 Figure 3.8 Mapping Pol γB deletion on the Pol γ holoenzyme structure ...... 45 Figure 3.9 Structure of Pol γB-ΔI4 dimer ...... 46 Figure 3.10 Pol γA charge distribution ...... 47 Figure 3.11 Pol γA mutational analysis ...... 52 Figure 3.12 Structural differences between human Pol γ and HIV reverse transcriptase 56

Figure 4.1 Structural and bioinformatic basis for construction of Pol γB variants ...... 63 Figure 4.2 Variant Pol γB oligomeric states ...... 66 Figure 4.3 Steady-state DNA polymerization assays ...... 68 Figure 4.4 Time-dependent product formation in pre-steady-state assays ...... 71

xi Figure 4.5 Effects of Pol γA and DNA on Pol γB dimerization ...... 75 Figure 4.6 Subunit interaction between Pol γA and ΔI4-D129K ...... 76

Figure 5.1 Interaction between Pol γA R232- Pol γB E394 in the Pol γ holoenzyme ...... 89 Figure 5.2 Steady-state DNA synthesis activities of Pol γA R232 mutants ...... 92 Figure 5.3 DNA polymerase activities of holoenzymes containing Pol γB E394 variants 93 Figure 5.4 Pre-steady-state single incorporation ...... 96 Figure 5.5 Exonuclease activities of Pol γA R232 variants ...... 98 Figure 5.6 Exonuclease analyses ...... 103 Figure 5.7 The signaling pathway in the modeled Pol γ holoenzyme-DNA complex ... 110

xii List of Tables

Table 3.1 Statistics of data analysis and structural refinement ...... 32 Table 3.2 Classification of disease-related Pol γA substitutions ...... 53

Table 4.1 Dissociation constants measured by analytical ultracentrifugation...... 66 Table 4.2 Pre-steady state kinetic parameters for Pol γB variants ...... 71

Table 5.1 Primer set for the mutagenesis of human Pol γ subunits ...... 86 Table 5.2 Polymerization activities of Pol γA mutants ...... 96 Table 5.3 Global fitting analysis for exonuclease activity of Pol γA variants ...... 99 Table 5.4 Exonuclease activity of Pol γA variants ...... 103 Table 5.5 Selective exonuclease activities of wild-type and mutant ...... 103

xiii Chapter 1: Introduction

MITOCHONDRIAL DNA POLYMERASE (POL γ)

DNA replication is an essential process to maintain genetic integrity. In eukaryotic cells, mitochondria are unique subcellular organelles which contain their own DNA (mitochondrial DNA, mtDNA) distinct from nuclear chromosomal DNA. Mitochondria are the power houses of the cells, generating ATP through the process of oxidative phosphorylation (OXPHOS). Because essential protein components for the OXPHOS are encoded in mitochondrial as well as in chromosomal , integrity of mtDNA is critical for the cellular energy supply and activities. Replication of mtDNA is performed by several nuclear-encoded proteins that form mitochondrial nucleoid (1). Among them, Pol γ is the sole protein serving DNA polymerization for mtDNA replication. Animal Pol γ consists of two subunits, a 140-kDa catalytic subunit, Pol γA, and a 55-kDa accessory subunit, Pol γB.

Enzyme activity of Pol γ

Pol γ is a high-fidelity DNA polymerase with 5’ Æ 3’ polymerization activity for DNA synthesis as well as 3’ Æ 5’ exonucleolytic activity for proofreading; catalytic

residues for the reactions are in the catalytic subunit Pol γA. Although Pol γA by itself exhibits a salt-sensitive polymerization activity, forming a complex with Pol γB creates a Pol γ holoenzyme which is more salt resistance in a range of 0~200 mM salt concentration. (2). The rate of polymerization also increases from 9 sec-1 to 45 sec-1 upon

Pol γA complex with the Pol γB (3). Interestingly, the enzyme exhibits a broad substrate spectrum using templates that are not only DNA, but also RNA. Therefore it is thought to contain reverse transcriptase activity (4). Although the specificity constant for incorporation of dNTPs in RNA-dependent DNA polymerization by the Pol γ is 25~100 1 fold lower than the value in DNA-dependent DNA polymerization (5), the reverse

transcriptase activity is perhaps important to Pol γ, because it is known that mtDNA contains ribonucleotides (6), estimated to be at least 30 per a molecule of rat liver mtDNA (7).

The exonuclease activity of Pol γ contributes to the fidelity in mtDNA replication by hydrolysis of misincorporated during DNA synthesis. The conserved negatively charged residues in the catalytic core of the exo domain (D198 and E200) are known to be essential for the activity (8). By kinetic measurements, exonuclease deficient

(E200A)-Pol γ makes one error per 280,000 base pairs copied (8), and the exonuclease activity increases the fidelity by about 200-fold (9).

Pol γ also has 5’-deoxyribose phosphate (dRP) activity that is important for DNA repair (10). The enzyme removes 5’-dRP groups generated by the activity of glycosylases and abasic endonucleases during base excision repair process. However,

because the enzyme-dRP intermediate is very stable, Pol γ appears to have a much slower turnover rate than Pol β that is specialized for the repair process, which suggests that Pol γ might serve the repair function in low frequency (10). Some enzymes that lack of 3’Æ 5’ exonuclease activity (exo-) such as or 3’Æ 5’ exo- E. coli DNA polymerase I can extend the 3’-terminus of blunt-ended DNA by one nucleotide, using exclusively dAMP, in a non-template dependent manner (11). Exo- Pol γ variants also appear to non-template nucleotide addition, especially with dATP, although the reaction efficiency is low (Y.-S. Lee and

Y.W. Yin, unpublished observation). The terminal activity of Pol γ might not be relevant in vivo because of the exonuclease activity of the wild-type enzyme, but it might be interesting to study Pol γ in relation to other types of template-independent

2 nucleotide addition such as incorporating nucleotides opposite abasic sites done by many translesion synthesis (12). Based on comparison of amino acid sequences with other DNA polymerases, Pol

γA contains common motifs of a polymerase (pol) and an exonuclease (exo) domain; these are shared among DNA Pol I family (or family A polymerase), including E. coli DNA Pol I and T7 DNA polymerase (gp5) (13). N-terminal exo domain and C-terminal pol domain are linked by a spacer which has no sequence homology with other members of the family (14). Biochemical studies showed that deletion or mutation of some residues in the spacer affect polymerization and/or Pol γB interaction (15). In T7 DNA polymerase, this region is involved in interacting with its processivity factor, thioredoxin, and expanding interaction area with DNA (16).

Accessory subunit Pol γB as a processivity factor for DNA replication

Many replicative DNA polymerases require a processivity factor for efficient DNA replication because their catalytic subunits alone can incorporate only a few nucleotides per DNA binding event. This is likely due to their rapid dissociation from template DNA (17). Although Pol γA displays high (~100 nt synthesis per binding) processivity relative to other catalytic subunits of DNA polymerases (18), Pol γB serves as a processivity factor for efficient mtDNA replication with Pol γA. The processivity mechanism by Pol γB is distinct from the other processivity factors. Mammalian Pol γB is structurally homologous to class II aminoacyl tRNA synthetases (19,20), and has no resemblance to other processivity factors showing a toroidal shape like PCNA for eukaryotic DNA polymerase δ and ε (21) or small globular such as thioredoxin for T7 DNA polymerase (16). Even functionally, Pol γB not only increases the DNA binding affinity with the replicase (2), but also enhances the

3 polymerization rate of Pol γ holoenzyme (3), which is unique among processivity factors which usually just increase affinity for DNA.

Although it has no known enzymatic activity by itself, Pol γB binds non- specifically to double-stranded DNA (dsDNA) longer than 45 bp (22). The dsDNA binding property is not necessary for the stimulation of polymerization rate by the accessory subunit, but it is required for the DNA replication at the mtDNA replication fork coordinating with mtDNA (23).

Pol γB shows different oligomeric state across the species. The drosophila ortholog is proposed to be a monomer which makes a heterodimeric AB complex in holoenzyme (24); however, Xenopus and mammalian Pol γB is dimeric (19,25) and forms a heterotrimeric holoenzyme with an AB2 structure (26). Interestingly, human Pol γA forms an active holoenzyme even with mouse or Xenopus Pol γB (25), suggesting that the subunit interface involving holoenzyme formation is highly conserved in these species.

POL γ-MUTATION RELATED MITOCHONDRIAL DISEASE

Maintenance of mtDNA is critical to sustain the function of mitochondria.

Because Pol γ is the sole DNA polymerase in mitochondria, mutations in Pol γ can cause impaired mtDNA replication, and further deleterious effects of cellular activities. Animal

and clinical studies show that mutations in the catalytic subunit Pol γA are associated with a number of diverse mitochondrial disorders including progressive external ophthalmoplegia (PEO; a disorder characterized by slow paralysis of external eye muscle and exercise intolerance), Alpers’ syndrome (a fatal childhood disease leading to brain and live failure), sensory ataxia, neuropathy (27), and reduced lifespan (28).

To correlate human diseases with mutations in Pol γ, several Pol γ mutants have been characterized to explain disease mechanism in molecular level. For example,

4 Y955C substitution in Pol γA is associated with autosomal dominant PEO (29). The substitution causes reduction of polymerization activity as well as less discrimination against incorporating the oxidative damaged substrate 8-oxo-dGTP, or synthesis using the damaged DNA template containing 8-oxo-dG (30). Y955 in the pol domain is highly conserved among other members of DNA Pol I family. Although interpreting the functional defects of mutations affecting the , such as Y955C, can be aided by analyzing the corresponding sites in T7 DNA polymerase (31), understanding the effect of mutations located outside of the active sites

is more difficult, because of a lack of structural information of Pol γ. For example, variants A467T or W748S are commonly found in patients with mitochondrial disorders;

they are located in the Pol γ-unique spacer region locating between pol and exo domain. The A467T substitution is the most common Pol γ mutation found in human populations, and is associated with autosomal recessive PEO and Alpers’ syndrome (32). The

proficiency of nucleotide addition by the A467T mutant Pol γA is reduced more than 20- fold, and DNA binding affinity is reduced 2-fold relative to the wild-type enzyme (33). Furthermore, the mutation weakens the subunit interaction and reduces the processivity (33). The W748S mutation frequently occurs in cis with a E1143G substitution and is associated with several muscular and neurological diseases (32,34). Biochemical studies demonstrated that there is no functional defect in E1143G mutant alone, meanwhile, the W748S mutation itself reduced DNA binding affinity but is still capable of the holoenzyme formation (35). Interestingly, the E1143G mutation in cis partially rescued the functional defect of the W748S substitution, suggesting the W748S is responsible for W748S-E1143G substitution related human disease (35).

5 MITOCHONDRIAL TOXICITY BY NUCLEOSIDE ANALOGUE REVERSE TRANSCRIPTASE INHIBITORS

Nucleoside analogue reverse transcriptase inhibitors (NRTIs) are one of major antiviral drugs types for treatment of HIV infections (36). These drugs include AZT (3′- azidothymidine), the first compound approved by the United States Food and Drug Administration for the treatment of HIV infection. AZT functions as a DNA chain terminator. When incorporated onto the viral genome by HIV reverse transcriptase (HIV- RT), it terminates viral DNA replication (37). However, clinic and biological studies have shown that long-term treatment with the NRTIs causes mitochondrial toxicities, including cardiac dysfunction, hepatic failure, skeletal myopathy, lactic acidosis along with defective mtDNA replication and mtDNA depletion (38) The toxicities are

associated with inhibitory effect of NRTIs on the mtDNA synthesis by human Pol γ (39). Interestingly, the enzyme’s susceptibility to NRTIs is much higher than any other nuclear DNA polymerases (40). The toxic effects of drugs are correlated with its kinetics of

incorporation by the Pol γ. In order words, drugs showing higher clinic toxicities (ddC, d4T, and ddA) display higher rate of incorporation and lower rate of excision by Pol γ. As a result, they are practically indistinguishable from the regular substrate nucleotides (dNTPs). Interestingly, drug toxicity is not always correlated with efficacy, e.g., AZT or carbovir are relatively effective but exhibit much lower toxicity (41). This suggests the

existence of differences between the human adverse target (Pol γ) and viral target protein (HIV-RT), and that there is an exploitable space for more selective antiviral drug design. Comparative structural studies of these two proteins will provide this valuable information at the atomic resolution. However, while many structures of HIV-RT

complexed with NRTIs have been determined, the structure of human Pol γ was not

6 available when I started this study. Lack of such knowledge severely hinders the effort of rational drug design.

REPLICATION MECHANISM OF HUMAN MITOCHONDRIAL DNA

Human mtDNA is a double-stranded circular molecule of 16.6 kbp encoding 13 polypeptides involved in oxidative phosphorylation, two ribosomal , and 22 transfer RNAs (42). In contrast with the strict regulation of the nuclear DNA copy number, each mitochondrion is estimated to contain 2~20 mtDNA (43), and the overall amount of mtDNA in each cell varies depending on the tissue/cell types, and the level of the energy demand of tissue/cells (44). How mtDNA is replicated by a is still being debated. Two mechanism of mtDNA synthesis have been proposed: A conventional synchronous mode where leading and lagging strand synthesis occur simultaneously (45), and an asymmetric displacement mode where one strand (H strand) of mtDNA synthesis initiates from the

OH origin, followed by the other strand (L strand) synthesis started from the second origin, OL located at 11 kbp away from the OH, after the newly synthesized H strand

DNA has crossed the OL (46). Recently, along with the conventional synchronous model, RITOLS (RNA incorporation throughout the lagging strand) intermediate was also proposed based on finding extensive RNA/DNA hybrid in replicating mtDNA (47).

In order to replicate chromosomal DNA, DNA polymerases coordinate with other replisome machineries including , single-stranded binding proteins, and (48). As for the replisome for , human mtDNA are

thought to be composed of at least three proteins; Pol γ for DNA replication, mtDNA helicase for unwinding the DNA, and mitochondrial single-stranded DNA binding

7 proteins (mtSSBs) for wrapping the unwound single-stranded mtDNA during replication (49). The mitochondrial DNA helicase (named “TWINKLE”) catalyzes ATP- dependent duplex DNA unwinding with a 5’ to 3’ directionality (50). The enzyme has high sequence similarity with T7 gene 4 protein (gp4), which has both DNA helicase and activity (51). However, mtDNA helicase lacks the conserved critical cysteine residues for primase activity (52), and only helicase activity is possessed. Electron microscopic data and solution studies reveal that the mtDNA helicase forms a hexameric or heptameric ring-shape structure like the hexameric T7 gp4 (53). The helicase requires for replicating a duplex DNA template, because Pol γ alone cannot melt dsDNA. However, the helicase alone cannot unwind long stretches of dsDNA efficiently (49).

This stimulatory effect implies the interaction between Pol γ and mtDNA helicases. The helicase as well as the Pol γ is indispensible for mtDNA replication; it is not a surprise that mutations of mtDNA helicase are also associated with mtDNA instability and mitochondrial dysfunction (51). Mitochondrial SSB protein (mtSSB) is a 13 ~ 15 kDa polypeptide with sequence similarity to the N-terminal two thirds of E. coli SSB (54). Like the E. coli homolog, the mtSSB is shown to be a tetramer in solution and binds to DNA cooperatively (55). The enzyme prevents re-annealing of displaced DNA template during replication, and so, mtSSB stimulates the catalytic activity of the mtDNA helicase as well as the Pol γ (49). Interestingly, the stimulatory effect on the helicase activity cannot be obtained by substitution with the E. coli SSB, which may implicate a direct interaction between mtSSB and mtDNA helicase (50). Little is known about disease-related mtSSB mutations; however, in yeast, knockout of the homologous gene (RIM1) causes loss of

8 mitochondrial DNA (56), suggesting mtSSB is an essential component in mtDNA replication. Whereas, polypeptides serve a primase activity in mitochondria is still unclear. Alternately, in vitro studies show transcripts by mitochondrial RNA polymerase

(POLRMT) can be used for RNA primers at both OH (57) and OL (58) to initiate DNA replication.

SIGNIFICANCE OF STUDIES IN THE DISSERTATION

The goal of this study is to provide a structural understanding for the high processivity of Pol γ, and the mechanism for subunit cooperative regulation of enzyme activities. I embarked on structural and functional studies of Pol γ replication to provide a molecular explanation of the disease-related Pol γ mutations and the basis for anti-HIV inhibitor toxicity as an aid to rational antiviral drug design. It has been a long wait for the atomic structure of the human Pol γ holoenzyme described in this dissertation. Based on the detailed architecture of the protein, with supporting biochemical results also described in the dissertation, I will explain the mechanism of processive DNA polymerization by

Pol γ as well as structure-based interpretation of mitochondrial diseases associated with Pol γ mutations. In addition, I will provide a structural analysis of the dissimilarity of active sites between human Pol γ and HIV-RT, which could be an initial foundation to design novel NRTIs with improving specificity.

9 Chapter 2: Cloning, over-expression and purification of the human Pol γ subunits

CLONING HUMAN POL γ SUBUNITS

Full-length cDNA clones of human Pol γ subunits were gifts from Dr. Kenneth Johnson. Cloning scheme for Pol γA is summarized in Figure 2.1. In detail, Pol γA gene was amplified by PCR using one primer containing EcoR I site (5’-

GAGGAATTCGATGGTCCCCGCGTCCGACCCC-3’) and a methionine codon immediately before a codon for the 30th residue (Valine) of the gene, and the other containing the 1239th residue (Proline) followed by Not I site (5’- GGATGCGGCCGCTGGTCCAGGCTGGCTTCGTTTTTC-3’). The amplified products were initially inserted into pET22b(+) vector (Novagen) between EcoR I and Not I to add a hexa-histidine tag at the carboxyl-terminal end of the protein for helping in purification. For a convenient following cloning procedure, I introduced an additional Bgl II site into the original pET22b(+) at the right after the stop codon for translating the his-tagged

protein. Then, the final open reading frame (ORF) of Pol γA gene including additional His-tag coding sequence (corresponding amino acids are AAALEHHHHHH) at the 3’ end was transferred into between EcoR I and Bgl II sites of pBacPAK9 shuttle vector

(Clontech) for generating baculoviruses containing the Pol γA ORF in their genomic

DNA. In order to delete 10 glutamine residues encoded by the CAG repeats (ΔQ10 construct) at the N-terminal region of Pol γA, I carried out a PCR-based mutagenesis using 5’-GCGGCAGCAGCAGCCTCAGCAGCCGCAAGTG-3’ and 5’- GCTGAGGCTGCTGCTGCCGCCGCCGCTGCCC-3’ as internal primers deleting the CAG repeats.

In PCR products for Pol γB cloning, Nde I site is located in the immediately upstream of a codon for the 26th residue (Aspartate) at the 5’-end and Not I site in the 10 right after the codon for the last residue (Val485) at the 3’-end. The amplified products were cloned into pET22b(+) vector (Novagen) between Nde I and Not I for tagging

6×His (amino acids AAALEHHHHHH) at the carboxyl-terminal of Pol γB.

Figure 2.1 Scheme of cloning human Pol γA gene into pBacPAK9 shuttle vector for baculovirus generation. pET22b(+)* is a modified plasmid from the original pET22b(+) by generating Bgl II site (AGATCC Æ AGATCT) after the stop codon for His-tagged protein.

11 GENERATING A RECOMBINANT BACULOVIRUS FOR POL γA EXPRESSION IN INSECT CELL SYSTEM.

Recombinant baculoviruses containing human Pol γA gene were generated by using a protocol from Clontech BacPAK bacculovirus expression with minor

modification. Individual plaque-purified baculoviruses were tested for its Pol γA expression potency by infecting into 1.7×106 Sf9 insect cells in 2 ml of SF900 medium supplied with 5% FBS on a 6-well plate (~85% confluence on the plate). After 5 days post-infection, medium (Passage one virus stock; P1) in each well was transferred in a separate tube and stored at 4 °C after removing debris, and infected cells were treated with 200 μl of 1 X SDS sample buffer (62.5 mM Tris-HCl, pH 6.8, 2% SDS, 10% glycerol, 50 mM DTT) followed by boiling for 5 min. Pol γA expression in the cells were visualized on a SDS-polyacrylamide gel (A representative result is shown in Figure 2.2).

The medium incubating highest Pol γA expressed cells was used as a seed for amplifying baculoviruses having high Pol γA-expression potency. About 400 μl of passage one virus stock was mixed with 13 ml of 5% FBS SF900 medium and then added to 20×106 Sf9 cells plated on a 150 cm2-flask the day before. After 1 hr of incubation, 12 ml of the 5% FBS SF900 medium was added more to the flask (the total volume of media is now about 25 ml). The cells were then incubated for about 5 days until most cells were lysed by amplified virus. The medium (Passage two stock; P2) were then cleared by centrifugation at 1,000 × g for 5 min and stored at 4 °C.

OVEREXPRESSION OF HUMAN POL γA IN SF9 INSECT CELLS

In order to get a high viral titer for efficient protein induction, the passage two baculoviruses were re-amplified by inoculating 1 ml of the P2 stock into 12 ml of 5%

FBS SF900 medium culturing with monolayer of 20×106 insect cells on a 150 cm2-flask. After adding 13 ml of 5% FBS SF900 medium more at 1 hr post-inoculation, the cells 12 were subsequently cultured for 3 days. Meanwhile, healthy insect cells were seeded in a

500 ml of suspension culture with 2~3 × 105 cells/ml and incubated for 3 days. Pol γA expression was induced by transferring all materials (media containing amplified

baculoviruses and cells) from the 150 cm2-flask into the suspension culture, and the cells were cultured for additional 3 days (72 hr post infection) before harvesting. Protease

inhibitors, E-64 and pepstatin A (1 μg/ml each) were supplemented to the culture immediately before adding the materials containing viruses. All of the preceding cell culture procedures were performed at 27 °C.

Figure 2.2 Evaluation of plaque-purified baculovirus for Pol γ overexpression in insect cells. Highly expressed Pol γA (arrowhead) in cells infecting recombinant virus of plaque #13 (also #15 and #16) was visualized by SDS-PAGE.

13 HUMAN POL γB OVEREXPRESSION IN BACTERIA

C-terminal His-tagged Pol γB was expressed in E. coli Rosetta (DE3) (Novagen) in LB. The bacteria containing Pol γB construct were cultured in 37 °C to reach the cell

density of 0.6 at A600, and then the cultures were cooled down by shaking at 4 °C for 10 min. Protein expression was induced with 0.4 mM IPTG and the culture was subsequently incubated at 25 °C for 6 hr before harvesting.

Figure 2.3 Overexpression of human Pol γ subunits. (left panel) Pol γA expression (arrowhead) in insect cells 3 day post baculovirus infection. (right panel) Pol γB expression (arrowhead) in E. coli 6 hr after IPTG induction.

PURIFICATION OF POL γ SUBUNITS

Proteins were purified as described by Yakubovskaya et al. (26) with

modification, especially the lysis condition of insect cells for Pol γA. Instead of using sonication, the insect cells expressing Pol γA were thawed and resuspended in 6 volume of a detergent-based lysis buffer (0.32 M sucrose, 10 mM HEPES (pH 7.5 at 25°C), 0.5%

14 (v/v) NP-40, 3 mM CaCl2, 2 mM MgAc·4H2O, 0.1 mM EDTA (pH 8.0), 5 mM β- mercaptoethanol, 1 μg/ml Pepstatin, 1 μg/ml E-64, 5 μg/ml Leupeptin, 0.2 mM PMSF) for 1 hr at for 4°C. The lysate was centrifuged at 1,500 × g for 15 min to remove nuclei of insect cells. The supernatant was adjusted to 0.5 M KCl by adding slowly 1/5 supernatant volume of 3 M KCl buffer (3 M KCl, 20 mM HEPES (pH 8.0), 1 mM EDTA (pH 8.0),

5% glycerol, 5 mM β-mercaptoethanol with protease inhibitors previous mentioned). After stirring for additional 10 min, the sample was centrifuged at 31,000 × g for 30 min to remove insoluble debris.

Bacteria expressing Pol γB were thawed and resuspended in 6 volume of Pol γB lysis buffer (20 mM HEPES (pH 8.0 at 25°C), 0.3 M KCl, 5% glycerol, 1 mM EDTA, 5 mM β-mercaptoethanol with 0.1 % Triton X-100 and protease inhibitors). After sonication, cell lysate was centrifuged at 31,000 × g for 30 min, and 10% Poly(ethyleneimine) (PEI) (w/v; pH 8.0) was slowly added to the supernatant to a final concentration of 0.1% to precipitate nucleic acids. After incubation for 15 min, the suspension was centrifuged at 31,000 × g for 30 min.

Final supernatant was adjusted to 5 mM imidazole and 5 mM MgCl2 and applied onto an Ni-NTA affinity column (Qiagen) equilibrated with Ni-NTA wash buffer (20 mM HEPES (pH 7.5), 0.2 M KCl, 20 mM imidazole (pH 8.0) and 5 % glycerol plus protease inhibitors). After washing with 10 column volumes of the wash buffer, most proteins bound to the column were eluted with Ni-NTA elution buffer (20 mM HEPES (pH 7.5), 0.1 M KCl and 200 mM imidazole (pH 8.0) and 5 % glycerol plus protease inhibitors).

Peak fractions containing Pol γA or B from the Ni-NTA column were diluted to reduce salt concentration for the following ion-exchange chromatography step by slowly adding 110% fraction volume of the dilution buffer (20 mM HEPES (pH 7.5), 5 % 15 glycerol, 1 mM EDTA) and then applied to 1 ml Source S cationic exchange column equilibrated with in a low salt buffer (20 mM HEPES (pH 7.5), 60 mM KCl, 1 mM

EDTA (pH 8.0), 5 % glycerol, 5 mM β-mercaptoethanol, protease inhibitors) at a flow- rate of 1 ml/min. The column was washed with 7 column volume of the low salt buffer, and eluted with a gradient of with 0 ~ 70% of a high salt buffer (20 mM HEPES (pH 7.5),

700 mM KCl, 1 mM EDTA (pH 8.0), 5 % glycerol, 5 mM β-mercaptoethanol, protease inhibitors) in 15 column volume. The eluted proteins were visualized on a SDS-polyacrylamide gel; fractions

containing Pol γ subunits were pooled and subjected onto a Hiload 16/60 Superdex 200 size exclusion column (GE healthcare) equilibrated with a gel-filtration buffer (20 mM

HEPES (pH 7.5), 140 mM KCl, 1 mM EDTA (pH 8.0), 5 % glycerol, 5 mM β- mercaptoethanol, 1 mM PMSF) at flow rate of 1 ml/min. Once the fractions containing

Pol γ subunits were identified, they were pooled and concentrated with Vivaspin 15 (10 kDa of the molecular weight cut-off (MWCO), Sartorius AG). All protein purification procedures were performed at 4 °C.

16

Figure 2.4 Purification of human Pol γ subunits. C-terminal 6×His tagged Pol γA (left panel) or Pol γB (right panel) was purified by Ni-NTA column followed by Source S cationic exchange column.

DISCUSSION

Initial efforts to express active Pol γA in bacteria

Insect cell expression has been popularly used to obtain functionally active proteins which are originated from eukaryotes, needed post-translational modification, or

failed to express active form from bacteria. Active Pol γA proteins also have been obtained from the eukaryotic expression system. However, the system has some disadvantages compared to bacterial expression system economically and time-wise.

17 Although it has been unsuccessful to express an intact form of Pol γA in bacteria (26), it is still worth to exploring the bacterial protein system.

The main problem of Pol γA expression in bacteria has been to form insoluble inclusion bodies after translation. After a number of trials to overcome the problem, I found a condition to get a relatively large portion of soluble Pol γA: first, using pMALC2H10T vector (a gift from Dr. John Tesmer) which has N-terminal 40-kDa maltose binding protein (MBP) coding sequence followed by TEV cleavage site. Second, inducing protein expression at lower temperature (27 °C) than normal bacterial growing condition (37 °C) (Figure 2.5). However, even after purifying soluble N-terminal MBP tagged Pol γA from bacteria, the protein didn’t showed polymerization activity at all. In addition, N-terminal MBP somehow cannot be cleaved out by TEV protease. Lack of polymerization activity might be due to mis-folding of the protein or lack of unknown post-translational modifications that is essential for activity.

Figure 2.5 Attempt to obtain soluble Pol γA in bacteria. MBP tagging at the N-terminal Pol γA (arrowhead) helps to express the protein in soluble form, although enzyme’s functional activity cannot be detected. 18 I also tried to identify the “mini” Pol γA, which is a minimum fragment of Pol γA that retains polymerization activity. Pol γA has distinct C-terminal polymerase (pol) domain identified by sequence similarity with other DNA polymerase. Crystal structures of other polymerase in Pol I family (Klenow fragment and T7 DNA Pol) revealed that C- terminal pol domain and the N-terminal exo domain are structurally independent of each other, so that the N-terminal and C-terminal fragments derived from limited proteolytic digestion possessed the exo and pol activity, respectively (59). In “in-gel” assay condition- monitoring polymerization after denaturation-then-renaturation treatment of electrophoresed T7 DNA polymerase (59), however, either proteolytic fragments of Pol

γA or full-length protein (activity tested in solution) purified from insect cells didn’t show polymerization activity (Figure 2.6). The reason of fail to detect Pol γA activity by this assay might be due to inappropriate renaturation condition that the procedure for T7

DNA polymerase was not adaptable for Pol γA.

Figure 2.6 In-gel activity assay to detect polymerization activity of Pol γA after denaturation-then-renaturation. After electrophoresis of 5 μg of proteins (intact Pol γA, digested Pol γA or T7 DNA Polymerase) in SDS- polyacrylamide gel containing calf-thymus DNA followed by renaturation step, the gel was incubated in buffer containing dNTP with [α-32P]dATP and Mg2+ to allow polymerization reaction (59). 19 In addition, I tried to separate each proteolytic fragment by analytical gel- filtration column (Superdex200 PC 3.2/30 in the SMART system, Pharmacia). Interestingly, majority of the proteolytic fragment cannot be separated by the column equilibrated even under protein denaturating condition (5 M urea) (Figure 2.7A).

Furthermore, the trypsin-treated Pol γA was co-eluted with Pol γB from the column, indicating the nicked Pol γA is still able to form a holoenzyme (Figure 2.7B). The gel- filtration results imply that the sequence predicted functional domains, exo and pol, might interact with each other rather than independent folding, or spacer between the exo and pol connects two domains not only sequentially but also structurally.

Figure 2.7 Attempt to isolate functional domains of Pol γA after trypsin digestion. (A) After the digestion, Pol γA fragments were applied onto analytical gel- filtration column equilibrated with 5M urea buffer. Majority of fragments still didn’t separate under the protein denaturating codition. (B) After Pol γA digestion, fragments mixed with Pol γB were applied onto the column. Pol γB (arrowhead) was co-eluted with nicked Pol γA. 20 Optimizing overexpression and purification procedure for Pol γA expressed from insect cells

Adding protease inhibitors to prevent proteolysis of over-expressed recombinant Pol γA in insect cells

The baculovirus-based insect cell expression system is widely used for obtaining a high yield of recombinant eukaryotic proteins. However, the yield of certain proteins was significantly reduced by intracellular degradation in expression host (60). Proteolyzed recombinant proteins cause heterogeneity of proteins even after purification steps, which can affect not only functional activities of proteins but also crystallization and/or diffraction quality in X-ray crystallization. The undesirable proteolysis might be due to the action of proteases from virus- encoded proteins and/or endogenous insect cell proteins. In order to suppress the unwanted proteolysis, several means have been developed: A baculoviral DNA (Autographa californica nuclear polyhedrosis virus; AcNPV) lacking a non-essential cysteine protease gene (BacVector-3000, Novagen; (61)) has been available to prevent the proteolysis by viral gene products. Alternatively, supplements of cell permeable protease inhibitors such as E-64 (cysteine protease inhibitor) or pepstatin A (aspartic protease inhibitor) in culture media successfully inhibit protease activity in insect cells

(60). For human Pol γA expression, E-64 and pepstatin A (1 μg/ml each) were added to culture media prior to inoculating baculoviruses and an intact Pol γA was consistently overexpressed in insect cells, although the necessity of protease inhibitors has not been tested.

Expanding multiplicity of infection (MOI) of baculoviruses for increasing gene delivery efficiency into insect cells

Baculovirus is a rod-shaped double-stranded DNA virus and used for a gene carrier for ectopic protein expression in insect cells because it infects specifically only a 21 few insect species efficiently and recombinant baculovirus vector is considered to have no pathogenic effects on mammalian cells (62). Based on the Poisson distribution of infection and biochemical studies, MOI of 10 would be necessary to infect most of cells synchronously by the primary infection (63,64). Generally, the passage two (P2) viral stock is generated a volume of < 50 ml with a titer of about 108 pfu (plaque forming unit)/ml (BacPAK Baculovirus expression system manual by Clontech), and at MOI of 10, about 100 ml of the stock is required for

infecting 109 cells in 500 ml culture (2 × 106 cells). In order to maintain a high MOI at the primary infection with saving the usage of the P2 stock, I re-amplified the P2 virus to generate the Passage three (P3) equivalent by inoculating 1 ml of the P2 stock on

monolayer cells (20×106 insect cells on a 150 cm2-flask) then transferred materials containing amplified virus into exponentially growing suspension cultures for large-scale

recombinant Pol γA production. Actually, the P3 virus generation was done under same number of insect cells with one for the P2 virus generation, expecting a same or higher viral titer with the P2 stock of 108 pfu/ml in 25 ml (volume of media in the monolayer culture). Thus, starting with 1 ml of the P2 viral stock, MOI of 2 or greater can be used

for the recombinant Pol γA expression, which predicts that more than 85% of cells are infected by the primary infection (% of infection = 1-exp(-MOI) (63)). Still, optimal MOI

of baculovirus and cell harvest time post viral infection would be necessary to be

quantified for Pol γA production because of the cell toxicity by high MOI of the virus and the possibility of mutant virus generation during re-amplification, which causes heterogeneity of recombinant proteins.

22 Plaque-purification of baculovirus for isolating one having higher Pol γA- expression potency

In order to generate baculovirus containing Pol γA gene, the BacPAK baculovirus expression system (Clontech) uses a cellular recombination between co-transfected baculoviral genome (Bsu36 I digested BacPAK6 DNA) and a shuttle plasmid DNA

containing Pol γA gene. Because recombined DNA products encapsulated inside viruses would become heterogeneous during recombination, resulting in different level of recombinant protein expression, a pure clone of virus should be isolated separately and screened to test its expression potency. After isolating individual viruses by a plaque forming assay, I evaluated recombinant viruses by detecting synthesis of Pol γA in insect cells. Individual clone of viruses shows different degree of expression potency as well as infectious titer, thus it would be important to select a high potent clone for increasing yield of recombinant protein from a large-scale culture.

Deleting poly-glutamine sequence in the N-terminal Pol γA for increasing purification yield of Pol γA

In the N-terminal of wild-type human Pol γA, there are 13 glutamine residues in a row encoded by 10 CAG codon repeats followed by CAA and two CAGs. Although it was reported that the shorten length of CAG repeats are associated with male infertility

(65,66), it is still controversial (67), and the role of glutamine tract in Pol γ function remains to be studied. In addition, in vitro and in vivo studies represented that deleting the CAG repeat has no effect on the polymerization activity as well as mitochondrial function (68). In terms of protein folding, polyglutamine expansion is linked to protein aggregation (69), expecting that the glutamine tract in Pol γ is likely to has a disordered

structure, which would affect crystallization of the protein. Empirically, Pol γA-ΔQ10 mutant (deleting 10 out of 13 glutamine residues in the tract) gave a higher purification

23 yield than the wild-type Pol γA, even both recombinants have no difference in the activity, and only the mutant holoenzyme was able to crystallize. These results indicate that deletion of polyglutamine improves homogeneity of the recombinant protein and reduces a degree of disorder in the N-terminal region of the protein.

Optimizing lysis condition of Pol γA overexpressed insect cells to increase purification yield of the protein.

Cell lysis is the first step for recombinant protein purification and one of the key

steps to obtain a maximum yield of protein. For the lysis of Pol γA overexpressed insect cells, sonication or dounce homogenization followed by high salt treatment has been used by other groups (26,49). Because, however, both cell disruption methods appeared to be inevitable leakage of chromosomal DNA from nuclei into cell lysate, it is likely to

aggregate with overexpressed Pol γA which is a DNA binding protein, causing to reduce the purification yield. In order to prevent DNA leakage from nuclei, or keep intact nuclear membrane, I used a detergent-based cell lysis method with a non-ionic detergent (0.5% NP-40) under iso-osmotic sucrose (0.32 M sucrose). This insect cell lysis condition is adapted from a typical cell fractionation procedure to obtain cytoplasmic and nuclear proteins separately by disrupting only plasma membranes but not nuclear membranes (originally from http://www.pitt.edu/~lazo/Cytoplasmic Nuclear Protein Extraction.html). After cell resuspension in the lysis buffer, nuclei containing chromosomal DNA were easily removed by low-speed centrifugation (1,500 × g for 15 min), therefore, Pol γA can be primarily separated from the DNA and prevented from the non-specific aggregation. Sonication generates heat which could deteriorate proteins, and dounce homogenization that needs physical force by investigators may yield inconsistent quality of cell lysate. In comparision, the detergent-based method does not need a mechanical 24 force to disrupt cells, thus can expect consistent results without reducing protein yield in cell lysis step. The insect cell lysis method for Pol γA can be applicable for purifying other DNA binding proteins expressed in eukaryotic cells.

25 Chapter 3: Crystal structure of the human Pol γ holoenzyme and its functional implications

(The research in this chapter was originally published in Cell; Young-Sam Lee, W. Dexter Kennedy, Y. Whitney Yin., (2009) Cell, 139:312-324 © Elsevier)

INTRODUCTION

Mitochondrial DNA polymerase, Pol γ is solely responsible for DNA replication and repair in mitochondria. Mitochondrial DNA (mtDNA) codes a subset of proteins for oxidative phosphorylation pathway which is critical for supplying energy. Consequently,

Pol γ is indispensible not only for maintaining a genetic integrity of mtDNA, also for cell viability. Especially, many mutations in human Pol γ have been identified to link with neurological or muscular diseases such as progressive external opthalmoplegia (PEO) and

Alpers’ syndrome. Moreover, human Pol γ is known to be an adverse target by nucleoside reverse transcriptase inhibitors (NRTIs) that originally aim at HIV reverse

transcriptase. The high susceptibility of Pol γ for inhibition by these drugs causes mitochondrial dysfunctions due to the deletion/depletion of mtDNA.

In this chapter, I describe a crystal structure of human Pol γ holoenzyme. The holoenzyme is composed of one catalytic subunit, Pol γA, and a dimeric accessory subunit, Pol γB. The structure provides a molecular mechanism of processive DNA polymerization through the interaction between two subunits. In addition, the structural

information of human Pol γ can manifest molecular basis for the Pol γ mutation related human disease as well as set a rational for design of novel NRTI with reduced mitochondrial toxicities.

26 MATERIALS AND METHODS

Construction of Pol γA and Pol γB.

The exonuclease deficient Pol γA(ΔN-exo-) that contains D198A/E200A mutants also lacks the predicted mitochondrial localization sequence (residue 1-29) and ten of 13

sequential glutamines (residues 43-52) that is unique to human. The Pol γA(ΔΝ−exo−) gene was cloned into a baculovirus genome using the shuttle vector pBacPak9 (Clontech) and the protein, containing a C-terminal His-tag, was expressed in Sf9 insect cells. Mutants L549N, L552N and K553A were constructed on the plasmid pET22b(+) containing Pol γΑ(ΔN-exo−) gene using Quikchange site-directed mutagenesis kit

(Stratagene). Pol γΑ−ΔL was constructed by substituting residues 542-557 with a dipeptide Gly-Gly. The mutant were cloned into the baculovirus genome, and the

corresponding proteins were expressed and purified as Pol γΑ(ΔΝ−exo−). The deletion mutant accessory protein Pol γB (Pol γΒ−ΔI4) was constructed by substituting residues 147-179 with the dipeptide Gly-Gly (19). The Pol γΒ−ΔI4 gene was cloned into vector pET22b(+) and the C-terminal His-tagged protein was expressed in E. coli Rosetta (DE3) (Novagen).

Selenomethionine-substituted Pol γΑ was produced by a variation of a standard procedure (70). Briefly, the growth medium (5% FBS in SF-900 SFM; Invitrogen) was

exchanged 12 hr post-infection for a methionine-deficient medium containing 5% dialyzed FBS; L-(+)-se-Met (Acros) was added to a final concentration of 50 mg/L 19 hr post-infection. Proteins were purified by sequential application to Ni-NTA, SOURCE S

and Superdex 200 columns (26). The holoenzyme was formed by combining Pol γΑ and Pol γ monomer at a 1:2 molar ratio, and purified by gel filtration using Superdex 200 (Figure 3.3).

27 Polymerization assay

The control Pol γΑ protein contains a single change (E200A) in the exo active site but retains full polymerization activity (9), and is designated as Pol γΑ-wt*; the control

Pol γB is wild-type. Pol γΑ(ΔN-exo−) and Pol γΒ−ΔI4 are the mutant proteins used for structure determination. The substrate was a single-stranded M13mp18 DNA annealed to a 26 nt primer (5’ GGA TTA TTT ACA TTG GCA GAT TCA CC 3’). The reaction mixture contained 50 nM Pol γA, 100 nM Pol γB, and 50 nM primer/template DNA in 20 μL reaction buffer (10 mM HEPES (pH 7.5), 80 mM KCl, 12.5 mM NaCl, 50 μg/ml BSA and 3 mM β-mercaptoethanol) and was pre-incubated at 37 °C for 5 min. Reaction were initiated by the addition of MgCl2 (10 mM) and dNTPs (50μM each of dGTP, dATP, dTTP and 5 μM [α-32P]dCTP) and incubated at 37°C for 10 min. The reactions were stopped by adding 1% SDS, 20 mM EDTA and 0.1 mg/ml Protease K, and incubating at 42 °C for 30 min. The reaction mixtures were then applied to Micro Bio- Spin 6 columns (Bio-Rad) to remove free nucleotides. Samples were heat-denatured at 95

°C for 5 min in gel loading buffer (70% formamide, 1x TBE, 100 mM EDTA) and analyzed on a 6% polyacrylamide / 7 M urea gel. The products of the reactions were visualized by autoradiography.

Limited proteolysis

Experiments were conducted with 20 μg of purified Pol γA, Pol γB-ΔI4, and holoenzyme in 20 μl of reaction buffer (20 mM HEPES, pH 7.5, 140 mM KCl, 1 mM EDTA, 5 mM β-mercaptoethanol) with or without equal molar of 25/45mer DNA, to which trypsin (0.1 μg) was added. Samples were incubated on ice for 3 min and treated with an equal volume of 2 × SDS sample buffer (125 mM Tris-HCl, pH 6.8, 4% SDS, 20% glycerol, 100 mM DTT) to stop the digestion reaction. 4.5 μg of each reaction

28 product was separated by SDS-PAGE. Western blot analyses were performed using anti- His antibody to aid identification of proteolytic fragments.

Crystallography

Crystals of Pol γ holoenzyme were grown using the hanging-drop method at 20°C at 2-3 mg/mL of the protein complex against a well solution containing 5.5-7% PEG

8000, 100 mM NaH2PO4, and 100 mM ACES (pH 7.0). Osmium derivatives were

prepared by soaking crystals in mother liquor containing 3 mM K2OsO4 for 7 hr. Prior to freezing in liquid nitrogen, crystals were transferred into solutions with stepwise increasing concentrations of glycerol up to 20%. Crystals of Pol γΒ−ΔΙ4 were grown by the hanging-drop method at 4°C using 10-15 mg/mL protein and a well solution containing 100 mM Tris-HCl, pH 7-7.5, 100 mM KCl, 6-8% PEG8000, and 30% glycerol. The crystals were directly flash-frozen in liquid nitrogen. Single-wavelength anomalous diffraction data (SAD) for Se and Os derivatives were collected at Advanced Photon Source ID-19. All data sets were processed using the program HKL (71). Se atoms (30 out of 34 total) and Os atoms (8) were located by the anomalous difference Fourier method using phases obtained from molecular replacement with apo

Pol γΒ as a search model. Initial phases (with a figure-of-merit 0.56) were calculated using combined phases from Se-SAD, Os-SAD and molecular replacement using the program CNS (72). Density modification using Solomon in the CCP4 suite (73) was applied to the initial phases; the procedure and B-factor sharpening drastically improved the quality of electron density maps. The diffraction data of the holoenzyme were initially indexed to a hexagonal space group P3221 containing one complex per asymmetric unit (asu). Although the electron density map was readily interpretable, the resulting atomic

structure could only be refined to a high Rfree (~49%). After careful examination, we

29 reassigned the diffraction data to space group P32 with two copies per asu. In the new space group, diffraction intensity analysis indicated that the diffraction data were partially twinned with a twinning operator (h,-h-k,-l) and a twinning factor of 0.46. Refinement

was subsequently carried out in space group P32, utilizing the detwinning procedure in the program CNS.

Pol γΒ−ΔΙ4 was crystallized and its structure was determined by the molecular replacement method using wild type human Pol γΒ as a search model with the program AMORE (74) and was refined with Refmac (75).

RESULTS AND DISCUSSION

Structure of the catalytic subunit

Successful crystallization of the Pol γ holoenzyme required altered forms of both subunits. An exonuclease-deficient mutant of Pol γA was crystallized with the deletion mutant Pol γΒ−ΔΙ4, which lacks a four-helix bundle at the dimer interface (19,26). DNA polymerase activity of the enzymes used for crystallization is comparable to the activity

of the exo- holoenzyme containing wild-type Pol γΒ (Figure 3.2)(3,26). The structure of Pol γ was determined to 3.2 Å resolution. Phases were calculated by combining those from single wavelength anomalous diffraction using selenomethionine-substituted Pol γΑ and osmium-derivatives of holoenzyme, and molecular replacement using human Pol γΒ (20) as a search model. Density modification applied to the initial combined phases significantly improved the quality of electron density maps (Figure 3.2B); the structure

was refined to an Rfactor of 28.4% and Rfree 30.3%. Statistics for data collection and refinement are shown in Table 3.1.

30

Figure 3.1 Polymerization activities of Pol γA (ΔN-exo-) and Pol γB (ΔI4). Pol γA –wt* (as a positive control) is the E200A exo mutant that possesses wild type polymerase activity.

31

Table 3.1 Statistics of data analysis and structural refinement

Pol γ holoenzyme Pol γB-ΔI4

Native Se-Met derivative K2OsO4 soaked Native Data collection Resolution (Å) 50.0 - 3.2 50.0 - 4.0 50.0 – 4.0 50.0 – 3.3 Wavelength (Å) 0.979 0.979 1.140 1.140

Space group P32 P32 P32 P4122 cell dimensions a, b, c (Å) 138.4, 138.4, 226.4 138.94, 138.94, 227.35 139.25, 139.25, 227.70 64.42,64.42, 260.64 α, β, γ (°C) 90.0, 90.0,120.0 90.0, 90.0, 120.0 90.0, 90.0, 120.0 90.0, 90.0, 90.0 Unique reflections 76667 41728 41530 11917 Completeness b (%) 100 (100) 96.3 (82.4) 94.8 (87.9) 99.2 (94.7) Redundancy 5.7 (5.7) 3.8 (3.7) 3.9 (3.7) 8.2 (3.8) a R Linear 0.09 (1.00) 0.072 (0.658) 0.070 (0.622) 0.167 (0.511) I/σI 21.5 (2.3) 19.5 (3.0) 23.3 (1.8) 5.8 (2.0) SAD FOM 0.72 0.35 0.40 Density modification on Combined phases Refinement c Rwork (%) 28.4 25.7 d Rfree (%) 30.3 29.4 No. amino acids RMS deviations 1850 358 from ideal values Bond (Å) Angle (°) 0.0108 0.0090 1.57 1.71

a th Rlinear=∑|Ii−|/∑Ii where Ii is the i measurement and is the weighted mean of all measurements of I. b Values in parentheses are for the highest resolution shell. c R =−FhklF() () hkl Fhkl ()for reflections in the working data set. work∑∑hkl obs calc hkl obs d Rfree is the same as Rwork for 5% of data randomly omitted from refinement.

32 The catalytic subunit Pol γΑ contains domains for exonuclease (exo) and polymerase (pol) activities separated by a linker or spacer. Pol γΑ adopts the canonical polymerase ‘right-hand’ configuration with subdomains of ‘fingers’, ‘palm’ and ‘thumb’ that bind template DNA and substrate nucleotide triphosphate, as well as catalyze formation (Figure 3.2A). The conserved aspartic acids, D890 and D1135 are located in the palm at positions consistent with their known roles in catalysis.

Although the overall fold of Pol γA confirms its classification as a member of the Pol I family, many features of Pol γA are clearly absent in the other enzymes. Most obviously, Pol γA possesses a large spacer domain (~ 400 residues) between the exo and pol domains. In light of the atomic structure, it was necessary to modify the earlier

sequence homology-based domain definition of Pol γΑ because a portion of the originally assigned spacer is actually the thumb subdomain (Figure 3.2). The spacer domain is spatially far from the exo and pol domains and connects to them only through the long helices of the thumb subdomain. The spacer has two obvious

subdomains, a globular IP subdomain (Intrinsic Processivity, residues 475-510 and 571- 785) and an extended AID subdomain (Accessory Interacting Determinant, residues 511-

570) that reaches more than 50 Å away from the main body of Pol γΑ (Figure 3.2). We will show that the IP subdomain explains the intrinsic processivity of Pol γA, and the AID subdomain forms an important interface with Pol γB that is essential for increased processivity of the holoenzyme. A homology search against structures in the protein data bank yields a Z-score of ~0.2, suggesting that the spacer domain has a novel fold (76).

33

Figure 3.2 Structure of human Pol γ holoenzyme. (A) Structure of Pol γA. The pol domain shows a canonical ‘right-hand’ configuration with thumb (green), palm (red) and fingers (blue) subdomains, and the exo domain (grey). The spacer domain (orange) presents a unique structure and is divided into two subdomains. Domains are shown in a linear form where the N-terminal domain contains residues 1-170; exo: 171-440; spacer: 476-785; pol: 441- 475 and 786-1239. All figures are made with Pymol . (B) A portion of the experimental map from SAD phasing and density modification. (C and D) Structure of the heterotrimeric Pol γ holoenzyme containing one catalytic subunit Pol γA (orange) and the proximal (green) and distal (blue) monomers of Pol γB. Pol γA primarily interacts with the proximal monomer of the dimeric Pol γB. 34 Holoenzyme formation and subunit interface

In agreement with solution studies (Figure 3.3), the crystal form of Pol γ is a heterotrimer containing one catalytic Pol γΑ subunit (135 kDa) and a dimeric Pol γΒ−ΔI4

(2 x 50 kDa) with a subunit contact area ~3500 Å2. The deleted helical bundle in Pol γB is distance from the subunit interface and not involved in the subunit interaction with Pol γA (Figure 3.8).

Figure 3.3 Formation of Pol γ holoenzyme analyzed by gel filtration chromatography. Holoenzyme was formed by mixing Pol γA and Pol γB-ΔI4 monomer at a 1:2 molar ratio. Elution profiles of purified Pol γA (135 kDa; red curve) and Pol γB-ΔI4 dimer (100 kDa; blue curve) on a Superdex 200 gel filtration column are superimposed on that of the holoenzyme (235 kDa; purple curve).

The trimeric holoenzyme shows unequal subunit interactions: Pol γA primarily interacts with only one monomer of the Pol γB dimer (Figure 3.2C, D). An electron cryo- microscopic analysis of Pol γ at ~17Å resolution came to a similar conclusion (77), although the subunit interface, modeled onto the crystal structure of Pol γB, appears 35 different from that found in the crystal structure of the holoenzyme. The asymmetrical

interactions of Pol γΑ with the proximal monomer of the Pol γΒ dimer suggests that a monomeric accessory subunit could be fully functional, which is indeed the case for

Drosophila Pol γ (24). An asymmetric heterotrimer provokes the question whether the human holoenzyme could also be an A2B2 tetramer, which could position two polymerases at a replication fork - a necessary requirement for coupling leading and lagging strand DNA

synthesis. We thus modeled a tetrameric enzyme, adding a second Pol γΑ to the heterotrimer by following the symmetry operator constraining the Pol γΒ dimer. The modeled tetramer reveals steric clashes between the AID subdomains of the two Pol γΑs

that preclude formation of an actual A2B2 tetramer. Although this result could be used as support for the displacement model of mtDNA replication (78), it should be noted that a trimeric holoenzyme does not rule out other mechanisms for positioning two polymerases at a replication fork (7).

The only contact regions between Pol γA and the distal Pol γB monomer are a salt bridge (2.8 Å) between R232 of Pol γA and E394 of Pol γB (Figure 3.4E), and a weak van der Waals contact (5.3 Å) between Pol γA Q540 and Pol γB R122 (Figure 3.4G). A R232G

substitution, together with T251I and P587L, has been reported in a child with PEO and

hepatic failure (79). Healthy siblings of this patient carried T251I and P587L, suggesting

that the R232G substitution is associated with disease. This clinical case suggests that

either R232 is critical for Pol γΑ activity or that the contact between Pol γA and the distal Pol γB monomer is important for human holoenzyme function. In contrast to its limited interaction with the distal monomer, Pol γA makes extensive interactions with the proximal Pol γB monomer (Figure 3.4B). Examination of the subunit interface shows two major areas of hydrophilic interactions: between Pol γΒ 36 R264, K373 and D459 and the Pol γΑ thumb domain area (E454-D469, and R579) (Figure 3.4Α), and between Pol γB (D253 and D277) and Pol γA (K1198, R1208 and R1209). In addition, hydrophobic interactions occur between a Pol γB hydrophobic core (V398-L406,

V441-L455) in the C-terminal region and Pol γA AID subdomain L-helix (V543-L558)

(Figure 3.4C). Pol γA AID causes the steric clash in the modeled A2B2 tetramer; in the absence of stabilizing hydrophobic forces for a second Pol γA monomer the holoenzyme is therefore heterotrimeric. In turn, the modeling suggests that the hydrophobic interface is dominant in subunit interaction. To test this idea, we made four L-helix mutants: L549N and L552N, which reduce

hydrophobicity with only minimal structural alteration, a complete deletion (ΔL), and K553A (Figure 3.4C-D). The latter change nullifies the electrostatic interaction between

Pol γA K553 and E404 of Pol γB. In the absence of Pol γB, all mutants exhibited activities comparable to wild-type Pol γA, demonstrating that the alterations do not disrupt the active site (Figure 3.5). The L-helix is therefore not directly involved in DNA-binding, in agreement with our structural observation that the entire AID subdomain, which includes the L-helix, is connected to the body of Pol γΑ by flexible linkers and would likely be disordered in the absence of Pol γB. However, the addition of Pol γB reveals a very different outcome. In 90 mM salt, the Pol γA ∆L deletion not only severely lowers the stimulation by Pol γB, but it also reduces the length of product DNA (Figure 3.5). In contrast, the L-helix missense

mutants had little effect. At higher ionic strength (190 mM salt), wild-type Pol γA is inactive but retains considerable activity when complexed to Pol γB. However, holoenzymes containing Pol γA L549N, L552N, or the ∆L mutant are completely inactive in high salt. This salt-dependent reduction of activity strongly suggests that the mutations disrupt the hydrophobic interactions between Pol γA and Pol γB. Interestingly, 37 holoenzyme containing K553A appeared equally active as wild-type; simply disrupting one electrostatic interaction between Pol γA and Pol γB therefore has only a minor effect when the hydrophobic interactions in this region are preserved. These data provide strong support to our conclusion that hydrophobic interactions between the Pol γA L-helix and the C-terminal domain of Pol γB are the dominant attractive forces that stabilize the AID subdomain so that it can support processive DNA synthesis by the holoenzyme.

Although there is low overall sequence similarity in the Pol γA L-helix, residues involved in the hydrophobic interaction with Pol γΒ are conserved and are predicted to be α-helical in the mouse, Drosophila and Xenopus proteins (Figure 3.4D). The hydrophobic residues on the respective Pol γΒ proteins are also conserved. Conservation of interacting residues in both subunits suggests that all these mitochondrial holoenzymes likely possess a common subunit interface that involves an AID subdomain. Interestingly, the spacer containing AID is not only missing in the non-processive

DNA polymerases, it is also largely absent in fungal Pol γΑ. Perhaps not coincidentally, these enzymes also lack a Pol γΒ-type processivity factor. It seems likely that the ancestor of human Pol γA first acquired a spacer domain, which then allowed a Pol γB- like protein to interact, and the interacting domains subsequently co-evolved to increase the processivity of synthesis.

38

Figure 3.4 The major Pol γ subunit interfaces. Panels A-C: Pol γA- Pol γB proximal monomer interactions, the distal monomer is omitted for clarity. (A) Charge-charge interactions between the thumb domain of Pol γA and the C- terminal domain of Pol γB. (B) L-shaped support between Pol γA and the proximal monomer of Pol γB. (C) Hydrophobic interactions between the L- helix of Pol γA and a hydrophobic core of Pol γB. Mutated residues L549, L552 and K553 are shown. (D) Sequence alignments of residues involved in hydrophobic interactions between Pol γA and Pol γB. Panels E-G: Pol γA- Pol γB distal monomer interactions, the proximal monomer is omitted for clarity. (E) The salt-bridge (2.8 Å) between Pol γA R232 and the distal Pol γB E394. F) Pol γA-Pol γB distal monomer. (G) The weak van der Waals interaction (5.3 Å) between Pol γA and the distal Pol γB monomer.

39

Figure 3.5 DNA activity assay of Pol γA-L-helix mutant. (A) DNA synthesis activities of Pol γA mutants γL, L549N, L552N, K553A were assayed without or with Pol γB at different ionic strengths. Lo and Hi denote 90 mM and 190 mM NaCl, respectively. Denatured products were separated by electrophoresis on an acrylamide gel. (B) The purity of Pol γA wt and mutants proteins are shown after SDS-PAGE.

40 Processivity of the holoenzyme

The ability to catalyze processive synthesis is essential for replisomal complexes. However, most replicases have little processivity by themselves, generally synthesizing

15 nt or less per primer-binding event. Pol γA is somewhat exceptional in that it can synthesize ≥100 nts (3,18). However, when bound to their accessory proteins all replicases exhibit high processivity, synthesizing thousands of nucleotides without dissociation (80,81).

To begin to understand the mechanism of Pol γ processivity, we modeled a Pol γ- DNA complex by docking the primer-template DNA from the T7 DNAP-DNA complex

(82) onto the Pol γ holoenzyme after superimposing the active site domains of the two polymerases. Despite strong circumstantial evidence for a bacterial origin of mitochondria, the catalytic subunit Pol γΑ is more closely related to bacteriophage T7 DNAP than to any bacterial replicase. The two active site domains show high similarity and superimpose with an rmsr of 2.3 Å (Figure 3.6), which enables modeling an enzyme- DNA complex with confidence.

Figure 3.6 Superposition of Pol γA and T7 DNAP shows their similarity in the active sites.

The docked DNA is cradled by a positively charged channel formed by the

thumb, palm, and fingers of Pol γA (Figure 3.7, 3.10); this pol domain makes contact

41 with ~10 bp of template DNA that includes the primer terminus. In Pol γA holoenzyme, the hydrophobic interaction between Pol γB and the L-helix of the AID subdomain exposes a surface on Pol γA containing a high density of positively charged residues (496KQKKAKKVKK505, termed the K-tract, Figure 3.7B, 3.10). The K-tract interacts with the negatively charged phosphodiester backbone of DNA upstream to that bound in the pol domain, thus increasing the contact of holoenzyme to DNA to ~25 bp (Figure

3.10). The modeled complex reveals no direct contact between Pol γΒ and primer- template DNA, suggesting that the increased DNA-binding affinity of holoenzyme by Pol

γΒ is mediated entirely through Pol γA. In support of this conclusion, weakening the hydrophobic interaction between the AID subdomain and Pol γB, which likely causes additional flexibility of the K-tract, reduces activity and processivity of holoenzyme (Figure 3.5). This model now provides a structural basis for the known increased affinity of holoenzyme, relative to Pol γA, to DNA. Further evidence supporting the model includes limited proteolysis of the holoenzyme with and without primer-template DNA. Comparison of protease digestion patterns suggests several regions of Pol γ are protected by DNA (Figure 3.7A). Taking advantage that both Pol γA and Pol γB are His-tagged at their C-termini, Western blot analyses using anti-His antibody aided identification of proteolytic fragments. Digestion of the catalytic subunit Pol γA generated three C-terminal major fragments with apparent molecular masses of 105 kDa, 77 kDa, and 56 kDa, and a minor fragment of 84 kDa. The

77 kDa C-terminal fragment is absent when Pol γB is present, suggesting that the region around residue 560, corresponding to AID domain L-helix of Pol γA, is involved in subunit interactions. When holoenzyme is bound to DNA, the intensity of the 84 kDa fragment is significantly reduced, suggesting that it contains a DNA-. The DNA-protected region lies near residue 500, corresponding to the K-tract, which is in 42 good agreement with the modeled Pol γ-DNA complex. Other differentially protease- sensitive bands are apparent in Figure 3.7A, but the protected regions cannot be unequivocally identified.

Although Pol γΑ and T7 DNAP’s catalytic subunit, gene 5 protein (gp5), have similar active site domains (Figure 3.6, 3.7), simply referring to Pol γA as a T7-like DNAP is only partially correct. Pol γΑ is more processive than T7 gp5 in the absence of an accessory protein (18,83), and the corresponding processivity factors are different in structure and function. E. coli thioredoxin, the processivity factor for gp5, is a 104 aa

monomeric protein whereas the dimeric Pol γΒ contains 970 aa. Functionally, the two accessory proteins use different mechanisms for processivity enhancement: thioredoxin

increases T7 DNAP affinity for primer/template DNA by decreasing koff without affecting the rate of nucleotide incorporation, therefore, effectively prolonging the time

of each binding event (84). In contrast, Pol γΒ does not alter koff but accelerates the polymerization rate, thereby increasing the number of nucleotides incorporated per binding event (3).

The model suggests that the relatively high processivity of Pol γΑ in the absence of Pol γB can be attributed to the IP subdomain of the spacer, which provides a binding site for the upstream primer-template DNA duplex (Figure 3.7B). An IP subdomain is not found in T7 gp5 (Figure 3.7C), precluding significant DNA synthesis in the absence of thioredoxin. The modeled DNA-holoenzyme complex can also fully explain the remarkable ability of Pol γΒ to increase polymerase and decrease exonuclease activity simultaneously. Binding of Pol γΒ to Pol γΑ causes the primer terminus to be preferentially bound in the pol rather than the exo site, probably because less DNA bending is required.

43

Figure 3.7 Comparison of a modeled Pol γ-DNA complex with that of the T7 DNAP- DNA complex. (A) Limited proteolysis of Pol γ visualized after SDS-PAGE by Western blot or by Coomassie blue staining. (B) Modeled Pol γ-DNA complex containing Pol γA (shown in ribbons), Pol γB (grey CPK) and a docked DNA (blue ribbons) shows that IP and AID subdomains enhance DNA-binding. Mutations and the region protected by DNA from proteolytic digestion (black arrow) are indicated. (C) Crystal structure of T7 DNAP- DNA complex containing gp5 (ribbons), thioredoxin (grey CPK) and a primer-template DNA (blue ribbons). 44 Furthermore, Pol γB may function beyond processivity enhancement and play a role in replisome assembly. Despite it not contacting DNA directly in the modeled Pol γ- DNA complex, Pol γB is able to bind DNA. This activity, however, appears important only for DNA synthesis on duplex templates. Changing the positively charged residues 363RKK365 and, separately, 328RK329 to alanines abolishes DNA-binding activity (23). 363RKK365 is part of the I7 loop (residues 356-369) (19) that contains several positively charged residues; the corresponding region in threonyl-tRNA synthetase (structurally

homologous to Pol γB) is involved in tRNA binding. Deletion of I7 abolishes Pol γB DNA-binding (19). Nonetheless, mutant Pol γB-containing holoenzyme retains normal activity in copying single-stranded DNA (23), but is defective in replicating duplex DNA in the presence of SSB and helicase. These data are consistent with the substitutions lying distant from the primer-template channel (Figure 3.8).

Figure 3.8 Mapping Pol γB deletion mutants on the Pol γ holoenzyme structure. Deleted regions are colored in red for I5 (lacking residues 216-231), I6 (lacking residues 324-332), and I7 (lacking residues 356-369) are colored red, and distinguished by subscripts denoting proximal and distal monomers. I7d is disordered in the structure. The deleted region of I4 (lacking residues 146- 180) is in black. Locations of mutants K463A/H467A, S271A as well as 363RKK365and 328RK329 are indicated by the wild type residues in CPK.

45 Structurally, the I7 region in both monomers is disordered in the apo Pol γB structure, but that in the proximal monomer becomes ordered in the holoenzyme structure. To assess whether the distal monomer remains disordered is because of asymmetrical interaction between dimer Pol γB with Pol γA or whether it is due to the loss of the four-helical bundle in the Pol γΒ−ΔΙ4 mutant, we crystallized and determined its structure to 3.3 Å resolution. Deletion of the helical bundle changed the crystal packing from that of the wild-type but the structure still remains a perfect dimer, being formed by two monomers related by a 2-fold crystallographic axis (Figure 3.9). Aside

from the deleted helical bundle, the structure of Pol γΒ−ΔΙ4 is essentially identical to that of the wild-type protein. As in the wild-type protein, the two I7 regions are disordered in

Pol γΒ−ΔΙ4, indicating that the differential folding of the Pol γΒ loops in the holoenzyme is not a function of the deletion. Most likely, the ordering of I7 in holoenzyme is a direct consequence of the Pol γB proximal monomer interacting with Pol γΑ.

Figure 3.9 Structure of Pol γB-ΔI4 dimer (blue) superimposed on that of wild type Pol γB (purple); the four-helical bundle, which comprises about 40% of the Pol γB dimer interface, is deleted in Pol γB-ΔI4.

46 Although we do not have sufficient data to model the replisome, the electrostatic surface potential of Pol γA is informative. As expected, the putative DNA-binding channel is lined with positively charged residues but the opposite surface of the protein presents a large negatively charged region near the exo domain and the tip of the AID subdomain also contains four sequential glutamates (535EEEE538, E-tract; Figure 3.10). The human mitochondrial helicase, Twinkle, has a highly positively charged C-terminal region that could contact one of these regions. If the interaction is through the negatively charged E-tract in the replisome, Twinkle would be positioned in a location close to that

of the 363RKK365 and 328RK329, residues important in Twinkle-dependent replication of duplex DNA.

Figure 3.10 Pol γA charge distribution. The electrostatic surface potential of Pol γA is shown in two views with positively charged regions highlighted in blue and negatively charged regions in red. A primer-template DNA duplex is modeled using black sticks.

47 Distinct mode of substrate binding

Polymerases are classic examples of enzymes using the induced-fit mechanism to achieve substrate specificity. The apo form of most DNA Pol I members adopt an ‘open’ conformation, where catalytically important residues on the fingers domain lie some distance away from the palm active site residues. After DNA-binding, the fingers domain undergoes structural changes to the ‘closed’ conformation, enabling the enzyme to align the important residues for catalysis and substrate selection. The desolvation effect generated by the conformational changes further enhances substrate specificity (85,86). This mechanism is utilized by all high-fidelity polymerases and is apparently absent in

error-prone polymerases. Interestingly, the fingers domain of apo Pol γΑ directly abuts where the primer terminus will be positioned and apo Pol γ thus presents a partially ‘closed’ conformation. Nevertheless, the catalytically important residues on the fingers domain are still too far for catalysis, a rotational conformational change is still necessary after DNA-binding to position the catalytic residues correctly. The configuration of the

Pol γΑ active site suggests that the conformational change in the fingers domain is coaxial with the duplex DNA, whereas it is perpendicular in other DNAPs.

Apo Pol γA further differs from other DNAPs in the active site by containing a small subdomain (residues 1050-1095) that partially blocks the DNA-binding channel. This type of subdomain has not been described before in other DNAPs but the apo form of phage N4 virion miniRNAP contains a similar subdomain that rotates out of the

channel following DNA-binding (87). If Pol γ indeed undergoes different conformational changes than other DNAPs when binding DNA, it may then use different mechanisms to

ensure replication fidelity. These differences could reflect the high susceptibility of Pol γ to NRTIs.

48 Pol γ mutations and human diseases

The critical functions of Pol γ in mtDNA synthesis may, in part, rationalize the diversity and progressive effects of Pol γ mutations in degenerative human disorders.

Many severe human diseases have been correlated with mutations affecting Pol γ (http://tools.niehs.nih.gov/polg/), and several mutant proteins have been characterized.

Using the Pol γ structure we can now begin to rationalize the effects of some of these substitutions. Previous mutational analyses using the T7 DNAP structure as a model have successfully explained mutations predicted to affect the active site (31), and as we have shown here the active sites of the two proteins are homologous. However, mutations located in the spacer region – as originally defined by sequence alignment with E. coli

Pol I - show diverse biochemical behaviors. The structure of Pol γ now enables us to use a structure-based domain definition and distinct subdomain functions to re-analyze the effects of these mutations.

We divided all reported disease-associated Pol γ mutations into three classes (Table 3.2). Class I contains active site mutations that all result in reduced ; class II includes substitutions located in the putative DNA-binding channel, thus reducing DNA-binding affinity directly. Class III contains subunit interface

substitutions that disrupt the subunit interaction between Pol γA and Pol γB and thus naturally have a reduced processivity. It should be noted that, although clinically discovered, not all substitutions shown in Table 3.2 have been shown biochemically to adversely affect enzyme activity. A cluster of substitutions found in PEO patients (29,88): R943H/C, Y955C, and

A957P/S fall into class I. A modeled Pol γ-DNA complex with an incoming dNTP (Figure 3.12C) suggests that R943 may form a charged interaction with the triphosphate moiety. Substitution of the positively charged arginine should drastically reduce affinity 49 for incoming nucleotides. Y955 abuts the templating base, in a position that is consistent with its known multi-functional roles in other Pol I family enzymes, including primer- template alignment and substrate selection (89). A957 is adjacent to a critical glycine (G958); the equivalent residue in T7 RNAP (another Pol I family member) serves as a fulcrum during enzyme translocation and coordinates substrate binding (90). Substitutions of A957 with bulkier residues likely interfere with both enzyme translocation and binding to an incoming dNTP. In general, the predicted consequences of all these mutations are a decreased affinity for dNTPs, increased error rates and/or reduced catalysis, in good agreement with solution studies (31). Mutations giving rise to these defective proteins should tend to confer a dominant phenotype, likely because the mutant enzymes compete effectively with the wild-type enzyme for binding to the template DNA and cause error-prone DNA synthesis. Due to the multiple functions of the spacer, the phenotypes of what have traditionally been called spacer mutants vary with the spatial location of the substitution. Our structure suggests that the IP and AID subdomains of the spacer use distinct means

to increase processivity: the IP subdomain functions independently of Pol γB, whereas AID acts only through its interaction with Pol γB. Accordingly, spacer mutations segregate into classes II and III, but in general both are likely to reduce affinity for DNA. A large number of class II mutations are arginine substitutions that are distributed along the modeled primer-template DNA binding channel. Substitution of positively charged arginine with neutral residues will decrease DNA binding and polymerase activity. Thus class II mutants tend to be recessive, as the mutant Pol γA is ineffective in competing with the wild-type enzyme for template DNA. The class II mutation W748S is commonly associated with autosomal recessive ataxia and Alpers’ syndrome (34). W748 is located in the IP subdomain (Figure 3.7B and 50 3.11), away from the subunit interface, and is likely important in maintaining the local structure of the IP domain that contacts the downstream single-stranded DNA. W748 forms stacking-interaction with F750 and H733 in the local structure. Destabilizing this stack by the W748S substitution will undermine the enzyme’s interaction with template DNA, leading to lower polymerase activity. This interpretation is consistent with the biochemical observations of low DNA polymerase activity and processivity and a severe DNA-binding defect, but normal holoenzyme formation (35).

The most common substitution among all Pol γ mutations, A467T, is a representative of class III mutants and is associated with a wide range of mitochondrial disorders (91). Biochemical studies using the A467T mutant, which was thought to affect the spacer domain, unexpectedly was found to have reduced template binding, and the mutant-containing holoenzyme has lower processivity (33,92), suggesting that the A467T

mutant Pol γA has both reduced polymerase activity and subunit interaction. This observation can now be explained because the substitution actually lies in the thumb domain of Pol γA (Figure 3.7A and 3.11), which is well known to interact with template DNA. Although Pol γB interacts with the thumb containing A467, this residue faces away from the interaction surface. However, A467T may interrupt the local hydrophobic environment formed by L466 and L602, causing a slight spatial shift of the thumb domain

that interferes with the interaction between the subunits.

The only Pol γB substitution that has been reported to be associated with disease is G451E, which was found in a single PEO patient with multiple mtDNA deletions (93). G451E is a class III mutant, as G451 is located near the interface formed between the C-

terminal domain of the proximal Pol γB monomer and the AID subdomain of Pol γA (Figure 3.11). The G451E substitution may cause a steric clash with T556 on the AID subdomain; perhaps more importantly, it may disrupt the hydrophobic interaction that is 51 essential for subunit interaction. This structural analysis agrees with the biochemical characterization of G451E-substituted Pol γB, which revealed a compromised subunit interaction and incomplete stimulation of catalytic subunit activity (93). From their structural and biochemical properties, mutations in class III are expected to be autosomal recessive. A defective subunit-interface leads to reduced polymerase processivity and DNA-binding, defects that can be at least partially compensated in heterozygotes by the presence of wild-type enzyme. In addition, all class III mutations identified to date are located in the subunit hydrophilic interface that plays a secondary role in subunit interaction.

Figure 3.11 Pol γA mutational analysis. i) Pol γA mutants in the DNA-binding channel (class II) are highlighted in blue. ii) A split open view of the subunit interface showing class III mutants (orange) affecting Pol γA and Pol γB (subscripts denote the proximal and distal monomer) interaction. 52

Table 3.2 Classification of disease-related Pol γA substitutions

Class Location Substitutions Predicted defects

I Active site R943H, C Y955C Reduced catalysis, A957S, P R964C increased error-rate, defective translocation. II DNA-binding channel H277C S305R Q308H R309H, L Reduced DNA-binding R374X G380D L428P S433C Q497H G517V K561M R562Q H569Q R574W R579W P587L P589L R597W R617C G763R A767D W748S R807P, C N846S T849H R853Q, W V855A A862T R953C F961S L966R S1095R R1096C, H R1138C K1191N, R

III Subunit interface R232G L463F Reduced subunit A467T N468D affinity, decreased D1196N DNA affinity

Pol γB: G451E

53 Structural dissimilarities between human Pol γ and HIV reverse transcriptase provides exploitable space for drug design

Anti-viral NRTIs present a unique opportunity for drug design, because both the target HIV RT and the adverse target human Pol γ are known. Although it has long been suspected that the two enzymes are dissimilar, for the first time we can make detailed structural comparisons of the human and viral polymerases and exploit differences in a rational design of antiviral drugs with higher selectivity.

There are several structural differences between human Pol γ and HIV RT that may be utilized in designing selective inhibitors. The distinct subunit interactions of the two enzymes result in substrate DNA being bound in the active site of Pol γ at an angle of 45° to that in HIV RT (Figure 3.12A, B). More importantly, while the catalytic aspartates of the HIV RT p66 subunit and Pol γA have a similar spatial arrangement, the incoming nucleotide-binding sites formed between the palm and fingers subdomains are structurally distinct. This portion of the fingers subdomain is α-helical in human Pol γ but β-sheet in HIV RT (Figure 3.12C, D). Both human Pol γ and HIV RT utilize electrostatic interactions of positively charged residues on the fingers domain to bind the negatively charged triphosphate of an incoming dNTP. However, their interaction with the nucleoside moiety is different. In

Pol γΑ, the nucleoside binding site is likely bounded by E895, Y951, Y955 and Q1102; in HIV RT it is bounded by R72, F77, Y115 and Q151 (Figure 3.12C, D) (94). Not only is a charge

of a residue reversed (E895 in Pol γA vs. R72 in RT) but the positions of Y951 and Y115 are also altered. A highly conserved bulky residue (a Y or F) in members of the DNA Pol I family is known to play a major role in discriminating against incorporation of ddNMP.

Pol γA Y951 is located on the O-helix of the fingers domain, and has been shown to be responsible for the lack of discrimination between dNTPs and ddNTPs. Y115 of the p66

54 subunit of HIV RT has been shown to be important in the discrimination of 3´-OH

residues (95), and may be the equivalent residue to Pol γA Y951. However, Y115 in HIV RT p66 lies on a loop behind the ribose moiety of the incoming dNTP (Figure 3.12C, D).

The different angles at which Y951 and Y115 approach the sugar moiety of an incoming dNTP suggests that they differentially shape the active site and, further, that HIV RT and

Pol γ may interact differently with nucleoside analogues. The HIV RT residue in the equivalent spatial position as Pol γA Y951 is R72, which actually functions in pyrophosphorolysis rather than discrimination against the 2’-OH of ribose (96). The differences between the two enzymes suggest that it may therefore be possible to design small molecules that exploit these structural and functional dissimilarities.

The modeled complex of Pol γ and DNA also illustrate how the fingers domain may undergo conformational changes in order to accommodate the primer/template DNA duplex, changes that can be contrasted with those in the thumb domain of HIV RT (94,97). The differences may influence how the two enzymes maintain their different degrees of substrate specificity and their different responses to nucleoside inhibitors. However, a clearer picture of how correct and incorrect or analog nucleotides are differentially accommodated within the active sites of HIV RT and Pol γA must await co- crystal structures of the latter enzyme with DNA and dNTPs.

55

Figure 3.12 Structural differences between human Pol γ and HIV reverse transcriptase. Panel A and B: overall structures of the two enzymes illustrate differences in the interaction between the catalytic and accessory subunits, and the modes of DNA binding. The active site of Pol γA, comprised of an α-helical fingers domain (C), differs significantly from that of HIV RT (D), where the incoming dNTP binding site is comprised of β-sheet fingers.

56 Chapter 4: Dissecting the processivity function of the dimeric human Pol γB

(The research in this chapter was originally published in Journal of Biological Chemistry; Young-Sam Lee, Sujin Lee, Borries Demeler, Ian J. Molineux, Kenneth A. Johnson, and Y. Whitney Yin, (2010) Journal of Biological Chemistry, 285:1490-1499 © the American Society for Biochemistry and Molecular Biology.)

INTRODUCTION

Most DNA polymerases are composed of a catalytic subunit and an accessory subunit. Because a catalytic subunit by itself cannot bind DNA for an enough time to replicate DNA efficiently, it functions associated with an accessory subunit for a processive DNA synthesis. The processivity factor plays essential roles by significantly increasing enzyme-DNA affinity and/or decrease in the DNA dissociation rate; as a consequence, holoenzyme is able to synthesize DNA by significantly increased rounds of nucleotide incorporation per binding event. The catalytic subunits of all DNA polymerase present a recognizable polymerase core structure, meanwhile the processivity factors show various architecture for their function. β-clamp for bacterial DNA polymerase II and III or PCNA for eukaryotic DNA polymerase δ and ε forms a toroidal oligomeric complex which encircles the duplex DNA. Thioredoxin for T7 DNA polymerase is a small monomeric protein, which not only contacts DNA directly, but also enhances the interaction of the catalytic subunit with DNA.

Pol γB for mitochondrial DNA polymerase has a unique mode of processivity enhancement; it increases rate of polymerization as well as DNA binding affinity. The human Pol γ holoenzyme structure in the previous chapter suggests that the processivity

57 function of a dimeric Pol γB come from exclusive interaction with AID subdomain of Pol γA, which induces conformation of the positively charged “K-tract” in the AID to bind DNA and direct 3’-end of primer into the active site of polymerization. Interestingly, the

structure reveals that only one monomer of Pol γB (a proximal Pol γB monomer) mainly participates in interaction with Pol γA, whereas the other one (a distal Pol γB monomer) makes limited contacts (Figure 3.4). In this chapter, I detail the results of experiments to

dissect the role of each monomer of Pol γB to conferring processivity; the Pol γB monomer proximal to Pol γA in the holoenzyme strengthens the interaction with DNA, and the distal Pol γB monomer accelerates the reaction rate.

MATERIALS AND METHODS

Cloning, expression and protein purification

All mutants and wild type Pol γB were cloned into pET22b(+) and the C-terminal His-tagged constructs were expressed in E. coli Rosetta (DE3) (Novagen) at 37°C in LB.

Proteins were induced with 0.4 mM IPTG when the cell density reached 0.6 A600, and the culture was subsequently incubated at a reduced temperature of 25°C for 6 hours before harvesting. The deletion mutant ΔI4 was constructed as previously described (19). Other mutant Pol γBs were constructed using the following oligonucleotides (mutation sites in bold) as primers for Quikchange (Stratagene) site-directed mutagenesis: D129K: 5’-GCAGGTATTCCCGGTGAAAGCCCTCCACCACAAACC and 5’-GGTTTGTGGTGGAGGGCTTTCACCGGGAATACCTGC; R107E: 5’-CCTTGGGCGTAGAGTTGGAAAAGAACCTGGCCGCAG and 5’-CTGCGGCCAGGTTCTTTTCCAACTCTACGCCCAAGG.

The C-terminal His-tagged, exonuclease deficient catalytic subunit Pol γA was constructed by substituting E200 to alanine and by deleting the mitochondrial localization

58 sequence (residues 1-29). The exo- Pol γA gene was transferred into the baculovirus genome using the shuttle vector pBacPak9 (Clontech) and expressed in infected Sf9 insect cells. Proteins were purified by sequential application to Ni-NTA, SOURCE S and Superdex 200 columns (26).

Analytical ultracentrifugation

All experiments were performed using a Beckman Optima XL-I. Data were analyzed with the program UltraScan v9.9 (98), making appropriate hydrodynamic

corrections for the buffers used (99). The partial specific volumes of Pol γB proteins, estimated from the protein sequence (100), were 0.734, 0.735, 0.736, and 0.736 ccm/g for

ΔI4, ΔI4-D129K, D129K and wt Pol γB, respectively. All samples were analyzed in 50 mM NaCl and 25 mM sodium phosphate buffer (pH 7.4). Sedimentation velocity experiments were conducted at 40,000 rpm, 20°C, for Pol

γB wild-type, D129K, ΔI4 and ΔI4-D129K at equal loading concentrations (1.7 μM). Scans were taken at 230 nm in intensity mode. All data, with time invariant noise subtracted, were initially analyzed by the 2-dimensional spectrum method (101), and further refined with the genetic algorithm (102). Statistics were subjected to Monte Carlo analysis (103). Sedimentation coefficient distributions were calculated by the method of van Holde-Weischet as previously described (104).

Sedimentation equilibrium experiments were conducted at 4°C for Pol γB wild- type, mutant D129K, ΔI4, and ΔI4-D129K. Two sets of loading concentrations were prepared for each protein: 3.5 μM, 5.8 μM and 8.1 μM for scanning at 280 nm, and 0.8 μM, 1.3 μM and 1.8 μM for 230 nm. Samples were centrifuged to equilibrium at 15,000, 18,700, 22,500, 26,200 or 30,000 rpm, and scanned simultaneously at 230 and 280 nm. The resulting 30 scans were globally fitted to multiple models as described (105). The

59 extinction coefficient at 280 nm was determined to be 71,940 AUmol-1cm-1 by amino acid composition (106). The extinction coefficient at 230 nm was estimated to be 323,340 AUmol-1cm-1 by globally fitting wavelength scans from each concentration to sums of Gaussian terms (107). The most appropriate model was chosen based on minimum residual and the best statistics.

Steady-state Polymerization assay

Polymerization assays used single-stranded M13mp18 DNA annealed to a 26-nt primer (5’-GGATTATTTACATTGGCAGATTCACC). Reactions contained 80 nM Pol

γA, 200 nM Pol γB (or variant), and 50 nM primer/template DNA in 20 μL 10 mM HEPES, pH 7.5, 80 mM KCl, 12.5 mM NaCl, 50 μg/ml BSA and 3 mM β- mercaptoethanol. The holoenzyme titration experiment used Pol γA/Pol γB ratios of 40 nM/100 nM, 80 nM/200 nM, and 160 nM/400 nM; after pre-incubation at 37°C for 5 min, 500 nM poly(dA-dT) • poly(dA-dT) was added as ‘trap’ DNA. Reactions were then initiated by the addition of MgCl2 (10 mM), dNTPs (50μM dGTP, dATP, dTTP, 5μM dCTP and 0.1 μM [α-32P]dCTP), and incubated at 37°C for 10 min. Reactions were stopped by the addition of 1% SDS, 20 mM EDTA and 0.1 mg/ml Protease K, and

incubation at 42°C for 30 min. After applying reaction mixtures to Micro Bio-Spin 6 columns (BioRad) to remove free nucleotides, were heat-denatured at 95°C for 5 min in gel loading buffer (70% formamide, 1x TBE, 100 mM EDTA), and were analyzed on a 6% polyacrylamide / 7 M urea gel. Reaction products were visualized by autoradiography.

Pre-steady state kinetics

A 25/45-mer primer-template was prepared by annealing equimolar amounts of

5’[32P]-labeled primer (5’-TCCTCGCAGCCGTCCAACCAACTCA) and template (5’- 60 GGACGGCATTGGATCGAGGTTGAGTTGGTTGGACGGCTGCGAGGA) by heating at 95°C for 5 min and then slowly cooling to 20°C in 10 mM Tris-HCl (pH 8.0 at 25°C) and 50 mM NaCl. Single-nucleotide incorporation DNA polymerization assays were performed using a RQF-3 Rapid Chemical Quench Flow instrument (KinTek Co.), where

one syringe contained Pol γ-DNA complex (140 nM Pol γA, 600 nM Pol γB (or variant), 400 nM 25/45-mer DNA, 20 mM HEPES (pH 7.5 at 25°C), 100 mM NaCl), and the other

syringe contained nucleotide-magnesium mix (100 μM dATP, 20 mM MgCl2, 20 mM HEPES (pH 7.5 at 25°C), 100 mM NaCl. The reaction was initiated by rapidly mixing equal volumes from each syringe at 37°C for 5, 10, 20, 30, 40, 60, 80, 100, 250, 500 msec, 1, 2.5 and 5 sec, and quenched with 0.5 M EDTA. Quenched reaction samples were applied to a 15% polyacrylamide / 7 M urea gel. The 26-mer DNA product was visualized by autoradiography and quantified with software QuantityOne (BioRad). The time dependence of the product formation was fit to the burst equation:

−k pol•t [ product _ 26 − mer] = A(1− e ) + k ss • t

Analytical gel filtration

Each Pol γB variant (2 μM monomer) was analyzed alone, with 1 μM Pol γA, or with 1 μM Pol γA and 3 μM 25/30-mer (5’-GCATCTACGACCAACTCATACACCT/3’- AAAGGAGGTGTATGAGTTGGTCGTAGATGC) primer/template DNA on a

Superdex 200 10/300 GL column. Samples (300 μL) were applied to the column in 20 mM HEPES (pH 7.5 at 25°C), 140 mM KCl, 1 mM EDTA (pH 8.0), 5 mM β- mercaptoethanol and eluted at a flow rate of 0.65 ml/min. Eluates were monitored at A280 and A260, and proteins were visualized by Coomassie staining after SDS-PAGE.

61 RESULTS

Construction and Preparation of Pol γB variants

In contrast to the monomeric Drosophila Pol γB, the human protein is a homodimer. To investigate the function of each human Pol γB monomer, we constructed a monomeric Pol γB, expecting to detect differences in activity between Pol γA alone and its complex with a monomeric Pol γB (heterodimer AB holoenzyme), or with the dimeric

Pol γB (heterotrimer AB2 holoenzyme). Guided by bioinformatic, structural and prior biochemical analyses, we identified two regions that contribute to Pol γB dimerization. In comparison to the monomeric Drosophila Pol γB, the human protein has two insertions that are located in the dimer interface (Figure 4.1B and 4.1D): Insertion I contains residues 165-201, which is part of the four-helical bundle (147-180) formed with the same region from another monomer.

The region has been termed I4, and a mutant lacking it termed ΔI4 (Figure 4.1B) (19).

Insertion II contains residues R107-V119 and H133-A146. This region in human Pol γB harbors cross-dimer hydrogen bonds formed by D129-R107, H77-D198 and H133-E233, each of

which is duplicated by the 2-fold symmetry axis relating the two Pol γB monomers (Figure 4.1C). The D129-R107 salt-bridges should be particularly strong, as they include both H-bonding and charge-charge interactions. Hypothesizing that both regions are necessary for dimerization, we constructed

four human Pol γB mutants where the regions are disrupted either individually or jointly: ΔI4 removes the four-helical bundle by replacing residues 147-179 with a Gly-Gly dipeptide. D129K converts the D129-R107 electrostatic attraction to repulsion by

substituting Asp with Lys at position 129. ΔI4-D129K combines the I4 deletion and the D129K substitution. Lastly, anticipating altered activity from mutant ΔI4-D129K, we constructed ΔI4-D129K/R107E, where two substitutions, D129K and R107E, were added 62 to the ΔI4 construct. These substitutions replace the wild-type D129-R107 pair with a new salt-bridge K129-E107. All proteins were purified to high homogeneity (Figure 4.2A).

Figure 4.1 Structural and bioinformatic basis for construction of Pol γB variants. (A) Structure of a trimeric human Pol γ holoenzyme shows Pol γA forms extensive interactions with the proximal Pol γB monomer but limited contacts with the distal monomer. Alignment of human, mouse and Drosophila Pol γB reveals two inserted regions (D). One forms a four- helical bundle I4 (B), the second interdimer H-bonds (C). Both regions are important for dimerization of mammalian Pol γB. 63 Oligomerization of Pol γB variants

Dimerization of Pol γB mutant proteins was first evaluated by analytical ultracentrifugation (AUC). We performed sedimentation velocity experiments under

identical conditions using Pol γB wild-type, D129K, ΔI4 and ΔI4-D129K proteins. Both Pol γB wild-type and mutant D129K have the same weight-average sedimentation coefficient of 5.37, indicating that they have identical oligomeric states (Figure 4.2B).

ΔI4 has a weight- average sedimentation coefficient of 3.74, and shows a typical monomer-dimer equilibrium pattern that is consistent with a weak dimer. Mutant ΔI4- D129K gave a weight-average sedimentation coefficient 3.28, consistent with it being completely monomeric under these conditions. To obtain quantitative measurements of dimer formation and dissociation, we

analyzed the proteins by sedimentation equilibrium centrifugation. Mutant ΔI4 best fit a reversible monomer-dimer equilibrium model with a dissociation constant of 16.6 μM, in

agreement with the previously reported value of 7 μM (26). All other Pol γB variants were best fit by a single species model, as only a very low level of other species was

detected. Wild-type Pol γB was calculated to have a molecular weight of 114.1 kDa, and mutant D129K of 99.5 kDa, both values are consistent with the proteins being dimers of a 52.5 kDa protein (by sequence). The mutation D129K therefore appears to have little

effect on dimer formation. Conversely, the molecular weight of ΔI4-D129K was

estimated to be 50 kDa, consistent with it being monomeric. In order to estimate the Kd boundary values for these variants, the missing species (monomer for the wild-type and mutant D129K, and dimer for ΔI4-D129K) were assumed to be below the detection limit

(A230< 0.05 AU). Data for all Pol γB variants are summarized in Table 4.1. The oligomeric states of Pol γB and variants were independently confirmed by gel filtration chromatography, extrapolating from their elution volumes and their calculated 64 molecular weights. All proteins were analyzed at 2 μM, a minimum concentration that is dictated by the system’s UV detection limit (50 mAU @ 280 nm). Wild-type Pol γB elutes with an apparent molecular weight of 100 kDa (Figure 2C), consistent with it being

a dimer. ΔI4 elutes as a broadened peak, suggestive of it being a mixture of 50 and 100 kDa species that correspond to monomers and dimers, and ΔI4-D129K behaves as a 50 kDa monomer. However, when ΔI4-D129K bears the additional R107E substitution, it chromatographs as a dimer. Thus, the D127K-R107E combination, which restores the

salt-bridge interaction between two Pol γB monomers, also restores the ability to form a dimer that may be even stronger than ΔI4. This result therefore clearly demonstrates the importance of the salt-bridge between residues 127 and 109 in Pol γB dimer formation. These analyses suggest that alteration of either dimer-stabilization region alone is

insufficient to dissociate dimeric Pol γB completely under the conditions of our analyses,

but together they abolish all significant intermolecular interactions. The Kd for dimerization of Pol γB ΔI4-D129K is more than 2000 times higher than that of the wild- type, and the mutant protein can therefore be considered monomeric under our conditions of analysis. Because the construction of Pol γB ΔI4-D129K was predicated on the structure of the dimeric human protein and sequence alignment differences with

Drosophila Pol γB, these data also explain why the latter protein is a monomer.

65

Figure 4.2 Variant Pol γB oligomeric states. (A) Purified Pol γB proteins (1 μg) analyzed on a SDS-PAGE gel and stained with Coomassie Blue R-250. (B) Superimposed van Holde - Weischet integral distribution plots of wild-type (filled circles), D129K (open triangles), ΔI4 (open circles) and ΔI4-D129K (filled squares). (C) Superimposed chromatograms of Pol γB variants (2 μM) analyzed on a Superdex 200 10/300 GL column: wt Pol γB (thick black), ΔI4 (grey), ΔI4-D129K (dotted), ΔI4-D129K/R107E (thin black).

Table 4.1 Dissociation constants measured by analytical ultracentrifugation.

a b Pol γB Proteins Kd (μΜ) Molecular weight (kDa) Oligomeric state Wild-type <0.1 92.9 (89.5, 94.7) Dimer D129K <0.1 91.1 (90.8, 92.0) Dimer ΔI4 16.6 90.9 (90.7, 91.3) Monomer/Dimer mixture 78.1 (77.8, 79.1) ΔI4-D129K >200 43.8 (43.6, 44.1) Monomer ΔI4-D129K/R107E ndc Monomer/Dimer mixture Note: a measured by AUC using the sedimentation equilibrium method. b Based on genetic algorithm - Monte Carlo analysis of sedimentation velocity data. Values in parenthesis are 95% confidence intervals. c not determined.

66 Effects of Pol γB dimerization on processive DNA synthesis

On primed M13 DNA, most products synthesized by Pol γA are less than 100 nt but, when Pol γA forms a holoenzyme with wild-type Pol γB, they increase in length several-fold and become more abundant (Figure 4.3A). Similar results are seen when Pol

γA complexes with Pol γB ΔI4, suggesting that the latter is fully competent to stimulate DNA synthesis in this system. Removal of the four-helical bundle, thereby weakening formation of the Pol γB dimer, has little effect. These results are comparable to previous studies comparing wild-type and ∆I4 Pol γB (26). In contrast, the monomeric Pol γB ΔI4-D129K is much less effective; although a small increase in total products was observed relative to Pol γA alone, there is little increase in product length (Figure 4.3A). These data suggest that loss of the distal Pol γB monomer diminishes holoenzyme processivity. In agreement with this conclusion, when

using Pol γB ΔI4-D129K/R107E, where the salt-bridge and dimerization capability is restored, the resulting holoenzyme exhibits activity comparable to the wild-type or ∆I4- containing enzyme.

To test whether the deficiency of the monomeric Pol γB is due to an intrinsic lack of activity or to a weakened interaction between subunits, holoenzyme containing

monomer Pol γB was titrated in a polymerization assay. Activities comparable to the wild type holoenzyme were observed at concentration 4-fold higher concentration (160 nM)

(Figure 4.3B). ΔI4-D129K at this concentration remains a monomer, and we show below that the corresponding holoenzyme is an AB heterodimer. This indicates that loss of

function in ΔI4-D129K is likely caused either by impaired subunit interactions or by impaired interactions of holoenzyme with DNA.

67

Figure 4.3 Steady-state DNA polymerization assays. Pol γA with or without Pol γB (wild-type or a variant) were analyzed using M13mp18 DNA annealed to a 26-nt primer. (A) Reaction products were visualized on a polyacrylamide denaturing gel. Reactions contained 80 nM Pol γA, 200 nM Pol γB or a variant, 50 nM primer/template DNA and 10-fold excess of “trap” DNA. (B) Reactions were performed with Pol γA either alone at 40, 80 and 160 nM, or in the presence of ΔI4-D129K at concentrations of Pol γA/ΔI4-D129K 40 nM/100 nM, 80 nM/200 nM, or 160 nM/400 nM. Pol γA/wild-type Pol γB 40 nM/100 nM served as the control.

68 Pre-steady State Kinetics analysis of Pol γB variants

To gain a better mechanistic insight of Pol γB oligomerization on the polymerization reaction, we used pre-steady state kinetics to examine single nucleotide incorporation to a 25-mer primer annealed to a 45-mer template (25/45-mer). Identical

experiments were carried out with 70 nM Pol γA, either with or without 300 nM Pol γB wild-type, ∆I4, ∆I4-D129K, or ∆I4-D129K/R107E proteins, 200 nM DNA substrate 25/45-mer. A pre-equilibrated polymerase-DNA complex was rapidly mixed with 50 mM

dATP and 10 mM MgCl2. The time-dependence for formation of the 26-mer product was plotted against time (Figure 4.4), and data were then fitted to the burst equation:

−k pol•t [ product _ 26 − mer] = A(1− e ) + k ss • t , where A, the burst amplitude, reflects the

amount of productive protein-DNA complex that can be turned over in the first cycle of the reaction, and kpol, the burst rate, denotes the fast polymerization rate in the first cycle

of the reaction. We should note that kpol is not the initial slope, or the first order derivative d[ product] of the curve, because = A • k when t→0, before steady state conditions dt pol

apply. The initial slope is thus the product of the two parameters A and kpol. Finally, kss, the steady state turnover rate, is the slope of the linear steady-state phase of the reaction. Other parameters are computed from the primary experimental data: the off-rate,

koff = k ss / A , reflects the frequency at which polymerase dissociates from its template,

and processivity = k pol / koff , is the number of nucleotides incorporated before

dissociation. The kinetic parameters for Pol γB, both wild-type and variants, are summarized in Table 4.2.

In the presence of wild-type Pol γB, the burst amplitude of Pol γA increases from 25 nM to 55 nM, showing that Pol γB increases the formation of a productive protein-

DNA complex 2-fold; the burst rate increases from 13 s-1 to 31 s-1, indicating that Pol γB

also accelerates the polymerization rate (Table 4.2). In addition, kss is reduced 3-fold, 69 indicating a lower steady-state turnover rate. The lower the value of kss, the less likely is polymerase to dissociate from its template. This means that the polymerase can catalyze more rounds of nucleotide incorporation – thereby becoming more processive - before it dissociates from the template. The combination of an increased polymerization rate and a

reduced koff, due to the presence of a wild-type dimeric Pol γB in holoenzyme, results in an increase in processivity from 33 nt by Pol γA alone to 650 nt, a 20-fold enhancement. Pre-steady state data (Table 4.2) using Pol γB-ΔI4 and Pol γB-ΔI4-D129K/R107E show that these proteins have similar properties to wild-type.

The monomeric Pol γB ΔI4-D129K confers different kinetic properties from a dimeric Pol γB. Although Pol γB ΔI4-D129K increases the amplitude of the reaction to nearly the same level as the wild-type protein (from 25 to 45 nM), it is unable to

accelerate the burst rate (Table 4.2). Accordingly, the monomeric Pol γB increases processivity only from 33 nt to 117 nt - a mere ~3.5-fold. To test whether the slightly

reduced amplitude of ΔI4-D129K, which is about ~90% of wild-type, was caused by a reduced interaction with Pol γA, we repeated the experiment at double the ΔI4-D129K concentration. The amplitudes are the same for ΔI4-D129K at 300 nM or 600 nM, (45.4 nM and 45.5 nM respectively), suggesting the reduction is not due to a reduced interaction with Pol γA, rather that a monomeric Pol γB is slightly inferior to a dimer in stimulating formation of a productive polymerase-DNA complex. Importantly, no change of burst rate is observed (14.1 s-1 and 16.3 s-1 at 300 nM and 600 nM, respectively)

compared to Pol γA alone (13.4 s-1), showing that monomeric Pol γB has little or no ability to accelerate the rate of synthesis by Pol γA.

70

Figure 4.4 Time-dependent product formation in pre-steady-state assays. The 26-mer products were quantified from reactions of Pol γA without or with Pol γB variants and plotted against time; without Pol γB (open circles), wt Pol γB (filled circles), ΔI4 (filled squares), ΔI4-D129K (filled triangles), and ΔI4- D129K/R107E (open diamonds). Shorter time points are shown as an inset. Reactions contained 70 nM Pol γA, ±300 nM Pol γB wild-type or a variant, 200 nM 25/45-mer DNA, 10 mM MgCl2 and 50 μM dATP.

Table 4.2 Pre-steady state kinetic parameters for Pol γB variants

Pol γA - Wild-type Pol γB ΔI4 Pol γB ΔI4- Pol γB ΔI4 Pol γB D129K D129K/R107E Amplitude A (nM) 25.40 ± 0.70 54.83 ± 0.70 52.00 ± 0.80 45.40 ± 1.06 52.09 ± 0.39 -1 Burst rate kpol (s ) 13.38 ± 0.91 30.99 ± 1.24 31.56 ± 1.52 14.07 ± 0.81 29.32 ± 0.68 -1 Steady state rate kss(nM•s ) 10.13 ± 0.26 3.01 ± 0.30 3.36 ± 0.34 5.58 ± 0.40 2.62 ± 0.16 a -1 koff (s ) 0.40 0.05 0.06 0.12 0.05 Processivity b (nt) 33 620 526 117 586

a. Calculated as koff = kss /A. b. Processivity was calculated as kpol/koff. The standard deviations are residual errors from least- square model fitting. 71 Effect of Pol γA on Pol γB dimerization

All the activity assays for holoenzyme containing Pol γB-ΔI4 were conducted at

concentrations far below the measured Kd for the dimer. Pol γB ΔI4 would therefore be expected to be completely monomeric in these reactions, yet it functions as effectively as

the dimeric wild-type Pol γB and distinctly more effective than Pol γB ΔI4-D129K, which is clearly a monomer. The apparent discrepancy in the properties of Pol γB ΔI4 suggests that the oligomeric state of the protein may be affected by its association with

Pol γA, either in the form of holoenzyme or in a holoenzyme-DNA complex. We used analytical gel filtration chromatography to reveal the oligomeric state of

Pol γB variants. Experiments were carried out using 2 μM Pol γB wild-type, ΔI4, ΔI4- D129K or ΔI4-D129K/E107R, either in the absence or presence of Pol γA (1 μM). These concentrations were expected to allow detection of changes in dimer-monomer

equilibrium of ΔI4, because they are near its Kd of 7-17 μM (this work and (26)), but far

from the Kd for ΔI4-D129K and wild-type Pol γB (this work), so that the ΔI4-D129K protein is essentially entirely monomeric and the wild-type dimeric.

Wild-type Pol γB (a predicted 52.5 kDa monomer) alone elutes as a ~100 kDa molecule; and as a singular ~220 kDa species when mixed with Pol γA (135 kDa) (Figure

4.5A), consistent with complete formation of the trimeric AB2 holoenzyme. In contrast, Pol γB ΔI4-D129K (50 kDa) elutes in the position expected for a 50 kDa monomer. When Pol γB ΔI4-D129K is mixed with Pol γA and chromatographed, two individual peaks, whose apparent molecular weights correspond to each protein alone, were observed. There was no evidence for the formation of any complex, indicating that the

monomeric Pol γB ΔI4-D129K does not interact with Pol γA at this concentration. These conclusions were confirmed by analysis of the column eluate by SDS-PAGE (Figure

72 4.5C). No Pol γB ΔI4-D129K could be detected in column fractions containing Pol γA. Only when the concentrations of Pol γA and Pol γB ΔI4-D129K were both raised to 4-5 μM could any subunit interaction be detected. At this concentration, ΔI4-D129 is still monomeric and binds to Pol γA to form an AB heterodimer (Figure 4.6A). At a concentration of 2 µM, Pol γB ΔI4 is a mixture of monomer and dimer (~50 kDa and ~100 kDa) species (Figure 4.2C). The concentration of Pol γB ΔI4 dimer can be estimated from the Kd to be 0.2 - 0.3 µM. In the presence of Pol γA, a protein species appears with an apparent molecular weight of ~200 kDa; this is larger than either Pol γA or the Pol γB dimer. Because ΔI4 is a mixture of monomers and dimers, this apparent

complex could be an AB heterodimer or an AB2 heterotrimer. As shown above, the monomeric ΔI4-D129K does not bind to Pol γA under these conditions, and so an AB heterodimer would not be expected. The new peak, therefore, most likely contains a

mixture of AB2 heterotrimeric holoenzyme and free Pol γA, the latter because of the substoichiometric amounts of ΔI4 dimer. This conclusion is supported both by the facts that the new peak contains both Pol γA and Pol γB and by the presence of substantial amounts of uncomplexed Pol γB ΔI4 (Figure 4.5A and 4.5C). These observations suggest that Pol γA has higher affinity for the Pol γB dimer than the monomer. Pol γA would thus preferentially associate with a dimeric protein in the monomer-dimer mixture, and in doing so would bias the monomer-dimer equilibrium towards dimer formation. However, these gel filtration experiments do not explain how

Pol γB ΔI4-D129K can increase the polymerization amplitude of Pol γA in pre-steady state kinetic analysis. Because Pol γA is unable to interact with monomeric ΔI4-D129K at 2 μM concentration, and yet stimulates Pol γA activity at a lower concentration, we considered the possibility that formation of a Pol γA-Pol γB holoenzyme may be affected by a primer-template DNA. 73 DNA-dependent subunit interaction

In order to examine the effect of primer-template DNA on the interaction between

Pol γA and Pol γB, we repeated the analytical gel filtration analyses in the presence of 3 μM primer-template DNA. In contrast to the DNA-free experiments, the presence of primer-template DNA promotes complete formation of holoenzyme for both Pol γB ΔI4 (Figure 4.5B) and ΔI4-D129K/R107E (not shown). The estimated molecular weight of the complex is 282 kDa, similar to that of the wild-type complex (290 kDa), indicating in both cases the formation of trimeric AB2-holoenzyme (calculated molecular weights of 235 and 240 kDa, respectively) complexed to the 16 kDa primer-template DNA. The most dramatic changes occurred to the holoenzyme containing monomeric

Pol γB. As described above, the monomeric Pol γB ΔI4-D129K is unable to bind to Pol γA at 2 μM concentration. However, a singular peak of ~239 kDa molecular weight was

observed in the presence of DNA, and the A260:A280 ratio indicated that the peak contained DNA (Figure 4.6B). Assaying column fractions by SDS-PAGE shows the

presence of both Pol γA and Pol γB (Figure 4.5D). The relative amounts of Pol γB in holoenzyme were estimated by densitometry scans of the bands corresponding to Pol γA and Pol γB on the SDS gels. The ratio Pol γB:PolγA for ΔI4-D129K is about one-half that of the wild-type Pol γB-containing holoenzyme, providing strong evidence that ΔI4- D129K forms an AB heterodimer. [Note that apparent molecular weights of the holoenzyme-DNA complexes are systematically overestimated, presumably due to the elongated DNA]. Increased holoenzyme formation in the presence of a primer-template

fully explains the ability of the monomeric Pol γB ∆I4-D129K to stimulate Pol γA in the pre-steady state polymerization reaction.

74

Figure 4.5 Effects of Pol γA and DNA on Pol γB dimerization. (A) Superimposed analytical gel filtration elution profiles of 1 μM Pol γA in the presence of 2 μM wt Pol γB (black), ΔI4 (blue) or ΔI4-D129K (red). The protein contents of peak fractions were visualized on SDS-PAGE gels for wt Pol γB, ΔI4, and ΔI4-D129K (C). (B) The same elution profiles as in (A) except that 3 μM 25/30-mer duplex DNA was included. The contents of peak fractions were analyzed on SDS gels for Pol γA+DNA with Pol γB wild-type, or ΔI4- D129K (D). Densitometry profiles of the gels are shown on the right (fraction 11 for B1 and fraction 12 for bottom panel, respectively). 75

Figure 4.6 Subunit interaction between Pol γA and ΔI4-D129K. (A) Elution profiles (absorbance at 280 nm) shows the chromatogram for 4.3 μM Pol γA with 4.7 μM ΔI4-D129K in 2.75 ml gel filtration buffer (20 mM HEPES (pH 7.5 at 25 °C), 140 mM KCl, 1 mM EDTA, 5 mM β-mercaptoethanol) from the HiLoad 16/60 Superdex 200 prep-grade column (GE Healthcare). SDS- polyacrylamide gel stained with Coomassie Blue shows co-elution of Pol γA and ΔI4-D129K at the peak fraction and the density of each band represents the molar ratio of each protein as 1:1. (B) Elution profiles representing the absorbance at 280 nm and 260 nm for Pol γ with or without DNA. The ratio of A260/A280 (~0.6 in the absence of DNA, ~1.1 for mixed with DNA, respectively) shows the co-elution of DNA with Pol γA and ΔI4-D129K. 76 DISCUSSION

Mitochondrial DNA polymerase accessory subunits Pol γB are structurally and functionally different from other accessory proteins. This divergence of mitochondrial DNA replicase from prokaryotic and eukaryotic enzymes inspires many interests in evolution and structural-functional relationship of mitochondrial replication system. In

contrast to the monomeric protein in lower eukaryotes, mammalian Pol γBs dimerize to become a larger protein, thereby raising the question whether the extra Pol γB monomer yields any additional functions relating to DNA synthesis processivity.

Contribution factors to processivity

Processivity is defined as the length of DNA synthesized per enzyme binding event. It is a distance that a polymerase travels before dissociating from the template, and

therefore can be expressed by two parameters, d = vt, functionally the same as kpol/koff, where v (or kpol) is the rate of single nucleotide incorporation reaction, t (or 1/koff) is the duration of the enzyme-DNA interaction per binding event. An accessory factor can increase processivity of a holoenzyme by either accelerating the rate of polymerization or prolonging the enzyme-DNA interaction. Several accessory proteins, such as the ring-shaped accessory proteins for DNA Pol II and III superfamily members, and thioredoxin for T3 and T7 DNA polymerase, increase protein-DNA affinity. These processivity factors prolong the duration of holoenzyme binding to DNA, but have no effect on catalysis rate (e.g., see reference (84)). In contrast, human Pol γB both strengthens the binding of holoenzyme to DNA and simultaneously accelerates the rate of polymerization.

Interestingly, the catalytic subunit Pol γA is more processive than other polymerases, evidenced by its ability to synthesize DNA up to ~100 nt (3), in comparison to the 1-15 nt of other enzymes (80,81). From a crystal structure, this high level of 77 intrinsic processivity was attributed to a subdomain (IP) of the spacer domain that is not found in other DNA polymerases (in the chapter 3, (108)). However, the rate of synthesis

by Pol γA is low, and forming a holoenzyme with Pol γB provides a significant rate enhancement. Some estimates of DNA polymerase processivity have been obtained by direct visualization of product length following synthesis on a long single-stranded template in steady-state reactions where multiple cycles of nucleotide incorporation occur. Other

estimates have used the ratio of the polymerization rate kpol to the off-rate koff, which can be obtained from pre-steady state kinetics experiments. This method breaks processivity into two simple parameters, enabling a more detailed mechanistic dissection of processivity.

Distinct roles of each Pol γB monomer in processivity

It is conceivable that a single mode of processivity enhancement, i.e., strengthening the affinity of polymerase for DNA, can increase processivity usefully only to a certain level. For example, T7 DNA polymerase and E. coli Pol III holoenzyme exhibit a comparable processivity, despite the great difference in the nature of their accessory subunits (109). In addition, DNA polymerases must retain some ability to dissociate from a template, and thus, if additional stimulation of processivity is needed, another mechanism may be necessary.

We have shown here that while the proximal monomer of the human Pol γB dimer is solely responsible for increasing the holoenzyme’s affinity to DNA, the distal monomer is essential for the polymerization rate enhancement. These results suggest that the monomeric Pol γB in Drosophila and perhaps other lower multicellular eukaryotic

78 organisms should have only the former activity, while the additional Pol γB monomer of mammals confers a new mode of processivity enhancement.

One reason for lack of rate enhancement by a monomeric Pol γB could be that an AB holoenzyme has lower affinity for dNTP. At low dNTP concentrations, the rate of synthesis by a heterodimeric AB-holoenzyme would then be slower than a heterotrimeric

AB2. However, the pre-steady state data we report were performed at a dNTP

concentration (50 µM) high enough to compensate for any theoretical increased Kd of the

AB-holoenzyme. The Kd values for wild-type Pol γA and Pol γ AB2 holoenzyme for

dNTP are 4.7 μM and 0.9 μM, respectively (3,9), and it is difficult to imagine how the Kd for an AB-holoenzyme could be outside that range. The reduced rate of synthesis by the AB-holoenzyme is therefore most likely due to other reasons.

Enhancement in rate of DNA synthesis occurs only when Pol γB is a dimer. The Pol γB distal monomer contacts the catalytic subunit at the exonuclease (exo) domain, in the vicinity of the DNA-binding channel but some distance away from the polymerization (pol) active site. It is therefore improbable that the distal monomer directly affects the pol active site conformation. Instead, its enhancement of polymerization rate is achieved either through other protein elements of Pol γA or by optimizing alignment of the template DNA. A modeled human Pol γ-DNA complex suggests that binding of dimeric Pol γB preferentially positions the primer terminus in the pol active site, thereby facilitating an in-line nucleophilic attack of the primer 3´-OH on the incoming dNTP. If this model is correct, a dimeric Pol γB would provide a more rigid scaffold for the primer-template than could be provided by a monomer. The latter would be less effective in restricting movement of the DNA. This idea also rationalizes why the small accessory subunit thioredoxin, and the toroidal sliding clamps that form flexible interactions with the catalytic subunit lack the ability to accelerate the rate of synthesis. 79 Pol γA promotes Pol γB dimerization

Mammalian Pol γB has an unusually large dimer interface of about 4000 Å2 - more than 2 times the size of an average protein-protein interface (1600 ± 400 Å2) (110).

We identified two regions that are critical for human Pol γB dimerization. Deletion of I4 in one region removes nearly half the surface contact area. However, the remaining

contact area of Pol γB ΔI4 is sufficiently large to support dimer formation at moderate concentrations. The estimated binding energy remaining for dimerization of ∆I4, using a converting factor of 25 cal/Å2, is about 50 kcal/mol. A substitution in the second region breaks two salt-bridges, removing at least 8 kcal/mol of binding energy. This value is probably an underestimate for the Pol γB D129K substitution, because the change introduces a repulsive interaction in replacement of the attractive interaction at the dimer interface. Nevertheless, D129K is not sufficient by itself to force Pol γB into a monomer, and both it and ∆I4 are necessary in combination. In Drosophila Pol γB, the corresponding I4 region is partly missing, as are the residues that can make a salt-bridge.

We conclude that the lack of these dimerization regions result in Drosophila Pol γB being a monomer. By the same token, Pol γB from mosquito or C. elegans are also predicted to be monomers. These monomeric accessory proteins are further predicted to be able to enhance the polymerase-DNA interaction but to lack the ability to accelerate the rate of polymerization.

Dimerization of Pol γB is also affected by the presence of Pol γA. Perhaps due to the additional interactions with the distal Pol γB monomer, Pol γA preferentially binds to

the dimer to form the more stable trimeric AB2, and in doing so, shifts the Pol γB dimer- monomer equilibrium towards dimer formation. Pol γA may also directly strengthen the Pol γB dimer. The proximal Pol γB monomer becomes sandwiched between the distal monomer and Pol γA in the holoenzyme (Figure 4.1A); by interacting with the distal Pol 80 γB monomer, Pol γA also reinforces its interaction with the proximal monomer. Some clinical symptoms associated with the Pol γA R232 mutations (79,111) may be a consequence of this lack of reinforcement. Consequently, the patients may have less efficient mtDNA synthesis.

Regardless of the oligomeric state of Pol γB, the presence of a primer/template DNA further stabilizes the interactions between subunits in holoenzyme. The effect is

unlikely to be caused by Pol γB and Pol γA interacting with DNA independently, because Pol γB does not bind to DNA of this length (22). Rather, the interaction is most likely mediated by Pol γA, whose biphasic AID subdomain simultaneously binds both to the upstream DNA via a positively charged surface and to the Pol γB proximal monomer via a hydrophobic surface (108). The observation that both monomeric and dimeric Pol γBs show DNA-dependent holoenzyme stabilization is consistent with the proximal monomer being solely responsible for DNA-dependent subunit interaction. Combining results from

our biophysical and biochemical assays, ΔI4 has a Kd value of 16.6 μM, but shows comparable activity to the dimeric wild-type activity at 70 nM. At this concentration the

amount of dimer should be ~ 1 nM, and we can then estimate that dimerization of Pol γB ∆I4 is increased by at least 70-fold by Pol γA and a primer/template DNA. The various interactions between Pol γA, Pol γB, and DNA can be summarized in the scheme:

81 The catalytic subunit Pol γA alters the monomer–dimer equilibrium of Pol γB in favor of dimer formation by selectively interacting – and thus sequestering - dimeric Pol

γB, driving more monomer into the dimer state (indicated by thicker arrows). Primer- template DNA also enhances the interaction between Pol γA and Pol γB, which may then allow formation of a ternary complex to be independent of the oligomeric state of Pol γB. A consequence of this may be that certain mutations that destabilize the Pol γB dimer may have less severe clinical consequences than those that destabilize the Pol γA-Pol γB interface. The AB heterodimer preserves some capacity for processive DNA synthesis, albeit without the rate enhancement, whereas the lack of any interaction between the A and B subunits precludes all processivity enhancement. An interesting question is whether AB-DNA can be converted to the AB2-DNA species by binding an additional monomeric Pol γB. With Pol γB ΔI4-D129K (and perhaps Drosophila Pol γB), we imagine that it could, but at much higher concentrations than we have examined.

However, the reverse reaction is more difficult to predict, because the affinities of Pol γB for itself (i.e., forming a homodimer) and for Pol γA (forming a heterodimer) are comparable ((3,26) and this work). It may then be possible for AB2-DNA to dissociate to

(AB-DNA)+B, (A-DNA)+B2, or simply to A+B2+DNA. We present here a rare example of two identical proteins that perform different functions. The species-dependent oligomeric states are thought-provoking on the

evolutional pathways leading to Pol γB. Pol γB shows obvious similarity to Class II aminoacyl-tRNA synthetases. Although modern Class II aaRSs are dimeric, the primordial enzymes are thought to be monomeric, which matches their much simpler

primordial stem-loop structured tRNA. If Pol γBs indeed evolved from Class II aaRS, their ancestors may be the primordial monomeric aaRS, as reflected by the monomeric

Pol γB of lower eukaryotes. Subsequently, both aaRSs and the mammalian Pol γB 82 independently became dimers in order to perform more sophisticated functions.

Alternatively, Pol γB might have evolved after aaRSs had become dimers; in this scenario lower eukaryote Pol γBs subsequently lost a monomer, while their mammalian counterparts remained unchanged. Dimerization of aaRSs not only accommodates the larger modern tRNA, but also potentially allows more regulation of activity.

Dimerization of Pol γB, as we have shown in this work, may also enable an additional mechanism of processivity enhancement to the mitochondrial DNA Pol γA.

83 Chapter 5: Characterization of R232 mutant of human mitochondrial DNA polymerase: The residue linked to mitochondrial disease controls balance between polymerization and proofreading activities.

(The research in this chapter was originally published in Journal of Biological Chemistry; Young-Sam Lee, Ian J. Molineux, and Y. Whitney Yin, (2010) Journal of Biological Chemistry, doi: 10.1074/jbc.M110.122283 © the American Society for Biochemistry and Molecular Biology.)

INTRODUCTION

Mitochondrial DNA (mtDNA) codes for a subset of proteins for ATP generation via the oxidative phosphorylation and its integrity is maintained solely by Pol γ. ATP is a “energy-currency” for cellular activities and impairment of ATP generation threatens viability of species. Consequently, the functional defects of human Pol γ cause depletion of mtDNA and manifest a wide range of clinical disorders, especially neurological and muscular dysfunction.

It has been identified about ~150 disease-associate mutations in human Pol γ. Recently, several patients with clinic symptoms caused by mitochondrial dysfunctions were reported to carry mutations at the Pol γA R232 residue (R232G or R232H) which is away from the both active sites. In detail, a patient who died at six months of age was found to carry R232G substitution on one chromosomal copy of Pol γA and the double substitution T251I and P587L on the other (79). The child was diagnosed with progressive generalized hypotonia and presented severe hypomyelinating peripheral neuropathy. Her mtDNA levels were only 3-5% of normal, and it was suggested that R232G may have a partially dominant -negative effect over a second allele expressing an abnormal protein, because patients homozygous for T251I and P587L present relatively

84 mild disease phenotypes (79). In a separate case, a five-month-old stopped growing and presented muscular hypotonia. At 4 yrs, he developed myoclonus, which progressed to

epilepsy. Pathological examinations revealed that he was heterozygous for Pol γA: R232H in trans to W748S (and the SNP E1143G), and his mtDNA content was 12% of normal (111). In addition, a six month-old patient with Leigh’s syndrome harbored Pol

γA R232H in trans to G848S. The patient had low levels of mtDNA, and Pol γ synthesis activity in fibroblasts was not processive on an M13 DNA template (112). Finally, three of four siblings exhibited various degrees of reduced dexterity, sensory redistribution and

muscle wasting. Two of the three (the third was not tested) were found to carry Pol γA R232H in trans to G737R. The healthy fourth child was wild-type for Pol γA (113). These data strongly suggest that substitutions of Pol γA R232 are intimately associated with clinical disorders.

The human Pol γ holoenzyme structure provides an idea of the phenotype- genotype relationship with this substitution. Among them, many mutations located in the

spacer of the catalytic subunit Pol γ have been hard to rationalize until the Pol γ holoenzyme is available (108). Interestingly, even Pol γA primarily interacts with the proximal monomer of the Pol γB, R232 residue forms an electrostatic interaction with E394 of the distal Pol γB monomer. In the previous chapter, I showed that each monomer of Pol γB has own distinct role in processivity enhancement: the proximal monomer is responsible for increasing DNA binding affinity and the distal one is for increasing the polymerization rate (114). Based on this finding, I hypothesize that the mitochondrial dysfunction caused by the R232 mutation results from disruption of the subunit

interaction between Pol γA and the distal Pol γB through R232-E394. In this chapter, I present the results of biochemical analysis for better understanding of the R232 function: R232 substitutions have no effect on independent Pol 85 γA activities, but show major defects in the Pol γA-Pol γB holoenzyme, including decreased polymerase and increased exonuclease activities, the latter with decreased selectivity for mismatches. This study provides a molecular basis for the disease caused

by the substitution of R232 in the catalytic subunit of the human Pol γ.

MATERIALS AND METHODS

Cloning, expression, and protein purification

Wild-type Pol γB, cloned into pET22b(+), was modified to incorporate the substitutions E394A, E394R, E449A, and E449R. Wild-type Pol γA and the exo- mutant E200A (114), both cloned into pET22b(+), were modified to incorporate substitutions of R232. Site-specific mutagenesis was performed using the QuikChange (Stratagene) kit with the following primer pairs (Table 5.1) where the mutation sites are shown in bold case. Pol γ subunit variants were overexpressed and purified as described in the “Materials and Methods” section, Chapter 4.

Table 5.1 Primer set for the mutagenesis of human Pol γ subunits

Primers Sequence 5’GGCTGGTGGAAGAGGGTTACTCTTGGACC Pol γA R232G 5’GGTCCAAGAGTAACCCTCTTCCACCAGCC 5’GGCTGGTGGAAGAGCATTACTCTTGGACCAG Pol γA R232H 5’CTGGTCCAAGAGTAATGCTCTTCCACCAGCC 5’GGCTGGTGGAAGAGGAATACTCTTGGACCAG Pol γA R232E 5’CTGGTCCAAGAGTATTCCTCTTCCACCAGCC 5’GAGGCCCCACATTGGCACTAAGACAGGTTTG Pol γB E394A 5’CAAACCTGTCTTAGTGCCAATGTGGGGCCTC Pol γB E394R 5’GAGGCCCCACATTGCGACTAAGACAGGTTTG 5’CAAACCTGTCTTAGTCGCAATGTGGGGCCTC Pol γB E449A 5’CTGAAACTACTTTGGCGAATGGATTAATACATC 5’GATGTATTAATCCATTCGCCAAAGTAGTTTCAG Pol γB E449R 5’CTGAAACTACTTTGCGGAATGGATTAATACATC 5’GATGTATTAATCCATTCCGCAAAGTAGTTTCAG

86 Polymerization kinetics assays

Steady and pre-steady state polymerization assays were carried out as described in the “Materials and Methods” section, Chapter 4.

Pre-steady state exonuclease assay

Kinetic experiments using wild-type (exo+) Pol γA and its R232G derivative also employed the RQF-3 instrument. Reactions were performed at 37°C in buffer (20 mM

HEPES pH 7.5, 100 mM NaCl), and were initiated by the addition of 10 mM MgCl2 into a pre-incubated Pol γ-DNA complex (100 nM Pol γA, with or without 500 nM Pol γB, and 75 nM substrate DNA). At various times (10 ms - 15 sec), reactions were quenched by rapid mixing with 0.5 M EDTA. The labeled DNA strands were resolved on a denaturing 15% polyacrylamide / 7 M urea gel, visualized by phosphorimagery and quantified using QuantityOne (Bio-Rad) .

RESULTS

Pol γA variants construction and mutation

To further our understanding of two Pol γA R232 mutants implicated in human diseases, we made Gly and His substitutions, yielding R232G and R232H. In addition,

because R232 forms a charge-charge interaction with E394 of the distal Pol γB monomer, we also constructed R232E. This mutant should display a repulsive interaction with Pol

γB E394. If the charge interaction between the subunits is important for the enzymatic activity of holoenzyme, R232E should be the most defective of the R232 mutants. Further,

to reveal any function of R232 distinct from its making a salt-bridge with the distal Pol γB monomer, we constructed Pol γB E394A and E394R. The former substitution abrogates the salt-bridge, whereas E394R introduces a charge repulsion with Pol γA R232. If formation of a salt-bridge with Pol γB E394 is the only function of Pol γA R232,

87 holoenzyme containing Pol γB E394R should have properties similar to that containing Pol γA R232E. In order to assess DNA synthesis by Pol γA variants without the complications of nucleolytic activity, Pol γA R232G, R232H and R232E were also constructed on an exonuclease-deficient enzyme where the catalytic residue E200 was changed to alanine,

yielding Pol γA R232G exo-, R232H exo- and R232E exo-. All proteins were purified to an estimated ≥ 95% homogeneity after SDS-PAGE (Figure 5.1B).

Interaction between Pol γA R232 variants and the accessory subunit

We first considered whether the reduced mtDNA synthesis observed in patients is caused by R232 variants weakening subunit interactions in the holoenzyme. To assess the

impact of disruption of the salt-bridge between Pol γA R232 and Pol γB E394, we analyzed the stability of holoenzyme formed by R232G or R232H variants with wild-type Pol γB using analytical gel-filtration chromatography.

When 1 μM Pol γA wild-type (MW 135 kDa) is mixed with 2 μM Pol γB (monomer MW 50 kDa) and chromatographed, a single peak elutes that corresponds to an ~ 220 kDa species (Figure 5.1C), suggesting that at this concentration the two subunits

complex completely. An identical profile is obtained with Pol γA variants R232G (Figure 5.1C), R232H and R232E (data not shown). Peak fractions corresponding to the 220 kDa

species were analyzed on SDS gels, showing that the complex contains both Pol γA R232G and Pol γB (Figure 5.1D). The intensity ratio of the two bands is 160:150,

consistent with formation of a heterotrimeric Pol γAB2. Peak heights of holoenzymes containing wild-type Pol γA or all R232 variants are identical, suggesting that the variants form holoenzyme normally at this concentration.

88

Figure 5.1 Structure of the heterotrimeric Pol γ holoenzyme with Pol γA (ribbon representation) and dimeric Pol γB (space filling). The proximal and distal monomer of Pol γB are colored light green and light grey, respectively. (A) The salt-bridge between Pol γA R232-E394 Pol γB constitutes the major interaction of Pol γA with the Pol γB distal monomer in the apo-enzyme. Residues Pol γA R232 (blue), Pol γB E394 and E449 (red) are illustrated. (B) Purified Pol γA wild-type and variants (top) and Pol γB variant proteins (bottom) (1 μg per lane) analyzed on Coomassie-stained SDS- polyacrylamide gels. (C) Superimposed elution profiles of analytical gel- filtration chromatography using 1 μM Pol γA R232G (thick black) or wild- type (grey) in the presence of 2 μM wt Pol γB. The profiles of uncomplexed Pol γA (1 μM) or Pol γB (2 μM) are shown in thin solid or dashed lines, respectively. Proteins in the peak fractions from the Pol γA R232G-Pol γB elution were visualized following SDS-PAGE gel electrophoresis. 89 DNA synthesis activity by Pol γA R232 variants

Because R232 variants showed no differences in holoenzyme formation, it was necessary to confirm that these mutations are indeed the cause of clinical disorders. We next analyzed the activities of Pol γA R232 exo- variants (R232G, R232H or R232E) on primed M13 DNA. All variants synthesized about equal amounts and similar lengths of

DNA as wild-type Pol γA (Figure 5.2A), suggesting that the polymerization domain of Pol γA is unaffected by the substitutions. However, holoenzymes containing Pol γA variants showed altered DNA synthesis activities. Relative to wild-type, holoenzyme containing Pol γA R232H shows a reduced level of DNA synthesis, albeit greater than that of the corresponding mutant Pol γA alone, suggesting that processivity enhancement by Pol γB is defective. More strikingly, holoenzymes containing Pol γA R232G or R232E are completely non-responsive to Pol γB stimulation. The yields and lengths of DNA synthesized by the mutant Pol γA in the presence of Pol γB are identical to those synthesized by the catalytic subunit alone (Figure 5.2A, Figure 5.3B). This result was unexpected because both mutants form holoenzyme with Pol γB. To test whether lack of sensitivity to the accessory subunit is due to the lower concentration used in this assay than in gel-filtration (Figure 5.1C), we increased protein concentrations up to 1 µM. No effect of Pol γB was observed (Figure 5.2B and 2C), again showing that holoenzyme containing Pol γA R232G or R232E cannot synthesize DNA processively.

We showed previously that processivity of human Pol γB is conferred by both monomers in separate ways: the proximal Pol γB monomer enhances DNA-binding, while the distal monomer, which harbors the Pol γA R232 binding partner E394, accelerates the polymerization rate (114). The reduced activity of holoenzyme containing Pol γA R232H can, in theory, be rationalized by minor local distortions due to the loss of the 90 salt-bridge between R232-E394 of Pol γA and Pol γB. However, the synthesis defects of holoenzymes containing Pol γA R232G or R232E are far more severe, even more so than that of wild-type Pol γA complexed to a monomeric Pol γB, ∆I4-D129K (114), which shows a reduced but clear stimulation (Figure 5.2D). Interestingly, Pol γA R232G behaves differently than the wild type in their respective heterodimeric holoenzymes –

Pol γA R232G is much less active than the wild-type enzyme (Figure 5.2C). Evidently, fulfilling the maximum potential of the normal heterotrimeric holoenzyme requires both

Pol γA R232 and the distal Pol γB monomer, and requires more than merely the local

interaction between E394 of that monomer and Pol γA R232. The importance of the salt-bridge between Pol γA R232 and Pol γB E394 in the apo- enzyme structure was independently tested by altering the latter protein. In experiments comparable to that shown in Figure 5.2, wild-type Pol γA was combined with Pol γB mutants E394A, E394R, E449A, and E449R. Substitutions of E449 serve as a negative

control, the residue lies close to E394 but from the atomic structure of Pol γ holoenzyme is not expected to interact with Pol γA (Figure 5.1A). In reactions on primed M13 DNA, holoenzyme containing Pol γB E394A, which abrogates the salt-bridge with Pol γA R232, showed near wild-type activity (Figure 5.3A). This result clearly demonstrates that the salt-bridge between Pol γA and the distal Pol γB monomer is not essential for processive DNA synthesis. The Pol γB E394R substitution should generate a repulsive force with Pol γA R232, and although this form of holoenzyme is less active than the wild type, the mutant accessory protein still confers a substantial enhancement of activity to Pol γA. Importantly, when a charge reversal combination of holoenzyme is made using Pol γA R232E and Pol γB E394R, there is only a small enhancement of activity, relative to Pol γA R232E alone (Figure 5.3B), and at a level far below wild-type holoenzyme. Pol γB

E394R should collide with Pol γA R232 to a comparable degree as Pol γA R232E to Pol 91 γB E394; the observation that Pol γB E394R does not mirror the defect of Pol γA R232E provides more support for the idea that in the presence of DNA, the interaction between subunits is more extensive than a salt-bridge.

Figure 5.2 Steady-state DNA synthesis activities of Pol γA R232 mutants. Assays were performed on primed M13 DNA in the presence of ~10-fold molar excess of ‘trap’ DNA. Products of Pol γA variants, with or without Pol γB, are visualized after separation on a denaturing polyacrylamide gel. (A) 0.05 µM Pol γA variant, 0.1 µM Pol γB, 0.05 µM M13 DNA; (B) 0.05 µM Pol γA R232G and increasing concentrations of Pol γB. Pol γA R232G (C) or wild- type (D) in the presence of dimeric wild-type or monomeric Pol γB ΔI4- D129K.

92

Figure 5.3 DNA polymerase activities of holoenzymes containing Pol γB E394 variants. Products of wild-type Pol γA (A) or Pol γA R232E (B) paired with a Pol γB variant are visualized after separation on a denaturing polyacrylamide gel. Reactions contained 0.05 µM Pol γA, wild-type or R232E, ± 0.1 µM Pol γB, wild-type or variant, 0.05 µM M13 DNA and 500 nM ‘trap’ DNA.

93 Effect of Pol γA R232 mutation on processivity

Pre-steady state kinetic analyses were performed to provide a mechanistic

explanation for the defects of the Pol γA R232 variants. Polymerization activities of Pol γA exo- forms of R232H and R232G were measured by monitoring the incorporation of a single nucleotide to a 25-nt primer annealed to a 45-nt template (Figure 5.4A). The time- dependent primer extension to 26-nt product was plotted (Figure 5.4B) and data were fitted to the burst equation:

−k pol •t [ product 26 − nt] = A(1 − e ) + k ss • t (Eq. 1)

where A is the burst amplitude, kpol the burst rate of polymerization, and kss the steady-state rate. Other important parameters, koff and processivity, which correspond, respectively, to the time Pol γ remains associated with the template and the length of DNA synthesized per binding event, can be derived from the above parameters. The data are summarized in Table 5.2 for the wild type and R232 variants, both with and without

Pol γB. In the absence of Pol γB, Pol γA R232G and R232H have amplitudes (A 22.1 and

-1 25.6 nM) and burst rates (kpol 10.5 and 12.1 s ), similar to wild-type (A 27.9 nM, kpol 11.4 s-1). These values are consistent with our steady-state measurements on M13 DNA and confirm that substitutions of R232 do not substantively alter the intrinsic polymerase

activity. In the presence of wild-type Pol γB, Pol γA R232H and R232G both show increases in amplitudes to near the wild-type level, indicating that the accessory protein

increases DNA-binding almost normally.However, the Pol γA R232H-containing holoenzyme is defective, exhibiting only 60% of the wild-type burst rate and an increased

steady-state rate kss that reflects a higher rate of dissociation. Consequently, Pol γA R232H-containing holoenzyme exhibits only 43% of wild-type processivity.

94 Remarkably, the burst rate of the R232G-containing holoenzyme fails to increase from that of the catalytic subunit alone, it actually decreases to give a nearly 4-fold reduction from wild-type holoenzyme. Further, the steady-state or dissociation rate of Pol

γA R232G-containing holoenzyme is almost twice that of wild-type; the two defects result in an 8-fold drop in processivity. Thus, although Pol γA R232G has only a minor defect in DNA-binding, and no significant defect in DNA synthesis, the presence of the wild-type Pol γB accessory factor in the mutant holoenzyme drastically reduces polymerization activity. Note that the burst rate is for incorporation of a single nucleotide. A small reduction in burst rate is more apparent during synthesis of longer products because the amount of an N-nucleotide product is related to the burst rate by the Nth-power. Thus, the

lack of processivity enhancement in Pol γA R232G-containing holoenzyme (Figure 5.2) is the combination of increased DNA binding but a severely reduced reaction rate.

Similar experiments were performed using wild-type Pol γA and mutant Pol γB (Figure 5.4C); they also support observations made in steady-state reactions with primed

M13 DNA (Figure 5.3A). The Pol γB E394A mutant holoenzyme gives an amplitude and burst rate indistinguishable from wild-type (Table 5.2). However, holoenzyme containing

Pol γB E394R, which may alter the position of Pol γA R232 by charge repulsion, results in a slightly reduced amplitude and a burst rate 80% of wild-type. Together, the data support

the idea that the salt-bridge between Pol γA R232 and Pol γB E394 simply maintains the correct spatial conformation of Pol γA R232 in the holoenzyme. Importantly, the changes in kinetic parameters of holoenzyme containing Pol γB E394R do not match those that contain Pol γA R232G, confirming the idea that substitutions of Pol γA R232 affects enzyme activity beyond that expected for a simple salt-bridge interaction between subunits. 95

Figure 5.4 Pre-steady-state single nucleotide incorporation. (A) The 25/45 nt duplex DNA substrate. (B) The 26-mer product was quantified from reactions using Pol γA wild-type or R232 variants in the presence or absence of Pol γB. (C) Same as (B), but with Pol γB wild-type or E394 variants in the presence of wild-type Pol γA. Short time points are shown as an inset. Reactions contain 200 nM 25/45 nt DNA and 70 nM Pol γA, with or without 300 nM Pol γB.

Table 5.2 Polymerization activities of Pol γA mutants

a a a b Pol γA Pol γB Amplitude (A) Burst rate, kpol Steady state rate, kss Processivity (exo-) (nM) (s-1) (nM•s-1) (nt) wt - 27.9 ( ± 0.8) 11.4 ( ± 0.8) 10.3 ( ± 0.2) 31 R232H - 25.6 (± 0.7) 12.1 ( ± 0.8) 9.5 ( ± 0.3) 32 R232G - 22.1 (± 1.1) 10.5 ( ± 1.4) 9.9 ( ± 0.4) 23 wt wt 50.0 ( ± 0.8) 29.7 ( ± 1.5) 3.0 ( ± 0.8) 495 R232H wt 49.3 ( ± 0.6) 17.2 (± 0.5) 3.9 ( ± 0.2) 217 R232G wt 45.8 ( ± 1.0) 7.7 ( ± 0.4) 5.6 ( ± 0.4) 63 wt E394A 49.6 ( ± 0.7) 29.4 (± 0.5) 3.4 ( ± 0.3) 428 wt E394R 45.1 ( ± 0.5) 23.2 ( ± 0.7) 3.9 ( ± 0.4) 268 a. Numbers in parentheses are the root-mean-square standard deviations for data fitting.

b. Processivity is calculated by kpol•A/kss 96 Effects of the R232 mutation on exonuclease activity

Spatially, R232 is located on the periphery of the 3’ Æ 5’ exonuclease domain (consisting of residues 171 - 440), 45 Å away from the polymerase domain. How do substitutions of R232 abolish the stimulation normally conferred by the accessory protein

Pol γB and, in the case of Pol γA R232G, make the holoenzyme a less efficient enzyme than the catalytic subunit alone? Examination of the DNA synthesized by holoenzyme

containing Pol γA R232G reveals that shorter products predominate (Figure 5.2). This could be due to lower processivity but it could also reflect increased degradation of newly

synthesized DNA. This last idea led us to examine the exonuclease activity of Pol γA variants. Exonuclease activity was tested using a partial duplex containing mismatched nucleotides (Figure 5.5A), which mimics a product of erroneous synthesis generated during DNA synthesis. This substrate allows us to examine rates of excision of primers containing from four to one mismatched nucleotides, and determine the rate of excision for each. Furthermore, in the same reaction the rate of excision of correctly base-paired primers can also be determined. This substrate therefore enables examination of consecutive primer hydrolytic reactions at single nucleotide resolution (Figure 5.5). The amount of each hydrolytic product was monitored over time and resolved by electrophoresis on a denaturing polyacrylamide gel (Figure 5.5 B~E). Each band corresponding to a defined primer length was quantified and is presented graphically in Figure 5.5 F~I. The time-dependent hydrolytic product formation was fitted (Figure 5.5 F~I, line) using KinTek Global Kinetic Explorer program (KinTek Co.) as model reactions: Enzyme • DNA ⇔ Enzyme • DNA ⇔ ... ⇔ Enzyme • DNA 25 nt 24 nt 16 nt Enzyme • DNAi ⇔ Enzyme + DNAi (i = 25 ~ 16 nt)

97

Figure 5.5 Exonuclease activities of Pol γA R232 variants. (A) The partial duplex substrate. Products of each excision event with the catalytic subunit (open symbols) or holoenzyme (filled symbols) were visualized (B~E) and quantified (symbol) and global fitting results (line) (F~I). All reactions contained 75 nM DNA substrate, 100 nM wild-type Pol γA (circles) or R232G (triangles), ± 500 nM Pol γB. 98 Table 5.3 Global fitting analysis for exonuclease activity of Pol γA variants

Pol γA wt and Pol γA R232G Reactions Pol γA wt Pol γA R232G Pol γB wt and Pol γB wt Sec-1 25 nt Æ 24 nt 9.0 7.6 6.1 10.8 24 nt Æ 23 nt 24.1 23.2 10.6 16.6 DNA 23 nt Æ 22 nt 24.0 23.1 9.3 14.3 hydrolysis 22 nt Æ 21 nt 9.2 8.2 2.1 5.1 21 nt Æ 20 nt 1.6 1.4 0.18 0.60 20 nt Æ 19 nta 1.2 1.1 0.15 0.30 Enzyme⋅DNA Æ 7.5 6.6 1.3 2.1 Enzyme + DNAb a. The hydrolysis rate of each step for generating product < 19 nt-DNA was assumed to be same with that of this reaction. b. The reverse reaction rate was fixed to 10 sec-1 during the fitting.

For all enzymes, the rate for excision of primers containing more mismatched nucleotides is greater than for those with less, the primer with 1-mismatch is excised the slowest (Table 5.3). This is in agreement with earlier pre-steady state analyses (9), and FRET analyses (115). However, the rate of hydrolysis of a perfectly base-paired primer (yielding products <21 nt) is strikingly slower than any mismatched primer (Figure 5.5 F~I). Thus, when the products of mismatch removal are analyzed over longer times, the reaction is clearly biphasic with a sharp initial rise phase that plateaus at a maximum amplitude. The plateau is followed by a gradual decline (Figure 5.6A), because it represents two combined reactions of a fast hydrolysis of mismatched primer (21-24 nt) and a slower degradation of correctly base-paired primer (< 21 nt) generated after mismatch removal. It would not have the second phase if the exonuclease activity of Pol γ were directed exclusively towards a mismatched 3´-end.

99 As our primary interest here is in the comparison of wild-type and mutant enzyme, we took advantage of the drastic rate difference for excision of mismatch and correctly paired primers, and simplified the analysis by pooling data for primers containing one or more mismatches and, separately, for primers that are fully duplexed. The concentration of products resulting from hydrolysis of the original 25-nt primer can then be expressed as the sum of both products:

[25nt]0 −[25nt]t = ∑(21 ~ 24nt)t + ∑(< 21nt)t (Eq. 2) where [25nt]o and [25nt]t are concentrations of the 25-nt primer at times 0 and t (Figure

5.6A inset). Σ(21-24nt)t and Σ(<21nt)t are the respective sums of nucleolytic products of primers originally containing a mismatched or a base-paired 3´-end. The two processes are sequential, which allows us to simplify the kinetic analysis by fitting the formation of all mismatched primers Σ(21-24nt)t at short times (≤ 1 sec, Figure 5.6A) to: m ∑(21 ~ 24nt)t = Am [1− exp(−kexo • t)] (Eq. 3)

Hydrolysis of duplex DNAs shorter than 21 nt data (Figure 5.6B) were then fitted to: c ∑(< 21nt)t = Ac [1− exp(−kexo • t)] (Eq. 4) where Am and Ac are the respective amplitudes for mismatched and correctly paired primer-templates; they represent the amount of primer/template bound to polymerase

m c subject to excision during the first cycle of reaction, kexo and kexo represent excision rates of mismatched and correctly paired primers.

Mismatch removal by Pol γA and holoenzyme

Wild-type Pol γA exhibits significant nucleolytic activity for mismatched DNA: with an amplitude of 40 nM, substrate is excised at a rate of 20.7 s-1 (Figure 5.6A, Table 5.4). Pol γA R232G alone displays mismatch removal activity comparable to wild-type,

100 suggesting the substitution has little impact on the integrity and function of the exonuclease active site. However, in the presence of Pol γB, significant differences become apparent. The amplitude of wild-type holoenzyme increases, indicating an increased affinity for a mismatched DNA substrate, and the burst rate is reduced. Using Eq. 3, and the estimated time to hydrolyze four mismatched nucleotides in the 45nt/25nt substrate takes ~0.2 sec at the rate of 20.7 s-1, we calculate that 48.2 nM mismatched nucleotide is excised in 0.2 sec by holoenzyme, whereas 39.2 nM is excised by Pol γA alone. The higher amplitude of the holoenzyme-catalyzed reaction compensates for the slower excision burst rate, and the overall efficiency of mismatch excision by holoenzyme becomes ~20% higher than that of Pol γA alone. Holoenzyme containing Pol γA R232G increases the amplitude; however, the burst rate remains essentially constant (Table 5.4), resulting in an overall mismatch excision activity greater than wild-type. Again using Eq. 3, the Pol γA R232G holoenzyme excises 57 nM of mismatched primer in 0.2 sec, 30% higher than wild-type holoenzyme and a 250% increase from Pol γA alone (Table 5.5).

Hydrolysis of a correct base-pair

In contrast to the increased amplitude for mismatched DNA excision, on perfect duplex DNA both the amplitude and burst rate by wild-type holoenzyme are lower than by Pol γA alone (Figure 5.6B, Table 5.4). For example, using Eq. 4, hydrolysis of a correctly paired primer by holoenzyme is 0.8 nM in 0.2 sec, only 38% that of Pol γA. The differential change in amplitude and burst rate on correct and incorrect primer enhances the nucleolytic specificity of holoenzyme towards mismatched DNA. The accessory protein Pol γB therefore not only increases the processivity and rate of polymerization but also simultaneously stimulates excision of incorrectly incorporated nucleotides and

101 reduces unnecessary excision of correctly incorporated nucleotides. Performing both functions confers a significant Pol γB contribution to DNA synthesis fidelity. In stark contrast, holoenzyme containing Pol γA R232G exhibits both increased amplitude and rate of excision of correctly incorporated nucleotides relative to wild-type (Table 5.4), giving the mutant holoenzyme a 3-fold higher hydrolytic activity on duplex DNA than wild type. These observations suggest that the mutant holoenzyme is less capable of distinguishing the conformations of mismatched and correctly paired primer- templates and, therefore, correctly incorporated nucleotides are excessively and unnecessarily hydrolyzed. Taken together, holoenzyme containing Pol γA R232G has elevated exonucleolytic activities for both correct and incorrect DNA. Nonetheless, because of increased removal of the correct DNA, the selective nucleolytic activity is reduced. Consequently, wild-type holoenzyme is 60-fold more active on mismatches than correctly paired DNA whereas the R232G holoenzyme is only 23-fold (Table 5.5).

102

Figure 5.6 Exonuclease analyses. Products of all mismatches (21~24-mer) (A) and duplex DNA (< 21-mer) (B) fitted into their respective burst equations.

Table 5.4 Exonuclease activity of Pol γA variants

ENZYME Mismatch excision Duplex excision

Amplitude Average exo rate Amplitude Average exo A , (nM) a m -1 a A , (nM) a rate m kexo (s ) c c -1 a kexo(s ) Pol γA wt - 40.0 ( ± 2.4) 20.7 ( ± 3.4) 53.8 ( ± 2.2) 0.2 ( ± 0.01) Pol γA R232G - 39.2 ( ± 1.9) 16.8 ( ± 2.2) 51.2 ( ± 2.1) 0.2 ( ± 0.01) Pol γA wt Pol γB wt 60.4 ( ± 1.7) 8.0 ( ± 0.5) 41.2 ( ± 1.6) 0.1 ( ± 0.00) Pol γA R232G Pol γB wt 57.1 ( ± 2.8) 16.2 ( ± 1.2) 61.8 ( ± 1.5) 0.2 ( ± 0.01) a. Numbers in parentheses are the root-mean square standard deviations for data fitting.

Table 5.5 Selective exonuclease activities of wild-type and mutant enzymesa

Enzyme Mismatches excised Correct primer excised Selectivityb (nM) (nM) for mismatches Pol γA-wt 39.2 2.1 18.6 Pol γA holoenzyme-wt 48.2 0.8 60.2 Pol γA-R232 39.2 2.0 19.6 Pol γA-R232 holoenzyme 57.0 2.4 23.7 a. Calculated for 0.2 sec duration. b. Selectivity = [mismatch excised]/[correct primer excised]

103 DISCUSSION

Among disease-associated Pol γA mutations, those located in the polymerase (pol) or exonuclease (exo) active sites have been easiest to rationalize. The recognizable motifs and the crystal structure of the homologous T7 DNAP bound to DNA aided the interpretation of biochemical defects due to active site mutations. Such analyses have provided explanations for some severe disorders, e.g., active site mutations in pol have been linked to autosomal recessive PEO (arPEO), and mutations in exo to pathological premature aging (28,31). However, correlating a clinical phenotype with mutations outside the active sites was more difficult until a crystal structure of human Pol γ became known (108). Several patients displaying severe clinical neurological and muscular disorders have been found to carry the Pol γA R232 substitutions R232G and R232H. Patients presented various degrees of mtDNA copy number reduction or deletions of mtDNA, both of which are indicative of defective mtDNA maintenance. Using an atomic structure of Pol γ holoenzyme to guide our biochemical investigation, we have described critical properties of Pol γΑ R232 that provide a molecular explanation for the role of R232 substitutions in human disease. Like most DNA replicases, the catalytic subunit of Pol γA has low processivity. Not until Pol γA associates with its accessory subunit Pol γB does the resulting holoenzyme exhibit high processivity. Pol γA R232 forms the only strong interaction with E394 of the distal Pol γB monomer in the holoenzyme. While our biochemical investigation confirms that the interaction contributes to Pol γ activity, the salt-bridge between the residues seen in the apo-enzyme is not absolutely required. However, Pol γA R232 is essential for mediating Pol γB functions, explaining conservation of the residue

104 across diverse Pol γA proteins, even in species where the holoenzyme contains only monomeric Pol γB.

Substitutions of Pol γA R232 alter both polymerase and exonuclease activites of holoenzyme

There are no detectable defects of Pol γA R232 substitutions on the activities of the catalytic subunit, suggesting that R232 does not directly participate in either pol or exo activities. Surprisingly, Pol γA mutants R232G or R232E show major defects only in holoenzyme, where, despite an apparently normal association of subunits, the effects of Pol γB are completely abrogated. The R232H mutant behaves similarly, albeit to a lesser degree. We conclude that Pol γA R232 constitutes an essential part of the communication pathway with Pol γB. Signal transduction is completely disrupted by the R232G or R232E substitutions; the mutant holoenzymes display markedly reduced processivity. Concurrently, they display altered exonuclease activity. The combined changes of pol and exo activities result in Pol γA R232 mutant-containing holoenzymes having a major defect in DNA synthesis. Pol γA R232H-containing holoenzyme presents 43%, that containing R232G only 13% of wild-type activity. We showed previously that each monomer of Pol γB dimer has a distinct role in processivity: the proximal monomer increases affinity for DNA whereas the distal monomer increases polymerization rate (114). Holoenzymes containing R232H or R232G show no – or even a negative - reaction rate acceleration over that of Pol γA alone, indicating that Pol γA R232 is mainly responsible for transmitting a signal from the distal Pol γB monomer. Nonetheless, the mutant holoenzymes exhibit normal DNA- binding affinity, suggesting that the function of the Pol γB proximal monomer is unaffected by substitutions of R232. The Pol γΑ K-tract - a lysine-rich region (496KQKKAKKVKK505) of the AID subdomain ((108), Figure 5.7) most likely enhances 105 DNA-binding in holoenzyme. The distinct functions of the two Pol γB monomers – strengthening DNA binding and enhancing the polymerization rate – may thus be independently mediated by the K-tract and R232 of Pol γA.

Pol γA R232 senses the conformation of primer-template for selective exo activity

Pol γB was previously reported to reduce the exonuclease activity of holoenzyme, which potentially could lower its proofreading ability, relative to Pol γA alone (9,23). This seemingly detrimental activity of a processivity factor is, however, a combination of excision of both a mismatched and a properly base-paired primer. By modifying the methods of reference (9), we were able to analyze mismatch and correct nucleotide excision in the same reaction. Although Pol γA has an intrinsic ability to distinguish mismatched from correctly matched primer-template, Pol γB further enhances this selectivity. By increasing the affinity and reaction rate of a mismatched primer in the exo site, and a correctly paired primer in the pol site, Pol γB facilitates the DNA geometry-dependent activity of the catalytic subunit, thereby making a positive contribution to the fidelity of DNA synthesis. As a result, the nucleolytic rate of Pol γ holoenzyme on a correctly paired primer is reduced to ~1% of a mismatch (Table 5.5). The exonuclease activity of holoenzyme containing Pol γA R232G is elevated for both correct and incorrect paired primers. However, the nucleolytic activity for the correct primer increases more than for mismatched primers, leading to a reduced ability to discriminate. Consequently, the mutant holoenzyme more frequently positions a correctly base-paired primer in the exo site, resulting in excessive degradation. This suggests that R232 comprises part of the sensing system that directs a primer terminus to pol or exo as appropriate.

106 The pol and exo sites are 45 Å apart, similar to those in other DNA polymerases. How is a primer stand transferred between the two sites? Crystal structures of two DNAP editing complexes with the primer terminus in exo show that both maintain their respective apo-enzyme conformations (116-118); primer bound in the exo site may therefore be the low energy state. The low energy state for DNA bound to Pol γA may also lie in exo, but for holoenzyme it is biased towards the pol site binding of dNTPs should also promote switching to pol, but when polymerization stalls, the primer strand falls back into the exo site for editing. A mismatch has the highest probability to be edited immediately after its incorporation (9). The important comparison is therefore between hydrolysis of a single mismatch and perfectly duplexed DNA. Interestingly, mismatch removal occurs at a steadily decreasing rate as the number of mismatches reduces from four to one nucleotide. The rate reduction may reflect a requirement for an increasing number of base-pairs to be unwound prior to excision (119,120). The rate of hydrolysis of a single mismatch is slowest among mismatches, suggesting that the lowest energy barrier for a primer to partition between the exo and pol sites is for a single mismatch, in agreement with the fluorescence studies of Millar and co-workers (115). This suggests that the two sites are delicately balanced and can be inter-converted by a subtle structural change in polymerase conformation. A low energy barrier would allow rapid switching between synthesis and editing modes so that the polymerase can achieve rapid replication with high fidelity.

R232 and primer strand transfer

We suggest the following model for why substitutions of Pol γA R232 have such a detrimental effect on holoenzyme activity when they confer only minor changes to the

107 activities of the catalytic subunit alone. Substitution of Pol γA R232 changes both pol and exo activities, suggesting that the arginine residue directly interacts with DNA. In the current absence of atomic structures of Pol γ-DNA complexes, we constructed a Pol γ- DNA complex using the structure of the apo-holoenzyme with a docked primer-template DNA (Figure 5.7). Two Pol γA DNA binding regions, the AID subdomain and R232, function only in the holoenzyme. Both regions are essential for positioning the primer terminus in pol or exo site, and their functions are mutually dependent. The AID subdomain harbors a positively charged lysine-rich K-tract that could contact DNA (108). The opposite face of the K-tract forms the major subunit interaction with the Pol γB proximal monomer, but deleting the interacting residues in Pol γA has no effect on activity. In the absence of Pol γB the AID subdomain is simply too flexible to allow the K-tract to function. Furthermore, in the absence of a K-tract interaction to clamp the template-primer, R232 cannot form a stable interaction with DNA. Thus, in the absence of Pol γB, neither AID nor R232 alone can contribute to Pol γA function, which explains why R232 substitutions only affect holoenzyme activity. When a template-primer is docked onto the apo-enzyme structure, positioning the 3´-end of the primer in the pol site, neither R232 nor the AID subdomain directly contacts DNA. We suggest that upon binding to DNA the Pol γA AID subdomain undergoes a rotational movement that aligns the K-tract with the phosphodiester backbone. Because of the AID subdomain’s strong interaction with Pol γB, rotation of the AID subdomain should induce a concerted rotation of the dimeric Pol γB in the direction that brings the distal monomer close to the R232-loop region. Consequently, the local subunit contact surface area between Pol γA and the distal Pol γB monomer is extended beyond the single salt-bridge of R232-E394 in the apo-enzyme. In addition, the R232-loop is brought 108 closer to the DNA (Figure 5.7). Pol γA R232 may then form a bipartite charge-charge interaction with both the negatively charged backbone of the primer-template and Pol γB E394, and may thus coordinate with the latter in positioning the primer-template in the pol site. In the holoenzyme, while the K-tract firmly binds the duplex DNA upstream of the primer terminus, Pol γA R232 could act as a pivot between the pol and exo active sites by biasing a duplex primer towards pol. This conformational change model explains why R232 only functions in the holoenzyme. Eliminating the positive charge in Pol γA R232G disrupts both the salt-bridge with Pol γB and its interaction with DNA. This explains why Pol γA R232G is completely unresponsive to Pol γB-mediated stimulation of polymerase and suppression of exonucleolytic reaction rates. Similarly, Pol γA R232E introduces charge repulsion to both DNA and Pol γB E394, supporting the proposed role for R232. Presumably, the histidine substitution in R232H retains enough positive charge character in its local environment to allow a small response to the presence of Pol γB. Loss of R232 function abolishes the balancing force to switch the primer strand back to the pol site, resulting excessive nucleolytic activity. The Pol γB E394R substitution, which leads to a partially defective holoenzyme (Figure 5.3; Table 5.2), should widen its separation from Pol γA R232, indicating that for optimal function the proper local geometry of R232 and E394 is important. This geometry must be maintained with Pol γB E394A, as holoenzyme containing this mutant is fully functional, but the defects of the charge-reversal combination of Pol γA R232E:Pol γB E394R holoenzyme indicate that a simple interaction is insufficient and Pol γA R232 is of critical importance. Pol γA R232 provides the balancing force to switch the primer strand from its likely low energy state in exo back to pol. Loss of Pol γA R232 therefore prolongs the residence time of the primer in the exo site, resulting in excessive nucleolytic activity. 109

Figure 5.7 The signaling pathway in the modeled Pol γ holoenzyme-DNA complex. The 5’-end of the template and the 3’-end of the primer are indicated. (A) Pol γA undergoes conformational changes upon binding to DNA at the K-tract that results in concerted changes in Pol γB. The R232-E394 region moves closer to the DNA binding channel (in the direction indicated by the arrow). In this conformation, Pol γA R232 forms a bipartite charge interaction with both the DNA backbone and Pol γB E394 that facilitates positioning the primer terminus in pol. (B) The mutant Pol γA G232 cannot make either interaction, and is thus unable to transduce the signal from Pol γB. A more detailed discussion is given in the text.

Pol γA R232 substitution and human diseases

Pol γA R232 has an essential role in both processive DNA synthesis and proofreading, providing a molecular explanation for the association of Pol γA R232 substitutions with human diseases. Estimating the reduction in synthetic capacity (2.5- fold for R232H and 7.7-fold for R232G) and an increase in exonucleolytic activity (3- fold), we predict that the overall mtDNA content for patients carrying Pol γA R232H or R232G should be about 13% or 4%, respectively, of normal. These are in excellent agreement with clinical findings, where the patients who carried Pol γA R232H and R232G had 12% and 3-5%, respectively, of the normal mtDNA content. 110 However, patients carrying Pol γA R232 substitutions are heterozygous for POLG, where the effects of the substitution are complicated by the presence of the other copy. Though recessive to wild-type, an R232 mutant holoenzyme may compete effectively with Pol γA variants for template DNA, reducing their capacity for synthesis. The disease manifestations suggest that the substitutions T251I/P587L, G737R, and G848S that accompany R232G/H should also cause Pol γA to be defective. Even if the intrinsic effects of these substitutions on Pol γ activity are mild, they will be enhanced in the presence of an R232 substitution. Inducing expression of a dominant negative mutant Pol γA in human cells quickly depleted mtDNA (121), an observation likely caused by the same mechanism.

111 Summary

The crystal structure of human Pol γ holoenzyme reveals a specific subunit interaction between a monomeric catalytic subunit Pol γA which belongs to DNA Pol I polymerase family and a dimeric accessory subunit Pol γB which confers a processivity of the holoenzyme. Although Pol γA adopts the canonical polymerase “right-hand” configuration with subdomains of “fingers”, “palm”, and “thumb”, a spacer domain in the Pol γA connects to N-terminal exonuclease domain and C-terminal polymerase domain through a long helices of the thumb subdomain. A unique folding of the spacer domain which is missing or diminished in other members in the Pol I family provides a hydrophobic interface for subunit interaction with processivity factor Pol γB through L- helix in AID subdomain as well as an intrinsic processivity of the catalytic subunit through IP subdomain. The Pol γ-DNA complex model shows that the hydrophobic interaction between Pol γB and the L-helix in Pol γA exposes a positively charged surface on Pol γA (“K-tract”) into DNA binding channel, thus increasing the contact of holoenzyme to DNA. In the heterotrimeric Pol γ holoenzyme structure, Pol γA interacts primarily with the proximal monomer of Pol γB through the thumb subdomain and the L-helix in the AID domain of Pol γA, but makes only limited contacts with the distal Pol γB monomer. Kinetic analyses with an engineered monomeric human Pol γB decipher the role of each monomer in Pol γB on the processivity: the proximal monomer strengthens DNA binding affinity of the holoenzyme, whereas the distal monomer accelerates the rate of nucleotide incorporation. In addition, subunit interaction of Pol γ holoenzyme was stabilized in the presence of primer-template DNA.

112 The human Pol γ structure provides a molecular basis of mitochondrial diseases caused by mutations in Pol γA. These mutations can be classified into three groups: class I containing mutations in the active site, class II including substitutions located in the DNA binding channel and class III for subunit interface mutations. Especially, mutations in spacer region (440th ~830th amino acids of Pol γA, defined by sequence alignment with other DNA polymerases) segregate into classes II and III and the effects of the mutations are now rationalized based on the Pol γ structure. For example, the most common substitution among Pol γ variants, A467T (class III) reduces DNA binding affinity and lowers processivity. Also, the substitution weakens subunit interaction. Mutation of the A467 residue locating in the thumb subdomain disrupts a local structure of the thumb, which interrupts interaction with DNA and Pol γB. W748 locating the IP subdomain participates stacking interaction with F750 and H733, and is likely important for the IP structure. Thus, the effects of W748S mutation (Class II), lower DNA polymerization activity, but normal holoenzyme formation are expected to result from disrupting the local structure of the IP subdomain by the mutation. In addition, the holoenzyme crystal structure also provides a clue to understand the disease-related Pol γA-R232 substitutions which haven’t been characterized in molecular level. Among the limited interactions between Pol γA and the distal Pol γB monomer, Pol γA R232 and Pol γB E394 make the only strong interaction. Biochemical analyses with Pol γA carrying R232 substitution reveal that deleterious effects of the substitution, decreased polymerization activity and reduced discrimination between correct and incorrect base pairs in exonuclease activity, are observed only in a form of holoenzyme with Pol γB. Consequently, R232 residue of human Pol γA is critical for relaying the regulatory activity conferred by Pol γB to enhance processivity in DNA polymerization as well as to facilitate selective editing of mis-incorporated nucleotides. 113 Lastly, the Pol γ-DNA complex model provides an initial template to characterize the differences between the active site of Pol γ and HIV RT, which sets a foundation for understanding the mechanisms of antiviral drug toxicity as well as for developing drugs with less toxicity. Continued research to determine co-crystal structures of Pol γ with DNA and incoming nucleotides will elucidate a more precise mechanism of how Pol γ catalyzes its enzyme activity as well as provide a clearer template for designing selective inhibitors.

114 References

1. Chen, X. J., and Butow, R. A. (2005) Nature Reviews Genetics 6, 815-825 2. Lim, S. E., Longley, M. J., and Copeland, W. C. (1999) Journal of Biological Chemistry 274, 38197-38203 3. Johnson, A. A., Tsai, Y. C., Graves, S. W., and Johnson, K. A. (2000) in Biochemistry Vol. 39, pp. 1702-1708 4. Ito, K., Arens, M., and Green, M. (1975) Journal of Virology 15, 1507-1510 5. Lee, H. R., and Johnson, K. A. (2007) Journal of Biological Chemistry 282, 31982-31989 6. Grossman, L. I., Watson, R., and Vinograd, J. (1973) Proceedings of the National Academy of Sciences of the United States of America 70, 3339-3343 7. Yang, M. Y., Bowmaker, M., Reyes, A., Vergani, L., Angeli, P., Gringeri, E., Jacobs, H. T., and Holt, I. J. (2002) Cell 111, 495-505 8. Johnson, A. A., and Johnson, K. A. (2001) Journal of Biological Chemistry 276, 38090-38096 9. Johnson, A. A., and Johnson, K. A. (2001) Journal of Biological Chemistry 276, 38097-38107 10. Longley, M. J., Prasad, R., Srivastava, D. K., Wilson, S. H., and Copeland, W. C. (1998) Proceedings of the National Academy of Sciences of the United States of America 95, 12244-12248 11. Clark, J. M. (1988) Nucleic Acids Research 16, 9677-9686 12. Auerbach, P. A., and Demple, B. (2010) Mutagenesis 25, 63-69 13. Ito, J., and Braithwaite, D. K. (1990) Nucleic Acids Research 18, 6716-6716 14. Kaguni, L. S. (2004) Annual Review of Biochemistry 73, 293-320 15. Luo, N. G., and Kaguni, L. S. (2005) Journal of Biological Chemistry 280, 2491- 2497 16. Doublie, S., Tabor, S., Long, A. M., Richardson, C. C., and Ellenberger, T. (1998) Nature 391, 251-258 17. Hamdan, S. M., and Richardson, C. C. (2009) Annual Review of Biochemistry 78, 205-243 18. Graves, S. W., Johnson, A. A., and Johnson, K. A. (1998) Biochemistry 37, 6050- 6058 19. Carrodeguas, J. A., Theis, K., Bogenhagen, D. F., and Kisker, C. (2001) Molecular Cell 7, 43-54 20. Fan, L., Kim, S., Farr, C. L., Schaefer, K. T., Randolph, K. M., Tainer, J. A., and Kaguni, L. S. (2006) Journal of Molecular Biology 358, 1229-1243 21. Kong, X. P., Onrust, R., Odonnell, M., and Kuriyan, J. (1992) Cell 69, 425-437 22. Carrodeguas, J. A., Pinz, K. G., and Bogenhagen, D. F. (2002) Journal of Biological Chemistry 277, 50008-50014

115 23. Farge, G., Pham, X. H., Holmlund, T., Khorostov, I., and Falkenberg, M. (2007) Nucleic Acids Research 35, 902-911 24. Wernette, C. M., and Kaguni, L. S. (1986) Journal of Biological Chemistry 261, 4764-4770 25. Carrodeguas, J. A., Kobayashi, R., Lim, S. E., Copeland, W. C., and Bogenhagen, D. F. (1999) Molecular and Cellular Biology 19, 4039-4046 26. Yakubovskaya, E., Chen, Z. X., Carrodeguas, J. A., Kisker, C., and Bogenhagen, D. F. (2006) Journal of Biological Chemistry 281, 374-382 27. Longley, M. J., Graziewicz, M. A., Bienstock, R. J., and Copeland, W. C. (2005) Gene 354, 125-131 28. Trifunovic, A., Wredenberg, A., Falkenberg, M., Spelbrink, J. N., Rovio, A. T., Bruder, C. E., Bohlooly-Y, M., Gidlof, S., Oldfors, A., Wibom, R., Tornell, J., Jacobs, H. T., and Larsson, N. G. (2004) Nature 429, 417-423 29. Van Goethem, G., Dermaut, B., Lofgren, A., Martin, J. J., and Van Broeckhoven, C. (2001) Nature Genetics 28, 211-212 30. Graziewicz, M. A., Bienstock, R. J., and Copeland, W. C. (2007) Human Molecular Genetics 16, 2729-2739 31. Graziewicz, M. A., Longley, M. J., Bienstock, R. J., Zeviani, M., and Copeland, W. C. (2004) Nature Structural & Molecular Biology 11, 770-776 32. Van Goethem, G., Luoma, P., Rantamaki, M., Al Memar, A., Kaakkola, S., Hackman, P., Krahe, R., Lofgren, A., Martin, J. J., De Jonghe, P., Suomalainen, A., Udd, B., and Van Broeckhoven, C. (2004) Neurology 63, 1251-1257 33. Chan, S. S. L., Longley, M. J., and Copeland, W. C. (2005) Journal of Biological Chemistry 280, 31341-31346 34. Hakonen, A. H., Heiskanen, S., Juvonen, V., Lappalainen, I., Luoma, P. T., Rantamaki, M., Van Goethem, G., Lofgren, A., Hackman, P., Paetau, A., Kaakkola, S., Majamaa, K., Varilo, T., Udd, B., Kaaiainen, H., Bindoff, L. A., and Suomalainen, A. (2005) American Journal of Human Genetics 77, 430-441 35. Chan, S. S. L., Longley, M. J., and Copeland, W. C. (2006) Human Molecular Genetics 15, 3473-3483 36. Cihlar, T., and Ray, A. S. (2010) Antiviral Research 85, 39-58 37. Furman, P. A., Fyfe, J. A., Stclair, M. H., Weinhold, K., Rideout, J. L., Freeman, G. A., Lehrman, S. N., Bolognesi, D. P., Broder, S., Mitsuya, H., and Barry, D. W. (1986) Proceedings of the National Academy of Sciences of the United States of America 83, 8333-8337 38. Lewis, W., and Dalakas, M. C. (1995) Nature Medicine 1, 417-422 39. Lewis, W., Day, B. J., and Copeland, W. C. (2003) Nature Reviews Drug Discovery 2, 812-822 40. Kakuda, T. N. (2000) Clinical Therapeutics 22, 685-708 41. Lee, H., Hanes, J., and Johnson, K. A. (2003) Biochemistry 42, 14711-14719 42. Anderson, S., Bankier, A. T., Barrell, B. G., Debruijn, M. H. L., Coulson, A. R., Drouin, J., Eperon, I. C., Nierlich, D. P., Roe, B. A., Sanger, F., Schreier, P. H., Smith, A. J. H., Staden, R., and Young, I. G. (1981) Nature 290, 457-465 116 43. Wiesner, R. J., Ruegg, J. C., and Morano, I. (1992) Biochemical and Biophysical Research Communications 183, 553-559 44. Spelbrink, J. N. (2010) Iubmb Life 62, 19-32 45. Holt, I. J., Lorimer, H. E., and Jacobs, H. T. (2000) Cell 100, 515-524 46. Brown, T. A., Cecconi, C., Tkachuk, A. N., Bustamante, C., and Clayton, D. A. (2005) Genes & Development 19, 2466-2476 47. Yasukawa, T., Reyes, A., Cluett, T. J., Yang, M. Y., Bowmaker, M., Jacobs, H. T., and Holt, I. J. (2006) Embo Journal 25, 5358-5371 48. Langston, L. D., and O'Donnell, M. (2006) Molecular Cell 23, 155-160 49. Korhonen, J. A., Pham, X. H., Pellegrini, M., and Falkenberg, M. (2004) Embo Journal 23, 2423-2429 50. Korhonen, J. A., Gaspari, M., and Falkenberg, M. (2003) Journal of Biological Chemistry 278, 48627-48632 51. Spelbrink, J. N., Li, F. Y., Tiranti, V., Nikali, K., Yuan, Q. P., Tariq, M., Wanrooij, S., Garrido, N., Comi, G., Morandi, L., Santoro, L., Toscano, A., Fabrizi, G. M., Somer, H., Croxen, R., Beeson, D., Poulton, L., Suomalainen, A., Jacobs, H. T., Zeviani, M., and Larsson, C. (2001) Nature Genetics 28, 223-231 52. Kusakabe, T., and Richardson, C. C. (1996) Journal of Biological Chemistry 271, 19563-19570 53. Ziebarth, T. D., Gonzalez-Soltero, R., Makowska-Grzyska, M. M., Nunez- Ramirez, R., Carazo, J. M., and Kaguni, L. S. (2010) Journal of Biological Chemistry 285, 14639-14647 54. Tiranti, V., Rocchi, M., Didonato, S., and Zeviani, M. (1993) Gene 126, 219-225 55. Curth, U., Urbanke, C., Greipel, J., Gerberding, H., Tiranti, V., and Zeviani, M. (1994) European Journal of Biochemistry 221, 435-443 56. Vandyck, E., Foury, F., Stillman, B., and Brill, S. J. (1992) Embo Journal 11, 3421-3430 57. Xu, B. J., and Clayton, D. A. (1996) Embo Journal 15, 3135-3143 58. Wanrooij, S., Fuste, J. M., Farge, G., Shi, Y. H., Gustafsson, C. M., and Falkenberg, M. (2008) Proceedings of the National Academy of Sciences of the United States of America 105, 11122-11127 59. Yang, X. M., and Richardson, C. C. (1996) Journal of Biological Chemistry 271, 24207-24212 60. Gotoh, T., Miyazaki, Y., Sato, W., Kikuchi, K. I., and Bentley, W. E. (2001) Journal of Bioscience and Bioengineering 92, 248-255 61. Monsma, S. A., and Scott, M. (1997) Faseb Journal 11, 3173 62. O'Reilly, D. R., Miller, L. K., and Luckow, V. A. (1994) Baculovirus expression vectors : a laboratory manual, New York Oxford University Press 63. Murhammer, D. W. (2007) Baculovirus and insect cell expression protocols, 2nd Ed. Methods in molecular biology ;; 388; Variation: Methods in molecular biology (Clifton, N.J.) ;; v. 388., Totowa N.J. Humana Press 64. Hu, Y. C., and Bentley, W. E. (2000) Chemical Engineering Science 55, 3991- 4008 117 65. Rovio, A. T., Marchington, D. R., Donat, S., Schuppe, H. C., Abe, J., Fritsche, E., Elliott, D. J., Laippala, P., Ahola, A. L., McNay, D., Harrison, R. F., Hughes, B., Barrett, T., Bailey, D. M. D., Mehmet, D., Jequier, A. M., Hargreave, T. B., Kao, S. H., Cummins, J. M., Barton, D. E., Cooke, H. J., Wei, Y. H., Wichmann, L., Poulton, J., and Jacobs, H. T. (2001) Nature Genetics 29, 261-262 66. Jensen, M., Leffers, H., Petersen, J. H., Andersen, A. N., Jorgensen, N., Carlsen, E., Jensen, T. K., Skakkebaek, N. E., and Rajpert-De Meyts, E. (2004) Human Reproduction 19, 65-70 67. Hudson, G., and Chinnery, P. F. (2006) Human Molecular Genetics 15, R244- R252 68. Spelbrink, J. N., Toivonen, J. M., Hakkaart, G. A. J., Kurkela, J. M., Cooper, H. M., Lehtinen, S. K., Lecrenier, N., Back, J. W., Speijer, D., Foury, F., and Jacobs, H. T. (2000) Journal of Biological Chemistry 275, 24818-24828 69. Williams, A. J., and Paulson, H. L. (2008) Trends in Neurosciences 31, 521-528 70. Bellizzi, J. J., Widom, J., Kemp, C. W., and Clardy, J. (1999) Structure with Folding & Design 7, R263-R267 71. Otwinowski, Z., and Minor, W. (1997) in Macromolecular Crystallography, Pt A Vol. 276, pp. 307-326 72. Brunger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse- Kunstleve, R. W., Jiang, J. S., Kuszewski, J., Nilges, M., Pannu, N. S., Read, R. J., Rice, L. M., Simonson, T., and Warren, G. L. (1998) Acta Crystallographica Section D-Biological Crystallography 54, 905-921 73. Collaborative Computational Project, N. (1994) Acta Crystallogr D Biol Crystallogr 50, 760-763 74. Navaza, J. (2001) Acta Crystallographica Section D-Biological Crystallography 57, 1367-1372 75. Winn, M. D., Murshudov, G. N., and Papiz, M. Z. (2003) in Macromolecular Crystallography, Pt D Vol. 374, pp. 300-321 76. Holm, L., and Sander, C. (1996) Science 273, 595-602 77. Yakubovskaya, E., Lukin, M., Chen, Z. X., Berriman, J., Wall, J. S., Kobayashi, R., Kisker, C., and Bogenhagen, D. F. (2007) Embo Journal 26, 4283-4291 78. Clayton, D. A. (1982) Cell 28, 693-705 79. Ferrari, G., Lamantea, E., Donati, A., Filosto, M., Briem, E., Carrara, F., Parini, R., Simonati, A., Santer, R., and Zeviani, M. (2005) Brain 128, 723-731 80. McHenry, C., and Kornberg, A. (1977) Journal of Biological Chemistry 252, 6478-6484 81. Hori, K., Mark, D. F., and Richardson, C. C. (1979) Journal of Biological Chemistry 254, 1591-1597 82. Brieba, L. G., Eichman, B. F., Kokoska, R. J., Doublie, S., Kunkel, T. A., and Ellenberger, T. (2004) Embo Journal 23, 3452-3461 83. Tabor, S., Huber, H. E., and Richardson, C. C. (1987) Journal of Biological Chemistry 262, 16212-16223

118 84. Huber, H. E., Tabor, S., and Richardson, C. C. (1987) Journal of Biological Chemistry 262, 16224-16232 85. Fersht, A. (1985) Enzyme structure and mechanism, 2nd Ed., W.H. Freeman, New York 86. Petruska, J., Sowers, L. C., and Goodman, M. F. (1986) Proceedings of the National Academy of Sciences of the United States of America 83, 1559-1562 87. Gleghorn, M. L., Davydova, E. K., Rothman-Denes, L. B., and Murakami, K. S. (2008) Molecular Cell 32, 707-717 88. Lamantea, E., Tiranti, V., Bordoni, A., Toscano, A., Bono, F., Servidei, S., Papadimitriou, A., Spelbrink, H., Silvestri, L., Casari, G., Comi, G. P., and Zeviani, M. (2002) Annals of Neurology 52, 211-219 89. Joyce, C. M., and Benkovic, S. J. (2004) Biochemistry 43, 14317-14324 90. Yin, Y. W., and Steitz, T. A. (2004) Cell 116, 393-404 91. Nguyen, K. V., Ostergaard, E., Ravn, S. H., Balslev, T., Danielsen, E. R., Vardag, A., McKiernan, P. J., Gray, G., and Naviaux, R. K. (2005) Neurology 65, 1493- 1495 92. Luoma, P. T., Luo, N. G., Loscher, W. N., Farr, C. L., Horvath, R., Wanschitz, J., Kiechl, S., Kaguni, L. S., and Suomalainen, A. (2005) Human Molecular Genetics 14, 1907-1920 93. Longley, M. J., Clark, S., Man, C. Y. W., Hudson, G., Durham, S. E., Taylor, R. W., Nightingale, S., Turnbull, D. M., Copeland, W. C., and Chinnery, P. F. (2006) American Journal of Human Genetics 78, 1026-1034 94. Huang, H. F., Chopra, R., Verdine, G. L., and Harrison, S. C. (1998) Science 282, 1669-1675 95. Klarmann, G. J., Eisenhauer, B. M., Zhang, Y., Gotte, M., Pata, J. D., Chatterjee, D. K., Hecht, S. M., and Le Grice, S. F. J. (2007) Biochemistry 46, 2118-2126 96. Sarafianos, S. G., Pandey, V. N., Kaushik, N., and Modak, M. J. (1995) Journal of Biological Chemistry 270, 19729-19735 97. Ding, J. P., Das, K., Hsiou, Y., Sarafianos, S. G., Clark, A. D., Jacobo-Molina, A., Tantillo, C., Hughes, S. H., and Arnold, E. (1998) Journal of Molecular Biology 284, 1095-1111 98. Demeler, B. (2009) An integrated data analysis software package for sedimentation experiments. 99. Laue, T. M., Shah, B. D., Ridgeway, T. M., and Pelletier, S. L. (1992) Computer- aided Interpretation of Analytical Sedimentation Data for Proteins 100. Durchschlag, H. (1986) Specific Volumes of Biological Macromolecules and Some Other Molecules of Biological Interest (Hinz, H.-J., Ed.), Springer-Verlag, New York 101. Brookes, E., Cao, W., and Demeler, B. (2010) Eur Biophys J 39, 405-414 102. Brookes, E., Cao, W., and Demeler, B. (2007) in GECCO Proceedings Vol. 978- 1-59593-697-4/07/0007, ACM, New York 103. Demeler, B., and Brookes, E. (2008) Colloid and Polymer Science 286, 129-137 104. Demeler, B., and van Holde, K. E. (2004) Analytical Biochemistry 335, 279-288 119 105. Johnson, M. L., Correia, J. J., Yphantis, D. A., and Halvorson, H. R. (1981) Biophysical Journal 36, 575-588 106. Gill, S. C., and Vonhippel, P. H. (1989) Analytical Biochemistry 182, 319-326 107. Yan, L., Ge, H., Li, H., Lieber, S. C., Natividad, F., Resuello, R. R. G., Kim, S. J., Akeju, S., Sun, A., Loo, K., Peppas, A. P., Rossi, F., Lewandowski, E. D., Thomas, A. P., Vatner, S. F., and Vatner, D. E. (2004) Journal of Molecular and Cellular Cardiology 37, 921-929 108. Lee, Y. S., Kennedy, W. D., and Yin, Y. W. (2009) Cell 139, 312-324 109. Lee, J. B., Hite, R. K., Hamdan, S. M., Xie, X. S., Richardson, C. C., and van Oijen, A. M. (2006) Nature 439, 621-624 110. Lo Conte, L., Chothia, C., and Janin, J. (1999) Journal of Molecular Biology 285, 2177-2198 111. Kollberg, G., Moslemi, A. R., Darin, N., Nennesmo, I., Bjamadottir, I., Uvebrant, P., Holme, E., Melberg, A., Tulinius, M., and Oldfors, A. (2006) Journal of Neuropathology and Experimental Neurology 65, 758-768 112. Taanman, J. W., Daras, M., Albrecht, J., Davie, C. A., Mallam, E. A., Muddle, J. R., Weatherall, M., Warner, T. T., Schapira, A. H. V., and Ginsberg, L. (2009) Neuromuscular Disorders 19, 151-154 113. Harrower, T., Stewart, J. D., Hudson, G., Houlden, H., Warner, G., O'Donovan, D. G., Findlay, L. J., Taylor, R. W., De Silva, R., and Chinnery, P. F. (2008) Archives of Neurology 65, 133-136 114. Lee, Y. S., Lee, S., Demeler, B., Molineux, I. J., Johnson, K. A., and Yin, Y. W. (2010) Journal of Biological Chemistry 285, 1490-1499 115. Bailey, M. F., van der Schans, E. J. C., and Millar, D. P. (2004) Journal of Molecular Biology 336, 673-693 116. Beese, L. S., and Steitz, T. A. (1991) Embo Journal 10, 25-33 117. Beese, L. S., Derbyshire, V., and Steitz, T. A. (1993) Science 260, 352-355 118. Shamoo, Y., and Steitz, T. A. (1999) Cell 99, 155-166 119. Gupta, A. P., Benkovic, P. A., and Benkovic, S. J. (1984) Nucleic Acids Research 12, 5897-5911 120. Freemont, P. S., Friedman, J. M., Beese, L. S., Sanderson, M. R., and Steitz, T. A. (1988) Proceedings of the National Academy of Sciences of the United States of America 85, 8924-8928 121. Jazayeri, M., Andreyev, A., Will, Y., Ward, M., Anderson, C. M., and Clevenger, W. (2003) Journal of Biological Chemistry 278, 9823-9830

120 Vita

Young-Sam Lee was born in Seoul, Korea. He attended the Seoul National University, Seoul, Korea in 1993 and was awarded a Bachelor of Science degree in Chemistry in Feb. 1997, then a Master of Science degree in Chemistry (Biochemistry subdivision) in Feb 1999. After military duties for 26 months, he resumed his research carrier at Samsung Biomedical Research Institute in Oct. 2001. He attended the University of Texas at Austin as a graduate student in Aug. 2004 and joined Dr. Whitney Yin lab.

Email address (or email): [email protected] [email protected] This dissertation was typed by Young-Sam Lee.

121