BIOCHEMICAL STUDIES OF DNA THETA

A Dissertation Submitted to the Temple University Graduate Board

In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY

by Ahmet Y Ozdemir May 2019

Examining Committee Members:

Richard T Pomerantz,PhD, Advisory Chair, Fels Institute for Cancer Research and Molecular Biology & Medical Genetics and Molecular Biochemistry

Xavier Graña-Amat,PhD, Fels Institute for Cancer Research and Molecular Biology & Medical Genetics and Molecular Biochemistry

Tomasz Skorski, MD, PhD, DSc, Fels Institute for Cancer Research and Molecular Biology & Microbiology and Immunology

Italo Tempera, PhD, Fels Institute for Cancer Research and Molecular Biology & Microbiology and Immunology

Alexander Mazin, PhD, External Member, Biochemistry & Molecular Biology, Drexel University

© Copyright 2019

by

Ahmet Y Ozdemir

All Rights Reserved

ii

ABSTRACT

POLQ is a unique multifunctional replication and repair that encodes a multidomain protein with a N-terminal superfamily 2 and a C-terminal A-family polymerase. Although the function of the polymerase domain has been investigated, little is understood regarding the helicase domain. Multiple studies have reported that polymerase θ-helicase (Polθ-helicase) is unable to unwind DNA. However, it exhibits

ATPase activity that is stimulated by single-stranded DNA, which presents a biochemical conundrum. In contrast to previous reports, we demonstrate that Polθ-helicase (residues 1–

894) efficiently unwinds DNA with 3'–5' polarity, including DNA with 3' or 5' overhangs, blunt- ended DNA, and replication forks. Polθ-helicase also efficiently unwinds RNA-

DNA hybrids and exhibits a preference for unwinding the lagging strand at replication forks, similar to related HELQ helicase. Finally, we find that Polθ-helicase can facilitate strand displacement synthesis by Polθ-polymerase, suggesting a plausible function for the helicase domain. Taken together, these findings indicate nucleic acid unwinding as a relevant activity for Pol theta in replication repair.

DNA polymerase theta is a unique polymerase-helicase fusion protein that promotes microhomology-mediated end-joining of DNA double-strand breaks. How full- length human DNA polymerase theta performs microhomology-mediated end-joining and is regulated by the helicase and disordered central domain remains unknown. We find that the helicase upregulates DNA polymerase theta microhomology-mediated end-joining activity in an ATPase-independent manner. Using single-particle microscopy, we find that

iii

DNA polymerase theta forms large multimeric complexes that promote DNA accumulation and end-joining. We further find that the disordered central domain regulates DNA polymerase theta multimerization and governs its DNA substrate requirements for end- joining. In summary, these studies identify major regulatory functions for the helicase and central domains in DNA end-joining and the structural organization of DNA polymerase theta.

iv

I dedicate my dissertation work

to my wife Çiğdem Satgun,

to my son Erva Muhammed,

to my daughters Liya Meryem,

and Mina Nehir.

v

ACKNOWLEDGMENTS

First of all, I would like to acknowledge my advisor Richard Pomerantz. He was always supportive. He gave me the opportunity to work independently and helped me to develop as a scientist.

Second, I would like to thank past and present members of Pomerantz Lab. We always share our expertise with each other. My special thanks go to Samuel Black who taught me protein purification from yeast.

Third, I would like to acknowledge my committee members. They helped me see the full picture of my research.

Moreover, I would like to thank past and present members of Fels Institute in all levels.

Fels is a nice friendly environment which I will never forget.

My special acknowledgement goes to my previous supervisor Dianne Langford. She supported my decision to become a PhD and she was always there whenever I wanted to talk to her.

Finally, I would like to acknowledge all the support from my family, friends and relatives.

My wife and kids waited for my graduation with patience. They helped me to relax during stressful times. My parents always supported me on my decision to become a PhD. I had many friends who always kept me on target for graduation.

vi

TABLE OF CONTENTS

Page

ABSTRACT ...... iii

DEDICATION ...... v

ACKNOWLEDGEMENTS ...... vi

LIST OF FIGURES ...... xi

LIST OF ABBREVIATIONS ...... xiii

CHAPTER

1. INTRODUCTION ...... 1

Double-Strand Break Repair Pathways ...... 1

Homologous Recombination (HR) ...... 3

Classical Non-Homologous End-Joining (C-NHEJ) ...... 6

Single-Strand Annealing (SSA) ...... 8

Microhomology-Mediated End-Joining (MMEJ) ...... 10

MMEJ/Alt-EJ in Genome Instability and Cancer ...... 12

Genome Instability ...... 12

Pol Theta and Cancer ...... 13

Pol Theta as an Anti-Cancer Drug Target ...... 14

Structure and Function of Human Pol Theta ...... 15

Background of DNA and ...... 15

History of Pol Theta Studies ...... 17

vii

Structure of Human Pol Theta ...... 18

Pol Theta Polymerase Domain ...... 19

Pol Theta Helicase Domain ...... 19

Pol Theta Central Domain ...... 20

Function of Human Pol Theta ...... 22

MMEJ/Alt-EJ ...... 22

Other Functions of Pol Theta ...... 23

2. MATERIALS AND METHODS ...... 24

Proteins ...... 24

Pol Theta Helicase and Pol Theta Helicase K121M Purification ..24

RPA Purification ...... 25

Nucleic Acid Unwinding ...... 26

Sequence Alignment ...... 27

Superposition of Pol Theta Helicase and Hel308 Structures ...... 27

Pol Theta Strand Displacement Synthesis ...... 28

Strand Exchange ...... 28

MMEJ ...... 29

Primer Extension ...... 29

ATPase Assay ...... 30

EMSA ...... 30

Fluorescence Anisotropy ...... 31

Scanning Force Microscopy ...... 31

viii

DNA and RNA ...... 32

Materials for Full Length Pol Theta Purification from Yeast ...... 34

3. POL THETA HELICASE UNWINDS DNA ...... 38

Introduction ...... 38

Results ...... 41

Pol Theta Helicase Unwinds DNA in an ATP and

dATP Dependent Manner ...... 41

Pol Theta Helicase Preferentially Unwinds DNA

with 3’ Overhangs ...... 43

Pol Theta Helicase Efficiently Unwinds Substrates Modeled After

Stalled Replication Forks ...... 48

Pol Theta Helicase Promotes Strand Displacement Synthesis by

Pol Theta Polymerase ...... 50

Discussion ...... 51

4. PURIFICATION OF FULL LENGTH HUMAN DNA POLYMERASE

THETA AND VARIANTS ...... 59

Introduction ...... 59

Expression of Full Length Pol Theta in Yeast ...... 60

Expression Vector ...... 60

Cloning ...... 62

Growth of Yeast Cells and Induction of Protein Expression ...... 66

Protein Purificaton ...... 68

ix

Preparation of Yeast Cell Extract ...... 68

Purification of Recombinant 3xFlag-Polθ and Variants

by Binding to Anti-Flag Resin ...... 69

Results and Conclusions ...... 72

5. STRUCTURE FUNCTION STUDIES OF FULL-LENGTH POL THETA ..75

Introduction ...... 75

Results ...... 77

Polymerase-Helicase Tethering is Essential for MMEJ ...... 77

Pol Theta Exclusively Performs MMEJ of Long ssDNA ...... 79

Pol Theta Multimers Promote DNA Accumulation and End-

Joining ...... 84

Discussion ...... 86

BIBLIOGRAPHY ...... 91

APPENDICES

A. SUPPLEMENTARY FIGURES ...... 118

B. YEAST CODON OPTIMIZED SEQUENCES ...... 119

C. POL THETA ALIGNMENTS ...... 128

D. EXTENDED DATA FIGURES ...... 137

x

LIST OF FIGURES

Figure Page

1-1. Overview of Double-Strand Break Repair Pathways………..……..………….2

1-2. Major Double-Strand Break Repair Pathways……………………..………….4

1-3. Minor Double-Strand Break Repair Pathways………………….……..……....9

1-4. Human DNA Polymerase Theta………………………………………....……18

1-5. Highly Disordered Central Domain of Human DNA Polymerase Theta….….21

3-1. Polθ-Helicase Unwinds DNA in an ATP- and dATP-Dependent

Manner………………..…………………………………….…………..…….39

3-2. Polθ-Helicase Preferentially Unwinds DNA with 3’ Overhangs..……...…….44

3-3. Polθ-Helicase Unwinds RNA-DNA Hybrids……………………..…..……....47

3-4. Polθ-Helicase Unwinding Activity at Replication Forks………………..……49

3-5. Sequence and Structural Comparison of Polθ-Helicase and HELQ/Hel308

Enzymes……………………………...……………………….……………….54

3-6. Models of Polθ-Helicase Activity During Replication and Repair………...... 57

4-1. Maps of Yeast Expression Plasmids…………………………………..……....61

4-2. Yeast Codon Optimized Sequence of 3xFlag-Polθ……………………………63

4-3. Overview of Full-Length Pol Theta Purification Method….………………….67

4-4. Activity of Full-Length Pol Theta Purification…..………………..…….…….73

5-1. Polymerase-Helicase Tethering is Essential for MMEJ……………..….…...... 78

5-2. Pol Theta Does Not Promote MMEJ of Short ssDNA…………..……….……80

5-3. Pol Theta Exclusively Performs MMEJ of Long ssDNA…….……..………...82 xi

5-4. Pol Theta Forms Large DNA-Dependent Multimeric Complexes That Promote

DNA Accumulation and Induces Liquid Demixing of DNA………………….85

5-5. Functional Models of Wild-Type Pol Theta and Pol Theta Variants…….……87

xii

LIST OF ABBREVIATIONS

53BP1 Tumor suppressor p53-binding protein 1

Alt-EJ Alternative end-joining

AmpR Ampicillin resistance gene

ATP Adenosine triphosphate

ATM Ataxia telangiectasia mutated

ATCC American type culture collection

BER

BLM Bloom syndrome RecQ like helicase bp

BRCA Breast cancer early onset gene

BRCA1 Breast cancer early onset gene 1

BRCA2 Breast cancer early onset gene 2

Ca Carbon alpha cDNA Complementary DNA chaos1 aberration occurring spontaneously 1

C-NHEJ Classical non-homologous end-joining

CPS Cycles per second

CtIP CtBP interacting protein dATP Deoxyadenosine triphosphate

D-loop Displacement loop dNTP Deoxyribonucleotide triphosphate xiii

DNA Deoxyribonucleic acid

Dna2 DNA replication helicase/nuclease 2

DNA-PKcs DNA-dependent protein catalytic subunit

DSB Double-strand break eGFP Enhanced green fluorescent protein

ERCC1 Excision repair cross-complementation group 1

ExoI Exonuclease I

FDA Food and drug admistration

G4 G-quadruplex

HelQ Hel308: POLQ-like helicase

Hel308 HelQ: POLQ-like helicase

HR Homologous recombination

HU Hydroxyurea

Indels Insertions and deletions

IR Ionizing radiation

Ku Ku70/Ku80 heterodimer

Ku70 XRCC6: X-ray repair cross complementing protein 6

Ku70/Ku80 Ku70/Ku80 heterodimer

Ku80 XRCC5: X-ray repair cross complementing protein 5

Lig4 DNA Ligase IV

MCM2-7 Minichromosome maintenance complex component 2-7

MEFs Mouse embryonic fibroblasts

xiv

MMEJ Microhomology-mediated end-joining

Mre11 Meiotic recombination 11 homolog 1

MRN MRE11, RAD50, NBS1 complex mus308 Mutagen-sensitive 308

Nbs1 Nijmegen breakage syndrome 1

PALB2 Partner and localizer of BRCA2

PARP Poly(ADP-ribose) polymerase

PARP1 Poly(ADP-ribose) polymerase 1

PCNA Proliferating cell nuclear antigen

PDB Protein data bank

PIF1 Petite integration frequency 1 5’-to-3’ DNA helicase

Pol alpha DNA polymerase alpha: Polα

Pol delta DNA polymerase delta: Polδ

Pol epsilon DNA polymerase epsilon: Polε

Pol gamma DNA polymerase gamma: Polγ

Pol lambda DNA polymerase lambda: Polλ

Pol mu DNA polymerase mu: Polµ

Pol theta DNA polymerase theta: Polθ

Pol zeta DNA polymerase zeta: Polζ

PolQ Human DNA polymerase theta gene

Polq Mouse DNA polymerase theta gene pssDNA Partially single-stranded DNA

xv

Rad50 RAD50 double strand break repair protein

Rad51 RAD51 recombinase

Rad52 Recombination protein RAD52

Rag Recombination activating

RECQ SF2 helicase family

RECQ4 RECQ like helicase 4

RECQL5 RECQ like helicase 5

RMSD Root-mean-square deviation

RNA Ribonucleic acid

RPA Replication protein A

RPM Revolutions per minute

SC-TRP Synthetic complete yeast media lacking tryptophan

SF1 Super family 1

SF2 Super family 2

Ski2 Ski2 like RNA helicase

Spo11 Spo11, initiator of meiotic double strand breaks

SSA Single strand annealing ssDNA Single-stranded DNA

T-DNA Transfer DNA

TLS Translesion synthesis

TRP Tryptophan,

TRP1 Tryptophan gene

xvi

UV Ultraviolet

UvrD UvrD, SF1 DNA helicase

VDJ A type of recombination

WRN Werner syndrome RECQ like helicase

XLF XRCC4-like factor

XPF Xeroderma pigmentosum, complementation group F

XRCC4 X-ray repair cross complementing 4

xvii

CHAPTER 1

INTRODUCTION

Double-Strand Break Repair Pathways

DNA encodes for the genetic information in all organisms and is located in the nucleus and mitochondria of eukaryotic cells. Because DNA harbors the genetic information, its integrity and protection is essential for cell division and survival. DNA is a double-stranded helical molecule composed of two anti-parallel strands(1). Both single-strand breaks/gaps and double-strand breaks (DSBs) can occur in the genome(2)(3)(4)(5). Single-strand breaks/gaps are not lethal since they are relatively easy to repair by using the complementary strand as a template for DNA repair synthesis(3)(6). In contrast, double strand breaks (DSBs) are potentially lethal since the physical continuity of the DNA molecule is lost and deletion or addition of sequence information, or even DNA translocations can occur(4)(8)(9)(10)(11)(12)(13). Thus, DSBs are detrimental to cells if not accurately or efficiently repaired(12)(14).

An overview of DSBs and their sources are illustrated in Fig. 1-1. DSBs can be caused by endogenous sources, for example reactive oxygen species, endogenous nucleases such as topoisomerase II, and programmed genome breaks(14)(13)(15)(16)(17). The types of programmed DSBs include those induced by nucleases during meiosis (i.e. Spo11) and

VDJ recombination (Rag endonuclease) in B and T cells(18)(19). Exogenous sources can also induce DSB formation(20)(14)(15). For example, some exogenous sources include mustard gas, gamma radiation, and genotoxic chemotherapy agents(15). Genotoxic agents

1

Fig. 1-1

Endogenous Sources Exogenous Sources

Reactive Nucleases: Programmed Ionizing Oxygen Topoisomerase Genome Chemotherapy CRISPR Radiation Species II Breaks

DSBs

No resection Resection

C-NHEJ HR Alt-EJ SSA

Figure 1-1. Overview of Double-Strand Break Repair Pathways. Endogenous and exogenous

sources of double-strand breaks and their potential repair pathways are described.

2 such as platinum based drugs (i.e. cisplatin, carboplatin) and ionizing radiation (IR) are widely used as chemotherapeutics to treat cancer due to their ability to induce DSBs that are lethal to highly proliferative cells such as cancer cells(21)(22)(23)(6)(5). However, these chemotherapies are also toxic to other replicating cells in patients and therefore cause major toxic side effects(24)(25).

Multiple DSB repair pathways exist in mammalian cells and these can be generalized as error-free or error-prone DSB repair processes(26)(12). Below, the currently known mammalian DSB repair pathways are described.

Homologous Recombination

Homologous Recombination (HR) is considered the only highly accurate DSB repair pathway (Fig. 1-2) and is important during DNA replication(27)(10)(28)(16)(29). HR is accurate because it uses a homologous DNA template (i.e. sister chromatid) to repair DSBs which enables the ability to replicate lost sequence information during DNA repair synthesis by a high-fidelity DNA polymerase, such as DNA polymerase delta (Pol delta)

(Fig. 1-2)(27)(30)(31). Thus, HR is specifically active in S and G2 cell cycle phases when the sister chromatid is available(31)(27). HR is also induced during meiosis where it promotes genetic diversity in gametes by copying sequence information between homologous (32)(33). HR is an essential pathway(34)(33). Thus, major defects in this pathway results in embryonic lethality(35). Due to its high-fidelity, HR is also important for maintaining genome integrity by preventing mutations and indels during the repair of DSBs in S/G2 phases(4)(31)(10)(28)(6).

3

Fig. 1-2

DSBs S/G2 phase HR C-NHEJ

Ku70/80 MRN CtIP

RPA

Rad51 BRCA2

DNA-PKcs

D-loop 53BP1

Pol δ X-family polymerases sister chromatid XRCC4-Ligase IV-XLF

Figure 1-2. Major Double-Strand Break Repair Pathways. Homologous recombination (HR)

and classical non-homologous end-joining (C-NHEJ) are described.

4

Breast cancer early onset 1 and 2 (BRCA1 and BRCA2), which greatly increase breast cancer risk for women when mutated, were discovered at the same time when HR was found to be an important DSB repair pathway in mammalian cells(16)(27)(36)(29).

Subsequently, BRCA1 and BRCA2 were found to be integral for HR (Fig. 1-2) (16).

Harmful heritable mutations in BRCA1 and BRCA2 are now known to strongly predispose women to both breast and ovarian carcinomas, with the latter originating in the fallopian tube (37)(38)(16)(29).

The initial phase of HR involves 5’-3’ exonuclease resection of the DSB end which specifically occurs during S and G2 cell cycle phases(26)(39)(4)(40)(41). The initiating events of DSB repair are critical for determining cellular DSB repair pathway choice(12)(8)(14)(26). For example, Ku proteins involved in non-homologous end-joining

(NHEJ) have a high affinity for DSBs but the MRN complex (Mre11-Rad50-NbsI), in particular Mre11, competes with Ku proteins (Fig. 1-2) (42)(43)(44)(45)(46). MRN is a major DNA damage response factor that also binds to DSB ends and is involved in both

HR and NHEJ, the two major DSB repair pathways(47)(48)(49)(50)(51). 53BP1 (p53

Binding Protein 1) is one of the first proteins that binds to the ends of DSBs (52). Although

53BP1 suppresses DNA resection, it is also suppressed by BRCA1 during S/G2 cell cycles by spatially excluding 53BP1 from the proximity of DSBs(8)(16)(29). BRCA1 colocalizes with the MRN complex (Mre11-Rad50-NbsI) and CtIP (CtBP Interacting Protein) during resection(16)(29)(53)(54). MRN in conjunction with CtIP initiates DNA resection of DSBs by promoting nicks near DSB ends(31)(44)(41)(53). This allows MRN-CtIP to excise one of the strands in a 3’-5’ manner towards the break(55). Other more processive nucleases

5 such as Dna2 and Exonuclease I are then recruited along with DNA helicases (i.e. Bloom’s helicase) to perform an extended 5’-3’ resection step which results in a long 3’ ssDNA overhang(55)(44)(56)(57). BRCA1 promotes the resection process by directly interacting with the resection factor CtIP and countering 53BP1 activity (16)(29)(54). Thus, BRCA1 is an essential HR factor(58).

The first protein thought to bind to the single-strand DNA (ssDNA) is replication protein A (RPA) (Fig. 1-2)(31)(59). BRCA1 is thought to facilitate recruitment of BRCA2 via formation of a complex with PALB2 (i.e. BRCA1-PALB2-BRCA2) which along with other recombination mediator proteins enables loading of Rad51 recombinase onto ssDNA coated by RPA (Fig. 1-2)(4)(16)(29)(60)(61). Rad51 is a highly conserved Walker A/B containing recombinase protein which promotes homology search and strand invasion into the sister chromatid which serves as a template for DNA repair synthesis primarily by Pol delta in conjunction with PCNA (Proliferating Cell Nuclear Antigen) which confers processivity onto Pol delta and other Pols (Fig. 1-2) (15)(62)(63)(64)(65)(66). Rad51 is highly conserved and is an essential protein(65)(67). Therefore, Rad51 knockout mice are embryonic lethal(68)(69). This demonstrates that HR is an essential pathway for mammals.

In contrast to HR, which is highly accurate, there are multiple pathways of DSB repair that are highly error-prone. These mutagenic DSB repair pathways are described below.

Classical Non-Homologous End-Joining

Classical Non-Homologous End-Joining (C-NHEJ) is a major DSB repair pathway in eukaryotic cells, especially during G0/G1 phases of the cell cycle(Fig. 1-2)(70). Although

6

C-NHEJ is functional throughout the cell cycle, this pathway is suppressed during S and

G2 cell cycle phases by BRCA1(Fig. 1-2)(16). C-NHEJ can result in small insertions and/or deletions (indels) as well as chromosomal translocations and is therefore considered error-prone (6)(71)(72)(73). Although C-NHEJ can cause mutations, overall it is thought to be important for preventing genome instability(6)(73)(74).

The Ku70/Ku80 heterodimer initiates C-NHEJ(42). Specifically, Ku70/Ku80 binds the ends of DSBs with high affinity and protects them from resection which leads to

HR(42)(43). Translocation of Ku70/80 along DNA enables their accumulation at the ends of DSBs(42). Ku70/80 DSB binding is the commitment step for NHEJ DSB repair and is a distinct cellular pathway choice since other DSB repair processes require DNA resection(42)(4)(73)(75). Accumulation of Ku70/80 enables recruitment of all downstream

NHEJ factors such as DNA-PKcs (DNA dependent catalytic subunit),

XRCC4 (X-ray Repair Cross-Complementing protein 4)-DNA Ligase IV-XLF (XRCC4-

Like Factor) complex and X-family DNA polymerases (Pols) lambda and mu (Fig. 1-2)

(42)(76)(77)(78)(79)(80). Specifically, Ku70/80 activates DNA-PKcs which autophosphorylates and phosphorylates other proteins(Fig. 1-2)(42)(81). Pols mu and lambda are recruited during C-NHEJ and enable DNA extension and gap filling at C-NHEJ mediated DNA synapses (Fig. 1-2)(17)(62)(70)(79) . XRCC4 is an important auxiliary factor for DNA Ligase IV and enhances its activity (Fig. 1-2)(72)(78). XRCC4 knock-out mice are embryonic lethal, demonstrating the essential function for C-NHEJ(82)(83)(84).

For example, C-NHEJ is essential for lymphocyte development(85)(86)(87) C-NHEJ deficient humans have developmental delay and immunodeficiency(88)(86). DNA Ligase

7

IV ligates the ends of DSBs and therefore acts at later stages of C-NHEJ(Fig. 1-2)(70).

XLF (XRCC4-Like Factor) also stimulates the ligation step(78). Although C-NHEJ repair can be accurate in many cases, indels or translocations can also occur due to the limited processing of DNA ends(72)(89).

Single-Strand Annealing

Single-Strand Annealing (SSA) utilizes DNA homology regions approximately 8-20 base pairs (bp) in length to join ends of extensively resected DNA during S and G2 cell cycle phases (Fig. 1-3)(9)(80)(4). Hence, SSA causes large deletions and DNA rearrangements which may lead to tumorigenesis(4)(9)(80)(20). 5’-3’ DNA resection is an initiating step for SSA (Fig. 1-3)(4)(8)(9)(80). The Mre11-Rad50-NbsI (MRN) complex and CtIP (CtBP Interacting Protein) initiate resection (Fig. 1-3)(4)(28)(90). Exonuclease I and BLM helicase, and Dna2 helicase/nuclease promote further extension of the resection process along with RPA(9)(28)(57)(90). RPA (Replication protein A) is a major DNA repair and replication factor that binds to the single-stranded DNA (ssDNA), prevents secondary ssDNA structures, and interacts with many replication and repair factors(4)(90)(59). Rad52 is essential for SSA and mediates homology search and annealing of ssDNA overhangs(4)(9)(62)(91). Following annealing, the ERCC1(Excision

Repair Cross Complementation Group 1)-XPF(Xeroderma Pigmentosum,

Complementation Group F) complex removes flanking unannealed non-complementary ssDNA(9)(20). DNA Ligase I ligates the paired DNA ends(4)(9). The DNA polymerase(s) involved in SSA is poorly understood, however, recent studies in yeast suggest a role for

8

Fig. 1-3

DSBs

SSA S/G2 phase Alt-EJ / MMEJ

MRN

CtIP PARP1 Rad52 Pol θ

Large homology Microhomology

ERCC1-XPF complex DNA Ligase I or III

DNA Ligase I

Figure 1-3. Minor Double-Strand Break Repair Pathways. Single-strand annealing (SSA) and

alternative end-joning (Alt-EJ) or microhomology-mediated end-joning (MMEJ) are described.

9

Pol delta (9)(92). Overall, SSA is considered to be highly error-prone since it requires pairing of DNA ends through relatively long (6)(62)(73). This pathway therefore typically occurs between repetitive DNA sequences and as a result induces large deletions(4)(9)(20).

Microhomology-Mediated End-Joining

Microhomology-Mediated End-Joining (MMEJ), also referred to as Alternative End-

Joining (Alt-EJ), is another highly error-prone DSB repair pathway which frequently uses microhomology to pair 3’ ssDNA overhangs (Fig. 1-3)(6)(9)(62)(80). Given that this pathway has not been fully elucidated, it remains unclear whether there are multiple subsets of this process, or whether it is performed by a specific set of protein factors and mechanisms (73)(62). MMEJ events are rare, therefore this pathway has been more challenging to study(89)(62)(28). More than two decades ago, the first evidence of

MMEJ/alt-EJ was observed in yeast and mammalian cells deficient in C-

NHEJ(93)(94)(77).

As mentioned above, MMEJ/alt-EJ uses microhomology regions (i.e. ~2-8 bp) to pair opposing ends of resected DNA during S phase and G2(73). MMEJ/alt-EJ causes larger indels than C-NHEJ and also promotes chromosomal translocations(73)(95)(96)(97)(59).

Similar to SSA, 5’-3’ DNA resection is a necessary requirement for MMEJ/alt-EJ(Fig. 1-

3)(6)(26)(62)(9)(98). The MRN complex and CtIP initiate the resection step(Fig. 1-

3)(73)(51). Evidence also suggests that Dna2 may promote further resection during this pathway(90)(99). Although MMEJ/alt-EJ is thought to share the initial resection

10 mechanism as homologous recombination (HR), whether the full resection process is the same for these two pathways remains unclear(28)(100)(101). For example, this pathway is thought to occur independently from BRCA1(102). Recent studies demonstrate that at least

45-70 nucleotides (nt) of ssDNA overhangs can support MMEJ/alt-EJ in mammalian cells(98). DNA polymerase theta (Pol theta) is a unique A-family polymerase that is essential for MMEJ/alt-EJ(Fig. 1-3) and is further described below(95). Poly (ADP ribose) polymerase I (PARP1) also promotes MMEJ/alt-EJ, and is thought to be required for the recruitment of Pol theta(Fig. 1-3)(95). Pol theta mediates microhomology search, ssDNA overhang pairing, and overhang extension via its DNA synthesis activity(96)(95). DNA

Ligase III or DNA Ligase I seal the DNA ends(Fig. 1-3)(95)(9). Evidence also suggest that unpaired non-complementary ssDNA flanking the minimally paired ends is cleaved by an endonuclease(9), similar to SSA.

Deletions caused by MMEJ/alt-EJ are typically larger than 5-10 bp, and can be greater than 50 bp, whereas C-NHEJ generates relatively small deletions (i.e. <30bp)(95).

MMEJ/alt-EJ also promotes relatively large insertions ~1-10 bp in 10-20% of repair products, whereas insertions due to C-NHEJ are typically very small (1 bp) and occur with very low frequency(95). Microhomology lengths utilized by MMEJ/alt-EJ are typically 2-

8 bp, whereas C-NHEJ is not dependent on this base-pairing mechanism (95)(96).

Evidence suggests that microhomology is not necessarily essential for MMEJ/alt-EJ but may promote this process (73). MMEJ mechanisms and its importance in genome instability and cancer are discussed in detail below.

11

MMEJ/Alt-EJ in Genome Instability and Cancer

Genome Instability

The genome is transferred to daughter cells in an accurate manner during mitotic cell division(103). This process is therefore essential for the propagation of species(103). The genome encodes for proteins as well as non-coding RNAs and has many regulatory sequences and functions(104)(105)(106). Variation of the genome between individuals is also important for fitness and may therefore enable adaptation to different environments(107). Endogenous and exogenous agents may cause breaks and sequence changes in the genome and alter the epigenome through chromatin and chemical (i.e. methylation) changes in DNA(108)(109). Changes in the genome and epigenome may also occur during DNA replication and DNA repair(110)(111). Since proper genome maintenance and repair is essential for the existence of life, failures in genome maintenance can lead to disease such as cancer(112).

Genome instability can result from many insults such as mutations, chromosomal translocations, chromosomal loss or duplications, DNA rearrangements, and loss of heterozygosity(6). Genome instability is not always associated with a disease but it usually enhances tumorigenesis and aging(112)(113). For example, chromothripsis and kataegis are extreme forms of genome instability observed in cancer cells(114). Hypermutation is another form of genome instability observed in many cancers(115)(116).

DNA repetitive elements represent more than 50% of the (104), and these repetitive elements are prone to instability due to DNA rearrangements(114). Fragile

12 sites are also prone to instability due to frequent gaps and breaks at these loci(117). Non-

B DNA motifs such as G-quadruplexes (G4) are also associated with replication stress and mutations(118)(119). Telomeres which consist of G-quadruplexes (G4) and repetitive sequences are also prone to instability and require multiple layers of protection(119)(120).

Pol Theta and Cancer

DNA helicases and polymerases are important factors required for DNA replication and repair, and mutations in such genes can cause genome instability and tumorigenesis(121)(122)(123). For example, mutations in genes encoding RECQ-type

DNA helicases such as BLM, WRN and RECQ4 cause genome instability phenotypes and predispose to human cancer syndromes(124)(125)(126)(127)(128). There are error-free replicative Pols such as B-family polymerases delta, epsilon and alpha which are essential for accurate genome duplication(123). Several types of error-prone Pols which specialize in DNA repair, such as Y-family polymerases, are also necessary for promoting cellular tolerance to DNA damage(123)(129). For example, Y-family Pols evolved to promote replication past bulky lesions which otherwise block replicative Pols(123)(129).

Pol theta is a unique helicase-polymerase fusion protein involved in MMEJ/alt-EJ in higher (130)(131)(95). Pol theta is expressed at low levels in most tissues excluding lymphoid tissues and testis(132)(95). In contrast, it is highly upregulated in many human cancers including ovarian, breast, lung, stomach and colon cancers(132)(39)(133)(134). Pol theta is the most highly expressed DNA polymerase in breast cancers and high levels of the are associated with a poor clinical outcome

13 for breast cancer patients(133). High levels of Pol theta expression also correlates with a poor prognosis in non-small cell lung cancer(134).

Upregulation of Pol theta in human cells is thought to inhibit the accurate DSB repair pathway HR(39). Consistent with this interplay, knockdown of Pol theta in human cells increases HR activity(39). Furthermore, knockdown of Pol theta in HR-deficient cells causes cell death, which demonstrates a synthetic lethal relationship between Polq and

HR(39). This synthetic lethal relationship was also demonstrated in mice(39). Hence, although Pol theta is not essential for viability in normal cells, it is necessary for the proliferation of HR-deficient cells and mice and therefore is thought to compensate for the loss of HR by promoting MMEJ/alt-EJ(135)(136)(39).

Pol Theta as an Anti-Cancer Drug Target

Given that knockdown of Pol theta in HR-deficient cells causes cell death, and that

Polq is not essential for the development or viability of mice, Pol theta is considered a promising drug target in HR-deficient cancers, such as those that form in the breast and ovary(137). HR-deficient tumors, such as due to BRCA1/2 mutations, are known to be highly sensitive to PARP inhibitors, which promote replication-dependent DSBs and suppress the DNA repair pathways base excision repair (BER) and MMEJ/alt-

EJ(39)(138)(139)(140)(141). Thus, PARP inhibitors such as olaparib and rucaparib have been FDA approved to treat patients with BRCA-deficient breast and ovarian cancers(141).

Since BRCA-deficient breast and ovarian cancers are highly sensitive to PARP inhibitors, there is an ongoing effort to find biomarkers of HR-deficient tumors(22). Because, cellular

14 resistance to PARP inhibitors has become a major problem in the clinic, there is an ongoing effort to identify and develop alternative drug targets for precision medicine in BRCA- deficient cancers such as Pol theta(140)(142).

Although Pol theta is not essential for the survival of wild-type mice, Pol theta knock- out mice exhibit increased levels of micronuclei in red blood cells which is attributed to defects in DNA repair(136)(135). Pol theta knockout cells also exhibit decreased hypermutation in B cells(143). Despite these minor effects, Pol theta null mice appear to develop and grow normally(136). Thus, drug inhibitors of Pol theta as anti-cancer therapeutics are expected to have little or no side effects for patients.

Structure And Function Of Human Pol Theta

POLQ is a unique gene in higher eukaryotes that encodes for a helicase-polymerase fusion protein involved in MMEJ/alt-EJ(95). Below, I review the general functions of Pols and helicases in DNA replication and repair, then describe in detail the structure and function of Pol theta.

Background of DNA Polymerases and Helicases

As mentioned above, DNA replication and repair require the activities of specific such as DNA polymerases and DNA helicases(123)(121). Humans encode for 17

DNA polymerases classified in the following 4 families: A, B, X and Y(123)(144). The replicative polymerases involved in nuclear genome duplication are within the B family of polymerases (Pols delta, epsilon, alpha)(123)(144). Pol delta is also involved in HR repair of replication forks and DSBs(31)(145). A-family Pol gamma which is related to Pol theta

15 is essential for replication of the mitochondrial genome(144)(146). The remaining Pols from the A, X and Y families are involved in various different DNA repair pathways(123)(144). For example, as mentioned above Pol theta is essential for

MMEJ/Alt-EJ and promotes translesion synthesis (TLS)(95)(144)(147)(96)(59). Y family

Pols and the unique B family Pol zeta are primarily involved in promoting TLS, whereas

X family Pols such as Pol mu and lambda promote C-NHEJ(123)(144)(148). Pol nu which is an A-family member and is therefore also related to Pol theta appears to be important for

DNA repair during meiosis(149). Because DNA repair Pols such as Pol theta and X and Y family members lack proofreading activity, these enzymes exhibit relatively high rates of misincorporation and are therefore considered error-prone(123)(144).

Helicases are enzymes that unwind DNA and RNA and translocate along DNA and

RNA using the energy of ATP hydrolysis(150)(122). Thus, helicases are involved in many

DNA and RNA metabolic processes such as replication, repair and transcription(121).

There are many types of helicases and they are classified in 6 superfamilies(121).

Superfamily (SF) 1 and 2 have been widely studied and these enzymes are typically involved in DNA repair(151). SF1 and SF2 prototypical helicases that have been widely studied include UvrD and RecQ helicases encoded by bacteria(152)(153). Several SF1 and

SF2 family members exist in mammalian cells and include PIF1 (SF1 member) and a variety of RECQ-type helicases (SF2 members) including BLM and WRN helicases(121)(154)(155). SF1 and SF2 helicases share a similar catalytic core but they exhibit different functions in RNA and/or DNA metabolism(121)(122)(156). Superfamily

16

3-6 members, such as MCM2-7 helicase, primarily form hexameric ring structures and are involved in genome duplication(121)(122).

DNA helicases cooperate with DNA polymerases during replication and repair in all organisms(123)(122). Although helicases and polymerases are almost exclusively encoded by separate genes in eukaryotic cells, POLQ is a unique gene in higher eukaryotes because it encodes for a helicase-polymerase fusion protein separated by a disordered central domain (Fig. 1-4)(144)(130)(137). Other helicase-polymerase fusion proteins exist in bacteria and archaea and are generally involved in replication initiation(157).

History of Pol Theta Studies

In a Drosophila screen, the mus308 gene was discovered due to its importance in promoting cellular resistance to DNA cross-linking agents which lead to replication fork arrest and DSBs(158)(159)(160). Characterization of the Mus308 protein showed that it is a helicase-polymerase fusion enzyme(161). In a mouse screen, the chaos1(chromosome aberration occurring spontaneously 1) mutation was found to promote increased micronuclei in red blood cells and later mapped to the Polq gene locus(135)(136). At the same time, the Wood lab cloned the human POLQ gene and full-length Pol theta was purified(130). These early studies showed that the relatively conserved SF2 helicase domain exhibits ATPase activity, and that the A-family polymerase domain is active in primer extension, similar to other Pols(130). Later, it was found that Pol theta and ATM are semi-synthetic lethal, indicating that Pol theta functions somewhat independently of the major ATM kinase DNA damage response factor(136). Genetic studies further showed

17 that Drosophila mus308 acts independently of HR and C-NHEJ and is responsible for

MMEJ/alt-EJ repair of DSBs(162).

Structure of Human Pol Theta

Human Pol theta consists of 2590 amino acids and is composed of three subdomains: a C-terminal A-family polymerase domain, an N-terminal SF2 helicase domain, and a disordered central domain of unknown function (Fig. 1-4)(95). The polymerase and helicase domains are highly conserved, whereas the central domain is much less conserved among metazoans(95)(137).

Fig. 1-4

Figure 1-4. Human DNA Polymerase Theta. Schematic of Pol theta consistic of three domains: helicase

(Polθ-helicase), central (Polθ-central) and polymerase domains (Polθ-polymerase).

18

Pol Theta Polymerase Domain

The A-family polymerase domain of human Pol theta consists of 799 [1792-2590] amino acids and is structurally organized like most other Pols, containing palm, thumb and fingers subdomains, which structurally is often described as a partially closed right hand

(Fig. 1-4) (161)(137). The Pol theta polymerase domain is highly similar in structure and sequence to DNA polymerase I enzymes from bacteria, but is a unique A-family polymerase because it can perform 3’ terminal activity and translesion synthesis

(95)(163)(147)(137). These unique activities for this A-family member appear to be attributed to unique insertion motifs in the polymerase domain referred to as loops 1, 2 and

3(95). For example, insertion loop 2 was shown to be important for Pol theta MMEJ activity and ssDNA extension activity in vitro(95)(96)(163). Pol theta also includes an inactive 3’-5’ exonuclease-like domain due to an acquired mutation in the exonuclease active site(137). A similar but functional exonuclease is present in bacterial Pol I enzymes, such as Klenow fragment of E. coli(164).

Pol Theta Helicase Domain

The SF2 helicase domain of human Pol theta consists of 894 [1-894] amino acids

(Fig. 1-4) (95)(137)(165). It is similar in structure and sequence to HelQ/Hel308 helicases from eukaryotes and archaea(95)(165). The helicase domain has five subdomains: Two core helicase domains which are required for helicase activity and conserved across SF2 helicases and three additional domains(165). The winged helix domain is commonly found in DNA binding proteins, but its function in Pol theta is unclear(165). The ratchet helix

19 domain in helicases is suggested to ensure directional transport of the DNA substrate(165).

The third domain has a helix-hairpin-helix motif(165). Polq-helicase is also somewhat similar in sequence to RECQ-like helicases which function in HR and replication repair(95). For example, it shares 18% sequence identity with RECQL5(95). As mentioned above, Polq-helicase is known to exhibit ATPase activity that is stimulated by ssDNA(130). However, although most SF2 helicases exhibit DNA unwinding activity, multiple studies previously failed to identify this function for Polq-helicase(130)(165). Our studies described below will present a detailed analysis of Polq-helicase DNA unwinding activity. Recently, a ATP-independent ssDNA annealing function for Polq-helicase has been demonstrated(59). The isolated enzyme was also shown to promote dissociation of

RPA-ssDNA complexes in an ATP-dependent manner, and cellular studies suggest that this ATPase function contributes to MMEJ/alt-EJ(59).

Pol Theta Central Domain

The central domain of human Pol theta is considered to be disordered since it mostly lacks identifiable secondary structural motifs (Fig. 1-4) (Fig. 1-5). It has been suggested to contain two RAD51 binding sites(39)(59). RAD51 is the essential recombinase in HR(69)(29). The same studies suggest Polq-helicase, which also contains a putative RAD51 binding site, may act as an anti-recombinase and thus potentially suppress HR in favor of MMEJ/alt-EJ(39). The central domain consists of 897 [895-1791] amino acids. As mentioned above, the central domain is the least conserved domain of Pol theta in terms of length and sequence(161). Although the function of the central domain

20

Fig. 1-5

Pol-cen [895 aa - 1791 aa]

Figure 1-5. Highly Disordered Central Domain of Human DNA Polymerase Theta. Analysis of human

Pol theta central domain consisting of amino acids 895 to 1791 using Protein Disorder Prediction System

(PrDOS). Plot shows disordered probability for each residue. Residue 1 refers to residue 895 in Polθ-cen.

Ishida, T and Kinoshita, K, PrDOS: prediction of disordered protein regions from amino acid

sequence., Nucleic Acids Res, 35, Web Server issue, 2007

21 has remained a mystery, our studies presented below provide new insight into how this enigmatic domain contributes to Polq MMEJ activity.

Function of Human Pol Theta

MMEJ/Alt-EJ

The first studies of alt-EJ were performed in yeast and mammalian cells by the

Jasin, Jackson and Roth laboratories in 1996 and 1998. First, the Jasin and Jackson labs showed that Ku-independent DSB repair exists and that this pathway is separate from

HR(93)(94). Second, Roth lab claimed that alternative pathway(s) to C-NHEJ exist by showing microhomology dependent end joining is present in Ku86- and XRCC4-deficient cells(77). Later studies showed that alt-EJ is a Ku and Lig4 independent form of DSB repair and that the this pathway in mammalian cells is promoted by the following factors: MRN,

CtIP, Ligase 3/1, PARP1 and Pol theta(95)(137)(166)(62)(5). Notably, Pol theta-dependent

MMEJ/alt-EJ is absent in yeast, but appears to be present in all multicellular organisms including plants(62)(95)(167)(161).

The first evidence for Pol theta activity in MMEJ/Alt-EJ was observed from genetic studies of Drosophila Mus308(162). In the absence of pol theta, alternative end-joining is impaired(162) and analysis of break junctions suggest that pol theta promotes alternative end-joining with long microhomologies and complex insertion events(162). Subsequent studies in C. elegans demonstrated that Pol theta-dependent DSB repair suppresses large deletions at highly stable G-quadruplex (G4) DNA structures during replication, and that

Pol theta-dependent repair utilizes microhomology to pair opposing 3’ ssDNA overhangs

22 generated at DSBs (118). The first mammalian study of Pol theta was performed by the

Wood lab which originally cloned POLQ. They found that Pol theta-dependent double- strand break repair is present in bone marrow stromal cells and mouse embryonic fibroblasts (MEFs) and that POLQ confers resistance to ionizing radiation (IR) and topoisomerase inhibitors(97). Later, the Sfeir lab showed that Pol theta promotes telomere fusions by MMEJ/alt-EJ in mammalian cells, and that these repair junctions in telomeres and other regions often contained insertions which were dependent on Pol theta expression(168). The mechanism by which the Polq-polymerase domain promotes

MMEJ/alt-EJ was demonstrated in vitro by the Pomerantz lab(96).

Other Functions of Pol Theta

Translesion synthesis. Pol theta is a unique A-family polymerase with low fidelity and has the ability to perform TLS opposite abasic sites and thymine glycol lesions(169)(170)(171)(166)(172)(173)(174). Although Pol theta exhibits proficient TLS activity, it remains unclear whether this function of the enzyme is important for cellular tolerance to DNA damaging agents compared to its role in MMEJ/alt-EJ. For example, the possibility exists that Pol theta TLS activity may be useful for bypassing lesions along DSB overhangs caused by IR and therefore contribute to its function in MMEJ/alt-EJ. A few studies suggested that Pol theta contributes to somatic hypermutation and class switch recombination, but these findings have yet to be confirmed, therefore the importance or actual function of Pol theta in these pathways remains unclear(175)(176)(172).

23

CHAPTER 2

MATERIALS AND METHODS

Proteins

Pol Theta Helicase And Pol Theta Helicase K121M Purification pE-SUMOstar vector (Life Sensors) containing the Polθ-helicase cDNA (aa 1–894) was transformed into Rosetta2(DE3)/pLysS E. coli cells (Stratagene). Freshly grown colonies were picked from a plate and resuspended in 20 ml LB broth. 1 ml of resuspended cells was added to 1 liter of autoinduction medium (1X Terrific Broth

(USB Corporation), 0.5% w/v glycerol, 0.05% w/v dextrose, 0.2% w/v alpha-lactose,

100 µg/ml ampicillin and 34 µg/ml chloramphenicol) in a 2.8-liter Fernbach flask. The flasks were shaken at 20 °C for 60 h. Six liters of culture were grown and resulting E. coli pellets were stored at −80 °C. Frozen pellets were thawed on ice and resuspended in buffer containing 50 mM HEPES, pH 8.0, 500 mM NaCl, 10% (v/v) glycerol, 10 mM imidazole, pH 8.0, 1.5% (v/v) Igepal CA-630 (Sigma), 5 mM 2-β- mercaptoethanol (BME), 10 mM PMSF, and 1 tablet of Complete EDTA-free protease inhibitors cocktail (Roche) per every 50 ml at a volume of 5 ml of buffer per gram of cell pellet. The resuspended cells were sonicated on ice with constant stirring then centrifuged at 27,000g. The clarified cell lysate was loaded onto a 5-ml HisTrap FF

Crude column (GE Lifesciences) and washed with buffer A (50 mM HEPES, pH 8.0,

450 mM NaCl, 10% (v/v) glycerol, 10 mM imidazole, pH 8.0, 5 mM BME and 0.005% v/v Igepal CA-630). Bound protein was then eluted with a gradient from buffer A to

24 buffer B (50 mM HEPES, pH 8.0, 450 mM NaCl, 10% (v/v) glycerol, 0.005% (v/v)

Igepal CA-630, 5 mM BME and 250 mM imidazole, pH 8.0). Fractions containing

Polθ-helicase were pooled, mixed with 25 units of SUMOStar protease (LifeSensors,

#4110), and dialyzed against buffer C (50 mM HEPES, pH 8.0, 450 mM NaCl, 10%

(v/v) glycerol, 5 mM DTT and 0.005% v/v Igepal CA-630) overnight at 4 °C. The protein mixture was then loaded onto a 5-ml HisTrap HP column and washed with buffer C. Cleaved Polθ-helicase was separated from uncleaved protein and the protease by applying a gradient to buffer B. Fractions containing cleaved Polθ-helicase were concentrated and stored in aliquots at −80 °C. All steps of the purification process were performed at 4 °C.

RPA Purification

Hexahistidine-tagged RPA expression vector was transformed into

Rosetta2(DE3)/pLysS E.coli cells (Stratagene). Freshly grown colonies were inoculated into 50 ml of LB with 50 µg/ml ampicillin and 34 µg/ml chloramphenicol and incubated overnight at 37 °C with agitation. The preculture was then diluted 100- fold into 6 liters of LB with 50 µg/ml ampicillin and 34 µg/ml chloramphenicol and incubated at 37 °C with agitation until O.D. at 600 nm reached 0.6. The was cooled to

16 °C then protein expression was induced with 1 mM IPTG at 16 °C for 16–18 h.

Cells were harvested by centrifugation for 15 min at 5,000 × g. Cell pellets were frozen and stored at −80 °C. The frozen cell paste corresponding to 6 liters of starter culture was thawed on ice and resuspended in buffer containing 40 mM Tris-HCl, pH 7.5, 500

25 mM NaCl, 10% (v/v) glycerol, 10 mM imidazole, pH 8, 5 mM 2-β-mercaptoethanol,

10 mM PMSF, and one tablet of Complete EDTA-free protease inhibitors (Roche) cocktail per every 50 ml at a volume of 10 ml of buffer per gram of cell pellet. The resuspended cells were sonicated on ice with constant stirring and then centrifuged

27,000g. The clarified cell lysate was loaded onto a 5 ml HisTrap FF Crude column

(GE Lifesciences) and washed with buffer A (20 mM Tis-HCl, pH 7.5, 250 mM NaCl,

10% (v/v) glycerol, 10 mM imidazole, pH 8, and 5 mM BME). Bound fractions were then eluted with a gradient to 100% of elution buffer B (20 mM Tris-HCl, pH 7.5, 250 mM NaCl, 500 mM imidazole, pH 8.0, 10% glyc- erol and 5 mM BME). Fractions containing trimeric RPA were pooled and dia- lyzed against buffer C (20 mM Tris-

HCl, pH 7.5, 50 mM NaCl, 10% (v/v) glycerol and 5 mM BME) for overnight at 4 °C.

Next, the protein was loaded onto a 5-ml HiTrap Q HP column (GE Lifesciences), washed with buffer C, then eluted with a gradient to 100% buffer D (20 mM Tris-HCl, pH 7.5, 500 mM NaCl, 5 mM BME, 10% glycerol). Fractions were resolved and analyzed in a 4–15% SDS–PAGE gel (BioRad). Pure RPA fractions containing equimolar amounts of each subunit were pooled and dialyzed against 2 liters of storage buffer (20 mM Tris-HCl, pH 7.5, 250 mM NaCl, 5 mM BME, 10% glycerol) overnight at 4 °C, then stored in aliquots at −80 °C.

Nucleic Acid Unwinding

5 nM 5’-32P radiolabeled DNA, RNA-DNA or RNA templates were preincubated at room temp in buffer (25 mM Hepes- NaOH, pH 7.0, 2 mM DTT, 0.01% NP-40, 40 mM KCl, 5% glycerol, 1 mM MgCl2 ) then mixed with the indicated amounts of

26

Polθ-helicase for 5 min. This was followed by the addition of 2 mM ATP and 200 nM ssDNA trap and reactions were incubated for the indicated times at 30º C in a total volume of 20 µl. Reactions were terminated by the addition of 4 µl of non- denaturing stop buffer (0.2 M Tris-HCl, pH 7.5, 10 mg/ml proteinase K, 100 mM

EDTA, and 0.5 % SDS) then resolved in non-denaturing 12% polyacrylamide gels and visualized by phosphorimager (Fujifilm FLA 7000) or autoradiography. For RPA stimulation experiments, the indicated amounts of RPA were pre-incubated with

DNA for 5 min, then the indicated amounts of Polθ-helicase were added for an additional 5 min. Reactions were then initiated as above. Unwinding experiments utilizing substrates with a 23 bp duplex were incubated at 37º C.

Sequence Alignment

The indicated amino acid sequences of the helicase domain of Homo sapiens Pol theta and other indicated SF2/Ski2 family helicases were aligned using Clustal Omega

(http://www.ebi.ac.uk/Tools/msa/clustalo/; European Bioinformatics Institute) default settings. Location and numbers of β-sheets and α-helices are indicated for Polθ-helicase domain based on previous structural analysis.

Superposition of Pol Theta Helicase and Hel308 Structures

The carbon alpha (Ca) bound form of Hel308 (PDB code 2p6r) was used as reference onto which the Polθ-helicase domain (PDB code 5aga) was superimposed using Swiss-

PDBViewer. Using least-squares fitting option, 1,432 matching atoms were found to superimpose with an RMSD of 1.55 Å. Images were generated with PyMOL software.

27

Pol Theta Strand Displacement Synthesis

10 nM 5’-32P radiolabeled DNA pre-incubated at room temp in 25 mM Tris-HCl, pH 8.8,

1 mM DTT, 0.01% NP-40, 0.1 mg/ml BSA, 10% glycerol, 10 mM MgCl2 was mixed with or without 50 nM Polθ-helicase. Next, 2 mM ATP and 20 µM dNTPs were added along with 400 nM of unlabeled ssDNA trap for 30 min at 30º C. Polθ-polymerase was added for an additional 20 min in a total volume of 20 µl. Reactions were terminated by the addition of 20 µl of 2x denaturing stop buffer (90% formamide and 50 mM EDTA) then resolved in denaturing urea polyacrylamide gels and visualized by phosphorimager (Fujifilm FLA

7000).

Strand Exchange

5 nM 5’-32P radiolabeled DNA (AO5/AO6) was preincubated at room temp in buffer

(25 mM Hepes- NaOH, pH 7.0, 2 mM DTT, 0.01% NP-40, 40 mM KCl, 5% glycerol, 1

mM MgCl2) then mixed with indicated amounts of Polθ-helicase and 5 nM AO7 for 10

min. This was followed by the addition of 2 mM ATP for a further 30 min at 37° C in a

total volume of 20 µl. Reactions were terminated by the addition of 4 µl of nondenaturing

stop buffer (0.2 M Tris-HCl, pH 7.5, 10 mg/ml proteinase K, 100 mM EDTA, and 0.5

% SDS). DNA was resolved in a nondenaturing 10% polyacrylamide gel and visualized

by phosphorimager (Fujifilm FLA 7000).

28

MMEJ

10 nM 5’-32P radiolabeled DNA preincubated at room temp in buffer (25 mM Tris-HCl pH

8.8, 1 mM DTT, 0.01% NP-40, 0.1 mg/ml BSA, 10% glycerol, 10 mM MgCl2) was mixed with Pol theta for 5 min; this was followed by the addition of 1 mM ATP and 20 µM dNTPs and incubation for 45 min or other times as indicated at 37°C in a total volume of 20 µl.

For analysis in non-denaturing gels, reactions were terminated by the addition of 5 µl of non-denaturing stop buffer (0.2 M Tris-HCl, pH 7.5, 10 mg/ml proteinase K, 100 mM

EDTA, and 0.5 % SDS). DNA was resolved in non-denaturing 12% polyacrylamide gels and analyzed by phosphorimager (Fujifilm FLA 7000). For time course experiments a 20

µl sample of the reactions are removed at specified time points after the addition of Pol theta into tubes containing 5 µl of non-denaturing stop buffer.

Primer Extension

10 nM 5’-32P radiolabeled DNA preincubated at room temp in buffer (25 mM Tris-HCl, pH 8.8, 1 mM DTT, 0.01% NP-40, 0.1 mg/ml BSA, 10% glycerol, 10 mM MgCl2) was mixed with Pol theta for 5 min; this was followed by the addition of 2 mM ATP and 20µM dNTPs and incubation for 30 min at 37°C in a total volume of 20 µl. For analysis in denaturing gels, reactions were terminated by the addition of 20 µl of 2x denaturing stop buffer (90% formamide and 50 mM EDTA). DNA was resolved in denaturing 15% polyacrylamide gels and analyzed by phosphorimager (Fujifilm FLA 7000).

29

ATPase Assay

The indicated amounts of proteins were incubated with 10 µM ATP, 2 µCi of (γ-32P) ATP and 100 nM ssDNA (29 nt poly-dT) in 5 µl of buffer (50 mM Tris-HCl, pH 7.5, 10 mM

MgCl2, 5 mM DTT, 0.1 mg/ml BSA, and 10% v/v glycerol) at room temp for 20 min. The reaction mixture was then spotted onto a TLC plate on PEI cellulose, which was developed in a buffer containing 1 M acetic acid and 0.25 M LiCl2 for 1.5 h. Plates were dried, then visualized by phosphorimager.

EMSA

10 nM 5’-32P radiolabeled DNA preincubated at room temp in buffer (25 mM Tris-HCl, pH 8.8, 1 mM DTT, 0.01% NP-40, 0.1 mg/ml BSA, 10% glycerol, 10 mM MgCl2) was mixed with Pol theta for 5 min; this was followed by the addition of 1 mM ATP and incubation for 30 min at 30°C in a total volume of 20 µl. For analysis in non-denaturing agarose gels, reactions were crosslinked by the addition of 2 µl of 2% glutaraldehyde for

15 min at room temp. DNA was resolved in non-denaturing 0.8% agarose gels. The wells of thin agarose gel were cleaned before pre-run and run. Agarose gel was pre-run for 30 min. After DNA was resolved, agarose gel was put in between Amersham Hybond-N+ membranes and paper towels; and pressed with a heavy weight for 30 mins. Agarose gel together with Amersham Hybond-N+ membranes were analyzed by phosphorimager

(Fujifilm FLA 7000).

30

Fluorescence Anisotropy

Reactions were performed at room temp in 25 mM Tris-HCl, pH 8.8, 1 mM DTT, 0.01%

NP-40, 10 mM MgCl2, 10% glycerol, 0.1 mg/ml BSA, 1 mM ATP for 30 min. Reactions contained 10 nM FAM-conjugated ssDNA (RP316FAM5), and the indicated amounts of enzyme. A Clariostar (BMG Labtech) plate reader was used to measure fluorescence anisotropy. All experiments were performed in triplicate, normalized, and plotted with ± s.d.

Scanning Force Microscopy

Proteins (10 nM) ± ssDNA (20 nM) were incubated in reaction buffer containing: 25 mM

Tris-HCl pH 8.8, 30 mM NaCl, 10 mM MgCl2, 2 mM ATP, 0.01% NP40 and 1mM DTT.

Incubations were carried out at 37° C for 30 min and deposited onto freshly cleaved mica.

After 15 sec the mica surface was washed with milli Q water and dried with a stream of filtered air. Images were obtained on a NanoScope IV SFM (Digital Instruments; Santa

Barbara, CA) operating in tapping mode in air with a type J scanner. Silicon Nanotips were from AppNano (Santa Clara, CA). Images were collected at 3 µm × 3 µm and flattened to remove background slope using Nanoscope software. The size of proteins was measured from NanoScope images imported into IMAGE SXM 1.89 (National Institutes of Health

IMAGE version modified by Steve Barrett, Surface Science Research Centre, Univ. of

Liverpool, Liverpool, U.K.). Statistical analysis was done using QtiPlot (Version 0.9.8.9 svn 2288) and LibreOffice (Version: 5.1.6.2). The volume of proteins monomers, measured from SFM images of proteins deposited in 10 mM Tris-HCl pH 8.8/100 mM KCl, were

31

100 nm3 for Polθ-hel and Polθ-pol, 210 nm3 for Polθ∆cen and 320 nm3 for Pol theta and

PolθK121M. These values (± 50%) were used to normalize the volume of observed complexes. Quantification is presented as % of total proteins present in certain oligomeric state.

DNA And RNA

Templates are as follows.

Figure 3-1c,e,f,h and Figure 3-2b, RP469D /RP470D.

Figure 3-1d and Figure 3-2c, RP469D /RP484.

Figure 3-2a, AO8/AO1, AO8/AO10.

Figure 3-2g, AO8/AO1.

Figure 3-2d, RP469D /RP469DC.

Figure 3-2e, RP470D/RP485.

Figure 3-3a, RP469R/RP470D.

Figure 3-3b, RP469R/484.

Figure 3-3c, RP469R/RP469DC.

Figure 3-3e, RP469R/RP470R.

Figure 3-3d, RP469D/RP470R.

Figure 3-4a, Fork A: RP470D/AO12/RP485, Fork B: RP470D/RP485/AO13, Fork C:

RP470D /AO12/RP485/AO13, Fork D: RP470D/RP485.

Figure 3-4b, Leading strand fork: RP470D /AO12/AO18, Lagging strand fork:

RP470D /AO17/AO19.

32

Figure 3-4c, RP470D/AO12 /RP485.

Figure A-1a, RP469D/RP470D.

Figure A-1b, RP420/RP420C.

Figure A-1c, AO5/AO6/AO7.

DNA and RNA were 32P-5’-radiolabeled with T4 polynucleotide kinase (NEB) and [γ-

32P] ATP (Perkin Elmer). Substrates were annealed by mixing a ratio of 1:2 of short to long strands then boiling and slow cooling to room temp.

DNA and RNA oligonucleotides (Integrated DNA Technologies) are as follows (5’-

3’):

RP469D, CTGTCCTGCATGATG;

RP469R, CUGUCCUGCAUGAUG;

RP469DC, CATCATGCAGGACAG;

RP470D, CATCATGCAGGACAGTCGGATCGCAGTCAG;

RP470R, CAUCAUGCAGGACAGUCGGAUCGCAGUCAG;

RP484, TCGGATCGCAGTCAGCATCATGCAGGACAG;

AO8, CATGCTGTCTAGA GACTATCGAT;

AO1, ATCGATAGTCTCTAGACAGCATGTCCTAGCAAGCCAGAATTCGGCA

GCGT;

AO10, TCCTAGCAAGCCAGA ATTCGGCAGCGTATCGATAGTCTCTAGACA

GCATG;

RP485, TTTTCGCCTTTTGCTCTGTCCTGCATGATG;

33

AO12, CTGACTGCGA TCCGA;

AO13, AGCAAAAGGCGAAAA;

AO17, TGCATCTAGAGGGCTCTGTCCTGCATGATG;

AO18, TGCATTCGAATTACT CTG TCCTGCATGATG;

AO19, AGCCCTCTAGATGCA.

NH1, ATTGCGGTCTCAAGC;

NH2, AGCAAAAGGCGAAAA;

RP420, TTCAGAATGTGCCAGTAGATTTTGAAATCAA;

RP420C, TTGATTTCAAAATCTACTGGCACATTCTGAA;

AO5, ACTATCATT CAGTCATGTAACCTAGTCAATCTGCGAGCTCGAATTCA

CTGGAGTGACCTC;

AO6, ATTGACTAGGTTACATGACTGAATGATAGT;

AO7, GAGGTCACTCCAGTGAATTCGAGCTCGCAGCCCCTCTAGGTTACATG

ACTGAATGATAGT.

Materials For Full Length Pol Theta Purification From Yeast

3xFLAG Peptide (Sigma-Aldrich, cat.no. F4799)

Agar (Fisher Bioreagents, cat.no. BP1423)

Amber glass jugs (Thermo Scientific, cat.no. 145-4000)

ANTI-FLAG M2 Resin (Sigma-Aldrich, cat.no. A2220)

Arginine (Sigma-Aldrich, cat.no. A5006)

34

ATP disodium salt trihydrate (Fisher Bioreagents, cat.no. BP413)

Benzamidine hydrochloride (Sigma-Aldrich, cat.no. B6506)

Bottletop filters, 1000 mL, 0.22 µm PES filter, sterile (Cell Treat Scientific Products, cat.no. 229718)

Bradford Dye

Centrifuge bottles polycarbonate with cap assemblies (Beckman Coulter, cat.no. 355618)

Disposable 10 mL polypropylene columns (Thermo Scientific, cat.no. 29924)

DNaseI (RNase-free) (NEB, cat.no. M0303)

DTT (Bioworld, cat.no. 40400120-5)

EDTA (Oakwood Chemical, cat.no. 238173)

EZ-Plate beads (Sunrise Science Products, cat.no. 3001-000)

Galactose (Sigma-Aldrich, cat.no. G8270)

Glutamic Acid (Sigma, G1251)

Glycerol (Macron Fine Chemicals, cat.no.5092-16)

Glycine (Dot Scientific Inc, DSG36050)

HEPES (Oakwood Chemical, cat.no. 047861)

Igepal CA-630 (Sigma-Aldrich, I3021)

Lithium acetate dehydrate (Sigma-Aldrich, cat.no. L6883) 35

LoBind tubes 5 mL (Eppendorf, cat.no. 0030108302)

LoBind tubes 50 mL (Eppendorf, cat.no. 0030122240)

Magnesium chloride hexahydrate (Fisher Chemicals, cat.no. M35-500)

Milli-Q sterile water

Mini-protean TGX gels, 7.5 % (BioRad, cat.no. 456-1024)

PEG 3350 (Sigma-Aldrich, cat.no. P3640)

Peptone (Dot Scientific Inc, cat.no. DSP20240)

PMSF: Phenylmethylsulfonyl fluoride (Dot Scientific Inc, cat.no. DSP20270)

Raffinose (Sigma-Aldrich, cat.no. R0250)

Salmon sperm DNA (Sigma-Aldrich, cat.no. D1626)

SigmaFast protease inhibitor cocktail tablets, EDTA-free (Sigma Life Science, cat.no.

S8830)

Sodium Chloride (Dot Scientific Inc, DSS23020)

Sorbitol (Sigma Life Science, cat.no. S1876)

Spectra/Por Standard RC dialysis tubing (Spectrum Labs, cat.no. 132678)

TRIS base (Dot Scientific Inc, DST60040)

Yeast extract (Sigma-Aldrich, cat.no. Y1625)

Yeast nitrogen base without amino acids (Dot Scientific Inc, cat.no. DSY20040) 36

Yeast synthetic drop-out medium (SC-TRP powder, Sunrise Science Products, cat.no.

1305-030)

37

CHAPTER 3

POL THETA HELICASE UNWINDS DNA

Introduction

POLQ is a unique gene in higher eukaryotes that encodes for a N-terminal superfamily 2 (SF2) helicase and a C-terminal A-family polymerase with a large central domain that lacks any known enzymatic domain (Fig. 3-1a)(95)(137)(62). Understanding the biochemical activities and cellular functions of Polθ have become a priority because it has been found to be essential for the error-prone double-strand break (DSB) repair pathway known as microhomology-mediated end-joining (MMEJ) or alternative end- joining (alt-EJ)(62)(168)(149)(97)(162)(96)(21)(118)(98). Remarkably, Pol theta expression has also been shown to be important for the proliferation of cells deficient in the homologous recombination (HR) pathway, such as due to mutations in BRCA1 or

BRCA2(168)(39). Recent studies additionally demonstrate that Pol theta is responsible for random DNA integration into the genomes of mammalian cells, and for T-DNA integration into plant genomes(177)(167)(178). In addition to these functions, Pol theta was shown to be essential for DSB repair in zebra fish embryos and is involved in replication timing and potentially replication fork repair(39)(11)(179). Thus, the recent expansion of Pol theta studies has revealed multiple essential and important functions for this enigmatic protein in DNA repair and cancer proliferation.

Although multiple studies have begun to elucidate the functions of the polymerase domain (Polθ-polymerase), very little is understood about the helicase domain (Polθ-

38

Fig. 3-1

Polθ-helicase (1-894) Polθ-polymerase (1792-2,590) A B kDa SF2 helicase domain A-family Pol domain 150 - 1 NT DEAH ~868 2,590 aa 100 - 75 - ATP binding, Helicase C RAD51 RAD51 Thumb Palm Fingers Palm hydrolysis binding binding Inactive (861-865 aa) exonuclease domain

C D 15bp 15bp 15bp 5’ 3’ *5’ 3’ *5’ 3’ 3’ * 5’ 3’ 5’ 3’ 5’ 0 10 20 50 nM Polθ-helicase 0 10 20 50 nM Polθ-helicase

Polθ-helicase ATP + unlabeled trap 5’ 3’

EDTA, proteinase K, SDS 1 2 3 4 5 1 2 3 4 5 0 61 78 83 % Unwinding 0 42 68 83 % Unwinding

15bp 15bp 5’ 3’ *5’ 3’ E 3’ * 5’ F 3’ 5’ 50 50 nM Polθ-Helicase 0 20 20 20 20 20 20 20 20 20 nM Polθ-Helicase − + ATP − − ATP GTP CTP UTP dATP dGTP dCTP dTTP + − AMP-PNP

1 2 3 4 5 6 7 8 9 10 11 1 2 3 0 0 92.6 9.3 6.7 5.8 75 6.4 6.8 5 % Unwinding 0 60 % Unwinding

G kDa 15bp 5’ 3’ 150 - 3’ * 5’ 100 - − + − nM Polθ-Helicase − − + nM Polθ-Helicase K121M 75 -

1 2 3 4

39

Figure 3-1. Pol0-helicase unwinds DNA in an ATP- and dATP-dependent manner. A, schematic of Pol theta. B,

denaturing SDS gel of purified Polθ-helicase. C, schematic of unwinding assay (left). Non-denaturing gel showing

Polθ-helicase unwinding of the indicated DNA substrate with a 3' ssDNA overhang (right). D, non-denaturing gel

showing Polθ-helicase unwinding of the indicated DNA substrate with a 5' ssDNA overhang. E and F, non-

denaturing gels showing Polθ-helicase DNA unwinding in the presence of the indicated nucleotides. % unwinding

is indicated. G, SDS gel of purified Polθ-helicase K121M (left). Non-denaturing gel showing the lack of Polθ-

helicase K121M DNA unwinding (right). *, 32P.

helicase) which is a SF2 helicase member (Fig. 3-1a). For example, a seminal report investigating Pol theta activities found that the helicase exhibits ATPase activity as predicted from its conserved helicase motifs (i.e. nucleotide binding, ssDNA binding, and core helicase motifs; Fig. 3-1a)(130). However, although Polθ-helicase exhibits robust

ATPase activity, the study failed to identify any DNA unwinding activity by the enzyme.

Consistent with this, a more recent study reported that Polθ-helicase is unable to unwind

DNA(165). Interestingly, Ceccaldi, et al. (39) reported that Polθ-helicase interacts with

RAD51 via specific binding motifs and exhibits anti-recombinase activity due to its ability to counter RAD51 activity. Despite these initial findings, the biochemical and cellular functions of Polθ-helicase have yet to be fully elucidated.

Because Polθ-helicase is most closely related to Hel308/HELQ-type and RecQ- type helicases, it likely shares activities with these widely studied groups of motor proteins(95)(180). For example, many RecQ helicases exhibit both DNA unwinding and annealing activities(181). Because these mechanisms can compete with one another, they can also mask each other in biochemical assays. For example, in recent studies we found

40 that Polθ-helicase exhibits DNA annealing activity, similar to RecQ-type helicases(59).

Specifically, Polθ-helicase promotes ssDNA annealing in an ATP-independent manner in the absence of the ssDNA binding protein RPA(59). However, when RPA is pre-bound to ssDNA Polθ-helicase requires ATP hydrolysis to promote ssDNA annealing(59). These studies link the ATP-dependent annealing activity of the helicase to alt-EJ by showing that it counteracts RPA to promote end-joining(59) (reviewed in Ref.(182)).

Because Polθ-helicase promotes DNA annealing, we envisaged that this activity likely opposes its unwinding function, and if so this would explain why DNA unwinding by the helicase has been difficult to detect. Indeed, here we demonstrate that by masking ssDNA annealing, we observe that Polθ-helicase efficiently unwinds several different types of DNA substrates with 3’-5’ polarity, including replication forks, blunt-ended DNA and

DNA with 3’ or 5’ overhangs. We further demonstrate that Polθ-helicase efficiently unwinds RNA-DNA hybrids and preferentially displaces the lagging strand from model replication forks, similar to the related HELQ/Hel308 helicase. These findings suggest

Polθ-helicase DNA unwinding contributes to the many activities of Pol theta in genome maintenance, and highlight a new activity for this enigmatic multi-functional enzyme.

Results

Pol Theta Helicase Unwinds DNA in an ATP and dATP Dependent Manner

Considering that Polθ-helicase exhibits annealing activity like related RECQ-type helicases(59), it can conceivably rewind DNA after unwinding it which would prevent

41 detection of its unwinding function. We therefore developed an assay that would mask the annealing activity immediately following DNA unwinding by the helicase. Polθ-helicase

(residues 1-894) was expressed and purified from E. coli using a N-terminal tandem hexahistidine-SUMO-tag which was subsequently cleaved (Fig. 3-1b)(59). The purified helicase was incubated with a radio-labeled DNA substrate containing a 3’ ssDNA overhang, referred to as partial ssDNA (pssDNA), in standard buffer conditions in the presence of MgCl2 (Fig. 3-1c). Next, the ATPase activity of the helicase was initiated by adding ATP along with excess ssDNA trap that is identical to the short strand within the pssDNA substrate. Here, if the helicase unwinds the DNA duplex then the excess unlabeled ssDNA trap will preferentially anneal to the complementary long strand within the pssDNA substrate. Consistent with this, we detected helicase dependent unwinding in the presence of the ssDNA trap (Fig. 3-1c), and show that excess sequence-specific ssDNA trap is essential for detection of Polθ-helicase unwinding (Fig. A-1a: Appendix A). To our knowledge, these data are the first to document Polθ-helicase unwinding.

Next, we utilized the optimized unwinding assay to further characterize the enzyme’s unwinding activity on various substrates. Unexpectedly, we observed that the helicase is able to unwind substrates containing 3’ and 5’ overhangs with similar efficiency

(compare Fig. 3-1c and Fig. 3-1d). Although related SF2 enzymes such as HELQ, also known as Hel308, translocate along ssDNA with a 3’-5’ polarity(180), our data presented insofar fail to reveal a particular polarity exhibited by Polθ-helicase. Nevertheless, we proceeded to determine which nucleotide cofactors support the enzyme’s unwinding activity. The results show that the helicase exclusively utilizes nucleotides containing

42 adenine, but more efficiently unwinds DNA in the presence of ATP compared to deoxyribonucleoside triphosphate (dATP)(Fig. 3-1e). We further find that the Polθ- helicase is unable to unwind DNA in the presence of the non-hydrolyzable ATP analog

AMP-PNP which demonstrates that the enzyme harnesses the energy of ATP hydrolysis to unwind DNA as expected (Fig. 3-1f). Lastly, we demonstrate that Polθ-helicase possessing a mutation of a highly conserved lysine (K121M) within the Walker A motif known to be essential for ATP binding fails to unwind DNA as expected (Fig. 3-1g). Taken together, the data presented in Figure 3-1 clearly show that Polθ-helicase exhibits robust DNA unwinding activity that depends on hydrolysis of ATP or dATP.

Pol Theta Helicase Preferentially Unwinds DNA with 3’ Overhangs

Although Polθ-helicase demonstrated a similar ability to unwind DNA containing a 3’ or 5’ ssDNA overhang, it would be unprecedented for such an enzyme to actively translocate along ssDNA in both directions. Thus, an alternative interpretation of the data presented in Figure 3-1c and 3-1d is that Polθ-helicase actively translocates along ssDNA with a single polarity, but is capable of initiating unwinding at blunt or 3’ recessed ends.

To further investigate the enzyme’s ATP-dependent directional movement, we assayed unwinding on 3’ and 5’ overhang substrates that contain longer DNA duplexes in order to increase the energy barrier to unwinding (Fig. 3-2a). For example, the substrates used in

Figure 3-1 include a duplex region 15 base pairs (bp) in length, whereas the substrates used in the current figure contain 23 bp of double-strand DNA. Importantly, the 23 bp duplex sequence on the 3’ and 5’ overhang substrates is identical to prevent differences in melting

43

Fig. 3-2 A 23 bp 23 bp 5’ 3’ 5’ 3’ 3’ * 5’ *3’ 5’ 0 10 20 50 nM Polθ-helicase 0 10 20 50 nM Polθ-helicase

1 2 3 4 5 1 2 3 4 5 0 16 21.9 33.4 % Unwinding 0 0.2 0.5 3.8 % Unwinding

15 bp 15 bp 5’ 3’ 5’ 3’ B 3’ * 5’ C *3’ 5’ 0 2 5 10 20 30 min 0 2 5 10 20 30 min

1 2 3 4 5 6 7 1 2 3 4 5 6 7 0 7 34.6 71.2 87 89.6 % Unwinding 0 0.8 18 67.1 85.5 88 % Unwinding

3’ 15 bp *5’ 15 bp 3’ D *5’ 3’ E 3’ 5’ 5’ 0 2 5 10 20 30 min 0 2 5 10 20 30 min

1 2 3 4 5 6 7 1 2 3 4 5 6 7 0 0.5 3 12.5 50.5 77.5 % Unwinding 0 5.3 15.9 28.4 43.3 51.7 % Unwinding

F 120 100 23bp G 5’ 3’ 3’ * 5’ 80 - - 50 50 nM Polθ-Helicase 100 - 20 − 20 nM hRPA 90 *** 5’ overhang 60 5' 80 *** 70 3'3’ overhang 60

% unwinding % 40 50 dsDNA 40 sds 30 20 % Unwinding % 20 replicationFork fork 10 0 0 0 10 20 30 1 2 3 4 5 0 6.5 22 68 % Unwinding min

44

Figure 3-2. Polθ-helicase preferentially unwinds DNA with 3' overhangs. A, non-denaturing gels showing

Polθ-helicase unwinding of DNA substrates containing 3' (left) and 5' (right) ssDNA overhangs. B–E, non-

denaturing gel showing a time course of Polθ-helicase unwinding of the indicated DNA substrates. F, plot

showing rate of Polθ-helicase unwinding of the DNA substrates from B--E. Data represent mean, n = 3 ± S.D. G,

non-denaturing gel showing RPA stimulation of Polθ-helicase unwinding (left). Bar chart showing % unwinding by

indicatedproteins(right).n=3±S.D.***,<0.001pvalue,Student’sunpaired t test. % unwinding is indicated. *, 32P.

temperature, and thus the amount of energy required for unwinding. The results demonstrate that Polθ-helicase unwinds the 3’ overhang substrate, but not the 5’ overhang substrate, indicating a 3’-5’ polarity, similar to related HELQ/Hel308 (Fig. 3-2a)(180).

The rate of unwinding by the helicase was next examined on multiple substrates to potentially identify its preference for a particular substrate. We utilized identical conditions to assay the enzyme on pssDNA containing 3’ or 5’ overhangs, duplex DNA, and a replication fork (Fig. 3-2b,3-2e). Here, again we employed substrates with the same double-strand DNA sequence and thus identical melting temperature. The results show that although the helicase unwinds each substrate under identical conditions, it exhibits the highest rate of unwinding on pssDNA harboring a 3’ overhang, which is consistent with

3’-5’ directional movement along ssDNA (Fig. 3-2f). We presume the enzyme unwinds the replication fork at a slower rate due to a second enzyme acting on the 5’ overhang that can conceivably impede helicase translocation on the 3’ overhang. Taken together, the results presented insofar in Figure 3-2 demonstrate that Polθ-helicase preferentially unwinds DNA containing 3’ overhangs, but is also capable of unwinding double-strand

DNA, DNA with 5’ overhangs, and replications forks. We note that although the enzyme 45 can unwind blunt-ended DNA substrates, it fails to do so on longer substrates even at relatively high concentrations (Fig. A-1b: Appendix A). This suggests that multiple Polθ- helicase molecules are unable to act cooperatively to unwind long substrates, as indicated for SF1-type helicase UvrD(183). Because many helicases function with and are stimulated by the ssDNA binding protein RPA, we assessed whether RPA promotes Polθ-helicase unwinding activity in Figure 3-2g. Here, we determined the efficiency of unwinding the 23 bp duplex substrate by relatively low amounts of either Polθ-helicase, RPA, or both proteins combined. The results show that the addition of both proteins results in synergistic activity which is indicated by a significantly higher yield of unwound DNA (Fig. 3-2g).

Future studies will be required to determine whether RPA stimulation of Polθ-helicase occurs by a specific protein-protein interaction.

Several lines of evidence indicate the involvement of RNA-DNA structures in contributing to both genome instability and DNA repair. For example, R-loops have long been associated with replicative stress and genome instability, whereas more recent work indicates that RNA-DNA hybrids can also promote DNA repair by mechanisms that remain to be elucidated(184)(185)(186)(187)(188). Considering the importance of RNA-DNA structures in DNA repair and genome instability, we proceeded to examine whether Polθ- helicase unwinds RNA-DNA duplexes with similar efficiency. Indeed, using identical substrate sequences, our results show that RNA-DNA substrates are also efficiently unwound by Polθ-helicase (Fig. 3-3a, b). Here again, the enzyme more rapidly unwinds the substrate containing a 3’ overhang (Fig. 3-3a). We note that the helicase shows substantially lower efficiency of unwinding a blunt ended RNA-DNA duplex (Fig. 3-3c).

46

Fig. 3-3

RNA-DNA RNA-DNA 15 bp 15 bp 5’ 3’ 5’ 3’ A 3’ * 5’ B *3’ 5’ 0 2 5 10 20 30 min 0 2 5 10 20 30 min

1 2 3 4 5 6 7 1 2 3 4 5 6 7 0 0 10 27 43.5 50 % Unwinding 0 0 7.5 23 44 46 %Unwinding

RNA-DNA DNA-RNA 15 bp 15 bp 5’ 3’ 5’ 3’ C *3’ 5’ D 3’ * 5’ 0 2 5 10 20 30 min 0 2 5 10 20 30 min

1 2 3 4 5 6 7

1 2 3 4 5 6 7 1 2 3 4 5 6 7 0 0 3 10 13 % Unwinding

RNA-RNA 15 bp 5’ 3’ E 3’ * 5’ 0 2 5 10 20 30 min

1 2 3 4 5 6 7

Figure 3-3. Polθ-helicase unwinds RNA-DNA hybrids. A–E, non-denaturinggels showing a time

course of Polθ-helicase unwindingof theindicatedRNA-DNA and RNA-RNA substrates. Gray line, RNA

oligonucleotide. Black line, DNA oligonucleotide. % unwindingis indicated.*, 32P.

47

This is consistent with inefficient unwinding of a blunt ended DNA-DNA duplex (see Fig.

3-2d). Failure of Polθ-helicase to unwind a RNA-DNA substrate containing a 3’ RNA overhang indicates that this enzyme exclusively translocates along ssDNA (Fig. 3-3d). The helicase also fails to unwind a RNA-RNA substrate which further demonstrates its inability to translocate along RNA (Fig. 3-3e).

Pol Theta Helicase Efficiently Unwinds Substrates Modeled After

Stalled Replication Forks

A previous report demonstrates that mammalian Pol theta acts in response to replication stress and promotes replication fork progression or fork stability(39). For example, Pol theta was shown to form cellular foci in response to ultraviolet light and confer cellular resistance to hydroxyurea (HU) treatment(39). Furthermore, Pol theta was demonstrated to promote replication fork progression in the absence of exogenous DNA damaging agents, and cells deficient in Pol theta exhibit a prolonged S phase delay and a significant increase in stalled or collapsed forks following HU treatment(39). Thus, although Pol theta has an essential role in alt-EJ, additional lines of evidence suggest it might exhibit separate functions in response to replicative stress, such as replication fork restart(39).

We further examined Polθ-helicase activity on different types of replication forks to provide insight into its potential functions during replication. Time courses of Polθ- helicase unwinding were performed on replication forks containing leading or lagging strands, leading and lagging strands, or a fork lacking leading and lagging strands (Fig. 3-

48

Fig. 3-4

Fork C Fork D Fork A 3’ Fork B 3’ 3’ 3’ A 5’ 5’ 15 bp 15 bp 15 bp 15 bp 5’ *5’ 5’ 5’ *3’ 3’ *3’ *3’

3’ 3’ 5’ 5’ 5’ 5’

Fork A Fork B Fork C Fork D 0 2 5 10 20 30 0 2 5 10 20 30 0 2 5 10 20 30 0 2 5 10 20 30 min

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 0 0.5 4.5 17.3 35.5 44.2 0 7.4 23.8 51 68.7 76.2 0 2.6 7.8 22.5 39.2 44.8 0 7.7 17.9 30.7 46.6 54.3 % Unwinding

Leading strand fork Lagging strand fork B 3’ 3’ 5’* 15 bp 15 bp 5’ 5’ 3’ 3’ * 100 3’ 5’ 5’ Lagging 80 0 2 5 10 20 30 min 0 2 5 10 20 30 min strand

60

40 Leading % Unwinding % strand 20

1 2 3 4 5 6 1 2 3 4 5 6 0 0 5 7 18 39 46 % Unwinding 13 36 76 94 95 % Unwinding 0 10 min20 30

C 3’ 5’* 15 bp 5’ 3’

5’ − + Polθ-Helicase − + + Polθ-Polymerase

−Run-off product

−Primer 1 2 3 5.7 22.3 % Run-off

49

Figure 3-4. Polθ-helicase unwinding activity at replication forks. A, non-denaturing gel showing a time

course of Polθ-helicase unwinding of the indicated DNA replication fork substrates. B, non-denaturing gel

showing a time course of Polθ-helicase unwinding of the indicated DNA replication fork substrates containing

leading (left) or lagging (right) strands. % unwinding is indicated. Plot of % unwinding data from left and

middle panels (right). C, denaturing gel showing Polθ-polymerase leading strand extension in the presence

(lane 3) andabsence(lane 2) of Polθ-helicase.% run-offproductis indicated. *, 32P.

4a). The results clearly show that the helicase preferentially unwinds the fork containing the lagging strand but lacking the leading strand (Fork B; Fig. 3-4a). These data suggest a possible function for Polθ-helicase in replication fork repair. For example, following arrest of the leading strand polymerase, such as due to an encounter with a lesion, the replicative helicase is known to continue to unwind the fork resulting in a large leading strand gap. In contrast, the lagging strand polymerase can continue to act on its respective template, generating Okazaki fragments(189). Hence, fork collapse is often modeled as Fork B which specifically lacks a leading strand. We next investigated whether Polθ-helicase more efficiently unwinds the lagging strand which is common among Hel308 type enzymes.

Indeed, similar to HELQ/Hel308 activity, we find that Polθ-helicase preferentially unwinds the lagging strand at a replication fork which further supports a potential role in response to replication stress like HELQ/Hel308 (Fig. 3-4b).

Pol Theta Helicase Promotes Strand Displacement Synthesis by Pol Theta Polymerase

A unique feature of POLQ is that it is the only known gene in multicellular organisms to encode for a helicase and polymerase. Other known helicase-polymerase 50 fusion proteins are more common in bacteria, archaea, and viruses, and are involved in replication and repair(157). A conceivable function for Polθ-helicase unwinding activity is to facilitate strand displacement synthesis by the Polθ-polymerase domain. For example, although some polymerases exhibit proficient strand displacement activity, which enables

DNA unwinding downstream of the 3’ primer terminus during replication, many polymerases such as those involved in chromosomal replication require the unwinding activity of auxiliary helicases to perform replication of double-strand DNA. We tested whether Polθ-polymerase exhibits strand displacement activity on a replication fork containing a leading strand in Figure 3-4c. The results show that the polymerase possesses limited strand displacement activity in the presence of all four dNTPs and ATP as indicated by its inability to fully extend the leading strand primer (Fig. 3-4c, lane 2). Given that Polθ- helicase exhibits 3’-5’ polarity, we evaluated whether it promotes strand displacement activity by the polymerase domain. Indeed, addition of the helicase under identical conditions with ATP facilitates Polθ-polymerase primer extension through the downstream

DNA duplex, as indicated by a 4-fold increase in run-off product (Fig. 3-4c, lane 3). Hence, these data suggest a plausible function for the helicase domain in facilitating Polθ- polymerase strand displacement synthesis during replication repair.

Discussion

Pol theta has multiple documented activities in DNA replication and repair, including alt-EJ, replication repair, translesion synthesis, and replication

51 initiation(168)(97)(162)(96)(118)(39)(179)(169)(170). Although the activities and cellular functions of Polθ-polymerase have been investigated, little is understood regarding the enzymatic activities of Polθ-helicase. For example, although studies have shown that the helicase exhibits ATPase activity that is stimulated by ssDNA, multiple reports failed to detect a DNA unwinding function which is common among DNA helicases sharing sequence homology to Pol theta, such as HELQ/Hel308 and RecQ-type helicases(39)(190)(165)(180)(191)(121)(192)(193). In this report, we demonstrate that

Polθ-helicase exhibits robust DNA unwinding activity with 3’-5’ directionality. The helicase preferentially unwinds DNA substrates containing 3’ ssDNA overhangs, but additionally unwinds substrates with 5’ overhangs, blunt-ended DNA, RNA-DNA hybrids and replication forks. Because Polθ-helicase also performs ssDNA annealing(59), this function counters its unwinding which likely explains why Pol theta unwinding has been difficult to detect in previous studies(190)(165).

Several lines of evidence have supported a role for Pol theta in replication fork repair. For example, a recent report demonstrated that suppression of Pol theta significantly slows the velocity of replication forks even in the absence of exogenous DNA damaging agents(39). This report also show that knockdown of Pol theta expression impairs fork progression and halts cells in S-phase following HU treatment(39). These data indicate that

Pol theta either promotes replication fork stability or replication fork repair. Considering that several SF2 helicases, such as HELQ/Hel308 and RecQ subclasses, are involved in replication fork repair, it is not unreasonable to assume a similar function for Polθ-helicase which is closest in relation to HELQ/Hel308(180)(191)(194)(195)(196). For example,

52 prior studies showed that mammalian HELQ/Hel308 is recruited to stalled replication forks and is involved in repairing interstrand crosslinks which arrest replication forks(191)(195)(196). Similar to HELQ/Hel308, Polθ-helicase unwinds DNA with 3’-5’ polarity and exhibits a preference for unwinding substrates modeled after collapsed replication forks, such as those lacking a leading strand(191)(193). We also find that Polθ- helicase is unable to unwind long substrates and thus exhibits non-processive unwinding activity like HELQ/Hel308(180)(193). Hence, our studies confirm similar biochemical activities between Polθ-helicase and HELQ/Hel308 which suggests these enzymes perform similar replication repair functions.

Structural and sequence comparisons between Polθ-helicase and Hel308 type enzymes provide further evidence for shared mechanisms of helicase activity (Fig. 3-5).

For example, superposition of Polθ-helicase and the co-crystal structure of Hel308 in complex with partially unwound DNA reveals a similar orientation of the β-hairpin motif, previously shown to act as a wedge to facilitate duplex unwinding by Hel308 (Fig. 3-

5b)(152). Although the sequence of this motif is not closely conserved between Polθ- helicase and HELQ/Hel308 type enzymes (Fig. 3-5a), superposition of Polθ-helicase and

Hel308 suggests the slightly smaller β-hairpin in Pol theta facilitates DNA duplex separation by a similar mechanism (Fig. 3-5b). Another interesting structural similarity between these enzymes is the previously reported auto-inhibitory helix-loop-helix domain

5 which contains a highly conserved Arg-Ala-Arg (RAR) motif (Fig. 3-5c)(193)(152). For instance, a prior report demonstrated that domain 5 within Hel308 suppresses its unwinding activity(193). Specifically, deletion of this domain or mutation of a conserved arginine

53

Fig. 3-5

A β12 α12 β13 α13 β14 β15 β16 α14 β17 IVa IVb V β-hairpin VI HsPolq 414 AFHHAGLTFEERDIIEGAFRQGLIRVLAATSTLSSGVNLPARRVIIRTPI----FGGRPLDILTYKQMVGRAGRKGVDTVGESILICK 497 DmMus308 545 AFHHAGLTTEERDIIEASFKAGALKVLVATSTLSSGVNLPARRVLIRSPL----FGGKQMSSLTYRQMIGRAGRMGKDTLGESILICN 628 HsHel308 641 AYHHSGLTSDERKLLEEAYSTGVLCLFTCTSTLAAGVNLPARRVILRAPY----VAKEFLKRNQYKQMIGRAGRAGIDTIGESILILQ 724 AfHel308 301 AFHHAGLLNGQRRVVEDAFRRGNIKVVVATPTLAAGVNLPARRVIVRSLYRFDGYS-KRIKVSEYKQMAGRAGRPGMDERGEAIIIVG 387 Pa0592 298 AFHHAGLGRDERVLVEDNFRKGLIKVVVATPTLSAGINTPAFRVIIRDTWRYSEFGMERIPVLEVQQMMGRAGRPRYDEVGEAIIVST 385 yMTR4 473 GIHHSGLLPILKEVIEILFQEGFLKVLFATETFSIGLNMPAKTVVFTSVRKWDGQQFRWVSGGEYIQMSGRAGRRGLDDRGIVIMMID 560 hMtr4 453 GIHHGGLLPILKETIEILFSEGLIKALFATETFAMGINMPARTVLFTNARKFDGKDFRWISSGEYIQMSGRAGRRGMDDRGIVILMVD 540 ySKI2 693 AVHHGGLLPIVKELIEILFSKGFIKVLFATETFAMGLNLPTRTVIFSSIRKHDGNGLRELTPGEFTQMAGRAGRRGLDSTGTVIVMAY 780 hSKI2 633 GVHHSGILPILKEIVEMLFSRGLVKVLFATETFAMGVNMPARTVVFDSMRKHDGSTFRDLLPGEYVQMAGRAGRRGLDPTGTVILLCK 720 . **.*: : :* : * : . .* *:: *:* *: *:. . : ** ***** * * *::

B C Hel308 Polθ-helicase

DNA duplex DNA duplex R806 R662 Hel308 Hel308 Polθ- helicase

Polθ-helicase

Figure 3-5. Sequence and structural comparison of Pol0-helicase and HELQ/Hel308 enzymes. A,

sequence alignment of motifs IVa–VI of Pol0/Hel308/Ski2- related SF2 helicases. [3-hairpin, motifs IV–VI,

and secondary structures are indicated. *, identical residues; colon (:), residues sharing very similar

properties; period (.), residues sharing some properties; red, small and hydrophobic; blue, acidic;

magenta, basic; green, hydroxyl, sulfhydryl, amine. Sequences were aligned using Clustal Omega

(https://www.ebi.ac.uk/Tools/msa/clustalo/) default settings. B, superposition of Pol0-helicase (green)

(PDB ID: 5AGA) and Hel308 in complex with DNA (blue) (PDB ID: 2P6R) highlighting the [3-hairpin motif. C,

superposition of Pol0-helicase (green) (PDB ID: 5AGA) and Hel308 (blue) (PDB ID: 2VA8) highlighting the

conserved Ala-Arg-Ala (RAR) motif in domain 5. Conserved arginines are represented as sticks.

54

(R662) in this region, which was shown to interact with extruded ssDNA in the co-crystal structure of Hel308 in complex with partially unwound DNA, resulted in a dramatic increase in helicase activity(193)(152). These types of helicase autoinhibitory domains found in both SF1 and SF2 members may be modulated by interacting proteins or specific nucleic acid structures(193). Thus, Polθ-helicase activity may be substantially stimulated by protein or DNA interactions that change the orientation of the autoinhibitory domain.

We speculate that the structurally and sequence conserved domain 5 in Polθ-helicase exhibits an autoinhibitory mechanism like Hel308 (Fig. 3-5c).

Despite the similar unwinding activities between Pol theta and HELQ/Hel308,

DNA unwinding is countered by the annealing function of Polθ-helicase. For instance, detection of DNA unwinding by Polθ-helicase requires masking the opposite annealing activity by addition of excess ssDNA trap. In contrast, HELQ/Hel308 has been shown to unwind DNA in the absence of a ssDNA trap and therefore does not likely exhibit strong annealing activity like Polθ-helicase(191). Interestingly, other SF2 helicases such as those from the RecQ subclass exhibit ssDNA annealing, however, in many cases this activity is suppressed by ATP(181). In contrast, the respective annealing activities of Pol theta and

RECQL5 helicases are not suppressed by ATP, and these enzymes share ~18% sequence homology(95)(59). RECQL5 also unwinds DNA with low processivity like Polθ- helicase(197). Despite these similarities, we were unable to detect strand exchange activity by Polθ-helicase which RECQL5 has been shown to exhibit (Fig. A-1c: Appendix A).

Another similar function between Polθ-helicase and RECQL5 is their ability to interact with RAD51 and counteract its activity. For example, both Polθ-helicase and RECQL5

55 promote dissociation of RAD51–mediated D-loops in vitro, and these enzymes suppresses homologous recombination in cells(39)(198). Taken together, Polθ-helicase shares similar characteristics with RECQL5 and HELQ/Hel308.

Because Pol theta is known to promote the proliferation of BRCA-deficient cancer cells and is considered a promising oncology drug target, it will be important to determine whether the unwinding function of the helicase domain contributes to cancer cell survival(168)(39). For example, although the helicase domain was recently shown to promote alt-EJ via annealing and counteracting RPA, its unwinding activity may also contribute to this pathway(59). For example, Polθ-helicase unwinding may enable microhomology annealing or strand displacement synthesis by Polθ-polymerase during alt-

EJ (Fig. 3-6a). Since Polθ-polymerase exhibits poor strand displacement synthesis, DNA unwinding ahead of the polymerase would be a plausible function for the helicase during alt-EJ (Fig. 3-6a). Considering that RNA-DNA hybrids have recently been shown to form at DNA breaks, Polθ-helicase dissociation of these structures may also contribute to DSB repair(188).

Importantly, it remains unclear whether alt-EJ independent roles for Polθ-helicase exist and enable the proliferation of BRCA-deficient cells. For example, although a previous report suggested Polθ-helicase suppresses HR via its RAD51 interaction motif(39), a more recent study was unable to confirm this mechanism in cells(59). Because

Polθ-helicase exhibits robust unwinding of replication forks, it can conceivably play a compensatory role in BRCA-deficient cells during replication repair. For instance, lagging strand unwinding can contribute to fork reversal and replication restart (Fig. 3-6b, left).

56

57

Alternatively, the helicase can potentially promote replication by facilitating strand displacement synthesis by the polymerase domain (Fig. 3-6b, right). This activity could conceivably aid in replication restart by extending nascent primers. Future studies will be required to further characterize the molecular basis of Polθ-helicase unwinding and determine whether this activity contributes to the many functions of Pol theta in replication and repair.

58

CHAPTER 4

PURIFICATION OF FULL-LENGTH HUMAN DNA POLYMERASE THETA

AND VARIANTS

Introduction

Full-length human DNA polymerase theta (Pol theta) has a molecular weight of

290 kDa and includes a helicase, polymerase and disordered central domain as described above. Pol theta is one of 17 human DNA polymerases in human cells and its C-terminal polymerase domain is classified as an A-family polymerase based on its amino-acid sequence (123).

The first report of full-length human Pol theta purification was from the Wood lab in 2003(130). In this report, recombinant human Pol theta was purified from insect cells using a baculovirus system. However, only very small quantities of the full-length protein were isolated, which are insufficient for most biochemical studies. In recent studies, similar large-sized (i.e. 371 kDa Pol epsilon holoenzyme) recombinant Pol proteins were expressed and purified from yeast in relatively large quantities(199)(200)(201)(202). We therefore utilized yeast S. cerevisiae to express and purify relatively large quantities of full- length and mutant versions of recombinant human Pol theta. To our knowledge, these are the first studies to express and purify recombinant full-length human Pol theta from yeast.

These methods will enable biochemical and structural characterization of this unique and large polymerase-helicase fusion protein that is essential for the proliferation of BRCA-

59 deficient cancer cells(39). These methods will also aid in identifying and developing drug- like inhibitors of human Pol theta as cancer therapeutics.

Expression of Full-Length Pol Theta in Yeast

Expression Vector

Our yeast expression system utilizes the well characterized galactose-inducible

GAL 1-10 promoter(203). Gal 1-10 promoter is a strong bidirectional promoter for yeast expression and has been widely used in (204)(205)(206). We generated a 3xFlag-Polθ yeast expression plasmid from a 2-µm-based vector that carries the TRP auxotrophic marker using standard molecular cloning techniques (further described below). 2-µm is a eukaryotic origin of replication which enables cell-cycle dependent replication of the plasmid as an episomal (non-chromosomal) entity(207)(208).

The TRP auxotrophic marker located in the plasmid enables for selection of yeast cells containing the plasmid as a result of their ability to proliferate in media lacking tryptophan(TRP)(209). For example, yeast cells harboring the 3xFlag-Polθ expression plasmid following cellular transformation are able to maintain growth in media lacking tryptophan(209). We generated the following plasmids for yeast expression of wild-type

(WT) human Pol theta and mutant versions of the protein which are described below:

3xFlag-Polθ, 3xFlag-PolθK121M, 3xFlag-Polθ∆cen and 3xFlag-GFP-Polθ (Figure 4-1).

60

Fig. 4-1 Yeast Expression Plasmids

a b

c d

Figure 4-1. Maps of yeast expression plasmids. a, map of 3xFlag-Polθ. b, map of 3xFlag-Polθ∆cen. c, map

of 3xFlag-PolθK121M. d, map of 3xFlag-GFP-Polθ. Plasmid elements have been shown using Snapgene

software.

61

Cloning

A yeast codon optimized human POLQ gene DNA cassette was synthesized and inserted into plasmid pRS424-GAL_J-TRP1(210)using standard molecular cloning techniques

(Fig. 4-2). pRS424 contains the 2-µm yeast replication origin, an ampicillin resistance gene

(AmpR) under its own promoter, a high copy number bacterial replication origin

(pBR322)), the bidirectional GAL 1-10 promoter with two multiple cloning sites,, and the tryptophan(TRP1) gene under control of the TRP1 promoter(Fig. 4-1)(211). The codon optimized WT full-length POLQ synthetic gene or mutant variations of this gene were cloned into the EcoRI and XhoI sites immediately downstream from the bidirectional GAL

1-10 promoter (Fig. 4-2, Appendix B). This insert site enables galactose-inducible expression from GAL 1-10 promoter. Specifically, a 7,845 bp synthetic DNA fragment

(Figure 4-2) that encodes for 3xFlag-tag and adjacent in-frame yeast codon optimized full- length wild type Pol theta with flanking EcoRI and XhoI sites were ligated into the cloning site of the bidirectional Gal 1-10 promoter (Figures 4-1a). The plasmid carrying WT full- length human POLQ was named 3xFlag-Polθ.

To perform structure function analysis of Polθ for elucidating particular domain functions, the following mutant constructs were generated. Lysine 121 within the original

3xFlag-Polθ yeast expression plasmid was changed to a methionine (Met) using standard site-directed mutagenesis protocols in order to express and purify 3xFlag-PolθK121M which is known to be defective in ATP binding and ATPase activity (Figure 4-1c)(98)(97).

In order to probe the function of the central domain, the DNA fragment corresponding to amino acids (895-1791) was removed from the original 3xFlag-Polθ yeast expression

62

Yeast Codon Optimized Sequence of 3xFlag-Polθ

1 M S D Y K D H D G D Y K D H D I D Y K D D D D K M N L L R R 1 ATGTCTGACTACAAGGACCATGACGGTGACTACAAGGACCACGACATAGACTACAAAGACGATGATGATAAGATGAATTTGTTGAGAAGA 31 S G K R R R S E S G S D S F S G S G G D S S A S P Q F L S G 91 TCTGGTAAAAGAAGAAGATCAGAATCTGGTTCAGATTCTTTTTCAGGTTCTGGTGGTGACTCTTCTGCTTCTCCACAATTTTTGTCAGGT 61 S V L S P P P G L G R C L K A A A A G E C K P T V P D Y E R 181 TCTGTTTTATCTCCACCACCAGGTTTGGGTAGATGTTTAAAAGCTGCAGCTGCTGGTGAATGTAAACCAACTGTTCCAGATTACGAAAGA 91 D K L L L A N W G L P K A V L E K Y H S F G V K K M F E W Q 271 GATAAGTTGTTGTTGGCAAATTGGGGTTTGCCAAAGGCTGTTTTAGAAAAGTACCATTCTTTCGGTGTTAAGAAAATGTTCGAATGGCAA 121 A E C L L L G Q V L E G K N L V Y S A P T S A G K T L V A E 361 GCTGAATGTTTATTGTTAGGTCAAGTTTTGGAAGGTAAAAATTTGGTTTACTCTGCACCAACTTCAGCTGGTAAAACATTAGTTGCTGAA 151 L L I L K R V L E M R K K A L F I L P F V S V A K E K K Y Y 451 TTGTTGATCTTGAAGAGAGTTTTAGAAATGAGAAAGAAAGCTTTGTTTATTTTACCATTCGTTTCTGTTGCTAAGGAAAAGAAATACTAC 181 L Q S L F Q E V G I K V D G Y M G S T S P S R H F S S L D I 541 TTGCAATCATTATTTCAAGAAGTTGGTATTAAAGTTGATGGTTACATGGGTTCAACTTCTCCATCAAGACATTTTTCTTCATTGGATATC 211 A V C T I E R A N G L I N R L I E E N K M D L L G M V V V D 631 GCAGTTTGTACAATTGAAAGAGCTAATGGTTTGATTAATAGATTGATCGAAGAAAATAAGATGGATTTGTTGGGTATGGTCGTTGTTGAT 241 E L H M L G D S H R G Y L L E L L L T K I C Y I T R K S A S 721 GAATTGCATATGTTGGGTGACTCTCATAGAGGTTATTTGTTAGAATTGTTGTTGACTAAGATTTGTTACATTACAAGAAAATCTGCTTCA 271 C Q A D L A S S L S N A V Q I V G M S A T L P N L E L V A S 811 TGTCAAGCAGATTTGGCTTCTTCATTGTCTAACGCAGTTCAAATCGTTGGCATGTCAGCTACTTTACCAAATTTGGAATTAGTTGCATCT 301 W L N A E L Y H T D F R P V P L L E S V K V G N S I Y D S S 901 TGGTTGAACGCTGAATTGTACCATACAGATTTCAGACCAGTTCCATTGTTGGAATCTGTTAAGGTTGGTAATTCAATCTATGATTCTTCA 331 M K L V R E F E P M L Q V K G D E D H V V S L C Y E T I C D 991 ATGAAGTTGGTTAGAGAATTTGAACCAATGTTGCAAGTTAAAGGTGACGAAGATCATGTTGTTTCTTTATGTTACGAAACTATCTGTGAT 361 N H S V L L F C P S K K W C E K L A D I I A R E F Y N L H H 1081 AATCATTCTGTTTTATTGTTTTGTCCATCTAAGAAATGGTGTGAAAAGTTGGCAGATATCATCGCTAGAGAATTTTACAATTTGCATCAT 391 Q A E G L V K P S E C P P V I L E Q K E L L E V M D Q L R R 1171 CAAGCTGAAGGTTTAGTTAAACCATCTGAATGTCCACCAGTTATTTTAGAACAAAAGGAATTGTTGGAAGTTATGGACCAATTGAGAAGA 421 L P S G L D S V L Q K T V P W G V A F H H A G L T F E E R D 1261 TTACCATCTGGTTTGGATTCAGTTTTACAAAAGACTGTTCCATGGGGTGTTGCATTTCATCATGCTGGTTTGACATTTGAAGAAAGAGAT 451 I I E G A F R Q G L I R V L A A T S T L S S G V N L P A R R 1351 ATCATCGAAGGTGCTTTTAGACAAGGTTTGATTAGAGTTTTAGCTGCAACTTCTACATTGTCTTCAGGTGTTAATTTGCCAGCTAGAAGA 481 V I I R T P I F G G R P L D I L T Y K Q M V G R A G R K G V 1441 GTTATTATCAGAACTCCAATCTTCGGTGGTAGACCATTGGATATCTTGACATACAAGCAAATGGTTGGTAGAGCTGGTAGAAAAGGTGTT 511 D T V G E S I L I C K N S E K S K G I A L L Q G S L K P V R 1531 GATACTGTTGGTGAATCTATCTTGATCTGTAAAAATTCTGAAAAATCAAAGGGTATCGCTTTATTGCAAGGTTCTTTGAAACCAGTTAGA 541 S C L Q R R E G E E V T G S M I R A I L E I I V G G V A S T 1621 TCATGTTTACAAAGAAGAGAAGGTGAAGAAGTTACTGGTTCAATGATCAGAGCAATCTTGGAAATCATCGTTGGTGGTGTTGCTTCTACA 571 S Q D M H T Y A A C T F L A A S M K E G K Q G I Q R N Q E S 1711 TCACAAGATATGCATACTTATGCTGCATGTACATTTTTAGCTGCATCTATGAAGGAAGGTAAACAAGGTATTCAAAGAAACCAAGAATCA 601 V Q L G A I E A C V M W L L E N E F I Q S T E A S D G T E G 1801 GTTCAATTGGGTGCAATTGAAGCTTGTGTTATGTGGTTGTTGGAAAACGAGTTTATTCAATCTACTGAAGCTTCAGATGGTACAGAGGGT 631 K V Y H P T H L G S A T L S S S L S P A D T L D I F A D L Q 1891 AAAGTTTACCATCCAACTCATTTGGGTTCTGCAACATTGTCTTCATCTTTATCACCAGCTGATACATTGGATATCTTCGCAGATTTGCAA 661 R A M K G F V L E N D L H I L Y L V T P M F E D W T T I D W 1981 AGAGCTATGAAGGGTTTCGTTTTGGAAAACGATTTGCATATCTTGTATTTGGTTACTCCAATGTTCGAAGATTGGACTACAATCGATTGG 691 Y R F F C L W E K L P T S M K R V A E L V G V E E G F L A R 2071 TACAGATTTTTCTGTTTGTGGGAAAAGTTGCCAACATCTATGAAGAGAGTTGCTGAATTGGTTGGTGTTGAAGAAGGTTTCTTGGCAAGA 721 C V K G K V V A R T E R Q H R Q M A I H K R F F T S L V L L 2161 TGTGTTAAGGGTAAAGTTGTTGCTAGAACTGAAAGACAACATAGACAAATGGCTATTCATAAGAGATTTTTCACATCTTTGGTTTTGTTG 751 D L I S E V P L R E I N Q K Y G C N R G Q I Q S L Q Q S A A 2251 GATTTGATCTCAGAAGTTCCATTGAGAGAAATTAATCAAAAGTATGGTTGTAATAGAGGTCAAATTCAATCTTTGCAACAATCAGCTGCA 781 V Y A G M I T V F S N R L G W H N M E L L L S Q F Q K R L T 2341 GTTTACGCTGGTATGATCACTGTTTTCTCTAACAGATTGGGTTGGCATAACATGGAATTGTTGTTGTCACAATTCCAAAAGAGATTGACA 811 F G I Q R E L C D L V R V S L L N A Q R A R V L Y A S G F H 2431 TTCGGTATTCAAAGAGAATTGTGTGATTTGGTTAGAGTTTCTTTGTTGAACGCACAAAGAGCTAGAGTTTTGTATGCATCAGGTTTTCAT 841 T V A D L A R A N I V E V E V I L K N A V P F K S A R K A V 2521 ACTGTTGCTGATTTGGCAAGAGCTAACATCGTTGAAGTTGAAGTTATTTTGAAAAATGCAGTTCCTTTTAAATCTGCAAGAAAAGCTGTT 871 D E E E E A V E E R R N M R T I W V T G R K G L T E R E A A 2611 GATGAAGAAGAAGAAGCTGTTGAAGAAAGAAGAAACATGAGAACTATCTGGGTTACAGGTAGAAAAGGTTTGACAGAAAGAGAAGCTGCA

63

901 A L I V E E A R M I L Q Q D L V E M G V Q W N P C A L L H S 2701 GCTTTGATCGTTGAAGAAGCAAGAATGATCTTGCAACAAGATTTGGTTGAAATGGGTGTTCAATGGAATCCATGTGCTTTGTTACATTCA 931 S T C S L T H S E S E V K E H T F I S Q T K S S Y K K L T S 2791 TCTACTTGTTCTTTGACACATTCTGAATCAGAAGTCAAGGAACATACTTTTATTTCACAAACAAAATCATCTTACAAGAAATTGACTTCT 961 K N K S N T I F S D S Y I K H S P N I V Q D L N K S R E H T 2881 AAAAATAAGTCAAATACAATTTTCTCTGATTCATACATCAAGCATTCTCCAAACATTGTTCAAGATTTGAATAAGTCAAGAGAACATACT 991 S S F N C N F Q N G N Q E H Q T C S I F R A R K R A S L D I 2971 TCATCTTTTAATTGTAATTTTCAAAATGGTAATCAAGAACATCAAACTTGTTCCATTTTTAGAGCCAGAAAGAGAGCATCCTTAGATATT 1021 N K E K P G A S Q N E G K T S D K K V V Q T F S Q K T K K A 3061 AACAAAGAAAAGCCAGGTGCATCCCAAAACGAAGGTAAAACATCCGATAAAAAGGTTGTCCAAACATTCTCCCAAAAGACCAAAAAGGCT 1051 P L N F N S E K M S R S F R S W K R R K H L K R S R D S S P 3151 CCATTAAATTTCAATTCTGAAAAGATGTCTAGATCTTTTAGATCATGGAAGAGAAGAAAGCATTTGAAGAGATCTAGAGATTCTTCACCA 1081 L K D S G A C R I H L Q G Q T L S N P S L C E D P F T L D E 3241 TTAAAGGATTCAGGTGCATGTAGAATACATTTGCAAGGTCAAACTTTGTCTAATCCATCATTATGTGAAGATCCTTTTACTTTGGATGAA 1111 K K T E F R N S G P F A K N V S L S G K E K D N K T S F P L 3331 AAGAAAACTGAGTTTAGAAATTCTGGTCCATTCGCTAAAAATGTTTCTTTGTCTGGTAAAGAAAAAGATAATAAGACTTCTTTTCCATTG 1141 Q I K Q N C S W N I T L T N D N F V E H I V T G S Q S K N V 3421 CAAATTAAACAAAATTGTTCATGGAACATCACTTTGACAAACGATAACTTCGTTGAACATATCGTTACAGGTTCTCAATCTAAAAATGTT 1171 T C Q A T S V V S E K G R G V A V E A E K I N E V L I Q N G 3511 ACATGTCAAGCTACTTCTGTTGTTTCAGAAAAAGGTAGAGGTGTTGCTGTTGAAGCAGAAAAGATTAATGAAGTTTTGATTCAAAACGGT 1201 S K N Q N V Y M K H H D I H P I N Q Y L R K Q S H E Q T S T 3601 TCTAAAAATCAAAACGTTTACATGAAACATCATGATATTCATCCAATTAATCAATACTTAAGAAAGCAATCTCATGAACAAACTTCAACA 1231 I T K Q K N I I E R Q M P C E A V S S Y I N R D S N V T I N 3691 ATCACTAAGCAAAAGAATATCATCGAAAGACAAATGCCATGTGAAGCAGTTTCTTCATACATCAACAGAGATTCTAACGTTACTATTAAT 1261 C E R I K L N T E E N K P S H F Q A L G D D I S R T V I P S 3781 TGTGAAAGAATTAAATTGAACACAGAAGAAAATAAGCCATCTCATTTCCAAGCTTTAGGTGACGATATTTCAAGAACTGTTATTCCATCT 1291 E V L P S A G A F S K S E G Q H E N F L N I S R L Q E K T G 3871 GAAGTTTTGCCATCAGCTGGTGCATTTTCTAAATCAGAAGGTCAACATGAAAATTTCTTGAACATTTCTAGATTGCAAGAAAAGACTGGT 1321 T Y T T N K T K N N H V S D L G L V L C D F E D S F Y L D T 3961 ACTTACACTACAAATAAGACTAAAAATAACCATGTTTCAGATTTGGGTTTGGTTTTGTGTGATTTCGAAGATTCTTTCTACTTAGATACA 1351 Q S E K I I Q Q M A T E N A K L G A K D T N L A A G I M Q K 4051 CAATCAGAAAAGATTATCCAACAAATGGCTACTGAAAATGCAAAATTGGGTGCTAAAGATACAAATTTGGCTGCAGGTATCATGCAAAAA 1381 S L V Q Q N S M N S F Q K E C H I P F P A E Q H P L G A T K 4141 TCTTTGGTTCAACAAAATTCTATGAACTCTTTCCAAAAGGAATGTCATATCCCATTCCCAGCAGAACAACATCCATTAGGTGCTACTAAG 1411 I D H L D L K T V G T M K Q S S D S H G V D I L T P E S P I 4231 ATCGATCATTTGGATTTGAAGACAGTTGGTACTATGAAGCAATCTTCAGATTCTCATGGTGTTGATATTTTGACACCAGAATCTCCAATC 1441 F H S P I L L E E N G L F L K K N E V S V T D S Q L N S F L 4321 TTCCATTCACCAATCTTGTTAGAAGAAAACGGTTTGTTTTTGAAGAAAAATGAAGTTTCTGTTACTGATTCACAATTAAATTCATTTTTG 1471 Q G Y Q T Q E T V K P V I L L I P Q K R T P T G V E G E C L 4411 CAAGGTTATCAAACACAAGAAACTGTTAAGCCAGTTATTTTGTTGATCCCACAAAAGAGAACACCAACTGGTGTTGAGGGTGAATGTTTG 1501 P V P E T S L N M S D S L L F D S F S D D Y L V K E Q L P D 4501 CCAGTTCCAGAAACTTCTTTGAACATGTCTGATTCATTGTTATTTGATTCTTTTTCAGATGATTACTTGGTTAAGGAACAATTGCCAGAT 1531 M Q M K E P L P S E V T S N H F S D S L C L Q E D L I K K S 4591 ATGCAAATGAAAGAACCATTGCCATCAGAAGTTACATCTAACCATTTCTCTGATTCATTGTGTTTACAAGAAGATTTGATTAAGAAATCT 1561 N V N E N Q D T H Q Q L T C S N D E S I I F S E M D S V Q M 4681 AACGTTAACGAAAATCAAGATACACATCAACAATTGACTTGTTCAAACGATGAATCTATCATTTTCTCTGAAATGGATTCTGTTCAAATG 1591 V E A L D N V D I F P V Q E K N H T V V S P R A L E L S D P 4771 GTTGAAGCATTGGATAACGTTGATATCTTCCCAGTTCAAGAAAAGAATCATACAGTTGTTTCTCCAAGAGCTTTGGAATTATCAGATCCA 1621 V L D E H H Q G D Q D G G D Q D E R A E K S K L T G T R Q N 4861 GTTTTGGATGAACATCATCAAGGTGACCAAGATGGTGGTGACCAAGATGAAAGAGCAGAAAAATCTAAGTTGACAGGTACTAGACAAAAC 1651 H S F I W S G A S F D L S P G L Q R I L D K V S S P L E N E 4951 CATTCTTTTATTTGGTCTGGTGCTTCATTTGATTTGTCTCCAGGTTTACAAAGAATTTTGGATAAAGTTTCTTCACCATTGGAAAATGAA 1681 K L K S M T I N F S S L N R K N T E L N E E Q E V I S N L E 5041 AAGTTAAAATCAATGACTATTAATTTTTCTTCATTGAATAGAAAGAATACAGAATTAAATGAAGAACAAGAAGTTATTTCTAATTTGGAA 1711 T K Q V Q G I S F S S N N E V K S K I E M L E N N A N H D E 5131 ACTAAGCAAGTTCAAGGTATTTCTTTTTCTTCAAACAACGAAGTTAAATCAAAGATCGAAATGTTAGAAAATAATGCAAATCATGATGAA 1741 T S S L L P R K E S N I V D D N G L I P P T P I P T S A S K 5221 ACATCTTCATTGTTGCCAAGAAAGGAATCTAACATCGTTGATGATAATGGTTTGATTCCACCAACACCAATTCCAACTTCTGCTTCAAAG 1771 L T F P G I L E T P V N P W K T N N V L Q P G E S Y L F G S 5311 TTGACATTCCCAGGTATTTTAGAAACTCCAGTTAACCCATGGAAGACAAACAACGTTTTGCAACCTGGTGAATCTTATTTGTTCGGTTCT

64

1801 P S D I K N H D L S P G S R N G F K D N S P I S D T S F S L 5401 CCATCAGATATTAAGAATCATGATTTGTCTCCAGGTTCAAGAAACGGTTTTAAAGATAACTCTCCAATCTCAGATACTTCTTTTTCATTG 1831 Q L S Q D G L Q L T P A S S S S E S L S I I D V A S D Q N L 5491 CAATTGTCTCAAGATGGTTTGCAATTAACACCAGCATCTTCATCTTCAGAATCTTTGTCAATCATCGATGTTGCTTCTGATCAAAATTTG 1861 F Q T F I K E W R C K K R F S I S L A C E K I R S L T S S K 5581 TTCCAAACTTTTATTAAGGAATGGAGATGTAAGAAAAGATTTTCTATCTCATTGGCTTGTGAAAAGATTAGATCTTTAACATCTTCAAAG 1891 T A T I G S R F K Q A S S P Q E I P I R D D G F P I K G C D 5671 ACAGCAACTATTGGTTCTAGATTCAAACAAGCTTCTTCACCACAAGAAATTCCAATTAGAGATGATGGTTTCCCAATTAAAGGTTGTGAT 1921 D T L V V G L A V C W G G R D A Y Y F S L Q K E Q K H S E I 5761 GATACTTTGGTTGTTGGTTTGGCAGTTTGTTGGGGTGGTAGAGATGCTTACTACTTCTCTTTGCAAAAGGAACAAAAGCATTCAGAAATC 1951 S A S L V P P S L D P S L T L K D R M W Y L Q S C L R K E S 5851 TCTGCTTCATTGGTTCCACCATCATTGGATCCATCTTTGACATTGAAGGATAGAATGTGGTATTTGCAATCTTGTTTGAGAAAGGAATCT 1981 D K E C S V V I Y D F I Q S Y K I L L L S C G I S L E Q S Y 5941 GATAAGGAATGTTCAGTTGTTATATATGATTTCATCCAATCATACAAGATTTTGTTATTGTCATGTGGTATTTCTTTGGAACAATCATAC 2011 E D P K V A C W L L D P D S Q E P T L H S I V T S F L P H E 6031 GAAGATCCAAAAGTTGCTTGTTGGTTATTGGATCCAGATTCTCAAGAACCAACATTGCATTCTATCGTTACTTCATTTTTACCACATGAA 2041 L P L L E G M E T S Q G I Q S L G L N A G S E H S G R Y R A 6121 TTGCCATTGTTGGAAGGCATGGAAACTTCTCAAGGTATTCAATCATTGGGTTTAAATGCAGGTTCTGAACATTCAGGTAGATACAGAGCT 2071 S V E S I L I F N S M N Q L N S L L Q K E N L Q D V F R K V 6211 TCTGTTGAATCAATCTTGATTTTCAATTCTATGAATCAATTAAATTCATTGTTGCAAAAGGAAAATTTGCAAGATGTTTTTAGAAAGGTT 2101 E M P S Q Y C L A L L E L N G I G F S T A E C E S Q K H I M 6301 GAAATGCCATCTCAATACTGTTTGGCATTATTGGAATTGAACGGTATCGGTTTTTCTACAGCTGAATGTGAATCACAAAAGCATATCATG 2131 Q A K L D A I E T Q A Y Q L A G H S F S F T S S D D I A E V 6391 CAAGCAAAGTTGGATGCTATTGAAACTCAAGCATACCAATTGGCTGGTCACTCTTTTTCTTTTACTTCTTCTGATGATATCGCTGAAGTT 2161 L F L E L K L P P N R E M K N Q G S K K T L G S T R R G I D 6481 TTGTTTTTAGAATTGAAGTTACCACCAAACAGAGAAATGAAGAACCAAGGTTCTAAGAAAACTTTGGGTTCAACTAGAAGAGGTATCGAT 2191 N G R K L R L G R Q F S T S K D V L N K L K A L H P L P G L 6571 AACGGTAGAAAGTTGAGATTGGGTAGACAATTTTCTACTTCAAAGGATGTTTTGAATAAGTTGAAGGCATTGCATCCATTACCAGGTTTG 2221 I L E W R R I T N A I T K V V F P L Q R E K C L N P F L G M 6661 ATCTTAGAATGGAGAAGAATCACAAACGCTATCACTAAGGTTGTTTTCCCATTGCAAAGAGAAAAGTGTTTGAACCCATTTTTGGGTATG 2251 E R I Y P V S Q S H T A T G R I T F T E P N I Q N V P R D F 6751 GAAAGAATCTATCCAGTTTCTCAATCACATACAGCAACTGGTAGAATCACTTTTACTGAACCAAACATCCAAAACGTTCCAAGAGATTTC 2281 E I K M P T L V G E S P P S Q A V G K G L L P M G R G K Y K 6841 GAAATTAAAATGCCAACATTGGTTGGTGAATCTCCACCATCACAAGCTGTTGGTAAAGGTTTATTGCCAATGGGTAGAGGTAAATACAAG 2311 K G F S V N P R C Q A Q M E E R A A D R G M P F S I S M R H 6931 AAAGGTTTTTCTGTTAATCCAAGATGTCAAGCACAAATGGAAGAAAGAGCTGCAGATAGAGGTATGCCATTTTCTATCTCAATGAGACAT 2341 A F V P F P G G S I L A A D Y S Q L E L R I L A H L S H D R 7021 GCTTTCGTTCCATTTCCAGGTGGTTCTATTTTGGCTGCAGATTACTCACAATTGGAATTGAGAATTTTGGCACATTTGTCTCATGATAGA 2371 R L I Q V L N T G A D V F R S I A A E W K M I E P E S V G D 7111 AGATTGATCCAAGTTTTGAACACTGGTGCTGATGTTTTTAGATCTATTGCTGCAGAATGGAAAATGATTGAACCAGAATCAGTTGGTGAC 2401 D L R Q Q A K Q I C Y G I I Y G M G A K S L G E Q M G I K E 7201 GATTTGAGACAACAAGCAAAGCAAATCTGTTACGGTATCATCTATGGTATGGGTGCTAAATCTTTGGGTGAACAAATGGGTATTAAAGAA 2431 N D A A C Y I D S F K S R Y T G I N Q F M T E T V K N C K R 7291 AACGATGCTGCATGTTACATCGATTCTTTTAAATCAAGATACACTGGTATTAATCAATTCATGACAGAAACTGTTAAAAATTGTAAAAGA 2461 D G F V Q T I L G R R R Y L P G I K D N N P Y R K A H A E R 7381 GATGGTTTTGTTCAAACAATCTTGGGTAGAAGAAGATATTTGCCAGGTATTAAAGATAACAACCCATACAGAAAAGCTCATGCAGAAAGA 2491 Q A I N T I V Q G S A A D I V K I A T V N I Q K Q L E T F H 7471 CAAGCAATTAATACTATCGTTCAAGGTTCTGCTGCAGATATCGTTAAGATCGCTACAGTTAACATTCAAAAACAATTGGAAACATTCCAT 2521 S T F K S H G H R E G M L Q S D Q T G L S R K R K L Q G M F 7561 TCTACTTTTAAATCACATGGTCATAGAGAAGGCATGTTACAATCTGATCAAACTGGTTTGTCAAGAAAGAGAAAGTTGCAAGGCATGTTT 2551 C P I R G G F F I L Q L H D E L L Y E V A E E D V V Q V A Q 7651 TGTCCAATCAGAGGTGGTTTCTTTATCTTGCAATTGCATGATGAATTATTGTATGAAGTTGCAGAAGAAGATGTTGTTCAAGTTGCTCAA 2581 I V K N E M E S A V K L S V K L K V K V K I G A S W G E L K 7741 ATCGTTAAAAATGAAATGGAATCTGCTGTTAAATTATCAGTTAAGTTGAAAGTAAAAGTAAAGATTGGTGCCTCCTGGGGTGAATTGAAA 2611 D F D V * 7831 GATTTTGATGTCTAA

Figure 4-2. Yeast codon optimized sequence of 3xFlag-Polθ. Corresponding amino acids are

shown using Show Translation software:

https://www.bioinformatics.org/sms/show_trans.html

65 plasmid and a DNA fragment encoding for a 15 aa (Gly-Gly-Gly-Gly-Ser)3 flexible linker was inserted. This flexible linker was used successfully in other proteins in previous studies(212)(213)(214)(215)(216). This plasmid which lacks the coding sequence for the central domain is named 3xFlag-Polθ∆cen (Figure 4-1b). In order to visualize purified human Pol theta using confocal microscopy, the green fluorescence protein (eGFP) was inserted into the original 3xFlag-Polθ yeast expression plasmid in between the 3xFlag tag and the N-terminal coding sequence of human Pol theta. This plasmid which includes the

N-terminal eGFP is named 3xFlag-GFP-Polθ (Figure 4-1d). Yeast codon optimized sequences of 3xFlag-Polθ∆cen, 3xFlag-PolθK121M and 3xFlag-GFP-Polθ are listed in

Appendix B. During codon optimization 60-70% of codons were changed (data not shown). Below, I list the yeast codon optimized sequence for WT 3xFlag-Polθ as an example.

Growth of Yeast Cells and Induction of Protein Expression

Here, I describe detailed methods to grow yeast cells (S. cerevisiae) and induce expression of recombinant human WT 3xFlag-Polθ and variants for purification purposes

(Figure 4-3a). The 3xFlag-Polθ yeast expression plasmid (or Polθ mutant expression vectors) was transformed into yeast strain LSY0269 (a leu2 trp1 ura3-52 prb1-1122 pep4-

3 prc1-407 GAL+) or BJ5465(217) (American Type Culture Collection(ATCC) cat. no:

208289) (a ura3-52 trp1 leu2-delta1 his3-delta200 pep4::HIS3 prb1-delta1.6R can1 GAL+) using standard methods as previously described(218). Briefly, 300 ng or 1,000 ng of plasmid DNA were mixed with PEG 3350, LiAc, single-stranded carrier DNA (Salmon sperm DNA) then incubated with chemically competent S. cerevisiae cells at 42° C for 1 66

YEAST PURIFICATION OF FULL-LENGTH POL THETA

a GROWTH Transformation Centrifugation of of 3xflag plasmid yeast cells

Sc-trp and Store biomass Yeast cells 16L Sc-trp culture of yeast cells @-80 C 3 days Induced with Galactose for 5 hours 30° C 30° C

b CRUSHING

Wash cell pellet Crushed yeast with Lysis Buffer Freezer/Mill Store @-80 C -196° C

c PURIFICATION AND STORAGE Ultracentrifuge Lyse crushed 2 times yeast

Rotate overnight with Flag-Resin @4 C

Wash steps

2 hour incubation Dialysis with 3xflag peptide @4 C Overnight @4 C Elution

67

Figure 4-3. Overview of full length Pol theta purifications. a, Growth phase of purification is described. b, Crushing

phase of purification is described.c,Purification and storage phases of purification are described.

hr. Cells were spread onto yeast selection media (SC-TRP consists of yeast synthetic drop out medium lacking tryptophan, yeast nitrogen base without amino acids and 2% raffinose) with agar in petri dishes by using EZ-Plate beads (Sunrise Products, cat.no. 3001-000) then incubated at 30° C for 3 days. Multiple individual yeast colonies were picked and grown in liquid yeast selection media lacking tryptophan (SC-TRP) with 2% raffinose at 30° C at

250 RPM. 45 mL of these starter cultures were expanded to 16 L by dilution in liquid yeast

SC-TRP media then further grown overnight at 30° C at 250 RPM until the cellular optical density (OD600) reached 0.6-0.8 (600 nm). In some cases, starter cultures were expanded to 8 L and grown to an OD600 of 1. Then the 8 L culture was expanded to 16 L by adding

8 L of complete yeast media (composed of yeast extract, peptone and 2% raffinose), then the culture was grown until an OD600 of ~1. Expression of 3xFlag-Polθ was induced by the addition of 2% galactose for 5 hr at 30° C at 250 RPM. Cells were harvested by centrifugation then the biomass was stored at -80° C. The growth phase is illustrated in

Fig. 4-3a.

Protein Purification

Preparation of Yeast Cell Extract

The following describes detailed methods for preparing yeast cell extract in which human recombinant 3xFlag-Polθ (or variants thereof) was expressed as described above

68

(Fig. 4-3b). The yeast cellular biomass (i.e. cell pellet) was washed twice with 50 mM

HEPES pH 8.0 and 1M Sorbitol then subsequently washed with lysis buffer (50 mM

HEPES pH 8.0, 1 M NaCl, 10% glycerol, 0.1% Igepal CA630 (Nonidet P-40), 1 mM

EDTA, 1 mM DTT, 50 mM Arginine, 50 mM Glutamic acid, 1 mM PMSF, 0.5 mM benzamidine, and SigmaFAST Protease Inhibitor Cocktail Tablet, EDTA Free). In some cases 1 unit per mL of DNaseI-Rnase free (NEB, cat.no.M0303) was added to the lysis buffer to prevent possible genomic DNA contamination. Cells washed with lysis buffer were pelleted first by centrifuging at 3,000g for 15 min at 4° C then lysed in liquid nitrogen at about -196° C using a freezer/mill (SPEX Sample Prep 6875) as follows. First, 50 mL tubes containing cell pellet were flash frozen in liquid nitrogen and frozen cell pellet was transferred to a large grinding vial by cracking the disposable 50 mL tubes. Two cell pellets

(max 15 mL volume per pellet) per large grinding vial were crushed for efficient grinding.

The following Freezer/Mill program was used for grinding/lysing the yeast cell pellet.

After an initial 3 min pre-cooling stage, 15 cycles totaling at a rate of 12 CPS (cycles per second) were used. Each cycle consists of 3 min crushing and 2 min of cooling without crushing. Finally, frozen lysed cell powder was transferred to 50 mL tubes on dry ice with a cold spatula and stored at -80°C until the next purification step. The crushing phase is illustrated in Fig. 4-3b.

Purification of Recombinant 3xFlag-Polθ and Variants by Binding to Anti-Flag Resin

After cell lysis and storage of lysed cells, the frozen cell powder was thawed on ice for approximately 3 hr then resuspended in equal volume of lysis buffer (described above).

69

The resuspended cell powder was homogenized by mixing with a spatula and also pipetting and transferred into transparent polycarbonate ultracentrifuge tubes (Beckman Coulter, cat.no.355618) then centrifuged at 92,000 g at 4° C for 30 min using a Beckman Ti70 fixed-angle rotor. The clarified supernatant containing cellular protein extracts was isolated and centrifuged a second time at 256,000 g at 4° C for 1 hr. The clear middle layer of the supernatant which contains cellular protein extract was collected by a 20 mL syringe with a metal tip and transferred into 2 LoBind 50 mL tubes (Eppendorf, cat.no. 0030122240).

4 mL of ANTI-FLAG M2 (Sigma, cat.no. A2220) resin was pre-washed with 10 mL 0.1

M Glycine-HCl pH 3.5, then washed with 15 mL TBS pH 7.4 (50 mM Tris-HCl, 150 mM

NaCl), followed by two more wash steps with 15 mL lysis buffer each (described above).

Wash steps were performed by mixing the resin with the indicated wash buffer, then removal of the wash buffer in the supernatant after centrifugation of the mixture at 1,000 rpm at 4° C for 5 min. 5 µg/mL 3xFLAG peptide (Sigma, cat.no. F4799) and 50% equilibrated 4 mL of ANTI-FLAG M2 resin (following the washing protocol described above) were mixed together in two LoBind 50 mL tubes. 50 mL tubes containing supernatant from centrifugation, 3xFLAG peptide and ANTI-FLAG M2 resin were incubated overnight at 4° C with slow 360 degree rotation to allow for binding of the protein to the resin.

ANTI-FLAG M2 resin bound to recombinant 3xFlag-Polθ protein (or variants thereof) was pelleted by centrifugation of the 50 mL LoBind tubes at 1,000 rpm at 4° C for

5 mins and the supernatant containing the unbound cellular protein extract was collected using a 4.5 mL transfer pipette (Thomas Scientific, cat.no.1227W91). The ANTI-FLAG

70

M2 resin containing recombinant 3xFlag-Polθ (or variants thereof) was washed twice with

10 mL of lysis buffer, then subsequently washed with 10 mL of wash buffer A (50 mM

HEPES pH 8.0, 1 M NaCl, 10% glycerol, 0.1% Igepal CA630 (Nonidet P40), 1 mM DTT,

50 mM Arginine, 50 mM Glutamic acid, 1 mM PMSF, 0.5 mM benzamidine,10 mM MgCl2 and 1 mM ATP). The ANTI-FLAG M2 resin containing recombinant 3xFlag-Polθ (or variants thereof) was then incubated in 15 mL of wash buffer A at 4° C for 15 min in 50 mL LoBind tubes, then the resin was pelleted by centrifugation at 1,000 rpm at 4° C for 5 min. The ANTI-FLAG M2 resin containing recombinant 3xFlag-Polθ (or 3xFlag-Polθ variants) was then washed twice with 10 mL of wash buffer B (50 mM HEPES pH 8.0, 1

M NaCl, 10% glycerol, 0.1% Igepal CA630 (Nonidet P-40), 1 mM DTT, 50 mM Arginine,

50 mM Glutamic acid, 1 mM PMSF, 0.5 mM benzamidine). Next, the ANTI-FLAG M2 resin containing recombinant 3xFlag-Polθ (or 3xFlag-Polθ variants) was resuspended in

~3 mL elution buffer (50 mM HEPES pH 8.0, 1 M NaCl, 10% glycerol, 0.1% Igepal

CA630 (Nonidet P-40), 1 mM DTT, 50 mM Arginine, 50 mM Glutamic acid) containing

500 µg/ml 3xFLAG peptide (Sigma, cat.no.F4799) then incubated for 2 hr at 4° C in one

LoBind 5 mL tube(Eppendorf, cat.no. 0030108302) with gentle 360 degree rotation. A disposable 10 mL polypropylene column (Thermo Scientific, cat.no.29924) was pre- washed with 15 mL TBS buffer pH 7.4 (50 mM Tris-HCl, 150 mM NaCl), then equilibrated with 15 mL elution buffer lacking 3xFLAG peptide. Finally, the incubated resin mixture was added to the column and the elution containing pure recombinant 3xFlag-Polθ (or

3xFlag-Polθ variants) and 3xFLAG peptide was collected by gravity flow in several 0.5 mL fractions. The resin was then washed once with 5 ml of elution buffer and 10 fractions

71 were collected in 0.5 mL volumes each in 1.5 mL eppendorf tubes. The resin was then washed once more with 10 ml of elution buffer and 10 fractions were collected in 1 mL volumes each in 1.5 mL eppendorf tubes. The elution fractions were analyzed in SDS denaturing polyacrylamide gels (7.5% mini-protean TGX gels with 10-well combs,

BioRad, cat.no.456-1024) to identify fractions containing 3xFlag-Polθ (3xFlag-Polθ variants). Elution fractions containing 3xFlag-Polθ ~292 kDa (or the following variants:

3xFlag-PolθK121M (~292 kDa); 3xFlag-Polθ∆cen (~183 kDa); 3xFlag-GFP-Polθ (~320 kDa)) identified by standard protein size markers were pooled together at 4° C then were dialyzed against 2 L of dialysis buffer (50 mM HEPES pH 8.5, 250 mM NaCl, 10% glycerol, 0.1% Igepal CA630, 1 mM DTT, 7.5 mM ATP in Spectra/Por Standard RC dialysis tubing (Spectrum Labs, cat.no.132678) with 12-14 kDa cut-off overnight at 4° C.

Pure recombinant 3xFlag-Polθ (3xFlag-Polθ variants) was flash frozen in liquid nitrogen and stored in small (5-100 µl) aliquots at -80° C. The purification and storage phase is illustrated in Fig 4-3c. This protocol resulted in an average yield of 35-100 µg of purified protein.

Results and Conclusions

We were able to purify relatively large amounts of full-length Pol theta and variants from yeast (S. cerevisiae) and they are highly active in biochemical assays (Figure 4-4).

SDS gel images of purified full-length Pol theta and variants are shown in Figure 4-4. In order to test the activity of full-length Pol theta and variants, we used a standard DNA polymerase primer extension assay in comparison with the isolated Polθ-polymerase

72

Fig. 4-4

Polθ∆cen Primer-template Polθ *5’ a b c d 3’ 5’ Primer-template *5’ 315- 315- 3’ 5’ 250- Polθ−pol Polθ 250- 175- − + − + 175- Run-off− 140- Run-off− 140-

95- 95- 72- 72- 52- 52- 43- Primer− 43- Primer− Denaturing gel Denaturing gel

Primer-template Primer-template *5’ e PolθK121M *5’ g GFP-Polθ 3’ 5’ f 3’ 5’ h

315- 315- 250- 250- 175- Run-off− 175- Run-off− 140- 140-

95- 95- 72- 72- 52- 52- Primer− Primer− 43- 43- Denaturing gel Denaturing gel

GFP-Polθ + RP334Cy3 i

Figure 4-4. Activity of full length Pol theta purifications. a, SDS gel of purified human 3xFlag-Polθ. b, Primer template extension of

purified human 3xFlag-Polθ. c, SDS gel of purified human 3xFlag-Polθ∆cen. d, Primer template extension of purified human 3xFlag-

Polθ∆cen. e, SDS gel of purified human 3xFlag-PolθK121M. f, Primer template extension of purified human 3xFlag-PolθK121M. g, SDS

gel of purified human 3xFlag-GFP-Polθ. h, Primer template extension of purified human 3xFlag-GFP-Polθ.i, Confocal image of 3xFlag-

GFP-Polθ in the presence of RP334Cy3DNA.Scale bar 5μm.White arrow: 3xFlag-GFP-Polθ.Blue arrow: RP334Cy3DNA

73 domain purified from E. coli as described(96). The GFP-Polθ concentration was limiting, and further concentration of these proteins proved problematic, therefore primer extension activity by this particular variant is less efficient than the other Pol theta proteins (Figure

4-4h). Finally, in order to visualize GFP-Polθ and confirm its ability to fluoresce and co- localize with ssDNA, a confocal image was taken in the presence of ssDNA (courtesy of

Samuel Black, Pomerantz lab) (Figure 4-4i). Further biochemical and structural analysis of WT Pol theta and Pol theta variants purified from yeast are described below.

To our knowledge, this study is the first to demonstrate purification of relatively large quantities of full-length human Polq from yeast. Polθ∆cen results in the highest yield likely due to its higher expression as a result of its smaller size (183 kDa). Furthermore, it appears that 16 L of yeast culture is necessary for yielding sufficient protein for robust biochemical studies. Future studies are needed to optimize methods for concentrating the protein further for structural biology applications, and for more effective removal of the

3xFLAG peptide which appears to either stick to dialysis tubing or not diffuse efficiently through this membrane.

74

CHAPTER 5

STRUCTURE FUNCTION STUDIES OF FULL-LENGTH POL THETA

Introduction

As discussed above, DNA polymerase theta (Pol theta) is a unique polymerase- helicase fusion protein that is essential for the mutagenic double-strand break (DSB) repair pathway called microhomology-mediated end-joining (MMEJ) or alternative end-joining

(95)(96)(168)(97)(118)(162). MMEJ functions during S and G2 cell cycle phases and therefore acts on 3’ single-strand DNA (ssDNA) overhangs generated by 5’-3’ exonuclease resection of DSBs, similar to homologous recombination (HR)(28)(98). The ability of Pol theta to act on 3’ ssDNA overhangs to promote MMEJ during replication explains how this specialized polymerase enables the proliferation of HR-deficient cells(118)(98). For example, HR-deficient cells depend on Pol theta activity for their survival(168)(39). Thus,

Pol theta is considered a promising drug target in BRCA1/2-deficient cancer cells(168)(39)(140).

As noted above, Pol theta consists of a super-family 2 helicase (Polθ-hel), a disordered central domain (Polθ-cen), and a A-family polymerase (Polθ-pol)(Appendix

C)(95)(137)(130). Polθ-hel is related to HELQ/Hel308 helicases, whereas Polθ-pol is related to bacterial Pol I enzymes, but contains an inactive 3’-5’ exonuclease domain(95)(137)(130)(219)(220)(165). The polymerase and helicase include unstructured motifs, and loop 2 in Polθ-pol promotes its ssDNA extension and MMEJ activities

(Appendix C)(96)(165)(163). Polθ-pol can promote MMEJ of partial ssDNA (pssDNA)

75 substrates by pairing the 3’ ssDNA overhangs via microhomology, then extending the minimally paired ends (96). Polθ-pol can also perform MMEJ of very short ssDNA (≤12 nt) which is fully contained within its active site and therefore not a relevant substrate for studying Pol theta MMEJ (Appendix D: Fig.D-2c) (219)(221). For example, cellular studies demonstrate that Pol theta-dependent MMEJ occurs on 3’ ssDNA overhangs that are ≥45-70 nt in length, indicating that extensive DNA resection promotes this pathway(Appendix D: Fig.D-2e) (98).

The specific functions of Polθ-hel and Polθ-cen in MMEJ remain to be elucidated.

Recent studies from our laboratory demonstrated that Polθ-hel exhibits ATP-dependent

DNA unwinding activity, however, whether this function is important for Pol theta MMEJ remains unclear. Other collaborative studies from our laboratory indicate that Polθ-hel

ATPase activity overcomes RPA inhibition of MMEJ by displacing it from ssDNA(59).

Mutation of the conserved Walker A residue K121 which is essential for ATP binding, however, does not abolish MMEJ in multiple studies, which suggests a non-essential function for Polθ-hel ATPase activity in MMEJ(97)(98)(59). Thus, the function of Polθ- hel ATPase activity in MMEJ remains elusive. Since the overall architecture of Pol theta is conserved, polymerase-helicase attachment via the disordered Polθ-cen is likely germane to MMEJ(137)(130). However, this domain is not highly conserved, thus it may primarily serve as a linker (Fig. 1-4, Appendix C). Below, we performed structure function analysis of full-length recombinant human Pol theta to elucidate the functions of Polθ-hel and Polθ- cen in MMEJ and how they affect the structural organization of Pol theta.

76

Results

Polymerase-helicase tethering is essential for MMEJ

To probe the function of Polθ-hel in MMEJ, the Polθ∆cen mutant protein was expressed and purified from S. cerevisiae using a N-terminal 3xFlag-tag as described above

(Fig. 5-1a, b). This Pol theta mutant has 15 aa (GGGGS)3 linker instead of Polθ-cen (Fig.

5-1a). The active concentration of Pol theta (and Pol theta mutants) was determined by specific activity on primer-templates compared to Polθ-pol (Appendix D: Fig.D-1).

Polθ∆cen and Polθ-pol exhibit identical primer extension activities, indicating the helicase does not affect this function (Fig. 5-1c). Polθ∆cen exhibits ATPase activity as expected

(Fig. 5-1d). Polθ∆cen is stored in buffer with 7.5 mM of ATP. Thus, detection of Polθ∆cen hydrolysis of 32P-γ-ATP, which is present at substantially lower levels than cold ATP, requires prolonged incubation (Fig. 5-1d).

Because long 3’ ssDNA overhangs support MMEJ in cellular studies, we tested whether Polθ∆cen can perform MMEJ of ssDNA as a model substrate (Fig 5-1e, f). Here,

Polθ∆cen was incubated with Cy3- 5’ conjugated ssDNA 26 nt in length containing 6 bp of sequence microhomology (5’-CCCGGG-3’) along with dNTPs and MgCl2 in Tris-HCl buffer for 32 min at 37 C. Reactions were terminated by the addition of EDTA and proteinase K which degrades protein and DNA was resolved in non-denaturing polyacrylamide gels (Fig. 5-1f). All MMEJ reactions include 2 mM ATP unless otherwise indicated. XmaI addition following MMEJ confirms the high molecular weight products are generated by end-joining (Fig. 5-1g). For example, XmaI specifically digests the

77

Fig. 5-1 Polθ∆cen SF2 helicase domain A-family Pol domain 315- 250- 1 NT DEAH 1710 aa 15aa linker 175- a Polθ△cen b 140- ATP binding, Helicase C Thumb Palm Fingers Palm 95- hydrolysis 15aa Polθ-pol Polθ-hel GS-linker 72- Inactive 52- exonuclease 43- domain Primer-template *5’ 3’ 5’

c ssDNA e 5’ 3’ d 20 * 5’ 3’ Run-off− 3’ 5’* 16 Polθ∆cen vs. Polθ-pol, 32 Pi− 12 dNTPs, ATP End-joining 8 Microhomology 4 32 *5’ P-γ-ATP− hydrolysis ATP % 5’ * Primer− 0 7.8 12.1 18.8 % Pi 0 1 2 3 Extension Denaturing gel 0 1 2 3 hr hr

26 nt ssDNA extension (in cis)

5’ CCCGGG 3’ * g * h 5’ 26 nt ssDNA * 5’ f 5‘Cy3 CCCGGG + + Polθ∆cen − + XmaI (post-MMEJ) Polθ-pol Polθ-hel MMEJ * 5’ 3’ MMEJ 3’ 5’* XmaI Polθ∆cen ssDNAx ssDNAx 70 nt− MMEJ *5’ 5’* 26 nt *5’ CCCGGG 3’ Polθ∆cen i 80 MMEJ 70 26 nt 60 j *5’ CCCGGG 50 MMEJ dNTPs 40 +

% yield % 30 Polθ-pol Polθ-hel ssDNAx ssDNAx 20 10 MMEJ? 0 0 1 2 4 8 16 32 45 min 0 10 20 30 40 50 Non-denaturing gel min

0 5 10 20 40 nM Polθ-hel 5 5 5 5 5 5 5 nM Polθ-hel k 5 5 5 5 5 nM Polθ-pol l 0 6.25 12.5 25 50 100 200 0 6.25 12.5 25 50 100 200 nM Polθ-pol

MMEJ

ssDNAx ssDNAx

Non-denaturing gel Non-denaturing gel

78

Figure 5-1. Polymerase-helicase tethering is essential for MMEJ.

a, Schematic of Polθ∆cen construct. b, Schematic (left) and SDS gel (right) of Polθ∆cen protein. c, Denaturing gel

showing primer extension by Polθ-pol and Polθ∆cen. d, TLC plate showing ATP hydrolysis by Polθ∆cen. Plot of ATP

hydrolysis. n = 3 +/-s.d. (right). e, Schematic of MMEJ assay. f, Schematic of a model for ssDNAx products. g, Non-

denaturing gel showing a time course of MMEJ of 26 nt ssDNA by Polθ∆cen (left). Plot of Polθ∆cen time course

products. h, Schematic of MMEJ assay with Polθ-hel and Polθ-pol. i,j Non-denaturing gels showing ssDNAx products

generated by Polθ-pol with and without Polθ-hel at indicated concentrations.

CCCGGG double-strand DNA sequence which is generated by MMEJ. Polθ∆cen performs both MMEJ and some ssDNA extension byproducts which appear to be due to intrastrand pairing and subsequent extension, also known as “snapback replication” (Fig. 5-1h, i)

(Appendix D: Fig.D-3). In contrast to the results with Polθ∆cen, Polq-pol fails to perform

MMEJ on the same ssDNA substrate (Fig. 5-2f), demonstrating that the helicase domain facilitates MMEJ. The addition of Polθ-hel in trans is unable to significantly stimulate

Polθ-pol MMEJ (Fig. 5-1j, k, l), and controls show that purified Polθ-hel binds ssDNA and exhibits ATPase activity (Appendix D: Fig.D-2a). Hence, our data demonstrate that Polθ- hel stimulates Polθ-pol MMEJ in cis likely by altering the conformation of Polθ-pol and/or ssDNA, and suggest that the major function of Polθ-cen is to tether Polθ-hel to Polθ-pol.

Pol theta exclusively performs MMEJ of long ssDNA

As described above, Pol theta was purified from S. cerevisiae using a N-terminal

3xFlag-tag (Fig. 5-2a, b). Pol theta exhibits normal primer extension activity (Fig. 5-2c)

79

Fig. 5-2 b 3xFlag-Polθ (Polθ-hel) (Polθ-cen) (Polθ-pol) a SF2 helicase domain Central domain A-family Pol domain 315− 1 NT DEAH ~868 2,590 aa 250− 175− 140− ATP binding,Helicase C RAD51 RAD51 Thumb Palm Fingers Palm hydrolysis binding binding Inactive 95− exonuclease domain 72−

52− c Primer-template d ssDNA 43− *5’ 5’ 3’ 3’ 5’ Polθ ssDNA 5’ 3’ Polθ−pol Polθ 20 e − + − + * 5’ 3’ 15 3’ 5’* Run-off− 32Pi− Polθ vs. Polθ-pol, 10 dNTPs, ATP % Pi % 32P-γ-ATP− 5 End-joining Microhomology 0 0 30 60 120 min *5’ 0 30 60 90 120 5’ * Primer− 4.5 7.8 12.1 % Pi min Denaturing gel Extension

26 nt *5’ CCCGGG 3’ ssDNA Polθ FAM Inhibition h Polθ vs. Polθ-pol f g 200 IC50KD = 2.5 nM 5' No activity

Polθ-hel Polθ-pol Polθ IC50KD = 44.7 nM FL-POLQ MMEJ Polθ-cen 100 PolPolQ-Polθ-pol

5' ssDNAx 3' 5' Active Fluorescence anisotropy Fluorescence 0 -0.5 0.0 0.5 1.0 1.5 2.0

POLQlog conc log( nMconc) nM

Figure 5-2. Pol theta does not promote MMEJ of short ssDNA.

a, Schematic of Pol theta. b, SDS gel of purified human Pol theta. c, Denaturing gel showing primer

extension by Pol theta and Polθ-pol. d, TLC plate showing ATP hydrolysis by Pol theta. Plot of ATP

hydrolysis. n = 3 +/-s.d. (right). e, Schematic of MMEJ assay. f, Non-denaturing gel showing MMEJ of 26 nt

ssDNA by Pol theta and Polθ-pol (left) and Polθ∆cen and Polθ-pol (right). g, Schematic of a model for Polθ-

cen inhibition on only MMEJ but not primer extension. h, Schematic of ssDNA binding assay (top). Plot of

fluorescence anisotropy comparing Pol theta and Polθ-pol ssDNA binding (bottom).

80 and demonstrates ATPase activity as expected (Fig. 5-2d). Notably, Pol theta remains active for several hours under these conditions with ssDNA, 10 mM MgCl2, ATP, 10% glycerol, TrisHCl buffer and .01% NP-40(Appendix D: Fig.D-2a).

Unexpectedly, Pol theta failed to perform MMEJ or ssDNAx on short (≤26 nt) ssDNA (Fig. 5-2e, f). Yet, Polθ∆cen is fully functional on the same substrate (Fig. 5-1f).

This indicates that Polθ-cen exhibits a specific autoinhibitory function on ssDNA by suppressing Pol theta activity on ssDNA but not primer-templates (Fig. 5-2g). For example,

Pol theta fails to extend short (≤26 nt) ssDNA (Fig. 5-2f), yet it binds tightly (KD = 2.5 nM) to ssDNA similar in length (29 nt; Fig 5-2h), and is functional on similar length primer- templates (Fig. 5-2c).

Initial studies indicated that short (≤15 nt) 3’ overhangs are required for MMEJ(96), whereas recent cellular studies demonstrate that ≥45-70 nt 3’ overhangs support this pathway(98). We therefore examined Pol theta MMEJ of long ssDNA (70 and 100 nt) which more closely models cellular substrates (Fig. 5-3a, b, c). 3 nM Pol theta was incubated with 10 nM of 5’ radio-labeled ssDNA in the presence of 20 µM deoxyribonucleotides (dNTPs), 25 mM NaCl, 0.1 µg/ml BSA and 10 mM MgCl2 in Tris-

HCl buffer at 37° C. Reactions were terminated by the addition of EDTA and proteinase K and DNA was resolved in non-denaturing polyacrylamide gels as above. Pol theta performs efficient MMEJ of 70 and 100 nt ssDNA containing 3’ terminal 6 base pairs (bp) of microhomology (5’-CCCGGG-3’)(Fig. 5-3). In contrast, Polθ-pol is deficient in MMEJ activity and predominantly performs ssDNA extension (ssDNAx) via “snap-back” replication in which the polymerase utilizes intrastrand pairing to extend the 3’ terminus.

81

Fig. 5-3 70 nt 70 nt *5’ CCCGGG 3’ *5’ CCCGGG 3’ a b * 5’ 3’ 3’ 5’* c MMEJ Polθ vs. Polθ-pol, MMEJ dNTPs, ATP End-joining Microhomology *5’ 5’ * ssDNAx ssDNAx

Extension 4 33 53 % MMEJ 4 53 58 % MMEJ Non-denaturing gel Non-denaturing gel

100 nt ssDNA d *5’ CCCGGG 3’ Polθ Polθ 90 MMEJ 80 70 60 50 40 30 % MMEJ % ssDNAx 20 100 nt− 10 0 0 2 4 8 16 32 min 0 5 10 15 20 25 30 35 2.7 24 51 71 77 %MMEJ min Non-denaturing gel

100 nt *5’ CCCGGG 3’ Polθ∆cen Polθ∆cen e 80

60 MMEJ 40

% MMEJ % 20 100 nt− 0 0 2 4 8 16 32 min 0 5 10 15 20 25 30 35 43 57 62 63 68 %MMEJ min Non-denaturing gel

100 nt *5’ CCCGGG 3’ f PolθK121M PolθK121M PolθK121M MMEJ 80 g 315- 60 250- 175- 40 140-

% MMEJ % 20 95- 100 nt− 72- 0 52- 43- 0 2 4 8 16 32 32 min 0 5 10 15 20 25 30 35 28 46 62 74 78 80 % MMEJ min Non-denaturing gel

82

Figure 5-3. Pol theta exclusively performs MMEJ of long ssDNA.

a, Schematic of MMEJ assay. b, Non-denaturing gel showing MMEJ of 70 nt ssDNA by Pol theta, Polθ∆cen and Polθ-

pol. c, Non-denaturing gel showing MMEJ of 70 nt ssDNA by Pol theta, PolθK121M and Polθ-pol. d, Non-denaturing

gel of time course of MMEJ of long ssDNA by Pol theta (left). Plot of Pol theta time course products. n = 3 +/-s.d.

(right). e, Non-denaturing gel of time course of MMEJ of long ssDNA by Polθ∆cen (left). Plot of Polθ∆cen time course

products. n = 3 +/-s.d. (right). f, Non-denaturing gel of time course of MMEJ of long ssDNA by PolθK121M (left). Plot

of PolθK121M time course products. n = 3 +/-s.d. (right). g, SDS gel of purified human PolθK121M.

Polθ-pol ssDNAx was characterized in previous studies and is shown in Appendix D:

Fig.D-3 (96)(163). MMEJ products accumulate slowly (≥2 min), indicating a rate-limiting interstrand pairing step (Fig. 5-3d). These data demonstrate the unexpected finding that Pol theta is fully functional on long ssDNA but not short ssDNA. Considering that Pol theta exhibits a significantly lower KD on ssDNA compared to Polθ-pol, the helicase domain strongly contributes to Pol theta ssDNA binding (Fig. 5-2h). This suggests the helicase binds upstream from the polymerase domain, since the polymerase is required for interstrand pairing and extension of the minimally paired overhangs. For example, because

Polθ-pol is capable of promoting MMEJ of 12 nt ssDNA, and Polθ∆cen exhibits the identical activity on this short substrate (Appendix D: Fig. D-2c), this functionally places the polymerase at the 3’ terminus of ssDNA, and demonstrates that the helicase does not provide an advantage on short substrates. Together, these data support a model of Polθ-hel binding upstream from Polθ-pol (Fig 5-5).

To probe Polθ-hel ATPase function in MMEJ, we purified PolθK121M which contains a methionine substitution for the conserved Walker A residue K121 that is

83 essential for ATP binding (Fig. 5-3f, g). PolθK121M performs identical MMEJ compared to wild-type (WT) Pol theta (Figs. 5-3b, c), demonstrating that Pol theta ATPase activity is dispensable for MMEJ in our purified system.

Pol Theta Multimers Promote DNA Accumulation and End-Joining

X-ray crystallography demonstrates that Polθ-hel and Polθ-pol form homo- tetramers and dimers, respectively(219)(165). Thus, we hypothesized that Pol theta acts as large multimers. Electrophoresis mobility shift assay (EMSA) following Pol theta incubation with radiolabeled ssDNA then subsequent cross-linking with glutaraldehyde indicate that Pol theta forms large multimeric complexes with ssDNA since these complexes fail to enter the agarose gel due to their large molecular weight (Fig. 5-4a).

ssDNA-dependent Pol theta multimerization was further investigated via scanning force microscopy (SFM) using the MMEJ buffer conditions lacking dNTPs and BSA (Fig.

5-4b). Here, 62% of Pol theta appears as tetramers or larger with ssDNA (Fig. 5-4c), whereas in the absence of ssDNA 18% of Pol theta appears as large multimeric complexes

(Fig. 5-4c). Interestingly, Polθ∆cen forms large (>tetramers) multimers even in the absence of ssDNA, demonstrating that Polθ-cen suppresses Pol theta multimerization when DNA is not present (Fig. 5-4e). These data not only demonstrate that Pol theta forms multimeric complexes with ssDNA, but also indicates the ability of Pol theta multimers to promote the accumulation of DNA which can conceivably facilitate DNA synapsis and end-joining.

To unequivocally determine if Pol theta can promote the accumulation of DNA by forming multimeric complexes, the protein was imaged in the presence of long double-s

84

Fig. 5-4 Polθ 0 5 10 20 40 a Polθ-ssDNA b Polθ Polθ + ssDNA complex⎼ ⎼Gel well

ssDNA * Polθ 70 nt + ATP

ssDNA⎼ 1 2 3 4 5 c d Polθ PolθK121M 60 60 50 −ssDNA (n = 280) 50 −ssDNA (n = 92) +ssDNA (n= 383) +ssDNA (n= 176) 40 40 30 30 20 20 % Proteins % % Proteins % 10 10 0 0 1 2 3 4 5 6 7 8 9 10> 1 2 3 4 5 6 7 8 9 10> Oligomeric state Oligomeric state

f e Polθ-pol g 60 Polθ∆cen 50 50 Polθ-hel −ssDNA (n = 699) ssDNA (n = 956) 50 −ssDNA (n = 176) 40 40 − +ssDNA (n = 256) +ssDNA (n =928) +ssDNA (n= 1,010) 40 30 30 30 20 20 % Proteins % 20 Proteins % % Proteins % 10 10 10 0 0 0

Oligomeric state Oligomeric state Oligomeric state

h DNA Polθ Polθ + DNA Polθ + DNA

250 nm

1,153A 2,300A

50 200 500 nM ssDNA 7 i − + + + Polθ-pol MMEJ 6 5 4 3 ssDNAx MMEJ % 2 1 26 nt− 0 50 200 500 nM Non-denaturing gel 1 ssDNA2 3

85

Figure 5-4. Pol theta forms large DNA-dependent multimeric complexes and induces liquid demixing of DNA.

a, Schematic of ssDNA binding assay (top). EMSA showing Pol theta ssDNA binding (bottom). b, SFM images of Pol

theta with (right; n = 1,072) and without (left; n = 453) ssDNA. White arrow indicates Pol theta monomer. Grey arrow

indicates Pol theta multimers. Blue particles represent small buffer components. Scale bar = 250 nm. c-g, Bar charts

showing oligomeric states of indicated proteins with and without 26 nt ssDNA determined by SFM volume

measurements. h, SFM images of DNA (left first), Pol theta (left second), and Pol theta-DNA complexes (right). Scale

bar = 250 nm. i, Non-denaturing gel showing a titration of MMEJ of 26 nt ssDNA by Polθ-pol (left). Bar graph of Polθ-

pol MMEJ products (right).

strand DNA which is easily detectable by SFM. Indeed, SFM demonstrates that large Pol theta multimeric complexes promote the accumulation of long (1.8 kb) double-strand DNA

(Fig. 5-4h). Overall, our data indicate that Pol theta multimers are the active form of this enzyme and suggest that Pol theta multimeric complexes facilitate DNA synapsis and end- joining in cells.

Discussion

Our biochemical studies demonstrate that Polθ-hel significantly upregulates Pol theta

MMEJ of long ssDNA, and this stimulation is most effective on ssDNA ≥70 nt (Fig. 5-5a, top), which in turn suppresses Pol theta ssDNAx via intrastrand pairing (Fig. 5-5a, bottom).

In contrast, Polθ-pol almost exclusively promotes ssDNAx byproducts on long ssDNA

(Fig. 5-5c). Although Polθ-pol is capable of efficient MMEJ on very short (≤12 nt) ssDNA that is fully contained within its active site and therefore not relevant (Fig. 5-5d)(221).

Remarkably, Polθ-hel stimulation of Polθ-pol MMEJ requires its attachment to the polymerase, but does not depend on Polθ-cen (Fig. 5-5b, f). Thus, Polθ-cen may primarily 86

Fig. 5-5

Pol Polθ θ Inhibition Interstrand pairing a Polθ-cen e 5' 5' 5' No activity

Polθ-hel Polθ-pol Polθ-hel Polθ-pol Polθ-cen

5' 5' 3' 5' Active Intrastrand pairing

b Polθ∆cen f Polθ∆cen 5' 5' 5' 5'

5' 5'

c Polθ-pol d g Polθ∆cen h Polθ∆cen 5' 5' 5' 5' Active 5' 5' 3' 5' Intrastrand pairing

Figure 5-5. Functional models of wild-type Pol theta and Pol theta variants

a, Model of Pol theta activity on long (≥70 nt) ssDNA. Polθ-hel promotes Polθ-pol MMEJ in cis in an

ATPase independent manner by suppressing Polθ-pol intrastrand pairing. b, Polθ∆cen exhibits identical

activity to Pol theta on long (≥70 nt) ssDNA where Polθ-hel suppresses Polθ-pol intrastrand pairing in favor

of MMEJ. c, Polθ-pol primarily performs ssDNAx on ssDNA ≥26 nt due to the absence of the helicase. d,

Polθ-pol promotes MMEJ of short (12 nt) ssDNA. e, Polθ-cen suppresses Pol theta activity on short (12 nt)

ssDNA (top), but not on short primer-templates (bottom). f, Polθ∆cen promotes MMEJ and ssDNAx on

intermediate length (26 nt) ssDNA. g, Polθ∆cen promotes MMEJ of short (12 nt) ssDNA in an identical

manner to Polθ-pol. h, Polθ∆cen extends short primer-templates in an identical manner to Pol theta.

87 act as a flexible linker, but could also be involved in intra- and inter- molecular interactions.

Polθ-hel stimulates Polθ-pol end-joining independently from its ATPase function

(Appendix D: Fig.D-2g). This suggests that Polθ-hel upregulates Polθ-pol MMEJ by either altering its conformation and/or by affecting the trajectory of ssDNA. Because Polθ-hel stimulation of Polθ-pol MMEJ is directly correlated to ssDNA length, our data support a model whereby Polθ-hel ssDNA binding upstream from Polθ-pol suppresses ssDNAx in favor of MMEJ (Fig. 5-5a, b). Consistent with this model, Polθ-hel significantly contributes to Pol theta ssDNA binding, and Polθ-pol performs MMEJ of short 12 nt ssDNA in the absence of the helicase, which functionally places Polθ-pol at the 3’ terminus

(Fig. 5-5d). Furthermore, Polθ∆cen and Pol theta exhibit identical MMEJ activity on 12 nt ssDNA (data not shown). Thus, polymerase-helicase tethering only provides an advantage on long ssDNA where Polθ-hel can bind upstream and suppress ssDNAx in favor of MMEJ

(Fig. 5a, b, f).

Unexpectedly, we found that Pol theta is unable to perform MMEJ or ssDNAx on short (≤26 nt) ssDNA (Fig. 5-5e, top), however, it binds tightly to similar size substrates and is fully active on short primer-templates (Fig. 5-5e, bottom). In contrast, Polθ∆cen is fully functional on short substrates (Fig. 5-5f, g), and behaves identically to WT Pol theta on long ssDNA (Fig. 5-5b) and primer-templates (Fig. 5-5h). These data reveal a specific autoinhibitory function for Polθ-cen on short ssDNA (Fig. 5-5e, top), which can conceivably be involved in regulating Pol theta substrate selection. We suspect Polθ-cen specifically suppresses Pol theta MMEJ and ssDNAx on short ssDNA by altering the

88 conformation of a disordered motif called loop 2 within Polθ-pol that is required for its activities on ssDNA, but not on primer templates (Fig. 5-5e, top)(96)(219)(163). Because

Pol theta is active on longer ssDNA (≥45 nt), the autoinhibitory function of Polθ-cen becomes suppressed, presumably due to conformational change upon Pol theta binding to long ssDNA (Fig. 5-5a). Notably, Polθ-cen is negatively charged and therefore may be repelled by phosphate groups on longer ssDNA. High resolution structures of full-length

Pol theta on various ssDNA substrates will be needed to fully elucidate how Polθ-cen and

Polθ-hel regulate its structural organization on ssDNA.

Single-particle imaging demonstrates that although Pol theta behaves mostly as monomers and dimers in the absence of DNA, replacement of Polθ-cen with a short linker enables multimerization of the enzyme, even without DNA. This suggests that Polθ-cen suppresses Pol theta multimerization by masking protein-protein interactions. Yet, Pol theta incubation with ssDNA or large double-strand DNA triggers its multimerization, indicating that DNA binding enables a conformational change in Polθ-cen that facilitates

Pol theta multimerization. ssDNA substantially increases Polθ-hel oligomerization suggesting that Polθ-hel contributes to DNA-dependent Pol theta multimerization (Fig. 5-

4g). Hence, Polθ-hel and Polθ-cen appear to differentially regulate Pol theta oligomerization.

In summary, our data reveal the molecular basis of MMEJ by full-length human

Pol theta and identify major regulatory functions for the helicase and central domains, both in regards to Pol theta structure and function. Polymerase-helicase tethering is responsible

89 for the most unique characteristics of Pol theta. For example, this molecular architecture is essential for stimulating MMEJ of long ssDNA which models resected DSBs. Hence, our findings likely explain the selective pressure for this unique polymerase-helicase fusion protein at the molecular level. Finally, the ability of Pol theta multimers to promote the accumulation of DNA greater than 1 kb in length may explain how this multi-functional enzyme facilitates chromosome translocations that are commonly observed in cancer cells.

90

BIBLIOGRAPHY

1. Watson JD, Crick FHC. Molecular structure of nucleic acids: A Structure for

deoxyribose nucleic acid. In: 50 Years of DNA. 2016.

2. Friedberg EC. DNA damage and repair. Nature. 2003;421.

3. Caldecott KW. DNA single-strand break repair. Exp Cell Res. 2014;

4. Ceccaldi R, Rondinelli B, D’Andrea AD. Repair Pathway Choices and

Consequences at the Double-Strand Break. Trends Cell Biol. 2016;26(1):52–64.

5. Seol J-H, Shim EY, Lee SE. Microhomology-mediated end joining: Good, bad and

ugly. Mutat Res. 2018 May;809:81–7.

6. Aguilera A, García-Muse T. Causes of Genome Instability. Annu Rev Genet.

2013;47(1):1–32.

7. Valerie K, Povirk LF. Regulation and mechanisms of mammalian double-strand

break repair. Oncogene. 2003;

8. Aparicio T, Baer R, Gautier J. DNA double-strand break repair pathway choice

and cancer. DNA Repair (Amst). 2014;19:169–75.

9. Bhargava R, Onyango DO, Stark JM. Regulation of Single-Strand Annealing and

its Role in Genome Maintenance. Trends Genet. 2016;32(9):566–75.

10. Chapman JR, Taylor MRG, Boulton SJ. Playing the End Game: DNA Double-

Strand Break Repair Pathway Choice. Molecular Cell. 2012.

11. Thyme SB, Schier AF. Polq-Mediated End Joining Is Essential for Surviving DNA

91

Double-Strand Breaks during Early Zebrafish Development. CellReports.

2016;15:707–14.

12. Kakarougkas A, Jeggo PA. DNA DSB repair pathway choice: an orchestrated

handover mechanism. J Radiol. 2014;87.

13. Lieber MR, Wilson TE. Nonhomologous DNA End Joining (NHEJ). Cell.

2010;142(3):451–5.

14. Shrivastav M, De Haro LP, Nickoloff JA, Haro LP De. Regulation of DNA

double-strand break repair pathway choice. Cell Res. 2008;18(1):134–47.

15. Mehta A, Haber JE. Sources of DNA Double-Strand Breaks and Models of Rec.

Cold Spring Harb Perspect Biol. 2014;6:1–19.

16. Prakash R, Zhang Y, Feng W, Jasin M. Homologous Recombination and Human

Health. Perspect Biol. 2015;1–29.

17. Ramsden DA. Polymerases in Nonhomologous End Joining: Building a Bridge

over Broken Chromosomes. Antioxid Redox Signal. 2011;14(12).

18. Keeney S. Spo11 and the Formation of DNA Double-Strand Breaks in Meiosis.

Genome Dyn Stab. 2008;2:81–123.

19. Helmink BA, Sleckman BP. The Response to and Repair of RAG-Mediated DNA

Double Stranded Breaks. Annu Rev Immunol. 2012;30:175–202.

20. Kass EM, Jasin M. Collaboration and competition between DNA double-strand

break repair pathways. FEBS Lett. 2010;584(17):3703–8.

92

21. van Schendel R, van Heteren J, Welten R, Tijsterman M. Genomic Scars

Generated by Polymerase Theta Reveal the Versatile Mechanism of Alternative

End-Joining. 2016;12(10).

22. Hoppe MM, Sundar R, Tan DSP, Jeyasekharan AD. Biomarkers for Homologous

Recombination Deficiency in Cancer. JNCI Natl Cancer Inst. 2018;110(7):704–13.

23. Baskar R, Dai J, Wenlong N, Yeo R, Yeoh W, Giron MC. Biological response of

cancer cells to radiation treatment. Front Mol Biosci. 2014;1.

24. Pearce A, Haas M, Viney R, Pearson S-A, Haywood P, Brown C, et al. Incidence

and severity of self-reported chemotherapy side effects in routine care: A

prospective cohort study. PLoS One. 2017;

25. Plenderleith IH. Treating the Treatment: Toxicity of Cancer Chemotherapy. Clin

Pract. 1990;36:1827–30.

26. Symington LS, Gautier J. Double-Strand Break End Resection and Repair Pathway

Choice. Annu Rev Genet. 2011;

27. Ellen Moynahan M, Jasin M. Mitotic homologous recombination maintains

genomic stability and suppresses tumorigenesis. Nat Rev Mol Cell Biol. 2010;11.

28. Truong LN, Li Y, Shi LZ, Yi-Hwa Hwang P, He J, Wang H, et al.

Microhomology-mediated End Joining and Homologous Recombination share the

initial end resection step to repair DNA double-strand breaks in mammalian cells

CELL BIOLOGY. Proc Natl Acad Sci USA. 2013;

29. Chen C-C, Feng W, Lim PX, Kass EM, Jasin M. Homology-Directed Repair and 93

the Role of BRCA1, BRCA2, and Related Proteins in Genome Integrity and

Cancer. Annu Rev Cancer Biol. 2018;

30. Wyman C, Ristic D, Kanaar R. Homologous recombination-mediated double-

strand break repair. DNA Repair. 2004.

31. Mazón G, Mimitou EP, Symington LS. SnapShot: Homologous Recombination in

DNA Double-Strand Break Repair. Cell. 2010;142.

32. Krejci L, Altmannova V, Spirek M, Zhao X. Homologous recombination and its

regulation. Nucleic Acids Res. 2012;40(13):5795–818.

33. Kim KP, Mirkin E V. So similar yet so different: The two ends of a double strand

break. Mutation Research - Fundamental and Molecular Mechanisms of

Mutagenesis 2018 p. 70–80.

34. Li X, Heyer W-D. Homologous recombination in DNA repair and DNA damage

tolerance. Cell Res. 2008;18(1):99–113.

35. Helleday T. Homologous recombination in cancer development, treatment and

development of drug resistance. Carcinogenesis. 2010;31(6):955–60.

36. King TA, Li W, Brogi E, Yee CJ, Gemignani ML, Olvera N, et al. Heterogenic

Loss of the Wild-Type BRCA Allele in Human Breast Tumorigenesis. Ann Surg

Oncol. 2007 Sep 10;14(9):2510–8.

37. Kuchenbaecker KB, Hopper JL, Barnes DR, Phillips KA, Mooij TM, Roos-Blom

MJ, et al. Risks of breast, ovarian, and contralateral breast cancer for BRCA1 and

BRCA2 mutation carriers. JAMA - J Am Med Assoc. 2017;317(23):2402–16. 94

38. Perets R, Wyant GA, Muto KW, Bijron JG, Poole BB, Chin KT, et al.

Transformation of the Fallopian Tube Secretory Epithelium Leads to High-grade

Serous Ovarian Cancer in Brca;Tp53;Pten Models. Cancer Cell. 2013;24(6):751–

65.

39. Ceccaldi R, Liu JC, Amunugama R, Hajdu I, Primack B, Petalcorin MIR, et al.

Homologous-recombination-deficient tumours are dependent on PolQ-mediated

repair. Nature. 2014;518.

40. André Cruz-García A, Ló pez-Saave-dra A, Huertas P, Cruz-García A, López-

Saavedra A, Huertas P, et al. BRCA1 Accelerates CtIP-Mediated DNA-End

Resection. Cell Rep. 2014;9(2):451–9.

41. Anand R, Ranjha L, Cannavo E, Correspondence PC, Cejka P. Phosphorylated

CtIP Functions as a Co-factor of the MRE11-RAD50-NBS1 Endonuclease in DNA

End Resection. Mol Cell. 2016;64(5):940–50.

42. Fell VL, Schild-Poulter C. The Ku heterodimer: Function in DNA repair and

beyond. Mutat Res. 2015;763:15–29.

43. Sun J, Lee K-J, Davis AJ, Chen DJ. Human Ku70/80 Protein Blocks Exonuclease

1-mediated DNA Resection in the Presence of Human Mre11 or Mre11/Rad50

Protein Complex. J Biol Chem. 2011;287(7):4936–45.

44. Langerak P, Mejia-Ramirez E, Limbo O, Russell P. Release of Ku and MRN from

DNA Ends by Mre11 Nuclease Activity and Ctp1 Is Required for Homologous

Recombination Repair of Double-Strand Breaks. PLoS Genet. 2011;7(9).

95

45. Myler LR, Gallardo IF, Soniat MM, Kim Y, Paull TT, Finkelstein Correspondence

IJ. Single-Molecule Imaging Reveals How Mre11-Rad50-Nbs1 Initiates DNA

Break Repair. Mol Cell. 2017;67:891–8.

46. Chanut P, Britton S, Coates J, Jackson SP, Calsou P. Coordinated nuclease

activities counteract Ku at single-ended DNA double-strand breaks. Nat Commun.

2017;8:15917.

47. Lamarche BJ, Orazio NI, Weitzman MD. The MRN complex in double-strand

break repair and telomere maintenance. FEBS Lett. 2010;584(17):3682–95.

48. Quennet V, Beucher A, Barton O, Takeda S, Lö Brich M. CtIP and MRN promote

non-homologous end-joining of etoposide-induced DNA double-strand breaks in

G1. Nucleic Acids Res. 2011;39(6):2144–52.

49. Runge KW, Li Y. A curious new role for MRN in Schizosaccharomyces pombe

non-homologous end-joining. 2018 Apr 10;(2):359–64.

50. Czornak K, Chughtai S, Chrzanowska KH. Mystery of DNA repair: the role of the

MRN complex and ATM kinase in DNA damage repair. 2008 Dec;49(4):383–96.

51. Williams GJ, Lees-Miller SP, Tainer JA. Mre11–Rad50–Nbs1 conformations and

the control of sensing, signaling, and effector responses at DNA double-strand

breaks. DNA Repair (Amst). 2010 Dec 10;9(12):1299–306.

52. Panier S, Boulton SJ. Double-strand break repair: 53BP1 comes into focus. Nat

Rev | Mol CELL Biol. 2013;15(7).

53. André Cruz-García A, Ló pez-Saave-dra A, Huertas P. BRCA1 Accelerates CtIP- 96

Mediated DNA-End Resection. Cell Rep. 2014;9:451–9.

54. Isono M, Niimi A, Oike T, Hagiwara Y, Sato H, Sekine R, et al. BRCA1 Directs

the Repair Pathway to Homologous Recombination by Promoting 53BP1

Dephosphorylation. Cell Rep. 2017;18(2):520–32.

55. Garcia V, Phelps SEL, Gray S, Neale MJ. Bidirectional resection of DNA double-

strand breaks by Mre11 and Exo1. Nature. 2011 Nov 16;479:241–4.

56. Daley JM, Jimenez-Sainz J, Wang W, Nguyen KA, Jensen RB, Sung P.

Enhancement of BLM-DNA2-Mediated Long-Range DNA End Resection by

CtIP. Cell Rep. 2017;21:324–32.

57. Sturzenegger A, Burdova K, Kanagaraj R, Levikova M, Pinto C, Cejka P, et al.

DNA2 Cooperates with the WRN and BLM RecQ Helicases to Mediate Long-

range DNA End Resection in Human Cells *. J Biol Chem. 2014;289(39):27314–

26.

58. Venkitaraman AR. Functions of BRCA1 and BRCA2 in the biological response to

DNA damage. J Cell Sci. 2001;114:3591–8.

59. Mateos-Gomez PA, Kent T, Deng SK, Mcdevitt S, Kashkina E, Hoang TM, et al.

The helicase domain of Polθ counteracts RPA to promote alt-NHEJ. Nat Struct

Mol Biol. 2017;24(12):1116–23.

60. Sy SMH, Huen MSY, Chen J, Livingston DM. PALB2 is an integral component of

the BRCA complex required for homologous recombination repair. Proc Natl

Acad Sci USA. 2009;28:7155–60. 97

61. Buisson R, Dion-Côté A-M, Coulombe Y, Launay H, Cai H, Stasiak AZ, et al.

Cooperation of breast cancer proteins PALB2 and piccolo BRCA2 in stimulating

homologous recombination. Nat Struct Mol Biol. 2010;17(10):1247–54.

62. Sfeir A, Symington LS. Microhomology-Mediated End Joining: A Back-up

Survival Mechanism or Dedicated Pathway? Trends Biochem Sci.

2015;40(11):701–14.

63. Tombline G, Fishel R. Biochemical Characterization of the Human RAD51

Protein. J Biol Chem. 2002;277(17):14417–25.

64. Wiese C, Hinz JM, Tebbs RS, Nham PB, Urbin SS, Collins DW, et al. Disparate

requirements for the Walker A and B ATPase motifs of human RAD51D in

homologous recombination. Nucleic Acids Res. 2006;34(9):2833–43.

65. Kelso AA, Goodson SD, Temesvari LA, Sehorn MG. Data on Rad51 amino acid

sequences from higher and lower eukaryotic model organisms and parasites. Data

Br. 2017;10:364–8.

66. Chen J, Villanueva N, Rould MA, Morrical SW. Insights into the mechanism of

Rad51 recombinase from the structure and properties of a filament interface

mutant.

67. Morrison C, Shinohara A, Sonoda E, Yamaguchi-Iwai Y, Takata M,

Weichselbaum RR, et al. The Essential Functions of Human Rad51 Are

Independent of ATP Hydrolysis. Vol. 19, MOLECULAR AND CELLULAR

BIOLOGY. 1999.

98

68. Tsuzuki T, Fujii Y, Sakumi K, Tominaga Y, Nakao K, Sekiguchi M, et al.

Targeted disruption of the Rad51 gene leads to lethality in embryonic mice. Proc

Natl Acad Sci. 1996 Jun 25;93(13):6236–40.

69. Lim D-S, Hasty P. A Mutation in Mouse rad51 Results in an Early Embryonic

Lethal That Is Suppressed by a Mutation in p53. Mol Cell Biol. 1996;16(12):7133–

43.

70. Lieber MR, Wilson TE. SnapShot: Nonhomologous DNA End Joining (NHEJ).

71. Davis AJ, Chen DJ. DNA double strand break repair via non-homologous end-

joining. Transl Cancer Res. 2013;

72. Lieber MR, Gu J, Lu H, Shimazaki N, Tsai AG. Nonhomologous DNA end joining

(NHEJ) and chromosomal translocations in humans. Subcell Biochem.

2014;50:279–96.

73. Deriano L, Roth DB. Modernizing the Nonhomologous End-Joining Repertoire:

Alternative and Classical NHEJ Share the Stage. Annu Rev Genet.

2013;47(1):433–55.

74. Bé Termier M, Bertrand P, Lopez BS. Is Non-Homologous End-Joining Really an

Inherently Error-Prone Process? PLoS Genet. 2014;10(1).

75. Xie A, Kwok A, Scully R. Role of mammalian Mre11 in classical and alternative

nonhomologous end joining. Nat Struct Mol Biol. 2009;16(8):814–8.

76. Davis AJ, Chen BPCC, Chen DJ. DNA-PK: A dynamic enzyme in a versatile DSB

repair pathway. DNA Repair (Amst). 2014;17:21–9. 99

77. Kabotyanski EB, Gomelsky L, Han J-O, Stamato TD, Roth DB. Double-strand

break repair in Ku86-and XRCC4-deficient cells. Nucleic Acids Res.

1998;26(23):5333–42.

78. Yano K-I, Morotomi-Yano K, Akiyama H. Cernunnos/XLF: A new player in DNA

double-strand break repair. Int J Biochem Cell Biol. 2009;41:1237–40.

79. Yamtich J, Sweasy JB. DNA polymerase Family X: Function, structure, and

cellular roles. Biochim Biophys Acta. 2010;1804:1136–50.

80. Chang HHY, Pannunzio NR, Adachi N, Lieber MR. Non-homologous DNA end

joining and alternative pathways to double-strand break repair. Nat Rev Mol Cell

Biol. 2017;18(8):495–506.

81. Blackford AN, Jackson SP. ATM, ATR, and DNA-PK: The Trinity at the Heart of

the DNA Damage Response. Molecular Cell Jun, 2017 p. 801–17.

82. Gao Y, Sun Y, Frank KM, Dikkes P, Fujiwara Y, Seidl KJ, et al. A Critical Role

for DNA End-Joining Proteins in Both Lymphogenesis and Neurogenesis. Cell.

1998;95:891–902.

83. Frank KM, Sharpless NE, Gao Y, Sekiguchi JM, Ferguson DO, Zhu C, et al. DNA

Ligase IV Deficiency in Mice Leads to Defective Neurogenesis and Embryonic

Lethality via the p53 Pathway. 2000.

84. Gao Y, Ferguson DO, Xie W, Manis JP, Sekiguchi JA, Frank KM, et al. Interplay

of p53 and DNA-repair protein XRCC4 in tumorigenesis, genomic stability and

development. Nature. 2000;404(6780):897–900. 100

85. Rooney S, Chaudhuri J, Alt FW. The role of the non-homologous end-joining

pathway in lymphocyte development. Immunol Rev. 2004;200:115–31.

86. Lim DS, Vogel H, Willerford DM, Sands AT, Driscoll O’, Cerosaletti M, et al.

NHEJ Deficiency and Disease. Vol. 20, Mol. Cell. Biol. 2001.

87. Sharpless NE, Ferguson DO, O’Hagan RC, Castrillon DH, Lee C, Farazi PA, et al.

Impaired Nonhomologous End-Joining Provokes Soft Tissue Sarcomas Harboring

Chromosomal Translocations, Amplifications, and Deletions. 2001.

88. O’driscoll M, Cerosaletti KM, Girard P-M, Dai Y, Stumm M, Kysela B, et al.

DNA Ligase IV Mutations Identified in Patients Exhibiting Developmental Delay

and Immunodeficiency. Mol Cell. 2001;8:1175–85.

89. Ghezraoui H, Piganeau M, Renouf B, Renaud J-B, Sallmyr A, Ruis B, et al.

Chromosomal Translocations in Human Cells Are Generated by Canonical

Nonhomologous End-Joining. Mol Cell. 2014;55:829–42.

90. Deng SK, Gibb B, Justino De Almeida M, Greene EC, Symington LS. RPA

antagonizes microhomology-mediated repair of DNA double-strand breaks. Nat

Struct Mol Biol. 2014;21(4).

91. Chandramouly G, McDevitt S, Sullivan K, Kent T, Luz A, Glickman JF, et al.

Small-Molecule Disruption of RAD52 Rings as a Mechanism for Precision

Medicine in BRCA-Deficient Cancers. Chem Biol. 2015;22(11):1491–504.

92. Miller A. Polymerase Delta is Required for the Removal of Short Non-

homologous DNA Flaps During Single-Strand Annealing Repair. 2018. 101

93. Liang F, Jasin M. Ku80-deficient Cells Exhibit Excess Degradation of

Extrachromosomal DNA. J Biol Chem. 1996;271(24):14405–11.

94. Boulton SJ, Jackson SP. Saccharomyces cerevisiae Ku7O potentiates illegitimate

DNA double-strand break repair and serves as a barrier to error-prone DNA repair

pathways. EMBO J. 1996;15(18):5093–103.

95. Black SJ, Kashkina E, Kent T, Pomerantz RT. DNA polymerase θ: A unique

multifunctional end-joining machine. Genes (Basel). 2016;7(9).

96. Kent T, Chandramouly G, Mcdevitt SM, Ozdemir AY, Pomerantz RT. Mechanism

of microhomology-mediated end-joining promoted by human DNA polymerase θ.

Nat Struct Mol Biol. 2015;22(3):230–7.

97. Yousefzadeh MJ, Wyatt DW, Takata K, Mu Y, Hensley SC, Tomida J, et al.

Mechanism of Suppression of Chromosomal Instability by DNA Polymerase

POLQ. PLoS Genet. 2014;10(10).

98. Wyatt DW, Feng W, Conlin MP, Wood RD, Gupta GP, Ramsden Correspondence

DA. Essential Roles for Polymerase θ-Mediated End Joining in the Repair of

Chromosome Breaks. Mol Cell. 2016;63:662–73.

99. Howard SM, Yanez DA, Stark JM. DNA Damage Response Factors from Diverse

Pathways, Including DNA Crosslink Repair, Mediate Alter-native End Joining.

PLoS Genet. 2015;11(1):1004943.

100. Biehs R, Steinlage M, Barton O, Shibata A, Jeggo PA, Lö Brich Correspondence

M, et al. DNA Double-Strand Break Resection Occurs during Non-homologous 102

End Joining in G1 but Is Distinct from Resection during Homologous

Recombination. Mol Cell. 2017;65(4):671–84.

101. Löbrich M, Jeggo P. A Process of Resection-Dependent Nonhomologous End

Joining Involving the Goddess Artemis. Trends Biochem Sci. 2017;42(9):690–

701.

102. Yun MH, Hiom K. CtIP-BRCA1 modulates the choice of DNA double-strand

break repair pathway throughout the cell cycle. Nature. 2009;459(7245):460–3.

103. Yanagida M. The Role of Model Organisms in the History of Mitosis Research.

Cold Spring Harb Perspect Biol. 2014;

104. Buttler A, Gavazov K, Peringer A, Siehoff S, Mariotte P, Wettstein JB, et al. An

integrated encyclopedia of DNA elements in the human genome. Agrar Schweiz.

2012;(7–8):346–53.

105. Prak ETL, Jr HHK. Mobile Elements and the Human Genome. Nat Rev Genet.

2000;1(November).

106. Kellis M, Wold B, Snyder MP, Bernstein BE, Kundaje A, Marinov GK, et al.

Defining functional DNA elements in the human genome. Proc Natl Acad Sci

USA. 2014;111(17):6131–8.

107. Orr AH. The population genetics of adaptation: the adaptation of dna sequences.

Evolution (N Y). 2002;56(7):1317–30.

108. Baccarelli A, Bollati V. Epigenetics and environmental chemicals. Curr Opin

Pediatr. 2009;21(2):243–51. 103

109. Friedberg à EC, McDaniel LD, Schultz RA, Werb Z, Evan G. The role of

endogenous and exogenous DNA damage and mutagenesis. Curr Opin Genet Dev.

2004;14:5–10.

110. Budhavarapu VN, Chavez M, Tyler JK. How is epigenetic information maintained

through DNA replication? Epigenetics Chromatin. 2013;6(32).

111. Dabin J, Fortuny A, Polo SE. Epigenome maintenance in response to DNA

damage. Mol Cell. 2016;62(5):712–27.

112. Hanahan D, Weinberg RA. Hallmarks of Cancer: The Next Generation. Cell.

2011;144:646–74.

113. Ló Pez-Otín C, Blasco MA, Partridge L, Serrano M, Kroemer G. The Hallmarks of

Aging. Cell. 2013;153:1194–217.

114. Kass EM, Moynahan ME, Jasin M. When Genome Maintenance Goes Badly

Awry. Mol Cell. 2016;62(5):777–87.

115. Roberts SA, Gordenin DA. Hypermutation in human cancer genomes: footprints

and mechanisms. Nat Rev Cancer. 2014;14(12):786–800.

116. Campbell BB, Light N, Fabrizio D, Zatzman M, Fuligni F, de Borja R, et al.

Comprehensive Analysis of Hypermutation in Human Cancer. Cell.

2017;171(5):1042–1056.e10.

117. Franchitto A. Genome Instability at Common Fragile Sites: Searching for the

Cause of Their Instability. Biomed Res Int. 2013;2013.

104

118. Koole W, Van Schendel R, Karambelas AE, Van Heteren JT, Okihara KL,

Tijsterman M. A Polymerase Theta-dependent repair pathway suppresses

extensive genomic instability at endogenous G4 DNA sites. Nat Commun. 2014;5.

119. Rhodes D, Lipps HJ. G-quadruplexes and their regulatory roles in biology. Nucleic

Acids Res. 2015;43(18):8627–37.

120. Hande M. DNA repair factors and telomere-chromosome integrity in mammalian

cells Request reprints from M. Cytogenet Genome Res. 2004;104:116–22.

121. Fairman-Williams ME, Guenther U-P, Jankowsky E. SF1 and SF2 helicases:

family matters. Curr Opin Struct Biol. 2010;20(3):313–24.

122. Singleton MR, Dillingham MS, Wigley DB. Structure and Mechanism of

Helicases and Nucleic Acid Translocases. Annu Rev Biochem. 2007;76:23–50.

123. Jain R, Aggarwal AK, Rechkoblit O. Eukaryotic DNA polymerases. Curr Opin

Struct Biol. 2018;53:77–87.

124. Bernstein KA, Gangloff S, Rothstein R. The RecQ DNA Helicases in DNA

Repair. Annu Rev Genet. 2010;

125. Hickson ID. RecQ helicases: caretakers of the genome. Nat Rev Cancer. 2003;3.

126. Chu WK, Hickson ID. RecQ helicases: Multifunctional genome caretakers. Nature

Reviews Cancer 2009.

127. Brosh RM, Datta A. New Insights Into DNA Helicases as Druggable Targets for

Cancer Therapy. Cancer Ther Front Mol Biosci. 2018;5.

105

128. Mo D, Zhao Y, Balajee AS. Human RecQL4 helicase plays multifaceted roles in

the genomic stability of normal and cancer cells. Cancer Lett. 2018;413:1–10.

129. Sale JE, Lehmann AR, Woodgate R. Y-family DNA polymerases and their role in

tolerance of cellular DNA damage. Nat Rev Mol Cell Biol. 2012;13(3):141–52.

130. Seki M, Marini F, Wood RD. POLQ (Pol theta), a DNA polymerase and DNA-

dependent ATPase in human cells. Nucleic Acids Res. 2003 Nov;31(21):6117–26.

131. Malaby AW, Martin SK, Wood RD, Doublié S. Expression and Structural

Analyses of Human DNA Polymerase θ (POLQ). Methods Enzymol.

2017;592(802):103–21.

132. Kawamura K, Bahar R, Seimiya M, Chiyo M, Wada A, Okada S, et al. DNA

polymerase θ is preferentially expressed in lymphoid tissues and upregulated in

human cancers. Int J Cancer. 2004;109(1):9–16.

133. Lemée F, Bergoglio V, Fernandez-Vidal A, Machado-Silva A, Pillaire M-J, Bieth

A, et al. DNA polymerase θ up-regulation is associated with poor survival in breast

cancer, perturbs DNA replication, and promotes genetic instability. Proc Natl Acad

Sci USA. 2010;107(30):13390–5.

134. Allera-Moreau C, Rouquette I, Lepage B, Oumouhou N, Walschaerts M, Leconte

E, et al. DNA replication stress response involving PLK1, CDC6, POLQ, RAD51

and CLASPIN upregulation prognoses the outcome of early/mid-stage non-small

cell lung cancer patients. Oncogenesis. 2012;1(10):e30-10.

135. Shima N, Hartford SA, Duffy T, Wilson LA, Schimenti KJ, Schimenti JC. 106

Phenotype-Based Identification of Mouse Chromosome Instability Mutants. 2003.

136. Shima N, Munroe RJ, Schimenti JC. The Mouse Genomic Instability Mutation

chaos1 Is an Allele of Polq That Exhibits Genetic Interaction with Atm. Mol Cell

Biol. 2004;24(23):10381–9.

137. Wood RD, Doublié S. Mini review DNA polymerase (POLQ), double-strand break

repair, and cancer. DNA Repair (Amst). 2016;44:22–32.

138. Helleday T. The underlying mechanism for the PARP and BRCA synthetic

lethality: Clearing up the misunderstandings. Mol Oncol. 2011;5(4):387–93.

139. Brown JS, O’Carrigan B, Jackson SP, Yap TA. Targeting DNA repair in cancer:

Beyond PARP inhibitors. Vol. 7, Cancer Discovery. 2017.

140. Higgins GS, Boulton SJ. Beyond PARP—POL Theta as an anticancer target. Vol.

359, Science. 2018.

141. Brown JS, O’carrigan B, Jackson SP, Yap TA, O’Carrigan B, Jackson SP, et al.

Targeting DNA Repair in Cancer: Beyond PARP Inhibitors. Cancer Discov.

2017;7(1):20–37.

142. Villanueva MT. DNA repair: A new tool to target DNA repair. Nat Rev Cancer.

2015;15(3):136–136.

143. Zan H, Shima N, Xu Z, Al-Qahtani A, Iii AJE, Zhong Y, et al. The translesion

DNA polymerase theta plays a dominant role in immunoglobulin gene somatic

hypermutation. EMBO J. 2005;24:3757–69.

107

144. Lange SS, Takata KI, Wood RD. DNA polymerases and cancer. Nature Reviews

Cancer 2011.

145. Maloisel L, Fabre F, Gangloff S. DNA Polymerase delta Is Preferentially

Recruited during Homologous Recombination To Promote Heteroduplex DNA

Extension. Mol Cell Biol. 2008;28(4):1373–82.

146. Copeland WC, Longley MJ. DNA polymerase gamma in mitochondrial DNA

replication and repair. TheScientificWorldJournal. 2003.

147. Kent T, Rusanov TD, Hoang TM, Velema WA, Krueger AT, Copeland WC, et al.

DNA polymerase θ specializes in incorporating synthetic expanded-size (xDNA)

nucleotides. Nucleic Acids Res. 2016;44(19):9381–92.

148. Vaisman A, Woodgate R. Translesion DNA Polymerases , Eukaryotic.

2004;4:247–50.

149. Takata K-I, Reh S, Yousefzadeh MJ, Zelazowski MJ, Bhetawal S, Trono D, et al.

Analysis of DNA polymerase ν function in meiotic recombination,

immunoglobulin class-switching, and DNA damage tolerance. PLoS Genet. 2017;

150. Byrd AK, Raney KD. Superfamily 2 helicases. 2013.

151. Lohman TM, Tomko EJ, Wu CG. Non-hexameric DNA helicases and

translocases: Mechanisms and regulation. Vol. 9, Nature Reviews Molecular Cell

Biology. 2008. p. 391–401.

152. Büttner K, Nehring S, Hopfner K-P. Structural basis for DNA duplex separation

by a superfamily-2 helicase. Nat Struct Mol Biol. 2007;14(7):647–52. 108

153. Manthei KA, Hill MC, Burke JE, Butcher SE, Keck JL, Designed JLK, et al.

Structural mechanisms of DNA binding and unwinding in bacterial RecQ

helicases. Proc Natl Acad Sci USA. 2014;

154. Singh DK, Ghosh AK, Croteau DL, Bohr VA. RecQ helicases in DNA double

strand break repair and telomere maintenance. Mutation Research - Fundamental

and Molecular Mechanisms of Mutagenesis 2012 p. 15–24.

155. Hanada K, Hickson ID. Molecular genetics of RecQ helicase disorders. Cellular

and Molecular Life Sciences 2007.

156. Beyer DC, Ghoneim MK, Spies M. Structure and mechanisms of SF2 DNA

helicases. Adv Exp Med Biol. 2013;

157. Guilliam TA, Keen BA, Brissett NC, Doherty AJ. -polymerases are a

functionally diverse superfamily of replication and repair enzymes. Nucleic Acids

Res. 2015;43(14):6651–64.

158. Leonhardt EA, Henderson DS, Rinehart JE, Boyd JB. Characterization of the

mus308 Gene in Drosophila melanogaster. 1993.

159. Harris P V, Mazina OM, Leonhardt EA, Case RB, Boyd JB, Burtis KC. Molecular

Cloning of Drosophila mus308, a Gene Involved in DNA Cross-Link Repair with

Homology to Prokaryotic DNA Polymerase I Genes. Vol. 16, MOLECULAR

AND CELLULAR BIOLOGY. 1996.

160. Boyd JB, Sakaguchi’ K, Harris P V. mus308 Mutants of Drosophila Exhibit

Hypersensitivity to DNA Cross-Linking Agents and Are Defective in a 109

Deoxyribonuclease. 1990.

161. Yousefzadeh MJ, Wood RD. DNA polymerase POLQ and cellular defense against

DNA damage. DNA Repair (Amst). 2013;12:1–9.

162. Chan SH, Yu AM, Mcvey M. Dual Roles for DNA Polymerase Theta in

Alternative End-Joining Repair of Double-Strand Breaks in Drosophila. PLoS

Genet. 2010;6(7):1001005.

163. Kent T, Mateos-Gomez PA, Sfeir A, Pomerantz RT. Polymerase θ is a robust

terminal transferase that oscillates between three different mechanisms during end-

joining. Elife. 2016;5(JUN2016):1–25.

164. Derbyshire V, Grindley NDF, Joyce’ CM. The 3’-5’ exonuclease of DNA

polymerase I of Escherichia coli: contribution of each amino acid at the active site

to the reaction. Vol. 10, The EMBO Journal. 1991.

165. Newman JA, Cooper CDO, Aitkenhead H, Gileadi Correspondence O. Structure of

the Helicase Domain of DNA Polymerase Theta Reveals a Possible Role in the

Microhomology-Mediated End-Joining Pathway. Structure. 2015;23.

166. Beagan K, Mcvey M. Linking DNA polymerase theta structure and function in

health and disease. Cell Mol Life Sci. 2016;73.

167. Van Kregten M, De Pater S, Romeijn R, Van Schendel R, Hooykaas PJJJ,

Tijsterman M. T-DNA integration in plants results from polymerase-θ-mediated

DNA repair. Nat Plants. 2016;2.

168. Mateos-Gomez PA, Gong F, Nair N, Miller KM, Lazzerini-Denchi E, Sfeir A. 110

Mammalian polymerase θ promotes alternative NHEJ and suppresses

recombination. Nature. 2015 Feb 12;518(7538):254–7.

169. Hogg M, Seki M, Wood RD, Doublié S, Wallace SS. Lesion Bypass Activity of

DNA Polymerase θ (POLQ) Is an Intrinsic Property of the Pol Domain and

Depends on Unique Sequence Inserts. J Mol Biol. 2011;405:642–52.

170. Yoon J-H, Choudhury JR, Park J, Prakash S, Prakash L. A Role for DNA

Polymerase in Promoting Replication through Oxidative DNA Lesion, Thymine

Glycol, in Human Cells. J Biol Chem. 2014;289(19):13177–85.

171. Seki M, Wood RD. DNA polymerase theta (POLQ) can extend from mismatches

and from bases opposite a (6-4) photoproduct. DNA Repair (Amst). 2008;7:119–

27.

172. Arana ME, Seki M, Wood RD, Rogozin IB, Kunkel TA. Low-fidelity DNA

synthesis by human DNA polymerase theta. Nucleic Acids Res.

2008;36(11):3847–56.

173. Prasad R, Longley MJ, Sharief FS, Hou EW, Copeland WC, Wilson SH. Human

DNA polymerase theta possesses 5’-dRP lyase activity and functions in single-

nucleotide base excision repair in vitro. Nucleic Acids Res. 2009;37(6):1868–77.

174. Goodman MF, Woodgate R. Translesion DNA polymerases. Cold Spring Harb

Perspect Biol. 2013;

175. Masuda K, Ouchida R, Takeuchi A, Saito T, Koseki H, Kawamura K, et al. DNA

polymerase theta contributes to the generation of C/G mutations during somatic 111

hypermutation of Ig genes. Proc Natl Acad Sci USA. 2005;102(39):13986–91.

176. Masuda K, Ouchida R, Hikida M, Kurosaki T, Yokoi M, Masutani C, et al. DNA

Polymerases and Function in the Same Genetic Pathway to Generate Mutations at

A/T during Somatic Hypermutation of Ig Genes. J Biol Chem.

2007;282(24):17387–94.

177. Zelensky AN, Schimmel J, Kool H, Kanaar R, Tijsterman M. Inactivation of Pol θ

and C-NHEJ eliminates off-target integration of exogenous DNA. Nat Commun.

2017;8(1):1–7.

178. Saito S, Maeda R, Adachi N. Dual loss of human POLQ and LIG4 abolishes

random integration. Nat Commun. 2017;8(May):1–10.

179. Fernandez-Vidal A, Guitton-Sert L, Cadoret J-C, Drac M, Schwob E, Baldacci G,

et al. A role for DNA polymerase y in the timing of DNA replication. Nat

Commun. 2014;5.

180. Marini F, Wood RD. A Human DNA Helicase Homologous to the DNA Cross-

link Sensitivity Protein Mus308*. J Biol Chem. 2001;277(10):8716–23.

181. Khadka P, Croteau DL, Bohr VA. RECQL5 has unique strand annealing properties

relative to the other human RecQ helicase proteins. DNA Repair (Amst).

2016;37:53–66.

182. Campbell JL, Li H. Polθ helicase: drive or reverse. Nat Struct Mol Biol.

2017;24(12):1007–8.

183. Runyon GT, Lohman TM. Escherichia coli Helicase I1 (UvrD) Protein Can 112

Completely Unwind Fully Duplex Linear and Nicked Circular DNA*. Vol. 264.

1989.

184. Sollier J, Stork CT, García-Rubio ML, Paulsen RD, Aguilera AS, Cimprich KA.

Transcription-Coupled Nucleotide Excision Repair Factors Promote R-Loop-

Induced Genome Instability. Mol Cell. 2014;56:777–85.

185. Keskin H, Shen Y, Huang F, Patel M, Yang T, Ashley K, et al. Transcript-RNA-

templated DNA recombination and repair. Nature. 2014;515.

186. Aguilera AS, García-Muse T. R Loops: From Transcription Byproducts to Threats

to Genome Stability. Mol Cell. 2012;46:115–24.

187. Plosky BS. The Good and Bad of RNA:DNA Hybrids in Double-Strand Break

Repair. Mol Cell. 2016;64:643–4.

188. Ohle C, Tesorero R, Schermann G, Dobrev N, Sinning I, Fischer T. Transient

RNA-DNA Hybrids Are Required for Efficient Double-Strand Break Repair. Cell.

2016;167(4):1001–13.

189. Pages V, Fuchs R. Uncoupling of leading- and lagging-strand DNA replication

during lesion bypass in vivo. Science (80- ). 2003;300:1300–3.

190. Seki M, Marini F, Wood RD. POLQ (Pol theta), a DNA polymerase and DNA-

dependent ATPase in human cells. Nucleic Acids Res. 2003 Nov;31(21):6117–26.

191. Tafel AA, Wu L, Mchugh PJ. Human HEL308 Localizes to Damaged Replication

Forks and Unwinds Lagging Strand Structures. J Biol Chem. 2011;286(18):15832–

40. 113

192. Zhang L, Xu T, Maeder C, Bud L-O, Shanks J, Nix J, et al. Structural evidence for

consecutive Hel308-like modules in the spliceosomal ATPase Brr2. Nat Struct

Mol Biol. 2009;16(7):731–9.

193. Richards JD, Johnson KA, Liu H, Mcrobbie A-M, Mcmahon S, Oke M, et al.

Structure of the DNA Repair Helicase Hel308 Reveals DNA Binding and

Autoinhibitory Domains *. J Biol Chem. 2008;283(8):5118–26.

194. Croteau DL, Popuri V, Opresko PL, Bohr VA. Human RecQ Helicases in DNA

Repair, Recombination, and Replication. Annu Rev Biochem. 2014;83:519–52.

195. Takata K-I, Reh S, Tomida J, Person MD, Wood RD. Human DNA helicase

HELQ participates in DNA interstrand crosslink tolerance with ATR and RAD51

paralogs. Nat Commun. 2013;

196. Adelman CA, Lolo RL, Birkbak NJ, Murina O, Matsuzaki K, Horejsi Z, et al.

HELQ promotes RAD51 paralogue-dependent repair to avert germ cell loss and

tumorigenesis. Nature. 2013;502.

197. Popuri V, Huang J, Ramamoorthy M, Tadokoro T, Croteau DL, Bohr VA.

RECQL5 plays co-operative and complementary roles with WRN syndrome

helicase. Nucleic Acids Res. 2013;41(2):881–99.

198. Hu Y, Raynard S, Sehorn MG, Lu X, Bussen W, Zheng L, et al. RECQL5/Recql5

helicase regulates homologous recombination and suppresses tumor formation via

disruption of Rad51 presynaptic filaments. Genes Dev. 2007;21(23):3073–84.

199. Georgescu RE, Langston L, Yao NY, Yurieva O, Zhang D, Finkelstein J, et al. 114

Mechanism of asymmetric polymerase assembly at the eukaryotic replication fork.

Nat Struct Mol Biol. 2014;21(8):664–70.

200. Langston LD, Zhang D, Yurieva O, Georgescu RE, Finkelstein J, Yao NY, et al.

CMG helicase and DNA polymerase e form a functional 15-subunit holoenzyme

for eukaryotic leading-strand DNA replication. Proc Natl Acad Sci USA.

2014;111(43):15390–5.

201. Goswami P, Abid Ali F, Douglas ME, Locke J, Purkiss A, Janska A, et al.

Structure of DNA-CMG-Pol epsilon elucidates the roles of the non-catalytic

polymerase modules in the eukaryotic replisome. Nat Commun. 2018;9(1):5061.

202. Chilkova O, Jonsson B-H, Johansson E. The Quaternary Structure of DNA

Polymerase epsilon from Saccharomyces cerevisiae. J Biol Chem.

2003;278(16):14082–6.

203. West RW, Chen SM, Putz H, Butler G, Banerjee M. GAL1-GAL10 divergent

promoter region of Saccharomyces cerevisiae contains negative control elements

in addition to functionally separate and possibly overlapping upstream activating

sequences. Genes Dev. 1987;1(10):1118–31.

204. Choi E-S, Sohn J-H, Rhee S-K. Optimization of the expression system using

galactose-inducible promoter for the production of anticoagulant hirudin in

Saccharomyces cerevisiae. Appl Microbiol Biotechnol. 1994;42:587–94.

205. Sudbery PE. The expression of recombinant proteins in yeasts.

206. Han JY, Seo SH, Song JM, Lee H, Choi ES. High-level recombinant production of 115

squalene using selected Saccharomyces cerevisiae strains. J Ind Microbiol

Biotechnol. 2018;45(4):239–51.

207. Kojo H, Greenberg BD, Sugino A. Yeast 2-,um plasmid DNA replication in vitro:

Origin and direction. Proc Natl Acad Sci USA. 1981;78(12):7261–5.

208. Chan K-M, Liu Y-T, Ma C-H, Jayaram M, Soumitra S⇑. The 2 micron plasmid of

Saccharomyces cerevisiae: A miniaturized selfish genome with optimized

functional competence. Plasmid. 2013;70:2–17.

209. Pronk JT. Auxotrophic Yeast Strains in Fundamental and Applied Research. Appl

Environ Microbiol. 2002;68(5):2095–100.

210. Mumberg D, Mulier R, Funk M. Regulatable promoters of Saccharomyces

cerevisiae: comparison of transcriptional activity and their use for heterologous

expression. Vol. 22, Nucleic Acids Research. 1994.

211. Gary SL, Burgers PMJ. Identification of the fifth subunit of Saccharomyces

cerevisiae Replication Factor C. Nucleic Acids Res. 1995;23(24):4986–91.

212. Chen X, Zaro JL, Shen W-C. Fusion protein linkers: Property, design and

functionality ☆. Adv Drug Deliv Rev. 2013;65:1357–69.

213. Bai Y, Shen W-C. Improving the Oral Efficacy of Recombinant Granulocyte

Colony-Stimulating Factor and Transferrin Fusion Protein by Spacer Optimization.

Pharm Res. 2006;23(9).

214. Huston JS, Levinson D, Mudgett-Huntert M, Tai M-S, Novotn9t J, Margolies MN,

116

et al. Protein engineering of antibody binding sites: Recovery of specific activity in

an anti-digoxin single-chain Fv analogue produced in Escherichia coli. Proc Natl

Acad Sci USA. 1988;85:5879–83.

215. Hu W, Li F, Yang X, Li Z, Xia H, Li G, et al. A flexible peptide linker enhances

the immunoreactivity of two copies HBsAg preS1 (21-47) fusion protein. J

Biotechnol. 2004;107:83–90.

216. Hu L, Li Z, Cheng J, Rao Q, Gong W, Liu M, et al. Crystal Structure of TET2-

DNA Complex: Insight into TET-Mediated 5mC Oxidation. Cell. 2013;155:1545–

55.

217. Jones EW. Tackling the protease problem in Saccharomyces cerevisiae. Methods

Enzymol. 1991;194:428–53.

218. Gietz RD, Schiestl RH. Frozen competent yeast cells that can be transformed with

high efficiency using the LiAc/SS carrier DNA/PEG method. Nat Protoc.

2007;2(1):1–4.

219. Zahn KE, Averill AM, Aller P, Wood RD, Doublié S. Human DNA polymerase θ

grasps the primer terminus to mediate DNA repair. Nat Struct Mol Biol. 2015;

220. Ozdemir AY, Rusanov T, Kent T, Siddique LA, Pomerantz RT. Polymerase θ-

helicase efficiently unwinds DNA and RNA-DNA hybrids. J Biol Chem.

2018;293(14):5259–69.

221. He P, Yang W. Template and primer requirements for DNA Pol θ-mediated end

joining. Proc Natl Acad Sci USA. 2018;115(30):7747–52. 117

APPENDIX A

SUPPLEMENTARY FIGURES

Fig A-1. Polθ-helicase controls

A. Schematic of unwinding assay (left). Non-denaturing gel showing Polθ-helicase unwinding

of the indicated DNA substrate in the presence of increasing amounts of unlabeled ssDNA trap

(left gel). Non- denaturing gel showing Polθ-helicase unwinding of the indicated DNA substrate

in the presence of unlabeled sequence-specific or non-homologous (NH) ssDNA trap (right gel).

B. Non-denaturing gel showing lack of Polθ-helicase unwinding of the indicated DNA substrate.

% unwinding is indicated. C. Non-denaturing gel showing lack of Polθ-helicase strand exchange

or reversal of the indicated fork substrate. *, 32P.

118

APPENDIX B

YEAST CODON OPTIMIZED SEQUENCES

Yeast Codon Optimized Sequence of 3xFlag-Polθ∆cen

119

120

Yeast Codon Optimized Sequence of 3xFlag-PolθK121M

121

122

123

Yeast Codon Optimized Sequence of 3xFlag-GFP-Polθ

124

125

126

127

APPENDIX C

POL THETA ALIGNMENTS

128

129

130

131

132

133

134

135

Multiple sequence alignment of full length DNA Polymerase theta (Polθ) and selected homologues. Multiple sequence alignment was performed using Clustal Omega.

* = identical residues, : = residues sharing very similar properties,. = residues sharing some properties. Red, small and hydrophobic; Blue, acidic; Magenta, basic; Green, hydroxyl, sulfhydryl, amine. Motifs and subdomains are labeled black, central domain

[895-1791 aa] is shown in orange.

136

APPENDIX D

EXTENDED DATA FIGURES

Primer-template Primer-template *5’ *5’ 3’ 5’ 3’ 5’

a b Polq PolqK121M

Run-off Run-off

Primer Primer *** *** ** ** **

Primer-template *5’ 3’ 5’

c PolqDcen

Run-off

Primer * * *

Fig. D-1 Determination of specific activities of WT Pol theta and Pol theta variants. a-c,

Denaturing gels showing primer extension by the indicated proteins. Polθ-pol primer extension at the

indicated concentrations was used as activity standards. Increasing amounts of WT Pol theta,

PolθK121M and Polθ∆cen were assayed and compared to Polθ-pol under identical conditions. * and **

represent specific activities equivalent to to 5 nM and 10 nM of Polq-pol, respectively.

137

70 nt 29 nt *5’ CCCGGG 3’ 5' 3' 50-fold excess ssDNA b of cold SJB103 a Polq-hel Polq MMEJ

MMEJ 32Pi− intermediate

32P-g-ATP− 70 nt- 1 2 3 4 5 6 7 8 9 10 0 2 5 10 20 40 0 2 5 10 20 40 min 2 4 6 2 4 6 29 29 29 hr SJB103 14 25 32 10 15 23 5 65 53 % ATP hydrolysis 5'ATCTGCCAACTTTTCACACCATTTCGCAGATGGACAAAACAATAAAACAGAA3'

12 nt * 5' GGTTAGCCCGGG 3' c Polq-pol PolqDcen 60 MMEJ Polq 50 40 PolqDcen 30

% MMEJ 20 10 12 nt- 0 0 2 4 8 16 32 45 0 2 4 8 16 32 45 min 0 10 20 30 40 50 8 35 41 46 49 55 7 35 41 42 45 49 % MMEJ min

50 nt ssDNA e *5’ CTCCG 3’ 98 nt ssDNA 3’ GAGGC 5’* *5’ GC 3’ d Microhomology No microhomology 12 nt 12 nt * 5' GGTTAGCCCGGG 3' * 5' TCGTCGGAGTCT 3' MMEJ - + Polq-pol - + Polq-pol 10 MMEJ 16

8 12 MMEJ 6 8 4 % MMEJ ssDNAx % MMEJ 4 ssDNAx 2 12 nt- 12 nt- 50 nt- 98 nt- 0 0 1 2 Non-denaturing gel 1 2 Primer-template *5’ 3’ 5’ 100 nt *5’ CCCGGG 3’ f g WT Polq AMP-PNP ATP Run-off− MMEJ

100 nt- Primer− Denaturing gel 0 2 4 8 16 32 32 min 31.8 40.0 47.7 63.0 82.7 84.3 % MMEJ

138

Fig. D-2 Control assays for WT Pol theta and Pol theta variants. a, Thin layer chromatography plate image

showing a time course of Pol theta and Polθ-hel ATPase activities in the presence of ssDNA. Lanes 8-10 demonstrate

substantial 32P-g-ATP hydrolysis over 29 hr. b, Pol theta MMEJ processivity assay. A 50-fold excess of the indicated

unlabeled ssDNA was added after 2 min (right). Conversion of the intermediate MMEJ products to full-length MMEJ

products following addition of the excess cold ssDNA demonstrates processive MMEJ after the initial extension step.

c, Non-denaturing gel showing a time course of MMEJ by the indicated proteins on the indicated 12 nt ssDNA with

microhomology (left). Plot of MMEJ time courses (right). d, Non-denaturing gel of Polθ-pol MMEJ reactions

performed on 12 nt ssDNA with (left) and without (right) microhomology. e, Non-denaturing gels showing MMEJ by

the indicated proteins on the indicated ssDNA (left). Bar charts showing % MMEJ. n = 3 +/-s.d. (right). f, Denaturing

gel showing primer-extension by the indicated proteins. g, Non-denaturing gel showing MMEJ by Pol theta on the

indicated 100 nt ssDNA in the presence of AMP-PNP and ATP. Reactions were terminated at the indicated times.

139

a Sequencing Polq-pol ssDNAx products 26 nt ssDNA 5’ 3’ Polq-pol, 5’ 3’ dNTPs, ATP Heat inactivate Polq-pol

5’ 3’ Purify ssDNA

5’ 3’ Ligate 5' P-ssDNA handle (blue) to 3' terminus

5’ 3’ Purify ssDNA, PCR amplification of ligated ssDNA

Clone PCR products into TOPO-TA cloning vector and sequence

b ssDNA sequence (5'-3') ssDNAx CACTGTGAGCTTAGGGTTAGCCCGGG CTAGCAGTGA CACTGTGAGCTTAGGGTTAGCCCGGG TCA CACTGTGAGCTTAGGGTTAGCCCGGG TGA CACTGTGAGCTTAGGGTTAGCCCGGG T CACTGTGAGCTTAGGGTTAGCCCGGG T CACTGTGAGCTTAGGGTTAGCCCGGG T CACTGTGAGCTTAGGGTTAGCCCGGG T

c Polq-pol 5' CACTGTGAGCTTAGGGTTAGCCC Extension ATCGGG First extension step 5'

5'CACTGTGAGCTTAGGGTT Second extension step ssDNAx AGTGACGAG TCGGGCCCGA (1-10 nt)

Fig. D-3 Sequencing of Polθ-pol ssDNA extension products. a, Schematic of method used for

sequencing Polθ-pol ssDNAx products. b, Representative sequences of Polθ-pol ssDNAx products.

Initial ssDNA substrate sequence (black). Nucleotides transferred to the 3' terminus of ssDNA by

Polθ-pol (red; ssDNAx). c, Model of Polθ-pol ssDNAx via intrastrand pairing and extension. The

model predicts how the first sequence in panel b is generated by Polθ-pol. Because Polθ-pol is

highly promiscuous various intrastrand non-canonical base-pairing options may lead to the observed

ssDNAx events in panel b. Polθ-pol is also implicated in non-templated ssDNA extension which

may result in some of the observed ssDNAx events in panel b.

140