Chapter 30:

DNA Replication, Repair, and Recombination 1. DNA Replication: An overview 2. of Replication 3. Prokaryotic Replication 4. Eukaryotic Replication 5. Repair of DNA 6. Recombination and Mobile Genetic Elements 7. DNA Methylation and Trinucleotide Repeat Expansion DNA Replication

• DNA double strand -> template for duplication, Replication • Chemically similar to • As complex as translation but enzymes in only few copies/cell

• Extremely accurate: 10-10 mistakes/base • Extremely regulated: only once per cell division Action of DNA

• Template • dNTPs • 5’ -> 3’ direction • Semi-conservative Replication of DNA

Unwinding of dsDNA: - Rate in E. coli: 1000nt/sec - 100rev/sec (10bp/turn)

Negative supercoils by: DNA Gyrase type II , ATP E. coli theta replication

• autoradiogramm • branch point called “replication fork” • unidirectional / bidirectional • and bacteriophages have only one Unidirectional vs. bidirectional θ replication

[3H]tymidine pulse-labelling Semidiscontinuous DNA replication

Okazaki fragments: Discontinous ! 1000-2000nt in prokaryotes 100-200nt in eukaryotes Joined by DNA Replication eye in Drosophila melanogaster DNA Priming of DNA synthesis by short RNA segments

E. coli: RNA Polymerase , sensitive Removal of RNA primers 2. Enzymes of replication

DNA Replication requires (in order of appearance): 1. DNA Topoisomerase 2. 3. ssDNA binding proteins 4. RNA primer synthesis 5. DNA polymerase 6. to remove RNA primers 7. Link E. coli DNA polymerase I in complex with a dsDNA

Arthur Kornberg, 1957 DNA Polymerase I 5’->3’ synthesis Processive, 20nt Recognizes dNTP based on base pairing Right hand sructure

Editing activity: 3’->5’ 5’->3’ exonuclease () Fidelity 10-7 Lacks 5’->3’ exo, lacks N- term. Nick translation as catalyzed by Pol I

Used to radiolabel DNA probes for Southern/Northern DNaseI, αP32dNTP Pol I functions to repair DNA

E. coli, Pol I are viable but sensitive to UV and chemical mutagens

Essentisl physiological function of Pol I 5’->3’ exonuclease is to excise RNA primers, role in replication DNA Polymerase III

Pol III is replicase of E. coli Holoenzyme consists of more than 10 subunits β subunit confers >5000nt β subunit form a ring like sliding clamp with 80Å diameter , sliding clamp/ β clamp Properties of E. coli DNA Components of E. coli DNA Polymerase III Holoenzyme β subunit of E. coli Pol III holoenzyme Unwinding of DNA

3 proteins required to advance replication fork: , DnaB, hexameric, ATP-dep., 5’->3’,AAA+ Strand separation, Rep helicase, dimer, ATP-dep. ssDNA binding protein, prevent re-annealing, tetramer Unwinding and Binding Proteins of E. coli DNA Replication Active, rolling mechanism for DNA unwinding by Rep helicase DNA ligase

Ligating single strand nicks between Okazaki fragments

E. coli: NAD-dependent T4 phage, ATP-dependent blunt end ligation Primase

Synthesis of RNA primers fro Okazaki fragments: 5’->3’ In vitro 11nt ±1 Prokaryotic Replication

Bacteriophages Coliphages: M13, φX174

M13: 6408nt ssDNA(+), circular Replication->RF Leading strand synthesis φX174 Replication

5386nt ssDNA circular Replication more complex than M13 Requires Paradigm for lagging strand synthesis 6step process a. coating b. primosome assembly c. migration d. priming e. Pol III extension f. Pol I removes primers g. ligation, supercoiling Micrograph of a primosome Proteins of the Primosomea The rolling circle mode of DNA replication

a. Specific cut at + strand b. Extension of + strand c. Tandem-linked + strands d. Separation by e. packaging

Rolling circle = Sigma replication φX174 (+) strand replication by the looped rolling circle mode

φX174 (+) strand synthesis as model for leading strand replication 1. Cut by A protein 2. Pol II extension 3. Cut + ligation The replication of E. coli DNA

Bidirectional, theta replication leading and lagging strand synthesis occurs on a common 900kD multisubunit particle: the -> loop of lagging strand

Initiation: at oriC, 245bp segment The replication of E. coli DNA A model for DNA replication initiation at oriC

oriC, 245bp segment Contains 5 DnaA boxes Melting, P1 Penicillium citrinum endodunclease Specific for ssDNA Prepriming complex (DnaB DnaC)6 Initiation of DNA replication is strictly regulated

Only 1 replication/cell cycle Doubling time 20min-10h 1000nt/sec 4.6 106bp -> 40min/replication -> multiforked Sequestration of hemimethylated oriC Electron micrograph of an intact and supercoiled E. coli attached to two fragments of the cell membrane Schematic diagram of the clamp loading cycle β clamp responsible for high processivity of Pol III Must be “loaded” onto DNA by a clamp loader ATP-dep. AAA+ Termination of replication

Large 350 kb region in E. coli genome Flanked by 7 nonpalindromic nearly identical termination Sites Replication fork counterclockwise passes through TerG,F, B, and C but stops at TerA Analogous for other direction Ter act as valves Ter-action requires binding of Tus protein

Without Ter, collision of replication forks terminates Fidelity of Replication

Complexity of replication (>20 proteins) important for high fidelity: T4 phage reversion 10-8 - 10-10

High accuracy due to: 1. Balanced dNTP levels 2. Polymerase reaction itself, pairing 3. 3’->5’ exonuclease of Pol I and Pol III 4. Repair systems -> see later Why only 5’->3’ synthesis ?

3’->5’ extension would require retention of 5’ triphosphate This would be lost upon editing ! Eukaryotic Replication

Remarkable degree of similarity to prok. replication But linear chromosomes -> ends ?

Cell cycle regulation, can last 8h to > 100 days Most variation in G1 phase/Go phase Irreversible decision to proliferate is made in G1 Checkpoint Controlled by cyclins and cyclin-dep.

Best understood from yeast (budding, fission) The eukaryotic cell cycle Eukaryotic cells contain many polymerases 6 families: A, E. coli Pol I, Pol γ (mitochondrial) B, E. coli Pol II, Pol α, Pol δ C, E. coli Pol III D, X, Y

Pol δ, unlimited processivity when in complex with PCNA, proliferating cell nuclear antigen (systemic lupus erythematosus), β clamp function Properties of Some Animal DNA Polymerases Structure of PCNA Eukaryotic chromosomes consist of numerous replicons

Multiple replication origins, every 3-300kb Replication rate 50nt/sec, 20x slower than E. coli But 60x more DNA Replication would require 1 month Clusters of 20-80 adjacent replicons Not simultaneously, but ensure they initiate only once Assembly of the initiator complex in 2 stages To prevent multiple rounds of initiation: Assembly of pre-RC in G1 phase (licensed) Activation at S phase Origin can “fire” only once Origin = ARS (autonomously replicating sequences) Re-replication prevented by Cdks and Geminin

ORC, origin recognition complex Hexamer, Orc1-Orc6 (DnaA analog) MCM, minichr. maintenance funct. Removal of RNA primers

2 enzymes: RNase H1, removes most of the RNA leaving a single 5’ ribonucleotide (H, hybrid)

Flap endonuclease-1 (FEN1) removes single single 5’ ribonucleotide Mitochondrial DNA is replicated in D- loops

15kb circular genome Leading strand synthesis precedes lagging strand

Leading strand forms displacement loop (D-loop)

Retroviruses: RNA containing eukaryotic viruses, e.g. HIV Replicate from RNA genome Copy RNA into DNA by Reverse Transcripase (RT)

Similar to Pol I, 5’->3’ synthesis of DNA from RNA template, primed by host tRNA RNA is degraded by RNase H ssDNA directs dsDNA synthesis dsDNA integration into host genome

RT: important tool for cDNA synthesis, oligo-dT primed Reverse transcriptase Structure of HIV-1 reverse transcriptase RT inhibitors Telomers and

How are the ends of linear chromosomes replicated ?

Problem: no priming at 5’ of lagging strand possible without shortening of the chromosome upon every replication

Telomer sequence: unusual, G-rich, 3’ overhang (20-200bp)

Specialized enzyme: telomerase adds G-rich repeats without teplate, is ribonucleoprotein, RNA acts as template Synthesis of telomeric DNA by Tetrahymena telomerase Telomers must be capped

Without telomerase, chromosome would shorten 50-100nt with every cell division

Exposed telomeric ssDNA must be protected by capping with proteins, Pot1 length correlates with aging

Primary cells in culture die after 20-60 divisions

Such somatic cells have no telomerase activity -> Telomers shorten with every division Telomerase is active only in germ cells

Analysis of fibroblast from donors of different age: No correlation with numbers of doublings in culture But correlation of telomer length with numbers of doublings

Progeria: premature aging disease patients have short telomers Cancer cells have active telomerase

Why do somatic cells down regulate telomerase ? Senescence may be a mechanism to protect from cancer

All immortal cells express telomerase Telomeric DNA can dimerize via G-quartets Telomers form T-loops Repair of DNA

DNA is not inert

UV radiation, ionizing radiation, toxic chemicals, oxidative metabolism can harm DNA

Spontaneous hydrolysis of 10’000 glycosidic bonds in every cell every day....

Human genome 130 dedicated to DNA repair Chemically similar in E. coli Chemical damage of DNA

Oxidation Hydrolysis Methylation Direct reversal of damage

Pyrimidine dimers are split by photolyase:

UV (200-300nm) promtes Formation of cyclobutyl ring between adjacent thymine -> intrastrand thymine dimer DNA photolyase

Photoreactive enzyme: Absorbed light is transferred to FADH- Electron used to split thymine dimer

Base flipping: Often used to repair damaged DNA Excision repair

Cells have two types of excision repair: 1. excision repair, NER repairs bulky lesions 2. Base excision repair, BER repairs nonbulky lesions involving a single base Excision repair (NER)

Found in all cells Activated by helix distortion Major defense in humans (cigarette smoke, carcinogens)

16 subunits, 3 in

E. coli: UvrA, UvrB, UvrC UvrABC endonuclease

1. Cleavage 2. Displacement, UvrD 3. Repair, Pol I NER diseases

Xenoderma pigmentosum skin cells cannot repair UV damage Individuals extremely sensitive to sun light skin tumors risk 2000-fold elevated cultured skin cells are defective in repairing tymidine dimers Cell fusion experiments: 8 complementation groups

Cockayne syndrome light sensitive and neurological defects demyelination-> oxidative damage in neurons Base excision repair

Single base repair:

1. DNA glycosidase-> Apurinic or apyrimidinic (AP) site (abasic site)

2. Ribose cleaved by AP endonuclease

3. Exonuclease

4. Filled by pol and ligase Uracil in DNA would be highly mutagenic Why use thymine in DNA and uracil in RNA ?

Cytosine deaminates to uracil

If U in DNA: no way to discriminate whether G-U mismatch is due to: G-C -> deaminated to U A-U

Since T is normal in DNA, every U is due to deaminated C Mismatch repair

Replicational mispairing is repaired by mismatch repair (MMR)

Defects result in hereditary nonpolyposis colorectal cancer (HNPCC)

Must distinguish between correct and wrong base In E. coli, possible due to hemimethylation 3 proteins, MutS, MutL, MutH Mismatch repair in E. coli

1. MutS binds mismatch as dimer 2. MutS-DNA recrutes MutL 3. MutS-MutL scan DNA for hemi- Methylated GATC, recrute MutH 4. Cleavage of non-methylated strand 5. Strand separation by UvrD 6. Exonuclease 7. Fill Pol III 8. Ligate The SOS response

On heavy DNA damage, E. coli stops to grow and induces DNA repair system, SOS system

SOS operon, recA, uvrA, uvrB repressed by LexA

RecA is ssDNA binding protein, induces cleavage of LexA upon ssDNA binding -> release repression of SOS operon Regulation of the SOS response in E. coli SOS repair is error prone

If replisome encounters DNA lesion: Stallment, relase Pol III core, collapse of replication fork

To resume: either SOS repair or recombination repair

Recombination repair: circumvents lesion and uses homologous recombination to restore damaged site (->later)

In SOS repair, Pol III is replaced by bypass DNA polymerase, Pol IV or Pol V Error prone polymerases -> SOS response is mutagenic -> Adaptation to difficult situation by generating diversity Double-strand break repair

Ionizing radiation and free radicals can induce double strand breaks in DNA (DSB) Also induced by some cellular processes, e.g. VDJ recomb.

2 ways to repair DSBs: 1. Recombination repair-> later 2. Nonhomologous end-joining (NHEJ) involves DNA end binding protein Ku Nonhomologous end-joining (NHEJ) Identification of carcinogens

Many forms of cancers are caused by exposure to certain chemical agents, carcinogens (man-made or natural)

Ames test assay for carcinogenicity

Salmonelle typhimurium his- incubate with chemical -> rate of reversion to his+ correlates with mutagenecity of tested chemical The Ames test for mutagenesis

Filter disc containing Substance:

1. Zone lethal 2. Zone mutagenic 3. Zone spontaneous reversion Recombination and mobile genetic elements

Pairs of allelic genes may exchange chromosomal location by via homologous recombination

Homologous recombination: Exchange of homologous segments between two DNA molecules

Bacteria, haploid, exchange via conjugation (mating) or Transduction (viral) The Holliday model of homologous recombination

1. ssDNA nick 2. Strand invasion 3. Branch migration 4. Holliday interm. Chi structure 5. Resolution Homologous recombination between two circular DNA duplexes

Results either in two circles of the original sizes or in a single composite circle Homologous recombination in E. coli is catalyzed by RecA

RecA have 104-fold lowe rate of recombination RecA catalyzes ATP-dependent strand exchange Binds DNA with 6.2 RecA monomers/turn Electron microscopy–based image (of an E. coli RecA–dsDNA–ATP filament Model for RecA-mediated pairing and strand exchange RecA-catalyzed assimilation of a single-stranded circle

Requires: -free end (nick) -homology at 5’ Hypothetical model for the RecA- mediated strand exchange reaction

Rad51 is eukaryotic homologue of RecA recBCD initiate recombination by making single-strand nicks Products of the SOS operon Unwinding dsDNA exonuclease to Chi sequence GCTGGTGG Every 5kb Have elevated rate of recombination

Requires free ds ends: Transformation Conjugation, Transduction Replication fork collaps RuvABC mediates branch migration and the resolution of the Holliday junction Branch migration is ATP-dependent, unidirectional Mediated by SOS-induced proteins: RuvB, ATP-dep. Pump, hexamer, AAA+ RuvA, binds Holliday junction, homotetramer RuvC, exonuclease Recombination repair Transformation, transduction and conjugation are rare events requi- ring recombination

Frequent is collapse of replication fork, 10times/euk cell cycle -> Recombination Repair 1. Replication arrest at lesion 2. Fork regression, chicken foot 3. Fill by Pol I 4. Reverse branch migration (RecG) 5. Replication restart Note: lesion is not repaired Recombination repair of a single- strand nick Replication fork encounters ss nick: 1. Collapse 2. RecBCD + RecA invasion 3. Branch migration, RuvAB 4. Resolution, RuvC -> nick has become 5’ end of Okazaki fragment Recombination repair reconstitutes doulbe-strand breaks

Homologous end-joining as alternative to NHEJ 2 Holliday junctions inter- mediate

1. Resection of DS ends 2. DNA dynthesis and ligation 3. Resolution of 2 Hol.j. Transposition and site- specific recombination 1950 Barbara McClintock, varied pigmentation on maize Due to the action of variable genetic elements, i.e. non-Mendelian inheritance 20 years later, evidence for mobile genetic elements in E. coli Transposable elements, transposons in prokaryotes and euk. Each transposon encodes for a that catalyzes illegitimate recombination, because it requires no homology between donor and acceptor Transposition is mutagenic and dangerous, tightly regulated: 10-5 to 10-7 events per cell division Prokaryotic transposons

3 Types: 1. Simplest, insertion sequences, IS Elements <2000bp, transposase, flanked by short inverted repeats, flanked by direct repeat at insertion site, E. coli: 8 copies of IS1, 5 copies of IS2 Properties of Some Insertion Elements Transposons (2) 3 Types: 2. More complex, carry additional genes, e.g. anti- biotic resistance Example, Tn3, 4957 bp a. transposase, TnpA b. , TnpR c. beta-lactamase, Ampicilin resistance Transposons (3) 3 Types: 3. Composite transposons gene containing central region flanked by IS-like modules that have the same or inverted orientation Generation of direct repeats of the target sequence by transposon insertion Two modes for transposition

1. Direct or simple transposition -> transposon moves from position A to position B

2. Replicative transposition -> transposon remains + new copy at position B Direct transposition of Tn5 by a cut and paste mechanism

1. Transposase binding 2. Dimerization 3. Synaptic complex 4. Target capture 5. Integration Replicative transposition A cointegrate Model for transposition via cointegrate

1. Pair of staggered ss cuts 2. Ligation of both ends at integration site forms replication fork 3. Replication forms cointegrate 4. Site-specific recombination cointegrate resolved γδ Resolvase catalyzed site- specific recombination

Via double-strand DNA cleavage Replicative transposons are responsible for much genetic remodeling in prokaryotes Transposons induce rearrangements in host genome a) Inversion of genomic segment b) Deletion of genomic segment Mediate transfer of genetic material between species Phase variation is mediated by site-specific Recombination

Salmonella typhimurium make 2 antigenetically distinct versions of flagellin, H1 and H2 only one of the two is expressed switch every 1000 cell generations, phase variation may help evade host immune response

H2 is linked to rh1, that encodes a repressor for H1 Expression of H2-rh1 unit is controlled by a 995bp segment that contains 1. Promoter for H2-rh1 2. Hin gene coding for Hin DNA invertase 3. Two closely related 26bp sites, hixL and hixR Mechanism of phase variation in Salmonella Cre-mediated site-specific recombination

Many bacteriophages have two modes to propagate: 1.lytic, lysis of cells 2. Lysogeic, integration into host genome

Examples: Bacteriophage lambda, λ P1 bacteriophage, Cre recombinase The circularization of linear bacteriophage P1 DNA

34bp LoxP site, palindromic except for central 8bp Mechanism of Cre–loxP site- specific recombination

Via 3’-PhosphoTyr intermediate Structure of the Cre tetramer complexed with loxP DNA Most transposition in eukaryotes involve RNA intermediates 3% of the human genome consists of transposons Many are fosils, i.e. sequence mutated to be inactive Many ressemble retroviruses in sequence

Retroposons Transposition via RNA intermediate, tanscription dsDNA via reverse transcriptase, cDNA Random integration by integrase

Retroviral genome flanked by LTR, long terminal repeats (250-600bp) 3 polyproteins: gag (viral core) pol (reverse transcriptase) env (viral envelope) Organization of retroviruses and the Ty1 retrotransposon Non-viral retroposons

Vertebrate contain Retroposons that lack LTRs Non-viral retroposons,e.g. LINEs, long intersoersed nuclear elements, 1-7kn long Contain 2 ORFs ORF1, similar to gag ORF2, similar to pol

In humans, LINEs account for 20% of genome ! DNA methylation and trinucleotide repeat expansion

Species specific methylation of A and C residues in DNA to: N6-methyladenine (m6A) N4-methylcytosine (m4C) 5-methylcytosine (m5C) DNA methylation

Bacterial DNA is methylated at own restriction site

E.coli, Dam methyltransferase (dam MTase), A in GATC Dcm MTase bith C in CCA/TGG at pos 5 both palindromic, mismatch repair and oriC

Methyl groups project into major groove of B-DNA, interact with DNA-binding proteins The MTase reaction occurs via a covalent intermediate in which the target base is flipped out

Methylation uses SAM, S-adenosylmethionine as methyl donor via a Cys thiolate attack, uses base flipping Inhibited by 5-fluorocytosine Base flipping DNA methylation in eukaryotes functions in gene regulation

5-methylcytosine is the only methylated base in most eukaryotes Modification in largely in GC dinucleotide CG is present at 1/5 of statistical expectation Upstream regions of many genes have CpG island DNA methylation in eukaryotes

Experimental assessment: Comparative southern blot of DANN cut with HpaII, cleaves CCGG, but not C-m5C-GG and MspI, cleaves both

Identification of m5C residues through bisulfite sequencing - DNA is reacted with bisulfite (HSO3 ) which deaminates C to U, but not m5C, followed by PCR amplification: copies U to T and m5C to C Sequence and compare to untreated DNA methylation in eukaryotes (2)

Methylation switches off eukaryotic gene expression, particularly when methylation occurs in promoter region For example, globin genes are less methylated in erythroid cells

Recognized by methyl-CpG binding domain (MBD) May also affect chromatin packaging DNA methylation in eukaryotes is self- perpetuating

Maintenance of methylation after replication -> inherited,

Epigenetic inheritance: Non-Mendelian inherited information

By DNMT1, which has preference for hemimethylated sites DNMT1 null mice die early in embr. devel. Methylation is dynamic

Pattern of DNA methylation varies in early embryological development:

Methylation levels high in gamets (sperm, ova) but nearly eliminated in blastocyst stage Methylation then rises again till gastrula stage when it reaches that found in adults, remain constant Except germ line cells, remain unmethylated

Pattern of expression differs in embryonic and somatic cells => Explains high failure of cloning experiments, few survivers, early death, abnormalities, large size Genomic imprinting results from differential DNA methylation

Difference in maternal and paternal inheritance: Mare x Male donkey -> mule Female donkey x stallion -> hinny Both are sterile

mule hinny Maternal and paternal genes are differentially expressed = genomic imprinting, only in mammals No embry from transplant of two male or female pronuclei DNA methylation is associated with cancer

Most prevalent is is m5C to T, covert proto-oncogens to oncogens or inactivate tumor suppressors Several neurological diseases are associated with trinucleotide repeat expansion Fragile X syndrome: mental retardation, long narrow face 1 in 4500 males, 1 in 9000 females Activated by passage through female Affects FMR1 gene, which contains (CGG)n, n=6-60 in 5’ region, n can increase from 60 to 200 = premutation Can the expand upon transmission to a daughter to >200 = full mutation

Expansion arises through slippage during replication

FMR1 is unmethylated in normal individuals But is methylated when premutation is maternally transmitted Other important trinucleotide repeat diseases

Huntington’s disease (HD), 1 in 10’000, onset at age of approx. 40, 18-year course, fatal Protein huntingtin contains (CAG)n repeats (Gln) Normal 11-34, sick 37-86 Repeat length is unstable, changes in >80% meiotic transmissions Number of repeats inversely correlates with age of onset polyGln aggregates as β sheets Neurons contain inclusions The loop-out mechanism for the alteration of the number of consecutive triplet repeats in DNA through its replication