David Peters Baran Group Meeting DNA Chemistry 06/02/17 Contents The Basics… Topics Included: DNA building blocks, DNA structures, chemical stability/succeptibility, NH O O NH2 O 2 relevant enzymatic processes, DNA modifications/bioconjugation, Levaraging DNA: Me base pairing, chemical diversity, and stereochemical environment N N HN N N HN HN Topics Breiefly Discussed: oligonucleotide drugs, chemical etiology, non-natural DNA N N H N N N O N O N O N H 2 H H H H Topics Not Included: RNA, nucleoside chemistry (O’Hara, 2012; Gianatassio, 2013), synthesis, oligonucleotide synthesis, sequencing, nucleobase synthesis, uracil prebiotic chemsitry Purines Pyrimidines

Selected History H2N H2N 1869 - Friedrich Miescher identifies “nuclein” from white blood cells N N O N N 1919 - Phoebus Levene identifies components as base, nucleoside and nucleotide NH2 1935 - Klein and Thannhauser isolate single from degraded DNA P HO N N O O N N 1937 - First X-ray diffraction showing DNA as a regular structure N N O O 1944 - Avery-MacLeod-McCarty experiment published – DNA as ‘transforming principle’ O 1950 - Chargaff published on species specificity and G/C and A/T ratios (Chargaff’s Rule) N N 1952 - Hershy-Chase experiment confirms DNA’s role as genetic material H HO OH HO OH 1952 - Franklin X-ray photographs are taken (A- and B-DNA) Nucleobase Nucleoside Nucleotide 1953 - Watson and Crick elucidated structure of DNA (adenosine) (adenosine monophosphate) 1977 - Frederick Sanger develops dideoxy method of sequencing (guanidine) (di-, tri-) (AMP, ADP, ATP) 1985 - PCR is invented (cytidine) (thymidine) 1994 - First therapy is approved by FDA (uridine) 2003 - Human Genome project declared complete H2N H2N H2N Fun Facts!! N N N N - A person has roughly 3x109 bases pairs! N O N N N - All of your DNA would go across the solar system…twice O P O HO N N O O N N - ~ 3% of your DNA is gene encoding O O - ~ 8% of your DNA is viral DNA O O P O OH - You’re DNA is 99.9% similar to everyone elses O HO - There is a Potato Genome Project HO Cyclic Adenosine Deoxynucleoside - DNA is a flame retardant (phosphoric acid and ammonia) Deoxynucleotide Monophosphate (cAMP) - You shouldn’t combine dinosaur and frog DNA

O O O O 3’ O 5’ WHAT DOES DNA STAND FOR?! HO P O P O P P OH O O O O HO HO D______OH OH OH OH O O O O OH O OH N______O A______N N N N O N Me N O N HN OH - 2’-deoxy is important for stability N N HO HO O O - first found in the center of cells NH N N NH 2 H N 2 D-ribosfuranose 2’-deoxy-D-ribofuranose - pka is about 1.5 2 H Oligonucelotide David Peters Baran Group Meeting DNA Chemistry 06/02/17

Watson-Crick Pairing - 99% of DNA base pairs Structural Parameters - N-N/N-O distances: 0.28 to 0.30 nm H H A-type DNA B-type DNA Z-type DNA N N H O Me N O H N Rotation Sense left left right N N H N N N H N Sug Sug Helix Diameter 25.5 Å 23.7 Å 17.4 Å N N N N O Sug N H O Sug Rise/Base 2.3 Å 3.32 Å 3.8 Å A-T H G-C Rotation/Base 33.6º 35.9º -30º Hoogsteen Pairing - 1% of DNA base pairing Bases/Turn ~11 ~10 12 - disclosed 10 years later Base Pair Tilt +19º -1.2º -9º H - relevant in G-quadruplex H H H N N N O H N Propeller Twist +18º +16º ~0º H O Me N H N N H N Helix Axis Location major groove through base pairs minor groove N H N N N Sugar Conformation C3’-endo C2’-endo N C:2’-endo, G:2’-exo N N O Sug Sug O Sug Sug Glycosyl-Base Bond anti anti anti (C), syn (G) A-T G-C+ Garret, Grisham, Biochemistry, 4th Ed., 2010. st Conformational Considerations Neidle, Principles of Structure, 1 Ed., 2008. NH H N NH2 2 NH2 2 N N NH HN N N N O O N N O N O O N O N O O O O

O O O O anti (B-DNA) syn (Z-DNA) Sugar Puckers HO Base HO Base HO Base O O O

C2’-endo C3’-endo C2’-exo

Propeller Twist Inclination

G A-DNA B-DNA Z-DNA - results in greater - observed when dehydrated - majority of natural DNA - first in artificial GC repeats A T base-base overlap A - natural in bacterial - Forced by m5C endospores and some viral - Implicated in gene regulation capsids Baran Group Meeting David Peters DNA Chemistry 06/02/17 Other Structures Solvation - must be 30% w/w water to maintain B-DNA H Sug - many spheres of solvation – 3 defined in some sequences Sug N N N N N H - solvation ordered in minor groove (H-bonding bases) N N - “spine of hydration” (left; blue = 1st sphere, yellow = 2nd) N O H N O - solvation less orded in major groove – disperse negaitve charge H N H H along phosphates +n H M H N H - ester oxygens don’t usually participate in H-bonding O N - ring oxygens occasionally participate in H-bonding H O N - A•T promotes minor groove H-bonding – entropic driving force N N for binding J. Mol. Biol. 2007, 365. H N N N N N Sug Base Hydration/Water Networks Sug H Bent B-DNA Major Groove H H G-Quadruplexes - A•T propellor H H H H H O O O - discovered in 1962 by X-ray diffraction twist 5-7º greater O O - important structure in aptamer technology than G•C H O H O H H H H H - thought to play important role in gene regulation - properly spaced H N H O H H H - G-rich telomeres form these – imporant in cancer biology Poly-A’s cuase Me O N N - can be from 1,2,3, or 4 strands (examples above) kink/bend in helix H N N N NH - propensity to form/type dependent on sequence and metal - implication in NH N ions available seq. recognition H N N N N N N N N Sugar N O Holliday Junction Sugar Sugar H O O - 4 stranded, 4 armed structure H H - intermediate in chromosomal recombination and O O H H bouble-strand break repair O H H H O - symmetrical sequences can slide at juntion H H O - unsemmetrica sequenced cannot – uses in Minor Groove H nanotechnology/DNA Origami Metal Binding in Solvation O - Na+ and K+ lay in minor groove replacing O P O waters (exact position disputed) H2O + O - exactly locating metals is difficult – Tl helps H2O OH2 O - Mg2+ plays a major role in solvation of major Mg2+ Base groove – serves as counterion to posphates H2O OH2 O 2+ H2O - Mg exists as hexahydrate – associated O P O DNA’s Happy Place: Stability waters H-bond S O - overall stability is a combination of solvent, pH, and ionic Niedle, 2008. (see above) strength DNA vs. RNA - base pair H-bonding contributes very little (-3 kcal/mol/bp) N S O NH2 O - equally as many H-bonds available w/ water in ssDNA OMe - RNA is more hydrolytically labile - hydrol. results in 2’ or 3’ phosphate - pi-stacking contributes majority of stablilization (-16 to -51 dTPT3 dNaM O N OH HN kcal/mol/bp) Base - Thymine in DNA; Uracil in RNA st 2nd NH3 - deamination of C is U (that’s a - colder = more stable 1 O O N O N order order H H problem…) - pH around neutral is best (pH<11 = denatured) OH O - Denatured ≠ Decomposition P O OH -13 - base pairing and larger structure is the Rate: 7 x 10 /sec same - T dependent on above cond. (and G/C content) O m dsDNA ssDNA Baran Group Meeting David Peters DNA Chemistry 06/02/17 DNA’s Unhappy Place: Degradation O Radicals Chem. Rev. 1998, 1109. O O General OH - radical reactions generally result in cleavage of glycosidic bond HN N - can cause phosphodiester cleavage, abasic sites OH - main degradation is depurination (pH 4) - electron rich nucleobases are most sensitive (G) N - faster under acidic conditions and higher heat O H H2N N OH - fenton chemistry most implicated in biological systems Sug - abasic sites strand cleave faster under basic cond. O P O – H2O [O] [H] - thermal denaturation is sequence dependent - H-atom abstraction O - most radicals can participate - chemical denaturants: ureas, DMSO, formamide, + 2+ 5+ 2+ CHO (after depurination) - Cu , Fe , Mn -oxo, [Ru(bipy)3] /h!, ROOH O O propylene glycol H - denatured DNA is more succeptible O N NH O 2+ 3+ HN HN - pH 1 = phosphodiester cleavage OH Fe + HOOH Fe + HO + OH O O O P O 3+ 2+ H N N N H N N NH O Fe + HOOH Fe + HOO + H 2 2 Metals Sug 2+ 2+ + + Sug Rule of Thumb: If it’s not Mg , Ca , Na , or K its going to do something bad. 8-oxo-gaunine FAPy-G Hud, Nucleic Acid-Metal ion Interactions, 2009. Photochemistry Chem. Rev. 1998, 1109. Interstrand Binding - well documented intrastrand pyrimidine O sugar/base 2 1 dimers (254 nm) S S* O2 chemistry H3N Cl H3N Cl H3N Pt Pt Pt G Cell - guanine oxidation at 266 nm G H3N Cl H3N OH H3N Death - photosensitizers (S) lead to a variety of reactivity S+ + O – - cisplatnin targets G,G repeats 2 HO O + - binds to N7 of G O O Me G - large conformational change and extrahelical bases Me Me OH HN O - signals for apoptosis HN NH N 5’-H abstr. Intrastrand Binding O N N Sug O N N O - Metal ion invades helix or binds during denaturation Sug MeO - flips bases (syn) or makes helix “breathe” Sug Sug Me MeO Duocarmycin - Pt2+, Ag+, Hg2+ CPD (6,4)-photoproduct

Stabilizing Other Conformations MeO N O Alkylation H - metal binding can stabilize non-B-DNA structures - any strong alkyl electrophile (most carcinogens are alkylators) N - can inhibit recognition and/or make succeptible to other chemical modification 3+ 2+ - act mainly at phosphate, guanine-N7, adenine-N3 O - [Co(NH3)6] and Zn polyamine complexes stabilize Z-DNA w/ phosphate - generally leads depurination binding - mutation results from mispairing or unfixed sites replicating NH - Pt2+, Ag+, Hg2+ all stabilize quadruplexes and triplexes by bridging non- - Duocarmycin recognizes 5’-AAAAA sequences and alkylating A-N3 O Me cannonical baise pairing - dimethylsulfate, nitrogen mustards, acrylates, most things in the lab O OMe Depurination - nucleotides are naturally succeptible to this at low pH Intercalating Agents - usually flat, cationic, fused - Metals have high affinity for N7 of purine bases NH2 O OH O aromatic tricycles - Metal binding retards depurination at low pH & accelerates it at high pH OH - lengthening (~3.4 Å) and II - ex: Pt (dien) depurination of G pH<4 (slower) pH>4 (faster) OH unwinding (~26º) of helix - Pt2+, Ag+, Zn2+, Co2+, Ni2+, Cu2+, Pd2+ N Me - interactions in major/minor H2N NH groove increase affinity Miscelaneous Ph Br OMe O OH O 2 II II II - sequence insensitive - Pb , Zn , Cd hydroxides cleave phosphodiesters - easy entry = dynamic DNA 2+ Me O - Pt facilitates deamination of C ( C→T) ethidium doxorubicin OH - increases chemical liability & - bromide (Adriamycin) - OsO4 and MnO4 oxidize 5,6 bond of pyrimidines Me modulates gene regualtion David Peters Baran Group Meeting DNA Chemistry 06/02/17 ENZYMATIC PROCESSES Transcription Polymerases Extending 3’ Extending Strand Strand O O - RNA polymerases selects for NTPs For mRNA: Base Template Strand Base O O over dNTPs - template strand is “anti-sense” - fidelity is controlled by KINETICS - coding strand/RNA transcript are “sense” - RNA plays many roles (stay tuned!) B H O O O P Nucleases O O O O O Base - break nucleic acid polymers - PPi O O Base O P O P O P O Base - DNase (left) is selective for DNA, and H H O RNase (not pictured) for RNA O O 2+ O O O 5’ OH H B - some have Zn sites Free 3’-OH O O O - exonucleases cleave at ends of chain 2+ H Mg2+ OH Mg P - endo nucleases cleave mid-chain Mg2+ - Nucleic acids duplexes are antiparallel - RNase is EVERYWHERE! O O Base Polymerase - synthesized strand grows from 5’ to 3’ - restriction nucleases (bacterial) used - many polymerases in each organism O extensively in molecular biology - DNA and RNA polymerase acts similarly - usually guided or sequence specific O - play roles in gene regulation and Replication repair - complex process utilizing many - fidelity is controlled by KINETICS DNA Repair protiens and complexes - editing domains/machinery also contribute - chemically dangerous process - many DNA polymerases - various roles Base Excission Repair - error rate: 1/107 bases - major repair pathway for many types of damage/mispairs glycosylase polymerase; - can also involve invasive polymerase ligase - repairs: 8-oxo-G, Fapy-G, 3-Me-A, AP nuclease; 7-Me-G, xanthine, hypoxyxanthine, lyase U (T-derived), etc. - implicated in some cancers OMe O - Methyl guanosine methyl transferase N N MGMT N HN - common damage from smoking - stoichiometric in MGMT – extremely H N N N N 2 H2N N energy expensive Sug Sug

O O O O O Me Me Me Me Me photolyase 2x N Biochemistry 2004, 14317. HN NH h" HN NH 1 2 3 4 – O N E + DNA E•DNA E•DNA•dNTP E*•DNA•dNTP E*•DNA+1•PPi O N N O FADH O N N O MTHF Sug Sug Sug Sug Sug - Correct Pair: step 3 is rate limiting 5 - Incorrect Pair: step 4 is rate limiting - humans lack photolyase; rely on nucleotide excission repair Chem. Rev. 2003, 2203. E•DNA+1 E•DNA+1•PPi - catalytic in cofactors Baran Group Meeting David Peters DNA Chemistry 06/02/17 Enzymatic Labeling Hermanson, Bioconjugate Techniques, 3rd Ed., 2013. Biotin Labeling

Random Primed Labeling PCR Labeling O O NH2 N2 3’ 5’ O O HN NH HN NH N denature/ NaNO N denature/ 2 HN NH anneal primer anneal primer HCl; NaOH ssDNA N 3’ 5’ N S S H2N N polymerase/ O N O N labeled dNTPs H H HN S p-aminobenzoyl biocytin O N repeat/ End Labeled DNA H denature S H O O N O Terminal Transferase HN dsDNA O HN Labeling O NH h 365 nm Me - intercalaton ! O Me N N O HN TRIS DNA based N O O O O pH 7.4 - pairing Labeled ssDNA O N DNA disruptive O P O P O P O O O H O O O O O O - also: dual Nick Translation Labeling O N O 32 psoralen-PEG3-biotin - RPL can be used for P labeling HN H HO S - control labeled [dNTP] for RPL and NTL TdT (enzyme) HN N 2+ - no site selectivity in RPL and NTL ssDNA, Co H H H S O N N N N N - PCR labeling possible from RNA (MLLV H O S O S O S O reverse transcriptase) O N N O2N O P O N N - end labeling doesn’t affect hybridization H sunlamp H H DNA O HN 3’ O NO2 NO2 O2N HN pH 7.0 HN HN N Chemcial Labeling HO NH Bisulfite Conjugation N3 N N NH2 NH2 Nu Nu “photobiotin” NaHSO - amine and hydrazide HN 3 HN Nu HN HN nucleophiles common - water competes Post Incorporation Modifications O N O N SO3 O N SO3 O N - disrupts pairing (3% NH2 labeling ideal) Chem. Eur. J. 2015, 16091. N O O O CuBr, N3R N Bromine Activation O P O P O P O O N polymerase Na(ascorbate) N O O O O O O O N H N N N N HN NBS HN NH2R HN - G: C8 Br NHR 50 ºC - C: C5 S N pH 9.6 N N HO O H2N N 0 ºC H2N N H2N N - A: hard O Bioconjugate Chem. 2012, 1382. N N - one pot bisfunctionalization R R Carbodiimide 5’ Phosphate Conjuagtion N N N N N NH O O reacts via: EDC, NH R N N O P O 2 O P O phosphor- 3 5’ O 0.1 M imidazole 5’ NHR imidazolide CuSO4, THPTA, pH 6.0 Na(ascorbate) David Peters DNA Chemistry Baran Group Meeting

Biosensors (base pairing) DNA-Templated Asymmetric Catalysis (stereochemical) Idealistically: fast, highly specific detection of analytes at very low concentrations - achiral metal catalyst associated/bound to double helix - mostly use undefined DNA sequences (st-DNA) - reaction scope extremely limited currently F - scale limitations Arseniyadis, Org. Biomol. Chem. 2017, ASAP. Intercalative Binding Groove Binding Hoeschst Ligand R R = 3,5-OMe-Ph n = 2 or 3 MeN N N Terpyridine or 1-naphthyl n N N N N NH Me H N N M Q M N HN F Q N N General ACS Appl. Mater. Interfaces 2013, 11741. N N N - Utilizes high specificity of base pairing - gold nanorods (GNR) = generalized Me N N - oligos synthesized for specific purpose - oligo with 1 modification = easier NH - FRET or fluorophore/quencher pair - hairpin = higher selectivity Covalent Binding Solid-Support R - microarray bound = parallel detection - can discriminate SNP (3x higher LOD) O - recycling 10x Ligandless - point-of-care diagnostics - 0.3 nM LOD (fluorescence) M - flow application - utilizes ion binding - metals, DNA, RNA, protiens - 10 nM LOD (colorimetric) HN N N M - scaled to 1 mmol G-quadruplexes M O - SiO & cellulose PPh 2 Deoxyribozymes (diversity) HN 2 M

- no known natural DNAzymes; lots of ribozymes known N M M - nearly all catalyze bioorganic reactions Sug - in vitro selection necessary; cannot predict 1º→2º→Function - libraries synthesied; hits amplified Reaction Examples: 14 3 ACIE 2005, 3230. - Selection = 10 tested simultaneously; Cominatorial = 10 O - “coverage” decreases with length BUT longer sequences yield better hits O Cu(II), st-DNA, N O Cu(II), [DNA], 4,4’-bipy N - Rate accelerations beyond “effective concentration” effects (105 to 1010) N 1,10-phen. Ph R variable constant MOPS (pH 6.5) MOPS (pH 6.5) constant (DNA) (DNA) Reactions catalyzed: 5ºC 5ºC (RNA) - RNA cleavage R st-DNA: 98% ee - DNA cleavage (oxidative) R = p-OMePh; 73% ee Chem. Comm. 2006, 635. GC oligo: 86% ee H R = Ph; 37% ee GACT oligo: 78% ee N i. PCR O NH - RNA ligation (3’ or 2’-branched) N ii. streptavidin H - DNA ligation repeat O Cu(II), st-DNA, O - R = Ar, alk. H - DNA posphorylation 4,4’-bipy N N + NuH N - Nu = CH(CO2Me)2, indole O NH iii. cleavage - DNA capping (adenylation) R R N MOPS (pH 6.5) CH2(CN)CO2R, MeNO2 H - DNA depurination NMe 5ºC NMe - 62 to 91% ee H - Nucleopeptide linking ACIE 2007, 9308. N O NH + 1 N O O R 1 H H Cu(II), st-DNA, R = H, Me N O 2 2 O NH Base SO2R terpyridine R = Ph, p-Tol, 2-Pyr Chem. Sci. 2013, 2013. N discard H 2 16-84% ee 1 SO2R Gen. R N2 MOPS (pH 6.5) O OH Base 5ºC O A G O 8-17 Deoxyribozyme O P Other Reactions: T Cu(II), st-DNA, R = Me, i-Pr, t-Bu - metal ions needed for activity O 4,4’-bipy O - Pd/Ir allylic amination A C 2+ O 22 to 70% ee G G C - Pb dependent RNA cleavage Selectfluor F - chiral sulfoxide formation C A G A - Zn2+ and Mg2+ = cyclic phosphate - epoxide resolution G CO R MES (pH 5.5) CO2R C 2 - L-DNA = other antipode C G - pH dependent Silverman, Chem. Commun. 2008, 3467. 5ºC Synlett. 2007, 1153. David Peters Baran Group Meeting DNA Chemistry 06/02/17

Oligonucleotide Therapeutics (base pairing) Oligonucleotide Therapeutic MOAs - Utilizes base recognition specificity to target specific 1. Aptamers - Challenges: entry/uptake, RNase resistance, toxicity, afinity, high specificity - specificity for single protein - Area for chemical inovation to overcome these challenges - modulate activity/binding of target Protein 2. Anti miRNA RNase H O Base O Base O Base O Base - decoy of mRNA targeted by O O O O native miRNA - UPREGULATION 1 3. RISC Recruiting 6 O O F S O O O O O NH - ds-oligo mimics siRNA P P P P - recuits RISC machinery to mRNA O O 2 O O Base O Base O O Base O Base cleave target mRNA X 5 O O O O 4. Splice Switching 3 4 - blocks splicing sites durring miRNA O F O O O NH pre-mRNA modification 5. miRNA Mimic pre-mRNA RISC 2’-Fluoro DNA Phosphorothioate Locked Nucleic Acid N3’-P5’- - blocks translation by forming improved uptake stops phosphotases Tm increases phosphoramidate duplex (miRNA-like) nuclease stability 6. Gapmers O Base O Base - DNA capped with stable RISC O O variants (see left) at 3’ and 5’ O Base - DNA/RNA duplex recuits RNase H causing cleavage R= CH3 or MeO O O O OR OMe P P O O Key Points On This Topic O P O Base O O - also known as “ASOs” or “antisense drugs” Base O Base O O O - claimed dianophore and pharmacophore separation – has not been true - biggest issue to address has been toxicity: specificity and ADME - complexity has risen as a result of poor efficacy O O OR O Stereochemsitry Methylphosphonate 2’-O-alkyl-RNA Cyclohexene - phosphate changes introduce stereocenter - stereorandom > stereopure (only one) reduces toxicity Nucleic Acid - Sp more resistant to nuclease - selective orders may be beneficial very popular n-1 O - Rp higher binding afinity - # of isomers = 2 O O NH 18-mer = 217 = 131072 isomers!! Base O Base Base Connectivity Conjugation N PAST 3’GalNAc O N O O P O NH 5’PUFA Me2N P O O O Oligo PUFA O O N O Base Base H O N HO OH Base O CURRENT O O N HO Oligo GalNAc Good Reviews/Perspectives: NHAc O splice switching 3 splice switching Morpholino O Watts, Nature Biotech. 2017, 238. Tricyclo DNA Phosphoramidate Peptide Nucleic Acid Ludnin, Human Gene Therapy 2015, 475. Bruice, Chem. Rev. 2012, 1284. David Peters Baran Group Meeting DNA Chemistry 06/02/17

DNA Origami (base pairing) Related Topics Worthy of Their Own Meeting: - Nucleic Acid Chemical Etiology (concept, design, synthesis) - Oligonucleotides - Nucleobases

“Keys”

Nature 2006, 297. Nature 2009, 73.

Chemical Etiology Asks the qustion: Why are nucleic acids the they are and not otherwise? Experimental Set-Up: 1. Use chemical reasoning to propose alternatives to natural nucleic acid components 2. Synthesize those alternatives 3. Study their properties with respect to nucleic acid functions as we know them 4. Compare the to natural nucelic acids (NA) Assumptions Made: - NA’s originated from a “combinatorial” process w/ respect to functional and informational storage abilities - selection took place in the domain of sugar based oligonucleotides - reasoned component constitutions must be similar to those of natural NAs - the reasoned components were available - must contain a nucleobase (or derivative) and have capacity for pairing in any fashion

A8•T8 A12•T12

10ºC unit

Tm estimated Eschenmoser, Science 1999, 2118. not investigated