Study of DNA G-quadruplex structures by Nuclear Magnetic Resonance (NMR) Abdelaziz Kerkour

To cite this version:

Abdelaziz Kerkour. Study of DNA G-quadruplex structures by Nuclear Magnetic Resonance (NMR). Biomolecules [q-bio.BM]. Université de Bordeaux, 2014. English. ￿NNT : 2014BORD0292￿. ￿tel- 01281325￿

HAL Id: tel-01281325 https://tel.archives-ouvertes.fr/tel-01281325 Submitted on 2 Mar 2016

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés.

THÈSE PRÉSENTÉE

POUR OBTENIR LE GRADE DE

DOCTEUR DE L’UNIVERSITÉ DE BORDEAUX

ÉCOLE DOCTORALE Sciences de la Vie et de la Santé

Spécialité : Interface Chimie-Biologie

Présentée par

Abdelaziz KERKOUR

Study of DNA G-quadruplex structures by Nuclear Magnetic Resonance (NMR)

Sous la direction de : Dr. Gilmar SALGADO

Soutenue le 15 décembre 2014

Membres du jury

Dr. Jean-Jacques Toulmé INSERM U869, Université de Bordeaux Président

Dr. Serge Bouaziz CNRS UMR 8015, Université de Paris 5 Rapporteur

Dr. Janez Plavec Université de Ljubljana, Slovénie Rapporteur

Dr. Patrizia Alberti MNHN, Paris Examinateur

Dr. Gilmar Salgado INSERM U869, Université de Bordeaux Directeur de thèse

1

CONTENTS CONTENTS ...... 1 LIST OF FIGURES ...... 3 SHORTENINGS ...... 6 ACKNOWLEDGEMENTS ...... 7 Résumé étendu en français ...... 8 A. GENERAL INTRODUCTION ...... 12 A.1 DNA structures ...... 12 A.1.1 DNA components ...... 12 A.1.2 DNA structural polymorphism ...... 12 A.1.2.1 Duplex DNA ...... 12 A.1.2.2 Triplex DNA ...... 13 A.1.2.3 Guanine quadruplex (G4) structures ...... 13 A.2 G4 structures ...... 16 A.2.1 G-quartets ...... 16 A.2.2 Structural polymorphism of G4 ...... 16 A.2.2.1 Number of strands ...... 16 A.2.2.2 Orientation of strands ...... 17 A.2.2.3 Conformation of loops ...... 17 A.2.2.4 Nature of cations ...... 17 A.2.2.5 Intra-molecular G4 motifs ...... 18 A.2.2.6 Bulges in G4 ...... 18 A.3 Biological functions of G4 ...... 20 A.3.1 G4 in silico evidence ...... 20 A.3.2 Examples of G4 localization in genomes ...... 21 A.3.2.1 Promoters of genes ...... 21 A.3.2.2 Minisatellites ...... 22 A.3.2.3 Immunoglobulin class switching regions ...... 23 A.3.2.4 RNA quadruplex ...... 24 A.3.2.5 ...... 24 A.3.2.5.1 G4 structures of human telomeric DNA ...... 25 A.4 G4 ligands ...... 29 A.4.1 π-π stacking interaction with G-quartets...... 30 A.4.1.1 In situ protonated ligands ...... 30 A.4.1.2 Aromatic N-methylated ligands ...... 32 A.4.1.3 Neutral and negatively-charged macrocyclic ligands ...... 34 A.4.1.4 Metallo-organic ligands...... 34 A.4.2 Ligands of grooves and loops ...... 35 A.5 Goals of the doctoral research ...... 36 A.5.1 Binding studies of 2,4,6-triarylpyridine family ligand with Human Telomeric Repeat G4 36 A.5.2 Characterization of a tetra-molecular G4 model d[TG4T]4 and its interaction with specific G4 ligands inside cells ...... 36 A.5.3 NMR study of KRAS G4 ...... 37

1

Part 1 ...... 2 B. Interaction of 2,4,6-triarylpyridine family ligand with the Human Telomeric Repeat G-quadruplex motif ...... 38 B.1 Introduction...... 38 B.2 Materials and Methods ...... 40 B.2.1 Biophysical characterization of G4 ...... 40 B.2.1.1 Synthesis and purification ...... 40 B.2.1.2 Ligand preparation ...... 41 B.2.1.3 CD and CD melting ...... 41 B.2.2 NMR spectroscopy ...... 42 B.2.3 Assignment and structure calculation ...... 43 B.3 Results and discussions ...... 45 B.3.1Characterization of 22AG G-quadruplex formation by CD and NMR spectroscopy ...... 45 B.3.2 Characterization of TAP compound ...... 48 B.3.3 TAP binds to and stabilizes 22AG ...... 50 B.3.4 NMR analysis of the TAP-22AG complex and binding site determination ...... 53 B.3.5 Assignment of resonances belonging to the bound 22AG ...... 56 B.3.6 Assignment of resonances belonging to the bound TAP...... 63 B.3.7 Intermolecular NOEs detection ...... 64 B.3.8 Structure calculation ...... 65 C. Study of G4 and ligand interaction inside living cell by NMR spectroscopy ...... 71 C.1 Introduction ...... 71 C.1.1 DNA vectorization methods ...... 73 1) Microinjection ...... 73 2) Electroporation ...... 74 3) Lipofection...... 74 4) Biolistic method ...... 75 C.1.2 In-cell NMR experiments ...... 75 C.2 Materials and Methods ...... 76 C.2.1 Oligonucleotides and ligands preparation ...... 76 C.2.2 15N, 13C uniformly labeled oligonucleotides preparation ...... 77 C.2.2.1 Chemical synthesis ...... 77 A) Detrytilation ...... 77 B) Coupling ...... 77 C) Capping...... 78 D) Oxidation ...... 79 E) Deprotection ...... 79 C.2.2.2 Enzymatic synthesis of uniformly 15N, 13C labeled oligonucleotides ...... 79 C.2.2.3 Electrophoresis on denaturing polyacrylamide gel ...... 80 15 13 C.2.2.4 Purification of the N, C uniformly labeled TG4T ...... 80 C.2.3 G4 micro-injection of Xenopus laevis oocytes ...... 82 C.2.3.1 Xenopus laevis oocytes micro-injection ...... 82 C.2.3.2 NMR sample preparation ...... 83 C.2.3.3 Cleared lysates of oocytes preparation ...... 83 C.2.4 G4 electroporation in human cells ...... 84 C.2.4.1 Cell culture manipulation ...... 84 C.2.4.2 G4 transfection by electroporation (EP) ...... 85 C.2.4.3 Flow cytometry...... 86 C.2.4.4 Confocal microscopy ...... 88

2

C.2.4.5 In-cell NMR sample preparation for human cells ...... 88 C.2.4.6 In-cell NMR lysates preparation for human cells ...... 89 C.2.4.7 In-cell NMR experiment...... 89 C.3 Results and discussions ...... 89 13 15 C.3.1 Synthesis of N, N labeled TG4T ...... 89 C.3.1.1 Yield of TG4T synthesis ...... 90

C.3.1.2 Yield of TG4T purification ...... 91 13 15 C.3.1.3 Quality of produced C, N labeled TG4T ...... 92 C.3.2 G4 DNA and ligand interaction study by in-cell NMR ...... 94 C.3.2.1 Characterization of d[(TG4T)]4 inside Xenopus oocytes by NMR spectroscopy ...... 95

C.3.2.2 Assessment of d[(TG4T)]4 leakage from Xenopus oocytes ...... 97

C.3.2.3 Stability and behavior of d[(TG4T)]4 inside Xenopus oocytes ...... 97

C.3.2.4 Favored forms of d[(TG4T)]4 inside Xenopus oocytes ...... 98

C.3.2.5 Titration of d[(TG4T)]4 with ligands in vitro ...... 102 A) Titration with Distamycin A ...... 102 B) Titration with DMSB ...... 103 C) Titration with 360A ...... 103

C.3.2.6 Ligand binding to d[(TG4T)]4 inside Xenopus oocytes ...... 105 A) Binding with Distamycin A ...... 105 B) Binding with 360A ...... 106 C) Binding with DMSB ...... 108 C.3.3 G4 DNA in human cells ...... 113 C.3.3.1 Optimization of electroporation conditions ...... 113 C.3.3.2 G4 uptake efficiencies ...... 114 C.3.3.3 Cell recovery evaluation ...... 115 C.3.3.4 G4 distribution inside human cells ...... 115 C.3.3.5 In cell NMR experiments ...... 116 NMR study of KRAS G4 ...... 2 D. NMR study of KRAS G4 ...... 122 D.1 Introduction ...... 122 D.2 Materials and methods ...... 122 D.2.1 Oligonucleotides ...... 122 D.2.2 Ligands preparation ...... 123 D.2.3 CD and CD melting ...... 123 D.2.4 NMR spectroscopy ...... 124 D.3 Results and discussions ...... 125 D.3.1 Screening of K-ras G4 forming sequences for NMR study ...... 125 D.3.2 CD spectroscopy analysis of Kras 21R and 22RT with G4 ligands ...... 126 D.3.3 Stabilization of 21R and 22RT in the presence of different ligand ...... 129 D.3.4 NMR titration of 22RT with Braco19 ...... 130 E. Conclusions and perspectives ...... 135 REFERENCE ...... 140

3

LIST OF FIGURES Figure 1: Nucleic acids components...... 12 Figure 2: Duplex DNA structures...... 14 Figure 3: Triplex DNA structure...... 15 Figure 4: G4 DNA structure...... 15 Figure 5: G4 polymorphism...... 19 Figure 6: G4 in gene's promoters...... 22 Figure 7: The six hallmarks of cancer shown with the associated G4 found in the regions of genes .. 23 Figure 8: Telomeric G4 characterization...... 27 Figure 9: Human telomeric G4 polymorphism...... 28 Figure 10: Chemical structures of G4 ligands generated by in situ protonation strategy...... 31 Figure 11: Models of G4-in situ protonated ligand complexes...... 32 Figure 12: Schematic presentation of the chemical structures of G4 ligands generated by N-methylation strategy...... 33 Figure 13: Models of G4-N-methylated ligand complexes...... 33 Figure 14: Schematic presentation of the chemical structures of G4 neutral and negatively-charged macrocyclic ligands...... 34 Figure 15: Schematic presentation of the chemical structures of some G4 metallo-organic ligands...... 35 Figure 16: Model of G4-ligand complex showing groove interaction mode...... 35 Figure 17: Human telomeric repeat motif 22AG and TAP ligand...... 39 Figure 18: Polymorphism of human telomeric sequence motif (22AG) characterized by CD...... 46 Figure 19: Proton NMR spectrum of 22AG at 500 µM annealed in 90 mM Na+solution...... 47 Figure 20: Polymorphism of human telomeric motif formed by the sequence 22AG characterized by NMR...... 48 Figure 21: Assignments of the compound TAP in both deuterated methanol (d6-MetOD) and Na-phosphate buffer...... 50 Figure 22: Interaction and stabilization of 22AG human telomeric G4 with TAP...... 52 Figure 23: Chemical shift mapping of 22AG in the presence of 2.5 molar equivalents of TAP compound...... 55 Figure 24: 300 ms 2D 1H-1H NOESY spectrum of 22AG alone (black) superimposed with 300 ms 2D NOESY spectrum of TAP:22AG complex at 2.5:1 stoichiometry (red-green contours)...... 57 Figure 25: H6/H8-H1' assignment and sequential walking...... 57 Figure 26: 2.5:1 TAP:22AG complex assignment...... 58 Figure 27: Portions of a 120 ms 2D TOCSY (Blue) superimposed with a 300 ms 2D NOESY spectrum (gold) of TAP:22AG complex at 2.5:1 stoichiometry...... 59 Figure 28: Portion of a 2D COSY spectra (green) superimposed with 300 ms 2D NOESY spectrum (gold) of TAP:22AG complex at 2.5:1 stoichiometry...... 60 Figure 29: 1H-13C HSQC spectra of TAP:22AG complex at 2.5:1 stoichiometry...... 61 Figure 30: 1H-13C HSQC spectra of TAP:22AG complex at 2.5:1 stoichiometry...... 62 Figure 31: 1H-15N HSQC spectrum of TAP:22AG complex at 2.5:1 stoichiometry...... 63 Figure 32: Portion of 1H -1H NOESY spectrum of TAP:22AG complex at 2.5:1 stoichiometry (green) superimposed with NOESY spectrum of 22AG alone (red)...... 64 Figure 33: Solution structure of the G4 formed by 22AG in Na+ and in the presence of 2.5 molar equivalents of TAP ...... 67 Figure 34: Region of Hot spots derived from chemical shift perturbations mapping (green surface)...... 69 Figure 35: Interactions between 22AG and TAP observed by unrestrained docking...... 69 Figure 36: Possible interactions between 22AG and TAP in the site 1 comprised between lateral loops and the major groove...... 70 Figure 37: Possible interactions between 22AG and TAP in the site 2 comprised between diagonal and the minor groove...... 70 Figure 38: Chemical synthesis cycle for preparation of oligonucleotides by phosphoramidite method...... 78 Figure 39: Enzymatic synthesis strategy of 13C, 15N labeled TG4T...... 80

4

Figure 40: The Xenopus laevis oocytes microinjection procedure and the in-cell NMR sample preparation...... 84 Figure 41: Cell sub-populations based on FSC versus SSC ...... 87 Figure 42: Analysis of enzymatic synthesis and alkaline cleavage products by electrophoresis on denaturing polyacrylamide gel at 20 %(cf C.2.2.3)...... 91 13 15 Figure 43: HPLC profile of the reactional mixture obtained from enzymatic synthesis of C, N labeled TG4T after alkaline cleavage (cf C.2.2.4)...... 92 13 15 Figure 44: NMR spectra of C, N labeled TG4T issued from enzymatic synthesis after alkaline cleavage, anion exchange HPLC purification and desalting...... 95 1 15 Figure 45: Time evolution of H, N SOFAST-HMQC spectrum of d[(TG4T)]4 inside Xenopus oocytes...... 99 Figure 46: Time spectral evolution 1H, 15N SOFAST-HMQC of buffer surrounding Xenopus oocytes micro- injected by d[(TG4T)]4...... 100 1 15 Figure 47: Series of H, N SOFAST-HMQC spectra of d[(TG4T)]4 in different buffer conditions ...... 101 1 1 15 Figure 48: Probing different ligands interaction with d[(TG4T)]4 G4 by 1D H and H, N SOFAST-HMQC in vitro...... 104 1 15 Figure 49: Probing interaction of d[(TG4T)]4 G4 with distamycin A by H, N SOFAST-HMQC inside Xenopus oocytes...... 109 1 15 Figure 50: Probing interaction between d[(TG4T)]4 and 360A with H, N SOFAST-HMQC in vitro and inside Xenopus oocytes...... 110 1 15 Figure 51: H, N SOFAST-HMQC spectrum of d[(TG4T)]4 with 360A from the cytoplasm of ~200 oocytes after heating at 80°C (light blue), overlayed on spectra of in vitro titration of d[(TG4T)]4...... 111 1 15 Figure 52: H, N SOFAST-HMQC spectrum of d[(TG4T)]4 with DMSB from the cytoplasm of ~200 oocytes after heating at 80°C (light blue), overlayed on spectra of in vitro titration of d[(TG4T)]4...... 112 Figure 53: Cell loss after electroporation for three different cell lines HeLa, A2780 and HEK293T...... 117

Figure 54: Delivery of d[(TG4T)]4 formed by TG4T-cy3 and unlabeled TG4T at a ratio of 1:3 in 1) HeLa cells, 2) A2780 and 3) HEK293T by EP...... 119 Figure 55: Viability test of attached and non attached cell populations 1...... 120

Figure 56: Intracellular distribution of d[(TG4T)]4 G4 formed by TG4T-cy3 and unlabeled TG4T at a ratio of 1:3 in 1) delivered by electroporation in A) HeLa cells, B) HEK293T and C) A2780...... 121 1 15 Figure 57: H, N SOFAST-HMQC spectrum of d[(TG4T)]4 from the cytoplasm of HeLa cells after heating at

80 °C (light green), overlaid on spectra of in vitro d[(TG4T)]4 G4 (red) and with 2 molar equivalent of 360A (magenta)...... 121 Figure 58: Screening of Kras G4 candidates using 1D 1H NMR ...... 126 Figure 59: CD titration of Kras 21R and 22RT with 360A...... 127 Figure 60: CD titration of Kras 21R and 22RT with Braco19...... 128 Figure 61: CD titration of Kras 21R and 22RT with PhenDC3...... 129 Figure 62: 21R and 22RT stabilization by various ligands ...... 131 Figure 63: 1D 1H NMR titration of 22RT with Braco19 compound ...... 132

5

SHORTENINGS 1D : One-dimentional

2D : Two-dimentional

UV: Ultraviolet

CD: Circular dichroism

NMR: Nuclear Magnetic Resonance

DNA: Deoxyribonucleic acid

RNA: Ribonucleic acid

NOESY: Nuclear Overhauser effect

TOCSY: Total correlation spectroscopy

COSY: Correlated spectroscopy

HSQC: Heteronuclear Single Quantum Correlation

HMQC: Heteronuclear multiple Quantum Correlation

HMBC: Heteronuclear multiple bond Correlation

WaterLOGSY: Water-ligand observed via gradient spectroscopy

STD: Saturation Transfer Difference ppm: parts per million

FRET : Fluorescence Resonance Energy Transfert

6

ACKNOWLEDGEMENTS First of all, I would like to thank all the members of my thesis committee for accepting to judge my work. I thank also my supervisor Dr. Gilmar Salgado for the opportunity to gain a phD. I thank Dr. Jean-Louis Mergny for accepting me into his lab. Thanks also to Dr. Samir Amrane for his advices and help. Finally, I would like to thank all who proffered their time and support over the last three years.

I dedicate this work to my family.

7

Résumé étendu en français Les G-quadruplex (G4) sont des structures non canoniques d'acide nucléique, formés par des séquences riches en guanine en présence de cations monovalents tels que K+ et Na+. Leur stabilisation provient principalement de l'empilement des tétrades constituées par quatre guanines (G-tétrades) et stabilisées par un réseau de liaisons hydrogène à l'intérieur et à travers les tétrades. Des analyses informatiques et expérimentales sur le génome humain ont mis en évidence la distribution d'un grand nombre de séquences capables de former des structures G4 principalement dans les télomères et les régions promotrices des différents oncogènes. Bien que les G4 soient connus depuis des décennies, l'intérêt pour ce domaine a pris un nouvel essor à la fin des années 90, lorsque un effet inhibiteur sur l’activité de la télomérase a été démontré. La télomérase est une transcriptase inverse impliquée dans l’immortalisation de la plupart des cellules cancéreuses. En fait, l'enzyme télomérase est up- régulée dans 85% des cellules cancéreuses, cette perturbation rend les cellules cancéreuses immortelles. Par conséquent, l'inhibition de la télomérase serait une approche appropriée pour une thérapie anti-cancer. Une autre approche serait de cibler le substrat de la télomérase, qui est l'extrémité 3' de l'ADN télomérique simple brin capable de former des G4. Pour catalyser l'élongation de la séquence télomérique par addition des unités d(T2AG3), la télomérase nécessite l'ADN simple brin. Par conséquent, la formation de G4 entrave l'activité de la télomérase et empèche l'élongation de l'ADN simple brin. Les structures G4 peuvent être aussi reconnues par de petites molécules, qui stabilisent ces édifices, le plus souvent par des interactions d'empilement de type π-π. Même si l'existence de G4 in vivo n'est pas complètement élucidée, les effets biologiques des ligands sélectifs de G4 doivent être pris en considération. En effet, Le développement de ligands qui stabilisent les G4 ou induisent leur formation à partir de l'extrémité 3' simple brin de la télomérase peut empêcher l'allongement des télomères ce qui pourraient être prometteur pour le traitement du cancer. Sur la base de la pertinence biologique des structures G4 ainsi que leur interaction avec les ligands spécifiques, le but de mes études de thèse consiste en la caractérisation des structures d'ADN G4 par spectroscopie RMN à différents niveaux. Trois thèmes différents sont étudiés dans ce manuscrit:

1) L' étude d'interaction du ligand spécifique aux G4s de la famille 2,4,6-triarylpyridine avec le G4 télomérique humain:

Le premier projet concerne l'interaction du motif télomérique humain constitué de la séquence 22AG et un ligand synthétisé dans notre laboratoire, le composé 2,4,6-

8

triarylpyridine, dont il était le meilleur candidat d'une famille de ligands qui a montré des effets prometteurs de stabilisation de structures G4. D'une part, il montre une forte spécificité et sélectivité pour les structures G4 dans des différentes conditions de tampon (Na+ et K+) en comparaison avec les formes canoniques de l'ADN et d'autre part, il confère une stabilité élevée pour le G4 télomérique humain comme a été montré par des expériences de FRET melting. Les principaux objectifs de ce projet est de savoir comment TAP se lie au G- quadruplex formé par la séquence 22AG et d'étudier les propriétés chimiques qui permettent de stabiliser la structure G4 avec des valeurs de si élevées (> 30 ° C). Par l'utilisation de CD et de spectroscopie RMN 1D, j'ai effectué des titrations dans des conditions différentes pour confirmer que TAP se lie avec une affinité moyenne à 22AG. Nos observations nous amènent à conclure que le ligand semble imposer des modifications légères dans la structure de 22AG. Cette explication est également renforcée par l'agrégation de ligand qui inhibe la fixation correcte entre le ligand et 22AG. Les résultats de CD melting ont montré que TAP stabilise le G4 comme prévu sur la base des résultats précédents effectués par FRET melting. En utilisant la spectroscopie RMN 2D, j'ai attribué tous les atomes du G4 avec l'aide des résultats publiés avec le 22AG seul. J'ai aussi calculé la structure de la forme libre avec plus de NOE que dans la structure publiée ce qui permet de mieux valider le modèle, puis j'ai extrapolé les attributions de la forme libre pour attribuer le complexe. La cartographie des perturbations des déplacements chimiques nous a permis de déterminer la région potentielle de liaison située entre la boucle latérale formée par T17, T18 et A19 et la tétrade formé par G (4, 8, 16 et 20 TAP ). Malheureusement, aucun de NOE d'interaction n'a été identifié sans ambiguïté pour obtenir une preuve décisive pour l'interaction, le chevauchement des pics dans la région aliphatique rend impossible la distinction entre les NOEs d'interactions ligand-ligand et ligand-22AG. Quelques nouveaux pics sont observés dans la région aromatique, mais nous ne pouvions pas les attribuer sans ambiguïté, nous pensons qu'ils pourraient être des NOEs d'interactions mais malgré cela, ils ne suffisent pas à localiser la région d'interaction. Cependant, les études d'amarrage ont montré que deux sites différents d'interaction sont possibles, le plus privilégié correspond à la région déterminée à partir de l'approche RMN décrite ci-dessus et l'autre site est situé entre la boucle diagonale formée par T11, T12 et A13 et le sillon mineur du G4. En fait, nous avons montré clairement dans ce travail que se lie TAP au G4 au niveaux des sillons et des boucles avec une faible affinité de liaison. L'élargissement de la partie 4-aryle du TAP semble être une bonne stratégie pour améliorer l'affinité de la molécule au G4 car les chaînes latérales sont en mesure d'établir plus facilement des interactions avec les sillons du G-quadruplex. Pour étudier plus en détail les

9

caractéristiques structurelles des interactions, nous allons probablement effectuer des essais cristallisation pour le couple TAP-22AG afin de confirmer le mode d'interaction et d'étudier expérimentalement l'existence du second site de liaison. Dans le futur, d'autres investigations RMN telle que STD, WATERLOGSY qui n'ont pas abouti dans nos conditions expérimentales doivent être optimisées pour obtenir la structure en solution du G4 formé par 22AG. D'autres séquences G4 tels que c-myc et K-ras seront également testés pour confirmer les caractéristiques structurales observées avec 22AG.

2) La caractérisation du G4 tétramoléculaire d [TG4T]4 et son interaction avec des ligands spécifiques à l'intérieur de cellules vivantes.

Dans la deuxième partie de notre projet, nous avons caractérisé un G4 tétra-moléculaire formé par d[TG4T]4 à l'intérieur des cellules vivantes. Nous avons montré clairement que la structure du G4 est maintenue dans les cellules et a été détectée par spectroscopie RMN qui est connu pour ne pas être une méthode très sensible. En outre le G4 d[TG4T]4 semble être résistant à de nombreuses activités enzymatiques et peut être considéré comme une espèce à longue vie dans les cellules. Les différences constatées entre les résultats in vivo et in vitro donnent l'issue à de nouvelles questions pertinentes qui doivent être abordées dans des futures expériences concernant l'interaction G4-ligand. Le ligand lié à l'oligonucléotide semble améliorer la liaison de certaines biomolécules et / ou le complexe précipite à l'intérieur de la cellule qui le rend invisible dans l'échelle de temps de RMN. Nos résultats avec les études précédentes de RMN obtenus avec la séquence télomérique d humaine (AG3 (T2AG3)3) nous donnent des indices fraiches sur la façon de procéder pour étudier les oligonucléotides, et plus particulièrement les structures G4 à l'intérieur des cellules vivantes avec des ligands. Les résultats fournis par les investigations d'interaction avec des ligands à l'intérieur des ovocytes de Xenope montrent que, malgré la toxicité que peuvent avoir ces ligands à des concentrations élevées, les ovocytes sont restés intacte avec un ligand parmi eux (360A) pendant des périodes supérieures à 24 heures. En outre, ce ligand présente un effet très intéressant sur le G4 à la fois in vitro et in vivo. 360A peut être utilisé pour tester autre G4 formés par le motif télomérique humain, c-myc et k-ras. Le processus d'optimisation que j'ai fait pour les expériences de G4 dans les cellules humaines nous encourage à enquêter plus en détail sur la possibilité de détecter les signaux de RMN du G4 dans un tel milieu complexe que l'on trouve dans les cellules humaines. Le protocole d'électroporation est correctement optimisé permettant de livrer une bonne quantité de G4 à l'intérieur des cellules, en particulier si on le compare à ceux obtenus avec des protéines (5-12 uM). L'oligonucléotide d[TG4T]4 semble

10

interagir avec des partenaires cellulaires à l'intérieur des cellules humaines et sa liaison avec ces grosses molécules dans un environnement intracellulaire très compact empêche une observation correcte des signaux obtenues par RMN de solution. Ayant cela à l'esprit, nous avons commencé une collaboration avec le Dr Antoine Loquet de CBMN (Institut de Chimie et Biologie des membranes et des nano-objets) à Bordeaux afin d'effectuer RMN du solide et obtenir plus d'informations sur le repliement et la l'interaction du G4 à l'intérieur des cellules. A plus court terme, nous pensons que des séquences alternatives plus courtes et compactes peuvent fournir des signaux visibles et que des séquences plus longues et probablement plus pertinents que TG4T tels que c-myc ou le motif télomérique humain peuvent également être exploitées pour tester le repliement des structures G4 et la liaison du ligand à l'intérieur des cellules de mammifère en utilisant la spectroscopie RMN.

3) Investigation d'une nouvelle topologie d'ADN G4 formé dans le promoteur du gène K-ras humain.

Outre ces deux projets, nous avons criblé plusieurs séquences capables de former des G4 situées dans la région du promoteur NHE Kras. Parmi eux, deux séquences (21R et 22RT) ont montré une bonne qualité spectrale comme a été montré par RMN 1D. En outre, leur interaction avec des ligands spécifiques de G4 (360A, PhenDC3 et Braco19) confère une haute stabilisation (Tm> 30 °). La titration RMN réalisée entre 22RT et Braco19 a montré un comportement intéressant de G4 dans un complexe avec le ligand, on observe un intermédiaire dans la région duplex avant la stabilisation finale de fois et 22RT Braco19 dans complexe 1: 1. Dans l'avenir, nous avons l'intention de résoudre la structure de G4 formé par 22RT par spectroscopie RMN qui sera suivie par la détermination de la structure du complexe formé entre G4 et le ligand Braco19. Ces investigations nous permettent de mieux comprendre comment les ligands se lient aux structures G4 et comment les différents échafaudages de G4 peuvent être utilisés pour le développement de nouvelles molécules.

11

General Introduction

A. GENERAL INTRODUCTION

A.1 DNA structures

A.1.1 DNA components Nucleic acids are constituted by the succession of elementary units called nucleotides linked by phosphodiester bounds. The nucleotides can be generated by the phosphorylation of nucleosides. Therefore, each nucleotide contains a nucleoside formed by a nucleobase covalently linked to a pentose through a glycosidic bond and a phosphate group (Fig. 1A). Four primary nucleobases appear in DNA : Adenine (A), Cytosine (C), Guanine (G) and Thymine (T) which is substituted by Uracil (U) in RNA (Fig. 1B). The pentose is 2-desoxy- D-ribose in DNA and D-ribose in RNA.

Figure 1: Nucleic acids components.

(A) The chemical structures of nucleotide and nucleoside. Nucleobase in green, pentose in black and phosphate group in blue. (B) Schematic structures of the nucleobases constituting DNA and RNA with conventional numbering of atoms. H bond sites involved in Watson Crick and Hoogsteen base pairing (donor and acceptor sites showed on guanine nucleobase by blue and red asterisks respectively).

A.1.2 DNA structural polymorphism

A.1.2.1 Duplex DNA In 1953, James Watson and Francis Crick discovered the first double helix DNA structure (1,2). DNA is usually represented by a right-handed double helix formed with two anti-parallel strands held together by the complementary base pairing through two Hydrogen bonds between the bases A and T and Three between C and G (Fig. 2A). The helicoidal structure of DNA is stabilized by interactions of π- π stacking between consecutive bases. This DNA form called B form (Fig. 2B) is the canonical conformation predominantly found ' in the cells. The sugar pucker is always in C2 -endo/C3'-exo conformations and the glycosidic bond connecting the base to the sugar adopts an anti conformation. Nevertheless, the B form

12

is not the only base pairing arrangement that can occur between bases. Another right handed duplex DNA form can also be found, the A form (Fig. 2B) observed under conditions of low humidity differs from the B form by the sugar pucker which is in C3'-endo conformation. Besides the two first forms, a left-handed conformation called Z form was also highlighted (3). This structure is favored at high concentration of salts and requires an alternation of ' purine-pyrimidine dinucleotides especially C and G. The sugar puckering is in C2 -endo/C3'- exo and the glycosidic bond adopts an anti conformation at pyrimidines. By contrast, the ' sugar puckering is in C3 -endo/C2'-exo and the glycosidic bond adopts a syn conformation at purines. The change to the syn position (Fig. 2C) induces a turning over of bases within the helix and gives to the backbone of the left-handed DNA a zig-zag appearance (Fig. 2B).

A.1.2.2 Triplex DNA In 1975, G. Felsnfeld et al observed a structure formed by a triple helix (4). It is formed by the insertion of a third strand in the major groove of the duplex DNA (Fig. 3B). The formation of this structure requires an asymmetric double helix where one strand contains mostly pyrimidine bases while its complementary strand is mainly constituted by purine bases. The purine-rich strand is recognized by the third strand through non-canonical base pairing. These connections established via H bonds can explain the formation of the triplex. This non-canonical base pairing was highlighted by Karst Hoogsteen using X-ray diffraction and are called "Hoogsteen" (5,6) (Fig. 3A) .

A.1.2.3 Guanine quadruplex (G4) structures Besides these DNA structures, another non-canonical base pairing can be established within guanine rich DNA sequences between guanines via four hydrogen bonds involving the "Watson Crick" and "Hoogsteen" edges. The association of four guanines by a network of eight H bonds generates a planar structure called a G-quartet (7). The stacking of several G- quartets forms a G4(8) (Fig. 4). These structures representing the main object of this work will be described more in detail in the rest of the manuscript. The study of G4 structures present a large interest in view of their structural diversity and the fact that they can be used as potential therapeutic targets (9-12).

13

Figure 2: Duplex DNA structures.

(A) Canonical base pairing of nucleotides within the double helix of DNA. (B) Models of the A, B and Z DNA. B and A forms are right-handed helices, B form is characterized by a helical turn every 10 base pairs (3.4 nm) with 0.34 nm by base pair. A form is more compact with 11 base pairs per turn. Z form is left-handed with a zig- zag like backbone. (C) Syn and anti conformations of glycosidic bond angles of guanines as in B and Z DNA. ' ' C2 -endo/C3'-exo and C2 -exo/C3'-endo sugar puckers of B and Z DNA. Images generated using UCSF Chimera (13).

14

Figure 3: Triplex DNA structure.

(A) Non-canonical base pairing of nucleotides within a triple helix of DNA. (B) Model of triplex DNA adapted from the structure pdb 149D obtained by NMR (14). Images generated using UCSF Chimera (13).

Figure 4: G4 DNA structure.

(A) Schematic structure of guanine with the Hoogsteen and Watson Crick faces, the atoms marked with asterisks are involved in the H bonds. (B) Presentation of a G-quartet formed by the association of four guanines. (C) Model of G4 formed by the stacking of three G-quartets adapted from the Human telomeric sequence, (AG3[(T2AG3])3, code pdb 143D obtained by NMR (15). Images generated using UCSF Chimera (13).

15

A.2 G4 structures

A.2.1 G-quartets The planar structures formed by the association of four guanines are called G-quartets (Fig. 4B). The guanines are linked by two networks of H-bonds. The first H-bonds established between the oxygen of carbonyl group in position 6 (O6) and the Hydrogen of imino group in position 1 (N1), the second H bond is established between the nitrogen in position 7 (N7) and the hydrogen of amino group in position 2. The G-quartets are maintained by the presence of monovalent cations, mainly the metallic cations such as K+, and to a lesser degree Na+ (16)and + + + 2+ 2+ 2+ NH4 (17). Other monovalent and divalent cations such as (Li , Rb , Sr , Ca and Pb ) can also be incorporated between G-quartets, to stabilize their formation (17,18). The cations interact with the four oxygen atoms belonging to the carbonyl group (O6) of the four guanines during the stacking of G-quartets (19). The sugar pucker of the G forming G-quartet is usually ' in a C2 -endo/C3'-exo conformation. However, the glycosidic bond angle can adopt two different orientations syn and anti. The syn/anti conformations of guanines within a G-quartet define the groove dimension between two adjacent guanines. G4 present three types of grooves, they can be wide, narrow or medium (20). The combination of these different conformations allows the specific structural polymorphism of G4; which can contain two or more G-quartets, arranged in intermolecular or intra-molecular structures, and different orientation of the strands.

A.2.2 Structural polymorphism of G4 The stacked G-quartets linked by the phosphate backbone constitute the invariant core of all G4 structures. Even if they are not connected by a phosphate backbone, the guanosines are able to form a helicoidal structure by self assembly and stacking of G-quartets (21). Therefore, a strong cooperation between three key factors including H bonding, Dipole interactions and π-π stacking interactions determines the structural diversity of G4. Different parameters have been described to explain the structural diversity of G4 mainly : the number of strands, their orientations, the conformation of loops and the nature of cations.

A.2.2.1 Number of strands According to the number of DNA molecules involved in the formation of a G4, two main categories can be described. Intermolecular G4includes three categories; Tetra- molecular formed by the association of four strands, bimolecular formed by two strands and recently Zhou et al were able to form a G4 with three strands(22). The intra-molecular or

16

monomolecular G4 is formed by one strand containing several blocks of guanines able to fold back and form a G4 structure (Fig. 5B).

A.2.2.2 Orientation of strands The orientation of each strand within the G4 is defined by the 5'-3' directionality of the phosphate backbone. Depending on the orientation of the aromatic base towards the sugar, the guanines can adopt two different conformations within the G-quartet, syn and anti (Fig. 5A). When all the guanines have the same conformation, the strands possess the same polarity and the structure called parallel. However, if the guanines present different GBA (Glycosidic Bond Angles) conformations, 16 conformations are in theory possible for each G-quartet that will govern the orientation of the strands (23).Therefore, 16 orientations are possible grouped in 4 categories: (a) A parallel conformation containing 4 strands oriented in the same direction 5'-3', (b) a hybrid conformation (3+1) presenting 3 strands in the same direction and the 4th strand in the opposite direction, (c) an anti-parallel conformation with 2 adjacent stands oriented in the same direction and the others in the opposite direction, and (d) another anti- parallel conformation where 2 adjacent stands are oriented in the opposite direction (Fig. 5B).

A.2.2.3 Conformation of loops The intra-molecular and bimolecular structures generally present 2 or 3 loops respectively. The loops are the sequences located between the blocks of guanines and linked G-quartets. They present different conformations: (a) Diagonal loop connecting two anti- parallel strands, (b) lateral or edgewise loop connecting two anti-parallel adjacent strands, (c) propeller or chain reversal loop connecting two parallel adjacent strands and (d) V-shaped or snap-back loop connecting the wedges of G-quartet (Fig. 5C). The loops have an important impact on the conformational polymorphism and the stability of G4. The conformation of loops is remarkably dependant on the length of their sequences. In fact, loops containing one or several nucleotides have the ability to link more than two G-quartets(24). Short loops are usually external and favor a stable parallel conformation. However, longer loops give rise to anti-parallel conformation but with reduced stability of G4(25-28). The nucleotide type within loops plays also a role in G4 stability, the presence of adenine reduces remarkably their stability comparing to pyrimidines (C and T) (29).

A.2.2.4 Nature of cations TheG4 interact with cations in two different manners: a specific interaction with the central ions inserted between G-quartets and an interaction with external ions which

17

contribute to the partial screening of the negative charges of the phosphate backbone. The internal interaction was first evidenced by crystallography (30,31), where dehydrated cations are inserted between the carbonyl group of superior and inferior G-quartets. Recently, Sket et al demonstrated using NMR spectroscopy that smaller protons (1H) are able to exchange from one ammonium ion to another within d(G3T4G4)2 dimeric G4, while larger ammonium ions could not exchange through the central G-quartets(32). According to cation type, different structures of G4 were highlighted and resolved both by NMR and crystallography, mainly the telomeric G4 which can adopt several conformations (Fig. 9).

A.2.2.5 Intra-molecular G4 motifs The formation of intra-molecular G4requires DNA sequences containing at least 4 blocks of guanines with more than 2 guanines for each block separated with a variable number of nucleotides. This involves the formation of three single-stranded loops in the G4 structure in which the G-tracts would form continuous columns supporting the G-tetrad core, while the linkers would form loops connecting the corners of the G-tetrad core. The number of guanines in each block and the size of loops are critical for the stability of the structure. The measurements of thermodynamic stability according to the length of loops and the proximity of guanines within G4 led to the formulation of a consensus sequence capable of forming an intra-molecular G4 used in the search for putative quadruplex sequences (PQS) in the genome G3-5NL1G3-5NL2G3-5NL3G3-5(25). To date, most of G4 structure resolved complies with this consensus. Nevertheless, it has been shown that several examples defy the description of this consensus.

A.2.2.6 Bulges in G4 As previously referred, exceptions to the consensus sequence have been reported by NMR and crystallography such as the sequence Pu24 from the human c-myc promoter (33) and c-kit87up from the human c-kit promoter (34,35). These two sequences show discontinuous disposition of guanines in one column of the G-tetrad core, despite the presence of four G-tracts each having at least three continuous guanines. The presence of bulges which are a projections of bases from the G-tetrad core have been observed. Recently, Phan et al have shown that many different bulges can exist in G4 structure (36). They vary in their sequence, size, position and numbers within the G4.Expanding this description should help to identify more potential G4 forming sequences.

18

Figure 5: G4 polymorphism.

(A) Conformations of guanine base towards sugar, anti (in the right) and syn (in the left). Average Distances of H8-H1' corresponding to each conformation are showed in red. (B) Schematic representation of different G4 conformations according to the molecularity (Tetra-, bi- and uni- or intra-molecular), strand orientations (Parallel, anti-parallel and hybrid (3+1)) (36). (C) Schematic representation of different loop conformations within G4 (Lateral, propeller, diagonal, V shaped and snap-back loops) according to (37).

19

A.3 Biological functions of G4

A.3.1 G4 in silico evidence The ability of certain guanine rich sequences to form spontaneously intramolecular G4 structures in physiological mimicking conditions suggests their existence in vivo. Besides theses experimental observations, different in silico approaches were used to localize sequences susceptible of forming G4 within genomes of different species. Notably, the Quadparser algorithm (38) allows to identify sequences respecting the following motif d(G3+N1-7G3+N1-7G3+N1-7G3+). This program is usually used to identify putative quadruplex sequences (PQS) susceptible of forming G4 in human genome. Using Quadparser, more than 370000 sequences were detected. More recently, another tool representing an extension of Quadparser was developed (39). Together with predictive algorithm Quadparser, this data repository contains a large database of predicted G4 (QuadDB) in the human and other genomes. By limiting the number of successive G to 5 nucleotides, the result obtained was nearly identical to that of Quadparser (40). The limitation of loop length to 7 nucleotides, in favor to reduce the complexity of data processing suggests that such sequences are more widespread, as stable G4 can be formed with longer loops (41). The potential for G4 DNA formation (G4P) is based on a more qualitative description. The sequences show at least four blocks of three guanines in a window of more than 100 nucleotides, with a slippery interval of 20 nucleotides. The G4P is defined as the percentage of hits in the total number of analyzed windows. More recent analysis showed that the density of PQS is higher in promoters close to the transcription start sites (TSS). The PQS density and the G4P are correlated with GC content of the genome. However, it is important to note that they are not correlated to the enrichment of CpG sites which requires the alternation of bases G and C rather than an asymmetry of G bases on one strand, and C bases on the other. Furthermore, the density of PQS is also correlated with the presence of DNase I hypersensitive sites (DHS). These sites are characteristic of regions undergoing transition by the opening of B-DNA and conformational changes in the promoters required for the activation of transcription. The G4 structures may have a role in Cis-Regulatory Elements (CREs) for 40 % of the human genome genes (42). In addition, a correlation between G4P and gene function was highlighted. Several functions mainly G-protein coupled receptor, sensory perception, nucleosome assembly, nucleic acids fixation, Ubiquitin cycle, adhesion and cell division are characterized by a low G4P. Furthermore, a high G4P is associated with functions such as development, immune

20

response, activity of transcription factors, cell signaling, muscle contraction, growth factors and cytokines. It is also interesting to note that the tumor suppressor genes are characterized by low G4P, as opposed to proto-oncogenes that have a high G4P (43). These studies suggest that G4 could play dynamic roles within the cell. Besides studies in human genome, bioinformatics allowed to identify 854 PQS in Saccharomyces cerevisiae using an algorithm with a slippery window of reasonable size (< 50 nucleotides) (44,45). Among others, these predictive tools are limited. The precise prediction of structure and stability for a new sequence remain inaccurate (46). The experimental data provided describe only the folding of the single strand, they do not take into consideration the presence of the complementary strand which favor the formation of DNA duplex. G4 forming requires the denaturing of DNA duplex, which occurs during replication, transcription or recombination phenomena (47). Only telomeric overhangs may form G4 structures without need to dissociate a complementary strand. Moreover, most of experimental observations are acquired in vitro conditions. Therefore, they do not take in account the interactions of G4 with biological partners mainly proteins such as helicases (48) and nucleases like topoisomerase I (49,50) and topoisomerase II (51). Even though limited, the predictions performed in silico on different genomes, namely human (38), HIV (52) and Saccharomyces cerevisiae (45) genomes provide evidence that the G4 sequences are not randomly localized on genomes but they are specifically distributed in particular regions.

A.3.2 Examples of G4 localization in genomes

A.3.2.1 Promoters of genes The promoter is the region of the gene that is localized upstream to the transcription start site. They are often characterized by more richness in guanines and a nuclease sensitivity to DNase I, suggesting that B-DNA can be easily opened in these regions. According to the in silico analysis, more than 40 % of human genes contain at least one G4 motif within their promoters (42). After the telomeres, these regions are the most studied for their potential to form G4. The proto-oncogenes seem particularly to be enriched with PQS, contrary to tumor suppressor genes (43). The formation of intramolecular G4 have been studied in vitro for different sequences of these promoters, such as c-myc (33,53,54) (Fig. 6A), c-kit (55) (Fig. 6C) and bcl2 (56,57) (Fig. 6B) in the human genome. Furthermore, in HIV genome, we recently resolved by NMR a topology of a sequence called HIV-PRO1 belonging to the HIV promoter (52). Other proto-oncogenes are less studied such as Kras (58), VEGF (59). These proto-oncogenes are involved in different process namely the self-sufficiency for growth

21

signals, insensitivity to anti-growth signals, sustained angiogenesis, evasion of apoptosis, tissue invasion and metastasis and limitless replicative potential (Fig. 7) (60).

Figure 6: G4 in gene's promoters.

Models of different G4structures formed in genes promoter regions resolved by NMR. Each model is defined by its PDB ID in the bottom. A) Structures formed in c-myc promoter. 1XAV: Parallel G4 formed in K+(53). 2A5P : Parallel G4 formed in K+ with all anti guanines and a snapback 3'-end syn guanine (33). 2LBY: Parallel G4 formed in K+ (61). B) Structures formed in BCl2 promoter. 2F8U: Hybrid G4 formed K+ in containing two lateral loops and one side loop (57). C) Structures formed in c-Kit promoter. 2KYO: Monomeric parallel G4 formed in K+ (62). 2KYP: Dimeric parallel G4 formed in K+ (62). 2O3M: Monomeric G4formed in K+ containing two single-residue double-chain-reversal loops, a two-residue loop, and a five-residue stem-loop (55). 2KQG, 2KQH: Monomeric parallel propeller-type conformation G4 formed in K+(63). Images generated using UCSF Chimera (13).

A.3.2.2 Minisatellites Minisatellites are DNA segments within the genome located mainly near the ends of chromosomes that consist of repeating sequences from 5 to 100 nucleotides. They are known by their hypermutability and genomic instability(64). It was shown that they have as a generalized function to provide binding sites for recombination proteins in eukaryotes (65). Certain minisatellite sequences possess guanine rich repeated sequences and show their ability to form G4 structures. Amrane et al solved the G4 structure formed by the conserved 26 nucleotides guanine rich fragment of the CEB25 motif in K+, which form a monomeric propeller-type parallel-stranded G4(66). Another structure of CEB1 minisatellite forming a dimericG4 was recently resolved (67). This formation of G4 is in agreement with the instability behavior observed in these genomic regions.

22

Figure 7: The six hallmarks of cancer shown with the associated G4 found in the promoter regions of genes

, figure adapted from (60).

A.3.2.3 Immunoglobulin class switching regions The genes coding the heavy chain of immunoglobulin undergo a biological mechanism called class switch recombination (CSR), which allows to bring the constant- regions portion close to the variable regions rearranged during B-lymphocyte differentiation. These regions are constituted of guanine rich degenerated repeats with a total length that varies between 1 and 10 kb. The ability of guanine rich motifs to form intermolecular G4 structures was showed for the first time in the CSR regions. The authors give their a potential role at the telomeres, particularly during meiosis (68).

23

A.3.2.4 RNA quadruplex G4 forming by RNA sequences containing several blocks of guanines is possible in both coding and non-coding RNA. It has also been shown that some G4 RNA possess remarkable stability (69-71). In addition, the formation of G4 RNA seems all more relevant since the RNA is single-stranded. Such structures could play key roles in their regulation and translation. G4 RNA can also be related to genomic stability and disease. The human fragile X syndrome is caused by the loss of FMRP protein ability to bind RNA or the repression of the gene FMR1 coding the protein. The in vitro selection of bound RNA by FMRP showed that the protein bind to G4-prone sequences (72). Certain mRNA coding for proteins involved in the biology of neuronal and synaptic development can also form G4 structures (72,73). Bioinformatics studies demonstrated that the 5´ and 3´ untranslated regions of mRNA are G- rich. It has been supposed that the guanine-rich UTRs can fold into G4, which may play a role in the regulation of translation and degradation of RNA (70,74,75). Recent studies reported that telomeric repeat-containing RNA (TERRA) folds into a parallel G4 conformation; additionally, the parallel RNA G4 tends to associate and form higher-order structures (76,77). TERRA could play important roles in regulating telomerase and in chromatin remodeling during cell development (78).

A.3.2.5 Telomeres The telomeres are a nucleoprotein complex localized at the end of each chromosome in eukaryotic cells (79). They play key roles in the protection of genetic material especially towards the double-strand breaks and the erosion due to cell division, and also protect it from the degradation by nucleases and from recombination phenomena (80-82). In addition to this role in protecting the genome, telomeres may also have a key role in anchoring the nuclear membrane, in chromosome segregation during mitosis and meiosis as well as in suppressing the expression of genes in sub-telomeric regions (83,84). It has been shown that the ends of chromosomes consist of tandem repeats rich in guanines (85). The first telomeric sequence have been described in the ciliate Tetrahemyna thermophila witch is constituted by the repetition of the motif d(GGGGTT) (86,87) while in mammalian cells the repeat unit is d(T2AG3). Human telomeric DNA consists of approximately 2–10 kb of the double stranded sequence and a single stranded 3´ overhang of the G-rich strand, which is approximately 50– 100 deoxynucleotides long(88,89). The yeast Saccharomyces cerevisiae is characterized by the repetition of the motif d(TG2-3(TG)1-6) (90). In fact, the presence of repetitive guanine rich motifs forming the overhang at 3' extremity of telomeres suggests that G4 formation is a

24

common feature shared by most sequences (91-95). These repeats form a complex with proteins of a wide range of functions. The DNA–protein complex is referred to as the telomeres (96-98). When the progressive erosion of telomeres over the generations is not compensated by an elongation mechanism, the proliferative capacity of the cells is limited which prevents tumorigenesis. Indeed, in 85% of tumors, the ability of unlimited proliferation of tumor cells depends on the reactivation of a mechanism of telomere elongation, which usually occurs by the over-expression of telomerase (99,100). This makes telomerase a privileged target for the treatment of cancer (10). The inhibitory effect of a G4 formed in the telomerase substrate on its activity in vitro has led to consider an indirect telomerase inhibition strategies by altering the structure of the substrate. Stabilizing a human telomeric G4 by a G4 ligand in vitro inhibits the elongation of a telomeric oligonucleotide by telomerase (101-103). These experiments greatly stimulated the interest of G4 as therapeutic targets and the development of small molecules able to stabilize G4 specifically. Antibodies that bind specifically to G4 have also been developed: these antibodies are able to detect G4 formation in genomic DNA and RNA of human living cells (104-106) (Fig. 8A,8B). These observations in agreement with the experimental data obtained in vitro represent the most direct proof for the existence of G4 in vivo. These studies also suggest a possible role of G4 in the protection of telomeres. Another alternative structure of the telomere is the T-loop (Telomeric loop) arising from the invasion of the telomeric duplex by the 3' end of the single stranded (107) (Fig. 8C,8D). T-loop would hide the telomeric ends by association with the telomeric proteins and could contribute to the protection of the telomere function. In vitro, T- loop forming is mediated by the protein TRF2 which is a component of the Shelterin complex. TRF2 binds directly to the telomeric DNA (108,109).

A.3.2.5.1 G4 structures of human telomeric DNA

The 3' end of human telomeres finish with a single-stranded DNA G-rich overhang (88).Therefore, the DNA of human telomeres can easily fold into a quadruplex structure. However, this single-stranded is capped by proteins such hPOT1 (110,111). Moreover, the human telomerase enzyme formed of the catalytic domain hTERT and the RNA template recognition domain hTRcap the 3' end in cancer cells (85). Since the demonstration that telomeres could be promising for cancer treatment, the structural features of the telomeric G4 present a great interest. Several high resolution structures have resolved both by NMR (Fig. 9) and X-ray crystallography. An anti-parallel G4 structure having a basket-like folding containing one diagonal loop and two lateral loops was obtained by NMR for the truncated

25

human telomeric d(AG3(T2AG3)3) sequence in sodium solution published in 1993(15) (Fig. 9 : 143D). Recently, a new anti-parallel (2+2) structure resolved in sodium conditions comprises a novel core arrangement from the known topology (Fig. 9 : 2MBJ). Unlike the low polymorphism observed in sodium conditions, a set of polymorphic structures was observed under physiological mimicking conditions rich in potassium. Several structures were obtained from very close sequences. Two different hybrid (3+1) G4 forms were observed with the sequences d(TAG3(T2AG3)3) and d(TAG3(T2AG3)3T2) in potassium solution. Both structures contain the (3 + 1) G-tetrad core with one double-chain-reversal and two edgewise loops, but differ in the successive order of loop arrangements within the G4 scaffold(112)(Fig. 9: 2JSK,

2JSL, 2JSQ and 2JSM).The sequence A3G3(T2AG3)A2and d(T2AG3(T2AG3)3T2) form hybrid structures differing only in the arrangement of loops, strands orientation and capping structures (95,113) (Fig. 9: 2JPZ, 2HY9). Another structure resolved for the structure

T2G3(T2AG3)A presenting (3+1) strand fold topology, two edgewise loops and double-chain reversal loop(94) (Fig. 9: 2GKU). The substitution of guanosines with 8-bromoguanosines at proper positions allowed to obtain more leaner NMR spectra and led to resolve a new structure that is a mixed-parallel/anti-parallel G4 under physiological mimicking conditions (100 mM KCl and 3 mM NaCl) (114) (Fig. 9: 2E4I). An anti-parallel structure very different from those already known was identified by NMR in a solution rich in potassium. It contains only two G-quartets and has a diagonal loop of five bases and two lateral loops(115) (Fig. 9: 2KF7, 2KF8).In addition to these folding observed in sodium or potassium optimized conditions, another all parallel propeller-type G4 structure was obtained under molecular crowding conditions(116) (Fig. 9: 2LD8). More generally, the high structural diversity of the human telomeric motif in both potassium and sodium conditions clearly show that a particular structure is not only dependent on external factors, but its sequence context is crucial, especially that of the 5'and 3' ends.

26

Figure 8: Telomeric G4 characterization.

A).Visualization of RNA G4 in the cytoplasm of human cells, indirect immunofluorescence microscopy showing detection of RNA G4 by the G4 structure-specific antibody, BG4, in the cytoplasm of HeLa cells, Figure adapted from (105)B).Visualization of DNA G4 in the nuclei of human cells using BG4 (U2OS (osteosarcoma)),Figure adapted from(106). C).Schematic presentation of T-loop according to (107). D) Image of T-loop formed by the telomeric DNA of HeLa cells isolated by size fractionation following psoralen/UV treatment of nuclei, deproteinization, and restriction cleavage. The image obtained by electronic microscopy is taken from (107).

27

Figure 9: Human telomeric G4 polymorphism.

Models of different structures adopted by the human telomeric sequence resolved by NMR. Each model is defined by its PDB ID in the bottom. 2JSK, 2JSL, 2JSQ and 2JSM: Hybrid G4 (3+1) in K+ containing two edgewise loops and double-chain reversal loop, two forms are observed from natural and mutated sequences (15BrG) (112). 143D: anti-parallel G4 in Na+, a unique basket type folding was observed with three stacked tetrads (15). 2KF8, 2KF7:a two G-tetrad basket type G4 formed by a natural and mutated sequence (7BrG)(115).2GKU: Hybrid G4 (3+1) in K+(94). 2HY9: hybrid G4 containing a lateral and double-chain- reversal loops(113). 2JPZ: Hybrid G4(95) . 2MBJ: anti-parallel (2+2) G4 in Na+(117) . 2E4I: mixed parallel- antiparallel G4 under physiological mimicking conditions (100 mM K+, 3 mM Na+) obtained with a mutated sequence contain several BrG substitutions(114).2LD8: propeller-type all parallel stranded G4 under molecular crowding conditions (116).Images generated using UCSF Chimera (13).

28

A.4 G4 ligands As previously mentioned, G4 are highly polymorphic and are susceptible to be formed in different key genomic regions, mainly in telomeres and gene promoters. Even if the existence of G4 in vivo is not absolutely demonstrated, the biological effects of selective G4 ligands have to be taken into consideration. As referred before, telomerase is up-regulated in 85 % of cancer cells, this perturbation leads cancer cells to become immortal (99,118). Consequently, inhibiting telomerase would be a suitable approach for anti-cancer therapy(10,119-121).For blocking the active site of telomerase, different inhibitors have emerged(122). An alternative approach would be to target the substrate of telomerase, which is the 3' end of the single-stranded telomeric DNA overhang. To catalyze the elongation of the telomeric sequence by adding d(T2AG3) units, telomerase requires the single-stranded DNA. Hence, the development of G4 ligands that stabilize or induce G4 formation from the 3' single-stranded overhang may prevent telomerase from catalyzing the elongation of telomeres which could be promising for cancer treatment(123). The G4 stabilization by G4 ligands is often resulting from π-π stacking and electrostatic interactions (124). Usually, the development of G4 ligands has been mainly based on polycyclic planar small compounds with positively charged substituents, these particularities lead to the compound stacking on top of the planar G-tetrads, as has been reported by several structural studies of G4-ligands complexes(125,126). To be more selective, the interacting aromatic surface of ligands have to be larger than DNA duplex binders in order to optimize G-quartet covering and to disfavor the DNA duplex binding. The positively charged ligand are able to interact more strongly with the negatively charged phosphate backbone of DNA and contribute to stabilization. Hence, to develop a G4 ligand, it is not enough to investigate scaffolds which interact strongly with G4 structures. The main criterion for molecules to be considered as G4 ligand candidate is to show a high selectivity for G4 structures comparing to other DNA forms, particularly the duplex. Besides the targeting of the planar G-quartet, other structural features of G4 have to be taken into account particularly grooves and loops. The channel bearing monovalent cations + + + such as K , Na and NH4 stabilizing the G4 structure is also concerned by the interaction. Concisely, the G4 ligands can be ranked into different groups according to their modes of interaction: interactions of π or end stacking with the terminal G-quartet of G4, hypothetical intercalation of ligands between G-quartet, interactions with grooves and interactions with loops or combination of the two last modes. Among the interaction types, the intercalation of ligands between G-quartets seems unlikely since the unstacking of two G-quartets and the diversion of a cation coordinated between them is energetically unfavorable (11).

29

A.4.1 π-π stacking interaction with G-quartets Most of G4 ligands target the invariant part of the structure, the G-quartet, and also possess a large aromatic surface favorable to the interaction with the large hydrophobic surface constituted by the G-quartet. The binding is mediated by the electronic orbital π from the aromatic groups belonging to the ligand and the G-quartet. However, the conception of such molecules with large and planar surfaces containing several aromatic cycles affect their solubility in aqueous conditions. To improve the solubility of ligands, positive charges may be added to the molecule. As a consequence, the stabilization of G4 is based on the π stacking interaction between aromatic surface of the ligand and planar surface of the G-quartet and reinforced by electrostatic interactions between positive charges of the ligand and negative charges of the G4. Different ways can be employed to protonate the ligand mainly the quaternization of amine side-chains via in situ protonation, the aromatic N-methylation, the generation of neutral marcocyclic ligands and the chelation of metal cation such as Nickel and Manganese (124).

A.4.1.1 In situ protonated ligands As we refer before, the G4 ligands have to conceive large flat aromatic systems prone to π stacking with G-quartet platform, while retaining reasonable water solubility. A classical way is to introduce a protonated side chain at neutral pH by usually adding amine groups around an aromatic core. Thus, the generated molecules are water soluble containing charges distant from aromatic core. Several G4 ligands were generated following this strategy such as Braco-19, Quindoline, MMQ1, PIPER and a L2H2-6M(2)OTD which is a telomestatin derivative generated by adding protonated side chains to telomestatin (Fig. 10). Different structural studies and structures resolved by X-ray crystallography or NMR showing the interaction between these ligands and G4 have been done. Braco-19 shows an interaction dictated by hydrophobic π stacking interactions between the flat aromatic core of the ligand and two guanine residues of the accessible G-tetrad. Electrostatic interactions between the three protonable side chains of the ligand and the G4-grooves were also highlighted (127). The main technique widely used to appreciate the affinity of G4 ligands for their targets is the FRET melting assay developed by Jean-Louis Mergny in 2000 (128).In this test, the binding affinity is correlated with the increase in melting temperature (Tm) of a fluorescently labeled G4. In this test, Braco-19 confers a stabilization of 27°C to the human telomeric quadruplex in the FRET-melting test. It exhibits a fair selectivity (31x) for G4 over duplexes as demonstrated by SPR and displays also an important inhibition effect on the telomerase

30

enzyme (10,128-130). MMQ1 (Meta-quinacridine n-propylamine) is a pentacyclic quinacridine that displays a crescent shape likely to maximize the overlap with the guanines of the accessible G-quartets. The NMR structure of the complex formed between MMQ1 and the tetramolecular G4 d(TTAGGGT)4 shows ligand binding on the terminal G-quartets of the G4 with a stoechiometry of 2:1. The interaction is mediated by π stacking of the aromatic surfaces and electrostatic interactions between the side chains of the ligand and the loops of the G4 (Fig. 11) (131). PIPER ligand or perylene diimide is also a G4 ligand: NMR structural studies have demonstrated that PIPER can interact with a tetramolecular G4 by π stacking on the 3' terminal G-quartet. Interestingly, it is also able to be sandwiched between two G-quartet of different G4 inducing dimerization of d[TTAGGG]4G4. Moreover, PIPER has an important inhibitory effect on telomerase enzyme (132). NMR studies of the interaction between c-myc G4 and quindoline compound indicates that binding of quindoline increases the stability of the c-Myc G4 by more than 15 °C in its melting temperature at a stoechiometry of 2:1 as showed by 1D NMR-melting. The NMR structure resolved between the two partners demonstrates that two quindoline molecules bind c-myc G4 by π-π stacking generating a complex in which one quindoline covers the top guanine tetrad and another covers the bottom guanine tetrad (Fig. 11) (133). (134). Recently, Chung et al solved the structure of the complex formed between an intramolecular (3+1) human telomeric G4 and a telomestatin derivative by NMR (L2H2-6OTD). The ligand was observed to interact with the G4 by π stacking and electrostatic interactions (Fig. 11) (135).

Figure 10: Chemical structures of G4 ligands generated by in situ protonation strategy.

BRACO-19 is a 3,6,9-trisubstituted acridine. PIPER is N,N'-disubstituted perylene diimide. MMQI is a Meta- quinacridine n-propylamine, L2H2-6M(2)OTD is a telomestatin derivative.

31

Figure 11: Models of G4-in situ protonated ligand complexes.

Each model is defined by its PDB ID in the bottom. 2JWQ: Structure of the complex formed between d(TTAGGGT)4 and MMQ1 ligand (131). 2L7V: Structure of the complex formed between c-myc promoter sequence TGAG3TG3TAG3TG3TA2 and Quindoline compound (133). 2MB3: Structure of the complex formed between Human telomeric sequence T2G3T2AG3T2AG3T2AG3Aand Telomestatin derivative compound L2H2- 6M(2)OTD (135). Images generated using UCSF Chimera (13).

A.4.1.2 Aromatic N-methylated ligands Another way of protonation is to use N-methylated ligands. Such strategy confers a good water solubility to the molecule without the need to add cationic side chains and increase the π stacking ability of the ligand by reducing the electron density of the aromatic part. The most known G4 ligands developed by this approach are the TMPyP4, 360A, Phen- DC3 and RHPS4 (Fig. 12). TMPyP4 is a tetra-cationic porphyrin that inhibits telomerase and possess a high affinity for G4 DNA with a stabilization of 17°Cas showed by FRET-melting, it is also able to regulate the expression of oncogenes such as c-myc and K-ras (136,137). Different structural studies were performed to describe the interaction mode of TMPyP4 ligand with G4. The NMR structure of the TMPyP4 complexed with c-myc G4 showed the stacking of the porphyrin onto the external G-quartet (Fig. 13). (33). The X-ray structure of the complex formed between TMPyP4 ligand and telomeric G4surprisingly showed an external stacking onto TTA loops without any direct contact with G-quartets (138). The pentacyclic acridin RHPS4 was identified as G4 ligand, this molecule interacts with the parallel telomeric G4 d(TTAGGGT)4 by π stacking with a stoichiometry of 2:1 (Fig. 13). (126). Besides these ligands an important family of compounds based on the scaffold of PDC (bisquinolinium pyridodicarboxamide) was also developed (139,140). The compound 360A is

32

the most known, it confers a stabilization of 23°C for the intermolecular telomeric G4.It also has a very high affinity for the telomeric G4 d(AGGG(T2AG3)3) and affinity for the DNA quadruplex 20 times higher than DNA duplex (141,142). To improve the G-quartet covering by PDC compounds, the structure of PDC was modified by the replacement of the central pyridine by a phenantroline generating PhenDC family compounds, the selectivity of these compounds is higher than PDC ligands (143). Recently, Chung. et al solved by NMR the complex formed between Phen-DC3 and an intramolecular G4 derived from the c-myc promoter. Structural study revealed that Phen-DC3 interacts with the quadruplex through extensive π-stacking with guanine bases of the top G-tetrad (Fig. 13) (144).

Figure 12: Schematic presentation of the chemical structures of G4 ligands generated by N-methylation strategy.

Figure 13: Models of G4-N-methylated ligand complexes.

Each model is defined by its PDB ID in the bottom. 1NZM : Structure of the complex formed between d(TTAGGGT)4 and RHPS4 ligand (126). 2A5R : Structure of the complex formed between c-myc promoter sequence TGAG3TG2IGAG3TG4A2G2 and TMPyP4 compound (33). 2MGN : Structure of the complex formed between c-myc promoter sequence TGAG3TG2TGAG3TG3GA2G2 and Phen-DC3 compound (144). Images generated using UCSF Chimera (13).

33

A.4.1.3 Neutral and negatively-charged macrocyclic ligands Surprisingly, certain ligands possess a good affinity for G4 while they are not positively charged. The interaction of these compounds would be guided by covering the G- quartet with the ligand surface and π stacking interactions. The most famous ligand of this limited group is telomestatin which is a natural compound extracted from Streptomyces annulatus(145). Telomestatin stabilizes human telemoric G4 and shows a high selectivity for G4 structures (146), it displays an important inhibition of telomerase activity. Other analogues has also been developed such as HXDV conferring to G4 structures a stabilization close to 20°C (147). Even more surprising is the observation that negatively charged compounds may have a good affinity for G-quadruplex. N-methyl mesoporphyrin IX (NMM) has a strong affinity for some quadruplex conformation despite beeing negatively charged (Fig. 14).

Figure 14: Schematic presentation of the chemical structures of G4 neutral and negatively-charged macrocyclic ligands.

Telomestatin , HXDV and NMM as representatives.

A.4.1.4 Metallo-organic ligands The design of metallo-organic molecules is also proposed for the development of G4 ligands. This approach is based on the insertion of a central metal cation which can be integrated on the ionic channel of the G- quadruplex. The cationic nature of these molecules helps to improve the association of the ligand with DNA G4 negatively charged, it allow also to optimize π stacking interaction between the two partners (148). For instance, Ni-salphene showed an important affinity for telomeric G4 confering a stabilization higher than 30°C (149) and Mn-porphyrine displaying a very high affinity for the telomeric G4 (Fig. 15) (150).

34

Figure 15: Schematic presentation of the chemical structures of some G4 metallo-organic ligands.

A.4.2 Ligands of grooves and loops In comparison with duplex DNA, DNA G4 is characterized by the presence of 4 grooves and a large variety of loops. The recognition of grooves and loops confers a high selectivity to G4 DNA. Distamycin A (Fig. 16A) represents the most known ligand of this family. NMR and thermodynamic studies demonstrate that distamycin A is able to interact with grooves of the intermolecular G4 d[TGGGGT]4 with a stoechiometry of 4:1. Its crescent shape facilitates the insertion in grooves through the 4 hydrogen bonds established with guanines (Fig. 16B) (151,152). Recently, Gai et al have shown the ability of DMSB cyanine day (2,20-diethyl-9-methyl selenacarbocyanine bromide) to interact with parallel-stranded G4 d[TGGGGT]4in the form of monomer and dimer in different sites. The dimer binding is a dual-site simultaneous binding mode, which is different from normal modes (such as end- stacking or pure groove-embedding). This kind of binding mode involves the recognition of two unique structural features of G4: a terminal G-tetrad and a groove (153).

Figure 16: Model of G4-ligand complex showing groove interaction mode.

A) Schematic presentation of the chemical structures of Distamycin A compound. B) PDB Model

2JT7: Structure of the complex formed between d(TGGGGT)4 and Distamycin A. Image generated using UCSF Chimera (13).

35

A.5 Goals of the doctoral research The goal of my thesis studies is to characterize DNA G4 by NMR spectroscopy at different levels. Three different topics are investigated in this manuscript:

1) Binding studies of 2,4,6-triarylpyridine family ligand with Human Telomeric Repeat G4;

2) Characterization of a tetra-molecular G4 model d[TG4T]4 and its interaction with specific G4 ligands inside cells

3) Investigation of a new DNA G4 scaffold formed in the human K-ras gene promoter

A.5.1 Binding studies of 2,4,6-triarylpyridine family ligand with Human Telomeric Repeat G4 Previous studies in our laboratory demonstrate that a ligand of 2,4,6-triarylpyridine (TAP) family show a high specificity and selectivity for G4 structures in different buffer conditions (Na+ and K+), it also confers a high stability for the human telomeric G4 as was assessed by FRET melting experiment (154). Our studies detailed in this manuscript aim to: Characterize the interaction between the TAP ligand and the human telomeric G4 formed by + the sequence d(AG3(T2AG3)3 in Na conditions. Using CD and 1D NMR spectroscopy, the interaction between the two partners is investigated. 2D NMR has been used to assign unambiguously all 1H resonances in the complex and to explore the binding site. A model exploring the interaction mode between the two partners was generated. This model would be used as a tool to improve G4 ligands for the development of more affine molecules that can better stabilize G4.

A.5.2 Characterization of a tetra-molecular G4 model d[TG4T]4 and its interaction with specific G4 ligands inside cells Recent advances have enabled the application of NMR to study biomolecules inside living cells. The second project of this manuscript aims to probe the structural properties of a tetra-molecular G4 model d[TG4T]4 inside living eukaryotic cells, Xenopus laevis and HeLa cells with NMR spectroscopy. Using 1H-15N SOFAST HMQC, spectra were performed inside Xenopus laevis and compared to those observed in vitro conditions. Cell lysates were used to probe the structural properties inside HeLa cells because no signal was observed inside cells. Furthermore, an optimization approach of in-cell NMR application in human cells is also analyzed. The second objective of this part is based in probing the interaction of d[TG4T]4 with three G4 specific ligands presenting different mode of interaction : 360A, distamycin A

36

and DMSB cyanine compound. A comparison with in vitro pattern of imino peaks was established for each ligand. The strategy of in cell NMR application for human cells can be developed with different G4 structures and more relevant G4 structures can also investigated to probe the interaction with G4 ligands.

A.5.3 NMR study of KRAS G4

Previous studies showed that the human K-ras promoter contains a motif that is able to form a G4 structure (155). In this manuscript, different sequences were screened by NMR to select good candidates for high resolution structure determination. Among them, two different sequences (kras-21R and kras-22RT) were used to characterize the G4 formed in the human k-ras promoter. In collaboration with Dr. Liliya Yatsunyk, a Visiting Professor from Swarthmore college, USA. These two sequences were characterized by CD spectroscopy. The stabilization of G4 structures formed by these sequences in interaction with three different ligands (360A, PhenDC3 and Braco19) was also investigated. NMR study of the two sequences showed that kras-22RT presents a good profile to perform a G4 exploitable topology. A 1D titration with Braco19 was also performed with 22RT showing an interesting behavior of k-ras G4 by the formation of intermediate species upon the addition of Braco19.

37

Part 1 Interaction of 2,4,6-triarylpyridine ligand with the Human Telomeric Repeat G-quadruplex motif

B. Interaction of 2,4,6-triarylpyridine family ligand with the Human Telomeric Repeat G-quadruplex motif

B.1 Introduction As referred in the previous sections, G4s are non canonical nucleic acids structures, formed by guanine-rich sequences in the presence of monovalent cations such as K+ and Na+. In the past few years, computational and experimental studies on the human genome have highlighted the distribution of a large number of sequences capable of forming G4 structures. Among those, the human telomeric repeat is probably the most studied one (over 1970 hits from web of science). In this work we have used a human telomeric sequence 22AG (Fig. 17B), capable of forming an antiparallel G4 in the presence of Na+ cations (Fig. 17A) (15).

The human telomeric sequence is formed by repetitive nucleotide sequences made of T2AG3 at each end of chromosomes (156). During cell division, the enzyme telomerase compensates the normal shortening of telomeres by supplementing it with additional T2AG3 units (157,158). The high activity of telomerase confers a quasi-infinite proliferation to cancer cells in approximately 85 % of human tumors (99,159,160). The inhibition of human telomerase by targeting the human telomeric sequence with G4 ligands, aiming to inhibit its interaction with telomerase or interfere with telomeric functions (161,162), has been described as a logical strategy to develop anti-cancer drugs (12,101,163-165), and also a main objective pursued under this project . The diversity of G4 structures and ligands that have been described in the literature over the past years, unveiled different binding modes and affinities between G4 structures and the ligands tested (127,166). Usually, the development of G4 ligands has been mainly based on polycyclic planar small compounds with negatively charged substituents, these particularities lead to the compound stacking onto planar G-tetrads, as has been reported by several structural studies of G4-ligands complexes (125,126). In a succinct way, we can group different distinct broad families of ligands based on their interactions mode: End- stacking, comprising the largest collection of G4 ligands tested so far, with highlight to telomestatin (135,145), Bisquinolinium compound Phen-DC3 (144), porphyrins such as NMM (N-methylmesoporphyrin IX) (167), TMPyP4 (33) and quinazoline derivatives (168). Hypothetical G4-specific intercalating ligands are scarce since they must go in between G tetrads, which requires a high energy cost to unstack or highly bend the two tetrads and displace the cation (11). Interaction with grooves and/or loops of G4 has been reported with Distamycin A (151,152,169) and carbocyanines compounds (153,170,171). Probing the

38

differences between these binding modes can be important for rational drug design (172).To date, only a few G4-ligand complexes structures were obtained, mainly from NMR (33,135,144) and crystallography (127,138,166). For that reason, we investigated the interaction of the G4 formed by the human telomeric sequence with a new G4 compound belonging to the 2,4,6-triarylpyridines compounds (TAP) (Fig. 17C) family which demonstrates interesting structural-stabilization properties and a good affinity for telomeric G4 (154,173,174).

Figure 17: Human telomeric repeat motif 22AG and TAP ligand.

A) The folding topology adopted by 22AG in Na+ solution. B)22AG is a 22-mer found in the human telomeric sequence and can adopt unique folding in Na+ solution. C) Schematic structure of the 2,4,6–Triarylpyridine compound with the numbering used in this manuscript.

Here, we characterize the interaction of TAP with 22AG human telomeric G4 in sodium conditions by biophysical techniques mainly CD and NMR. The 2,4,6-triarylpyridines family ligands were synthesized in our group by Smith Nicole (154) and FRET test with different derivatives lead to this compound showing best delta Tm, it confers a stabilization for Human telomeric G4 with ∆Tm of 31.6 ± 1.8°C. The stabilization of the 22AG after binding with TAP was first confirmed by CD melting. Next we used 1H-NMR to titrate TAP into 22AG and map the chemical shift perturbations in order to localize possible hot spots regions that could indicate the binding site(s). The titration was also useful to determine the optimal conditions to prepare the NMR samples for high resolution 2D NMR experiments in order to determine

39

the structure of the oligonucleotide alone and in complex with TAP. NOE-based distance restraints using simulated annealing was applied to obtain the final structure of the complex. Finally, molecular dynamics simulation was used to optimize the 22AG-TAP complex. We believe that our results are pertinent and would be used to improve the development of a new generation of G4 ligands, more specifically against the human telomeric repeat and some diseases that are implicated from dysfunctions related to this motif.

B.2 Materials and Methods

B.2.1 Biophysical characterization of G4

B.2.1.1 Synthesis and purification The oligonucleotides used in this work were purchased from both Eurogentec (Belgium) and Integrated DNA Technologies (USA). They were synthesized at 200 nmol and 1 µmol scales, then purified by reverse phase HPLC. The sequences were supplied lyophilized. After solubilization in distilled water, oligonucleotides were desalted by gel filtration using G 25 sephadex illustra NAP-5 columns purchased from GE Healthcare (United Kingdom). This last step was important for the elimination of small but important impurities such as acetate, ethanol, ammonium and other small molecules adsorbed to the oligonucleotide and observed by NMR.

The oligonucleotide concentration was determined by absorption spectroscopy, performed on an Uvikon XS spectrometer using a quartz cuvettes (Hellma, France) with a path length l = 10 mm, basing on the absorbance value at 260 nm and the molar extinction coefficient published previously (167,175) or provided by the manufacturer. The Nanodrop ND-1000 was also employed to determine quickly and economically the concentrations of the oligonucleotides during the enzymatic synthesis. The Beer-Lambert law was applied to calculate the concentrations:

( ) = ( ) where is the absorbance at the wavelength𝑨𝑨 𝝀𝝀 𝜺𝜺 ,𝝀𝝀 ∙is𝑙𝑙 ∙the∁ molar extinction coefficient of the -1 -1 absorbent𝑨𝑨 molecule (L.mol .cm ), is the length𝜺𝜺 of the optical pathway of the crossed -1 solution (cm) and is the molar concentration𝑙𝑙 of the absorbent molecule (mol.L ). The 1 oligonucleotide purity∁ was assessed using 1D- H NMR.

40

Afterwards, the stock solutions of oligonucleotides were conserved at -20°C in bidistilled water to avoid the introduction of unwanted ions, which are able to influence the stability and the kinetic of certain structures formation.

The human telomeric sequence 22AG (AG3(T2G3)3) was prepared in Sodium or potassium NMR buffers (20 mM Na2HPO4/NaH2PO4,70 mM NaCl, pH6.5, 10% D2O or 20 mM K2HPO4/KH2PO4,70mM KCl, pH 6.5, 10% D2O). After dissolving in NMR buffers, the oligonucleotides were annealed several times by heating 5 min at 95 °C and chilling in ice. After the last annealing cycle, they were conserved at 4°C for 24 h before utilization.

B.2.1.2 Ligand preparation The 2,4,6-Triarypyridine (TAP) was synthesized in Mergny group (U869, ARNA laboratory, Bordeaux) by Nicole Smith a former postdoc in the laboratory (154). Additionally, it was also purchased from Greenpharma (France). Contrary to the one synthesized by Nicole Smith, the purchased molecule is not charged positively in the pyrrolidine moiety and is not in the salt form. The purity of the compound was confirmed by 1D 1H-NMR. According to the experiment requirements, the stock solutions of the compound were prepared at different concentrations ranging from 2 to 50 mM according to the NMR needs, e.g., 1D titration, waterlogsy, NOESY...etc.

B.2.1.3 CD and CD melting Circular dichroism (CD) is referred to the differential absorption of left and right circularly polarized light. It provides an important information about the conformation of DNA structures. The CD signatures are influenced by π-π stacking between stacked guanines from the G-quartets (176). Depending on the conformation of the GBAs of two stacked G- quartets, the CD spectrum will have a specific shape, Syn/Syn or Anti/Anti steps give rise to a negative peak at 260 nm while Syn/Anti or Anti/Syn steps give rise to a positive peak at 295 nm. All parallel G4s have a CD signature with a positive band at 260 nm and negative band at 240 nm. However, anti-parallel G4s show a positive band at 240 and 295 nm in addition to the negative band at 260 nm. A mix of parallel/anti-parallel conformations can also be observed as shown for the telomeric sequence. Studies can also be used to assess the G4 stabilization by ligands provided that the ligand will not be degraded by increasing temperature and will not be absorbed. The difference between melting curves in the absence and in the presence of ligands indicates the stabilization potential of the ligand (∆Tm values) (177,178). CD experiments were performed on a JASCO J-815 spectrometer using Spectra Manager

41

software. The sample contained 3-5 µM of oligonucleotides was dissolved in 500 µl volume of G4-folding buffer and transferred into a 10 mm path-length quartz cuvette. The ligand was transferred from the stock solutions (see ligands preparation). The CD spectra were measured between 220 and 320 nm wavelengths. A scan speed of 100 nm/min and a response time of 2 s were used to perform each experiment, with five scans collected and averaged. The melting experiments were conducted monitoring the ellipticity at 295 nm by increasing the temperature from 20 to 95°C, with a heating rate increase of 1°C/min.

B.2.2 NMR spectroscopy NMR spectra were recorded on Bruker Avance instruments, a 700 and a 800 MHz cryogenically cooled probe, experiments were performed at 20 °C. For solution NMR, standard 3 or 5 mm NMR tubes were used. The samples were prepared in potassium or sodium NMR buffer depending on the G4 folding conditions. 22AG was dissolved in sodium

NMR buffer (20 mM Na2HPO4/NaH2PO4,70 mM NaCl, pH6.5, 10 % D2O or 20 mM

K2HPO4/KH2PO4,70 mM KCl, pH6.5, 10 % D2O). The concentrations prepared for each sequence were comprised between 100 µM and 3 mM depending on the experiment requirements. For the study of the interaction with the ligand, the 1H 1D NMR titration between 22AG and TAP ligand was performed by adding aliquots of TAP directly from a 50 mM stock (dissolved in DMSO) solution to the NMR sample, with molar ratios of ligand to DNA up to 5 molar equivalents. Most of the 1H 1D spectra were recorded using the selective excitation sculpting (179) sequence (based on the use of "double pulsed field gradient spin- echo"), which selectively removes resonance due to water without affecting other resonances including those which are in fast exchange with water. The 1H-1H NOESY were recorded in

90%H2O/10%D2O with mixing times of 50, 200 and 300 ms, respectively, both in absence and presence of the ligand. The pulse program chosen to perform the 1H-1H NOESY experiments was noesyesgpphzs (180), dipsi2esgpph (181) pulse program for the 1H-1H- TOCSY experiment which was done with 120 ms of mixing time, both in absence and presence of the ligand. The 1H-1H-COSY experiment was recorded using the pulse cosydfesgpph (182). All the pulse programs employed use excitation sculpting with gradients for water suppression (183). Usually a spectral width of 2048×512 complex points in F2 and F1 dimensions respectively were used in 2D homonuclear experiments. 128 scans were done for each 1H 1D titration experiment and for each 1H-1H 2D NOESY experiment, 1H-1H- TOCSY experiment or H-1H-COSY. The homonuclear 2D experiments were recorded using the STATE-TPPI mode for the quadrature detection. A set of three natural abundance

42

heteronuclear 1H, 13C HSQC spectra were also performed for three different chemical shift regions using hsqcphpr pulse program . The coupling constant used are 200, 165, 145 Hz for aromatic, sugar and aliphatic regions respectively. 256 accumulations were run for each HSQC experiment. The NMR data processing was carried out using Topspin version 2.1 and 3.2 software and analyzed in both Sparky (T. D. Goddard and D. G. Kneller, SPARKY 3, University of California, San Francisco) and CCPNMR (184).

B.2.3 Assignment and structure calculation The cross-peak assignments and volume integrations were carried-out using respectively the CCPNMR2.2 and Sparky software. The assignment protocol applied as detailed previously (185). NOESY, TOCSY and DF-COSY experiments were used to assign all the protons namely : Imino, aromatics, sugar and methyls. 13C/15N, 1H experiments were used to assign protonated carbons and nitrogens. NOE buildup curves for NOESY spectra at mixing times of 50, 200 and 300 ms were generated using as reference the averaged value of the six intensities derived from the NOEs observed between H6 and H5 of thymines which was fixed to 2.99 Å. The NOE restraints categories were classified as follows: strong (1.8-3 Å), medium (2.5-4.5 Å) and weak (4.0-6.5 Å). The Hydrogen bonds restraints defined between 1.9-2.1 Å and 2.9-3.1 Å for the bonds established between H1-O6 and H21-O7 respectively were only applied for the guanine bases. The anti- and syn-conformations were defined by the glycosidic torsion angles χ determined from the H1'-H8 intra-base distances. They were restrained to be in the range of -130 ± 40° and 60° ± 40° respectively. The planarity restraints applied to the guanine bases of tetrads were set to 20 Kcal/mol/Å2. The distance restraints were manually generated from the NOESY experiments in D2O at 200 ms of mixing time, for non-exchangeable protons, and from NOESY experiment in H2O at the same mixing time for the exchangeable protons. The integration of NOE volumes and the calibration of distances with a relaxation matrix spin diffusion correction, and the setting of lower and upper bounds were done by ARIA2.3 (186). The automated NOE assignment procedure of ARIA was avoided to limit the errors leading to badly calibrate distances. Therefore, a manual NOE assignment procedure was employed by individually inspected peaks. The structure calculation was performed as previously proceeded (135,187), The structures of 22AG in the presence of the ligand were calculated by the combination of three steps, Firstly, 8 iterations of calculation were carried-out, where 200 initial structure were calculated using ARIA2.3/CNS1.2 (188,189) with a mixed cartesian and torsion angle dynamics simulated annealing protocol, which is composed of four stages: a) an initial

43

molecular dynamic with a high temperature torsion angle simulated annealing, 50000 steps at 10000 K with 27 fs for each step; b) a torsion angle dynamics cooling stage, 10000 steps from 10000 to 2000 K; c) a cartesian dynamics cooling stage, 10000 steps from 2000 to 1000 K, and finally d) an other cartesian dynamics cooling stage of 20000 steps from 1000 to 50 K with 3 fs per step. For all bond, angle, and improper dihedral energy terms of the force field, the standard CNS -rna-allatom topology and parameter files were used with uniform energy constants. For distance and H-bonds, 10 kcal mol-1Å-2was applied during the stage of dynamics, increased up to 50 kcal mol-1Å-2 for the remain steps of the dynamics. For dihedral restraints, energy constants applied were of 5, 25 and 200 10 kcal mol-1Å-2 for the phases a, b, (c and d) , respectively. While, an energy constant of 25 kcal mol-1Å-2 was applied for planarity restraints. Distances restraints together with G-tetrad hydrogen-bonding distance restraints, glycosidic angle restraints and planarity restraints were involved in this calculation step. At this stage, the intermolecular NOE restraints of the complex and the intramolecular NOE restraints of the compound were not included.

The second step consists to the NMR refinement of the 20 best structures in explicit solvent, it was performed using the SANDER module of Amber 12 (University of California, san Fransisco, http://ambermd.org). The calculation was performed with the AMBER force field FF12SB which contains AMBER force field for nucleic acids and Barcelona changes (190). Two Na+ ions were included between the G-tetrads and 21 others to counter the negative charge of the DNA. The 20 structures were solvated by a truncated octahedral box of TIP3P water molecules (191). First of all, the structures were energy minimized in 2 stages

2 using a harmonic position restraints of 25 kcal/mol/Å . 1000 steps of minimization by holding the system fix and minimizing just the water box including ions was done with initially 500 steps of steepest descent minimization followed by 500 steps of conjugate gradient minimization. Then, 2500 steps for the entire system, with initially 1000 steps of steepest descent minimization followed by 1500 steps of conjugate gradient minimization. Afterwards, 20 ps of simulated annealing was acquired, with heating the system from 0 to 300 K during 5

2 ps under a constant volume with maintaining the position restraints at 25 kcal/mol/Å . Then, a cooling step to 100 K during 13 s has been performed, where the time constant for heat bath coupling was varied in the range of 0.05 to 0.5 ps. A final cooling stage was also performed with much tighter cooling (0.1 to 0.05 ps) to bring the system to 0 K. The weight of distance restraints was increased gradually during this simulated annealing going from 0.1 to 1 in the 3 first seconds and kept at 1 for the rest of the annealing procedure.

44

The third step consists on an unrestrained docking of the ligand with the averaged structure of the 10 best generated G4 structures. The parameterization of TAP ligand was carried out using the Antechamber module of AMBER package which is based on the general amber force field (GAFF) (192) and the Hic-up server (193), the partial charges of the individual atoms were obtained, the bonds, angles and dihedral parameters were also determined. The docking was performed using UCSF Chimera autodock Vina.

B.3 Results and discussions

B.3.1Characterization of 22AG G-quadruplex formation by CD and NMR spectroscopy As reported in B.2.1.3 section, the CD signature depends on the conformation of GBA of two stacked G-quartets, Syn/Syn or Anti/Anti steps give rise to a negative peak at 260 nm while Syn/Anti or Anti/Syn steps give rise to a positive peak at 295 nm. All parallel G4 have CD signature with a positive band at 260 nm and negative band at 240 nm. However, anti- parallel G4 structures show positive bands at 240 and 295 nm in addition to the negative band at 260 nm. In sodium solution, 22AG displays a positive peak around 295 nm, a negative one at 260 nm and another positive around 245 nm which are characteristic of an anti-parallel conformation. However, in potassium solution, 22AG presents a strong maximum band at 290 nm, a positive shoulder 260-270 nm and with a small negative band at 235 nm suggesting the contribution of mixed parallel and anti-parallel structures (2+2 and 3+1 folds) as previously observed in the case of very similar sequences. In our case, the CD spectrum of 22AG annealed in 90 mM sodium in the absence of the TAP compound is shown in Fig. 18, we observe a profile corresponding to an anti-parallel G4. The profile observed for 22AG annealed in 90 mM potassium fits perfectly with the related data in the literature (93,112,194- 197).

The conformational behavior of 22AG according to the metal ion was also investigated by NMR which is a powerful tool for the structural, kinetics and dynamics study of biological macromolecules. NMR has allowed the determination of atomic resolution structures of G4 as well as their intermolecular interactions with small organic drugs. Structural studies by NMR of G-quadruplexes have been abundantly covered in several reviews (185,198). To conduct such NMR studies, highly concentrated samples are often employed. After the identification of the G4 main peaks, such as the imino, H8, H1' and H6 peaks unambiguously by 1D and 2D NMR experiments, notably by 1H-15N HMBC, TOCSY and COSY experiments, together with punctual mutations in the sequence, we can establish a

45

complete assignment of every imino, amino, aromatic, sugar and methyl protons necessary to calculate a high resolution structure. Afterwards, NMR restraints can be calculated in order to obtain a set of interatomic distances, hydrogen bonds, dihedral angles and planarity of the G- quartet core. Altogether, these restraints generated experimentally from NMR spectra have to be included in the restrained molecular dynamics to search for the most favorable G4 conformation and obtain the final 22AG structure derived from NMR restrains.

Figure 18: Polymorphism of human telomeric sequence motif (22AG) characterized by CD.

CD spectra of 3 µM 22AG annealed in 90 mM sodium solution, the CD profile displays two positive bands at 295 and 245 nm and a negative band at 260 nm (black line ) indicating a unique anti-parallel conformation. In potassium solution, the CD displays both a strong positive band at 290 nm and a small negative band at 235nm in addition to several broad bands around 260-270 nm, this signature suggests the presence of mixed parallel and anti-parallel structures.

In a succinct description, the formation of G-quartets within G-quadruplex structures gives rise to characteristic imino protons in the downfield region between 10 and 12 ppm (199). The observation of sharp peaks within this chemical shift range confirms the formation of G-tetrads. However, amino protons are broader and span the region from 6 to 10 ppm which makes them more difficult to identify. Aromatics usually exhibit chemical shifts

46

between 6 and 8.5 ppm while sugar protons are localized between 1 and 6.5 ppm. In the upfield region we observe also methyl protons of thymines within the range 0 to 2.5 ppm (Fig. 19). Moreover, the existence of more than one G4 conformation formed by a unique guanine rich sequence can be immediately detected by NMR if the number of observed guanine imino exceed the number of guanines participating in a single specie as observed with Tetrahymena telomeric G4 (200). In our work we started by acquiring 1D 1H NMR spectra (Fig. 20). From the spectra is evident the polymorphism of 22AG under KCl based solutions. The 22AG in 90mM sodium solution at 20°C shows 12 well-resolved imino proton peaks, corresponding to the 12 imino protons (Fig. 20B) from the 12 guanine bases indicating a single G4 structure. In contrast, in potassium, more than 24 peaks were observed (Fig. 20A) which proves the presence of different conformations. The structure of 22AG in sodium solution was previously resolved (15) and the NMR studies showed a unique folding (Fig. 20B). Moreover, NMR signals of 22AG are also easy to identify. FRET studies performed in our lab confirmed its stabilization by TAP and we believe to the interest of the development of such compound for cancer. For these reasons, the binding mechanism of TAP with 22AG is of great interest.

Figure 19: Proton NMR spectrum of 22AG at 500 µM annealed in 90 mM Na+solution.

Imino, amino, aromatic, sugar and methyl protons are highlighted. H2O signal around 4.7 ppm suppressed by excitation sculpting is showed as well as the TMS (Tetramethylsilane) signal used as reference at 0 ppm. Unless otherwise specified, all NMR experiments were performed at 20°C.

47

Figure 20: Polymorphism of human telomeric motif formed by the sequence 22AG characterized by NMR.

1D 1H NMR spectrum shows the Imino region of unlabeled 22AG annealed in 90 mM K+(Red) and in 90 mM Na+(Black).

B.3.2 Characterization of TAP compound It was previously reported by our group the ability of a series of novel 2,4,6- triarypyridine compounds to stabilize different G4 sequences. The stabilization, selectivity and cytotoxicity of three sequences forming G4 (Human telomeric G4, k-ras G4 and c-kit G4) by several compounds of this family were assessed by different biophysical and biological methods (154). The structural features of these molecules were optimized to interact with G4 grooves and loops. The 4-aryl constituents were modified on the central pyridine ring by different groups and the terminal amino group by adding pyrrolidine or piperazine (Figure 17 C). The modifications conferred an increase in stabilization properties of these compounds when bound to G4 as observed by the increased Tm. The compound which contains a thio- methyl-phenyl group on the central pyridine ring and pyrrolidine group in the terminal amino group (Figure 17 C) showed a good stabilization for Human telomeric G4 with ∆Tm of 31.6 ± 1.8°C in 100 mM of Na+ as showed by FRET melting essay. The highest selectivity for G4 comparing to duplex DNA among all the tested compounds was shown as well as the significant cytotoxicity index for HeLa cells which may indicate its anti-proliferative effect. All the characteristics of this compound making it a promising G4 ligand (154). For that reason, we selected this compound to probe the interaction with the human telomeric G4 formed by the sequence AG3(T2AG3)3. Furthermore, the structural studies of the complex between G4 and TAP could be used to improve the scaffold of that family of ligands and

48

allow side groups selection. For the development of TAP ligands two main strategies were adopted. In the first case the ligands were produced in house, developed by a post doctoral fellow, and where mainly used for the biophysical and cytotoxic tests, and in the former case a larger quantity of the best ligand was acquired from Greenpharma (France) for structural objectives since NMR studies can consume tens of milligrams of compound. This strategy had advantages and inconveniences. The main advantage was the capacity of modifying the ligand at will in the laboratory but unfortunately this option produced much fewer quantities. In addition, the purification was very cumbersome with extended purifications cycles and low-mild yields.

The purchased molecule overcome this obstacle but unlike the molecule synthesized in house the batches produced by the supplier were not in the imino salt form The new synthesized positively uncharged form can explain the decreasing of the ability of TAP to stabilize 22AG G4 (∆Tm of 7°C) compared with the original ligand ∆Tm 31,6 °C. Nevertheless we could not wait for the production of large scale of the charged form of the ligand and we advanced for the structural calculations of the complex 22AG-TAP. The synthesized TAP showed a good solubility in both methanol and Na-phosphate buffer used in this work. To best characterize the compound, two different concentrated preparations (~5mM) of the compound were performed both in deuterated methanol (MetOD) and in Na-phosphate buffer. The protons of TAP in MetOD were unambiguously assigned according to (154) (Fig. 21A). In addition we repeated the experiments in different molar ratios of MetOD and phosphate buffer to follow the individual peaks in 100% MetOD and their changes/collapse observed in buffer alone. The behavior of the ligand in Na-phosphate buffer is dramatically changed comparing to that observed by NMR in MetOD. The assignments in Napi buffer are shown in figure 21B, Some peaks such as those of the protons linked to carbons 5 and 5' as well as most of the pyrrolidine protons (D/D': H3 and H4) have merged into broad peaks , probably due to an stacking and organization of ligand molecules with each other to minimize the exposure to the solvent of the hydrophobic chains.

49

Figure 21: Assignments of the compound TAP in both deuterated methanol (d6-MetOD) and Na-phosphate buffer.

A) Assignments in MetOD using the numbering showed in fig. 17. B)Assignments in Na-phosphate using the numbering showed in fig. 17. s: singlet, d:dounlet, t: triplet, m: multiplet.

B.3.3 TAP binds to and stabilizes 22AG Although more than ten intra-molecular human telomeric G4 structures have been resolved so far by NMR spectroscopy, only a few drug/human telomeric G4 complex have been determined (135). Initial studies by CD spectroscopy were used to investigate the modifications in the conformation of 22AG in the presence of TAP. The anti-parallel quadruplex displays a specific CD profile in sodium conditions (Fig. 22A). To test the effect of the compound TAP on the structure of 22AG, a CD titration from 0 to 5 TAP/22AG molar ratio was performed by adding the ligand to 22AG before and after being annealed. Addition of TAP to 22AG induces a decrease in CD signal at 245, 260 and 295 nm (Fig. 18A), the small shifts observed from the CD spectra suggest that the binding of TAP did not induce a major structural rearrangement of 22AG conformation. However, the interaction of TAP leads to a decrease in the intensity of the peaks at 295, 260 nm and 245 nm. The moderate extend of modifications at the 295, 260 and 245 nm bands may indicate that the ligand does not bind as an intercalating agent, as expected, but it would rather bind to the surface of quartets exposed

50

to the solvent, the other possible binding modes that require consideration are the end- stacking and the groove or loop binding.

In fact, this CD signal modification is not necessarily an indication of unbinding between TAP and 22AG, as was shown previously with several G4 (201). Therefore, TAP structural perturbation in 22AG is minimal and may reflect small changes in the relative orientation of the tetrads. On the other hand, to confirm the stabilization effect of TAP on 22AG, we monitored the interaction between TAP and 22AG by CD melting at 2:1 molar ratio between the two partners (Fig. 22B). The CD melting profile before adding the compound shows a melting temperature (Tm) of ~45 °C, the presence of the compound increases the Tm to nearly 52°C, conferring ~ 7° C of stabilization to 22AG. The low capacity of TAP to alter the 22AG structure as shown by CD spectroscopy in the presence of increasing concentrations of TAP is supported by the small stabilization observed by CD melting. The small shifts observed by CD spectroscopy together with the melting stabilization suggest little rearrangement of G4 structure in the presence of TAP.

51

Figure 22: Interaction and stabilization of 22AG human telomeric G4 with TAP.

A) CD titration spectra of 5µM 22AG annealed alone in 90mM Na+ buffer and afterwards successively additions of 2,4,6- Triarypyridine compound, from 0 to 2.5 molar equivalents data was collected at 20°C. B) CD melting of 5µM 22AG annealed alone in 90mM Na+ buffer (black line), and in the presence of 2.5 molar equivalents of 2,4,6- Triarypyridine compound (red line). Melting was monitored at 295 nm. The difference of melting temperature (∆Tm) is around 7°C.

52

B.3.4 NMR analysis of the TAP-22AG complex and binding site determination As described previously, targeting the G4s by small compounds have attracted growing attention in recent years for their potential cancer-associated applications namely tumor detection and treatment (202-205). Binding of ligands to G-quadruplexes can be easily assessed by solution NMR studies. Nevertheless only a few NMR structures of G4-ligand complexes have been resolved. The difficulty in obtaining well-defined complexes is still a challenging task, mainly because most of the ligands undergo molecular aggregation in water based buffers. Nevertheless, NMR remains a powerful technique allowing a detailed analysis of the binding properties and structural features of the G4-ligand complex, not possible otherwise. Different NMR experiments can be used to study G4-ligand interactions, such as 1D, 2D titration, Selective NOE, STD, WATERLOGY...etc. It is worthwhile to mention that both STD and WATERLOGSY methods were not successful for the study of G4-TAP interaction.

The NMR method of choice, e. g., the 1D 1H titrations allow us to follow the chemical shift of distinct protons families in the presence of increasing ligand concentrations. In addition, changes upon the addition of ligand can be followed by 2D NMR experiments such as NOESY and HSQC. In the case of fast and intermediate NMR time scale movements (µ to ms), the ligand binding to G4 can result in the broadening of peaks concerned by the interaction. To characterize the interaction between TAP and 22AG, 1H1D NMR titration was performed aiming both to confirm the conformation features of 22AG in interaction with TAP as monitored by CD spectroscopy, and to determine the "hot spots" binding regions between 22AG and the compound. As shown before, the 22AG in 90mM sodium solution at 20°C shows 12 well-resolved imino peaks, corresponding to the 12 imino protons of the guanine bases indicating a single G4 structure (Fig. 20B). The interaction of 22AG with TAP ligand was studied by titrating 22AG with increased concentration of the compound from 0 to 5 molar equivalents. The addition of the compound to 22AG increases gradually the changes in the chemical shifts, particularly in the upfield region of the G4 protons. Upon the addition of ligand to a ratio of TAP/ G4 ~2, the NMR titration experiment shows progressively selective changes (Figure 7A). The titration quadruplex-TAP induces a dramatic line-broadening at ratio 2 (Fig. 23A). The chemical shift variations were measured for all the bases to explore the binding site of 22AG with the compound at 293 K (Fig. 23B). The NMR titration shows that among the protons that were unambiguously assigned (section B.3.5), the ones that experienced the larger chemical shift perturbations are the H1' (anomeric), the H3' and the H4'

53

protons (Fig. 23B). These results were somehow expected since these sugar protons are more exposed towards the exterior of 22AG. As it was observed with CD spectroscopy and in general terms, the guanines did not change the relative conformation significantly and that is corroborated by looking to the mild shifts observed for the imino protons upon titration. It is clear that the changes of chemical shifts in the specific region formed by the bottom loop of the G4 formed with T18, A19 and T6 together with A7 that is in front of A19 are much larger than others. This region is located in the bottom of the G4 under the third tetrad and presents the particularity of favoring the interaction of TAP through a loop/groove binding mode. The H1' protons of Thymine 18 (T18) and Adenine 19 (A19), the H2' protons of T6, T17 and G16, as well as the H5' proton of G16 present the most important chemical shift changes when comparing to the other proton resonances (Fig. 23B). Furthermore, the broadening and disappearance of some resonances such as H1-G16 and H1-G4, H8-A7, H8-A19 and H8-T18 suggest an intermediate chemical exchange of TAP binding to the G4 in the NMR time scale. In terms of bases, we could identify a ligand binding "hot spot" in the region of T18 and A19. The anomeric proton of A7 that is just facing A19 in the opposite side of the major groove seems to be part of this apparently preferred-binding region, which encompasses a short loop in the opposite direction from where the 3' and 5' strands ends join together to participate in the formation of the third tetrad. Furthermore, the TAP compound shape suggest its ability to easily find its binding site with 22AG by extension of its side chains designed around the 4- aryl core (Fig. 17C) capable to establish an inter-molecular- π stacking interaction. In addition, the binding mode of TAP with 22AG was previously investigated by molecular dynamic and it allowed for stronger interactions (174).

54

Figure 23: Chemical shift mapping of 22AG in the presence of 2.5 molar equivalents of TAP compound.

A) Imino and aromatic proton regions titration spectra of 500µM 22AG with different concentrations of 2,4,6-

Triarypyridine in 200µl of sodium phosphate buffer containing 20mM Na2HPO4/NaH2PO4, 70 mM NaCl at pH 6.9. The assignment of the protons and the 22AG/ligand ratios are shown above the spectra. B) The chemical shift perturbation of the 22AG protons between the free form and the complex formed with 2,4,6-Triarypyridine at 20°C. The base numbers of 22AG are shown at the bottom of the figure and the different protons are represented by different colors shown above the figure.

55

B.3.5 Assignment of resonances belonging to the bound 22AG The assignment of resonances in nucleic acids including G4 usually starts by following the classical sequence-specific assignment of resonances described in detail elsewhere (198). To perform this step, several NMR experiments are required. 1H-1H NOESY (206) spectra were acquired at different mixing times (50, 200 and 300 ms): this experiment is crucial for assignments and is based on through space (dipolar) coupling (Fig. 24). NOESY is dependent on the distance between interacting spins, peaks are observed when the distance is shorter than 5-6 . Sequential assignments are derived from sequential walking of NOE connectivities betweenÅ H8/H6 aromatic protons and H1' sugar proton in the same residue and H8/H6 of following residue (Fig. 25), the sequential walk can also be traced from 5' and 3' sugar protons. Through bond DQF-COSY (207) and TOCSY (208) experiments can also be used to confirm the assignment of sugar protons namely 3', 4', 2'/2" and 5'/5", these experiments are based on scalar coupling. Heteronuclear experiments such as 13C-1H HSQC are also helpful to distinguish aromatic from sugar protons within the same residue according to their couplings with different carbons. It is desirable to use labeled DNA for the heteronuclear experiments but they can also be done at natural abundance if DNA is not isotopically labeled (13C and 15N isotopes exist in nature with a rate of ~1% and 0.37% respectively). In our case we used uniformly labeled (13C and 15N) 22AG to help with the assignments. To simplify the assignments we firstly characterized 22AG alone and only after that in presence of TAP. The assignment was based on the combination of through space and through bond experiments with the help of already published work (15). Once the H8/H6-H1' assignment pathway was traced from the A1 residue at 5' end of 22AG to the G22 residue at its 3' end (Figure 26A), the next step was to determine the correlations between imino/amino protons of guanines and the H8/H6/H2 bases protons, as shown in Fig. 26B. Subsequently, NOESY and TOCSY spectra were used to assign most the sugar protons for all the residues (Figure 27A, 27B). COSY experiment was used to identify the protons H2'/H2" and H5'/H5" of sugars (Fig. 28), but some H2", H5" peaks were absent as it was reported before (209). To remove ambiguities and verify all the steps, three 1H-13C HSQC spectra were recorded at natural abundance conditions for different regions; aromatic protons (Fig. 29A), sugar protons (H1', H3' and H4') (Fig. 29B) and sugar protons (H2'/H2", H5'/H5") (Figure 30). In addition all the protonated carbons were assigned using the 1H-13C HSQC spectra. 1H-15N HSQC spectrum was also recorded using labeled 22AG to assign the nitrogen atoms (Fig. 31). Following the above strategy of assignments, we were able to completely assign all the atoms of the bound G4 with exception to the unprotonated heavy atoms.

56

Figure 24: 300 ms 2D 1H-1H NOESY spectrum of 22AG alone (black) superimposed with 300 ms 2D NOESY spectrum of TAP:22AG complex at 2.5:1 stoichiometry (red-green contours).

Figure 25: H6/H8-H1' assignment and sequential walking. correlation between H8 and H1' of two residues A1 and G2 from 22AG sequence.

57

Figure 26: 2.5:1 TAP:22AG complex assignment.

A) The expanded H8/H6-H1' 300 ms 2D-NOESY spectrum of the 2.5:1 TAP:22AG complex, the complete sequential walk assignment is shown. B) The expanded 300 ms 2D-NOESY spectrum of the TAP:22AG complex at 2.5:1 stoichiometry of the H1-H8/H2 region.

58

Figure 27: Portions of a 120 ms 2D TOCSY (Blue) superimposed with a 300 ms 2D NOESY spectrum (gold) of TAP:22AG complex at 2.5:1 stoichiometry.

A)Portion showing the correlations between H2'/H2"-H1' and H2'/H2"-H1'. B)Portion showing the correlations between H3'-H1'.

59

Figure 28: Portion of a 2D COSY spectra (green) superimposed with 300 ms 2D NOESY spectrum (gold) of TAP:22AG complex at 2.5:1 stoichiometry.

showing the correlations between H2'/H2"-H1'.

60

Figure 29: 1H-13C HSQC spectra of TAP:22AG complex at 2.5:1 stoichiometry..

A)Spectrum showing the correlations C8/C6-H8/H6 in the aromatic region. B)Spectrum showing the correlations C8/C6-H1', C8/C6-H3' and C8/C6-H3' in the aromatic region.

61

Figure 30: 1H-13C HSQC spectra of TAP:22AG complex at 2.5:1 stoichiometry.

A)Spectrum showing the correlations C2-H2'/H2" in the aromatic region. B)Spectrum showing the correlations C5/H5'/H5".

62

Figure 31: 1H-15N HSQC spectrum of TAP:22AG complex at 2.5:1 stoichiometry.

Spectrum showing the N1-H1 correlations of the guanines involved in the formation of G-tetrads.

B.3.6 Assignment of resonances belonging to the bound TAP. The protons of the ligand TAP are named as shown if Fig. 17C. The assignments of TAP compound alone in Na-phosphate are shown in Fig. 21. Aliphatic intra-molecular NOE correlation appeared quite broad and not well resolved in the NOESY spectra, so, these peaks were assigned and predicted to be considered as ambiguous restraints for the structure calculation (Fig. 32). The aromatic peaks are not distinguishable in the NOESY spectrum. To obtain more resolved peaks of the ligand, different strategies were employed as previously reported (133). We tested different temperatures from 277 to 308 K as well as different salt conditions from 10 to 100 mM of Na+, several conditions of pH were also screened. Unfortunately, the peaks of the ligand remained broad suggesting the formation of aggregates at high concentration in the presence of G4.

63

Figure 32: Portion of 1H -1H NOESY spectrum of TAP:22AG complex at 2.5:1 stoichiometry (green) superimposed with NOESY spectrum of 22AG alone (red).

Spectrum showing the intra-molecular correlations of aliphatic proton from TAP ligand.

B.3.7 Intermolecular NOEs detection The detection of NOEs resulting from the interaction between G4 and ligand is crucial to determine the binding mode. In different NMR studies of G4-ligand interaction, the mapping of chemical shift perturbations upon titration of the ligand into G4 together with intermolecular NOEs have been used to explore the binding mode of interaction and determine the structure of the complex (133,135,153). By comparison between NOESY spectra of 22AG alone and in the presence of TAP, no NOEs of interaction between the G4 and TAP were identified unambiguously. Therefore, we decided to calculate the structure of the G4 in the conditions of binding with TAP using NMR derived restraints in the presence of TAP and to perform subsequently unrestrained docking to generate the complex.

64

B.3.8 Structure calculation Different programs can be used for G4 structure calculation. They are based on several force fields. Two main programs of NMR structure calculation are XPLOR-NIH (210) and ARIA2.3-CNS2.1 (186,188,189). These programs use the potential energy parameters of CHARMM force field improved for nucleic acids (211-213). The module Sander based on Amber package which contains several force fields can also be used to perform NMR structure calculation. The force field used for G4 structure calculation called FF12SB containing improved potential energy parameters for nucleic acids and Barcelona force field changes (190). The combination of different force fields can also be applied for the structure calculation (187). In different procedures used for the structure calculation of DNA G4, the most critical parameters are the energy weight of NOE constraints, the cooling procedure of the simulated annealing protocol, and the length of the molecular dynamics simulation (198,211). The structure calculation can be run in vacuum or implicit solvent, with or without electrostatics contributions. The final NMR refinement can be performed in explicit solvent where the system is solvated in the presence of counter-ions to negate charges and cations are introduced between G-tetrads for a more realistic representation of G4.

The detailed protocol is described in part B.2.3. In a succinct way, two steps were employed to calculate the structure. A first step based on the chemical shift investigation using ARIA/CNS and a second step of refinement using AMBER package. Different types of restraints are used during the structure calculation of G4, inter-proton distance restraints derived from NOE, hydrogen bond restraints, dihedral angles and planarity restraints. Distance restraints were derived from NOEs between protons in close spatial proximity (< 5 ). The intensity of NOESY cross-peak is inversely proportional to the sixth power of the distanceÅ between the protons undergoing cross-relaxation (214). Under certain circumstances, indirect cross-relaxation of another proton, which is called spin diffusion, can alter the intensity of NOE during the mixing time. To solve this problem, we generated a build-up of NOEs from several experiments at different mixing times (50, 200 and 300 ms) (185). The reference used to calibrate the assigned peaks was determined by averaging the six intensities between H5–H6 atoms of thymines which is equivalent to 2.99 . To determine the distance (Dab) between two protons a and b, with an NOE cross-peak intensityÅ (Iab), calibrated for a known inter-proton separation (Dref) with a measured intensity (Iref), the following

1/ 6 equation is used Dab = Dref (Rab / Rref ) .The connectivities between the imino and the H8/H6 protons (Fig. 26B) were used to identify the formation of the three G-tetrads: Tetrad 1

65

: (G2-G10-G14-G22). Tetrad 2 in the middle :(G3-G9-G15-G21), and Tetrad 3: (G4-G8-G16- G22). The relative intensities of the cross-peaks between H8 and H1' allowed us to determine the Syn/Anti conformation for guanine bases. We observed a strong intra-residue H8-H1' cross-peaks for the guanines G10, G22, G3, G15, G8 and G20 indicating a Syn conformation for these guanines. In contrast, weak intra-residue H8-H1' cross-peaks were observed for the guanines G2, G14, G9, G16, G4 and G21 indicating an anti conformation (Fig. 26A). Hydrogen bonds and artificial planarity restraints allowed to keep G-quartets in their planar conformation during the first step of calculation. The planarity restraints were removed from the calculation during the second step. The characteristics of the five best generated structures (Fig. 33) are recapitulated in the Table 1. The average of the five best structures was used for the docking study with TAP.

Table 1 :NMR restraints and structural statistics for the 22AG G4 in the presence of TAP

NMR restraints Statistics Total restraints 413 Intra-residual 275 Inter-residual 89 Medium-range 20 Long-range 29 H-bond distance restraints 24 Torsions angle 12 Structural statistics NOEs violations Number (>0.2 ) 2.2 Maximum violations 0.32 Torsion violations Å Number (>5°) 0 Maximum violations 4.14 H-bond violations Number (>0.1 ) 1 Maximum violations 0.15 Global coordinate precisionÅ Backbone average 1.00 ± 0.14 All heavy atoms 1.58 ± 0.13 RMSD from ideal geometry Bonds lengths ( ) 0.01 ± 0.00 Bonds angles (°) 2.53 ± 0.05 Average AMBER energyÅ (kcal/mol) -2131± 3

66

Figure 33: Solution structure of the G4 formed by 22AG in Na+ and in the presence of 2.5 molar equivalents of TAP .

Ensemble of five superimposed refined structures.

According to the chemical shift perturbations mapping (CSPM) obtained by titration of 22AG with TAP (Fig. 23), it was possible to determine a hot spots region comprised between the lateral loop formed by the bases T17, T18 and T19, which faces the base A7 in the opposite lateral loop and the third tetrad formed by the guanines 4, 8, 16 and 20 (Fig. 34). To the better understand the binding we used unrestrained docking experiments between the 22AG and TAP using UCSF Chimera autodock Vina software. Different docking tests were performed and the ten most energetically favorable ligand structures were selected for analysis. The docking results allow to distinguish two different binding sites in the surface of 22AG (Fig. 35). The first site, present in the seven lowest structures, partially fits with the data derived from NMR CMSP. This site is located in the region between the lateral loop containing A19 and T18 and the third tetrad. In fact, one side chain is projected into the loop, and the other side chain interacts with the major groove of the G4. Analyzing in more details the binding points, two electrostatic interactions are possible between the amide groups of TAP and the phosphate backbone of from guanine 4 and thymine 6 (Fig. 36). Interestingly, another site located between the diagonal loop formed by T11 and T12 and the minor groove of the G4 is also spotted for three out of the ten lowest energy structures. The possible interactions that can be established with 22AG in this site are depicted by H-bonds that can be formed between the carbonyl group of TAP and the exchangeable amino protons of G4 or G9. An electrostatic interaction with the phosphate backbone of T12 is also possible (Fig. 37).

67

The resulting interaction observed from the docking suggest that TAP has the ability to bind the grooves and/or loops of the G4 by the establishment of electrostatic interactions with the phosphate backbone and H-bonds with exchangeable amino protons allowing to better stabilize the G4. This behavior observed is in part in agreement with the observation that we do not reach a saturation plateau during the titration experiment. Likely, TAP compound does not have a very high affinity for 22AG but the existence of a second binding site allows a better 22AG stabilization. The generated models allow us to better understand the behavior of TAP family compounds. Groove binders should be able to discriminate between different size of grooves (minor, medium and large). Consequently they will have the capacity to discriminate between different G4 structures like the telomeric basket type structure with only minor and major grooves or the propeller type with only medium grooves. Most G4-ligand complexes resolved until now showed a π-π stacking binding with intramolecular G4s. The few ones showing an alternative binding were observed with a parallel stranded inter-molecular G4 like Distamycin A (169) (127,215). This molecule is a groove binder that interact with the medium grooves of [TG4T]4. According to these observations, TAP-like molecules might also prefer to bind to the medium grooves that are unfortunately not present in the 22AG basket type structure studied here. TAP-like molecules might therefore have a better affinity for the propeller type telomeric structure. The π stacking is the most favorable binding mode for intra-molecular G4 as shown for telomestatin derivative with the human telomeric G4 (135) and PhenDC3 with c-myc G4 (144). To improve the binding affinity of TAP ligands molecules for intramolecular G4, it is worthwhile to attempt the enlarging of its scaffold with the conservation of their side chain which can easily interact with the grooves of G4. The enlargement of its 4-aryl core by adding rings should also increase its specificity for G4 against duplex. Larger surfaces allowing π stacking along with the side chains that interact with grooves should also increase the affinity of TAP for the G4.

68

Figure 34: Region of Hot spots derived from chemical shift perturbations mapping (green surface).

Figure 35: Interactions between 22AG and TAP observed by unrestrained docking.

TAP in orange, loops formed by T11, T12 ,A13 and T17, T18 ,A19 in green.

69

Figure 36: Possible interactions between 22AG and TAP in the site 1 comprised between lateral loops and the major groove.

Figure 37: Possible interactions between 22AG and TAP in the site 2 comprised between diagonal and the minor groove

70

Part 2 Study of G4 and ligand interactions inside living cell by NMR spectroscopy

C. Study of G4 and ligand interaction inside living cell by NMR spectroscopy

C.1 Introduction Since the highlighting of G4 forming by G-rich sequences (68), several studies have been performed in various conditions more or less close to the physiological environment. Nowadays, NMR structures and dynamic measurements of G4 are investigated mainly under in vitro conditions. The media used for in vitro studies are usually prepared in a manner to confer a good solubility and stability for the molecules under study or even to facilitate the spectral acquisition. Most of in vitro condition buffers neglect the effects of viscosity, molecular crowding and interaction with other macromolecules as well as the influence of the concentration of ions and other small molecules on the conformation and dynamics of the structure. Besides usual in vitro conditions such as sodium or potassium phosphate and lithium cacodylate buffers (216-218), different molecular crowding mimics have been also employed, for example, organic solvents, ammonium, bovine serum albumin (219) high concentration of PEG (116,220-222), or hen egg white (223). Previous studies have been shown that the addition of such crowding agents to solutions containing G4 induces changes in several measurements namely kinetics, thermodynamics, thermal and pressure stability, binding to compounds and conformational rearrangement of structures (224-230). Several biophysical and structural techniques have been used to investigate foldings and structural features of G4 in crowded environment, mainly CD, native PAGE and NMR spectroscopy (222,229). Even though the characteristics of these artificial molecular crowding mimics make them good tools for alternating in vivo conditions, the inside of the cell is too complex to be perfectly mimicked. The intracellular environment of eukaryotic cells is highly crowded and their macromolecular features are hard to be defined. The concentration of total proteins inside cytoplasm of eukaryotic cells is in the range 200–300 g/l, whereas that of RNA is in the range 75–150 g/l (231). The DNA is located in the nucleus within chromosomes and represents approximately 10 µg for 10 millions of cells. Recent advances in the NMR field allowed to develop an innovative way to study biomolecules with atomic detail in a physiological environment compared to in vitro studies (232-235). Performing NMR experiments inside living cells is relatively new and challenging, but it can provide information in more realistic conditions. In-cell NMR studies require isotopically labeled molecules in order to be detected in the cytoplasm or inside the nucleus environment. The use

71

of isotopes is essential to have a clean and "invisible" background from the cellular molecules. Recently, several studies of isotopically labeled DNA and RNA were carried out inside the eukaryotic environment of Xenopus laevis oocytes (235,236). In addition to the fact that only a few number of studies probing G4 inside living cells were done so far, all of them have been performed inside Xenopus laevis oocytes (222,227,229,237). During the last years, in-cell NMR have been largely used to investigate several biological and structural features of proteins such as post-translational modifications, phosphorylation, acetylation and glycosylation (238-241). In addition protein-protein and protein-ligand interactions were also studied inside living cells, including human cells (233,242-245). Moreover, protein structures were also resolved in cell conditions (246,247). The study of G4 inside living cells is very challenging and can extend our understanding on how G4 behave in physiological environment. Previous studies have attempted to probe the general fold of G4 inside cells, and despite the poor resolution, the results inferred that the in cell spectra contained important differences when compared with the in vitro spectra (229,237). To our best knowledge, G4 have never been investigated inside mammalian cells by NMR.

As previously reported, the adaptation of in-cell NMR to nucleic acids including G4 provides specific challenges (236). Several critical parameters have to be evaluated and optimized to observe G4 inside living cells:

A) In-cell NMR experiments require 13C, 15N labeled DNA, a relatively high concentration of isotopically labeled DNA has to be delivered inside cells. Consequently, a big quantity of 13C, 15N labeled DNA has to be synthesized. An efficient delivery technique has also to be employed.

B) The cells observed by in-cell NMR have to keep their viability during the NMR measurements that take around 24h.

C) The leakage of DNA should be avoided during NMR experiments.

Probing G4-ligand interactions inside living cells bears a tremendous potential due to the interest for developing ligands with better affinity and selectivity. Having in mind that these are very challenging tasks reaching the limits of feasibility, at least by NMR, they are very important and deserve to be analyzed and developed in further detail. In addition to the weak NMR signal usually observed in cell conditions, the ligand binding to G4 can result in a broadening of the peaks concerned by the interaction inducing an important decrease in their

72

intensity. The biological characteristics of the ligand have also to be investigated namely its toxicity towards the cell type used for the in-cell NMR experiments.

For our studies the Tetrahymena tetra-molecular telomeric G4 formed by d[(TG4T)]4 (87) was selected as a model to evaluate and optimize the in-cell conditions. We have chosen to use d[(TG4T)]4 G4 model which is easily usable for the study presented in this part. Two main features can explain this selection : The absence of polymorphism for this structure (248,249). The kinetics of association and dissociation of strands are relatively slow allowing easily to follow them by different techniques. Moreover, d[(TG4T)]4 G4 is well characterized in vitro by different methods such as CD, UV (250-255), mass spectrometry (256) NMR (257) and x-ray crystallography (31), in the presence of different counter ions (Na+, K+ and + NH4 ) , in the presence of crowding agents such as PEG (221,258), cationic comb-type copolymer (259) and ligands such as 360A (260), Distamycin A (152,169,261) and DMSB cyanine compound (153).

C.1.1 DNA vectorization methods According to the cell type, different delivery techniques are available to introduce DNA inside cells. The delivery methods can be classified into two groups: i) Direct transfer by physical methods such as microinjection and electroporation, these two methods were used for DNA delivery in this work and ii) Transfer via carrier methods such as lipofection and biolistic method. We had no time to probe these additional methods which are briefly described below.

1) Microinjection Microinjection has been the first technique used to deliver molecules inside living cells for eukaryotic in-cell NMR (234). This technique requires large cells such as Xenopus laevis oocytes where the intracellular volume is in the order of 1 µl. NMR samples are prepared by injecting about 200-250 oocytes (235). Microinjection inside cells presents several advantages comparing to the other techniques and was our principal choice. The principal advantages are the tolerance to inject high concentration of oligonucleotides, a relatively high degree of reproducibility and as well as the possibility for sample automation. However, with microinjection is not possible to deliver the oligonucleotide to a large number of cells and in addition, it requires certain operator skills in order to preserve intact the fragile oocytes inside the NMR tube. NMR sample preparation by microinjection in small eukaryotic cells is experimentally not feasible, because the required number of manipulated cells is very

73

high, often exceeding 10-20 millions of cells. To overcome this obstacle we need to employ different delivery methods also developed for structural biology purposes in the last ten years.

2) Electroporation DNA transfection by electroporation is probably the most simple and efficient method used to deliver proteins and oligonucleotides inside prokaryotic and eukaryotic cells. Furthermore, protocols and electroporation conditions for DNA delivery have also been improved for a large range of cells over many years, making it the privileged method for DNA transfection across different fields of research. Electroporation consists to temporarily increase the permeabilization of plasma membrane by applying high-voltage electrical pulses to cells suspended in the presence of DNA. Electrical pulse creates a potential difference across the membrane as well as charged membrane components and induces temporary pores in the membrane permitting the DNA penetration inside the cell (262). The process of electroporation follows a scheme of 5 successive steps (263): (1) Induction step: the applied field induces an increase in the trans-membrane potential giving local defects. When achieving a threshold of 200 mV, a mechanical stress is generated. (2) Expansion step: defects are maintained as long as the field persists at overcritical value. (3) Stabilization step: when the field intensity decreases below the threshold value, the recovery of membrane organization is acquired which brings the membrane to the permeabilized state for small molecules (4) Resealing step: the recovery of membrane unpermeability is established, this is the first order process on a scale of seconds and minutes. (5) Memory effect: modifications in the membrane properties remain present on a longer time scale exceeding hours but the normal cell behavior is finally recovered. The electroporation allows to transfect DNA fragments with high achieved efficiencies in a wide range of cell lines. In contrast to lipofection, electroporation does not use cytotoxic reagents towards the cells. The cell mortality induced by high voltage is the only criterion which has to be optimized for electroporation applications. Together these considerations make electroporation a good method to use for in-cell NMR sample preparations. Several devices and optimized protocols for a wide range of cells have been developed in laboratories worldwide and industrial companies continue their efforts to improve the method.

3) Lipofection Lipofection consists in the delivery of DNA into cells using liposomes which are prepared from appropriate lipids. Cationic lipids are mixed with neutral lipids to form unilamellar liposome vesicles bearing a net positively charge. Negatively charged DNA forms

74

a spontaneous complex with these vesicles which are taken-in by endocytosis (264). Lipofection allows to transfect a wide range of cell types (mainly adherent cells), it also permits successful delivery of DNA of all sizes. Several commercial kits are also available for transfecting DNA by lipofection.

Despite these advantages, this technique has several drawbacks namely its low efficiencies in most primary cells as well as suspension cell lines, linked to the dependence on endocytotic activity, its cytotoxicity and its dependence on cell division. Different strategies were developed to generate lipofection systems such as Lipid-based supramolecular assemblies (LONs) developed in Bartelemy's lab in Bordeaux (265,266).

4) Biolistic method This technique is based on the coating of DNA on the surface of microparticles such as gold or tungsten. To transfer DNA into cells, particles are accelerated to high velocity by a particular driving force such as the generation of a high voltage discharge between two electrodes or gas pressure (267). Biolistic method is fast and simple and enable transfection of a wide range of cells. Like lipofection this method allows to deliver DNA of all sizes while is not cell division dependent. However, the mortality is very high and therefore one needs high cell numbers; it is also expensive and requires a special hardware.

C.1.2 In-cell NMR experiments In-cell NMR spectroscopy is an emerging bioanalytical technique that can be employed to obtain atomic-level information on the structure, folding and interactions of biomolecules inside cells. Two-dimensional correlation experiments involving NMR sensitive atomic nuclei such as 1H, 13C and 15N are robust methods amply used to characterize bio- molecules structure, dynamics and their interactions under in vitro conditions. 1H-15N/1H-13C HSQC (Heteronuclear Single Quantum Coherence) is one of the most informative heteronuclear 2D NMR experiment and can be employed as 2D correlation experiment for in- cell NMR spectroscopy (268,269). Nevertheless, the applicability of in-cell NMR depends on three parameters: 1) The delivery of an important amount of molecules inside cells. The delivered quantity has to be enough for the detection of NMR signal and usually is in the range of 5-20 µM of concentration. 2) The conservation of cell viability during NMR measurements and 3) The absence of cellular damage often observed by leakage from the cellular interior to the buffer media. Therefore, to satisfy these conditions, an efficient delivery strategy needs to be applied in order to induce low cell mortality. In addition, the

75

cellular manipulations, especially after inserting the macromolecule that we wish to study must be as short as possible in order to avoid degradation inside the cells. From the NMR point of view, an experiment has been used more than any other in the field of in cell NMR. The development of the band-selective optimized flip-angle short-transient HMQC (SOFAST HMQC) NMR experiment (270,271) shares a great deal of responsibility in the development of the field, mainly by reducing the NMR acquisition time to shorter than 100 ms, enabling full spectral acquisition from minutes to a few hours. This feature enables to the observation of a few micromolar quantities inside living cells (233). The SOFAST HMQC pulse sequence is optimized for very short interscan delays which provide high sensitivity to perform heteronuclear H-X correlation experiments. HMQC experiment has also the advantage to use less radio frequency pulses which reduce signal loss due to B1-field inhomogeneities and pulse imperfections mainly observed in NMR cryoprobe. Furthermore, optimized flip angles in HMQC experiments allow to increase the repetition rate of the experiment without losing much S/N per unit time (272). On the one hand, the use of band-selective 1H reduces the 1 effective longitudinal relaxation times (T1) of the observed H spins. Moreover, the use of only 2 band-selective1H pulses in SOFAST-HMQC leads to minimal perturbation of the undetected proton spins, and provides higher enhancement factors than observed with other longitudinal relaxation optimized pulse schemes (271).

C.2 Materials and Methods

C.2.1 Oligonucleotides and ligands preparation

The unlabeled and cyanine3-labeled TG4T oligonucleotides used in this work were purchased from both Eurogentec (Belgium) and Integrated DNA Technologies (USA). They were synthesized on 200 nmol and 1 µmol scale, then purified by reverse phase HPLC. The sequences were supplied lyophilized. After solubilization in bidistilled water, both unlabeled and labeled oligonucleotides were desalted by gel filtration using G 25 sephadex illustra NAP-5 columns purchased from GE Healthcare (United Kingdom) in the same way for 22AG (see Material and Methods B.2). All the sequences were prepared in potassium NMR buffer

(20 mM K2HPO4/KH2PO4,70 mM KCl, pH 6.5, 10% D2O) then annealed several times by heating 5 min at 95 °C and chilling in ice. After the last annealing cycle, they were conserved 24 h before utilization. Distamycin A was purshased from Chemper (Italy), DMSB compound was purshased from MP Biomedicals (USA) and 360A was kindly provided by the group of Marie-Paule Teulade-Fichou from the Curie institute in Paris. All the compounds were supplied as a powder. Distamycin A was dissolved in water, however both 360A and DMSB

76

were dissolved in deuterated D6-DMSO. All the stock solutions of the compounds were prepared at 30 mM.

C.2.2 15N, 13C uniformly labeled oligonucleotides preparation We used two different strategies to synthesize oligonucleotides. The chemical and enzymatic procedures.

C.2.2.1 Chemical synthesis The chemical synthesis was performed by the group of Dr. Marc Escudier, Laboratoire de Synthèse et Physico-Chimie de Molécules d'Intérêt Biologique (SPCMIB) in the University of Toulouse. The automated chemical strategy is widely used. The synthesis is realized on a solid support called controlled porosity glass (CPG) wherein oligonucleotides are elongated from 3' to 5' end. The nucleobases are introduced in the form of phosphoramidites (273) whose 5'-OH is protected by DMT (Dimethoxytrityl) group and the phosphate group is in the form of N,N'-diisopropyl-cyanoethylphosphoramidite. The NH2 groups of the bases are protected by Benzoyl groups for Adenine and Cytosine or Isobutyryl for Guanine, however, Thymine is not protected. The synthesis is performed by an automated synthesizer in four successive steps : Detrytilation, coupling, capping, oxydation and deprotection (Fig. 38).

A) Detrytilation

Detrytilation reaction takes place at the beginning of the first coupling cycle and on the 5' extremity of the last introduced nucleotide for each other cycle. In the presence of Dichloroacetic acid at 2 %, the primary alcohol at 5' end is deprotected and the Dichloroacetate of Dimethoxytritylium is released and eliminated by an acetonitrile flush.

B) Coupling

Following detrytilation, the deprotected phosphoramidite at 5' end is ready to react with the next base, which is added in the form of a nucleoside phosphoramidite. The reaction is performed with a classical activator, tetrazole is often employed. The activator protonates the diisopropylamino group of the nucleoside phosphoramidite which is thereby substituted by tetrazole. The produced species rapidly affects the 5′-hydroxyl group of the nucleoside phosphoramidite, a phosphite triester is thus formed.

77

Figure 38: Chemical synthesis cycle for preparation of oligonucleotides by phosphoramidite method.

The synthesis approach is constituted by five successive steps : Detrytilation, coupling, capping, oxidation and finally a deprotection where the linker that attaches the 3′-end of the oligonucleotide to the solid support is cleaved by treatment with concentrated ammonium hydroxide.

C) Capping

Depending on the yield of coupling step, a part of the oligonucleotides whose 5'-OH is free remains on the support. The reaction of these runts generates a complex mixture of truncated oligonucleotides which are difficult to purify and separate from the desired sequence. To avoid their reaction during the following coupling, a capping step is needed to block the unreacted 5′-hydroxyl groups. In fact, the acetylation of the 5′-hydroxyl groups renders them inert to subsequent reactions, for that acetic anhydride solution is employed to acetylate the 5′-hydroxyl.

78

D) Oxidation

During this step, the unstable phosphate triester formed in the coupling step is converted to a stable phosphate triester by iodine oxidation in the presence of water and pyridine.

E) Deprotection

After elongation, phosphate groups of oligonucleotides are protected with a cyanoethyl group, whereas nucleobases are protected with benzoyle or isobutyryl groups. At the end of synthesis, the linker that attaches the 3′-end of the oligonucleotide to the solid support is cleaved by treatment with concentrated ammonium hydroxide.

C.2.2.2 Enzymatic synthesis of uniformly 15N, 13C labeled oligonucleotides The strategy of enzymatic synthesis is presented in the Fig. 39. The DNA synthesis was performed from a DNA/RNA template-primer as previously reported (274,275). To synthesize the sequence TG4T, a 22 mer template-primer DNA sequence was designed in such a way that the last 16 bases form a stem-loop, while the remaining part at the 5' extremity make the complementary sequence to the TG4T sequence, an RNA base was also included at the 3' extremity to separate the desired sequence from the template-primer. The sequence TG4T-primer 1 (ACCCCATGTAACGTACGTTACrA) has been ordered from IDT (USA), it was supplied lyophilized and used as delivered. The uniformly 15N, 13C labeled dNTP were also purchased from Cambridge Isotope Laboratories (USA). We screened different conditions to accomplish the DNA polymerization, and we were able to determine that the synthesis yields continue to improve until 18 h after the beginning of polymerization. An excess of enzyme over to 4u/µl of DNA during the polymerization does not necessarily allow to obtain better yields. Firstly, the template-primer sequences TG4T-primer 1 was solubilized in bidistilled water to obtain a final concentration of 800 µM. Afterwards, several mixtures of 500 µl were prepared. The optimized conditions used to carry out the DNA polymerization are described in the (Table 2) for a final volume of 500µl. The enzymatic reaction was catalyzed for 18 h by adding the MMLV reverse transcriptase ordered from Biolabs (USA) in the presence of 1X buffer supplied with the enzyme. The resulting products were treated by 400 mM KOH for 5 h at 37° C, in order to hydrolyze the RNA base and cleave the template-primer from TG4T sequence. Finally, the polymerization yields and the products quality were analyzed by electrophoresis on 20 % polyacrylamide denaturing gel, followed by a single coloration step using stains-all (Sigma Aldrich, France).

79

Figure 39: Enzymatic synthesis strategy of 13C, 15N labeled TG4T.

The sequence presented in the left part of the figure corresponds to the template-primer designed in a manner to form a hairpin ending by a ribonucleotide. After polymerization reaction (18h at 37°C) to extend the sequence by adding T and G nucleotides, the product was cleaved in alkaline conditions by treatment with concentrated potassium hydroxide 5h at 37°C.

C.2.2.3 Electrophoresis on denaturing polyacrylamide gel The polymerization products were analyzed by electrophoresis on a 20 % polyacrylamide gel (acrylamide: bisacrylamide (19: 1)) under denaturing conditions (Urea 7M, TBE buffer 1X) using TBE 1X as the migration buffer. The gel was polymerized by addition of 0.1% of TEMED (N,N,N',N'-Tetramethylethylenediamine) and 0.1% persulfate ammonium (PSA). A pre-run of the gel at 10 watts/cm2 was carried-out before loading the samples on the gel in order to reach a temperature of 50-60° C. Subsequently, the samples were diluted in the denaturing loading buffer with 95% of deionized formamide. A migration dye was also prepared containing colored marker namely the xylene cyanol and the methylene blue. Thereafter, the samples were denatured 3 min at 95° C. A ladder containing a mix of

TG4T and TG4T-primer1 (see table 1 for sequences) was used. After the gel migration, it was colored by stains-all for 20 min then discolored by distilled water for 24 h.

15 13 C.2.2.4 Purification of the N, C uniformly labeled TG4T After the alkaline cleavage of the polymerization reaction, a mixture of different species (cleaved TG4T-primer1, synthesized TG4T and remaining dNTPs and enzyme) was generated. To purify the TG4T from this mixture, we performed a HPLC purification, a technique of affinity chromatography in chemistry that consists to separate components in a mixture, to identify and to quantify each of them. We used an Ultimate 3000 machine from Themofisher scientific equipped with a pump with a degassing system, an autosampler, a column compartment thermo-regulated, a UV detector and an automatic collector. As normal practice, we filtered and degas all the buffers and sample solutions. To purify TG4T and separate our mixture components, a semi-preparative anion exchange column DNA-Pac PA100 (9 x 250 mm) was used. The column performance allows us to separate

80

oligonucleotides with one base of length difference. In addition, the DNAPac PA100 column can be operated under denaturing conditions which includes high temperature (up to 90 °C) and/or high pH eluents (up to pH 12.5) that can be used to eliminate hydrogen bonding, allowing resolution of problematic sequences such as self-complementary sequences or poly- G stretches. The separation of components is based on the electrique charge of the substrates, because the DNA is negatively charged, the stationary phase of the DNA-Pac PA100 containing a microbeads resin conjugated with cations retains the oligonucleotides according to their increasing charge. Firstly, different programs of calibration and elution were generated by the Chromelen software that manages the HPLC machine. Subsequently, the column have been cleaned with a high concentration of KCl at 1.5 ml/min, a flow rate that remains constant in all the following steps, this cleaning allows to remove all the remaining impurities from the previous purification. A second cleaning step with pure water was also done. A step of calibration using the same conditions of the purification is necessary to perform the baseline and to prepare the column for the substrates separation. Afterwards, 100 µl of the mixture was injected with the autosampler, and the gradient of elution detailed in the (Table 3) was applied to separate the components. The substrates detection was done at 260 nm, and the samples were collected between 8 and 15 min of the elution, the fractions corresponding to each peak were pooled and the pH was calibrated at 7 with a buffer solution of 1 M Kpi (K2HPO4/KH2PO4). After that, The samples were frozen in liquid nitrogen and finally lyophilized at - 92°. The lyophilization products were desalted by gel filtration using G 25sephadex illustra NAP-25 columns, then the samples were evaporated in speed-vac from Savant / Thermo Finnigan with a cryopumping system at high temperature (60-80 °C). The samples prepared by this way were solubilized in the NMR buffer containing 20 mM of Kpi and 70 mM KCl then analyzed by 1D 1H NMR and analytic HPLC.

81

15 13 Table 2: Conditions of the polymerization reaction of the uniformly N, C labeled TG4T synthesis for a final volume of 500 µl.

Solution Volume Final concentration MMLV buffer: 10X 1X 500 mM Tris-HCl, 750 mM KCl, 30 mM 50 µl 50 mM Tris-HCl, 75 mM KCl, 3 mM

MgCl2, 100 mM DTT, pH 8.3 MgCl2, 10 mM DTT, pH 8.3

Bidistilled H2O 332 µl

TG4Ttemplate primer 100 µl 160 µM 15N, 13C dGTP 100 mM 4 µl 800 µM dTTP 100 mM 4 µl 800 µM MMLV reverse transcriptase 200 u/µl 10 µl 4 u/µl

15 13 Table 3: Multi-step gradient program used in the purification of the N, C uniformly labeled TG4T. The flow rate is constant: 1.5 ml/min at room temperature (25° C). Curve: 5 correspond to the linear curve, 4 correspond to a slightly convex curve. Buffer A: 100 % bidistilled H20, Buffer B: 0.2 M KOH, Buffer C : 2 M KCl.

Elution time (min) Buffer A % Buffer B % Buffer C % Curve 0 82.5 12.5 5 5 2 72.5 12.5 15 5 20 57.5 12.5 30 4 23 42.5 12.5 45 4 25 82.5 12.5 5 5 30 82.5 12.5 5 5

C.2.3 G4 micro-injection of Xenopus laevis oocytes

C.2.3.1 Xenopus laevis oocytes micro-injection The Xenopus laevis oocytes at stage VI were provided by Dr. Christian Casenave from the "Laboratoire de microbiologie fondamentale et pathogénicité UMR-CNRS 5234, Université de Bordeaux". Firstly, the oocytes were sorted in a way to discard those presenting defects, such as bumps, or surrounded by a thick layer of follicles, which may impairs the micro-injection. The cells were conserved in the MBS buffer supplemented by 5 % of ficoll. 15 13 To perform the micro-injection, a sample of 20 µl of N, C labeled TG4T at 4 mM was prepared in potassium NMR buffer. The sample was heated for 5 min at 95 °C, then cooled- down to 4°C and the procedure was repeated three more times. Five to six capillary glass

82

needles were pre-filled with mineral oil in order to be ready to mount on the Nanoliter 2000 injector equipped with an oil-driven injection device. The sample was aspirated by the needle until the maximum of the oil level, then each oocyte must be immobilized and visually alligned with the tip of the needle and injected under a binocular microscope. Afterwards, as previously reported (235), the oocytes were pierced smoothly and with as little friction as possible, by the injection needle at the equatorial plane separating the animal and the vegetal 15 13 hemisphere. A volume of 50 nl of N, C labeled TG4T was delivered for each oocyte. After the sample deposition, the needle was withdrawn carefully to avoid the increase of the induced incision. Subsequently, all the manipulated cells were inspected to the degree of incision and the damaged oocytes (usually ~10%) were discarded.

C.2.3.2 NMR sample preparation The selected cells were allowed to recover for 1 hour at 18 °C and the unhealthy cells were removed. The cells medium was eliminated and replaced 3 times by the MBS buffer with 5 % of ficol to wash the oocytes and remove the leaked sample from the surrounding buffer. Approximately, 200 oocytes were added individually to a 5 mm Shigemi tube filled with MBS buffer, 5 % of ficol and 10 % of D2O. The cells were allowed to sediment by gravity, to obtain a volume of ~300 µl occupied by the oocytes (Fig. 40).

C.2.3.3 Cleared lysates of oocytes preparation As previously reported (236), after each in-cell experiment, the oocytes were transferred in a petri dish and washed abundantly by the MBS buffer, then they were crushed and soaked mechanically using a pipette tip, the insoluble fraction were removed by centrifugation at 20000 g for 20 min. The supernatant was transferred into an eppendorf tube and heated to 95 °C for 10 min. Precipitated proteins and cellular debris were removed by centrifugation at 20000 g for 10 min. The clear supernatant was used for NMR measurements.

83

Figure 40: The Xenopus laevis oocytes microinjection procedure and the in-cell NMR sample preparation.

C.2.4 G4 electroporation in human cells

C.2.4.1 Cell culture manipulation Three human cell lines were used in this work: A2780 cells were purshased from Sigma Aldrich (Catalogue N° 93112519); HeLa cells were provided by Stéphanie Durrieu from Dr. Teichmann group (ARNA laboratory, INSERM U869, Bordeaux); and HEK293 cells were provided by Dr. Marie-Line Andreola, from the "laboratoire de microbiologie fondamentale et pathogénicité UMR-CNRS 5234, Bordeaux university". After cell defrosting at 37 °C in a water bath, the cells were transferred in their corresponding fresh complete media listed before (Table 4) and kept at 37 °C in 5 % CO2 in the incubator. After several sub- cultures of each cell line, cells were grown to 70-80% of confluence in 5 % CO2 atmosphere, and harvested for a new passage by a 0.05% Trypsin-EDTA treatment for 10 min at 37 °C. Finally the cells morphology was visually and analyzed by inverted phase contrast microscope after each sub-culturing process.

Table 4: The culture media composition for the cell lines used in this work

Cell line Cell type Culture Media Human cervical DMEM, high glucose, pyruvate, 10 % FBS, 1 % HeLa carcinoma Penicillin/Streptomycin (10000 U/ml) Human embryonic DMEM, high glucose, Glutamax, pyruvate, 10 % HEK293 kidney FBS, 1 % Penicillin/Streptomycin (10000 U/ml) Human ovarian RPMI 1640 Medium, 10 % FBS, 1 % A2780 carcinoma Penicillin/Streptomycin (10000 U/ml)

84

C.2.4.2 G4 transfection by electroporation (EP) The day before the experiment, 1 to 4 Petri dishes (150 x 25 mm) of cultured cells were seeded with 5 x 106 cells into each dish. The cells treatment was performed at 70-80 % of confluence which corresponds to around 10 x 106 of cultured cells per dish. For the small- scale experiments (EP optimization, confocal microscopy, flow cytometry, viability and recovering test), 1 Petri dish of cultured cells was typically sufficient, however, for the in-cell NMR experiments at least 4 Petri dishes of cultured cells were required. Depending on the experiment, different sample preparations were applied: a) A mix of 75 % of unlabeled TG4T with 25% of fluorescent TG4T (Cy3-TG4T) was prepared at two different concentrations, 20 and 40 µM for the flow cytometry and confocal microscopy experiments. b) Several samples 15 13 at 1 mM of N, C labeled TG4T quadruplex was prepared for the in-cell NMR experiments. The samples preparation was performed by heating the samples 5 min at 95 °C and cooling- down them 24 h at 4 °C. After each EP cycle, the cells were put in pre-warmed complete media at 37 °C in 5 % CO2 incubator, after which 1 ml of cell suspension was added to each well of the 6-well plate for the cell recovering in the small-scale experiments, and 5 ml to each 15 cm Petri-dish for the in-cell NMR experiments.

Before each experiment, the cells were washed twice by pre-warmed PBS, and then they were treated by 0.05 % Trypsin-EDTA for 10 minutes at 37 °C to be harvested. Next, 2 ml of 0.05 % Trypsin-EDTA was added to each 15 cm Petri dish and 300 µl for each well of the 6 well-plate. To recover the cells, 8 ml of complete media were added to each Petri dish to neutralize the Trypsin-EDTA and 800 µl for each well of the 6 well-plate. After recovering, the cells were sedimented by centrifugation 5 min at 300 g, then washed again by pre-warmed PBS. After each preparation, 10 µl of the cells were recovered to count the cells and to perform the viability test by adding 10 µl of Trypan blue at 0.4%. For each experiment, the cell density was adjusted to 2-3 x 106 cells/ml by adding the required volume of pre-warmed PBS to reduce the mortality rate. The cells were aliquoted into several samples in eppendorf tubes at 2-3 x 106 cells/ml, then sedimented by centrifugation for 5 min at 300 g and the supernatant discarded. The oligonucleotide samples were diluted in complete media to obtain 20 or 40 µM of concentration for the small-scale experiment and 1 mM of concentration for the in-cell NMR experiment. The diluted samples were added to each aliquot of cells then electroporated progressively, Biorad EP cuvettes of 2 mm gap were used for each EP with 100 µl of sample per cuvette. To do the controls and to check the EP pulses, the cells were electroporated with the complete media without the oligonucleotides to be transfected. The

85

Gene pulser Xcell electroporation system was used to perform the EP. Different Biorad optimized pulse programs were used for each cell line (Table 5)

Table 5 : Summary of the pulse programs used for EP and the cell densities for the three cell lines.

Cell line/EP protocol 1 2 Cell density

HeLa Exponential decay : Exponential decay : 2-3 millions 160V, 500µF and ∞ Ω 200 V, 500µF and ∞ Ω

A2780 Exponential decay : Exponential decay : 2-3 millions 160V, 500µF and ∞ Ω 200 V, 500µF and ∞ Ω

HEK293T Square wave : Square wave : 2-3 millions 25ms, 120V 25ms, 160V

Immediately after each EP, 1 ml of pre-warmed complete media was added to the EP cuvette and gently mixed. Immediately after, the samples of each cuvette were transferred to the prepared Petri dish or into individual wells in the 6 well-plate, 3 to 4 EP samples for the in- cell NMR experiment (only one sample by well for the small-scale experiments). Afterwards, petri dishes or plates were incubated for several hours at 37 °C in the CO2 incubator for the cell recovery. Every 2 hours during the recovery phase, the cells morphology and the recovering efficiency were checked by taking pictures with the phase contrast microscope (Nikon TS100-F, Japan). After 8 hours of recovery, the culturing media with non-attached cells were removed. The adherent cells were washed three times by pre-warmed PBS then treated by 0.05 % of Trypsin-EDTA 10 min at 37 °C in the CO2 incubator. Subsequently, the trypsinized cells were recovered into complete media to neutralize the Trypsin-EDTA then sedimented by centrifugation for 5 minutes at 300 g. The pellets containing the cells were washed twice by pre-warmed PBS and prepared for flow cytometry or in-cell NMR experiments. For the confocal microscopy, the cells were cultured onto the cover-slips of 18 mm diameter pre-coated with poly-lysine.

C.2.4.3 Flow cytometry Flow cytometry is a technique that simultaneously measures and then analyzes the physical characteristics of cells. When cells pass through the laser intercept, they scatter laser light. The scattered and fluorescent light is collected. The Light scattering is dependent on the physical properties of the cells namely size and relative granularity. Two different measurements are provided from flow cytometry: a) Forward-scattered light (FSC) that is

86

proportional to cell-surface area or size, and a) Side-scattered light (SSC) that is proportional to cell granularity or internal complexity. The correlation between FSC and SSC can allow for differentiation of cell types in a heterogeneous cell populations (Fig. 41).

Figure 41: Cell sub-populations based on FSC versus SSC The flow cytometry experiments were done in the platform of flow cytometry at the

Université of Bordeaux (CNRS-UMR 5164). The samples electroporated with d[(TG4T)]4 formed by unlabeled and cy3-labeled TG4T at ratio of 3 to 1 are cultured in the 6 well-plates for 7 hours. The cells were washed twice with pre-warmed PBS, then harvested with 0.05

Trypsin-EDTA 10 min at 37 °C in the CO2 incubator. The trypsinized cells were diluted into an excess of complete media, sedimented by centrifugation for 5 min at 300 g and then washed again twice with pre-warmed PBS. The final cells pellets were solubilized into 200 µl of PBS and kept at 4 °C. A sample control of cells without fluorescent labeled TG4T was prepared to determine the native autofluorescence level of the cells. Two different EP programs were applied for each cell line (Table 5), with two different concentrations to assess simultaneously the EP efficiency within each EP pulse programs, the EP effect on the cell morphology and the effect in function of the increased concentration of cells. To assess the cell viability with the EP procedure, we used the 4',6-diamidino-2-phenylindole (DAPI), a fluorescent molecule that strongly binds to A-T rich regions in DNA. The test is simple and indicates the integrity of the cells that take the DAPI into their nuclei. With that in mind, 5 µl at 1 mM of DAPI was added to each EP sample. The flow cytometry experiment was done on the BD LSRFortessa (BD Biosciences, USA), on the Phycoerythrin fluorescent channel for the cyanine 3 detection (PE channel, 570 nm), and the DAPI channel for the viability tests

87

(DAPI channel, 460 nm). The data acquisition and analysis were performed using the FACSDiva software (BD Biosciences, USA).

C.2.4.4 Confocal microscopy The confocal imaging experiments were done in the Royou group from the institut de biochimie et génétique cellulaires (IBGC), CNRS-UMR 5095. Before the cell culture for the confocal microscopy experiments, the cover-slips with a 18 mm diameter were coated with poly-D-Lysine. Each cover-slip was pre-incubated for 3 hours with 1 M of HCl under agitation, then washed abundantly with sterile water. After removal of the excess of water by aspiration, the cover-slips were incubated for 1 hour in a solution containing 0.2 mg/ml of poly-D-Lysine, the poly-D-Lysine was removed by aspiration and the cover-slips sterilized by ethanol. After electroporation, the cells were put to recover directly on the cover-slips for a period of 7 hours. The culturing media with non-attached cells was removed, and the cells were washed three times with pre-warmed PBS. Afterwards, the cells were fixed with 1 % of para-formaldehyde solution 20 min at 37 °C. Then, the cover-slips were washed 3 times by pre-warmed PBS. The images were taken using a Leica confocal microscope (Wetzlar, Germany), with the 63x objective and an argon laser operating at 520 nm for the detection of the Cyanine fluorescence. The images were acquired by the LASAF software (Wetzlar, Germany) and processed using Fiji-ImageJ software (276).

C.2.4.5 In-cell NMR sample preparation for human cells 15 13 Several samples of 4 mM N, C TG4T quadruplex were prepared in the potassium standard G4 folding buffer (20 mM K2HPO4/KH2PO4, 70 mM KCl, pH 6.5). After annealing the samples (heating 5 min at 95 °C then cooling-down to 4 °C overnight), the samples were diluted in the complete media to obtain 1 mM of final G4 concentration. 4 Petri dishes at 80 % of confluence were prepared (containing 40-50 x 106 cells), they were washed with pre- warmed PBS then harvested by 0.05 Trypsin-EDTA 10 min at 37 °C in the CO2 incubator. After neutralization with complete media, the cells were aliquoted in eppendorf tubes then sedimented by centrifugation 5 min at 300 g . Each aliquot was mixed with 100 µl of the G4 sample and finally electroporated. The EP samples were recovered in Petri dishes with pre- warmed complete media ( 4 samples by Petri dish) and incubated for 7 h at 37 °C in the CO2 incubator for the cell recovering. Afterwards, the culturing media were removed and the cells were washed twice by PBS, then trypsinized 10 min at 37 °C in CO2 incubator. The cells were recovered in excess of complete media, then sedimented by centrifugation 5 min at 300 g. The pellet was washed 3 times by PBS then resolubiliezd in 300 µl of complete media with 10%

88

of D2O. The cells were gently transferred to the 5 mm Shigemi NMR tube and gently sedimented by spinning the NMR tube in a manual centrifuge.

C.2.4.6 In-cell NMR lysates preparation for human cells After each in-cell NMR experiment, the cells were resuspended in the remaining supernatant then transferred gently into 1.5 ml eppendorf tube. Then the cells were sedimented by centrifugation for 5 min at 300 g. The supernatant was conserved as a control to be analyzed by NMR and 200 µl of complete media were added to the cells. The cells were resuspended again in the new supernatant then crushed and soaked mechanically by a pipette tip. The lysates were cleared by centrifugation at 20000 g for 30 min, The clear supernatant was used for NMR measurements.

C.2.4.7 In-cell NMR experiment All in-cell NMR spectra were recorded on Bruker Avance 800 MHz spectrometer with a cryogenically cooled probe. Unless otherwise stated, all experiments were performed at 16 °C. The pulse program used for each experiment was the 2D 1H, 15N SOFAST-HMQC (band- Selective Optimized Flip-Angle Short-Transient heteronuclear multiple quantum coherence) (271), experiments were recorded with interscans delay of 30 ms, spectra were set with repetition rates of 200-240 ms. We used selective 1H pulses centered at 11.2 ppm. A polychromatic PC9-2.200 shaped pulse of 2400 µs as 1H excitation variable flip angle, and a RUBURP.1000 180° as refocusing shape pulse of 1600 µs were used. To decouple 15N nuclei during the acquisition, a WALTZ-16 sequence was applied. We usually selected a spectral width of 1024×32 complex points in F2 and F1 dimensions respectively. The sweep width for the proton dimension was fixed to 6-8 ppm with O1P value set at 11.2 ppm, however, for the heteroatom (15N) dimension the aromatic region was fixed to 4-20 ppm with O2P value centered at 143 ppm. The SOFAST-HMQC spectra were processed using a cosine-squared apodization function in the time-domain.

C.3 Results and discussions

13 15 C.3.1 Synthesis of N, N labeled TG4T An important amount of 13C, 15N uniformly labeled oligonucleotides is required for in- cell NMR experiments. To provide such high quantity, an efficient synthesis method has to be employed. Comparing chemical and enzymatic synthesis methods, several parameters have to be discussed:

89

 The chemical procedure requires phosphoramidites that are relatively expensive comparing to dNTPs used in enzymatic strategy.  Enzymatic strategy has the advantage of being simple without need for automated procedures nor expensive material.  The synthesis yield is generally higher in enzymatic procedure (60-70%) when compared to the solid phase (40-50 %)  The purification is easier for oligonucleotides synthesized using the chemical strategy than for those obtained from enzymatic synthesis ( the presence of DMT group increases the oligonucleotides hydrophobicity and facilitate their purification by HPLC reverse phase procedures).

C.3.1.1 Yield of TG4T synthesis Comparing to the chemical protocol, which necessitates the use of 13C, 15N labeled phosphoramidites, the enzymatic strategy requires 13C, 15N labeled dNTPs (DNA nucleotide Tri-phosphate) which degrades faster and needs additional precaution.13C, 15N labeled dNTPs were added to a 22 mer DNA sequence that serves as a primer template in a molar ratio of 5:1. After 18 h of a polymerization reaction catalyzed by a DNA/RNA dependent reverse transcriptase enzyme from the Moloney murine leukemia virus (MMLV), 10 µl of the reaction mixture were conserved to evaluate the synthesis yield. The rest of the mixture was treated with a potassium hydroxyde solution for 5 h to cleave the target sequence of TG4T from template-primer DNA. The synthesis and cleavage products were analyzed by electrophoresis on 20 % denaturing polyacrylamide gel (Fig. 42). After the polymerization reaction, a band of 28 bases appeared just above the band of 22 bases (wells 4 and 5, Fig. 42), the intensity of this new band is comparable to that of starting template-primer sequence (wells 2 and 3, Fig. 42). The remaining band of 22 bases is very weak comparing to its starting intensity. The yield of polymerization was higher than expected ~ 90 %. After cleavage, two bands are observed (wells 6 and 7, Fig. 42). The cleavage yield seems rather high close to 100 %, only the bands corresponding to the template-primer and TG4T are detected, 22 and 6 bases respectively. According to these analysis, the enzymatic synthesis 13 15 should provide a quantity of C, N labeled TG4T that represents more than 90 % of the starting template primer material.

90

Figure 42: Analysis of enzymatic synthesis and alkaline cleavage products by electrophoresis on denaturing polyacrylamide gel at 20 %(cf C.2.2.3).

1: DNA ladder with two bands, a first one composed by 22 bases corresponding to the primer-template used in the synthesis and a second composed by 6 bases corresponding to unlabeled TG4T ordered from Eurogentec used as reference. Columns: 2,3: Template-primer sequence. 4,5: Synthesis products obtained after polymerization reaction catalyzed by MMLV enzyme. A band of 28 bases corresponding to extended template- primer sequence appeared. 6,7: cleavage products after treatment with concentrated potassium hydroxide solution where a band of 6 bases corresponding to labeled TG4T appeared.

C.3.1.2 Yield of TG4T purification This step represents the bottleneck in the enzymatic synthesis protocol. 13 15 The most difficult step in TG4T production is to separate the product C, N labeled TG4T from the synthesis reagents and the primer-template sequence. Different techniques are possible to purify oligonucleotides mainly the electrophoresis on denaturing polyacrylamide gel followed by a step of electroelution and alcohol precipitation. Several tests have been done using this strategy, but the purification yields were very low (< 30 %). The restricting step in this strategy is the alcohol precipitation which is not appropriate with small DNA 13 15 fragments (< 10 bases). Using this strategy, the quantity of C, N labeled TG4T obtained will not be enough to perform in-cell NMR experiments. Moreover, this technique is time consuming and requires the running of several gels to obtain one NMR sample. For these reasons, we decided to use anion exchange HPLC purification to separate 13C, 15N labeled

TG4T from the rest of synthesis products. This technique is also frequently used to purify DNA oligonucleotides. The process allows the separation of analytes based on their charges using anion exchange resin coated with positively charged groups. The DNA will bind to the resin and displace the counter-ion according to the strength of its negative charge. The purification was done using a semi-preparative anion exchange column DNA-Pac PA100 (9 x 250 mm). A salt gradient going from 100 to 600 mM KCl was applied to separate the mixture components obtained from the synthesis after alkaline cleavage. The purification was also performed in alkaline conditions (pH 11.5) to prevent the hydrogen bonding. The HPLC

91

profile obtained from the purification shows that the reaction mixture contains several components. Four species are well separated: The peaks around 5 min after injection corresponds to the excess of dGTP/dTTP. The peak around 14 min represents unsuccessful products of synthesis or cleavage. The two main species are represented by the peak around 12 min that corresponds to the template-primer sequence after cleavage, and the peak around 13 15 10 min which represents C, N labeled TG4T (Fig. 43). The HPLC profile confirms what was observed in PAGE profile, the yield of synthesis is higher than 90 %.

Figure 43: HPLC profile of the reactional mixture obtained from enzymatic synthesis of 13C, 15N labeled TG4T after alkaline cleavage (cf C.2.2.4).

Four major peaks are observed. 1: around 5 min, corresponding to excess of dNTP used for the synthesis. 2: around 10 min, corresponding to TG4T product. 3: around 12 min, corresponding to template-primer used for the synthesis. 4: around 14 min, corresponding to unsuccessful products of synthesis or cleavage.

13 15 C.3.1.3 Quality of produced C, N labeled TG4T

After purification of a large quantity of synthesis mixtures, TG4T fractions were collected and pooled in the same sample. The pH of TG4T sample was calibrated around 7.5 using a concentrated potassium phosphate buffer (1 ml at 1M). The sample was subsequently lyophilized then desalted using gel filtration G25 column (NAP-5 from GE Healthcare). The concentration of the sample was checked by UV absorbance. 15 mg of TG4T have been synthesized using this strategy, quantity large enough to perform several NMR experiments

92

both inside Xenopus oocytes and human cells. To recapitulate, the enzymatic strategy allows us to obtain a final yield of synthesis above 80% with a moderate cost. Using the anion exchange HPLC process, we manage to improve the purification yield and highly increase the 13 15 quantity of produced C, N labeled TG4T. The relative yields and costs of enzymatic and chemical synthesis strategies are summarized in the Table 6. After desalting, the final product was lyophilized and dissolved in 500 µl of NMR buffer (20 mM K2HPO4/KH2PO4, 70 mM

KCl, pH6.5, 10 % D2O). Afterwards, the sample was annealed by heating for 5 min at 95 °C 13 15 1 then cooled-down 24 h at 4°C. The purity of C, N labeled TG4T was assessed by 1D H NMR (Fig. 44A). The final product has a high purity level, and the spectra shows four well resolved peaks characterizing the imino protons of four distinct guanines and confirms that 15 1 four tetrads are formed within the tetramolecular G4 d[(TG4T)]4. Several N, H SOFAST HMQC spectrum were also recorded in vitro as a control for in-cell NMR experiments. Four well resolved and separated peaks are also observed in the SOFAST-HMQC spectrum (Fig. 44B). Now that the labeled DNA is successfully prepared, an efficient delivery method needs to be optimized for introducing a maximum number of molecules inside cells. I would like to 13 15 emphasize that a part of C, N labeled TG4T was purchased from Silantes (Mering, Germany) which is unfortunately supplied with residual Na+ ions. This allow us to compare the price among the three different methods for obtaining isotopically labeled d(TG4T) Table 6 :Relative yields and costs of different strategies used for the synthesis of 13C, 15N labeled oligonucleotides.

Synthesis type Synthesis yield Cost for 1 mg (Euros) Synthesis time Chemical synthesis 30-40 % 3k 1 week Silantes Purchase variable 2.5k Several weeks Enzymatic synthesis 60 % 0.3k 1 week

15 13 Comparing the different methods that allow us to obtain N, C labeled TG4T:a sample of 20 µl at 3-4 mM strand concentration is required for micro-injection of 200 Xenopus oocytes, whereas, the amount needed for one sample of human cells is five times higher. Taking into account all these parameters, 10 mg of phosphoramidites permit to prepare a quantity large enough for one NMR sample while 10 mg of dNTPs allows to prepare more than 5 NMR samples. Moreover, phosphoramidites are five times more expensive than dNTPs. Therefore, enzymatic strategy provides a synthesis yield ten times higher than what chemical synthesis allows to obtain. Another critical parameter is the purification yield and quality. The chemical procedure has the purification advantage, which is more robust and simpler, mainly the presence of DMT groups favors more attachment

93

between the oligonucleotides and the reverse phase bead-surface. This particularity significantly improves the separation of the desired oligonucleotides from the synthesis reactants and by products. Unfortunately this is something that enzymatic purification cannot benefit.. The classical way to adapt for the product purification is the electrophoresis on denaturing polyacrylamide gel (PAGE), however, this technique allows only yields lower than 30 %. To increase the yield of purification, other methods have to be optimized such as anion exchange HPLC purification. To recapitulate, enzymatic strategy provides a synthesis yield several times better than that obtained using the chemical strategy with much lower cost. So, the enzymatic synthesis is the preferred method to pursue the synthesis of labeled oligonucleotides even if the purification yields are not comparable to those obtained in the chemical synthesis.

C.3.2 G4 DNA and ligand interaction study by in-cell NMR As previously reported, different techniques are used to deliver DNA inside living cells. In this work, two delivery techniques have been employed to introduce DNA inside cells according to their type. Micro-injection delivery method was used to introduce d[(TG4T)]4 inside Xenopus oocytes and electroporation technique for human cells. Two main sections in this manuscript will be dedicated to the studies d[(TG4T)]4 G4 carried-out inside Xenopus oocytes and in human cells respectively. In the first section, we discuss the characterization of d[(TG4T)]4 G4 that was microinjected inside Xenopus oocytes and ligand interaction as well. In the second section, the delivery of d[(TG4T)]4 G4 by electroporation in human cells will be evaluated and an optimization approach of in-cell NMR conditions will also be analyzed.

94

A

B

13 15 Figure 44: NMR spectra of C, N labeled TG4T issued from enzymatic synthesis after alkaline cleavage, anion exchange HPLC purification and desalting.

A) 1D 1H NMR spectrum showing a sample of 4 mM from the final product of the enzymatic synthesis of 13C, 15N labeled TG4T in K+ solution. Four well resolved imino peaks are visible between 11 and 12 ppm after 16 acquired scans confirming that d[(TG4T)]4 G4 containing four stacked tetrads is formed. The final product show a high level of purity. B) 2D 1H, 15N SOFAST-HMQC NMR spectrum showing the same sample of A). Four well resolved peaks correlating imino peaks with their corresponding nitrogen are visible confirming that d[(TG4T)]4 G4 containing four stacked tetrads is formed. Spectra recorded at 20°C in the 800 MHz spectrometer.

C.3.2.1 Characterization of d[(TG4T)]4 inside Xenopus oocytes by NMR spectroscopy 13 15 C, N isotopically labeled d[(TG4T)]4 issued from enzymatic synthesis were dissolved in 20 mM potassium phosphate buffer, pH~6.5 and prepared for micro-injection at 4 mM of strand concentration. An additional sample was prepared at the same concentration of

95

13 15 C, N labeled d[(TG4T)]4 in pure water. Using the stock solution prepared in potassium phosphate buffer, four sets of samples of G4 were folded using the common annealing process by heating 5 min at 95°C and cooled-down at 4°C. The cycle was repeated three times. Fresh Xenopus oocytes prepared accordingly to published protocols were kindly provided by Dr. Eric Boué-Grabot from the Institute of Neurodegenerative Diseases, 13 15 Bordeaux (235,277). 50 nl from a 1 mM stock solution of C, N labeled d[(TG4T)]4 was manually micro-injected into each of the 220 oocytes (further details are available in material and methods chapter) at approximately 22°C. The damage cells were assessed under the dissecting microscope and those showing some minor imperfections such as cytoplasmic expansions were discarded. Four micro-injections were performed for different experiments at increased NMR acquisition time 8, 16, 26 and 32 hours. The micro-injected oocytes were put down with caution in a 5 mm Shigemi NMR tube prefilled with MBS supplemented with 20

% of Ficol and 5 % of D2O. The higher density of Ficoll is important to prevent the bottom oocytes from being crushed over time. The NMR experiments were performed at 16 °C. As described before, the micro-injections of isotopically enriched proteins and nucleic acids were successfully employed in previous studies in cell experiments (235,236,277-280) with excellent results. To identify the G4 formation of d[(TG4T)]4 in vitro and inside oocytes, we recorded two dimensional 1H, 15N SOFAST-HMQC spectra correlating the imino protons from the guanines in each tetrad in d[(TG4T)]4 G4 to their corresponding nitrogen. After each micro-injection, the oocytes were left to recover for 1 hour. A first NMR experiment of 4h was recorded to observe the imino folding pattern of d[(TG4T)]4 G4 inside Xenopus oocytes. In the first spectrum, the folding pattern derived from the in cell experiment (Fig. 45A) shows four unique peaks, and resembles to a unique folding with four tetrads in a parallel mode similar to what is commonly found under KCl conditions in vitro. The four peaks observed were previously assigned and are well characterized elsewhere (153,248,281). Two other experiments were subsequently recorded using other fresh samples which allow us to follow the modifications on the G4 inside the oocytes for periods of 16, 26 and 32 hours (Fig. 45 B,C and D). The procedure allows us to test any possible leakage from the oocytes and to analyze the sample evolution and G4 stability over time inside the oocytes in order to find the optimal conditions. No significant line broadening evolution was observed during the first ~8 h, although it remains higher than what is observed in vitro, 22-40 Hz versus 16-18 Hz (ω1- ω2) respectively. There are no other visible peaks in the NMR spectra that could indicate different folded species, but as time progress (Fig. 45 B,C and D) new peaks become visible despite the resemblance of the overall resonance-pattern. The new peaks are more pronounced

96

initially for G2 and G5 tetrads, and they may represent a new conformer with both ends in conformation exchange, representing a new specie in equilibrium with the original that looks more stable. It can also represent conformers with specific interactions with cellular macromolecules or derived from slow-exchange equilibrium between a mix of G4 species charged with K+ and other ion such as Na+. Towards the end of the experiment, in Fig. 45 C, the newly formed peaks are more prominent but still with less intensity when compared with the initially observed four major peaks. Moreover, G2 seems to fit three individual peaks instead of a single one. In the bottom panel (Fig. 45 D) the spectrum shows that G2 and G3 have shifted around 0.1 and 0.05 ppm respectively, and additional peaks appears nearby G4 and G5.

C.3.2.2 Assessment of d[(TG4T)]4 leakage from Xenopus oocytes The buffer surrounding the oocytes inside the NMR tube was recovered after each in- cell experiment. Then, spectra were recorded for each buffer where the oocytes were resuspended under the same conditions. This control allows us to check if the recorded signal is really stemming from the in-cell observation and is not due to the leakage of d[(TG4T)]4 outside oocytes (Fig. 46 A,B and C). All controls revealed that no apparent leakage occurred for periods up to 24 h, where any NMR signal was observed, all the registered peaks correspond to the noise (Fig. 46 A, B and C). Between 24 and 36 h some leakage was visible (Fig. 46 C, see red arrow), and after 36 h there was several damaged cells and the "leakage" into the surrounding buffer was substantial.

C.3.2.3 Stability and behavior of d[(TG4T)]4 inside Xenopus oocytes According to the leakage assessment, we establish a ~24 h period as safe window for the spectral acquisition were we could operate without influence from cellular leakage. Previous NMR (197,282,283) and crystallographic (284) experiments have shown that oligonucleotides that are able to form G4 can adopt a variety of conformers, depending on different parameters such as the sequence and the ionic species stabilizing the tetrads. The typical physiological concentration of monovalent ions, with optimal affinity for G4, inside X. laevis oocytes is in the order of 21 mM and 90 mM for Na+ and K+ to 400 g/L (285). Having that in mind, one could imagine a more complex spectral pattern for the imino region as expected for the existence of multiple hybrid conformations in a rather complex cell interior "buffer". The 2D spectra that we present in this study are well resolved and all four tetrads of d[(TG4T)]4 can be individually distinguished. Additionally, no apparent multiple structural conformers are visible from the spectra acquired in the first 16 h post acquisition (Fig. 45 B).

97

The high spectral resolution and lack of extensive line broadening indicates the formation of fast tumbling species on the NMR time scale (μs to ms) (286). Towards the end of the "safe" spectral window acquisition time, that we set to approximately 24 h, other conformers start to appear, which were neither observed under in vitro conditions, with K+ or Na+, nor in mixtures of both ions (Fig. 47) Under in vitro conditions, once the tetrads are formed they remain stable and unchanged (observed by NMR) for several weeks. This leads us to think that only after a period of ~24h we start to observe substantial binding or degradation of d[(TG4T)]4 exogenously introduced into the cells. After that period of time, the modifications on d[(TG4T)]4 observed inside cells, and without ligand seem to originate from other macromolecules such as enzymes and co-solutes naturally occurring inside oocytes, which will with time, bind to and change the chemical shift environment felt by the guanines with additional peak broadening. Nevertheless, those changes are not severe to the point of perturbing the general fold of d[(TG4T)]4, thus further attesting for the robustness of this structure for tests inside living cells. Further studies may be necessary to probe if the crowding environment inherent properties are more important than the specificity of the constituents that make the cell interior so unique, and induce the small folding modification observed after 16-20 h period spent inside Xenopus laevis oocytes.

C.3.2.4 Favored forms of d[(TG4T)]4 inside Xenopus oocytes

To characterize the favored form of d[(TG4T)]4 inside cells, d[(TG4T)]4 sample of 1 mM containing residual Na+ from the chemical synthesis was prepared in pure water. In low concentration of Na+ under in vitro conditions, the spectra indicate the formation of a mixture of conformers (Fig. 47A). When the same sample is microinjected inside the cells, it shows a unique spectral pattern characteristic of a single species, revealing that the oligonucleotide exchanged the Na+ ion and refolded inside the cells into a structure pattern identical to the one found in both in vitro and in cell pre-incubated under K+ conditions (Figure 47C in red). This information is relevant to study other species more stable under in vitro conditions in presence of Na+ when compared with K+ such as the human telomeric repeat (22AG). These observations let us to wonder if all in vitro studies should preferably be performed under K+ conditions for other G4 as well. Other in cell studies will be necessary using different G4 species such as c-myc and 22AG to confirm the influence of K+ over Na+ as major stabilizing ion within the tetrads. Further studies are necessary to probe the importance of the environment in rational drug design.

98

1 15 Figure 45: Time evolution of H, N SOFAST-HMQC spectrum of d[(TG4T)]4 inside Xenopus oocytes.

Spectra obtained after a period of 8 H from assembly of 220 oocytes in the NMR tube A), 16 H B), 26 H C) and 32 H D).

99

Figure 46: Time spectral evolution 1H, 15N SOFAST-HMQC of buffer surrounding Xenopus oocytes micro- injected by d[(TG4T)]4.

Control allowing to detect possible leakage during in-cell NMR acquisition : A) After 8 h, B) After 16 h and C) After 24 h.

100

1 15 Figure 47: Series of H, N SOFAST-HMQC spectra of d[(TG4T)]4 in different buffer conditions .

A) The oligonucleotide was resuspended in water as received from the supplier. B) We included 20 mM of KCl.

C) d[(TG4T)]4 inside oocytes after micro-injections of both preparations in two different experimental sets of oocytes, in green: sample B) and in red: sample A). The presence of residual Na+ from the solid phase synthesis is perceived from the spectra in A), as it shows the existence of multiple imino peaks originating from folded tetrads in the same conformation but in the presence of different cations. In part B) we can observe the spectra from the sample reconstituted in buffer A (containing KCl) and treated with two cycles of warming (95°C) and cooling (4°C) in order to facilitate G4 formation. We can now observe four major resonances with similar intensities, arising from each tetrad group of four magnetically-equivalent imino peaks(G2-G5), and low intensity ones nearby from a second conformer. C) Despite the different d[(TG4T)]4 conformations states before the micro-injections, only the KCl species seems visible inside Xenopus oocytes. Approximately 1.5-1.7 * 10- 8moles of oligonucleotide was used in each experiment.

101

As described in previous chapters, G4 have been under studies by the international scientific community due to its enormous potential as promising therapeutic anticancer targets, much due to their localization in key regions of the genome such as at the end of telomeres or in gene promoter regions. Recently, numerous small molecules have been designed in the aim to target these structures (See G4 ligands " part A.4"). Most of these ligands were developed to target the telomeric G4 and to prevent elongation of telomeric DNA by telomerase. Different techniques have been used to study G4 ligands binding, NMR as a powerful tool allowed to reveal interesting structural and dynamic features of such binding. However, all NMR investigations were done in vitro. So, we decided to probe the effects of some ligands binding to d[(TG4T)]4 as a simple G4 model which was characterized inside Xenopus oocytes as reported in the part C.3.2.1 of this manuscript. Three ligands were selected to analyze the principal interaction features : 1) The ligand 2,6-N,N'-methyl-quinolinio-3-yl)-pyridine dicarboxamide-methyl called commonly 360A which is known for being a strong ligand for G4, binding with high affinity to G4 with little or negligible binding to other DNA conformations (141,260,287,288). 2) The ligand Distamycin A was shown to be a groove binder in d[(TG4T)]4 G4 with a stoichiometry of 4 : 1, NMR and ITC were respectively used to characterize the structural features and thermodynamics of the binding in detail (151,152). 3) The ligand 2,20-diethyl-9-methylselenacarbocyanine bromide called DMSB, a cyanine dye which shows a good selectivity for G4. It has been demonstrated that DMSB could occupy both the 5'-end external G-tetrad and the corresponding groove of the d[(TG4T)]4 G4 simultaneously (153). 1D or 2D NMR Titrations of these three ligands with d[(TG4T)]4 were performed in vitro to observe their effects on the imino pattern of the G4 d[(TG4T)]4. The overall objective was to find a ligand that shifted the imino peaks of the G4 without severely broadening the spectra.

C.3.2.5 Titration of d[(TG4T)]4 with ligands in vitro According to previous studies of binding effects with the three different ligands, variable ratios were selected for the titrations between d[(TG4T)]4 and ligands in vitro.

A) Titration with Distamycin A

The addition of Distamycin A to the quadruplex d[(TG4T)]4 induced gradual changes in the chemical shift and a broadening of DNA imino proton resonances in the 1H-NMR spectra, perturbations in 15N chemical shift are also observed (Fig. 48A). The titration was completed at ligand : DNA ratio of 4 : 1. The four strands resulted to be magnetically

102

equivalent throughout the titration, since no splitting of resonances was observed at any stage. Furthermore, a single set of signals was present for imino protons throughout the whole NMR titration. Nevertheless, the chemical shift changes are more important for the 5' and 3' end tetrads (G2, G5). The binding of Distamycin A to d[(TG4T)]4 does not affect the folding pattern of G4, the stabilized structure seems unchanged as compared to the free G4 form.

B) Titration with DMSB

Compared to Distamycin A, the gradual changes in chemical shifts caused by the addition of DMSB to d[(TG4T)]4 G4 are less important while the broadening is more remarkable (Fig. 48B). Perturbation are observed till 5 : 1 ligand : DNA ratio. The disappearance of G5 imino peak suggests that the binding could concern the 3' end by π stacking of the ligand on the 3' end G-tetrad. As for Distamycin A, the folding pattern of G4 is still conserved with DMSB.

C) Titration with 360A

Besides Distamycin A and DMSB, we observed that 360A induces a pronounced effect on the imino pattern of d[(TG4T)]4, especially by reorganizing the G2 and G3 tetrads, since their relative peak intensities diminished and shift to high field. We also observe the appearance of amino bonds-pattern in the spectra at ~10.2 ppm, indicating some sort of stabilization induced by the ligand in the amino groups that are normally exposed, and in fast exchange with bulk water. Under in vitro conditions all four tetrads appear to remain preserved albeit important differences in chemical shift are induced by the ligand (Fig. 48C).

103

1 1 15 Figure 48: Probing different ligands interaction with d[(TG4T)]4 G4 by 1D H and H, N SOFAST-HMQC in vitro.

1 15 A) Chemical shift mapping of d[(TG4T)]4 imino signals by H, N SOFAST-HMQC for G4 alone (red), after 1 incubation with 1 (green) and 4 (magenta) molar ratio of Distamycin A. B) H 1D titration of d[(TG4T)]4 with

1,2,3 and 5 molar equivalents of DMSB compound. C) Chemical shift mapping of d[(TG4T)]4 imino signals by 1H, 15N SOFAST-HMQC for G4 alone (blue), after incubation with 1 (red) and 2.5 (black) molar ratio of 360A.

104

C.3.2.6 Ligand binding to d[(TG4T)]4 inside Xenopus oocytes In order to maximize the chances of visualizing the effect of ligands inside living cells, we need to apply different ligand selection criteria. A good solubility and low toxicity are essential parameters for the selection of ligands. They should also have a high selectivity and affinity for DNA G4 because inside cells an important competition with other partners has to be considered. The broadening of signatures due to tumbling and decreasing of signals as observed in vitro can be more dramatic inside cell and can prevent the appearance of signals. The ligands selected have to show a good conservation of signal intensities in vitro, it does not necessarily mean that signals lost their intensities in vivo, but crowded conditions usually induce a high decrease of signals observed in 2D spectra. To be selected for in cell NMR experiment, a ligand has to induce perceptible changes in vitro. The effects observed in vitro for the three ligands selected are suitable for the study of interactions with d[(TG4T)]4 inside living cells. Such studies may present us with new directions to understand ligand-G4 binding inside cells and improve affinity and specificity of ligands in drug design. For each of the three ligands that we tested, two of such experiments were performed in two different set of batches containing 200 oocytes each. In the first experiment, the G4 was micro-injected inside ~200 oocytes, and later on, the ligand was incorporated inside the NMR tube and left to diffuse freely during equilibrium period of 1h. In addition, a second experiment started with an in vitro incubation between the ligand and the G4 prior to micro-injection of the mixture of G4 and ligand inside oocytes. Additional control experiments for testing the surrounding buffer were performed after each in-cell experiment to identify any leakage from oocytes.

A) Binding with Distamycin A

1 15 After incubation of cells micro-injected by H, N d[(TG4T)]4 G4 with 100µM of distamycin A, changes in chemical shift of imino peaks (Fig. 49A) comparable to those observed with 2 molar equivalent of distamycin A in vitro are observed (Fig. 48A). However, at this concentration of distamycin A, an important leakage from oocytes was observed few hours after starting the experiment (Fig. 49C). This leakage can be explained by the damaged oocytes after addition of the ligand. Distamycin A seems to be toxic towards cells and especially oocytes. Nevertheless, the changes in chemical shift observed in vivo should correspond to the binding of distamycin A inside the cell; because the surrounding buffer measured shows more important changes of chemical shift. Admittedly, a component resulting from the in vitro signal participates in the global signal. But, it is minor in comparison with the signal supposed coming from the in vitro interaction. In contrast, when

105

the complex Distamycin A-d[(TG4T)]4 with a ration of 2 : 1 was pre-formed before the micro-injection inside oocytes, an unusual pattern was observed inside cells. It consists in 12 peaks presenting probably different conformations of G4. The presence of ligand definitely induced the formation of a new imino pattern, that at present time we are not able to characterize. One hypothesis is that the interaction with other cellular partners such as cytoplasmic proteins, that are capable to bind the G4 can be favored or exacerbated by the presence of stabilized forms due to the interaction of distamycin A. The leakage observed in the buffer resulted from this experiment was much less important than that observed in the first case. The folding pattern of G4 observed in the leaked buffer resembles the one that was detected in vitro with 2 molar equivalent of distamycin A. The unusual pattern of d[(TG4T)]4 in the presence of Distamycin A can also be attributed to an organized pattern of the complex d[(TG4T)]4-distamycin A in buffer conditions used for the in cell experiments (MBS + 20% Ficoll) which create a viscous environment. The crowding medium can induce conformational changes in the presence of ligands. The control performed at ratio 2 : 1 between distamycin A

: d[(TG4T)]4 using in cell buffer conditions showed a similar signature to that observed in vitro in the presence of 2 molar equivalent of Distamycin A without Ficoll. Therefore, the pattern observed in vivo is independent from the effect of crowding buffer. The possibility of G4 oligomerization, although less probable, can also be responsible for the appearance of the new pattern. Together, these observations can suggest that distamycin A stabilizes d[(TG4T)]4 G4 in a manner to favor its interaction with cellular partners. In spite of its good solubility in water, the toxicity of distamycin A for oocytes limits considerably its ability to be used as a candidate for in-cell NMR experiments.

B) Binding with 360A

The second candidate used in this part was 360A. Compared to distamycin A, 360A is not so soluble in water at high concentration (more than 1mM). So, the ligand transferred from a stock solution of DMSO and was dissolved in buffer for the first experiment, which consisted in the incubation of freshly micro-injected cells with d[(TG4T)]4 with 100 µM final concentration of 360A. In the second case, the sample was prepared from a stock solution of the ligand at a concentration of 50 mM dissolved in deuterated Dimethyl sulfoxide (DMSO) and pre-incubated with G4 before the microinjection. Important differences can be appreciated between in vitro and in cell experiments for this ligand. After incubation of the cells with the ligand (Fig. 50A), or with the alternative method by co-injection; the spectra shows a dramatic change after approximately seven hours of spectral acquisition,

106

demonstrating that an interaction occurs between the G4 and the ligand. We believe that spectra shows the result of a direct interaction of a G4 specific ligand and its target within a complex intracellular environment, and we were able to visualize its effect on the imino pattern. Although the peaks intensities are severally diminished and a correct interpretation is rather difficult, the pattern of the correlation peaks shows important differences to what is observed for both without ligand in vivo and with the ligand under in vitro conditions. Nevertheless, it is encouraging to observe such dramatic changes under living cells. As we report in the part of C.3.2.5.1, the ligand in vitro induces a remarkable perturbation of imino pattern by organizing G2 and G3 tetrads and stabilizing amino peaks which are generally in fast exchange with bulk water. In contrast, in cell results indicate two distinct behaviors. In the first experiment, the spectra shows the lack of any tetrad-signature (Fig. 50A). Surprisingly, the only visible peaks around 10.2 ppm indicates the formation of characteristic amino bonds often seen associated with tetrads. When excess of ligand is pre-incubated with d[(TG4T)]4 before microinjection inside the oocytes, the modifications in the G4 structure occur in vitro, and the peak pattern observed once inside the cells for the complex (Fig. 50 B) is at some extend similar to what is observed in vitro (Fig. 48C). There are small differences that are difficult to evaluate, the peaks are broader and we can see newly formed peaks that were not observed in the experiments without the ligand inside the oocytes, nor under in vitro conditions with equivalent amounts of ligand (Fig. 50). Unfortunately we cannot use longer acquisition periods to have better signal because that leads to sample degradation under in vivo conditions (Fig. 45). A possible explanation is that the ligand is able to stabilize the complex and other biomolecules such as enzymes may interact better with the complex. We cannot exclude d[(TG4T)]4 dimerization or intracellular precipitation as origin for new or peak broadening in the complex formed with 360A. Again, the test for leakage was performed and the results were negative as seen before. After a period of ~20 h, the signals from both experiments totally vanish. We decided to probe if the oligonucleotide-ligand complex were degraded, or if it was not visible due to the interaction with large molecules. For that we homogenized the oocytes and separated the pellet from the cytoplasm of both samples by centrifugation and measured the spectra under identical conditions. Under these conditions, we did not observed any peaks for both samples preparations with ligand. However, if we heat the samples at 80°C to denature proteins that may be bound to the d[(TG4T)]4-360A complex, relatively strong peaks re-appear in the spectra (Fig. 51), indicating that at least a part of the complex was not digested by cellular enzymes when the oocytes where intact, neither destroyed by the heating at 80°C. The peak pattern is somehow different to what is found in

107

vitro but this time we can clearly see what looks like two distinct tetrads and an additional broad peak around 10.34 ppm that can also be from a tetrad.

C) Binding with DMSB

Besides distamycin A and 360A, a specific G4 ligand from cyanines family was also probed with d[(TG4T)]4 inside living oocytes. DMSB ligand showed an interesting effect on d[(TG4T)]4 G4 imino pattern in vitro (Fig. 48B). DMSB stock solution was prepared in

DMSO at a concentration of 80 mM. After incubation of micro-injected cells by d[(TG4T)]4 with 1 mM of DMSB, a dramatic changes of imino profile was observed. Two broad peaks are detected at high NMR field suggesting a sort of stabilization of d[(TG4T)]4 G4 correlating with what was observed in vitro. The disappearance of two peaks corresponding to G2 and G5 from the general signature can be explained by the binding of DMSB probably throughout end-stacking on the two terminal tetrads at 5' and 3' ends respectively. This binding could also lead to organize the central tetrads (G3 and G4) whence the broadening of the two other peaks which show a relatively important changes in chemical shift (more than 0.3 ppm) (Fig. 52A).

In contrast, when DMSB was co-micro-injected with d[(TG4T)]4 at a ratio of 2 : 1, the imino pattern remained conserved although with important changes in chemical shift (Fig. 52B) comparable to those observed in vitro (Fig. 48B). After eight hours of acquisition, an important leakage of G4 was observed in the surrounding buffer, an apparent damage of oocytes was observed. The measurement shows four weak peaks with dramatic shifts completely different from those observed in vitro (Fig. 52C). This pattern can be related to changes observed in the in-cell buffer. DMSB seems to be toxic for oocytes.

108

1 15 Figure 49: Probing interaction of d[(TG4T)]4 G4 with distamycin A by H, N SOFAST-HMQC inside Xenopus oocytes.

A) Ligand incubated and freely diffused from the NMR tube buffer to oocytes interior previously micro-injected with d[(TG4T)]4. B) in cell spectra resulted from the incubation of 2 moles ratio of distamycin A with d[(TG4T)]4 for a period of 4 h (room temperature) previous to co-micro-injection inside 220 Xenopus oocytes. C) Spectrum obtained with buffer surrounding oocytes showing an important leakage.

109

1 15 Figure 50: Probing interaction between d[(TG4T)]4 and 360A with H, N SOFAST-HMQC in vitro and inside Xenopus oocytes.

A) Ligand incubated and freely diffused from the NMR tube buffer to oocytes interior previously micro-injected with d[(TG4T)]4. B) In cell spectra resulted from the incubation of 2.5 moles ratio of 360A with d[(TG4T)]4 for a period of 4 h (room temperature) previous to co-micro-injection inside 220 Xenopus oocytes. C) Chemical shift 1 15 mapping of d[(TG4T)]4 imino signals by H, N SOFAST-HMQC for G4 alone (blue), after incubation with 1

(red) and 2.5 (black) molar ratio ligand to moles of d[(TG4T)]4. D) Spectra A) and B) overlaid with black spectrum from C). The spectra clearly show important differences between in vitro and in cell.

110

1 15 Figure 51: H, N SOFAST-HMQC spectrum of d[(TG4T)]4 with 360A from the cytoplasm of ~200 oocytes after heating at 80°C (light blue), overlayed on spectra of in vitro titration of d[(TG4T)]4.

360A without ligand (blue), with 1 (red) and 2.5 (black) molar equivalent of 360A.

111

1 15 Figure 52: H, N SOFAST-HMQC spectrum of d[(TG4T)]4 with DMSB from the cytoplasm of ~200 oocytes after heating at 80°C (light blue), overlayed on spectra of in vitro titration of d[(TG4T)]4.

A) Ligand incubated and freely diffused from the NMR tube buffer to oocytes interior previously micro-injected with d[(TG4T)]4. B) in cell spectra resulted from the incubation of 2 moles ratio of DMSB with d[(TG4T)]4 for a period of 4 h (room temperature) previous to co-micro-injection inside 220 Xenopus oocytes. C) Spectrum obtained with buffer surrounding oocytes showing an important leakage.

112

C.3.3 G4 DNA in human cells

In previous sections, we discussed how d[(TG4T)]4 behaves inside Xenopus laevis oocytes and the effects of its interaction with different specific G4 ligands. In this section, we describe an approach to investigate different views for the application of in-cell NMR to G4 study in human cells. Recent studies have shown the possibility to investigate proteins in human cells (233,243-245). Nevertheless, the adaptation of such studies for G4 necessitates to optimize several parameters : 1) The delivery method used that has to provide an important amount of molecules inside cells. 2) The cells ought to keep their viability during the course of the in-cell NMR measurements and 3) No leakage of G4 outside cells should happen. Different delivery methods can be used in transfecting DNA inside cells (see C.1.1). Comparing the different possible delivery methods, each exhibits several drawbacks and advantages. Among them electroporation seems to be the most advantageous because of its easiness of use. Moreover, it does not require any forms of chemical modifications of the G4. So, it works without having to treat cells with potentially harmful toxins that generally lower cell viability. Alternatively, the different Kits used for chemical methods (Lipofection, Biolistic method...etc) are relatively expensive limiting their suitability for in-cell NMR applications in mammalian cells. Besides these advantages, the cell mortality induced by electroporation represent the main drawback which has to be prevented for electroporation applications. The prevention of cell mortality has to be optimized in a manner to keep high transfection yields.

C.3.3.1 Optimization of electroporation conditions Studies of in-cell NMR applications in mammalian cells highlighted different transduction procedures to deliver proteins inside mammalian cells (243-245). The electroporation is a suitable method to deliver proteins for in-cell NMR applications. Here, we describe an optimization approach to evaluate the vectorization of G4 inside mammalian cells. To establish efficient electroporation conditions, several parameters must to be optimized, namely the EP pulse programs, the number of electroporated cells and the recovery conditions. Three human cell lines were used to perform the optimization procedure, the human epitheloid cervix carcinoma cells (HeLa), the human ovarian carcinoma cells (A2780) and the human embryonic kidney cells (HEK293T). To evaluate the DNA transduction efficiencies as well as the cell mortality induced by different electroporation pulse programs, two optimized protocols from the electroporator providor (Biorad) were tested for each cell line. Pulse programs used for electroporation are summarized in Table 5 (see C.2.4.2). The

113

cell densities were fixed in the range of 2-3 million cells per reaction as recommended by the provider. Two samples of unlabeled d[(TG4T)]4 were prepared in complete culture media at 1 mM quadruplex concentration for each cell line. Two other controls were prepared without DNA. After centrifugation, 2-3 millions cells were resuspended in each sample then electroporated using the corresponding protocol Table 5 (see C.2.4.2). After each round of the EP procedure, treated cells were transferred into different recovery media and counted after 30 min of recovery. The cell loss was monitored by counting viable and death cells with a Trypan blue staining protocol using an automatic cell counter. The cell loss is increased with higher voltage : from 10 to 40 % for HeLa cells and from 20 to 45 % for A2780 cells when we increase voltage from 160 to 200V, whereas no change in cell loss was observed for HEK 293T cells when the voltage was increased from 120 to 160 V. However, the cell mortality is similar both in the presence and in the absence of DNA (Fig. 53). So, cell loss is attributed to cell lysis during the EP procedure without influence of DNA penetration inside the cells. low cell mortality is registered overall cell recovery. The effect of increased number of pulses was also assessed. Higher numbers of electroporation pulses increase significantly the percentage of dead cells in EP mixtures. Thus, it is needed to evaluate if the increased cell death result in enhanced levels of G4 delivery.

C.3.3.2 G4 uptake efficiencies To evaluate the uptake of G4 inside cells, two different concentrations (50 and 100

µM) of d[(TG4T)]4 labeled with cyanine 3 fluorophore (Cy3) were delivered by electroporation using two different optimized pulses as in Table 5 (see C.2.4.2). Samples were prepared by mixing unlabeled d(TG4T) with d(TG4T)-Cy3 at a ratio of 3:1 to be the closest possible of conditions used in NMR experiments. The uptake measurements were performed using Flow cytometry (see C.2.4.3). The detection of Cy3 was done on PE channel at 570 nm for uptake efficiencie and on DAPI channel at 460 nm for cell death. For each electropration pulse program, a negative electroporated control with unlabeled TG4T was prepared to do the mock for the Flow cytometry measurements. The G4 uptake and the cell death were evaluated by comparison with the negative controls. The fluorescence level of the mock that correspond to the autofluorescence background is considered as a zero reference. From the different experiments, very promising delivery efficiencies were observed for all the tested cells (Fig. 54). Increasing the DNA concentrations used for the electroporation enhances remarkably the delivery efficiencies, while the increased voltage has no effect on the uptake level of G4. The cell loss rates recorded by Flow cytometry using DAPI are in

114

agreement with those observed by the Trypan blue protocol. We therefore do not need to increase the voltage for improving the efficiencies of delivery by electroporation. The pulse program presenting the lower voltage has been selected to be the used for each cell line.

C.3.3.3 Cell recovery evaluation To prepare in-cell NMR samples of high quality, it is necessary to separate viable cells that received G4 from dead cells. After the electroporation process which is carried out on adherent cells with 1 mM of unlabeled DNA, an excess of culture medium is added for the best recovery of the cells. The viable cells will reattach to the surface. This criterion can be exploited to select viable cells and separate them from the dead ones. During the cell recovery experiment, two populations are generated: 1) Cells that were not attached and remained floating in the supernatant of the culture media. 2) Cells that perfectly reattached and fully regained their individual morphologies. The reattachment of electroporated HeLa cells was analyzed over 7 h of recovery time during which cells were observed under the microscope after 2, 5 and 7 h of recovery. The last observation was carried out after removing the culture medium and washing cells three times with PBS 1X. A control of non electroporated cells was also prepared aiming to assess the morphologies of attached cells, it was also observed after 7 h of recovery. Actually, after 7 hours of recovery, HeLa cells had correctly recovered their initial morphologies. Approximately 30 % of cells were lost from the starting population. So, after electroporation of DNA, a period of 7 h has to be respected for insuring a correct recovery of viable cells. The dead cells can be removed by aspiration and washing several times. Besides morphologies evaluation tests, flow cytometry was used to evaluate the cell viability of attached and non attached cell. DAPI was used to reveal the cell mortality. Cell viability measurements of attached and non-attached cell populations revealed that attached cells showed low rate of dead cells (< 5 %) whereas cells that did not attach during the recovery period showed much higher dead cells rate(> 40 %) (Fig. 55).

C.3.3.4 G4 distribution inside human cells In this part, we aimed to determine the G4 localization after the electroporation process. Three samples of 50 µM preformed d[(TG4T)]4 with a ratio of 1 : 4 of d(TG4T)-Cy3 and unlabeled d(TG4T). Samples were electroporated in each cell line (HeLa, HEK293T and A2780) then allowed to attach for 7 hours on the microscopy cover-slips of 18 mm diameter. The cover-slips were subsequently washed abundantly by 1X PBS to remove all dead cells and fluorescent labels from the medium. The cells were finally fixed using 1 % of paraformaldehyde. Confocal microscopy was employed to obtain images of cells. The

115

distribution of G4 inside cells was predominantly cytoplasmic in all cell lines. Highest intracellular G4 levels were detected in HeLa and HEK293T whereas a very low level was detected in A2780 cells (Fig. 56). Among the three cell lines, HeLa cells has the lowest rate of cell mortality due to electroporation process and the highest level of delivered G4 inside cells. They also regained perfectly their original morphologies after electroporation. Taking all these observations together, HeLa cells showed the best profile for in-cell NMR experiments.

C.3.3.5 In cell NMR experiments Until now, our approach of in-cell experiment optimization allowed to show that electroporation is a good method to deliver high amounts of G4 inside human cells, mainly HeLa and HEK293T. Using unlabeled and fluorescent labeled DNA, we initially tested conditions to deliver the highest quantities of DNA inside cells, taking into account their toxicity effects. However, in-cell NMR is a low sensitivity method that necessitates a very high amounts of delivered DNA for the detection of signals. Nevertheless, two questions remain to be asked: do 13C, 15N labeled DNA behave similarly to unlabeled DNA towards delivery by electoporation? and are the delivered amounts sufficient to detect NMR signals inside cells?

13 15 For in-cell NMR measurements, 1 mM of C, N d[(TG4T)]4 was delivered by electroporation inside HeLa cells applying an exponential decay pulse (160V, 500µF and ∞Ω). Several electroporations of 2-3 million cells were performed to obtain the in-cell NMR sample. The cells were allowed to recover during 7h in complete culture media, then washed abundantly by 1X PBS to remove death cells. After trypsinizing cells for 10 min at 37 °C in a

CO2 incubator, they were washed again by 1X PBS and resuspended in 500 µl of complete 1 15 culture media + 10 % D2O within a 5 mm Shigemi tube. Three long H, N SOFAST HMQC experiments were recorded during more than 36 hours. Surprisingly, no NMR signal was observed. Several explanations can be envisaged for that observation. The most evident one is that the amount of d[(TG4T)]4 delivered inside cells is not sufficient to obtain NMR signals. Another possibility is that the DNA should be degraded inside cells by intracellular enzymes such as nucleases. Besides these suggestions, the slow tumbling that can be induced by the interaction of G4 with cellular partners (big complex such as helicases or chapron intracellular proteins) may provoke very slow relaxation resulting on the broadening and disappearance of signals. To investigate these hypothesis, the electroporated cells were crushed and soaked mechanically by a pipette tip. The lysates were cleared by centrifugation and the clear generated supernatant was used for NMR measurements. After heating of the supernatant at

116

80 °C (to denature the proteins) and cooling down at room temperature, The NMR experiment was performed again. The signals reappeared after 4 hours of acquisition time (Fig. 57). The reappearance of signals after heating the sample means that the tumbling is the privileged explanation of the signal disappearance during in-cell measurements. An approximative determination of d[(TG4T)]4 present in lysates by comparison an in vitro acquisition performed in the same conditions, indicates that the concentration of delivered d[(TG4T)]4 is close to 7 µM. Considering a concentration of 7 µM G4, the cell lysates were used to check the profile of interaction between d[(TG4T)]4 delivered inside cells and the ligand 360A. The pattern observed is very interesting: high changes of chemical shift were observed after addition of 2 equivalent of 360A and the imino pattern remains conserved (Fig. 57). The peak corresponding to G2 disappeared. The imino profile is very similar to that observed in lysates of Xenopus oocytes after heating at 80 °C (Fig. 51). However, the addition of more 360A in the lysates induces the disappearance of signals.

Figure 53: Cell loss after electroporation for three different cell lines HeLa, A2780 and HEK293T.

Control 1 : Cells electroporated in the absence of DNA using an exponential decay pulse program : 160V,

500µF, ∞ Ω (1). EP pulse 1: Cells electroporated in the presence of 1 mM d[(TG4T)]4 G4 DNA using the pulse (1). Control 2 : Cells electroporated in the absence of DNA using an exponential decay pulse program : 200V,

500µF, ∞ Ω (2). EP pulse 2: Cells electroporated in the presence of 1 mM d[(TG4T)]4 G4 DNA using the pulse

(2). EP pulse 1(2): Cells electroporated in the presence of 1 mM d[(TG4T)]4 G4 DNA using the pulse (1) repeated twice. EP pulse 2(2): Cells electroporated in the presence of 1 mM d[(TG4T)]4 G4 DNA using the pulse (2) repeated twice.

117

118

Figure 54: Delivery of d[(TG4T)]4 formed by TG4T-cy3 and unlabeled TG4T at a ratio of 1:3 in 1) HeLa cells, 2) A2780 and 3) HEK293T by EP.

1) HeLa cells: 1A) Flow cytometry analysis of the mock sample using two different EP pulse programs. 1: Exponential decay pulse: 160V, 500µF, ∞ Ω. 2: Exponential decay pulse: 200V, 500µF, ∞ Ω. 1B) Cells electroporated with 50 µM of TG4T-cy3 using the two protocols. 1C) Cells electroporated with 100 µM of TG4T- cy3 using the two protocols. 2) A2780: 2A) Flow cytometry analysis of the mock sample using two different EP pulse programs. 1: Exponential decay pulse: 160V, 500µF, ∞ Ω. 2: Exponential decay pulse: 200V, 500µF, ∞

Ω. 2B) Cells electroporated with 50 µM of TG4T-cy3 using the two protocols. 2C) Cells electroporated with 100

µM of TG4T-cy3 using the two protocols. HEK293T. 3A) Flow cytometry analysis of the mock using two different EP pulse programs. 1: Square wave pulse: 120V, 25ms. 2: Square wave pulse: 160V, 25ms. 3B). Cells electroporated with 20 µM of TG4T-cy3 using the two protocols. 3C). Cells electroporated with 40 µM of TG4T- cy3 using the two protocols. Analysis was performed by detection of cy3 label at 570 nm using the Phycoerythrin fluorescent channel ( PE channel : 570 nm). FSC : Forward scatter, SSC : Side scatter, CFI : Cy3 fluorescence intensity.

119

Figure 55: Viability test of attached and non attached cell populations 1.

A) Flow cytometry analysis of attached cell populations in the top and non attached cell populations in the bottom. High rate of death cells were present in the non attached cells (red population in the bottom) and a very low rate were present in the attached cells (in the top).

120

Figure 56: Intracellular distribution of d[(TG4T)]4 G4 formed by TG4T-cy3 and unlabeled TG4T at a ratio of 1:3 in 1) delivered by electroporation in A) HeLa cells, B) HEK293T and C) A2780.

Confocal imaging of intracellular TG4T-cy3 show a cytoplasmic repartition of G4 inside cells. High fluorescent level are found in both HeLa and HEK293T, in contrast a very low level in A2780.

1 15 Figure 57: H, N SOFAST-HMQC spectrum of d[(TG4T)]4 from the cytoplasm of HeLa cells after heating at 80 °C (light green), overlaid on spectra of in vitro d[(TG4T)]4 G4 (red) and with 2 molar equivalent of 360A (magenta).

121

Part 3 NMR study of KRAS G4

D. NMR study of KRAS G4

D.1 Introduction As reported in the part A.3.1, recent bioinformatic analysis have shown that putative G4 forming sequences are distributed with a high frequency in genome regions upstream of the transcription starting site or within promoter regions of the genes. This observation suggests that G4 could be involved in the regulation of the transcription process (40,42,43,289). Among the G4s involved in promoter's, c-myc proto-oncogen promoter is the most studied. Several G4 structures from several fragment of this promoter were resolved by NMR and crystallography (see part A.3.2.1). Other less studied proto-oncogenes such as c-kit (55) and bcl2 (56,57) also contains G4 structures recently resolved by NMR. Kras is another cancer related proto-oncogene containing a G-rich motif called 32R potentially able to form G4s (Table 7) located in the NHE (Nuclease Hypersensitive elements) (58,155,290,291). Previous studies showed that mutant alleles of Kras are involved in different cancer pathogenesis but the structures adopted by this motif was never determined by NMR or Crystallography. Furthermore, recent studies showed a new motif (21R) of Kras G-rich promoter that is able to form a parallel G4 which is called 21R (Table 7) (292).

Taking into account these results obtained mainly by biophysical methods such as UV and CD spectroscopy, we are now interested to study in more details their structural characteristics using NMR spectroscopy. The aim of this study of the Kras G-quadruplex is to provide preliminary understanding of the unique structural features of these G4s and to investigate their possible interaction and stabilization with different G4 ligands. We tested several different sequences from the NHE elements of Kras by 1D NMR for screening candidates that have a good NMR spectra quality in terms of imino-proton peak dispersion and existence of a single conformer. Two of these sequences meet the criteria and had their stability investigated using CD melting. Different ligands were also screened with these sequences using CD titration and their ability to stabilize G4 was assessed by CD melting. Finally, a 1D NMR titration was performed with the best set of ligands.

D.2 Materials and methods

D.2.1 Oligonucleotides The unlabeled DNA oligonucleotides used in this work were purchased from both Eurogentec (Belgium) and Integrated DNA Technologies (USA). They were synthesized on

122

200 nmol and 1 µmol scale, then purified by reverse phase HPLC. The sequences were supplied lyophilized. The 5% 15N, 13C site specific labeled Kras-22RT used in this study were synthesized by Brune Vialet (research assistant) in our laboratory (INSERM U869, Bordeaux) on an automated Expedite 8909 DNA synthesizer following the phosphoramidite strategy at 1 µmol scale on 1000 Å primer support (Link Technologies SynBase CPG). All the standard phosphoramidites ( dABz, dT, dGiBu, dCAc), reagents and solvents used during the synthesis were purchased from Glen Research. The dGiBu-phosphoramidite (U-13C10, 98%; U-15N5, 98%; CP 95%) was purchased from Cambridge Isotope Laboratories. After the synthesis the oligos were cleaved from the support and the nucleobases were deprotected with ammonium hydroxide at 55°C for 16h. In the end, oligonucleotides were lyophilized. For oligonucleotides desalting and concentration measurement (See B.2.1.1).

All the sequences used for NMR were prepared in potassium NMR buffer (20 mM

K2HPO4/KH2PO4,70 mM KCl, pH 6.5, 10% D2O) and for other methods in Kpi buffer (20 mM K2HPO4/KH2PO4,70 mM KCl, pH 6.5). After dissolving in buffers, the oligonucleotides were annealed several times by heating 5 min at 95 °C and chilling in ice. After the last annealing cycle, they were conserved in the fridge for 24 h before utilization. The table 1 recapitulates in detail the characteristics of the sequences used in this work.

D.2.2 Ligands preparation The 360A, Braco19 and PhenDC3 were kindly provided by the group of Marie-Paule Teulade-Fichou from the Curie institute in Paris. All the compounds have been supplied as a powder and dissolved in deuterated DMSO D6. The purity of compounds was confirmed by 1D 1H-NMR. According to the experiment requirements, the stock solutions of the compounds were prepared at different concentrations ranging from 2 to 30 mM according to the experiment types (UV, CD and NMR).

D.2.3 CD and CD melting CD experiments were performed on a JASCO J-815 spectrometer using Spectra Manager software. The sample contained 3-5 µM of oligonucleotides was dissolved in 500 µl volume of G4-folding buffer and transferred into a 10 mm quartz cuvette. The ligand was transferred from the stock solutions (see ligands preparation). The CD spectra were measured between 220 and 320 nm wavelengths interval. A scan speed of 100 nm/min and a response time of 2 s were used to perform each experiment, with five scans collected and averaged.

123

The melting experiments were conducted monitoring ellipticity at 263 nm by increasing the temperature from 10 to 105°C, with a heating rate increase of 1°C/min.

Table 7: List of the oligonucleotides used in this work. Guanines which can be involved in the establishment of tetrads are colored in red. 5% 15N, 13C site specific labeled guanines of Kras-22RT are marked by asterisks(*).

Name Length ɛ (mol-1.L.cm-1) Sequence (5'-3') Kras-32R 32 mer 352000 5'-(GCGGTGTGGGAAGAGGGAAGAGGGGGAGGCAG)-3' Kras-21R 21 mer 222300 5'-(AGGGCGGTGTGGGAAGAGGGA)-3' Kras-21RT 21 mer 221100 5'-(AGGGCGGTGTGGGAATAGGGA)-3' Kras-22R 22 mer 234300 5'-(AGGGCGGTGTGGGAAGAGGGAA)-3' Kras-22RT 22 mer 233100 5'-(AGGGCGGTGTGGGAATAGGGAA)-3' Kras-S1(2) 20 mer 211000 5'-(AGGGCGGTGTGGGAAGAGGG)-3' Kras-S2(3) 20 mer 211000 5'-(CGGTGTGGGAAGAGGGAAGAGG)-3' Kras-S3(4) 25 mer 276000 5'-(TGTGGGAAGAGGGAAGAGGGTGAGG)-3' Kras-22RT (G2) 22 mer 233100 5'-(A*GGGCGGTGTGGGAATAGGGAA)-3' Kras-22RT (G3) 22 mer 233100 5'-(AG*GGCGGTGTGGGAATAGGGAA)-3' Kras-22RT (G4) 22 mer 233100 5'-(AGG*GCGGTGTGGGAATAGGGAA)-3' Kras-22RT (G6) 22 mer 233100 5'-(AGGGC*GGTGTGGGAATAGGGAA)-3' Kras-22RT (G7) 22 mer 233100 5'-(AGGGCG*GTGTGGGAATAGGGAA)-3' Kras-22RT (G9) 22 mer 233100 5'-(AGGGCGGT*GTGGGAATAGGGAA)-3' Kras-22RT (G11) 22 mer 233100 5'-(AGGGCGGTGT*GGGAATAGGGAA)-3' Kras-22RT (G12) 22 mer 233100 5'-(AGGGCGGTGTG*GGAATAGGGAA)-3' Kras-22RT (G13) 22 mer 233100 5'-(AGGGCGGTGTGG*GAATAGGGAA)-3' Kras-22RT (G18) 22 mer 233100 5'-(AGGGCGGTGTGGGAATA*GGGAA)-3'

Kras-22RT (G19) 22 mer 233100 AGGGCGGTGTGGGAATAG*GGAA

Kras-22RT (G20) 22 mer 233100 AGGGCGGTGTGGGAATAGG*GAA

D.2.4 NMR spectroscopy NMR spectra were recorded on different Bruker Avance instruments, 700 and 800 MHz with a cryogenically cooled probe. Experiments were performed at 20 °C. For solution NMR, standard 3 or 5 mm NMR tubes were used. The samples were prepared in potassium

NMR buffer (20 mM K2HPO4/KH2PO4, 70 mM KCl, pH6.5, 10 % D2O). The concentrations prepared for each sequence were comprised between 1 and 5 mM depending on the

124

experiment requirements. For the study of the interaction with the ligand, the 1H 1D NMR titration between Kras-22RT and Braco19 ligand was performed by adding aliquots of Braco19 directly from a 30 mM stock (DMSO) solution to the NMR sample, with molar ratios of ligand to DNA up to 2 molar equivalents. Most of the 1H 1D spectra were recorded using the (179) sequence hs11echo (based on the use of "double pulsed field gradient spin- echo"), which selectively removes resonance due to water without affecting other resonances including those which are in fast exchange with water. To correlate imino and H8 protons of the same guanine via the 13C5 carbon at natural abundance through-bond correlations 1H-13C HMBC experiment at natural abundance was performed (293).

D.3 Results and discussions

D.3.1 Screening of K-ras G4 forming sequences for NMR study Several approaches have been developed in the literature to favor a single major G4 conformation for high resolution structure determination. The modification in flanking sequence at ends or loop sequences can help to improve the G-quadruplex folding and get a good spectra quality as observed for the study of the very polymorphic human telomeric sequence (93-95,112). Guanines from G-quartets can also be modified, commonly through G to T, G to I (inosine) or G to brG (8-bromo-guanine) substitutions, in order to favor a single conformation, as showed previously with the c-myc G-quadruplex (33,53). Experimental conditions such as pH, salt concentration and temperature or sample preparation as annealing procedure can also help to favor a single conformation as observed with c-kit G-quadruplex (55,63). To screen sequences for NMR characterization, the 1D NMR spectra of several sequences belonging to 32R Kras sequence were acquired. All the selected sequences contain between 20 and 25 nucleotides length (see Table 7). The sequences 21RT and 22RT contain the mutation of the guanine 16 to thymine compared to their natural equivalent 21R and 22R (Table 7). Among the tested sequences, Kras S1, S2 and S3 showed a broad envelope with some resolved peaks between 10 and 12 ppm indicating the presence of multiple G4 structures (Fig. 58). The sequences 22R and 21RT showed more resolved peaks indicating the presence of unique major conformation but have also many merged peaks that render the distinction between them more difficult (Fig. 58). However, the sequences 21R and 22RT have a clean profile nicely resolved imino peaks (Fig. 58). The sequences 21R and 22RT were therefore selected as potential candidates to do the screening with different ligands and to perform CD and NMR investigations. Moreover, 22RT show a more resolved pattern in both imino and aromatic regions than 21R. Fortunately, G16 to T mutation does not change the

125

general folding of the G4 comparing to the wild type because all the resolved peaks conserved their positions, only the merged ones show better resolution.

Figure 58: Screening of Kras G4 candidates using 1D 1H NMR

Only Imino (10-12 ppm) and aromatic (7-8.5 ppm) regions are shown. Sequences name are shown above the spectra.

D.3.2 CD spectroscopy analysis of Kras 21R and 22RT with G4 ligands The CD signature observed with the G4 formed by both 21R and 22RT suggests that they adopt a parallel conformation with a positive band a 263 nm and a negative band at 243 nm (Fig. 59,60 and 61).

Three different ligands (360A, Braco19 and PhenDC3) were then tested with the G4 formed by 21R and 22RT. Titration of 21R with 360A induces a small increase in the CD signal intensity and a small shift toward 262 nm (Fig. 59A). Furthermore, the addition of 360A to 22RT leads to a decrease of the signal at 263 nm and increase of the signal at 243 and at 290 nm. The peak positions at 263 and 243 nm shift to the left (Fig. 59B). Titration of 21R with Braco-19 lead to an increase in signal intensity up to 1 equivalent. Then the signal intensity decreased first slowly then more pronounced when adding up to 5 equivalents. The signal at 299 nm was steadily increasing at the same time. The signal at 263 nm shifted to 262 nm (Fig. 60A). 22RT with Braco-19 shows almost identical observations to those observed

126

with 21R (Fig. 60B). PhenDC3 induces for both 21R and 22RT an increase of the signal at 263 nm and a decrease of the signal at 243 nm up to 2 equivalents then conversely when more than 2 equivalents were added (Figure. 61A, 61B).

The three tested ligands show their ability to induce similar minimal changes in CD signature of both 21R and 22RT that may reflect small changes in the relative orientation of the tetrads. These changes can be related to a sort of G4 stabilization.

Figure 59: CD titration of Kras 21R and 22RT with 360A.

A) CD titration spectra of 3µM Kras 21R annealed alone in 90mM K+ buffer and afterwards successive additions of 360A compound, from 0 to 5 molar equivalents. B) CD titration spectra of 3µM Kras 22RT annealed alone in 90mM K+ buffer and afterwards successive additions of 360A compound, from 0 to 5 molar equivalents. Data was collected at 20°C

127

Figure 60: CD titration of Kras 21R and 22RT with Braco19.

A) CD titration spectra of 3µM Kras 21R annealed alone in 90mM K+ buffer and afterwards successive additions of Braco19 compound, from 0 to 5 molar equivalents. B) CD titration spectra of 3µM Kras 22RT annealed alone in 90mM K+ buffer and afterwards successive additions of Braco19 compound, from 0 to 5 molar equivalents. Data was collected at 20°C

128

D.3.3 Stabilization of 21R and 22RT in the presence of different ligand To compare the stabilization effects of the ligands on Kras G4, melting studies on both sequences (21R and 22RT) with the three ligands (360A, Braco19 and PhenDC3) were performed. The CD melting of both Kras G4 was monitored at 263 nm which corresponds to the formation of parallel G4. All the melting tests have been done with 2 molar equivalents of

Figure 61: CD titration of Kras 21R and 22RT with PhenDC3.

A) CD titration spectra of 3µM Kras 21R annealed alone in 90mM K+ buffer and afterwards successive additions of PhenDC3 compound, from 0 to 5 molar equivalents. B) CD titration spectra of 3µM Kras 22RT annealed alone in 90mM K+ buffer and afterwards successive additions of PhenDC3 compound, from 0 to 5 molar equivalents. Data was collected at 20°C

129

+ ligand. The melting temperature (Tm) obtained with 21R annealed in 90 mM K is around

49°C (Fig. 62A). The stabilization temperature (∆Tm) was more tan 40°C in the presence of 2 molar equivalent of PhenDC3 (Fig. 62A). However, Braco19 conferred in its turn a stabilization of 20°C and 360A allowed more than 35°C (Fig. 62A). In the case of 22RT, the registered stabilization was higher than 21R with Tm of 54°C (Fig. 62B). The melting of 22RT showed better stabilization with the three ligands. PhenDC and 360A conferred a stabilization with ∆Tm higher than 40°C and Braco 19 induced a ∆Tm around 25°C (Fig. 62B).

According to the titration and the melting studies, all the tested ligands show a similar behavior towards both 21R and 22RT. For more investigations, NMR spectroscopy is the ideal tool to get more structural information. In this study we performed preliminary NMR investigations between the couple 22R and Braco19.

D.3.4 NMR titration of 22RT with Braco19 Braco19 showed a good stabilization for 22RT, it allowed more than 30°C of thermal stabilization (∆Tm ~ 30°C). To study the interaction between 22RT and a titration of 22RT with increased concentration of the compound from 0 to 1.5 molar equivalents was performed. The addition of the compound to 22RT increases gradually the changes in the chemical shifts for ratio up to 1:1 between 22RT and Braco 19 without destroying tetrads as shown in Fig. 63. For concentrations more than 1 molar equivalent of Braco19, no perturbations of chemical shift were observed suggesting a stoechiometry of 1:1 for the complex formed between 22RT and Braco19. We also observe the formation of an intermediate species in the transition from 0.75 to 1 M.eq indicating probably that the complex passes through a stable intermediate phase of rearrangement before being finally stabilized in a 1:1 complex.

The sequence 22RT seems to form a well resolved G4 which will be easily studied by multi-dimentional NMR spectroscopy. We had no time to further progress in determining the structure but preliminary experiments were performed to resolve the general scaffold of this G4 structure formed by 22RT. The interaction of Braco19 with 22RT opens also a new window that is worth to discuss.

130

Figure 62: 21R and 22RT stabilization by various ligands

A) CD melting of 3µM 21R annealed alone in 90mM K+ buffer (black line), and in the presence of 2 molar equivalents of 360A compound (red line), Braco19 (blue line) and PhenDC3 (green line). B) CD melting of 3µM 22RT annealed alone in 90mM K+ buffer (black line), and in the presence of 2 molar equivalents of 360A compound (red line), Braco19 (blue line) and PhenDC3 (green line). Melting was monitored at 263 nm.

131

Figure 63: 1D 1H NMR titration of 22RT with Braco19 compound

Titration spectra of imino protons region of 500µM 22RT with different concentrations of Brco19 in 200µl of potassium phosphate buffer containing 20mM K2HPO4/KH2PO4, 70 mMKCl at pH 6.9. Experiments performed at 20°C.

132

Conclusions and perspectives

133

134

E. Conclusions and perspectives In this dissertation, I have developed three different topics in the field of G- quadruplexes and their interaction with small ligands with therapeutical purposes. The first concerns the binding of human telomeric motif formed by the sequence 22AG and a small ligand synthesized in the laboratory, e.g., the 2,4,6 Triarylpyridine compound, in which was the best candidate from a family of ligands that demonstrated good characteristics towards the stabilization of G4 structures. On one hand, it shows a high specificity and selectivity for G4 structures in different buffer conditions (Na+ and K+) against canonical forms of DNA and on the other hand, it confers a high stability for the human telomeric G4 as was assessed by FRET melting experiments. In this work, I was interested to know how TAP binds to 22AG G4 and to study the chemical properties that enables TAP to stabilize the G4 structure by a staggering 30°C of melting stabilization observed by FRET. Using CD and 1D NMR spectroscopy, I performed titrations in different conditions to confirm that TAP binds with medium-low affinity to 22AG. Our observations lead us to conclude that the ligand seems to impose mild modifications in the structure of 22AG. This explanation is also reinforced by the ligand aggregation which auto-inhibits a proper matching between the ligand and 22AG. The CD melting results showed that TAP stabilizes G4 as expected basing on the previous results performed by FRET melting. Using 2D NMR spectroscopy, I assigned all the atoms of the G- quadruplex based on published results with 22AG alone, calculated the structute of the free form with more NOEs than in the published structure which better validate the model, and later on I extrapolated the assignments of the free form to assign the complex. The chemical shift perturbations mapping allowed us to determine the potential region of TAP binding that is located in the region of G4 located between the lateral loop formed by T17, T18 and A19 and the tetrad formed by G (4, 8, 16 and 20). Unfortunately, no NOEs of interactions were identified unambiguously to get a decisive NMR evidence for the binding, the overlapping of peaks in the aliphatic region renders impossible the distinction between the NOEs of ligand- ligand and ligand-22AG interactions. A few new peaks are observed in the aromatic region but we could not assign them unambiguously, we think that they could be NOEs of interactions but despite that, they are not enough to localize the region of interaction. However, the docking studies showed that two different sites of binding are possible, the most privileged one corresponds to the site binding of the region determined from NMR described above and the other site is located between the diagonal loop formed by T11, T12 and A13 and the minor groove of the quadruplex. In fact, we clearly showed in this work that TAP

135

binds to G4 at grooves and loops with low binding affinity. The enlargement of 4-aryl core seems to be a good strategy to improve the affinity of TAP for G4 since the side chains are able to more easily establish interactions with the G-quadruplex grooves. To study more in detail the structural features of interactions, we will probably manage to perform a crystallization essays for the complex TAP-22AG to confirm the interaction mode and to investigate experimentally the existence of the second binding site. In the future, other NMR investigations have to be optimized for obtaining the structure solution such as STD, WATERLOGSY which failed for TAP interaction in our conditions. Other G4 sequences such as c-myc and K-ras G4s will be also tested to confirm the structural features observed with 22AG.

In the second part of our project, we characterized a tetra-molecular G4 formed by d[TG4T]4 inside living cells. We clearly showed that G4 structure are formed within cells and are visible by NMR spectroscopy that is known for not being a very sensible method. G4 made of d[TG4T]4 seems to be resistant to numerous enzymatic activities and can be considered as a long-lived specie within cells. The differences found between both in vivo and in vitro results opens new pertinent questions that must be addressed in future experiments concerning G4 ligand interaction. The ligand bound to the oligonucleotide seems to enhance the binding of certain biomolecules and/or precipitates the complex inside the cell which makes it invisible in the NMR time scale. Taking together our findings with those of previous

NMR studies obtained with the human telomeric sequence d(AG3(T2AG3)3) give us fresh clues on how to proceed to study oligonucleotides, and more specifically G4 structures inside living cells with ligands. The results provided from the binding investigations with the ligands inside Xenopus oocytes show that despite the toxicity that these ligands may have at high concentrations, for one of them 360A, the oocytes remained intact for periods of near 24h. Moreover, this ligand present a very interesting effect on G4 both in vitro and in vivo. 360A can be used to test other G4 such as the human telomeric motif, c-myc and k-ras G4 structures. The optimization process that I did for the in-cell experiments of G4 in human cells encourages us to investigate more in detail the possibility to detect NMR signals from G4 in such complex medium as the one found inside human cells. The protocol of electroporation is correctly optimized permitting to deliver a good amount of G4 inside cells, especially if we compare to those obtained with proteins (5-12 µM) (243-245). The oligonucleotide d[TG4T]4 seems to interact with cellular partners inside human cells and its binding with these large molecules in a very compact intracellular environment prevents a

136

correct observation of the NMR signals obtained by liquid state NMR. Having that in mind, we started a collaboration with Dr. Antoine Loquet from CBMN (Institut de Chimie et Biologie des Membranes et des Nano-objets) in Bordeaux in order to perform solid state NMR and get more information about the folding and binding of G4 inside cells. This represents a possible venue for future experiments. In the shorter term, we think that alternative shorter and compact sequences may provide visible signals and that longer and probably more relevant sequences than TG4T such as c-myc or human telomeric motif can also be exploited to test the folding of G4 structures and ligand binding inside mammalian cells using NMR spectroscopy.

Besides these two projects, we screened several sequences able to form G4 in the NHE region of Kras promoter. Among them, two sequences (21R and 22RT) showed a good spectra quality as showed by 1D NMR spectroscopy. In addition, their interaction with specific G4 ligands (360A, PhenDC3 and Braco19) confers a high stabilization (∆Tm >30°). The NMR titration performed between 22RT and Braco19 showed an interesting behavior of G4 in complex with the ligand, an intermediate is observed in the duplex region before the final stabilization of both 22RT and Braco19 in 1:1 complex. In the future, we intend to solve by NMR spectroscopy the G4 structure formed by 22RT which will be followed by determining the complex with Braco19. Such investigations allow us to better understand how ligands bind to G4 structures and how scaffolds can be used for new drug development.

137

139

REFERENCE 1. Watson, J.D. and Crick, F.H. (1953) Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature, 171, 737-738. 2. Watson, J.D. and Crick, F.H. (1953) Genetical implications of the structure of deoxyribonucleic acid. Nature, 171, 964-967. 3. Wang, A.H., Quigley, G.J., Kolpak, F.J., Crawford, J.L., van Boom, J.H., van der Marel, G. and Rich, A. (1979) Molecular structure of a left-handed double helical DNA fragment at atomic resolution. Nature, 282, 680-686. 4. Felsenfeld, G. and Rich, A. (1957) Studies on the formation of two- and three-stranded polyribonucleotides. Biochimica et biophysica acta, 26, 457-468. 5. Hoogsteen, K. (1959) The structure of crystals containing a hydrogen-bonded complex of 1-methylthymine and 9-methyladenine. Acta Crystallographica, 12, 822-823. 6. Hoogsteen, K. (1963) The crystal and molecular structure of a hydrogen-bonded complex between 1-methylthymine and 9-methyladenine. Acta Crystallographica, 16, 907-916. 7. Gellert, M., Lipsett, M.N. and Davies, D.R. (1962) Helix formation by guanylic acid. Proceedings of the National Academy of Sciences of the United States of America, 48, 2013-2018. 8. Arnott, S., Chandrasekaran, R. and Marttila, C.M. (1974) Structures for polyinosinic acid and polyguanylic acid. The Biochemical journal, 141, 537-543. 9. Oganesian, L. and Bryan, T.M. (2007) Physiological relevance of telomeric G- quadruplex formation: a potential drug target. BioEssays : news and reviews in molecular, cellular and developmental biology, 29, 155-165. 10. De Cian, A., Lacroix, L., Douarre, C., Temime-Smaali, N., Trentesaux, C., Riou, J.F. and Mergny, J.L. (2008) Targeting telomeres and telomerase. Biochimie, 90, 131-155. 11. Ou, T.M., Lu, Y.J., Tan, J.H., Huang, Z.S., Wong, K.Y. and Gu, L.Q. (2008) G- quadruplexes: targets in anticancer drug design. ChemMedChem, 3, 690-713. 12. Duchler, M. (2012) G-quadruplexes: targets and tools in anticancer drug design. Journal of drug targeting, 20, 389-400. 13. Pettersen, E.F., Goddard, T.D., Huang, C.C., Couch, G.S., Greenblatt, D.M., Meng, E.C. and Ferrin, T.E. (2004) UCSF Chimera--a visualization system for exploratory research and analysis. Journal of computational chemistry, 25, 1605-1612. 14. Radhakrishnan, I. and Patel, D.J. (1994) Solution structure of a pyrimidine.purine.pyrimidine DNA triplex containing T.AT, C+.GC and G.TA triples. Structure, 2, 17-32. 15. Wang, Y. and Patel, D.J. (1993) Solution structure of the human telomeric repeat d[AG3(T2AG3)3] G-tetraplex. Structure, 1, 263-282. 16. Mergny, J.L., Phan, A.T. and Lacroix, L. (1998) Following G-quartet formation by UV-spectroscopy. FEBS letters, 435, 74-78. 17. Wong, A. and Wu, G. (2003) Selective binding of monovalent cations to the stacking G-quartet structure formed by guanosine 5'-monophosphate: a solid-state NMR study. Journal of the American Chemical Society, 125, 13895-13905. 18. Wlodarczyk, A., Grzybowski, P., Patkowski, A. and Dobek, A. (2005) Effect of ions on the polymorphism, effective charge, and stability of human telomeric DNA. Photon correlation spectroscopy and circular dichroism studies. The journal of physical chemistry. B, 109, 3594-3605. 19. Pinnavaia, T.J., Marshall, C.L., Mettler, C.M., Fisk, C.L., Miles, H.T. and Becker, E.D. (1978) Alkali metal ion specificity in the solution ordering of a nucleotide, 5'- guanosine monophosphate. Journal of the American Chemical Society, 100, 3625- 3627.

140

20. Smith, F.W. and Feigon, J. (1992) Quadruplex structure of Oxytricha telomeric DNA oligonucleotides. Nature, 356, 164-168. 21. Wong, A., Ida, R., Spindler, L. and Wu, G. (2005) Disodium guanosine 5'- monophosphate self-associates into nanoscale cylinders at pH 8: a combined diffusion NMR spectroscopy and dynamic light scattering study. Journal of the American Chemical Society, 127, 6990-6998. 22. Zhou, J., Amrane, S., Korkut, D.N., Bourdoncle, A., He, H.Z., Ma, D.L. and Mergny, J.L. (2013) Combination of i-motif and G-quadruplex structures within the same strand: formation and application. Angewandte Chemie, 52, 7742-7746. 23. Webba da Silva, M. (2007) Geometric formalism for DNA quadruplex folding. Chemistry, 13, 9738-9745. 24. Crnugelj, M., Sket, P. and Plavec, J. (2003) Small change in a G-rich sequence, a dramatic change in topology: new dimeric G-quadruplex folding motif with unique loop orientations. Journal of the American Chemical Society, 125, 7866-7871. 25. Guedin, A., Gros, J., Alberti, P. and Mergny, J.L. (2010) How long is too long? Effects of loop size on G-quadruplex stability. Nucleic acids research, 38, 7858-7868. 26. Guedin, A., Alberti, P. and Mergny, J.L. (2009) Stability of intramolecular quadruplexes: sequence effects in the central loop. Nucleic acids research, 37, 5559- 5567. 27. Rachwal, P.A., Brown, T. and Fox, K.R. (2007) Sequence effects of single base loops in intramolecular quadruplex DNA. FEBS letters, 581, 1657-1660. 28. Rachwal, P.A., Findlow, I.S., Werner, J.M., Brown, T. and Fox, K.R. (2007) Intramolecular DNA quadruplexes with different arrangements of short and long loops. Nucleic acids research, 35, 4214-4222. 29. Guedin, A., De Cian, A., Gros, J., Lacroix, L. and Mergny, J.L. (2008) Sequence effects in single-base loops for quadruplexes. Biochimie, 90, 686-696. 30. Laughlan, G., Murchie, A.I., Norman, D.G., Moore, M.H., Moody, P.C., Lilley, D.M. and Luisi, B. (1994) The high-resolution crystal structure of a parallel-stranded guanine tetraplex. Science, 265, 520-524. 31. Phillips, K., Dauter, Z., Murchie, A.I., Lilley, D.M. and Luisi, B. (1997) The crystal structure of a parallel-stranded guanine tetraplex at 0.95 A resolution. Journal of molecular biology, 273, 171-182. 32. Sket, P., Kozminski, W. and Plavec, J. (2012) Is There Any Proton Exchange Between Ammonium Ions localized Within the d(G3T4G4)2 Quadruplex? Acta chimica Slovenica, 59, 473-477. 33. Phan, A.T., Kuryavyi, V., Gaw, H.Y. and Patel, D.J. (2005) Small-molecule interaction with a five-guanine-tract G-quadruplex structure from the human MYC promoter. Nature chemical biology, 1, 167-173. 34. Todd, A.K., Haider, S.M., Parkinson, G.N. and Neidle, S. (2007) Sequence occurrence and structural uniqueness of a G-quadruplex in the human c-kit promoter. Nucleic acids research, 35, 5799-5808. 35. Wei, D., Parkinson, G.N., Reszka, A.P. and Neidle, S. (2012) Crystal structure of a c- kit promoter quadruplex reveals the structural role of metal ions and water molecules in maintaining loop conformation. Nucleic acids research, 40, 4691-4700. 36. Mukundan, V.T. and Phan, A.T. (2013) Bulges in G-quadruplexes: broadening the definition of G-quadruplex-forming sequences. Journal of the American Chemical Society, 135, 5017-5028. 37. Yang, Q., Xiang, J., Yang, S., Li, Q., Zhou, Q., Guan, A., Zhang, X., Zhang, H., Tang, Y. and Xu, G. (2010) Verification of specific G-quadruplex structure by using a novel cyanine dye supramolecular assembly: II. The binding characterization with specific

141

intramolecular G-quadruplex and the recognizing mechanism. Nucleic acids research, 38, 1022-1033. 38. Huppert, J.L. and Balasubramanian, S. (2005) Prevalence of quadruplexes in the human genome. Nucleic acids research, 33, 2908-2916. 39. Wong, H.M., Stegle, O., Rodgers, S. and Huppert, J.L. (2010) A toolbox for predicting g-quadruplex formation and stability. Journal of nucleic acids, 2010. 40. Todd, A.K., Johnston, M. and Neidle, S. (2005) Highly prevalent putative quadruplex sequence motifs in human DNA. Nucleic acids research, 33, 2901-2907. 41. Bourdoncle, A., Estevez Torres, A., Gosse, C., Lacroix, L., Vekhoff, P., Le Saux, T., Jullien, L. and Mergny, J.L. (2006) Quadruplex-based molecular beacons as tunable DNA probes. Journal of the American Chemical Society, 128, 11094-11105. 42. Huppert, J.L. and Balasubramanian, S. (2007) G-quadruplexes in promoters throughout the human genome. Nucleic acids research, 35, 406-413. 43. Eddy, J. and Maizels, N. (2006) Gene function correlates with potential for G4 DNA formation in the human genome. Nucleic acids research, 34, 3887-3896. 44. Hershman, S.G., Chen, Q., Lee, J.Y., Kozak, M.L., Yue, P., Wang, L.S. and Johnson, F.B. (2008) Genomic distribution and functional analyses of potential G-quadruplex- forming sequences in Saccharomyces cerevisiae. Nucleic acids research, 36, 144-156. 45. Capra, J.A., Paeschke, K., Singh, M. and Zakian, V.A. (2010) G-quadruplex DNA sequences are evolutionarily conserved and associated with distinct genomic features in Saccharomyces cerevisiae. PLoS computational biology, 6, e1000861. 46. Stegle, O., Payet, L., Mergny, J.L., MacKay, D.J. and Leon, J.H. (2009) Predicting and understanding the stability of G-quadruplexes. Bioinformatics, 25, i374-382. 47. Phan, A.T. and Mergny, J.L. (2002) Human telomeric DNA: G-quadruplex, i-motif and Watson-Crick double helix. Nucleic acids research, 30, 4618-4625. 48. Huber, M.D., Duquette, M.L., Shiels, J.C. and Maizels, N. (2006) A conserved G4 DNA binding domain in RecQ family helicases. Journal of molecular biology, 358, 1071-1080. 49. Marchand, C., Pourquier, P., Laco, G.S., Jing, N. and Pommier, Y. (2002) Interaction of human nuclear topoisomerase I with guanosine quartet-forming and guanosine-rich single-stranded DNA and RNA oligonucleotides. The Journal of biological chemistry, 277, 8906-8911. 50. Arimondo, P.B., Riou, J.F., Mergny, J.L., Tazi, J., Sun, J.S., Garestier, T. and Helene, C. (2000) Interaction of human DNA topoisomerase I with G-quartet structures. Nucleic acids research, 28, 4832-4838. 51. Chung, I.K., Mehta, V.B., Spitzner, J.R. and Muller, M.T. (1992) Eukaryotic topoisomerase II cleavage of parallel stranded DNA tetraplexes. Nucleic acids research, 20, 1973-1977. 52. Amrane, S., Kerkour, A., Bedrat, A., Vialet, B., Andreola, M.L. and Mergny, J.L. (2014) Topology of a DNA G-quadruplex structure formed in the HIV-1 promoter: a potential target for anti-HIV drug development. Journal of the American Chemical Society, 136, 5249-5252. 53. Ambrus, A., Chen, D., Dai, J., Jones, R.A. and Yang, D. (2005) Solution structure of the biologically relevant G-quadruplex element in the human c-MYC promoter. Implications for G-quadruplex stabilization. Biochemistry, 44, 2048-2058. 54. Phan, A.T., Modi, Y.S. and Patel, D.J. (2004) Propeller-type parallel-stranded G- quadruplexes in the human c-myc promoter. Journal of the American Chemical Society, 126, 8710-8716.

142

55. Phan, A.T., Kuryavyi, V., Burge, S., Neidle, S. and Patel, D.J. (2007) Structure of an unprecedented G-quadruplex scaffold in the human c-kit promoter. Journal of the American Chemical Society, 129, 4386-4392. 56. Dai, J., Dexheimer, T.S., Chen, D., Carver, M., Ambrus, A., Jones, R.A. and Yang, D. (2006) An intramolecular G-quadruplex structure with mixed parallel/antiparallel G- strands formed in the human BCL-2 promoter region in solution. Journal of the American Chemical Society, 128, 1096-1098. 57. Dai, J., Chen, D., Jones, R.A., Hurley, L.H. and Yang, D. (2006) NMR solution structure of the major G-quadruplex structure formed in the human BCL2 promoter region. Nucleic acids research, 34, 5133-5144. 58. Cogoi, S. and Xodo, L.E. (2006) G-quadruplex formation within the promoter of the KRAS proto-oncogene and its effect on transcription. Nucleic acids research, 34, 2536-2549. 59. Sun, D., Guo, K., Rusche, J.J. and Hurley, L.H. (2005) Facilitation of a structural transition in the polypurine/polypyrimidine tract within the proximal promoter region of the human VEGF gene by the presence of potassium and G-quadruplex-interactive agents. Nucleic acids research, 33, 6070-6080. 60. Brooks, T.A., Kendrick, S. and Hurley, L. (2010) Making sense of G-quadruplex and i-motif functions in oncogene promoters. The FEBS journal, 277, 3459-3469. 61. Mathad, R.I., Hatzakis, E., Dai, J. and Yang, D. (2011) c-MYC promoter G- quadruplex formed at the 5'-end of NHE III1 element: insights into biological relevance and parallel-stranded G-quadruplex stability. Nucleic acids research, 39, 9023-9033. 62. Kuryavyi, V., Phan, A.T. and Patel, D.J. (2010) Solution structures of all parallel- stranded monomeric and dimeric G-quadruplex scaffolds of the human c-kit2 promoter. Nucleic acids research, 38, 6757-6773. 63. Hsu, S.T., Varnai, P., Bugaut, A., Reszka, A.P., Neidle, S. and Balasubramanian, S. (2009) A G-rich sequence within the c-kit oncogene promoter forms a parallel G- quadruplex having asymmetric G-tetrad dynamics. Journal of the American Chemical Society, 131, 13399-13409. 64. Jeffreys, A.J., Barber, R., Bois, P., Buard, J., Dubrova, Y.E., Grant, G., Hollies, C.R., May, C.A., Neumann, R., Panayi, M. et al. (1999) Human minisatellites, repeat DNA instability and meiotic recombination. Electrophoresis, 20, 1665-1675. 65. Singh, L. (1995) Biological significance of minisatellites. Electrophoresis, 16, 1586- 1595. 66. Amrane, S., Adrian, M., Heddi, B., Serero, A., Nicolas, A., Mergny, J.L. and Phan, A.T. (2012) Formation of pearl-necklace monomorphic G-quadruplexes in the human CEB25 minisatellite. Journal of the American Chemical Society, 134, 5807-5816. 67. Adrian, M., Ang, D.J., Lech, C.J., Heddi, B., Nicolas, A. and Phan, A.T. (2014) Structure and conformational dynamics of a stacked dimeric G-quadruplex formed by the human CEB1 minisatellite. Journal of the American Chemical Society, 136, 6297- 6305. 68. Sen, D. and Gilbert, W. (1988) Formation of parallel four-stranded complexes by guanine-rich motifs in DNA and its implications for meiosis. Nature, 334, 364-366. 69. Cheong, C. and Moore, P.B. (1992) Solution structure of an unusually stable RNA tetraplex containing G- and U-quartet structures. Biochemistry, 31, 8406-8414. 70. Kumari, S., Bugaut, A., Huppert, J.L. and Balasubramanian, S. (2007) An RNA G- quadruplex in the 5' UTR of the NRAS proto-oncogene modulates translation. Nature chemical biology, 3, 218-221.

143

71. Sacca, B., Lacroix, L. and Mergny, J.L. (2005) The effect of chemical modifications on the thermal stability of different G-quadruplex-forming oligonucleotides. Nucleic acids research, 33, 1182-1192. 72. Darnell, J.C., Jensen, K.B., Jin, P., Brown, V., Warren, S.T. and Darnell, R.B. (2001) Fragile X mental retardation protein targets G quartet mRNAs important for neuronal function. Cell, 107, 489-499. 73. Brown, V., Jin, P., Ceman, S., Darnell, J.C., O'Donnell, W.T., Tenenbaum, S.A., Jin, X., Feng, Y., Wilkinson, K.D., Keene, J.D. et al. (2001) Microarray identification of FMRP-associated brain mRNAs and altered mRNA translational profiles in fragile X syndrome. Cell, 107, 477-487. 74. Kikin, O., Zappala, Z., D'Antonio, L. and Bagga, P.S. (2008) GRSDB2 and GRS_UTRdb: databases of quadruplex forming G-rich sequences in pre-mRNAs and mRNAs. Nucleic acids research, 36, D141-148. 75. Huppert, J.L., Bugaut, A., Kumari, S. and Balasubramanian, S. (2008) G- quadruplexes: the beginning and end of UTRs. Nucleic acids research, 36, 6260-6268. 76. Martadinata, H. and Phan, A.T. (2009) Structure of propeller-type parallel-stranded RNA G-quadruplexes, formed by human telomeric RNA sequences in K+ solution. Journal of the American Chemical Society, 131, 2570-2578. 77. Collie, G.W., Haider, S.M., Neidle, S. and Parkinson, G.N. (2010) A crystallographic and modelling study of a human telomeric RNA (TERRA) quadruplex. Nucleic acids research, 38, 5569-5580. 78. Luke, B. and Lingner, J. (2009) TERRA: telomeric repeat-containing RNA. The EMBO journal, 28, 2503-2510. 79. O'Sullivan, R.J. and Karlseder, J. (2010) Telomeres: protecting chromosomes against genome instability. Nature reviews. Molecular cell biology, 11, 171-181. 80. Olovnikov, A.M. (1973) A theory of marginotomy. The incomplete copying of template margin in enzymic synthesis of polynucleotides and biological significance of the phenomenon. Journal of theoretical biology, 41, 181-190. 81. Harley, C.B., Futcher, A.B. and Greider, C.W. (1990) Telomeres shorten during ageing of human fibroblasts. Nature, 345, 458-460. 82. Harley, C.B. (1991) Telomere loss: mitotic clock or genetic time bomb? Mutation research, 256, 271-282. 83. Pandita, T.K., Hunt, C.R., Sharma, G.G. and Yang, Q. (2007) Regulation of telomere movement by telomere chromatin structure. Cellular and molecular life sciences : CMLS, 64, 131-138. 84. Tham, W.H. and Zakian, V.A. (2002) Transcriptional silencing at Saccharomyces telomeres: implications for other organisms. Oncogene, 21, 512-521. 85. Autexier, C. and Lue, N.F. (2006) The structure and function of telomerase reverse transcriptase. Annual review of biochemistry, 75, 493-517. 86. Blackburn, E.H. and Gall, J.G. (1978) A tandemly repeated sequence at the termini of the extrachromosomal ribosomal RNA genes in Tetrahymena. Journal of molecular biology, 120, 33-53. 87. Jacob, N.K., Skopp, R. and Price, C.M. (2001) G-overhang dynamics at Tetrahymena telomeres. The EMBO journal, 20, 4299-4308. 88. McElligott, R. and Wellinger, R.J. (1997) The terminal DNA structure of mammalian chromosomes. The EMBO journal, 16, 3705-3714. 89. Chai, W., Shay, J.W. and Wright, W.E. (2005) Human telomeres maintain their overhang length at senescence. Molecular and cellular biology, 25, 2158-2168. 90. Wellinger, R.J., Wolf, A.J. and Zakian, V.A. (1993) Saccharomyces telomeres acquire single-strand TG1-3 tails late in S phase. Cell, 72, 51-60.

144

91. Schultze, P., Smith, F.W. and Feigon, J. (1994) Refined solution structure of the dimeric quadruplex formed from the Oxytricha telomeric oligonucleotide d(GGGGTTTTGGGG). Structure, 2, 221-233. 92. Wang, Y. and Patel, D.J. (1993) Solution structure of a parallel-stranded G-quadruplex DNA. Journal of molecular biology, 234, 1171-1183. 93. Ambrus, A., Chen, D., Dai, J., Bialis, T., Jones, R.A. and Yang, D. (2006) Human telomeric sequence forms a hybrid-type intramolecular G-quadruplex structure with mixed parallel/antiparallel strands in potassium solution. Nucleic acids research, 34, 2723-2735. 94. Luu, K.N., Phan, A.T., Kuryavyi, V., Lacroix, L. and Patel, D.J. (2006) Structure of the human telomere in K+ solution: an intramolecular (3 + 1) G-quadruplex scaffold. Journal of the American Chemical Society, 128, 9963-9970. 95. Dai, J., Carver, M., Punchihewa, C., Jones, R.A. and Yang, D. (2007) Structure of the Hybrid-2 type intramolecular human telomeric G-quadruplex in K+ solution: insights into structure polymorphism of the human telomeric sequence. Nucleic acids research, 35, 4927-4940. 96. Moyzis, R.K., Buckingham, J.M., Cram, L.S., Dani, M., Deaven, L.L., Jones, M.D., Meyne, J., Ratliff, R.L. and Wu, J.R. (1988) A highly conserved repetitive DNA sequence, (TTAGGG)n, present at the telomeres of human chromosomes. Proceedings of the National Academy of Sciences of the United States of America, 85, 6622-6626. 97. Wright, W.E., Tesmer, V.M., Huffman, K.E., Levene, S.D. and Shay, J.W. (1997) Normal human chromosomes have long G-rich telomeric overhangs at one end. Genes & development, 11, 2801-2809. 98. Blasco, M.A. (2005) Telomeres and human disease: ageing, cancer and beyond. Nature reviews. Genetics, 6, 611-622. 99. Kim, N.W., Piatyszek, M.A., Prowse, K.R., Harley, C.B., West, M.D., Ho, P.L., Coviello, G.M., Wright, W.E., Weinrich, S.L. and Shay, J.W. (1994) Specific association of human telomerase activity with immortal cells and cancer. Science, 266, 2011-2015. 100. Palm, W. and de Lange, T. (2008) How shelterin protects mammalian telomeres. Annual review of genetics, 42, 301-334. 101. Sun, D., Thompson, B., Cathers, B.E., Salazar, M., Kerwin, S.M., Trent, J.O., Jenkins, T.C., Neidle, S. and Hurley, L.H. (1997) Inhibition of human telomerase by a G- quadruplex-interactive compound. Journal of medicinal chemistry, 40, 2113-2116. 102. Wang, Q., Liu, J.Q., Chen, Z., Zheng, K.W., Chen, C.Y., Hao, Y.H. and Tan, Z. (2011) G-quadruplex formation at the 3' end of telomere DNA inhibits its extension by telomerase, polymerase and unwinding by helicase. Nucleic acids research, 39, 6229- 6237. 103. Zahler, A.M., Williamson, J.R., Cech, T.R. and Prescott, D.M. (1991) Inhibition of telomerase by G-quartet DNA structures. Nature, 350, 718-720. 104. Paeschke, K., Simonsson, T., Postberg, J., Rhodes, D. and Lipps, H.J. (2005) Telomere end-binding proteins control the formation of G-quadruplex DNA structures in vivo. Nature structural & molecular biology, 12, 847-854. 105. Biffi, G., Di Antonio, M., Tannahill, D. and Balasubramanian, S. (2014) Visualization and selective chemical targeting of RNA G-quadruplex structures in the cytoplasm of human cells. Nature chemistry, 6, 75-80. 106. Biffi, G., Tannahill, D., McCafferty, J. and Balasubramanian, S. (2013) Quantitative visualization of DNA G-quadruplex structures in human cells. Nature chemistry, 5, 182-186.

145

107. Griffith, J.D., Comeau, L., Rosenfield, S., Stansel, R.M., Bianchi, A., Moss, H. and de Lange, T. (1999) Mammalian telomeres end in a large duplex loop. Cell, 97, 503-514. 108. Stansel, R.M., de Lange, T. and Griffith, J.D. (2001) T-loop assembly in vitro involves binding of TRF2 near the 3' telomeric overhang. The EMBO journal, 20, 5532-5540. 109. Amiard, S., Doudeau, M., Pinte, S., Poulet, A., Lenain, C., Faivre-Moskalenko, C., Angelov, D., Hug, N., Vindigni, A., Bouvet, P. et al. (2007) A topological mechanism for TRF2-enhanced strand invasion. Nature structural & molecular biology, 14, 147- 154. 110. Zaug, A.J., Podell, E.R. and Cech, T.R. (2005) Human POT1 disrupts telomeric G- quadruplexes allowing telomerase extension in vitro. Proceedings of the National Academy of Sciences of the United States of America, 102, 10864-10869. 111. Possemato, R., Timmons, J.C., Bauerlein, E.L., Wada, N., Baldwin, A., Masutomi, K. and Hahn, W.C. (2008) Suppression of hPOT1 in diploid human cells results in an hTERT-dependent alteration of telomere length dynamics. Molecular cancer research : MCR, 6, 1582-1593. 112. Phan, A.T., Kuryavyi, V., Luu, K.N. and Patel, D.J. (2007) Structure of two intramolecular G-quadruplexes formed by natural human telomere sequences in K+ solution. Nucleic acids research, 35, 6517-6525. 113. Dai, J., Punchihewa, C., Ambrus, A., Chen, D., Jones, R.A. and Yang, D. (2007) Structure of the intramolecular human telomeric G-quadruplex in potassium solution: a novel adenine triple formation. Nucleic acids research, 35, 2440-2450. 114. Matsugami, A., Xu, Y., Noguchi, Y., Sugiyama, H. and Katahira, M. (2007) Structure of a human telomeric DNA sequence stabilized by 8-bromoguanosine substitutions, as determined by NMR in a K+ solution. The FEBS journal, 274, 3545-3556. 115. Lim, K.W., Amrane, S., Bouaziz, S., Xu, W., Mu, Y., Patel, D.J., Luu, K.N. and Phan, A.T. (2009) Structure of the human telomere in K+ solution: a stable basket-type G- quadruplex with only two G-tetrad layers. Journal of the American Chemical Society, 131, 4301-4309. 116. Heddi, B. and Phan, A.T. (2011) Structure of human telomeric DNA in crowded solution. Journal of the American Chemical Society, 133, 9824-9833. 117. Lim, K.W., Ng, V.C., Martin-Pintado, N., Heddi, B. and Phan, A.T. (2013) Structure of the human telomere in Na+ solution: an antiparallel (2+2) G-quadruplex scaffold reveals additional diversity. Nucleic acids research, 41, 10556-10562. 118. Hanahan, D. and Weinberg, R.A. (2011) Hallmarks of cancer: the next generation. Cell, 144, 646-674. 119. Harley, C.B. (2008) Telomerase and cancer therapeutics. Nature reviews. Cancer, 8, 167-179. 120. Rezler, E.M., Bearss, D.J. and Hurley, L.H. (2002) Telomeres and telomerases as drug targets. Current opinion in pharmacology, 2, 415-423. 121. Shay, J.W. and Wright, W.E. (2006) Telomerase therapeutics for cancer: challenges and new directions. Nature reviews. Drug discovery, 5, 577-584. 122. Pascolo, E., Wenz, C., Lingner, J., Hauel, N., Priepke, H., Kauffmann, I., Garin- Chesa, P., Rettig, W.J., Damm, K. and Schnapp, A. (2002) Mechanism of human telomerase inhibition by BIBR1532, a synthetic, non-nucleosidic drug candidate. The Journal of biological chemistry, 277, 15566-15572. 123. Neidle, S. (2010) Human telomeric G-quadruplex: the current status of telomeric G- quadruplexes as therapeutic targets in human cancer. The FEBS journal, 277, 1118- 1125. 124. Monchaud, D. and Teulade-Fichou, M.P. (2008) A hitchhiker's guide to G-quadruplex ligands. Organic & biomolecular chemistry, 6, 627-636.

146

125. Artese, A., Costa, G., Ortuso, F., Parrotta, L. and Alcaro, S. (2013) Identification of new natural DNA G-quadruplex binders selected by a structure-based virtual screening approach. Molecules, 18, 12051-12070. 126. Gavathiotis, E., Heald, R.A., Stevens, M.F. and Searle, M.S. (2003) Drug recognition and stabilisation of the parallel-stranded DNA quadruplex d(TTAGGGT)4 containing the human telomeric repeat. Journal of molecular biology, 334, 25-36. 127. Campbell, N.H., Parkinson, G.N., Reszka, A.P. and Neidle, S. (2008) Structural basis of DNA quadruplex recognition by an acridine drug. Journal of the American Chemical Society, 130, 6722-6724. 128. Mergny, J.L. and Maurizot, J.C. (2001) Fluorescence resonance energy transfer as a probe for G-quartet formation by a telomeric repeat. Chembiochem : a European journal of chemical biology, 2, 124-132. 129. De Cian, A., Guittat, L., Kaiser, M., Sacca, B., Amrane, S., Bourdoncle, A., Alberti, P., Teulade-Fichou, M.P., Lacroix, L. and Mergny, J.L. (2007) Fluorescence-based melting assays for studying quadruplex ligands. Methods, 42, 183-195. 130. White, E.W., Tanious, F., Ismail, M.A., Reszka, A.P., Neidle, S., Boykin, D.W. and Wilson, W.D. (2007) Structure-specific recognition of quadruplex DNA by organic cations: influence of shape, substituents and charge. Biophysical chemistry, 126, 140- 153. 131. Hounsou, C., Guittat, L., Monchaud, D., Jourdan, M., Saettel, N., Mergny, J.L. and Teulade-Fichou, M.P. (2007) G-quadruplex recognition by quinacridines: a SAR, NMR, and biological study. ChemMedChem, 2, 655-666. 132. Fedoroff, O.Y., Salazar, M., Han, H., Chemeris, V.V., Kerwin, S.M. and Hurley, L.H. (1998) NMR-Based model of a telomerase-inhibiting compound bound to G- quadruplex DNA. Biochemistry, 37, 12367-12374. 133. Dai, J., Carver, M., Hurley, L.H. and Yang, D. (2011) Solution structure of a 2:1 quindoline-c-MYC G-quadruplex: insights into G-quadruplex-interactive small molecule drug design. Journal of the American Chemical Society, 133, 17673-17680. 134. Shin-ya, K. (2004) [Telomerase inhibitor, telomestatin, a specific mechanism to interact with telomere structure]. Nihon rinsho. Japanese journal of clinical medicine, 62, 1277-1282. 135. Chung, W.J., Heddi, B., Tera, M., Iida, K., Nagasawa, K. and Phan, A.T. (2013) Solution structure of an intramolecular (3 + 1) human telomeric G-quadruplex bound to a telomestatin derivative. Journal of the American Chemical Society, 135, 13495- 13501. 136. Seenisamy, J., Rezler, E.M., Powell, T.J., Tye, D., Gokhale, V., Joshi, C.S., Siddiqui- Jain, A. and Hurley, L.H. (2004) The dynamic character of the G-quadruplex element in the c-MYC promoter and modification by TMPyP4. Journal of the American Chemical Society, 126, 8702-8709. 137. Siddiqui-Jain, A., Grand, C.L., Bearss, D.J. and Hurley, L.H. (2002) Direct evidence for a G-quadruplex in a promoter region and its targeting with a small molecule to repress c-MYC transcription. Proceedings of the National Academy of Sciences of the United States of America, 99, 11593-11598. 138. Parkinson, G.N., Ghosh, R. and Neidle, S. (2007) Structural basis for binding of porphyrin to human telomeres. Biochemistry, 46, 2390-2397. 139. Lemarteleur, T., Gomez, D., Paterski, R., Mandine, E., Mailliet, P. and Riou, J.F. (2004) Stabilization of the c-myc gene promoter quadruplex by specific ligands' inhibitors of telomerase. Biochemical and biophysical research communications, 323, 802-808.

147

140. Pennarun, G., Granotier, C., Gauthier, L.R., Gomez, D., Hoffschir, F., Mandine, E., Riou, J.F., Mergny, J.L., Mailliet, P. and Boussin, F.D. (2005) Apoptosis related to telomere instability and cell cycle alterations in human glioma cells treated by new highly selective G-quadruplex ligands. Oncogene, 24, 2917-2928. 141. Granotier, C., Pennarun, G., Riou, L., Hoffschir, F., Gauthier, L.R., De Cian, A., Gomez, D., Mandine, E., Riou, J.F., Mergny, J.L. et al. (2005) Preferential binding of a G-quadruplex ligand to human chromosome ends. Nucleic acids research, 33, 4182- 4190. 142. Mergny, J.L., Lacroix, L., Teulade-Fichou, M.P., Hounsou, C., Guittat, L., Hoarau, M., Arimondo, P.B., Vigneron, J.P., Lehn, J.M., Riou, J.F. et al. (2001) Telomerase inhibitors based on quadruplex ligands selected by a fluorescence assay. Proceedings of the National Academy of Sciences of the United States of America, 98, 3062-3067. 143. De Cian, A., Delemos, E., Mergny, J.L., Teulade-Fichou, M.P. and Monchaud, D. (2007) Highly efficient G-quadruplex recognition by bisquinolinium compounds. Journal of the American Chemical Society, 129, 1856-1857. 144. Chung, W.J., Heddi, B., Hamon, F., Teulade-Fichou, M.P. and Phan, A.T. (2014) Solution structure of a G-quadruplex bound to the bisquinolinium compound Phen- DC(3). Angewandte Chemie, 53, 999-1002. 145. Shin-ya, K., Wierzba, K., Matsuo, K., Ohtani, T., Yamada, Y., Furihata, K., Hayakawa, Y. and Seto, H. (2001) Telomestatin, a novel telomerase inhibitor from Streptomyces anulatus. Journal of the American Chemical Society, 123, 1262-1263. 146. Gomez, D., Paterski, R., Lemarteleur, T., Shin-Ya, K., Mergny, J.L. and Riou, J.F. (2004) Interaction of telomestatin with the telomeric single-strand overhang. The Journal of biological chemistry, 279, 41487-41494. 147. Minhas, G.S., Pilch, D.S., Kerrigan, J.E., LaVoie, E.J. and Rice, J.E. (2006) Synthesis and G-quadruplex stabilizing properties of a series of oxazole-containing macrocycles. Bioorganic & medicinal chemistry letters, 16, 3891-3895. 148. Monchaud, D., Yang, P., Lacroix, L., Teulade-Fichou, M.P. and Mergny, J.L. (2008) A metal-mediated conformational switch controls G-quadruplex binding affinity. Angewandte Chemie, 47, 4858-4861. 149. Reed, J.E., Arnal, A.A., Neidle, S. and Vilar, R. (2006) Stabilization of G-quadruplex DNA and inhibition of telomerase activity by square-planar nickel(II) complexes. Journal of the American Chemical Society, 128, 5992-5993. 150. Dixon, I.M., Lopez, F., Tejera, A.M., Esteve, J.P., Blasco, M.A., Pratviel, G. and Meunier, B. (2007) A G-quadruplex ligand with 10000-fold selectivity over duplex DNA. Journal of the American Chemical Society, 129, 1502-1503. 151. Martino, L., Virno, A., Pagano, B., Virgilio, A., Di Micco, S., Galeone, A., Giancola, C., Bifulco, G., Mayol, L. and Randazzo, A. (2007) Structural and thermodynamic studies of the interaction of distamycin A with the parallel quadruplex structure [d(TGGGGT)]4. Journal of the American Chemical Society, 129, 16048-16056. 152. Cocco, M.J., Hanakahi, L.A., Huber, M.D. and Maizels, N. (2003) Specific interactions of distamycin with G-quadruplex DNA. Nucleic acids research, 31, 2944- 2951. 153. Gai, W., Yang, Q., Xiang, J., Jiang, W., Li, Q., Sun, H., Guan, A., Shang, Q., Zhang, H. and Tang, Y. (2013) A dual-site simultaneous binding mode in the interaction between parallel-stranded G-quadruplex [d(TGGGGT)]4 and cyanine dye 2,2'-diethyl- 9-methyl-selenacarbocyanine bromide. Nucleic acids research, 41, 2709-2722. 154. Smith, N.M., Labrunie, G., Corry, B., Tran, P.L., Norret, M., Djavaheri-Mergny, M., Raston, C.L. and Mergny, J.L. (2011) Unraveling the relationship between structure

148

and stabilization of triarylpyridines as G-quadruplex binding ligands. Organic & biomolecular chemistry, 9, 6154-6162. 155. Cogoi, S., Paramasivam, M., Filichev, V., Geci, I., Pedersen, E.B. and Xodo, L.E. (2009) Identification of a new G-quadruplex motif in the KRAS promoter and design of pyrene-modified G4-decoys with antiproliferative activity in pancreatic cancer cells. Journal of medicinal chemistry, 52, 564-568. 156. Rhodes, D. and Giraldo, R. (1995) Telomere structure and function. Current opinion in structural biology, 5, 311-322. 157. Greider, C.W. and Blackburn, E.H. (1985) Identification of a specific telomere terminal transferase activity in Tetrahymena extracts. Cell, 43, 405-413. 158. Zheng, Y.L., Zhang, F., Sun, B., Du, J., Sun, C., Yuan, J., Wang, Y., Tao, L., Kota, K., Liu, X. et al. (2014) Telomerase enzymatic component hTERT shortens long telomeres in human cells. Cell cycle, 13, 1765-1776. 159. Blackburn, E.H., Greider, C.W. and Szostak, J.W. (2006) Telomeres and telomerase: the path from maize, Tetrahymena and yeast to human cancer and aging. Nat Med, 12, 1133-1138. 160. Stewart, S.A. and Weinberg, R.A. (2006) Telomeres: cancer to human aging. Annual review of cell and developmental biology, 22, 531-557. 161. Feng, J., Funk, W.D., Wang, S.S., Weinrich, S.L., Avilion, A.A., Chiu, C.P., Adams, R.R., Chang, E., Allsopp, R.C., Yu, J. et al. (1995) The RNA component of human telomerase. Science, 269, 1236-1241. 162. Ruden, M. and Puri, N. (2013) Novel anticancer therapeutics targeting telomerase. Cancer treatment reviews, 39, 444-456. 163. Agrawal, A., Dang, S. and Gabrani, R. (2012) Recent patents on anti-telomerase cancer therapy. Recent Pat Anticancer Drug Discov, 7, 102-117. 164. Casagrande, V., Salvati, E., Alvino, A., Bianco, A., Ciammaichella, A., D'Angelo, C., Ginnari-Satriani, L., Serrilli, A.M., Iachettini, S., Leonetti, C. et al. (2011) N-cyclic bay-substituted perylene G-quadruplex ligands have selective antiproliferative effects on cancer cells and induce telomere damage. Journal of medicinal chemistry, 54, 1140-1156. 165. Xu, Y. (2011) Chemistry in human telomere biology: structure, function and targeting of telomere DNA/RNA. Chemical Society reviews, 40, 2719-2740. 166. Parkinson, G.N., Cuenca, F. and Neidle, S. (2008) Topology conservation and loop flexibility in quadruplex-drug recognition: crystal structures of inter- and intramolecular telomeric DNA quadruplex-drug complexes. Journal of molecular biology, 381, 1145-1156. 167. Nicoludis, J.M., Barrett, S.P., Mergny, J.L. and Yatsunyk, L.A. (2012) Interaction of human telomeric DNA with N-methyl mesoporphyrin IX. Nucleic acids research, 40, 5432-5447. 168. Li, Z., Tan, J.H., He, J.H., Long, Y., Ou, T.M., Li, D., Gu, L.Q. and Huang, Z.S. (2012) Disubstituted quinazoline derivatives as a new type of highly selective ligands for telomeric G-quadruplex DNA. Eur J Med Chem, 47, 299-311. 169. Pagano, B., Fotticchia, I., De Tito, S., Mattia, C.A., Mayol, L., Novellino, E., Randazzo, A. and Giancola, C. (2010) Selective Binding of Distamycin A Derivative to G-Quadruplex Structure [d(TGGGGT)](4). Journal of nucleic acids, 2010. 170. Kerwin, S.M., Sun, D., Kern, J.T., Rangan, A. and Thomas, P.W. (2001) G- quadruplex DNA binding by a series of carbocyanine dyes. Bioorganic & medicinal chemistry letters, 11, 2411-2414.

149

171. Campbell, N.H., Patel, M., Tofa, A.B., Ghosh, R., Parkinson, G.N. and Neidle, S. (2009) Selectivity in ligand recognition of G-quadruplex loops. Biochemistry, 48, 1675-1680. 172. Zhang, S., Wu, Y. and Zhang, W. (2014) G-quadruplex structures and their interaction diversity with ligands. ChemMedChem, 9, 899-911. 173. Smith, N.M., Corry, B., Swaminathan Iyer, K., Norret, M. and Raston, C.L. (2009) A microfluidic platform to synthesize a G-quadruplex binding ligand. Lab Chip, 9, 2021- 2025. 174. Corry, B. and Smith, N.M. (2012) The role of thermodynamics and kinetics in ligand binding to G-quadruplex DNA. Chemical communications, 48, 8958-8960. 175. Cantor, C.R., Warshaw, M.M. and Shapiro, H. (1970) Oligonucleotide interactions. 3. Circular dichroism studies of the conformation of deoxyoligonucleotides. Biopolymers, 9, 1059-1077. 176. Masiero, S., Trotta, R., Pieraccini, S., De Tito, S., Perone, R., Randazzo, A. and Spada, G.P. (2010) A non-empirical chromophoric interpretation of CD spectra of DNA G-quadruplex structures. Organic & biomolecular chemistry, 8, 2683-2692. 177. Jain, A.K., Reddy, V.V., Paul, A., K, M. and Bhattacharya, S. (2009) Synthesis and evaluation of a novel class of G-quadruplex-stabilizing small molecules based on the 1,3-phenylene-bis(piperazinyl benzimidazole) system. Biochemistry, 48, 10693- 10704. 178. Dash, J., Shirude, P.S., Hsu, S.T. and Balasubramanian, S. (2008) Diarylethynyl amides that recognize the parallel conformation of genomic promoter DNA G- quadruplexes. Journal of the American Chemical Society, 130, 15950-15956. 179. Nguyen, B.D., Meng, X., Donovan, K.J. and Shaka, A.J. (2007) SOGGY: solvent- optimized double gradient spectroscopy for water suppression. A comparison with some existing techniques. Journal of magnetic resonance, 184, 263-274. 180. Thrippleton, M.J. and Keeler, J. (2003) Elimination of zero-quantum interference in two-dimensional NMR spectra. Angewandte Chemie, 42, 3938-3941. 181. Shaka, A.J., Lee, C.J. and Pines, A. (1988) Iterative schemes for bilinear operators; application to spin decoupling. Journal of Magnetic Resonance (1969), 77, 274-293. 182. Howe, P.W. (2014) Rapid pulsing artefacts in pulsed-field gradient double-quantum filtered COSY spectra. Magnetic resonance in chemistry : MRC, 52, 329-332. 183. Hwang, T.L. and Shaka, A.J. (1995) Water Suppression That Works. Excitation Sculpting Using Arbitrary Wave-Forms and Pulsed-Field Gradients. Journal of Magnetic Resonance, Series A, 112, 275-279. 184. Vranken, W.F., Boucher, W., Stevens, T.J., Fogh, R.H., Pajon, A., Llinas, M., Ulrich, E.L., Markley, J.L., Ionides, J. and Laue, E.D. (2005) The CCPN data model for NMR spectroscopy: development of a software pipeline. Proteins, 59, 687-696. 185. Webba da Silva, M. (2007) NMR methods for studying quadruplex nucleic acids. Methods, 43, 264-277. 186. Rieping, W., Habeck, M., Bardiaux, B., Bernard, A., Malliavin, T.E. and Nilges, M. (2007) ARIA2: automated NOE assignment and data integration in NMR structure calculation. Bioinformatics, 23, 381-382. 187. Duszczyk, M.M., Wutz, A., Rybin, V. and Sattler, M. (2011) The Xist RNA A-repeat comprises a novel AUCG tetraloop fold and a platform for multimerization. Rna, 17, 1973-1982. 188. Brunger, A.T., Adams, P.D., Clore, G.M., DeLano, W.L., Gros, P., Grosse-Kunstleve, R.W., Jiang, J.S., Kuszewski, J., Nilges, M., Pannu, N.S. et al. (1998) Crystallography & NMR system: A new software suite for macromolecular structure determination. Acta crystallographica. Section D, Biological crystallography, 54, 905-921.

150

189. Brunger, A.T. (2007) Version 1.2 of the Crystallography and NMR system. Nature protocols, 2, 2728-2733. 190. Perez, A., Marchan, I., Svozil, D., Sponer, J., Cheatham, T.E., 3rd, Laughton, C.A. and Orozco, M. (2007) Refinement of the AMBER force field for nucleic acids: improving the description of alpha/gamma conformers. Biophysical journal, 92, 3817- 3829. 191. Jorgensen, W.L., Chandrasekhar, J., Madura, J.D., Impey, R.W. and Klein, M.L. (1983) Comparison of simple potential functions for simulating liquid water. The Journal of Chemical Physics, 79, 926-935. 192. Wang, J., Wolf, R.M., Caldwell, J.W., Kollman, P.A. and Case, D.A. (2004) Development and testing of a general amber force field. Journal of computational chemistry, 25, 1157-1174. 193. Kleywegt, G.J. (2007) Crystallographic refinement of ligand complexes. Acta crystallographica. Section D, Biological crystallography, 63, 94-100. 194. Vorlickova, M., Kejnovska, I., Bednarova, K., Renciuk, D. and Kypr, J. (2012) Circular dichroism spectroscopy of DNA: from duplexes to quadruplexes. Chirality, 24, 691-698. 195. Karsisiotis, A.I., Hessari, N.M., Novellino, E., Spada, G.P., Randazzo, A. and Webba da Silva, M. (2011) Topological characterization of nucleic acid G-quadruplexes by UV absorption and circular dichroism. Angewandte Chemie, 50, 10645-10648. 196. Garner, T.P., Williams, H.E., Gluszyk, K.I., Roe, S., Oldham, N.J., Stevens, M.F., Moses, J.E. and Searle, M.S. (2009) Selectivity of small molecule ligands for parallel and anti-parallel DNA G-quadruplex structures. Organic & biomolecular chemistry, 7, 4194-4200. 197. Phan, A.T., Luu, K.N. and Patel, D.J. (2006) Different loop arrangements of intramolecular human telomeric (3+1) G-quadruplexes in K+ solution. Nucleic acids research, 34, 5715-5719. 198. Adrian, M., Heddi, B. and Phan, A.T. (2012) NMR spectroscopy of G-quadruplexes. Methods, 57, 11-24. 199. Feigon, J., Koshlap, K.M. and Smith, F.W. (1995) 1H NMR spectroscopy of DNA triplexes and quadruplexes. Methods in enzymology, 261, 225-255. 200. Phan, A.T., Modi, Y.S. and Patel, D.J. (2004) Two-repeat Tetrahymena telomeric d(TGGGGTTGGGGT) Sequence interconverts between asymmetric dimeric G- quadruplexes in solution. Journal of molecular biology, 338, 93-102. 201. Paramasivan, S. and Bolton, P.H. (2008) Mix and measure fluorescence screening for selective quadruplex binders. Nucleic acids research, 36, e106. 202. Wong, H.M., Payet, L. and Huppert, J.L. (2009) Function and targeting of G- quadruplexes. Curr Opin Mol Ther, 11, 146-155. 203. Collie, G.W. and Parkinson, G.N. (2011) The application of DNA and RNA G- quadruplexes to therapeutic medicines. Chemical Society reviews, 40, 5867-5892. 204. Bidzinska, J., Cimino-Reale, G., Zaffaroni, N. and Folini, M. (2013) G-quadruplex structures in the human genome as novel therapeutic targets. Molecules, 18, 12368- 12395. 205. Parrotta, L., Ortuso, F., Moraca, F., Rocca, R., Costa, G., Alcaro, S. and Artese, A. (2014) Targeting unimolecular G-quadruplex nucleic acids: a new paradigm for the drug discovery? Expert opinion on drug discovery, 1-21. 206. Bax, A.a.D., D. G. (1985) Practical Aspects of Two-Dimensional Transverse NOE Spectroscopy. Journal of magnetic resonance, 63, 207-213.

151

207. Piantini, U., Sorensen, O. W., and Ernst, R. R. (1982) Multiple quantum filters for elucidating NMR coupling networks. Journal of the American Chemical Society, 104, 6800-6801. 208. Bax, A.a.F., R. (1981) Investigation of complex networks of spin-spin coupling by two-dimensional NMR. Journal of magnetic resonance, 44, 542-561. 209. Chazin, W.J., Wuthrich, K., Hyberts, S., Rance, M., Denny, W.A. and Leupin, W. (1986) 1H nuclear magnetic resonance assignments for d-(GCATTAATGC)2 using experimental refinements of established procedures. Journal of molecular biology, 190, 439-453. 210. Schwieters, C.D., Kuszewski, J.J., Tjandra, N. and Clore, G.M. (2003) The Xplor-NIH NMR molecular structure determination package. Journal of magnetic resonance, 160, 65-73. 211. MacKerell, J., A.D. Wiorkiewicz-Kuczera, J., Karlplus, M. (1995) An all-atom empirical energy function for the simulation of nucleic acids. Journal of the American Chemical Society, 117, 11946–11975. 212. Foloppe, N.a.M., Jr., A.D. (2000) All-Atom Empirical Force Field for Nucleic Acids: 1) Parameter Optimization Based on Small Molecule and Condensed Phase Macromolecular Target Data. Journal of computational chemistry, 21, 86-104. 213. Brooks BR, B.R., Olafson BD, States DJ, Swaminathan S, Karplus M. (1983) CHARMM: a program for macromolecular energy minimization and dynamics calculations. Journal of computational chemistry, 4, 187–217. 214. Baleja, J.D., Moult, J. and Sykes, B. D. . (1990) Distance measurement and structure refinement with NOE data. Journal of magnetic resonance, 87, 375-384. 215. Redman, J.E., Granadino-Roldan, J.M., Schouten, J.A., Ladame, S., Reszka, A.P., Neidle, S. and Balasubramanian, S. (2009) Recognition and discrimination of DNA quadruplexes by acridine-peptide conjugates. Organic & biomolecular chemistry, 7, 76-84. 216. Dai, J., Carver, M. and Yang, D. (2008) Polymorphism of human telomeric quadruplex structures. Biochimie, 90, 1172-1183. 217. Phan, A.T. (2010) Human telomeric G-quadruplex: structures of DNA and RNA sequences. The FEBS journal, 277, 1107-1117. 218. Renciuk, D., Kejnovska, I., Skolakova, P., Bednarova, K., Motlova, J. and Vorlickova, M. (2009) Arrangements of human telomere DNA quadruplex in physiologically relevant K+ solutions. Nucleic acids research, 37, 6625-6634. 219. Chebotareva, N.A., Kurganov, B.I. and Livanova, N.B. (2004) Biochemical effects of molecular crowding. Biochemistry. Biokhimiia, 69, 1239-1251. 220. Ai, X., Zhou, Z., Bai, Y. and Choy, W.Y. (2006) 15N NMR spin relaxation dispersion study of the molecular crowding effects on protein folding under native conditions. Journal of the American Chemical Society, 128, 3916-3917. 221. Xue, Y., Kan, Z.Y., Wang, Q., Yao, Y., Liu, J., Hao, Y.H. and Tan, Z. (2007) Human telomeric DNA forms parallel-stranded intramolecular G-quadruplex in K+ solution under molecular crowding condition. Journal of the American Chemical Society, 129, 11185-11191. 222. Hansel, R., Lohr, F., Foldynova-Trantirkova, S., Bamberg, E., Trantirek, L. and Dotsch, V. (2011) The parallel G-quadruplex structure of vertebrate telomeric repeat sequences is not the preferred folding topology under physiological conditions. Nucleic acids research, 39, 5768-5775. 223. Martorell, G., Adrover, M., Kelly, G., Temussi, P.A. and Pastore, A. (2011) A natural and readily available crowding agent: NMR studies of proteins in hen egg white. Proteins, 79, 1408-1415.

152

224. Doghaei, A.V., Housaindokht, M.R. and Bozorgmehr, M.R. (2014) Molecular crowding effects on conformation and stability of G-quadruplex DNA structure: Insights from molecular dynamics simulation. Journal of theoretical biology. 225. Verdian Doghaei, A., Housaindokht, M.R. and Bozorgmehr, M.R. (2014) Molecular crowding effects on conformation and stability of G-quadruplex DNA structure: Insights from molecular dynamics simulation. Journal of theoretical biology, 364C, 103-112. 226. Yaku, H., Murashima, T., Tateishi-Karimata, H., Nakano, S., Miyoshi, D. and Sugimoto, N. (2013) Study on effects of molecular crowding on G-quadruplex-ligand binding and ligand-mediated telomerase inhibition. Methods, 64, 19-27. 227. Hansel, R., Luh, L.M., Corbeski, I., Trantirek, L. and Dotsch, V. (2014) In-Cell NMR and EPR Spectroscopy of Biomacromolecules. Angewandte Chemie, 53, 10300- 10314. 228. Takahashi, S. and Sugimoto, N. (2013) Effect of pressure on the stability of G- quadruplex DNA: thermodynamics under crowding conditions. Angewandte Chemie, 52, 13774-13778. 229. Hansel, R., Foldynova-Trantirkova, S., Dotsch, V. and Trantirek, L. (2013) Investigation of quadruplex structure under physiological conditions using in-cell NMR. Topics in current chemistry, 330, 47-65. 230. Miyoshi, D., Fujimoto, T. and Sugimoto, N. (2013) Molecular crowding and hydration regulating of G-quadruplex formation. Topics in current chemistry, 330, 87-110. 231. Ellis, R.J. (2001) Macromolecular crowding: an important but neglected aspect of the intracellular environment. Current opinion in structural biology, 11, 114-119. 232. Selenko, P. and Wagner, G. (2006) NMR mapping of protein interactions in living cells. Nature methods, 3, 80-81. 233. Inomata, K., Ohno, A., Tochio, H., Isogai, S., Tenno, T., Nakase, I., Takeuchi, T., Futaki, S., Ito, Y., Hiroaki, H. et al. (2009) High-resolution multi-dimensional NMR spectroscopy of proteins in human cells. Nature, 458, 106-109. 234. Selenko, P., Serber, Z., Gadea, B., Ruderman, J. and Wagner, G. (2006) Quantitative NMR analysis of the protein G B1 domain in Xenopus laevis egg extracts and intact oocytes. Proceedings of the National Academy of Sciences of the United States of America, 103, 11904-11909. 235. Serber, Z., Selenko, P., Hansel, R., Reckel, S., Lohr, F., Ferrell, J.E., Jr., Wagner, G. and Dotsch, V. (2006) Investigating macromolecules inside cultured and injected cells by in-cell NMR spectroscopy. Nature protocols, 1, 2701-2709. 236. Hansel, R., Foldynova-Trantirkova, S., Lohr, F., Buck, J., Bongartz, E., Bamberg, E., Schwalbe, H., Dotsch, V. and Trantirek, L. (2009) Evaluation of parameters critical for observing nucleic acids inside living Xenopus laevis oocytes by in-cell NMR spectroscopy. Journal of the American Chemical Society, 131, 15761-15768. 237. Hansel, R., Lohr, F., Trantirek, L. and Dotsch, V. (2013) High-resolution insight into G-overhang architecture. Journal of the American Chemical Society, 135, 2816-2824. 238. Theillet, F.X., Smet-Nocca, C., Liokatis, S., Thongwichian, R., Kosten, J., Yoon, M.K., Kriwacki, R.W., Landrieu, I., Lippens, G. and Selenko, P. (2012) Cell signaling, post-translational protein modifications and NMR spectroscopy. Journal of biomolecular NMR, 54, 217-236. 239. Theillet, F.X., Liokatis, S., Jost, J.O., Bekei, B., Rose, H.M., Binolfi, A., Schwarzer, D. and Selenko, P. (2012) Site-specific mapping and time-resolved monitoring of lysine methylation by high-resolution NMR spectroscopy. Journal of the American Chemical Society, 134, 7616-7619.

153

240. Liokatis, S., Stutzer, A., Elsasser, S.J., Theillet, F.X., Klingberg, R., van Rossum, B., Schwarzer, D., Allis, C.D., Fischle, W. and Selenko, P. (2012) Phosphorylation of histone H3 Ser10 establishes a hierarchy for subsequent intramolecular modification events. Nature structural & molecular biology, 19, 819-823. 241. Liokatis, S., Dose, A., Schwarzer, D. and Selenko, P. (2010) Simultaneous detection of protein phosphorylation and acetylation by high-resolution NMR spectroscopy. Journal of the American Chemical Society, 132, 14704-14705. 242. Burz, D.S., Dutta, K., Cowburn, D. and Shekhtman, A. (2006) In-cell NMR for protein-protein interactions (STINT-NMR). Nature protocols, 1, 146-152. 243. Bekei, B., Rose, H.M., Herzig, M., Dose, A., Schwarzer, D. and Selenko, P. (2012) In- cell NMR in mammalian cells: part 1. Methods in molecular biology, 895, 43-54. 244. Bekei, B., Rose, H.M., Herzig, M. and Selenko, P. (2012) In-cell NMR in mammalian cells: part 2. Methods in molecular biology, 895, 55-66. 245. Bekei, B., Rose, H.M., Herzig, M., Stephanowitz, H., Krause, E. and Selenko, P. (2012) In-cell NMR in mammalian cells: part 3. Methods in molecular biology, 895, 67-83. 246. Sakakibara, D., Sasaki, A., Ikeya, T., Hamatsu, J., Hanashima, T., Mishima, M., Yoshimasu, M., Hayashi, N., Mikawa, T., Walchli, M. et al. (2009) Protein structure determination in living cells by in-cell NMR spectroscopy. Nature, 458, 102-105. 247. Ikeya, T., Sasaki, A., Sakakibara, D., Shigemitsu, Y., Hamatsu, J., Hanashima, T., Mishima, M., Yoshimasu, M., Hayashi, N., Mikawa, T. et al. (2010) NMR protein structure determination in living E. coli cells using nonlinear sampling. Nature protocols, 5, 1051-1060. 248. Sket, P. and Plavec, J. (2010) Tetramolecular DNA quadruplexes in solution: insights into structural diversity and cation movement. Journal of the American Chemical Society, 132, 12724-12732. 249. Krishnan-Ghosh, Y., Liu, D. and Balasubramanian, S. (2004) Formation of an interlocked quadruplex dimer by d(GGGT). Journal of the American Chemical Society, 126, 11009-11016. 250. Tran, P.L., Moriyama, R., Maruyama, A., Rayner, B. and Mergny, J.L. (2011) A mirror-image tetramolecular DNA quadruplex. Chemical communications, 47, 5437- 5439. 251. Tran, P.L., Virgilio, A., Esposito, V., Citarella, G., Mergny, J.L. and Galeone, A. (2011) Effects of 8-methylguanine on structure, stability and kinetics of formation of tetramolecular quadruplexes. Biochimie, 93, 399-408. 252. Mergny, J.L., De Cian, A., Ghelab, A., Sacca, B. and Lacroix, L. (2005) Kinetics of tetramolecular quadruplexes. Nucleic acids research, 33, 81-94. 253. Gros, J., Rosu, F., Amrane, S., De Cian, A., Gabelica, V., Lacroix, L. and Mergny, J.L. (2007) Guanines are a quartet's best friend: impact of base substitutions on the kinetics and stability of tetramolecular quadruplexes. Nucleic acids research, 35, 3064-3075. 254. Bardin, C. and Leroy, J.L. (2008) The formation pathway of tetramolecular G- quadruplexes. Nucleic acids research, 36, 477-488. 255. Tran, P.L., De Cian, A., Gros, J., Moriyama, R. and Mergny, J.L. (2013) Tetramolecular quadruplex stability and assembly. Topics in current chemistry, 330, 243-273. 256. Rosu, F., Gabelica, V., Poncelet, H. and De Pauw, E. (2010) Tetramolecular G- quadruplex formation pathways studied by electrospray mass spectrometry. Nucleic acids research, 38, 5217-5225.

154

257. Aboul-ela, F., Murchie, A.I. and Lilley, D.M. (1992) NMR study of parallel-stranded tetraplex formation by the hexadeoxynucleotide d(TG4T). Nature, 360, 280-282. 258. Miyoshi, D., Nakao, A. and Sugimoto, N. (2002) Molecular crowding regulates the structural switch of the DNA G-quadruplex. Biochemistry, 41, 15017-15024. 259. Moriyama, R., Shimada, N., Kano, A. and Maruyama, A. (2011) The role of cationic comb-type copolymers in chaperoning DNA annealing. Biomaterials, 32, 7671-7676. 260. De Cian, A. and Mergny, J.L. (2007) Quadruplex ligands may act as molecular chaperones for tetramolecular quadruplex formation. Nucleic acids research, 35, 2483-2493. 261. Cosconati, S., Marinelli, L., Trotta, R., Virno, A., De Tito, S., Romagnoli, R., Pagano, B., Limongelli, V., Giancola, C., Baraldi, P.G. et al. (2010) Structural and conformational requisites in DNA quadruplex groove binding: another piece to the puzzle. Journal of the American Chemical Society, 132, 6425-6433. 262. Neumann, E., Schaefer-Ridder, M., Wang, Y. and Hofschneider, P.H. (1982) Gene transfer into mouse lyoma cells by electroporation in high electric fields. The EMBO journal, 1, 841-845. 263. Teissie, J., Golzio, M. and Rols, M.P. (2005) Mechanisms of cell membrane electropermeabilization: a minireview of our present (lack of ?) knowledge. Biochimica et biophysica acta, 1724, 270-280. 264. Felgner, P.L., Gadek, T.R., Holm, M., Roman, R., Chan, H.W., Wenz, M., Northrop, J.P., Ringold, G.M. and Danielsen, M. (1987) Lipofection: a highly efficient, lipid- mediated DNA-transfection procedure. Proceedings of the National Academy of Sciences of the United States of America, 84, 7413-7417. 265. Gissot, A., Camplo, M., Grinstaff, M.W. and Barthelemy, P. (2008) Nucleoside, nucleotide and oligonucleotide based amphiphiles: a successful marriage of nucleic acids with lipids. Organic & biomolecular chemistry, 6, 1324-1333. 266. Patwa, A., Gissot, A., Bestel, I. and Barthelemy, P. (2011) Hybrid lipid oligonucleotide conjugates: synthesis, self-assemblies and biomedical applications. Chemical Society reviews, 40, 5844-5854. 267. O'Brien, J.A. and Lummis, S.C. (2006) Biolistic transfection of neuronal cultures using a hand-held gene gun. Nature protocols, 1, 977-981. 268. Kumar, T.K., Thurman, R. and Jayanthi, S. (2013) In-Cell NMR Spectroscopy- Monitoring of the Structure, Dynamics, Folding, and Interactions of Proteins at Atomic Resolution. Journal of analytical & bioanalytical techniques, 4, e112. 269. Serber, Z., Keatinge-Clay, A.T., Ledwidge, R., Kelly, A.E., Miller, S.M. and Dotsch, V. (2001) High-resolution macromolecular NMR spectroscopy inside living cells. Journal of the American Chemical Society, 123, 2446-2447. 270. Schanda, P. and Brutscher, B. (2005) Very fast two-dimensional NMR spectroscopy for real-time investigation of dynamic events in proteins on the time scale of seconds. Journal of the American Chemical Society, 127, 8014-8015. 271. Schanda, P., Kupce, E. and Brutscher, B. (2005) SOFAST-HMQC experiments for recording two-dimensional heteronuclear correlation spectra of proteins within a few seconds. Journal of biomolecular NMR, 33, 199-211. 272. Ross, A., Salzmann, M. and Senn, H. (1997) Fast-HMQC using Ernst angle pulses: An efficient tool for screening of ligand binding to target proteins. Journal of biomolecular NMR, 10, 389-396. 273. Beaucage, S.L. and Caruthers, M.H. (1981) Deoxynucleoside phosphoramidites—A new class of key intermediates for deoxypolynucleotide synthesis. Tetrahedron Letters, 22, 1859-1862.

155

274. Kettani, A., Bouaziz, S., Skripkin, E., Majumdar, A., Wang, W., Jones, R.A. and Patel, D.J. (1999) Interlocked mismatch-aligned arrowhead DNA motifs. Structure, 7, 803-815. 275. Zimmer, D.P. and Crothers, D.M. (1995) NMR of enzymatically synthesized uniformly 13C15N-labeled DNA oligonucleotides. Proceedings of the National Academy of Sciences of the United States of America, 92, 3091-3095. 276. Schindelin, J., Arganda-Carreras, I., Frise, E., Kaynig, V., Longair, M., Pietzsch, T., Preibisch, S., Rueden, C., Saalfeld, S., Schmid, B. et al. (2012) Fiji: an open-source platform for biological-image analysis. Nature methods, 9, 676-682. 277. Thongwichian, R. and Selenko, P. (2012) In-cell NMR in Xenopus laevis oocytes. Methods in molecular biology, 895, 33-41. 278. Bodart, J.F., Wieruszeski, J.M., Amniai, L., Leroy, A., Landrieu, I., Rousseau- Lescuyer, A., Vilain, J.P. and Lippens, G. (2008) NMR observation of Tau in Xenopus oocytes. Journal of magnetic resonance, 192, 252-257. 279. Selenko, P. and Wagner, G. (2007) Looking into live cells with in-cell NMR spectroscopy. Journal of structural biology, 158, 244-253. 280. Sakai, T., Tochio, H., Tenno, T., Ito, Y., Kokubo, T., Hiroaki, H. and Shirakawa, M. (2006) In-cell NMR spectroscopy of proteins inside Xenopus laevis oocytes. Journal of biomolecular NMR, 36, 179-188. 281. Kanaori, K., Yasumura, A., Tajima, K. and Makino, K. (2004) 1H NMR study on equilibrium between parallel G-quartet structures. Nucleic Acids Symp Ser (Oxf), 233- 234. 282. Phan, A.T. and Patel, D.J. (2003) Two-repeat human telomeric d(TAGGGTTAGGGT) sequence forms interconverting parallel and antiparallel G-quadruplexes in solution: distinct topologies, thermodynamic properties, and folding/unfolding kinetics. Journal of the American Chemical Society, 125, 15021-15027. 283. Hu, L., Lim, K.W., Bouaziz, S. and Phan, A.T. (2009) Giardia telomeric sequence d(TAGGG)4 forms two intramolecular G-quadruplexes in K+ solution: effect of loop length and sequence on the folding topology. Journal of the American Chemical Society, 131, 16824-16831. 284. Cang, X., Sponer, J. and Cheatham, T.E., 3rd. (2011) Insight into G-DNA structural polymorphism and folding from sequence and loop connectivity through free energy analysis. Journal of the American Chemical Society, 133, 14270-14279. 285. Ellis, R.J. and Minton, A.P. (2003) Cell biology: join the crowd. Nature, 425, 27-28. 286. Calligari, P.A., Salgado, G.F., Pelupessy, P., Lopes, P., Ouazzani, J., Bodenhausen, G. and Abergel, D. (2012) Insights into internal dynamics of 6-phosphogluconolactonase from Trypanosoma brucei studied by nuclear magnetic resonance and molecular dynamics. Proteins, 80, 1196-1210. 287. Mergny, J.L. (2012) Alternative DNA structures: G4 DNA in cells: itae missa est? Nature chemical biology, 8, 225-226. 288. Sidibe, A., Hamon, F., Largy, E., Gomez, D., Teulade-Fichou, M.P., Trentesaux, C. and Riou, J.F. (2012) Effects of a halogenated G-quadruplex ligand from the pyridine dicarboxamide series on the terminal sequence of XpYp telomere in HT1080 cells. Biochimie, 94, 2559-2568. 289. Maizels, N. (2006) Dynamic roles for G4 DNA in the biology of eukaryotic cells. Nature structural & molecular biology, 13, 1055-1059. 290. Cogoi, S., Paramasivam, M., Spolaore, B. and Xodo, L.E. (2008) Structural polymorphism within a regulatory element of the human KRAS promoter: formation of G4-DNA recognized by nuclear proteins. Nucleic acids research, 36, 3765-3780.

156

291. Paramasivam, M., Cogoi, S. and Xodo, L.E. (2011) Primer extension reactions as a tool to uncover folding motifs within complex G-rich sequences: analysis of the human KRAS NHE. Chemical communications, 47, 4965-4967. 292. Cogoi, S., Zorzet, S., Rapozzi, V., Geci, I., Pedersen, E.B. and Xodo, L.E. (2013) MAZ-binding G4-decoy with locked nucleic acid and twisted intercalating nucleic acid modifications suppresses KRAS in pancreatic cancer cells and delays tumor growth in mice. Nucleic acids research, 41, 4049-4064. 293. Phan, A.T. (2000) Long-range imino proton-13C J-couplings and the through-bond correlation of imino and non-exchangeable protons in unlabeled DNA. Journal of biomolecular NMR, 16, 175-178.

157

Titre : Etude des structures de l'ADN G-quadruplex par résonance magnétique nucléaire

Résumé :

Les G-quadruplexes (G4) sont des structures d'acides nucléiques non-canoniques formées par des séquences riches en Guanines (G) principalement localisées dans les telomères et les régions promotrices des oncogènes. Elles sont constituées de l'empilement de plusieurs tétrades de G en présence de cations. En utilisant la spectroscopie par RMN, nous avons caractérisé l'interaction entre le ligand TAP et le G4 télomérique humain constituée de la séquence d(AG3(T2AG3)3). CD et RMN 1D 1H ont été utilisés pour suivre l'interaction entre les deux partenaires. RMN 2D a été utilisé pour attribuer sans ambiguïté toutes les résonances de 1H dans le complexe et d'explorer le site d'interaction. Un modèle illustrant l'interaction de TAP avec 22AG au niveau des sillons et boucles a été généré. Une autre partie de ce travail consiste en l'étude du G4 tétra-moléculaire formé par TG4T et son interaction avec des ligands G4 par la RMN dans les cellules. Des spectres 1H-15N HMQC ont été effectués à l'intérieur de Xenopus laevis et les lysats des cellules HeLa et comparés avec ceux observés dans les conditions in vitro ce qui a montré une bonne stabilité de G4 à l'intérieur de la cellule. En outre, l'interaction de d [TG4T]4 avec des ligands spécifiques de G4 présentant trois différents modes d'interaction a également été étudiée. Le ligand 360A a montré un comportement prometteur. Enfin, dans la dernière partie, différentes séquences de promoteur Kras ont été criblés par RMN pour sélectionner des candidats pour la détermination de structure haute résolution. Deux séquences différentes ont été sélectionnées et caractérisées par spectroscopie CD. La stabilisation des structures G4 formées par ces séquences en interaction avec différents ligands a également été étudiée. Une titration RMN 1D 1H entre 22RT et le ligand Braco19 a montré un comportement intéressant de k-ras G4 par la formation d'espèces intermédiaires lors de l'addition de Braco19.

Mots clés : G-quadruplex, RMN, ligand, interaction.

Title : Study of DNA G-quadruplex structures by Nuclear Magnetic Resonance

Abstract :

G-quadruplexes (G4) are non-canonical nucleic acid structures formed by G-rich sequences mainly localized in telomeres and promoter regions of oncogenes. They are built from the stacking of several G-quartets in the presence of cations. Using NMR spectroscopy, we have characterized the interaction between the TAP ligand and the human telomeric G4 formed by the sequence 1 d(AG3(T2AG3)3). CD and 1D H NMR spectroscopy were used to follow the interaction between the two partners. 2D NMR was used to assign unambiguously all 1H resonances in the complex and to explore the binding site. A model depicting the interaction of TAP with 22AG in grooves and loops was generated. Another part of this work consists in the study of tetramolecular G4 formed by TG4T and its interaction with G4 ligands by in-cell NMR. 1H-15N HMQC spectra were performed inside Xenopus laevis and HeLa cell lysates compared to those observed in vitro conditions showing a good stability of G4 inside the cell. Furthermore, the interaction of d[TG4T]4 with three G4 specific ligands presenting different mode of interaction was also investigated. The ligand 360A showed a promising behavior. Finally, in the last part, different sequences of Kras promoter were screened by NMR to select good candidates for high resolution structure determination. Two different sequences were selected and characterized by CD spectroscopy. The stabilization of G4 structures formed by these sequences in interaction with different ligands was also investigated. A 1D 1H NMR titration between Braco19 and 22RT showed an interesting behavior of k-ras G4 by the formation of intermediate species upon the addition of Braco19.

Keywords : G-quadruplex, NMR, ligand, interaction.

Unité de recherche [INSERM U869, 2, Rue Robert Escarpit Pessac 33607]

139

140