University of Southampton Research Repository

University of Southampton Research Repository Copyright © and Moral Rights for this thesis and, where applicable, any accompanying data are retained by the author and/or other copyright owners. A copy can be downloaded for personal non-commercial research or study, without prior permission or charge. This thesis and the accompanying data cannot be reproduced or quoted extensively from without first obtaining permission in writing from the copyright holder/s. The content of the thesis and accompanying research data (where applicable) must not be changed in any way or sold commercially in any format or medium without the formal permission of the copyright holder/s. When referring to this thesis and any accompanying data, full bibliographic details must be given, e.g. Thesis: Ana Lages (2018) "Characterization of parallel G-quadruplex formation by highly conserved G-rich motifs in INS intron 1", University of Southampton, Faculty of Medicine, PhD Thesis, pagination. FACULTY OF MEDICINE Human Development and Health Characterization of parallel G-quadruplex formation by highly conserved G-rich motifs in INS intron 1 by Ana Luísa Gonçalves das Lages Thesis for the degree of Doctor of Philosophy June 2018 UNIVERSITY OF SOUTHAMPTON ABSTRACT FACULTY OF MEDICINE Human Development and Health Thesis for the degree of Doctor of Philosophy CHARACTERIZATION OF PARALLEL G-QUADRUPLEX FORMATION BY HIGHLY CONSERVED G-RICH MOTIFS IN INS INTRON 1 Ana Luísa Gonçalves das Lages Individuals predisposed to type 1 diabetes (T1DM) carry an adenine allele at rs689, a T-to-A genetic variant located 6 nucleotides upstream of the 3’ splice site of INS intron 1. The A allele disrupts the polypyrimidine tract (Py-tract) and impairs splicing, increasing intron retention (IR) levels in mature transcripts. Intron 1-containing messenger RNAs have an extended 5’ untranslated region (5’UTR), introducing an upstream open reading frame (uORF) and curtailing translation. IR can be reduced using oligonucleotides complementary to an intronic segment flanked by G-rich sequences, mitigating the splicing defect. To investigate their potential to form G-quadruplex (G4), a G4-specific fluorescent probe, thioflavin T, was used to examine DNA and RNA G-tracts flanking the antisense target region, in vitro. Fluorescence intensity was shown to be specific for G4 structures in DNA and RNA. G4 formation was influenced by the adjacent target region in each direction and by concentrations of K+ and Mg2+. G4s were also detected in RNA transcribed in real time and their formation was influenced by mutations that affected intron 1 removal. To identify proteins binding to this intronic region, in vitro RNA transcripts containing the antisense target region were used in RNA pull-down assays with HeLa nuclear extracts, revealing heterogeneous nuclear ribonucleoproteins (hnRNPs) F and H1 binding to G-rich segments. The G-rich 5’ UTRs of representative primate species were also cloned into a dual-luciferase-reporter system to establish primate-specific translation rates. Elimination of the Homoninae-specific uORF significantly increased luciferase translation, demonstrating its importance for INS gene expression. Overall, data obtained in this project has improved the understanding of the molecular mechanisms underlying the allele-specific expression of preproinsulin expression. These results may facilitate development of future preventative strategies for T1DM. i ii I. Table of Contents ABSTRACT ..................................................................................................................... i I. Table of Contents ......................................................................................... iii II. List of Tables ................................................................................................ vii III. List of Figures ................................................................................................xi IV. DECLARATION OF AUTHORSHIP ................................................................... xiii V. Acknowledgements ...................................................................................... xv VI. Abbreviations ............................................................................................. xvii Chapter 1: Introduction ....................................................................................... 1 1.1 Gene Expression ....................................................................................................... 3 1.1.1 Genome complexity and coding capacity .................................................. 4 1.1.2 Maturation of RNA ..................................................................................... 4 1.2 Pre-mRNA splicing .................................................................................................... 5 1.2.1 Cis-acting regulatory sequences and trans-acting factors ........................ 7 1.2.2 Alternative RNA processing ....................................................................... 9 1.2.3 Co-transcriptional splicing ....................................................................... 12 1.2.4 Introns in splicing ..................................................................................... 14 1.2.5 The importance of RNA secondary structure in splicing ......................... 16 1.2.6 Alternative splicing and human disease .................................................. 27 1.3 The potential of antisense oligonucleotide therapies for genetic diseases .......... 39 1.4 Diabetes: classification, pathophysiology and mechanism ................................... 44 1.4.1 Type 1 Diabetes ....................................................................................... 44 1.4.2 Preproinsulin gene (INS) .......................................................................... 47 1.5 Hypothesis.............................................................................................................. 51 1.6 Aims and Objectives ............................................................................................... 51 Chapter 2: General Material and Methods ......................................................... 53 2.1 Materials and reagents .......................................................................................... 55 2.2 Methods ................................................................................................................. 63 2.2.1 Prediction of G4 forming G-rich sequences in INS intron ....................... 63 iii 2.2.2 Screening for G4 formation by the thioflavin T (ThT) fluorescence assay ................................................................................................................. 64 2.2.3 Real-time monitoring of G-quadruplex formation during transcription in vitro ......................................................................................................... 64 2.2.4 Identification of proteins that bind the antisense target regions for the INS intron 1 retention.............................................................................. 65 2.2.5 Cloning, expression and purification of recombinant hnRNPs F and H1 and RRMs ........................................................................................................ 66 2.2.6 Cloning and transfection of 5’UTR regions of Human and primates INS gene ......................................................................................................... 67 2.2.7 Statistical analysis .................................................................................... 70 Chapter 3: Thioflavin T (ThT) as a fluorescent light-up probe for monitoring of G4 formation ..................................................................................................... 73 3.1 Introduction ........................................................................................................... 75 3.2 Results ................................................................................................................... 76 3.2.1 Optimization of experimental conditions ............................................... 76 3.2.2 G4 formation in INS intron 1 DNA-derived oligos ................................... 84 3.2.3 G4 formation in INS intron 1 RNA-derived oligos ................................... 95 Chapter 4: Identification of INS intron1-binding proteins .................................. 113 4.1 Identification and characterization of proteins binding to INS intron 1 antisense target region ........................................................................................................ 115 4.1.1 Pull-down using INS intron 1 transcripts containing the antisense target and flanking G-runs ............................................................................... 115 4.1.2 Cloning, expression and purification of full-length hnNRP F/H1 and their individual RRM domains ........................................................................ 117 Chapter 5: Coupled INS 5’UTRs splicing and translation efficiency in higher primates .................................................................................................... 123 5.1 Translation efficiency of human and primate INS 5’UTRs ................................... 125 5.1.1 Cloning ................................................................................................... 125 5.1.2 Transfection efficiency .......................................................................... 125 iv 5.1.3 Conclusions

University of Southampton Research Repository

1 Lessons from Non-Canonical Splicing 1 2 Christopher R Sibley1,2, Lorea Blazquez1 and Jernej Ule1 3 4 Addresses: 5

Splicing and Editing of Ionotropic Glutamate Receptors: a Comprehensive Analysis Based on Human RNA‑Seq Data

A Collection of Pre-Mrna Splicing Mutants in Arabidopsis Thaliana

An Ancient Germ Cell-Specific RNA-Binding Protein Protects

Intron Retention As a Component of Regulated Gene Expression Programs

Unmasking Alternative Splicing Inside Protein-Coding Exons Defines Exitrons and Their Role in Proteome Plasticity

Structural Disruption of Exonic Stem–Loops Immediately Upstream of The

E Cient Gene Editing Through an Intronic Selection Marker in Cells

Uleetalnrg2016lessonsfromn

European Congress on Thrombosis and Haemostasis ABSTRACT BOOK

A Collection of Pre-Mrna Splicing Mutants in Arabidopsis Thaliana

Oral Presentation - Session1