Intron Retention Induced by Microsatellite Expansions As a Disease Biomarker

Intron Retention Induced by Microsatellite Expansions As a Disease Biomarker

Intron retention induced by microsatellite expansions as a disease biomarker Łukasz J. Sznajdera,1,2, James D. Thomasa,1,3, Ellie M. Carrellb, Tammy Reida, Karen N. McFarlandc, John D. Clearya, Ruan Oliveiraa, Curtis A. Nuttera, Kirti Bhattb, Krzysztof Sobczakd, Tetsuo Ashizawae, Charles A. Thorntonb, Laura P. W. Ranuma, and Maurice S. Swansona,2 aDepartment of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, College of Medicine, University of Florida, Gainesville, FL 32610; bDepartment of Neurology, University of Rochester, Rochester, NY 14642; cMcKnight Brain Institute, Department of Neurology and Center for Translational Research in Neurodegenerative Disease, University of Florida, College of Medicine, Gainesville, FL 32610; dDepartment of Gene Expression, Institute of Molecular Biology and Biotechnology, Adam Mickiewicz University, 61-614 Poznan, Poland; and eNeurological Institute, Houston Methodist Hospital, Houston, TX 77030 Edited by Stephen T. Warren, Emory University School of Medicine, Atlanta, GA, and approved March 12, 2018 (received for review September 20, 2017) Expansions of simple sequence repeats, or microsatellites, have 8 tandem repeats have been documented to expand in hereditary been linked to ∼30 neurological–neuromuscular diseases. While disease (Fig. S1 and Dataset S1). While GC-rich trinucleotide ex- these expansions occur in coding and noncoding regions, microsatel- pansions (exp) predominate in exonic regions, intron mutations are lite sequence and repeat length diversity is more prominent in in- composed of 3- to 6-bp repeats that vary considerably in GC content trons with eight different trinucleotide to hexanucleotide repeats, (20–100%) (6, 8). Based on this sequence feature, we divided intronic causing hereditary diseases such as myotonic dystrophy type 2 expansions into GC- and A/AT-rich groups (Fig. 1A). In contrast to (DM2), Fuchs endothelial corneal dystrophy (FECD), and C9orf72 the majority of A/AU-rich microsatellite RNAs, GC-rich expansions amyotrophic lateral sclerosis and frontotemporal dementia (C9- are predicted to form highly stable RNA secondary structures (Fig. ALS/FTD). Here, we test the hypothesis that these GC-rich intronic S2) (9), increase intron length substantially (Fig. 1B), and even microsatellite expansions selectively trigger host intron retention multiply intron length several times, such as the SCA36-associated exp (IR). Using DM2, FECD, and C9-ALS/FTD as examples, we demonstrate GGCCTG mutation in NOP56 (Fig. S3A). SCA10 AUUCU re- that retention is readily detectable in affected tissues and peripheral peats also fold into secondary structures consisting of UCU internal blood lymphocytes and conclude that IR screening constitutes a rapid loops closed by AU pairs, but these structures are relatively unstable and inexpensive biomarker for intronic repeat expansion disease. compared with the hairpins and G-quadruplexes formed by comparable-length GC-rich repeats (10, 11). amyotrophic lateral sclerosis | intron retention | microsatellite | myotonic dystrophy | RNA splicing Significance epetitive elements are a common sequence feature of A number of hereditary neurological and neuromuscular dis- Reukaryotic genomic DNAs and comprise as much as ∼70% of eases are caused by the abnormal expansion of short tandem the human genome (1, 2). These repetitive sequences include repeats, or microsatellites, resulting in the expression of repeat transposable element families (DNA transposons and LTR and expansion RNAs and proteins with pathological properties. non-LTR retrotransposons) and simple sequence repeats, such as Although these microsatellite expansions may occur in either telomeric repeats and a variety of satellites (centromeric, micro, the coding or noncoding regions of the genome, trinucleotide mini, and mega). Microsatellites, which are repeating units CNG repeats predominate in exonic coding and untranslated of ≤10 base pairs (bp), are a particularly prominent repetitive element regions while intron mutations vary from trinucleotide to class because they are highly polymorphic due to their tendency to hexanucleotide GC-rich, and A/AT-rich, repeats. Here, we use form imperfect hairpins, slipped-stranded, quadruplex-like, and other transcriptome analysis combined with complementary experi- structures resulting in elevated levels of DNA replication and repair mental approaches to demonstrate that GC-rich intronic ex- errors (3, 4). While these errors result in both repeat contractions and pansions are selectively associated with host intron retention. expansions that may provide beneficial gene regulatory activities, ex- Since these intron retention events are detectable in both af- pansions cause ∼30 human hereditary diseases (5, 6). Although hu- fected tissues and peripheral blood, they provide a sensitive man introns are significantly longer and denser in repetitive elements and disease-specific diagnostic biomarker. compared with exons (7), only eight microsatellite expansion disor- Author contributions: Ł.J.S. and M.S.S. designed research; Ł.J.S., J.D.T., E.M.C., T.R., K.N.M., ders have been linked to intron repeat instability. J.D.C., R.O., and C.A.N. performed research; E.M.C., T.R., K.N.M., J.D.C., K.B., T.A., C.A.T., In this study, we examined the pathomolecular consequences of and L.P.W.R. contributed new reagents/analytic tools; Ł.J.S., J.D.T., E.M.C., K.S., and M.S.S. both GC- and A/AT-rich intronic microsatellite mutations asso- analyzed data; Ł.J.S. performed graphics; and Ł.J.S., J.D.T., and M.S.S. wrote the paper. ciated with myotonic dystrophy type 2 (DM2), C9orf72-linked Conflict of interest statement: M.S.S. is a member of the scientific advisory board of amyotrophic lateral sclerosis with frontotemporal dementia (C9- Locana, Inc. ALS/FTD), Fuchs endothelial corneal dystrophy (FECD), Frie- This article is a PNAS Direct Submission. ’ dreich s ataxia (FRDA), and spinocerebellar ataxia type 10 (SCA10). Published under the PNAS license. We demonstrate the GC-rich CCTG, GGGGCC, and CTG ex- Data deposition: The data reported in this paper have been deposited in the Gene Ex- pansions lead to host intron retention (IR) in DM2, C9-ALS/ pression Omnibus (GEO) database, https://www.ncbi.nlm.nih.gov/geo (accession no. FTD, and FECD, respectively, while A/AT-rich expansions in GSE101824). FRDA and SCA10 do not. Based on these and additional ob- 1Ł.J.S. and J.D.T. contributed equally to this work. servations, we propose IR as an accessible and inexpensive bio- 2To whom correspondence may be addressed. Email: [email protected] or mswanson@ marker for both diagnostic and therapeutic trial purposes. ufl.edu. 3Present address: Computational Biology Program, Public Health Sciences Division, Fred Results Hutchinson Cancer Research Center, Seattle, WA 98109. Sequence Diversity and Positional Bias of Intronic Microsatellite This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. Expansions. The human genome contains ∼80,000 3- to 6-bp micro- 1073/pnas.1716617115/-/DCSupplemental. satellites in introns that could potentially undergo expansion, but only Published online April 2, 2018. 4234–4239 | PNAS | April 17, 2018 | vol. 115 | no. 16 www.pnas.org/cgi/doi/10.1073/pnas.1716617115 Downloaded by guest on September 24, 2021 GC-rich A/AT-rich three distinct metrics: (i) relative enrichment in reads spanning A – ii CCG CTG GAA intron exon junctions; ( ) average per base pair read coverage iii CGG CCTG ATTTC across the retained intron; and ( ) the fraction of intron- CAG CAG GGCCTG ATTCT containing molecules (IR ratio) for all four CNBP introns us- CTG GCN GGGGCC TGGAA ing IRFinder (19, 20). As expected, IR was exclusively elevated for DM2 CNBP i1 with an IR ratio of ∼0.35, while splicing of introns 2, 3, and 4 was unaffected (Fig. 2B and Fig. S4 A–C). To 0.1-0.8 kb 1.3-75.9 kb confirm CNBP i1 retention using additional patient samples and an alternative experimental approach, we analyzed microarray datasets UTRs CDS intron obtained from DM2, DM1, facioscapulohumeral muscular dystrophy B (FSHD), and unaffected control muscle biopsies (21, 22) and ob- Disease Associated Intron Repeat expansion served statistically significant and specific CNBP i1 retention in acronym gene size range sequence structure DM2 compared with this large control cohort (Fig. 2C). This analysis of other muscular dystrophies increased our confidence that CNBP NOP56 + SCA36 GGCCTG i1 retention was DM2-specific rather than reflecting a general ALS/FTD C9orf72 GGGGCC + CNBP TCF4 myopathic feature (23). Since bidirectional transcription of FECD CTG + i1 occurs (24), we quantified strand-specific RNA-seq read coverage CNBP CCTG DM2 + to confirm that our analysis was not confounded by antisense tran- FRDA FXN GAA - > BEAN1 scription and found 99.9% of reads originated from sense mole- SCA31 TGGAA - cules in muscles (Fig. S4D). SCA10 ATXN10 ATTCT - DAB1 CNBP i1 retention was validated by using RT-PCR from SCA37 ATTTC - biopsied skeletal muscle (tibialis anterior; TA) since RNA deg- ′ 10 20 kb radation is minimized in these samples. IR detection from the 3 ss allowed selective amplification of the retained intron from pre- Fig. 1. GC- and A/AT-rich intronic microsatellite expansion mutations. (A) mRNA intermediates and simultaneous analysis of introns 1, 2, Sequences of disease-associated microsatellites located in exons, including and 3 (Fig. 2D). In agreement with whole transcriptomic data, we untranslated regions (UTRs) and coding

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    6 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us