Trinucleotide Repeats: a Structural Perspective
Total Page:16
File Type:pdf, Size:1020Kb
REVIEW ARTICLE published: 20 June 2013 doi: 10.3389/fneur.2013.00076 Trinucleotide repeats: a structural perspective † † Bruno Almeida, Sara Fernandes , Isabel A. Abreu and Sandra Macedo-Ribeiro* Instituto de Biologia Molecular e Celular, Universidade do Porto, Porto, Portugal Edited by: Trinucleotide repeat (TNR) expansions are present in a wide range of genes involved in sev- Thomas M. Durcan, McGill University, eral neurological disorders, being directly involved in the molecular mechanisms underlying Canada pathogenesis through modulation of gene expression and/or the function of the RNA or Reviewed by: Denis Soulet, Laval University, Canada protein it encodes. Structural and functional information on the role of TNR sequences in Thomas M. Durcan, McGill University, RNA and protein is crucial to understand the effect of TNR expansions in neurodegenera- Canada tion.Therefore, this review intends to provide to the reader a structural and functional view *Correspondence: ofTNR and encoded homopeptide expansions, with a particular emphasis on polyQ expan- Sandra Macedo-Ribeiro, Instituto de sions and its role at inducing the self-assembly, aggregation and functional alterations of Biologia Molecular e Celular, Universidade do Porto, Rua do Campo the carrier protein, which culminates in neuronal toxicity and cell death. Detail will be given Alegre 823, 4150-180 Porto, Portugal to the Machado-Joseph Disease-causative and polyQ-containing protein, ataxin-3, provid- e-mail: [email protected] ing clues for the impact of polyQ expansion and its flanking regions in the modulation of †Present address: ataxin-3 molecular interactions, function, and aggregation. Sara Fernandes, Shannon ABC, Keywords: amino acid-repeats, microsatellites, protein complexes, protein aggregation, amyloid, protein structure Limerick Institute of Technology, Limerick, Ireland; Isabel A. Abreu, GplantS, Instituto de Tecnologia Química e Biológica, Oeiras, Portugal. TRINUCLEOTIDE REPEATS AND HUMAN DISEASE act in concert to influence the pattern of selective cell toxic- Trinucleotide repeat (TNR) expansions and their association with ity. Some of those toxicity mechanisms will be briefly discussed neurological disorders have been known for the past 20 years below. (La Spada et al., 1991). Expansion of CAG, GCG, CTG, CGG, TRINUCLEOTIDE REPEATS AND RNA STRUCTURE and GAA repeats located in coding or non-coding sequences of different genes (summarized in Table 1; Figures 1 and 2) are asso- The formation of hairpin structures within the TNR RNA is related ciated with a diverse range of human monogenic diseases such to the gain in RNA toxic function, the major pathogenic mecha- as Spinobulbar Muscular Atrophy (SBMA, a.k.a. Kennedy dis- nism associated with CUG and CGG repeat expansions in non- ease), Huntington Disease (HD), Spinocerebellar Ataxias (SCAs), coding regions of DM1 and FXTAS transcripts, which was also Oculopharyngeal Muscular Dystrophy (OPMD), Myotonic Type 1 shown to contribute to pathogenesis in CAG repeat disorders such (DM1), Fragile X-Associated Tremor Ataxia Syndrome (FXTAS), as HD and Machado-Joseph disease (MJD, a.k.a. SCA3) (reviewed and Friedreich Ataxia (FRDA) (for a review see Orr and Zoghbi, in Krzyzosiak et al., 2012). These duplex structures, whose sta- 2007), with longer repeats being correlated with earlier age at onset bility is positively correlated with the repeat size (Napierala and and increased disease severity. These TNR are highly unstable Krzyzosiak, 1997), sequester dsRNA binding proteins involved in and the repeat tract length can change between affected indi- mRNA splicing such as CUG-binding protein (CUGBP) and mus- viduals within the same family and can be different in different cleblind protein 1 (MBNL1) (Miller et al., 2000), inducing aber- tissues (La Spada, 1997; Brouwer et al., 2009). More interestingly, rant splicing in affected cells, compromising multiple intracellular in the brain of patients affected by CAG expansions, differences pathways, affecting cell-quality control regulation, and ultimately in repeat instability have been found between specific cell types resulting in cell dysfunction (Li and Bonini,2010). Structural stud- (Pearson et al., 2005; Gonitel et al., 2008; Lopez Castel et al., ies on model trinucleotide CUG, CAG, and CGG repeats forming 2010). GCG repeats are usually shorter and reveal a higher sta- double-stranded chains revealed the features induced by peri- bility in different tissues and across generations than CAG repeats. odic U-U, A-A, and G-G mismatches, and provided hints into the The dynamic nature of these DNA repeat expansions is a con- structural details of pathogenic RNAs that are recognized by RNA- sequence of their capability to form different secondary struc- binding proteins (Mooers et al., 2005; Kiliszek et al., 2010, 2011; tures, which interfere with the cellular mechanisms of replication, Kumar et al., 2011; Parkesh et al., 2011). MBNL1 is composed repair, recombination and transcription (for a recent review see of four zinc-containing RNA-binding domains arranged in two Lopez Castel et al., 2010). The molecular mechanisms underly- tandem segments, with the C-terminal zinc-finger pair displaying ing pathogenesis in those disorders, either associated with mental a GC-sequence recognition motif (Teplova and Patel, 2008) and retardation, neuronal, or muscular degeneration, might result interacting with the stem region of expanded CUG RNAs (Yuan 136 from alterations in the levels of gene expression and/or the func- et al., 2007). Electron microscopy analysis of MBNL1:CUG tion of the RNA or protein it encodes, mechanisms that likely complexes showed that the pathogenic dsRNA forms a scaffold www.frontiersin.org June 2013 | Volume 4 | Article 76 | 1 Almeida et al. Frontiers in Neurology Table 1 | Human diseases associated with nucleotide repeat expansions (adapted from Messaed and Rouleau, 2009; Lopez Castel et al., 2010; Matos et al., 2011). Disease name Repeat Repeat Gene Protein (UniProt Biological Normal Disease Protein structure determined? type location identifier, number processa repeat repeat of residues) length length Spinal and bulbar CAG Protein coding AR Androgen receptor Transcription, transcription 9–36 38–62 Residues 20–30 and 671–919 (PDB | Neurodegeneration muscular atrophy region (polyQ) (P10275, 919 residues) regulation code 1xow) (SBMA) Huntington’s disease CAG Protein coding HTT Huntingtin (P42858, 3142 Apoptosis 6–34 36–121 Residues 5–18 (3lrh), Residues 1–17 (HD) region (polyQ) residues) (2ld0, 2ld2), Residues 1–64 (3io4, 3io6, 3ior, 3iot, 3iou, 3iov, 3iow) Dentatorubral- CAG Protein coding ATN1 atrophin 1 (P54259, 1190 Transcription, transcription 7–34 49–88 No structural information pallidouysian atrophy region (polyQ) residues) regulation (DRPLA) Spinocerebellar CAG Protein coding ATXN1 ataxin 1 (P54253, 815 Transcription, transcription 6–39 40–82 Residues 563–693 (1oa8) ataxia 1 (SCA1) region (polyQ) residues) regulation Spinocerebellar CAG Protein coding ATXN2 ataxin 2 (Q99700, 1313 No associated GO 15–24 32–200 Residues 912–928 (3ktr) ataxia 2 (SCA2) region (polyQ) residues) keywords for biological process Spinocerebellar CAG Protein coding ATXN3/MJD ataxin 3 (P54252, 364 Transcription, transcription 10–51 55–87 Residues 1–182 (1yzb), Residues ataxia 3 (SCA3) region (polyQ) residues) regulation, Ubl conjugation 222–263 (2klz) pathway Spinocerebellar CAG Protein coding CACNA1 A CACNA 1A, P/Q-type a1A Calcium transport, ion 4–20 20–29 Residues 1955–1975 (3bxk) ataxia 6 (SCA6) region (polyQ) calcium channel subunit transport, transport (O00555, 2505 residues) Spinocerebellar CAG Protein coding ATXN7 ataxin 7 (O15265, 892 Transcription, transcription 4–35 37–306 Residues 330–401 (2kkr) ataxia 7 (SCA7) region (polyQ) residues) regulation Structure and function of trinucleotide repeats Spinocerebellar CAG Protein coding ATXN17 TATA box binding protein Transcription, transcription 25–42 47–63 Residues 159–337 (1cdw, 1c9b, 1jfi, ataxia 17 (SCA17) region (polyQ) (TBP) (P20226, 339 regulation, Host-virus 1nvp, 1tgh) residues) interaction June 2013 | Volume 4 | Article 76 Multiple skeletal GAC Protein coding COMP cartilage oligomeric matrix Apoptosis, cell adhesion 5 4, 6, 7 Residues 225–757 (3fby). dysplasias (COMP) region protein (a.k.a (polyaspartate) Thrombospondin-5) (P49747, 757 residues) Synpolydactyly GCG Protein coding HOXD13 homeobox D13 (P35453, Transcription, transcription 15 22–29 No structural information | (HOXD13) region (polyA) 343 residues) regulation (Continued) 2 Almeida et al. www.frontiersin.org Table 1 | Continued Disease name Repeat Repeat Gene Protein (UniProt Biological Normal Disease Protein structure determined? type location identifier, number processa repeat repeat of residues) length length Oculopharyngeal GCG Protein coding PABPN1 Polyadenylate-binding mRNA processing 10 12–17 Residues 167–254 (3b4d, 3b4m, Muscular Dystrophy region (polyA) protein 2 (Q86U42, 306 3ucg) (OPMD) residues) Cleidocranial GCG Protein coding RUNX2 Runt-related transcription Transcription; transcription 17 27 No structural information dysplasia (CBFA1) region (polyA) factor 2 (Q13950, 521 regulation residues) Holoprosencephaly GCG Protein coding ZIC2 Zinc-finger protein ZIC 2 Differentiation, 15 25 No structural information (ZIC2) region (polyA) (O95409, 532 residues) neurogenesis, transcription, transcription regulation Hand-Foot-Genital GCG Protein coding HOXA13 homeobox A13 (P31271, Transcription, transcription 18 24–26 No structural information Syndrome/HOXA13)