An Update on the Neurological Short Tandem Repeat Expansion Disorders and the Emergence of Long-Read Sequencing Diagnostics
Total Page:16
File Type:pdf, Size:1020Kb
Chintalaphani et al. acta neuropathol commun (2021) 9:98 https://doi.org/10.1186/s40478-021-01201-x REVIEW Open Access An update on the neurological short tandem repeat expansion disorders and the emergence of long-read sequencing diagnostics Sanjog R. Chintalaphani1,2, Sandy S. Pineda3,4 , Ira W. Deveson2,5 and Kishore R. Kumar2,6* Abstract Background: Short tandem repeat (STR) expansion disorders are an important cause of human neurological disease. They have an established role in more than 40 diferent phenotypes including the myotonic dystrophies, Fragile X syndrome, Huntington’s disease, the hereditary cerebellar ataxias, amyotrophic lateral sclerosis and frontotemporal dementia. Main body: STR expansions are difcult to detect and may explain unsolved diseases, as highlighted by recent fnd- ings including: the discovery of a biallelic intronic ‘AAGGG’ repeat in RFC1 as the cause of cerebellar ataxia, neuropathy, and vestibular arefexia syndrome (CANVAS); and the fnding of ‘CGG’ repeat expansions in NOTCH2NLC as the cause of neuronal intranuclear inclusion disease and a range of clinical phenotypes. However, established laboratory tech- niques for diagnosis of repeat expansions (repeat-primed PCR and Southern blot) are cumbersome, low-throughput and poorly suited to parallel analysis of multiple gene regions. While next generation sequencing (NGS) has been increasingly used, established short-read NGS platforms (e.g., Illumina) are unable to genotype large and/or complex repeat expansions. Long-read sequencing platforms recently developed by Oxford Nanopore Technology and Pacifc Biosciences promise to overcome these limitations to deliver enhanced diagnosis of repeat expansion disorders in a rapid and cost-efective fashion. Conclusion: We anticipate that long-read sequencing will rapidly transform the detection of short tandem repeat expansion disorders for both clinical diagnosis and gene discovery. Keywords: Tandem, Repeats, Expansion, Neurological, Clinical, Genetics, Disease, Diagnosis, Long-read, Sequencing Introduction during DNA replication, due to slippage events on mis- A large proportion of the human genome is comprised aligned strands, errors in DNA repair during synthesis of repetitive DNA sequences known as microsatellites and formation of secondary hairpin structures [43]. As a or short tandem repeats (STRs). STRs are small sec- result, STR lengths are relatively unstable, with their fre- tions of DNA, usually 2–6 nucleotides in length, that quent mutation providing a source of genetic variation in are repeated consecutively at a given locus. STRs make human populations. STRs have a mutation rate orders of up at least 6.77% of the human genome and are highly magnitude higher than single nucleotide polymorphisms polymorphic [143]. STR lengths are prone to alteration (SNPs) in non-repetitive contexts [58]. Larger repeats, in general, are more unstable and have an increased pro- pensity to expand during DNA replication. *Correspondence: [email protected] 2 Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Large STR expansions may become pathogenic, under- Research, Darlinghurst, NSW 2010, Australia pinning various forms of primary neurological disease. Full list of author information is available at the end of the article Tere are currently 47 known STR genes that can cause © The Author(s) 2021. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/. The Creative Commons Public Domain Dedication waiver (http:// creat iveco mmons. org/ publi cdoma in/ zero/1. 0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Chintalaphani et al. acta neuropathol commun (2021) 9:98 Page 2 of 20 disease when expanded; 37 of these exhibit primary (RAN) translation, increased promoter activity, coding neurological presentations (see Table 1) while 10 pre- tract expansions and polyglutamine aggregation [85, sent with developmental abnormalities (see Table 2). 154, 180]. Repeat expansions in coding and non-coding With increased interest and improving molecular tech- regions may disrupt RNA function in many ways, with niques for detecting repeat expansions, the list of known multiple coexisting mechanisms potentially contribut- repeat expansion disorders is growing rapidly, with new ing to pathogenicity. For example, post-mortem exami- genes such as RFC1, GIPC1, LRP12, NOTCH2NLC and nation of brain tissue in patients with an expanded VWA1 recently implicated. Furthermore, STR expan- ‘GGG GCC ’ repeat in the 5’ region of C9orf72 ALS/ sions have been linked to complex polygenic diseases FTD, revealed multiple potential pathogenic RNA spe- such as heart disease, bipolar disorder, major depressive cies: RNA that had been stalled at repeat locations, disorder and schizophrenia [59]. Some theories also sug- RAN proteins, antisense transcription of repeat regions gest STR variability may account for normal brain and and alternative splicing of intron 1 containing the behavioural traits such as anxiety, cognitive function, repeat [48]. Tese species are considered “toxic” as they emotional memory and altruism [41]. Similarly, somatic accumulate as RNA foci within the neurons, astrocytes, instability at STR regions is a hallmark of many cancers microglia and oligodendrocytes and form complexes such as Lynch syndrome-related cancers, gastric cancers, with RNA-binding proteins to dysregulate translation colorectal cancers and endometrial cancers [174]. In this and modify transcription [48, 49]. review, we provide an overview of the primary neurologi- Te other common toxic GOF mechanism is expan- cal repeat expansion diseases, discuss limitations in cur- sion of homopolymer amino acid tracts resulting in rent diagnostic methods and developments in long-read misfolding and proteinopathy. In neurological repeat sequencing technologies that promise to improve the dis- expansion diseases, exonic ‘CAG’ repeat expansions covery and diagnosis of STR expansions. code for the amino acid glutamine; when expanded, they create polyglutamine tract expansions which can General characteristics of repeat expansion reach hundreds of amino acids long. Tis is thought disorders to alter and expand the transcribed protein creat- Molecular mechanisms ing insoluble protein aggregates within neuronal cells Repeat expansion diseases have a wide range of patho- (primarily in the cerebellum), leading to perturbations genic mechanisms, which depend on the location of the of intracellular homeostasis and cell death [81]. Tis expanded STR within a gene loci, and the nature and mechanism is commonly seen in the hereditary spi- function of the gene. It is often hard to determine the nocerebellar ataxias. In congenital and developmental specifc mechanism as multiple may occur simultane- repeat expansion diseases, exonic ‘GCG’ coding tracts ously and all may contribute to the disease form. Te expand to create polyalanine tract expansions (Table 2). mechanisms may be broadly categorised as loss-of-func- However, they are quite diferent to polyglutamine tract tion (LOF) or toxic gain-of-function (GOF). expansions seen in neurological repeat expansion dis- LOF mechanisms include hypermethylation and gene orders; they are smaller and generally meiotically sta- silencing [43, 132], defective transcription, and increased ble when transmitted between generations, thus they messenger RNA (mRNA) degradation [154]; all efects do not exhibit the same large pathogenic range seen in that can be elicited by an STR expansion within a gene neurological repeat expansion disorders. For example, a locus. DNA methylation is an epigenetic process that normal allele in HOXA13 contains 15–18 alanine resi- contributes to genome stability and maintenance, and dues while a pathogenic allele only contains between regulation of gene expression during development, with 7 and 15 extra residues [50]. Tus, the mechanism of aberrant methylation profles often implicated in disease mutation in polyalanine disorders is thought to be dif- [2]. Large expanded STRs may induce local hypermethyl- ferent and hypothesised to be due to unequal crossing ation, thereby silencing gene expression. One such classic between mispaired alleles and duplication during rep- example is an expanded STR in the promoter region of lication rather than dynamic trinucleotide expansions FMR1, seen in Fragile X syndrome (FXS). Te expansion [164]. Tis would explain the relative stability of trans- causes hypermethylation of the FMR1 promoter region mission and small pathogenic ranges. Furthermore, leading to silencing of transcription and LOF in the these polyalanine tract repeat expansion disorders FMR1 gene. Terefore, the methylation state of relevant are more commonly