OPEN Citation: Variation (2018) 5, 18003; doi:10.1038/hgv.2018.3 Official journal of the Japan Society of Human Genetics www.nature.com/hgv

ARTICLE Identification of in HEXA and HEXB in Sandhoff and Tay-Sachs diseases: a new large deletion caused by Alu elements in HEXA

Hassan Dastsooz1,3, Mohsen Alipour1,3, Sanaz Mohammadi1, Fatemeh Kamgarpour1, Fatemeh Dehghanian1 and Majid Fardaei1,2

GM2 are a group of lysosomal lipid storage disorders that are due to mutations in HEXA, HEXB and GM2A. In our study, 10 patients with these diseases were enrolled, and Sanger sequencing was performed for the HEXA and HEXB . The results revealed one known splice site (c.346+1G4A, IVS2+1G4A) and three novel mutations (a large deletion involving exons 6–10; one nucleotide deletion, c.622delG [p.D208Ifsx15]; and a missense mutation, c.919G4A [p.E307K]) in HEXA.InHEXB, one known mutation (c.1597C4T [p.R533C]) and one variant of uncertain significance (c.619A4G [p.I207V]) were identified. Five patients had c.1597C4TinHEXB, indicating a common mutation in south Iran. In this study, a unique large deletion in HEXA was identified as a homozygous state. To predict the cause of the large deletion in HEXA, RepeatMasker was used to investigate the Alu elements. In addition, to identify the breakpoint of this deletion, PCR was performed around these elements. Using Repeat masker, different Alu elements were identified across HEXA, mainly in intron 5 and intron 10 adjacent to the deleted exons. PCR around the Alu elements and Sanger sequencing revealed the start point of a large deletion in AluSz6 in the intron 6 and the end of its breakpoint 73 nucleotides downstream of AluJo in intron 10. Our study showed that HEXA is an Alu-rich that predisposes individuals to disease-associated large deletions due to these elements.

Human Genome Variation (2018) 5, 18003; doi:10.1038/hgv.2018.3; published online 15 March 2018

INTRODUCTION GM2 in the central nervous system, damaging this GM2 gangliosides are a group of lysosomal lipid storage disorders systems and presenting the signs and symptoms identified in that resulted from disease-causing mutations in three autosomal patients with the corresponding disorders. On the basis of the fully recessive genes, including A (HEXA), hexosamini- nonfunctional HEXA and HEXB proteins or reduced activity of these dase subunit beta (HEXB) and GM2 ganglioside activator (GM2A). proteins resulting from mutations in their genes, these disorders The HEXB gene produces a subunit of two related that are presented as a severe form found in infancy or a less severe include beta-hexosaminidase A and beta-hexosaminidase B. Both type with a later onset. Tay-Sachs and Sandhoff diseases (which are enzymes have two subunits. Beta-hexosaminidase A is made from clinically indistinguishable from each other) are autosomal one alpha subunit produced from the HEXA gene and one beta recessive, progressive neurodegenerative disorders. Their classical subunit (produced from the HEXB gene). However, beta- infantile forms develop in infancy with developmental retardation, hexosaminidase B is only produced from two beta subunits paralysis, dementia, blindness and death usually by 2 or 3 years of 1,2,4–6 3 encoded from the HEXB gene. Beta-hexosaminidase A and B, age. A study conducted by Smith et al. showed that the which are mainly found in lysosomes, break down , clinical findings in infantile-onset disease were generally as follow: oligosaccharides and glycoproteins, and are critical in the central hyperacusis with excessive startle response in 85% of individuals; nervous system as they have very essential roles in breaking down increased myotatic reflexes in 85% of individuals (children with the GM2 ganglioside in this system, preventing the accumulation infantile-onset disease were not able to walk independently); a of this product and its damage to the nervous system.1–3 pattern of combined axial hypotonia with limb spasticity in 86% of According to the Human Gene Mutation Database (HGMD, individuals; visual deterioration in 89% of individuals; and seizures http://www.hgmd.cf.ac.uk/ac/index.php), more than 182 disease- in 84% of individuals. Considering the importance of identifying causing mutations of the HEXA gene have been identified in genetic causes in our patients that are suggestive of Tay-Sachs and patients affected by Tay-Sachs disease (MIM #272800), including Sandhoff diseases, the aim of our study was to identify disease- missense/nonsense, 96; splicing, 33; small deletions, 28; small causing mutations in these patients from the south of Iran and insertions, 8; small indels, 3; and gross deletions, 1 (7.6 kb incl. report their clinical findings. promoter+ex. 1). Regarding the HEXB gene, to date, more than 107 pathogenic mutations in this gene have been reported in patents affected by Sandhoff disease (MIM #268800), including missense/ MATERIALS AND METHODS nonsense, 39; splicing, 18; small deletion, 18; small insertion, 3; and Ten unrelated patients suspected of GM2 ganglioside diseases from the gross deletion, 6. These mutations result in the accumulation of south of Iran were recruited in the current study. All patients gave

1Comprehensive Medical Genetic Center, Shiraz University of Medical Sciences, Shiraz, Iran and 2Medical Genetics Department, Shiraz University of Medical Sciences, Shiraz, Iran. Correspondence: M Fardaei ([email protected]) 3These authors contributed equally to this work. Received 31 August 2017; revised 21 December 2017; accepted 21 December 2017 New mechanism of genetic mutation in HEXA H Dastsooz et al 2 informed consent before undergoing genetic analysis of the HEXB and fr33.net/translator.php) was used to predict the possible effect of this HEXA genes based on the requirements of the ethics committee in deletion on the amino-acid sequence of the HEXA protein. Comprehensive Medical Genetics Center, Shiraz University of Medical To predict the functional effects and conservation status of a newly Sciences. Three milliliters of whole-blood samples from patients was identified missense mutation in another patient, we also used different collected in EDTA tubes and stored at − 20 °C until use. Genomic DNA was bioinformatics tools, including MutationTaster, MutationAssesor, Polyphen, prepared from peripheral blood lymphocytes using a DNA extraction kit Grantham, PhastCons, GERP, CADD, Fathmm and SIFT. The T-coffee (Yekta Tajhiz, Tehran, Iran) according to the manufacturer’s instructions. multiple sequence alignment program (http://www.ebi.ac.uk/Tools/msa/ The genomic DNA concentration was measured by a NanoDrop (ND1000, tcoffee/) was applied to compare the amino-acid sequence of HEXA NanoDrop Technologies, Wilmington, DE, USA) and stored at –20 °C proteins across different species to obtain information about the until use. conservation of amino acids at the site of mutation. All of the exon and exon–intron boundaries of the HEXA and HEXB genes were amplified by the PCR primers given in Table 1. These primer pairs were designed and evaluated according to their reference genomic RESULTS sequences in ENSEMBLE (https://www.ensembl.org) and the use of NCBI One known (NM_000520 (HEXA-201): c.346+1G4A, IVS2+1G4A)7 BLAST(https://blast.ncbi.nlm.nih.gov), ENSEMBLE BLAST/BLAT (https:// splice site and three novel HEXA mutations, including NM_000520 www.ensembl.org), UCSC (BLAT and In Silico PCR, https://genome.ucsc. 4 – edu) and Allele ID 7.5 (a bioinformatics software that helps design primers (HEXA-201): c.919G A [p.E307K], the deletion of exons 6 10 in a and probes). The total volume of the PCR reaction was 50 μl, which homozygous state and NM_000520 (HEXA-201): c.622delG [p. contained 1 μl of the forward and reverse primers (20 pmol/μl), 3 μl of DNA D208Ifsx15], which are suspected as being homozygous, were template (50–200 ng), 25 μl of TEMPase Hot Start 2x Master Mix Blue identified. One variant of uncertain significance (NM_000521 (Ampicon, A290806) and 20 μl of dH2O. PCR was carried out according to (HEXB-201): c.619A4G [p.I207V])8 and one known mutation the Amplicon TEMPase Hot Start program. Ten microliters of PCR (NM_000521 (HEXB-201):c.1597C4T [p.R533C])9 in HEXB were amplicons were visualized on a 2% agarose gel containing SYBR Safe found in a homozygous state (Table 2). From 10 patients, 5 cases (Invitrogen, Paisley, Scotland, UK). Then, the PCR products were sequenced only had the homozygous p.R533C mutation in HEXB, indicating using Sanger sequencing, and their results were analyzed using NCBI its high frequency in our population. For the first time, a large BLAST and Codon Code Aligner software (CodonCode Corporation, homozygous deletion (deletion of exons 6–10) in HEXA was Centerville, MA, USA). fi In addition, to predict the cause of a new large deletion in the HEXA identi ed in our patient. Using the repeat masker tool, different gene due to Alu elements, RepeatMasker (http://www.repeatmasker.org) Alu elements were found across the HEXA gene (Figure 1), ʹ was used with the RMBlast search engine (a version that was compatible including one Alu at the 5 untranslated region, twelve in intron 1, with the NCBI Blast tool suite). The Repeat masker program screens DNA one in intron 3, four in intron 5 flanking exon 6 (the first exon sequences of interspersed repeats and low complexity DNA sequences. which was deleted in a block of five exon deletions), one in intron Sequence comparisons in RepeatMasker are carried out using one of the 7, one in intron 10 flanking exon 10 (the last exon which was different search engines, including nhmmer, cross_match, ABBlast/ deleted in the deleted block) and three Alus at the 3ʹ untranslated WUBlast, RMBlast and Decypher. RepeatMasker uses information from region. The exact breakpoint of the deletion around exons 6 and curated libraries of repeats, Dfam and Repbase. We investigated any Alu 10 were determined using different PCR products flanking the Alu fl repeats (especially Alu repeats anking deleted exons) in the whole elements in their corresponding intron-flanked deleted exons. In genomic structure of HEXA, covering a total of 34,695 bp, starting 1.4 kb addition, the consequence of this deletion was predicted to be an upstream of exon 1 to 1.4 kb downstream of exon 14. Moreover, to identify fi in-frame deletion of 192 amino acids of the HEXA protein, leading the breakpoint of the identi ed large deletion around exons 6 and 10 in – our patient, we performed a second PCR around the predicted Alu to its abnormal function (Figures 2a d and 3). In patients with this – elements using the primers given in Table 1 (the two forward primers deletion, PCR did not show any band for exons 6 10, but the upstream of AluJo and AluSx in intron 5, and the reverse primers patient had PCR bands around Alu elements in introns 5 and 10 downstream of AluJo in intron 10). The PCR products were then used for (1,000 bp for AluSx-AluJo and 1,500 bp for AluJo-AluJo), revealing Sanger sequencing. In addition, a coding sequence translator (http://www. the homozygous status. In her parents, all of the exons produced

Table 1. All of the primer pairs used in the current study

HEXA primer pairs HEXB primer pairs

FE1 5ʹ-CGGTTATTTACTGCTCTACTGG-3ʹ FE1 5ʹ-TGTTTGAGGTTGCTGTCTGG-3ʹ RE1 5ʹ-GAAGTGGAGTGCCTGTGA-3ʹ RE1 5ʹ-TCGAAACTGAGGACTTCTGC-3ʹ FE2 5ʹ-TCCTCCCTTTCCTTTACC-3ʹ FE2 5ʹ-ATCTCTAGTTGGACTTACAATG-3ʹ RE2 5ʹ-CGAGCATCAGCAGTTTAG-3ʹ RE2 5ʹ-CTCATCCATATAGTGACAGAAC-3ʹ FE3 5ʹ-GACATCTCCTCATTGAAAGA-3ʹ FE3 5ʹ-TGAAATGAGGAACACAGAAGAC-3ʹ RE3 5ʹ-ACACCTGTAATCCCAGAAC-3ʹ RE3 5ʹ-TCACAAAGCACCAACTGAA-3ʹ FE4 5ʹ-AACCCTTACTCTGACATCTC-3ʹ FE4-5 5ʹ-TTAGAGGATAGAGGTATAACACTA-3ʹ RE4 5ʹ-CACTCACATCTCCTCTTCCA-3ʹ RE4-5 5ʹ-AAACAGGAGGGTGATTCTC-3ʹ FE5 5ʹ-TGAGAACAGTCACAGATTG-3ʹ FE6 5ʹ-AAGCAGACATATTGGAAGCA-3ʹ RE5 5ʹ-ACACCCAGTCCATACATTC-3ʹ RE6 5ʹ-AGCCTACATACCTAACATTGG-3ʹ FE6-7 5ʹ-ATGGGAAGGTTTGATAGAC-3ʹ FE7 5ʹ-TCCTTTGAGTATGTACGACTTAG-3ʹ RE6-7 5ʹ-TCACTCTGAGCATAACAAG-3ʹ RE7 5ʹ-CAGTGAGCCGAGATTGTAC-3ʹ FE8 5ʹ-GTGTGACTCGTGTCCTTAC-3ʹ FE8 5ʹ-AAGAGACAGGATTCAGGA-3ʹ RE8 5ʹ-AAGACCTGAGCAATGTGAG-3ʹ RE8 5ʹ-AGTACAGTGGCATGATCTCAG-3ʹ FE9-10 5ʹ-CCACCTGCTTCACATAACT-3ʹ FE9 5ʹ-AGGTGGTAAGGTAAAGAAAGC-3ʹ R9-10 5ʹ-GAGAGTGCTCCGACCATTA-3ʹ RE9 5ʹ-AAGCAAGCAGTGGGTATTG-3ʹ FE11-12-13 5ʹ-CATGTTGCCTGTGTATGAAT-3ʹ FE10-11 5ʹ-CTATGTTAGGTTCTGTATCAAG-3ʹ RE11-12-13 5ʹ-AAGCCTCTGTAAGTGTTAGC-3ʹ RE10-11 5ʹ-ACAATGGTTGCTTCACTTAC-3ʹ FE14New 5ʹ-ACAGTGACAGTTTGAGAGTC-3ʹ FE12-14 5ʹ-ATTGCCTCTGTGTATAAG-3ʹ RE14New 5ʹ-TGCCTATTAATTCCAGTTACT-3ʹ RE12-14 5ʹ-TAATGAGAAATGTGCTTTG-3ʹ FHEXA-AluSx 5ʹ-TTCTAGGGTGATGGAGCAAC-3ʹ FHEXA-AluJo 5ʹ-TGAGCTGCTCCAAGACTCAA-3ʹ RHEXA-AluJo 5ʹ-AAGGTTCCTGAAGAGTCCTTG-3ʹ

Human Genome Variation (2018) 18003 Official journal of the Japan Society of Human Genetics Of fi iljunlo h aa oit fHmnGntc ua eoeVrain(08 18003 (2018) Variation Genome Human Genetics Human of Society Japan the of journal cial Table 2. Identified mutations in HEXA and HEXB, and the clinical and laboratory findings in this study

Patients Mutation Gene Exon Novel or Laboratory findings Clinical findings known

Case-AMA Homo IVS2+1G4A HEXA Known NA Onset: 1 year Death: 3.8 years Psychomotor degeneration, mild vision deficiency, recurrent seizures, paralysis Case-MOE Suspected Homo c.622delG in her HEXA 6 Novel NA Onset: 6 months both parent Death: 4.5 years Seizures (sometimes), mild vision deficiency, psychomotor degeneration, paralysis Case-DAN Homo exon 6–10 deletion (new) HEXA Novel HexA: 0.00 nmol/ml/min (normal: Onset: 5 months 0.96–1.78) Death: 2.6 years HexAB: 51.66 nmol/ml/min (normal: Psychomotor degeneration, mild vision deficiency, seizures (sometimes), 18.59–31.33) paralysis HexB: 54.51 nmol/ml/min (normal: 5.76–15.77) Case-RAM Homo c.919G4A, p.E307K HEXA Novel NA Onset: 8 months Death: 2.1 years Recurrent seizures, vision deficiency, psychomotor degeneration, paralysis Case-NEGAH Homo c.619A4G, p.I207V HEXB VOUS NA Onset: 7.5 months Death: 2.5 years Seizures (rarely), vision deficiency, psychomotor degeneration, paralysis Case-MOHR Homo c.1597C4T, p.R533C HEXB Known HexAB: 0.2 mU/ml (normal: 16.7– Onset: 8 months 40.7) Death: 2.5 years HexA: 0.02 mU/ml (normal: 0.47– Psychomotor degeneration, vision deficiency, seizures (sometimes), paralysis

2.60) in Dastsooz mutation H genetic of mechanism New HexB: 0.02 mU/ml (normal: 3.94–14) Case-HANIE Homo c.1597C4T, p.R533C HEXB Known HexAB: 0.8 mU/ml (normal: 16.7– Case-YAS 40.7) Onset: 9 months HexA: 0.11 mU/ml (normal: 0.47– Death: 2 years 2.60) Psychomotor degeneration, vision deficiency, recurrent seizures, paralysis al et HexB: 0.06 mU/ml (normal: 3.94–14) Case-YAS Homo c.1597C4T, p.R533C HEXB Known HexAB: 0.0 mU/ml (normal: 16.7– Onset: 1.6 year 40.7) Death: 5 years HexA: 0.06 mU/ml (normal: 0.47– Psychomotor degeneration, vision deficiency, recurrent seizures, hypotonia, 2.60) paralysis HexB: 0.02 mU/ml (normal: 3.94–14) Case-TALP Homo c.1597C4T, p.R533C HEXB Known NA Onset: 1.6 year Death: 5 years Psychomotor degeneration, vision deficiency, seizures, paralysis Case-SADA Homo c.1597C4T, p.R533C HEXB Known NA Onset: 6 months Death: 4.5 years HEXA Psychomotor degeneration, vision deficiency, seizures (at the late stage of disease), hypotonia, poor head control, paralysis Abbreviations: HexA: beta-hexosaminidase A, HexB: beta-hexosaminidase B, HexAB: beta-hexosaminidase A and B; Homo, homozygous; NA, not applicable; VOUS, variant of uncertain significance. 3 New mechanism of genetic mutation in HEXA H Dastsooz et al 4

Figure 1. Different Alu elements across the HEXA gene. The most prevalent Alu element across this gene are the AluS class, mainly AluSx. As seen, the possible mechanism for the deletion of exons 6–10 is recombination between AluSz6 in intron 5 and AluJo in intron 10.

Figure 2. The breakpoint of the deletion and deleted region in the HEXA gene. (a) The red arrows show the breakpoint of the deletion (the region between these red arrows was completely deleted), blue arrows show the deleted region in the Sanger sequencing chromatogram and blue letters indicate the deleted nucleotides, including all of exons 6–10. Blue-highlighted nucleotides show the sequence of AluSz6, and yellow-highlighted nucleotides represent AluJo. (b) The homology between AluSz6 and AluJo is indicated as 76%, which is used for recombination between these two Alu elements. (c) Sequencing data confirmed that the breakpoint of the large deletion in HEXA was due to the Alu elements. (d) Normal and mutant HEXA proteins that resulted from the large deletion are compared, indicating the loss of 192 amino acids from the protein. In the normal sequence, the brown underline shows the deleted residues, and in the mutant sequence, the underline shows the breakpoints of the deletion and also the two joined amino acids after this deletion.

Human Genome Variation (2018) 18003 Official journal of the Japan Society of Human Genetics New mechanism of genetic mutation in HEXA H Dastsooz et al 5

Figure 3. The similarity between AluSz6 and AluJo and its flanking sequence. There was high homology between these Alu elements, and the microhomology between 18 nucleotides of the AluSz6 and the sequence at the breakpoints helps to determine the occurrence of recombination between these sequences, leading to Alu-mediated deletion. Blue letters show homology between AluSz6 and AluJo. Red letters show microhomology at the breakpoints.

Figure 4. PCR products from patients with deletion of exons 6–10 in HEXA.(a) PCR for different exons of HEXA in patients for whom exons 6–10 were deleted as there were no PCR bands for them (here, only we show deletion of exons 6–8 from this deleted block). (b) PCR products using primers upstream of AluJo and AluSx in intron 5 and downstream of AluJo in intron 10. PCR revealed that the proband with the exon 6–10 deletion only had PCR bands produced from primers designed around the Alu elements (1,000 bp for AluSx-AluJo and 1,500 bp for AluJo- AluJo), revealing that the homozygous status due to the amplification of the reduced size of the exon 6–10 block resulted from this deletion. In our patient’s parents, not only were the bands amplified from the primers around the Alu elements detected but all of the exons of HEXA also produced PCR bands (here, only the PCR bands around the Alu elements are given for parents), indicating its heterozygous status. E, exon.

PCR bands and bands around Alu elements, indicating its sequencing of the whole HEXA and HEXB genes in both parents. heterozygous status (Figure 4). Using PCR around Alu elements Therefore, this case is suspected to be homozygous. (AluJo and AluSx in intron 5, and AluJo in intron 10) followed by Sanger sequencing, the results revealed that the breakpoints of the large deletion occurred in AluSz6 (its length was 312 DISCUSSION nucleotides, and its breakpoint was in nucleotide 307 of this The current study revealed the structure and distribution of a large Alu) in the intron 6 and 73 nucleotides downstream of AluJo in deletion mutation consisting of exons 6–10 of HEXA in Tay-Sachs intron 10 (this deletion also contained the whole AluJo sequence; disease. The deletion was 4.6 kb long and spanned from 576 bp Figure 2a, c). The names of the Alu elements were determined upstream of exon 6 to 475 bp downstream of exon 10. As shown from the results of RepeatMasker and were extracted from Dfam in Figure 1, the Alu elements are located across the normal HEXA and Repbase. gene, which mainly flank the deleted region, demonstrating the In addition to a novel large deletion, a new c.919G4A [p.E307K] involvement of these elements in large deletions in HEXA. missense mutation in HEXA was also identified as homozygous in Alu elements are short interspersed elements that are one patient and was confirmed as being heterozygous using moderately repetitive sequences and have a copy number of Sanger sequencing in her parents (Figure 5a). Moreover, different more than 1 million copies across the human genome (~11% of bioinformatics tools confirmed the pathogenicity of this mutation human genome). Alu elements are considered to be one of the (Table 3). In addition, the T-coffee tool revealed that the glutamate most successful mobile elements.10,11 They are ~ 300 nucleotides residue at position 307 was highly conserved during evolution in length and have a dimeric structure. The 3ʹ-end of an Alu among different species (Figure 5b). Another new mutation, element has a poly A-like structure, which plays a key role in its c.622delG [p.D208Ifsx15], was identified as being heterozygous in amplification mechanism.12–15 There are different subfamilies of parents of the Case-MOE (Table 2 and Figure 5c), and since the Alu sequences that share similar nucleotides in each group, and death of patient was before the time of genetic testing, we only the most common Alu subfamily in the human genome is the Alu identified this deleterious mutation by performing Sanger S class (the most common Alu is AluSx).16 Alu sequences are

Official journal of the Japan Society of Human Genetics Human Genome Variation (2018) 18003 New mechanism of genetic mutation in HEXA H Dastsooz et al 6

Figure 5. (a) Sanger data from the family with a novel c.919G4A (p.E307K) missense mutation. The reference sequence was obtained from the Ensembl database (https://www.ensembl.org/). Reference nucleotide and residues are highlighted in yellow. (b) Comparative alignment of the amino acids of HEXA across different species. The E307 residue is highly conserved during evolution. The conserved residue is shown in the rectangular box. Protein sequences were obtained from National Center for Biotechnology (NCBI). Symbols: (*)—identical amino acids; (:)—only similar amino acids. Pongo abelii: one of two species of orangutans; Macaca fascicularis: one species of macaque; Bos taurus (domestic cow): a species of oxen; Ovis aries: domestic sheep; Felis catus: domestic cat; Sus scrofa: wild boar or Eurasian wild pig; Rattus norvegicus: brown rat, also referred to as common rat, street rat and Norway rat; Mus musculus: house mouse; Gallus gallus (the red junglefowl): a type of domesticated fowl. (c) The Sanger chromatogram revealed a novel heterozygous c.622delG, p.D208Ifsx15 in both parents of the proband. Reference nucleotides are highlighted in yellow.

Human Genome Variation (2018) 18003 Official journal of the Japan Society of Human Genetics New mechanism of genetic mutation in HEXA H Dastsooz et al 7 caused nonsense-mediated decay. Therefore, all of these amino Table 3. Bioinformatics analysis of the c.919G4A[p.E307K] in HEAX acids are vital for the normal function of the HEXA protein. Since with their corresponding damage scores the only identified damaging variation across HEXA and HEXB in Bioinformatics tool Score or Bioinformatics tool Score or effect both parents of the patient was this deletion and all of the effect phenotypes of the patient overlapped with the clinical abnorm- alities found in Tay-Sachs disease, this frameshift mutation can be PolyPhen 1 GranthamScore 56 reported to be the genetic cause of Tay-Sachs disease in this ScoreCADD 35 ChimpAllele C family. AsianHapMapFreq NA DbSNPValidation NA fi SIFT:DAMAGING 0 Fathmm DAMAGING, One previously reported variant of uncertain signi cance − 4.74 (c.619A4G [p.I207V]) that has a high allele frequency (0.15148 ConsScoreGERP 5.6 EuropeanHapMapFreq NA for G) in the main database, ExAC (http://exac.broadinstitute.org), Mutation Assessor 4.425 MutationTaster Disease- was identified as homozygous in one of our patients (Case- causing NEGAH, Table 2). There are conflicting interpretations of this ScorePhastCons 0.99 AfricanHapMapFreq NA variant since some studies have reported that the c.619A4G Abbreviation: N/A, not applicable. variant converts the Ile to Val (in position 207) in a highly conserved region of the HEXB protein, which is considered to be associated with alterations in the catalytic activity and activator responsible for the instability of the genome, which resulted from protein binding8. However, there are several reports that indicate intrachromosomal and interchromosomal recombination events 27,28 29 17,18 that this variant is benign or likely benign. Studies between these sequences. Deletions due to Alu repeats have conducted by Redonnet-Vernhet et al.28 showed that the healthy fi been identi ed in different previously reported genes, such as the mother of the patient who was affected by Sandhoff’s disease due low-density lipoprotein receptor,19 NACHT, leucine-rich repeat, 20 21 to a mutation of p.R505Q was homozygous for p.I207V, indicating PYD-containing 7 and HEXB genes. that this variant did not cause Sandhoff’s disease. Therefore, we It is worth noting that while most breakpoints of deletions occur suggest that the c.619A4G variant identified in our patient in Alu elements, one of the deletion breakpoints in our study was cannot be the genetic cause of Sandhoff’s disease and other observed 73 bp downstream of the AluJo element in intron 10 genes with a clinical diagnosis that overlapped this disease may (AluJo and 73 bp after this element were deleted along with the be involved in this case, and other genetic tests are needed, such deleted block). As shown in Figures 1 and 2, the deletion occurred as whole-exome sequencing. due to recombination between AluSz6 and AluJo as the break- Current studies have shown that HEXA is an Alu-rich gene and is point occurred in the last nucleotides of AluSz6 and 73 bp predisposed to disease-associated large deletions. We also found downstream of AluJo. Although there is an AluJo in intron 6, its different mutations in the HEXA and HEXB genes and that the most direction is opposite of AluJo located in intron 10 (Figure 1, the 4 same direction is needed to cause a deletion when recombination common mutation in HEXB was c.1597C T, which suggests that between Alu/Alu occurs). In addition, the similarity between AluSz6 screening the HEXB gene for this prevalent mutation is a cost- and AluJo is 76% (Figure 2b), which is suitable for recombination effective test in individuals with suspicion of having Sandhoff as Alu/Alu recombination occurs when Alu elements are com- disease. Such studies can help understand the different possible pletely identical (homologous). In addition, in most of the mechanisms of genetic changes in different diseases and help to observed events, Alu has a mismatch (homeologous) at the start of investigate large deletions in HEXA and HEXB in individuals recombination.11,22–24 affected by Sandhoff and Tay-Sachs diseases who do not have any An explanation of why the breakpoints in intron 10 occurred point mutations or small deletions and insertions. In addition, with 73 bp downstream from AluJo is that that some microhomology identification of all of the Alu elements across HEXA in this study, was observed at this breakpoint, such as with nucleotide adenine other researchers can look for any possible deletions more easily (shown as red letters in Figure 3). This microhomology, along with in special cases of Sandhoff and Tay-Sachs diseases. Until now, no the homology between Alu, makes it possible to observe these drugs have been used to treat Sandhoff and Tay-Sachs diseases, Alu/Alu deletions. It has been shown that these deletions are due and a better understanding of their exact pathomechanism in the to pathways such as microhomology-mediated end joining, which future can shed light on therapeutic approaches for these diseases occurs when there are 5–25 nucleotides of microhomology using components involved in their biological processes. between two Alu elements.25,26 Therefore, as shown in Figure 3, microhomology between two Alu elements can be one of the main causes of the large deletion in the HEXA gene. ACKNOWLEDGEMENTS The c.919G4A missense mutation in exon 8 of HEXA was a We thank the families of our patients for their participation in this research. novel mutation and resulted in a substitution from glutamate, a negatively charged, polar amino acid (at position 307) to lysine, a positively charged, polar amino acid (p.E307K). This amino acid is predicted to be essential for the proper functions of this protein (Table 3) and is conserved in the evolution of different species, AUTHOR CONTRIBUTIONS such as Mus musculus, Gallus gallus, Ovis aries and many others MF: conceived and designed the experiments. HD: conceived, designed, (Figure 5b). Moreover, Sanger sequencing of all of exons and performed all of the experiments related to the new large deletion and Alu exon/intron boundaries of the HEXA and HEXB genes in the elements as well as bioinformatics analysis for novel deletions and missense proband only identified this mutation in exon 8 of HEXA. On the mutations, analyzed the data and wrote the paper. MA: designed all of the basis of our results, this new variation is a real mutation and is primers for HEXA and HEXB and performed PCR. SM: performed some of the predicted to have damaging effects on the protein. experiments. FD: analyzed some of the Sanger Sequencing results. FK: extracted Regarding c.622delG and p.D208Ifsx15, it is obvious that this DNA from the blood samples. mutation causes a frameshift deletion that led to a premature translation termination. It is worth noting that one of the reported transcripts of HEXA (ENST00000569410.5, H3BTD4, http://www. COMPETING INTERESTS ensembl.org), which lacks all of the amino acids after position 373, The authors declare no conflict of interest.

Official journal of the Japan Society of Human Genetics Human Genome Variation (2018) 18003 New mechanism of genetic mutation in HEXA H Dastsooz et al 8 PUBLISHER’S NOTE 17 Callinan PA, Wang J, Herke SW, Garber RK, Liang P, Batzer MA et al. Alu Springer Nature remains neutral with regard to jurisdictional claims in published retrotransposition-mediated deletion. J Mol Biol 2005; 348: 791–800. maps and institutional affiliations. 18 Deininger PL, Batzer MA. Alu repeats and human disease. Mol Genet Metab 1999; 67:183–193. 19 Faiz F, Allcock RJ, Hooper AJ, van Bockxmeer FM. Detection of variations and REFERENCES identifying genomic breakpoints for large deletions in the LDLR by Ion Torrent semiconductor sequencing. Atherosclerosis 2013; 230:249–255. 1 Gravel RA et al. The Metabolic and Molecular Bases of Inherited Disease.vol.3, 20 Reddy R,A, Nguyen NM, Sarrabay G, Rezaei M, Rivas MC, Kavasoglu A et al. The 8th edn. McGraw-Hill: New York, NY, USA, 2001, 3827–3876. genomic architecture of NLRP7 is Alu rich and predisposes to disease-associated 2 Chin E, Bean L, Coffee B, Hegde MR. Novel human pathological mutations. Gene large deletions. Eur J Hum Genet 2016; 24: 1445–1452. symbol: HEXA. Disease: Tay-Sachs disease. Hum Genet 2009; 126: 329. 21 Neote K, McInnes B, Mahuran DJ, Gravel RA. Structure and distribution of an Alu- 3 Smith NJ, Winstone AM, Stellitano L, Cox TM, Verity CM. GM2 gangliosidosis in a type deletion mutation in Sandhoff disease. J Clin Invest 1990; 86: 1524–1531. UK study of children with progressive neurodegeneration: 73 cases reviewed. Dev 22 Hedges DJ, Deininger PL. Inviting instability: transposable elements, double- Med Child Neurol 2012; 54:176–182. strand breaks, and the maintenance of genome integrity. Mutat Res 2007; 616: 4 Langlois S, Wilson RD, Genetics Committee of the Society of O, 46–59. Gynaecologists of C, CCMG Prenatal Diagnosis. Carrier screening for genetic 23 Witherspoon DJ, Zhang Y, Xing J, Watkins WS, Ha H, Batzer MA et al. Mobile disorders in individuals of Ashkenazi Jewish descent. J Obstet Gynaecol Can 2006; element scanning (ME-Scan) identifies thousands of novel Alu insertions in – 28:324 343. diverse human populations. Genome Res 2013; 23:1170–1181. 5 O'Dowd BF, Klavins MH, Willard HF, Gravel R, Lowden JA, Mahuran DJ et al. 24 Konkel MK, Walker JA, Batzer MA. LINEs and SINEs of primate evolution. Evol Molecular heterogeneity in the infantile and juvenile forms of Sandhoff disease Anthropol 2010; 19: 236–249. – (O-variant GM2 gangliosidosis). J Biol Chem 1986; 261: 12680 12685. 25 Elliott B, Richardson C, Jasin M. Chromosomal translocation mechanisms at 6 Neudorfer O, Pastores GM, Zeng BJ, Gianutsos J, Zaroff CM, Kolodny EH et al. Late- intronic alu elements in mammalian cells. Mol Cell 2005; 17:885–894. onset Tay-Sachs disease: phenotypic characterization and genotypic correlations 26 McVey M, Lee SE. MMEJ repair of double-strand breaks (director’s cut): deleted – in 21 affected patients. Genet Med 2005; 7:119 123. sequences and alternative endings. Trends Genet 2008; 24:529–538. 7 Akli S, Chelly J, Lacorte JM, Poenaru L, Kahn A. Seven novel Tay-Sachs mutations 27 Zhang ZX, Wakamatsu N, Akerman BR, Mules EH, Thomas GH, Gravel RA et al. A fi detected by chemical mismatch cleavage of PCR-ampli ed cDNA fragments. second, large deletion in the HEXB gene in a patient with infantile Sandhoff – Genomics 1991; 11:124 134. disease. Hum Mol Genet 1995; 4:777–780. 8 Banerjee P, Siciliano L, Oliveri D, McCabe NR, Boyers MJ, Horwitz AL et al. Mole- 28 Redonnet-Vernhet I, Mahuran DJ, Salvayre R, Dubas F, Levade T. Significance of fi cular basis of an adult form of beta-hexosaminidase B de ciency with motor two point mutations present in each HEXB allele of patients with adult GM2 – neuron disease. Biochem Biophys Res Commun 1991; 181: 108 115. gangliosidosis (Sandhoff disease) homozygosity for the Ile207--4Val substitution 9 Yoshizawa T, Kohno Y, Nissato S, Shoji S. Compound heterozygosity with two is not associated with a clinical or biochemical phenotype. Biochim Biophys Acta novel mutations in the HEXB gene produces adult Sandhoff disease presenting as 1996; 1317: 127–133. a motor neuron disease phenotype. J Neurol Sci 2002; 195:129–138. 29 Hara A, Uyama E, Uchino M, Shimmoto M, Utsumi K, Itoh K et al. Adult Sandhoff's 10 Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J et al. disease: R505Q and I207V substitutions in the HEXB gene of the first Initial sequencing and analysis of the human genome. Nature 2001; 409:860–921. Japanese case. J Neurol Sci 1998; 155:86–91. 11 Batzer MA, Deininger PL. Alu repeats and human genomic diversity. Nat Rev Genet 2002; 3: 370–379. 12 Kazazian HH Jr. Mobile elements: drivers of genome evolution. Science 2004; 303: This work is licensed under a Creative Commons Attribution 4.0 1626–1632. International License. The images or other third party material in this 13 Hedges DJ, Batzer MA. From the margins of the genome: mobile elements shape article are included in the article’s Creative Commons license, unless indicated primate evolution. Bioessays 2005; 27:785–794. otherwise in the credit line; if the material is not included under the Creative Commons 14 Deininger PL, Moran JV, Batzer MA, Kazazian HH Jr. Mobile elements and mam- license, users will need to obtain permission from the license holder to reproduce the malian genome evolution. Curr Opin Genet Dev 2003; 13: 651–658. material. To view a copy of this license, visit http://creativecommons.org/licenses/ 15 Jurka J, Smith T. A fundamental division in the Alu family of repeated sequences. by/4.0/ Proc Natl Acad Sci USA 1988; 85: 4775–4778. 16 Price AL, Eskin E, Pevzner PA. Whole-genome analysis of Alu repeat elements reveals complex evolutionary history. Genome Res 2004; 14: 2245–2252. © The Author(s) 2018

Human Genome Variation (2018) 18003 Official journal of the Japan Society of Human Genetics