Human Mutation

Severe neurocognitive and growth disorders due to variation in THOC2, an essential component of nuclear mRNA export machinery

Journal:For Human Peer Mutation Review Manuscript ID Draft

Wiley - Manuscript type: Research Article

Date Submitted by the Author: n/a

Complete List of Authors: Gecz, Jozef; University of Adelaide, Adelaide Medical School and the Robinson Research Institute Sharma , Raman; The University of Adelaide, Adelaide Medical School and the Robinson Research Institute Gardner, Alison; The University of Adelaide, Adelaide Medical School and the Robinson Research Institute Homan, Claire; The University of Adelaide; Robinson Research Institute Douglas, Evelyn; Genetics and Molecular Pathology, SA Pathology Mefford, Heather; Division of Genetic Medicine, Department of Pediatrics, University of Washington Wieczorek, Dagmar; Universitatsklinikum Dusseldorf, Institut für Humangenetik; Heinrich-Heine-Universitat Dusseldorf, Institut für Humangenetik Lüdecke, Hermann-Josef; Heinrich-Heine-University, Medical Faculty, Institute of Human Genetics, Düsseldorf, Germany and Institut für Humangenetik, Universitätsklinikum Essen, Universität Duisburg-Essen Stark, Zornitza; Murdoch Children’s Research Institute, Victorian Clinical Genetics Services Sadedin, Simon; Murdoch Children's Research Institute and Program in Medical and Population Genetics, Broad Institute of MIT and Harvard Nowak, Catherine Bearce; Boston Children's Hospital douglas, Jessica; Boston Children's Hospital Parsons, Gretchen; Spectrum Health Medical Genetics Mark, Paul; Spectrum Health Medical Genetics Loidi, Lourdes; Fundación Pública Galega de Medicina Xenómica, Hospital Clínico Universitario de Santiago de, Compostela Herman, Gail; Nationwide Children’s Hospital and The Ohio State University Mihalic Mosher, Theresa; Nationwide Children’s Hospital and The Ohio State University Gillespie, Meredith; Children's Hospital of Eastern Ontario Research Institute, University of Ottawa Brady, Lauren; Department of Pediatrics, McMaster University Medical Centre Tarnopolsky, Mark; McMaster University Medical Center, Dept. Pediatrics and Medicine Madrigal, Irene; Fundación Pública Galega de Medicina Xenómica, Hospital

John Wiley & Sons, Inc. Page 1 of 67 Human Mutation

1 2 3 4 Clínico Universitario de Santiago de Compostela Eiris, Jesus; Unidad de Neurología Pediátrica, Departamento de Pediatría, 5 Hospital Clínico Universitario de Santiago de Compostela 6 Domenech Salgado, Laura; Centre de Regulacio Genomica; Universitat 7 Pompeu Fabra 8 Rabionet, Raquel; FUNDACIÓ PRIVADA PER A LA RECERCA I LA DOCÈNCIA 9 SANT JOAN DE DÉU , departament de genètica, microbiologia i estadística 10 de la UB; Universitat de Barcelona Facultat de Biologia, Strom, Tim; Helmholtz Zentrum München, Institute of Human Genetics 11 Ishihara, Naoko; Department of Pediatrics, Fujita Health University School 12 of Medicine 13 Inagaki, Hidehito; Fujita Health University, ICMS 14 Kurahashi, Hiroki; Fujita Health University, Institute for Comprehensive 15 Medical Science, Division of Molecular Genetics 16 Dudding, Tracy; Hunter New England Local Health District, Hunter Genetics ; University of Newcastle, 17 Palmer , Emma; Genetics of Learning Disability Service, Hunter Genetics 18 Field, Michael; Hunter Genetics, The GOLD Service 19 For Peer Review XLID, THOC2, mRNA export, stability, partial loss-of-function 20 Key Words: 21 variants 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 John Wiley & Sons, Inc. Human Mutation Page 2 of 67

1 2 3 RESEARCH ARTICLE 4

5 Severe neurocognitive and growth disorders due to variation in THOC2, an essential 6 component of nuclear mRNA export machinery 7 8

9 1 1 1 2 3 10 Raman Kumar , Alison Gardner , Claire Homan , Evelyn Douglas , Heather Mefford , 4 4 5 6 11 Dagmar Wieczorek , HermannJosef Lüdecke , Zornitza Stark , Simon Sadedin , The Broad 7 8 8 9 9 12 CMG , Catherine Bearce Nowak , Jessica Douglas , Gretchen Parsons , Paul Mark , Lourdes 13 Loidi10, Gail Herman11, Theresa Mihalic Mosher11, Meredith K Gillespie12, Lauren Brady13, 14 Mark Tarnopolsky13, Irene Madrigal14, Jesús Eiris15, Laura Domenech Salgado16, Raquel 15 Rabionet16, Tim M Strom17, Naoko Ishihara18, Hidehito Inagaki19, Hiroki Kurahashi19, 16 Tracy DuddingByth20,21, Elizabeth Palmer20,22, Michael Field20, Jozef Gecz1,23* 17 18 19 1Adelaide Medical SchoolFor and thePeer Robinson ResearchReview Institute, The University of Adelaide, 20 Adelaide, SA5000, Australia; 2Genetics and Molecular Pathology, SA Pathology, Adelaide, 21 SA5000, Australia; 3Division of Genetic Medicine, Department of Pediatrics, University of 22 4 23 Washington & Seattle Children’s Hospital, Seattle, Washington, USA; HeinrichHeine 24 University, Medical Faculty, Institute of Human Genetics, Düsseldorf, Germany and Institut 25 für Humangenetik, Universitätsklinikum Essen, Universität DuisburgEssen, Essen, 5 26 Germany; Murdoch Children's Research Institute and 27 Department of Pediatrics, University of Melbourne, Melbourne, Australia; 6Murdoch 28 Children's Research Institute and 29 Program in Medical and Population Genetics, Broad Institute of MIT and 30 Harvard, Cambridge, Massachusetts, USA; 7Broad’s Center for Mendelian Genomics, 31 Cambridge, Massachusetts, USA; 8Boston Children's Hospital, Longwood Avenue, Boston, 32 MA, USA; 9Spectrum Health Medical Genetics, Grand Rapids, MI, USA; 10Fundación 33 Pública Galega de Medicina Xenómica, Hospital Clínico Universitario de Santiago de, 34 11 35 Compostela. Santiago de Compostela, Spain; Nationwide Children’s Hospital and The Ohio State University, Columbus, OH, USA; 12Children's Hospital of Eastern Ontario Research 36 13 37 Institute, University of Ottawa, Ottawa, Canada; Department of Pediatrics, McMaster 14 38 University Medical Centre, Hamilton, Canada; Fundación Pública Galega de Medicina 39 Xenómica, Hospital Clínico Universitario de Santiago de Compostela, Santiago de 40 Compostela, Spain; 15Unidad de Neurología Pediátrica, Departamento de Pediatría, Hospital 41 Clínico Universitario de Santiago de Compostela, Santiago de Compostela, Spain; 16Centre 42 for Genomic Regulation (CRG), Universitat Pompeu Fabra and CIBERESP, Barcelona 43 Institute for Science and Technology, Barcelona, Spain; 17Institut für Humangenetik, 44 Universitätsklinikum Essen, Universität DuisburgEssen, Essen, Germany; 18Department of 45 Pediatrics, Fujita Health University School of Medicine, Japan; 19Division of Molecular 46 Genetics, Institute for Comprehensive Medical Science, Fujita Health University, Japan; 47 20Genetics of Learning Disability Service, Hunter Genetics, Waratah, NSW 2298, Australia; 48 21 22 49 University of Newcastle, Australia GrowUpWell Priority Research Center; School of 50 Women’s and Children’s Health, University of New South Wales, Randwick, NSW 2031, 23 51 Australia; Healthy Mothers, Babies and Children, South Australian Health and Medical 52 Research Institute, Adelaide, SA 5000, Australia. 53 54 *Correspondence: Jozef Gecz, 1School of Medicine, The Robinson Research Institute, The 55 University of Adelaide, North Adelaide, SA5005, Australia. Email: 56 [email protected] 57 58 1 59 60 John Wiley & Sons, Inc. Page 3 of 67 Human Mutation

1 2 3 Abstract 4 5 Highly conserved TREXmediated mRNA export is emerging as a key pathway in neuronal 6 7 development and differentiation. TREX subunit variants cause perturbed mRNA export from 8 9 the cell nucleus to the cytoplasm and lead to neurodevelopmental disorders (NDDs). Our 10 11 previous work implicated the THOC2 on the X in intellectual disability 12 13 14 (ID). Borderline to severe ID with speech delay, behavioural problems, short stature, elevated 15 16 BMI and truncal obesity in adulthood were observed. We now report an additional six 17 18 affected individuals from five unrelated families with pathogenic or likely pathogenic 19 For Peer Review 20 variants in THOC2, two de novo and three maternallyinherited, extending the genotypic and 21 22 phenotypic spectrum. These comprise three rare missense THOC2 variants that affect 23 24 evolutionarily conserved amino acid residues and reduce protein stability and affected males 25 26 27 from two families with canonical splicesite THOC2 variants that result in Cterminally 28 29 truncated THOC2 . We present detailed clinical assessment and functional studies on 30 31 a de novo variant in a female with an epileptic encephalopathy and discuss an additional four 32 33 families with rare variants in THOC2 with supportive evidence for pathogenicity. Severe 34 35 neurocognitive features, including movement and seizure disorders were observed in this 36 37 expanded cohort. Taken together, our data significantly expand existing evidence showing 38 39 40 that even subtle alterations to the canonical molecular pathways such as mRNA export, 41 42 otherwise essential for cellular life, can be compatible with life, but lead to NDDs in humans. 43 44 45 46 47 48 KEY WORDS 49 50 XLID; THOC2; mRNA export; protein stability; partial lossoffunction variants 51 52 53 54 55 56 57 58 2 59 60 John Wiley & Sons, Inc. Human Mutation Page 4 of 67

1 2 3 INTRODUCTION 4 5 Intellectual disability (ID), characterized by substantial limitations in both intellectual 6 7 functioning and adaptive behaviour, affects 13% of the population starting before the age of 8 9 18 years and has significant impact on individuals, families and communities (Vissers, et al., 10 11 2016). Individuals with ID are more likely than members of the general population to 12 13 14 experience poor physical and mental health, have a lower life expectancy, experience 15 16 inequalities accessing health care and frequently have limited or no specific therapies for 17 18 their core symptoms (Bittles, et al., 2002; Hosking, et al., 2016). Both genetic and 19 For Peer Review 20 environmental factors contribute to the development of ID (Milani, et al., 2015). Over 130 of 21 22 the identified >800 ID are located on the Xchromosome, and diagnosis of Xlinked 23 24 causes of ID remain critically important for accurate genetic counseling of families (Ropers 25 26 27 and Hamel, 2005). Dramatic improvements in highthroughput DNA sequencing 28 29 technologies and analyses software has led to identification/discovery of new ID genes and 30 31 additional variants in the known ID genes (Dickinson, et al., 2016; Vissers, et al., 2016). A 32 33 systematic review of clinical data suggests that ID affected individuals commonly have 34 35 comorbid neurological, psychiatric and behavioural disorders (Oeseburg, et al., 2011; 36 37 Vissers, et al., 2016), and disease variants in different parts of a gene can lead to a broad 38 39 40 range of complex neurocognitive disorders (Palmer, et al., 2017; Zhu, et al., 2014). This 41 42 complexity contributes to heterogeneity in clinical symptoms and indistinct boundaries 43 44 between syndromic and nonsyndromic forms of NDD. 45 46 47 48 In 2015, we reported genetic, molecular and protein structural data on four missense variants 49 50 in an Xlinked essential gene THOC2 (NM_001081550.1: p.Leu313Phe, p.Leu438Pro, 51 52 53 p.Ile800Thr and p.Ser1012Pro) (Kumar, et al., 2015). The affected individuals had a 54 55 syndromic NDD, characterized by borderline to severe ID, speech delay, short stature and 56 57 58 3 59 60 John Wiley & Sons, Inc. Page 5 of 67 Human Mutation

1 2 3 adult onset truncal obesity (Kumar, et al., 2015). THOC2 encodes for the THOC2 protein 4 5 the largest subunit of the highly conserved TREX (TranscriptionExport) mRNA export 6 7 complex essential for exporting mRNA from the cell nucleus to the cytoplasm (Heath, et al., 8 9 2016). The TREX complex is composed of a THO subcomplex (THOC1, THOC2, THOC3, 10 11 THOC5 and THOC7) and accessory proteins (UAP56, UIF, Aly, CIP29, PDIP3, ZC11A, 12 13 14 SRRT, Chtop) (Heath, et al., 2016). The TREX complex, besides its canonical role in mRNA 15 16 export in the mammalian cells, has been shown to perform critical roles in gene expression, 17 18 3’ mRNA processing, stress responses, mitotic progression and genome stability as well as 19 For Peer Review 20 developmental processes such as pluripotency maintenance and hematopoiesis (Yamazaki, et 21 22 al., 2010). We and others have recently demonstrated that subtle perturbations in mRNA 23 24 export by gene variants or preferential cytoplasmic aggregation can lead to NDDs (Beaulieu, 25 26 27 et al., 2013; Coe, et al., 2014; Kumar, et al., 2015), neurodegeneration (Woerner, et al., 2016) 28 29 or cancer (Chinnam, et al., 2014; Hautbergue, 2017; Liu, et al., 2015; Viphakone, et al., 30 31 2015). These alterations can have tissuespecific effects as TREX subunits are shown to have 32 33 tissuespecific roles; for example, mouse Thoc5 deficiency interferes with the maintenance of 34 35 hematopoiesis (Guria, et al., 2011; Mancini, et al., 2010) and Thoc1 deficiency interferes with 36 37 testis development (Wang, et al., 2009). Taken together, altered TREX function can have 38 39 40 diverse molecular and cellular consequences resulting in a range of diseases. Here we present 41 42 detailed information on the clinical presentations and functional investigations on an 43 44 additional eight missense and two splice THOC2 variants. These data reaffirm and extend our 45 46 previous findings that THOC2 variation is responsible for causing complex 47 48 neurodevelopmental conditions with the core clinical presentation of ID. 49 50 51 52 53 54 55 56 57 58 4 59 60 John Wiley & Sons, Inc. Human Mutation Page 6 of 67

1 2 3 MATERIALS AND METHODS 4 5 Molecular and cellular studies 6 7 RNA extraction, RTqPCR (primers listed in Supp. Table S1), cycloheximide chase, and 8 9 THOC2 immunofluorescence staining were performed as reported previously (Kumar, et al., 10 11 2015). Molecular studies on the THOC2 exon35:c.44502A>G variant were performed using 12 13 14 blood DNA and skin fibroblasts derived from the affected individual and his heterozygous 15 16 carrier mother. Genomic DNA or cDNA was amplified with KAPA HiFi PCR Kit (Kapa 17 18 Biosystems) using hTHOC24326F/ hTHOC24519R (Supp. Table S1) at 95°C for 3 min, 19 For Peer Review 20 35 cycles of 98°C15sec, 59°C15sec, 72°C30sec, incubation at 72°C for 10 min, gel 21 22 purified (MinElute Gel Extraction kit (Qiagen) and Sanger sequenced using the same 23 24 primers. For the THOC2 exon28:c.3503+4A>C, blood gDNA from unaffected father, carrier 25 26 27 mother and affected son was amplified with TaKaRa ExTaq using THOC2F/THOC2R 28 29 primers (Supp. Table S1) at 94°C for 2 min, 40 cycles of 94°C30sec, 60°C30sec, 72°C 30 31 30sec, incubation at 72°C for 5 min. The cDNA was generated by reverse transcribing the 32 33 white blood cell RNAs using Superscript III reverse transcriptase (Life Technologies) and 34 35 amplified with TaKaRa ExTaq using THOC2ex27F/THOC2ex30R (Supp. Table S1) at 36 37 94°C for 2 min, 28 cycles of 94°C30sec, 60°C30sec, 72°C30sec, incubation at 72°C for 5 38 39 40 min. The amplified products were analyzed by Sanger sequencing. 41 42 43 44 Generation of THOC2 variant expression constructs 45 46 Generation of the wild type Myctagged human THOC2 expression plasmid was reported 47 48 earlier (Kumar, et al., 2015). Briefly, the THOC2 variants were introduced into the existing 49 50 pCMVMycTHOC2 expression construct by overlap PCR method using the primers listed in 51 52 53 Supp. Table S1. The variant plasmid sequences were confirmed by Sanger sequencing. 54 55 56 57 58 5 59 60 John Wiley & Sons, Inc. Page 7 of 67 Human Mutation

1 2 3 Details relating generation of the THOC2 variant expression constructs are available on 4 5 request. 6 7 8 9 Transient expression and Western blotting 10 11 For transient expression experiments, HEK293T and HeLa cells were transfected with 12 13 14 expression constructs (400ng pCMVMycTHOC2 plasmid and 400ng pEGFPC1 15 16 plasmid/transfection for stability and cycloheximide assays and 4µg/transfection for 17 18 immunofluorescence staining, IF) using Lipofectamine 3000 reagent according to 19 For Peer Review 20 manufacturer’s protocol (Life technologies). Twentyfour hours posttransfection, cells were 21 22 either fixed with 4% formaldehyde for IF or collected and lysed in buffer containing 50 mM 23 24 TrisHCl pH 7.5, 150 mM NaCl, 0.1% TritonX 100, 1 mM EDTA, 50 mM NaF, 1× Protease 25 26 27 inhibitor/no EDTA and 0.1 mM Na3VO4 for western blot assay as reported previously 28 29 (Kumar, et al., 2015). 30 31 32 33 RESULTS 34 35 Identification of THOC2 variants 36 37 We previously implicated four missense THOC2 variants in 25 individuals with ID and a 38 39 40 range of other clinical features (Table 1) (Kumar, et al., 2015). We identified an additional 41 42 five THOC2 variants (three missense; de novo p.Thr696Ile, de novo p.Gly713Asp, 43 44 maternallyinherited p.His1187Tyr, and two splicingdefective; maternallyinherited 45 46 chrX:122747409 exon35:c.44502A>G and chrX:122757634 exon28:c.3503+4A>C; 47 48 GenBank: NM_001081550.1) variants in a further six affected individuals, including one pair 49 50 of monozygotic twins (Table 1, Figures 12). Whole exome (WES) or whole genome 51 52 53 sequencing (WGS) of probands and parents was used to identify the variants that were 54 55 confirmed by Sanger sequencing of the PCR amplified variantcarrying region of genomic 56 57 58 6 59 60 John Wiley & Sons, Inc. Human Mutation Page 8 of 67

1 2 3 DNA of the parents and affected individuals. The previously unreported THOC2 variants 4 5 affect amino acids that are highly conserved (Supp. Figure S1), are absent in gnomAD 6 7 database and are predicted to be pathogenic based on a number of in silico prediction criteria 8 9 (Table 1). We also included in our study a de novo missense p.Tyr517Cys variant in a female 10 11 with moderatesevere ID, speech problems, epileptic encephalopathy, cortical visual 12 13 14 impairment, and gait disturbances identified using WES as part of the Epi4K Consortium & 15 16 Epilepsy Phenome/Genome Project (Epi, et al., 2013) (Table 1). We have also collected 17 18 further, rare and potentially pathogenic variants through international collaboration (Table1, 19 For Peer Review 20 Figures 1 and 3) and performed functional testing on several of these. The following three 21 22 variants: p.Arg77Cys, p.Ser1012Pro and p.Asn1261His showed no clear evidence of altered 23 24 stability of variant THOC2 proteins in our assay. 25 26 27 28 29 Clinical presentations 30 31 The clinical features of the five previously unreported affected individuals with (likely) 32 33 pathogenic THOC2 variants, aged between 3 and 12 years, and the 10 year old female with de 34 35 novo p.Tyr517Cys variant are summarised in Table 2 and photographs, when available, are 36 37 shown in Figure 2. Detailed clinical information is available in the supplementary data. ID 38 39 40 was universal and at least moderate in severity: 2/7 were nonambulatory and 3/7 nonverbal. 41 42 Behavioural problems were reported in four individuals, with one meeting diagnostic criteria 43 44 for autism spectrum disorder. Common additional neurological features include infantile 45 46 hypotonia (4/7), tremor (2/7) and gait disturbance including balance problems/ataxia and toe 47 48 walking (4/5 ambulatory individuals). Confirmed seizure disorder was only present in the 49 50 affected female (individual 7) but suspected in individual 2. Neuroradiological studies were 51 52 53 performed in five individuals and reported as abnormal in three (available data presented in 54 55 Supp. Figure S2). Growth abnormalities were common including low birth weight (3/7), 56 57 58 7 59 60 John Wiley & Sons, Inc. Page 9 of 67 Human Mutation

1 2 3 microcephaly (2/7) and short stature (2/7). Facial features are shown in Figure 2. Appropriate 4 5 consent for reporting variants, clinical data and photographs of the affected individuals was 6 7 obtained from their parents or legal guardians. The research has been approved by the 8 9 Women’s and Children’s Health Network Human Research Ethics Committee in Adelaide, 10 11 Australia. 12 13 14 15 16 THOC2 variant protein localisation and stability 17 18 Without access to affected individuals' derived cells, we generated Myctagged THOC2 19 For Peer Review 20 missense variant expression constructs to determine protein stability and localisation. The 21 22 THOC2 protein stability was determined in HEK293T cells and localisation in both the 23 24 HEK293T and HeLa cells. Total protein lysates of HEK293T cells ectopicallyexpressing the 25 26 27 wild type or variant MycTHOC2 proteins were western blotted for THOC2, EGFP and β 28 29 Tubulin. We used HEK293T cells expressing Mycp.Ile800Thr THOC2 as a control for 30 31 protein stability assay as this variant is shown to cause reduced protein stability (Kumar, et 32 33 al., 2015). The results showed reduced stability of p.Tyr517Cys, p.Thr696Ile, p.Gly713Asp 34 35 and p.His1187Tyr THOC2 compared to the wild type protein (Figure 3A). Presence of 36 37 comparable levels of EGFP in the cells transfected with different expression constructs 38 39 40 indicated that the reduced levels of THOC2 protein were not due to difference in transfection 41 42 efficiency (Figure 3). We also determined the turnover rate of Mycp.Tyr517Cys THOC2 43 44 protein by cycloheximide chase. For this assay, the HEK293T cells transfected with pCMV 45 46 MycWT or pCMVMycp.Tyr517Cys THOC2 and pEGFPC1 transfection control plasmids 47 48 were cultured in presence of translation inhibitor cycloheximide for different durations and 49 50 western blotted for THOC2, EGFP and βTubulin. The results showed that p.Tyr517Cys 51 52 53 THOC2 turnover rate was 3h compared with 8h for the wild type protein (Figure 3B). 54 55 56 57 58 8 59 60 John Wiley & Sons, Inc. Human Mutation Page 10 of 67

1 2 3 THOC2 variant proteins, similar to the wild type, were mainly localised to the nucleus in 4 5 both the HEK293T and HeLa cells (Supp. Figure S3). 6 7 8 9 THOC2 splice variant: exon35:c.4450-2A>G 10 11 Sanger sequencing of amplified target region from affected son and mother's blood genomic 12 13 14 DNA showed that the affected boy inherited chrX:122747409 exon35:c.44502A>G variant 15 16 from his unaffected heterozygous carrier mother (Figure 4C). A 2 A>G change in the intron 17 18 exon splicing site boundary (acceptor AG) is predicted to abolish splicing (Ohno, et al., 19 For Peer Review 20 2017). To validate this possibility, we generated skin fibroblast cultures from the 21 22 heterozygous carrier mother and the affected son. We PCR amplified their fibroblast cDNAs 23 24 using primers with binding sequences located within exon 34 and 35. Amplification of a 194 25 26 27 bp DNA fragment from the mother indicated normal splicing but a 537 bp product from the 28 29 affected son indicated retention of the intron located between these exons (Figure 4AB). We 30 31 confirmed this result by Sanger sequencing of the PCR products generated from genomic 32 33 DNA that showed presence of A/G nucleotides in the carrier mother but only G (A>G) 34 35 nucleotide in the affected son (Figure 4C). The cDNA sequence showed presence of 36 37 normallyspliced mRNA in the mother but retention of intronic sequence upstream of the 38 39 40 exon 35 in the affected son indicating defective splicing due to presence of 2 G variant at the 41 42 intronexon 35 junction sequence (Figure 4C). The presence of normallyspliced mRNA in 43 44 the unaffected mother is consistent with Xinactivation (94% skewing) of the variant allele in 45 46 her fibroblasts. We predicted that a retention of intron between exon 3435 in the affected 47 48 fibroblasts would result in loss of 110 Cterminal amino acids of the 1593 wild type THOC2 49 50 protein (that is, 1483 amino acids); however, overall the variant protein would be 58 amino 51 52 53 acid smaller as it would now be a 1535 amino acid protein comprised of 1483 amino acids of 54 55 wild type THOC2 and 52 translated from intronic sequence in the premRNA (Figure 1A). 56 57 58 9 59 60 John Wiley & Sons, Inc. Page 11 of 67 Human Mutation

1 2 3 Consistent with our prediction, the western blot data showed presence of a slightly smaller 4 5 THOC2 protein band in the affected son’s fibroblasts than his unaffected mother. Many 6 7 independent western blot runs showed presence of two closelylocated THOC2 bands 8 9 similar to the fruit fly THO2 (Rehwinkel, et al., 2004) in the unaffected mother but a single 10 11 highly intense band in the affected son’s fibroblasts (Figure 4D). The observed difference in 12 13 14 levels of THOC2 protein was posttranslational as we found comparable amounts of THOC2 15 16 mRNA, as assayed by real time RTqPCR, in the mother and son (Figure. 4E). Finally, we 17 18 observed no difference in THOC2 localisation in fibroblasts of the affected son and his 19 For Peer Review 20 unaffected mother (Figure 4F). 21 22 23 24 THOC2 splice variant: exon28:c.3503+4A>C 25 26 27 For the second splice variant chrX:122757634 exon28:c.3503+4A>C, molecular studies were 28 29 performed on the white blood cells of the unaffected father, carrier mother and the male 30 31 proband. Sanger sequencing of target region amplified from the unaffected father and mother, 32 33 and affected son's genomic DNA showed that the affected son inherited the A>C change 34 35 from his unaffected carrier mother who had A/C nucleotides at this position (Figure 5). The 36 37 intronic nucleotide change A>C at +4 position of the 5' exonintron donor splicing site 38 39 40 sequence is predicted to cause aberrant splicing (https://www.med.nagoya 41 42 u.ac.jp/neurogenetics/SD_Score/sd_score.html). To confirm this possibility, we amplified 43 44 cDNA generated by reverse transcribing blood RNA of the father, mother and the affected 45 46 son using primers located within exon 27 and exon 30 (Figure 5A; Supp. Table S1). 47 48 Interestingly, whereas a 491 bp PCR product was observed in highly skewed carrier mother 49 50 (98:2%) and normal father, 491 bp and 634 bp PCR products were detected in the affected 51 52 53 son. A 491 bp amplified product indicated normal splicing in the mother and father, and 491 54 55 bp and 634 bp bands suggested partiallydefective splicing in the affected son. Amplification 56 57 58 10 59 60 John Wiley & Sons, Inc. Human Mutation Page 12 of 67

1 2 3 of a 634 bp instead of a 994 bp fragment that would have resulted from a complete retention 4 5 of intron 28 indicated aberrant splicing event in the affected son (Figure 5). Sanger 6 7 sequencing of 491 bp and 634 bp PCR products from the mother, father and son confirmed 8 9 normal splicing in the mother and father and aberrant splicing in the affected son. The 10 11 sequence showed retention of a 143 bp instead of complete 503 bp fragment due to activation 12 13 14 of a cryptic splice site within the intron 28 in the son (Figure 5). Retention of 143 bps from 15 16 intron 28 in the mRNA is predicted to result in a truncated THOC2 protein containing 1168 17 18 wild type amino acids and 7 novel amino acids from the translation of the intronic sequence 19 For Peer Review 20 retained in the defective mRNA, which would be present in addition to the wild type 1593 21 22 amino acid protein from the normallyspliced mRNA in the affected son. 23 24

25 26 27 Discussion 28 29 We present here detailed clinical information, and molecular and functional studies, on five 30 31 previously unreported THOC2 variants in six affected males (two de novo variants and one 32 33 maternally inherited variant in monozygotic twins) and on one affected female with a 34 35 previously reported de novo p.Tyr517Cys variant. We present evidence that extends the 36 37 genotypic spectrum beyond the four THOC2 missense variants that we reported previously 38 39 40 (Kumar, et al., 2015) by including two intronic variants that affect splicing, and four missense 41 42 variants that affect protein stability in a cellbased assay system. All of the variants meet 43 44 ACMG criteria for pathogenic or likely pathogenic status (Table 1) (Richards, et al., 2015). 45 46 These findings, along with the four missense variants reported earlier (Kumar, et al., 2015) 47 48 add to the existing evidence that alterations in essential mRNA export pathway can cause 49 50 NDDs (Amos, et al., 2017; Beaulieu, et al., 2013; Kumar, et al., 2015). 51 52 53 54 55 56 57 58 11 59 60 John Wiley & Sons, Inc. Page 13 of 67 Human Mutation

1 2 3 We confirm that the core clinical feature of THOC2related disorder in hemizygous males is 4 5 ID, with several individuals having additional neurocognitive features including infantile 6 7 hypotonia, behavioural disorders and autism spectrum disorder, and (less commonly) gait 8 9 disturbance and tremor. Although the range of neurodevelopmental features is similar, our 10 11 original cohort contained males with borderline to mild ID (Kumar, et al., 2015); whereas, all 12 13 14 individuals in this cohort have at least moderate ID. Five of the affected individuals had 15 16 neuroimaging, with three having evidence of white matter thinning or cortical atrophy, and 17 18 one individual (individual 2) having extensive cortical gyral abnormalities. Similarly, in the 19 For Peer Review 20 original cohort four of the six individuals with neuroimaging available were reported to have 21 22 a variety of abnormalities including inferior cerebellar vermis dysplasia, ventriculomegaly, 23 24 gliosis and mild cervical cord compression. A consistent pattern of extraneurological 25 26 27 features was not evident, although as in the original cohort, several individuals had evidence 28 29 of restricted growth, including short stature, low birth weight and microcephaly. The truncal 30 31 obesity, raised BMI and comorbid anxiety and depression frequently noted in the original 32 33 cohort was not observed in this new cohort, likely reflecting the relatively lower age range (4 34 35 12 years compared to 577 years) given that these features generally emerged in late 36 37 adolescence/adulthood. We specifically evaluated the cohort to assess if a characteristic facial 38 39 40 gestalt was evident across individuals with pathogenic or likely pathogenic variants across 41 42 our original and this expanded clinical cohort (Supp. Figure S4) (DuddingByth, et al., 2017). 43 44 Although a clearly recognizable facial gestalt was not obvious, there are some similarities. 45 46 Children have a wide forehead with variable ptosis, epicanthic folds, downslanting palpebral 47 48 fissures, wide nasal bridge, long philtrum and pointed chin. Adults have a wide forehead with 49 50 prominent supraorbital fissures with variable horizontal eyebrows, thin nasal bridge and 51 52 53 downslanting palpebral fissures. Age progression of the face shows shortening of the 54 55 56 57 58 12 59 60 John Wiley & Sons, Inc. Human Mutation Page 14 of 67

1 2 3 philtrum with widening of the nasal base. The facial gestalt spectrum associated with THOC2 4 5 pathogenic variants will continue to emerge as more individuals are reported. 6 7 8 9 The two males with splice variants (individuals 6 and 7) and the monozygotic twins with 10 11 missense variant p.His1187Tyr (individual 3 and 4) inherited the THOC2 variant from their 12 13 14 mother. As was the case in our original cohort, the heterozygous mothers were unaffected, 15 16 with normal learning and health. All had highly skewed X cell inactivation (XCI) (Table 1). 17 18 In comparison, individual 7, who was a female with a de novo missense variant 19 For Peer Review 20 (p.Tyr517Cys) had a particularly severe neurocognitive presentation with infantile onset 21 22 epileptic encephalopathy and profound intellectual disability. At age 10 y she remains non 23 24 ambulant and nonverbal. Unfortunately, we did not have access to her genomic DNA for 25 26 27 testing the XCI status. We do note that severely affected females with de novo variants in X 28 29 linked genes have been reported by us and others (Palmer, et al., 2016; Snijders Blok, et al., 30 31 2015; Zweier, et al., 2014), and that differences in XCI in blood do not always provide an 32 33 explanation for the range of phenotypic expression in heterozygous females. 34 35 36 37 A range of proteinprotein interactions are required for mRNA export (Chi, et al., 2013). 38 39 40 Proteins with altered stability (Hirayama, et al., 2008), localization (Beaulieu, et al., 2013) 41 42 (e.g., THOC6 p.Gly46Arg implicated in syndromic ID) or interaction (Chi, et al., 2013) can 43 44 impact mRNA export and consequently disrupt normal cell function. Likewise, THOC2 45 46 variants can cause neuronal defects due to altered protein interactions, stability, localization 47 48 and/or structure, which can impact normal THOC2 functions such as mRNA export or its 49 50 mRNA binding properties. We did not observe mislocalization of the THOC2 variant 51 52 53 proteins in cultured cells and did not test alterations in their interaction with the other known 54 55 or unknown TREX proteins. However, significantly reduced levels of a number of new 56 57 58 13 59 60 John Wiley & Sons, Inc. Page 15 of 67 Human Mutation

1 2 3 (p.Tyr517Cys, p.His1187Tyr, p.Thr696Ile, p.Gly713Asp) and published (p.Leu438Pro, 4 5 p.Ile800Thr (Kumar, et al., 2015)) missense THOC2 variant proteins are due to impaired 6 7 protein stability or reduced levels of normal mRNA due to aberrant splicing in 8 9 exon28:c.3503+4A>C affected individuals (see below). We also noted increased stability of 10 11 p.Asn1261His THOC2 protein. We and others have shown that THOC2 controls TREX 12 13 14 function by maintaining the stability of THOC1, 3, 5 and 7 subunits (Chi, et al., 2013; 15 16 Kumar, et al., 2015). Reduced levels of THOC2 missense variant proteins are most likely due 17 18 to enhanced proteasomemediated degradation as THOC2 is ubiquitinated (LopitzOtsoa, et 19 For Peer Review 20 al., 2012). Published evidence shows that molecular chaperons protect proteins from 21 22 degradation by promoting correct protein folding and assembly. Reduced stability of a set of 23 24 variant THOC2 proteins may be due to structural changes or impaired protectiveassociation 25 26 27 with an unknown chaperon, which can be TREX complex subunit or an unrelated protein. 28 29 Indeed, reduced levels of THOC2 protein in a girl with a de novo X;8 translocation creating a 30 31 PTK2-THOC2 gene fusion has also been implicated in nonprogressive form of congenital 32 33 ataxia, cognitive impairment and cerebellar hypoplasia (Di Gregorio, et al., 2013). This is 34 35 consistent with observations showing that THOC2 depletion has different consequences in 36 37 different organisms. Some of the examples include reduced levels of Thoc2 leading to 38 39 40 significant increase in length of neurites in rat primary hippocampal neurons (Di Gregorio, et 41 42 al., 2013) and C. elegans thoc2 knockouts that are completely immobile, slowgrowing, 43 44 sterile, have functional defects in specific sensory neurons and die prematurely (Di Gregorio, 45 46 et al., 2013). D. rerio Thoc2 is essential for embryonic development (Amsterdam, et al., 47 48 2004) and in D. melanogaster S2 cells Thoc2 knockdown inhibits mRNA export and cell 49 50 proliferation (Rehwinkel, et al., 2004). THOC2 depletion also results in chromosome 51 52 53 alignment, mitotic progression and genomic stability in human HeLa cells (Yamazaki, et al., 54 55 56 57 58 14 59 60 John Wiley & Sons, Inc. Human Mutation Page 16 of 67

1 2 3 2010). Finally, Thoc2 and Thoc5 knock down experiments have shown their role in 4 5 regulation of embryonic stem cell selfrenewal (Wang, et al., 2013). 6 7 8 9 Compared to his unaffected heterozygous carrier mother, who showed skewed X 10 11 inactivation, the exon35:c.44502A>G variant caused defective splicing in the male proband 12 13 14 resulting with not only a truncated THOC2 protein but also abnormal accumulation in his 15 16 skin fibroblasts. The truncated 1535 amino acid THOC2 protein lacks Cterminal 110 amino 17 18 acids (part of a putative RNA binding domain) of the 1593 normal THOC2 protein but added 19 For Peer Review 20 52 amino acids coded from the unspliced intron (Figure 5; (Pena, et al., 2012)). Interestingly, 21 22 whereas cDNA sequencing of exon 2730 region for the exon28:c.3503+4A>C THOC2 23 24 variant showed normal splicing in mother and father, the affected son showed aberrant 25 26 rd rd 27 splicing resulting in normally spliced (about 2/3 ) and intron 28 retained (nearly 1/3 ) 28 29 mRNAs. A 143 bp intron 28 retention is predicted to translate a 1175 amino acid THOC2 30 31 protein (1168 normal and 7 amino acids coded by the retained intron 28 fragment) completely 32 33 lacking the Cterminal RNA binding region (Figure 5). Both affected individuals carrying the 34 35 splicevariants presented with severe neurocognitive features, consistent with the phenotype 36 37 of the majority of individuals with missense variants with supportive functional evidence. 38 39 40 Taken together, exon35:c.44502A>G and exon28:c.3503+4A>C THOC2 splice variants 41 42 present interesting biological scenarios; the former resulting in 1535 amino acid truncated 43 44 protein that is present at higher level and the latter with both normal (albeit potentially much 45 46 reduced) and a 1175 amino acid truncated THOC2 protein. We postulate that the clinical 47 48 outcomes in the exon35:c.44502A>G individual are caused by partial loss of function due to 49 50 loss of 110 amino acid Cterminal region and accumulation of the truncated THOC2 protein. 51 52 53 However, pathogenicity in exon28:c.3503+4A>C affected individual is most likely caused by 54 55 reduced levels of normal and potential dominantnegative effects of the Cterminally 56 57 58 15 59 60 John Wiley & Sons, Inc. Page 17 of 67 Human Mutation

1 2 3 truncated THOC2 protein. That reduced THOC2 protein levels are associated with ID and 4 5 other clinical symptoms is emerging as a frequent theme; e.g., due to reduced THOC2 protein 6 7 stability caused by missense variants (see above) and now an aberrant splicing event. Indeed 8 9 reduced THOC2 levels are shown to destabilize the TREX complex in humans (Chi, et al., 10 11 2013; Kumar, et al., 2015) and removal of any THO subunit causes destabilization of other 12 13 14 TREX components in yeast (Pena, et al., 2012). Interestingly, p.Arg1307Trp THOC2 variant 15 16 was recently reported to effect THOC2THOC1 binding and thus THOC1 degradation in 17 18 MDAMB157 breast cancer cells (Lapek, et al., 2017). 19 For Peer Review 20 21 22 Systematic studies on functional effect of Tho2 Cterminal RNA binding region in yeast 23 24 provide interesting explanation as to how the truncated THOC2 protein can perturb normal 25 26 27 mRNA export function in human cells (Pena, et al., 2012) (Figure 6). The yeast data show 28 29 that whereas ∆Tho2 strain does not grow at 37ºC (restrictive temperature), Tho2∆14081597 and 30 31 Tho∆12711597 growth is considerably reduced suggesting that Cterminal 12711597 amino 32 33 acids are required for cell survival at restrictive temperature (Pena, et al., 2012). If the 34 35 36 exon28:c.3503+A>C variant caused complete splicing defect retaining intron 28 in all 37 38 mRNAs, the cells would translate only a 1175 amino acids (with 1168 normal) THOC2 39 40 protein; essentially lacking the Cterminal region encompassing the RBD domain that when 41 42 deleted in yeast strain (Tho∆12711597) totally abolishes their growth at 37ºC. However, the 43 44 affected boy carrying a single allele of the exon28:c.3503+A>C THOC2 variant, although 45 46 47 with severe clinical symptoms, is alive. This could be explained by presence of reduced 48 rd 49 levels of THOC2 protein produced from translation of about 2/3 normallyspliced mRNA in 50 51 the affected white blood cells. Taken together, clinical outcomes in the affected boy may be 52 53 due to perturbed mRNA export caused by reduced levels of THOC2 protein and perhaps also 54 55 Cterminally truncated nonfunctional THOC2 protein translated from about 1/3rd aberrantly 56 57 58 16 59 60 John Wiley & Sons, Inc. Human Mutation Page 18 of 67

1 2 3 spliced mRNAs that retain intron 28. It is though likely that the severe effects due to 4 5 truncated and/or reduced THOC2 protein in the affected individuals may be compensated by 6 7 other TREX proteins and may explain the survival of the girl with congenital ataxia, 8 9 cognitive impairment and cerebellar hypoplasia due to a de novo X;8 translocation creating a 10 11 PTK2-THOC2 gene fusion, who expresses reduced levels of the THOC2 protein (Di 12 13 14 Gregorio, et al., 2013). For example, REF knockdown increases UIF protein levels 15 16 (Hautbergue, et al., 2009) and Nxf1 depletion results in increased levels of THOC2 17 18 (Viphakone, et al., 2012). 19 For Peer Review 20 21 22 We also identified a set of THOC2 missense variants that do not reach ACMG criteria for 23 24 (likely) pathogenicity. This includes four previously unreported variants (p.Arg77Cys; 25 26 27 p.Ser1108Leu; p.Arg1121Gly and p.Asn1261His) in five affected individuals (including a 28 29 brother pair) which are absent from databases of reference individuals (ExAC/ gnomAD) 30 31 (Lek, et al., 2016), affect highly evolutionarily conserved amino acid residues, and in silico 32 33 analyses are supportive of pathogenicity (Table 1). Consistent with not satisfying the ACMG 34 35 criteria, they remain variants of uncertain clinical significance in the absence of additional 36 37 supportive functional evidence. Specifically, the missense variants in individuals 810 and 12 38 39 40 did not affect protein stability in our cellular assay system. However, the variants may have a 41 42 detrimental effect on THOC2 function due to altered protein structure impacting protein 43 44 RNA and/or proteinprotein interactions with known or unknown TREX subunits. 45 46 Indeed amyotrophic lateral sclerosis (ALS)linked Matrin 3 variant proteins exhibit altered 47 48 interactions with specific TREX proteins and lead to nuclear mRNA export defects of both 49 50 global mRNA and more specifically, the TDP43 and FUS mRNAs, linking cellular transport 51 52 53 defects to ALS (Boehringer, et al., 2017). In addition, we note that two of the THOC2 54 55 variants (p.Leu313Phe and p.Ser1012Pro) in our original cohort (Kumar, et al., 2015) 56 57 58 17 59 60 John Wiley & Sons, Inc. Page 19 of 67 Human Mutation

1 2 3 segregated with NDDs in multiple individuals in families, but also lacked evidence of a 4 5 detrimental effect on protein stability. Detailed clinical information on the affected 6 7 individuals carrying these VOUS is presented in the supplementary data and summarized in 8 9 Table 2. Similarities in clinical presentation between the majority of affected individuals in 10 11 this group and with those with (likely) pathogenic variants are apparent, including severe 12 13 14 profound intellectual disability, behavioural difficulties including autistic features, infantile 15 16 hypotonia, seizure disorders, gait abnormalities, neuroradiological features including white 17 18 matter abnormalities and ventricular dilatation, microcephaly and short stature. Individual 8 19 For Peer Review 20 had cerebellar vermal hypoplasia which was also noted in an individual from the original 21 22 cohort (Kumar, et al., 2015). Individuals 8 and 11 were considered to share overlapping facial 23 24 features with other members of the cohort (Supp. Figure S4). However, many of these 25 26 27 features are relatively common in neurodevelopmental disorders in general, and we do not yet 28 29 consider the phenotype of THOC2related disorder to be distinctive enough to strongly 30 31 weight assessment of pathogenicity of novel variants. The challenge of proving causality for 32 33 previously unreported missense variants in NDD genes is well recognized and speaks to the 34 35 need for ongoing intertwined clinical and research efforts to clarify causality of VOUS 36 37 (Wright, et al., 2018). 38 39 40 41 42 THOC2 is ubiquitously expressed in all human tissues (Thul, et al., 2017) and is highly 43 44 expressed in the developing and mature human (Johnson, et al., 2009; Kumar, et al., 2015; 45 46 Uhlen, et al., 2015) and mouse brain, with higher abundance in frontal cortex and cerebellum 47 48 (Di Gregorio, et al., 2013; Kumar, et al., 2015). THOC2 is an essential mRNA export factor 49 50 as its siRNAmediated depletion results in almost complete retention of mRNAs in the cell 51 52 53 nucleus (Chi, et al., 2013), potentially toxic to the cell. These data are consistent with the 54 55 findings that THOC2 is a highlyconstrained gene (Samocha, et al., 2014) and THOC (e.g. 56 57 58 18 59 60 John Wiley & Sons, Inc. Human Mutation Page 20 of 67

1 2 3 THOC1, 3, 5, 6 and 7) genes are essential for cell survival (Blomen, et al., 2015). Taken 4 5 together, as THOC2 knockout cells will not survive due to complete mRNA nuclear 6 7 retention, we predict that the identified THOC2 variants represent partial lossoffunction that 8 9 disrupt normal mRNA export in neuronal and possibly other cell types, potentially causing 10 11 variable clinical presentations. 12 13 14 15 16 TREX complex couples transcription and mRNA biogenesis with nuclear mRNA export and 17 18 has emerged as an essential pathway in embryogenesis, organogenesis and differentiation 19 For Peer Review 20 (Heath, et al., 2016). For example, Thoc2 and Thoc5 selectively bind and regulate export of 21 22 mRNAs involved in maintenance of pluripotency (e.g. Nanog, Sox2, Esrrb, and Klf4 23 24 mRNAs) (Wang, et al., 2013) and Thoc5 in hematopoiesis and heat shock (Katahira, et al., 25 26 27 2009; Mancini, et al., 2010). Lack of Thoc1 and Thoc5 in mice is embryonic lethal (Mancini, 28 29 et al., 2010; Wang, et al., 2006). Mouse modeling shows that both Thoc1 and Thoc5 30 31 knockouts are embryonic lethal (Mancini, et al., 2010; Wang, et al., 2006). However, Thoc1 32 33 and Thoc5 expression in a range of developing and adult tissues may indicate that the two 34 35 genes have more essential role in early embryonic development compared to less stringent 36 37 requirement during later stages of embryonic or adult development (Mancini, et al., 2010; 38 39 40 Wang, et al., 2006); a functional pattern most likely also true for the THOC2 gene. 41 42 Essentiality of THOC2 gene indicates that THOC2 knockout will also be lethal. However, 43 44 reduced levels or perturbed functionality can lead to a range of NDD phenotypes as observed 45 46 for a cohort of THOC2 variants identified by us. It is now wellestablished that development 47 48 of brain depends on tightly regulated and complex sequence of events involving neuronal and 49 50 glial cell proliferation, migration and maturation (Chiurazzi and Pirozzi, 2016). Therefore, it 51 52 53 is not surprising that our THOC2 variant data and published work (Dickinson, et al., 2016) 54 55 provides strong evidence that even subtle alterations to the canonical molecular pathways 56 57 58 19 59 60 John Wiley & Sons, Inc. Page 21 of 67 Human Mutation

1 2 3 such as mRNA export, otherwise essential for cellular life, can be tolerated but at a cost of a 4 5 NDD. 6 7 8 9 In summary, we present detailed clinical data on seven individuals with THOC2-associated 10 11 ID caused by both missense and splice variants that meet ACMG criteria for (likely) 12 13 14 pathogenicity. They have a core phenotype of ID, and common findings of behavioural 15 16 disorders, infantile hypotonia, gait disturbance and growth impairment, similar to the affected 17 18 males with THOC2associated ID we previously reported (Kumar, et al., 2015). Other than 19 For Peer Review 20 the affected female with a de novo missense variant, heterozygote carrier females are 21 22 typically unaffected. We also present clinical data on five individuals with four previously 23 24 unreported rare missense variants that show clinical overlap with our core group, but where 25 26 27 convincing evidence for causality is still required. The significance of these variants may be 28 29 clarified as further individuals with THOC2 variants are reported. 30 31 32 33 Acknowledgements 34 35 We thank the individuals and families for their contribution to this study. This work was 36 37 supported by NHMRC Program Grant APP1091593 and Senior Research Fellowship 38 39 40 APP1041920 and Channel 7 Children’s Research Foundation to J.G., research grants 41 42 APP19PI10/01710 from the Fondo de Investigación Sanitaria; the Spanish Ministry of 43 44 Economy and Competitiveness, ‘Centro de Excelencia Severo Ochoa 20172021’, SEV 45 46 20160571; a MINECO Severo Ochoa fellowship to LD (SVP20130680066); FORGE 47 48 Canada and Care4Rare Canada Consortium funded by Genome Canada, the Canadian 49 50 Institutes of Health Research (CIHR), the Ontario Genomics Institute, Ontario Research 51 52 53 Fund, Génome Québec, and Children's Hospital of Eastern Ontario Foundation; the National 54 55 Research Institute, the National Eye 56 57 58 20 59 60 John Wiley & Sons, Inc. Human Mutation Page 22 of 67

1 2 3 Institute, and the National Heart, Lung and Blood Institute grant UM1 4 5 HG008900 to Daniel MacArthur and Heidi Rehm; the Undiagnosed Diseases Program 6 7 Victoria (UDPVic), Murdoch Children¹s 8 9 Research Institute, Melbourne, Australia. 10 11

12 13 14 15 16 17 18 19 For Peer Review 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 21 59 60 John Wiley & Sons, Inc. Page 23 of 67 Human Mutation

1 2 3 References 4 5 Amos JS, Huang L, Thevenon J, Kariminedjad A, Beaulieu CL, MasurelPaulet A, 6 Najmabadi H, Fattahi Z, Beheshtian M, Tonekaboni SH and others. 2017. Autosomal 7 recessive mutations in THOC6 cause intellectual disability: syndrome delineation 8 requiring forward and reverse phenotyping. Clin Genet 91(1):9299. 9 10 Amsterdam A, Nissen RM, Sun Z, Swindell EC, Farrington S, Hopkins N. 2004. 11 Identification of 315 genes essential for early zebrafish development. Proc Natl Acad 12 Sci U S A 101(35):127927. 13 14 Beaulieu CL, Huang L, Innes AM, Akimenko MA, Puffenberger EG, Schwartz C, Jerry P, 15 16 Ober C, Hegele RA, McLeod DR and others. 2013. Intellectual disability associated 17 with a homozygous missense mutation in THOC6. Orphanet J Rare Dis 8(1):62. 18 19 Bittles AH, Petterson BA,For Sullivan Peer SG, Hussain Review R, Glasson EJ, Montgomery PD. 2002. The 20 influence of intellectual disability on life expectancy. J Gerontol A Biol Sci Med Sci 21 57(7):M4702. 22 23 Blomen VA, Majek P, Jae LT, Bigenzahn JW, Nieuwenhuis J, Staring J, Sacco R, van 24 Diemen FR, Olk N, Stukalov A and others. 2015. Gene essentiality and synthetic 25 lethality in haploid human cells. Science 350(6264):10926. 26 27 Boehringer A, GarciaMansfield K, Singh G, Bakkar N, Pirrotte P, Bowser R. 2017. ALS 28 Associated Mutations in Matrin 3 Alter ProteinProtein Interactions and Impede 29 mRNA Nuclear Export. Sci Rep 7(1):14529. 30 31 Chi B, Wang Q, Wu G, Tan M, Wang L, Shi M, Chang X, Cheng H. 2013. Aly and THO are 32 required for assembly of the human TREX complex and association of TREX 33 components with the spliced mRNA. Nucleic Acids Res 41(2):1294306. 34 35 Chinnam M, Wang Y, Zhang X, Gold DL, Khoury T, Nikitin AY, Foster BA, Li Y, Bshara 36 W, Morrison CD and others. 2014. The Thoc1 ribonucleoprotein and prostate cancer 37 38 progression. J Natl Cancer Inst 106(11). 39 40 Chiurazzi P, Pirozzi F. 2016. Advances in understanding genetic basis of intellectual 41 disability. F1000Res 5. 42 43 Coe BP, Witherspoon K, Rosenfeld JA, van Bon BW, Vultovan Silfhout AT, Bosco P, 44 Friend KL, Baker C, Buono S, Vissers LE and others. 2014. Refining analyses of 45 copy number variation identifies specific genes associated with developmental delay. 46 Nat Genet 46(10):106371. 47 48 Di Gregorio E, Bianchi FT, Schiavi A, Chiotto AM, Rolando M, Verdun di Cantogno L, 49 Grosso E, Cavalieri S, Calcia A, Lacerenza D and others. 2013. A de novo X;8 50 translocation creates a PTK2THOC2 gene fusion with THOC2 expression 51 knockdown in a patient with psychomotor retardation and congenital cerebellar 52 hypoplasia. J Med Genet 50(8):54351. 53 54 Dickinson ME, Flenniken AM, Ji X, Teboul L, Wong MD, White JK, Meehan TF, Weninger 55 WJ, Westerberg H, Adissu H and others. 2016. Highthroughput discovery of novel 56 developmental phenotypes. Nature 537(7621):508514. 57 58 22 59 60 John Wiley & Sons, Inc. Human Mutation Page 24 of 67

1 2 3 DuddingByth T, Baxter A, Holliday EG, Hackett A, O'Donnell S, White SM, Attia J, 4 Brunner H, de Vries B, Koolen D and others. 2017. Computer facematching 5 technology using twodimensional photographs accurately matches the facial gestalt 6 of unrelated individuals with the same syndromic form of intellectual disability. BMC 7 Biotechnol 17(1):90. 8 9 Epi KC, Epilepsy Phenome/Genome P, Allen AS, Berkovic SF, Cossette P, Delanty N, 10 Dlugos D, Eichler EE, Epstein MP, Glauser T and others. 2013. De novo mutations in 11 epileptic encephalopathies. Nature 501(7466):21721. 12 13 Guria A, Tran DD, Ramachandran S, Koch A, El Bounkari O, Dutta P, Hauser H, Tamura T. 14 2011. Identification of mRNAs that are spliced but not exported to the cytoplasm in 15 16 the absence of THOC5 in mouse embryo fibroblasts. RNA 17(6):104856. 17 18 Hautbergue GM. 2017. RNA Nuclear Export: From Neurological Disorders to Cancer. Adv 19 Exp Med Biol For1007:89109. Peer Review 20 21 Hautbergue GM, Hung ML, Walsh MJ, Snijders AP, Chang CT, Jones R, Ponting CP, 22 Dickman MJ, Wilson SA. 2009. UIF, a New mRNA export adaptor that works 23 together with REF/ALY, requires FACT for recruitment to mRNA. Curr Biol 24 19(22):191824. 25 26 Heath CG, Viphakone N, Wilson SA. 2016. The role of TREX in gene expression and 27 disease. Biochem J 473(19):291135. 28 29 Hirayama S, Yamazaki Y, Kitamura A, Oda Y, Morito D, Okawa K, Kimura H, Cyr DM, 30 Kubota H, Nagata K. 2008. MKKS is a centrosomeshuttling protein degraded by 31 diseasecausing mutations via CHIPmediated ubiquitination. Mol Biol Cell 32 19(3):899911. 33 34 Hosking FJ, Carey IM, Shah SM, Harris T, DeWilde S, Beighton C, Cook DG. 2016. 35 Mortality Among Adults With Intellectual Disability in England: Comparisons With 36 the General Population. Am J Public Health 106(8):148390. 37 38 Johnson MB, Kawasawa YI, Mason CE, Krsnik Z, Coppola G, Bogdanovic D, Geschwind 39 40 DH, Mane SM, State MW, Sestan N. 2009. Functional and evolutionary insights into 41 human brain development through global transcriptome analysis. Neuron 62(4):494 42 509. 43 44 Katahira J, Inoue H, Hurt E, Yoneda Y. 2009. Adaptor Aly and coadaptor Thoc5 function in 45 the Tapp15mediated nuclear export of HSP70 mRNA. EMBO J 28(5):55667. 46 47 Kim W, Bennett EJ, Huttlin EL, Guo A, Li J, Possemato A, Sowa ME, Rad R, Rush J, Comb 48 MJ and others. 2011. Systematic and quantitative assessment of the ubiquitin 49 modified proteome. Mol Cell 44(2):32540. 50 51 Kumar R, Corbett MA, van Bon BW, Woenig JA, Weir L, Douglas E, Friend KL, Gardner A, 52 Shaw M, Jolly LA and others. 2015. THOC2 Mutations Implicate mRNAExport 53 Pathway in XLinked Intellectual Disability. Am J Hum Genet 97(2):30210. 54 55 56 57 58 23 59 60 John Wiley & Sons, Inc. Page 25 of 67 Human Mutation

1 2 3 Lapek JD, Jr., Greninger P, Morris R, Amzallag A, PruteanuMalinici I, Benes CH, Haas W. 4 2017. Detection of dysregulated proteinassociation networks by highthroughput 5 proteomics predicts cancer vulnerabilities. Nat Biotechnol 35(10):983989. 6 7 Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, Fennell T, O'DonnellLuria AH, 8 Ware JS, Hill AJ, Cummings BB and others. 2016. Analysis of proteincoding genetic 9 variation in 60,706 humans. Nature 536(7616):28591. 10 11 Liu C, Yue B, Yuan C, Zhao S, Fang C, Yu Y, Yan D. 2015. Elevated expression of Thoc1 is 12 associated with aggressive phenotype and poor prognosis in colorectal cancer. 13 Biochem Biophys Res Commun 468(12):538. 14 15 LopitzOtsoa F, RodriguezSuarez E, Aillet F, CasadoVela J, Lang V, Matthiesen R, Elortza 16 F, Rodriguez MS. 2012. Integrative analysis of the ubiquitin proteome isolated using 17 18 Tandem Ubiquitin Binding Entities (TUBEs). J Proteomics 75(10):29983014. 19 For Peer Review 20 Mancini A, NiemannSeyde SC, Pankow R, El Bounkari O, KlebbaFarber S, Koch A, 21 Jaworska E, Spooncer E, Gruber AD, Whetton AD and others. 2010. THOC5/FMIP, 22 an mRNA export TREX complex protein, is essential for hematopoietic primitive cell 23 survival in vivo. BMC Biol 8:1. 24 25 Milani D, Ronzoni L, Esposito S. 2015. Genetic Advances in Intellectual Disability. J Pediatr 26 Genet 4(3):1257. 27 28 Morgan DO. 2013. The D box meets its match. Mol Cell 50(5):60910. 29 30 Oeseburg B, Dijkstra GJ, Groothoff JW, Reijneveld SA, Jansen DE. 2011. Prevalence of 31 chronic health conditions in children with intellectual disability: a systematic 32 literature review. Intellect Dev Disabil 49(2):5985. 33 34 Ohno K, Takeda JI, Masuda A. 2017. Rules and tools to predict the splicing effects of exonic 35 and intronic mutations. Wiley Interdiscip Rev RNA. 36 37 Olsen JV, Blagoev B, Gnad F, Macek B, Kumar C, Mortensen P, Mann M. 2006. Global, in 38 vivo, and sitespecific phosphorylation dynamics in signaling networks. Cell 39 127(3):63548. 40 41 Palmer EE, Kumar R, Gordon CT, Shaw M, Hubert L, Carroll R, Rio M, Murray L, Leffler 42 M, DuddingByth T and others. 2017. A Recurrent De Novo Nonsense Variant in 43 44 ZSWIM6 Results in Severe Intellectual Disability without Frontonasal or Limb 45 Malformations. Am J Hum Genet 101(6):9951005. 46 47 Palmer EE, Stuhlmann T, Weinert S, Haan E, Van Esch H, Holvoet M, Boyle J, Leffler M, 48 Raynaud M, Moraine C and others. 2016. De novo and inherited mutations in the X 49 linked gene CLCN4 are associated with syndromic intellectual disability and behavior 50 and seizure disorders in males and females. Mol Psychiatry. 51 52 Pena A, Gewartowski K, Mroczek S, Cuellar J, Szykowska A, Prokop A, CzarnockiCieciura 53 M, Piwowarski J, Tous C, Aguilera A and others. 2012. Architecture and nucleic 54 acids recognition mechanism of the THO complex, an mRNP assembly factor. EMBO 55 J 31(6):160516. 56 57 58 24 59 60 John Wiley & Sons, Inc. Human Mutation Page 26 of 67

1 2 3 Rehwinkel J, Herold A, Gari K, Kocher T, Rode M, Ciccarelli FL, Wilm M, Izaurralde E. 4 2004. Genomewide analysis of mRNAs regulated by the THO complex in 5 Drosophila melanogaster. Nat Struct Mol Biol 11(6):55866. 6 7 Richards S, Aziz N, Bale S, Bick D, Das S, GastierFoster J, Grody WW, Hegde M, Lyon E, 8 Spector E and others. 2015. Standards and guidelines for the interpretation of 9 sequence variants: a joint consensus recommendation of the American College of 10 Medical Genetics and Genomics and the Association for Molecular Pathology. Genet 11 Med 17(5):40524. 12 13 Ropers HH, Hamel BC. 2005. Xlinked mental retardation. Nat Rev Genet 6(1):4657. 14 15 Samocha KE, Robinson EB, Sanders SJ, Stevens C, Sabo A, McGrath LM, Kosmicki JA, 16 Rehnstrom K, Mallick S, Kirby A and others. 2014. A framework for the 17 18 interpretation of de novo mutation in human disease. Nat Genet 46(9):94450. 19 For Peer Review 20 Snijders Blok L, Madsen E, Juusola J, Gilissen C, Baralle D, Reijnders MR, Venselaar H, 21 Helsmoortel C, Cho MT, Hoischen A and others. 2015. Mutations in DDX3X Are a 22 Common Cause of Unexplained Intellectual Disability with GenderSpecific Effects 23 on Wnt Signaling. Am J Hum Genet 97(2):34352. 24 25 Thul PJ, Akesson L, Wiking M, Mahdessian D, Geladaki A, Ait Blal H, Alm T, Asplund A, 26 Bjork L, Breckels LM and others. 2017. A subcellular map of the human proteome. 27 Science 356(6340). 28 29 Uhlen M, Fagerberg L, Hallstrom BM, Lindskog C, Oksvold P, Mardinoglu A, Sivertsson A, 30 Kampf C, Sjostedt E, Asplund A and others. 2015. Proteomics. Tissuebased map of 31 the human proteome. Science 347(6220):1260419. 32 33 Viphakone N, Cumberbatch MG, Livingstone MJ, Heath PR, Dickman MJ, Catto JW, Wilson 34 SA. 2015. Luzp4 defines a new mRNA export pathway in cancer cells. Nucleic Acids 35 Res 43(4):235366. 36 37 Viphakone N, Hautbergue GM, Walsh M, Chang CT, Holland A, Folco EG, Reed R, Wilson 38 SA. 2012. TREX exposes the RNAbinding domain of Nxf1 to enable mRNA export. 39 40 Nat Commun 3:1006. 41 42 Vissers LE, Gilissen C, Veltman JA. 2016. Genetic studies in intellectual disability and 43 related disorders. Nat Rev Genet 17(1):918. 44 45 Wagner SA, Beli P, Weinert BT, Nielsen ML, Cox J, Mann M, Choudhary C. 2011. A 46 proteomewide, quantitative survey of in vivo ubiquitylation sites reveals widespread 47 regulatory roles. Mol Cell Proteomics 10(10):M111 013284. 48 49 Wang L, Miao YL, Zheng X, Lackford B, Zhou B, Han L, Yao C, Ward JM, Burkholder A, 50 Lipchina I and others. 2013. The THO complex regulates pluripotency gene mRNA 51 export and controls embryonic stem cell selfrenewal and somatic cell 52 reprogramming. Cell Stem Cell 13(6):67690. 53 54 Wang X, Chang Y, Li Y, Zhang X, Goodrich DW. 2006. Thoc1/Hpr1/p84 is essential for 55 early embryonic development in the mouse. Mol Cell Biol 26(11):43627. 56 57 58 25 59 60 John Wiley & Sons, Inc. Page 27 of 67 Human Mutation

1 2 3 Wang X, Chinnam M, Wang J, Wang Y, Zhang X, Marcon E, Moens P, Goodrich DW. 2009. 4 Thoc1 deficiency compromises gene expression necessary for normal testis 5 development in the mouse. Mol Cell Biol 29(10):2794803. 6 7 Woerner AC, Frottin F, Hornburg D, Feng LR, Meissner F, Patra M, Tatzelt J, Mann M, 8 Winklhofer KF, Hartl FU and others. 2016. Cytoplasmic protein aggregates interfere 9 with nucleocytoplasmic transport of protein and RNA. Science 351(6269):1736. 10 11 Wright CF, McRae JF, Clayton S, Gallone G, Aitken S, FitzGerald TW, Jones P, Prigmore E, 12 Rajan D, Lord J and others. 2018. Making new genetic diagnoses with old data: 13 iterative reanalysis and reporting from genomewide data in 1,133 families with 14 developmental disorders. Genet Med. 15 16 Yamazaki T, Fujiwara N, Yukinaga H, Ebisuya M, Shiki T, Kurihara T, Kioka N, Kambe T, 17 18 Nagao M, Nishida E and others. 2010. The closely related RNA helicases, UAP56 19 and URH49, preferentiallyFor Peer form distinct Review mRNA export machineries and coordinately 20 regulate mitotic progression. Mol Biol Cell 21(16):295365. 21 22 Zhu X, Need AC, Petrovski S, Goldstein DB. 2014. One gene, many neuropsychiatric 23 disorders: lessons from Mendelian diseases. Nat Neurosci 17(6):77381. 24 25 Zweier C, Rittinger O, Bader I, Berland S, Cole T, Degenhardt F, Di Donato N, Graul 26 Neumann L, Hoyer J, Lynch SA and others. 2014. Females with de novo aberrations 27 in PHF6: clinical overlap of BorjesonForssmanLehmann with CoffinSiris 28 syndrome. Am J Med Genet C Semin Med Genet 166C(3):290301. 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 26 59 60 John Wiley & Sons, Inc. Human Mutation Page 28 of 67

1 2 3 Figure legends 4 5 FIGURE 1 Location of variant amino acids and structural features in THOC2 protein. 6 7 Ubiquitinated (K139 and K1084 (Kim, et al., 2011) and K345 (LopitzOtsoa, et al., 2012; 8 9 Wagner, et al., 2011)), phosphorylated (S1448 and 1486 (Olsen, et al., 2006)) amino acid 10 11 residues, potential RNA binding domain (RBD) and destruction box (Dbox) and KEN box 12 13 14 sequences that interact with the Anaphase Promoting Complex/Cyclosome (APC/C) for 15 16 protein ubiquitination and subsequent destruction by the proteasome (Morgan, 2013) are 17 18 shown. Unreported (orange) and published (black: (Kumar, et al., 2015)) missense variants 19 For Peer Review 20 effecting THOC2 protein stability are marked with red lollipops. The positions of two splice 21 22 variants are also shown (red). 23 24

25 26 27 FIGURE 2 Front and side facial views of the affected individuals with THOC2 variants. 28 29 30 31 FIGURE 3 Functional testing of THOC2 missense variants. A: THOC2 variant protein 32 33 stability is reduced in HEK293T cells. pCMVMycTHOC2 wildtype or variant expression 34 35 constructs and pEGFPC1 plasmid (transfection control) were transfected into HEK293T 36 37 cells. Total protein lysates of cells 24hr posttransfection were analysed by western blotting 38 39 40 with mouse antiMyc (clone 9E10; Sigma), mouse antiEGFP (clones 7.1 and 13.1; Roche) 41 42 and rabbit antiβtubulin (loading control; Abcam) antibodies. pCMVMycTHOC2 43 44 p.Ile800Thr construct expressing the p.Ile800Thr protein shown to have reduced stability was 45 46 used as a control (Kumar, et al., 2015). Western blot signals were quantified using ImageJ 47 48 software. Averages of the MycTHOC2 proteins normalised to the housekeeping βtubulin 49 50 signal from two independent runs are shown. B: MycTyr517Cys THOC2 protein halflife is 51 52 53 significantly reduced in HEK293T cells. pCMVMycTHOC2 or pCMVMycTHOC2 54 55 Tyr517Cys expression constructs and pEGFPC1 plasmid (transfection control) were 56 57 58 27 59 60 John Wiley & Sons, Inc. Page 29 of 67 Human Mutation

1 2 3 transfected into HEK293T cells. Next day the cells were cultured in the presence of 100 4 5 µg/ml translation inhibitor cycloheximide and cells harvested at the time points shown. Total 6 7 protein lysates were analysed by western blotting with mouse antiMyc, mouse antiEGFP 8 9 and rabbit antiβtubulin (loading control) antibodies. 10 11 12 13 14 FIGURE 4 Exon35:c.44502A>G variant abolishes splicing of intron between exons 3435. 15 16 A: Part of the THOC2 gene showing location of the A/G nucleotide in the heterozygous 17 18 carrier mother and A>G splice variant in the affected son. The Cterminal part of the 1593 19 For Peer Review 20 amino acid wild type and 1535 amino acid (that contains 1483 normal and 52 amino acids 21 22 coded by the unspliced intron) THOC2 protein in the affected boy are also shown. B: Gel 23 24 25 showing a 194 bp RTPCR product from the normallyspliced mRNA of the heterozygous 26 27 carrier mother and a 537bp product from defective splicing of mRNA causing retention of an 28 29 intron between exon 3435 in the affected son. RTPCR products from total RNA isolated 30 31 from passage 3 (lanes 12) and 5 (lanes 34) fibroblasts. Location of the forward and reverse 32 33 primers within exons 34 and 35 is shown. C: Sanger sequencing chromatograms of PCR 34 35 products amplified from genomic and cDNA of the affected son and his heterozygous carrier 36 37 38 mother using primers located within exon 34 and 35. Genomic DNA around the Exon34 39 40 IntronExon35 region is shown. D: Western blot showing THOC2 protein in the affected son 41 42 and his carrier mother's skin fibroblasts. TREX subunit UAP56 was used as a loading control. 43 44 E: RTqPCR showing levels of the THOC2 mRNA in the affected son and his carrier 45 46 mother's skin fibroblasts. F: Immunofluorescence detection of THOC2 in skin fibroblasts of 47 48 the unaffected mother and affected son. 49 50 51 52 53 FIGURE 5 Exon28:c.3503+4A>C variant causes aberrant splicing of intron 28 A: Part of the 54 55 THOC2 gene showing location of the A/C nucleotide in the heterozygous carrier mother and 56 57 58 28 59 60 John Wiley & Sons, Inc. Human Mutation Page 30 of 67

1 2 3 A>C splice variant in the affected son. The Cterminal part of the 1593 amino acid wild type 4 5 and 1175 amino acid (that contains 1168 normal and 7 amino acids coded by the unspliced 6 7 intron) THOC2 protein in the affected son are also shown. B: Gel showing a 491 bp RT PCR 8 9 product from the normallyspliced heterozygous carrier mother and unaffected father, and 10 11 491 bp and 634 bp (retaining 143 bp of the 503 bp intron 28) RTPCR products derived from 12 13 14 the normally and aberrantly spliced mRNAs, respectively, in the affected son. Location of the 15 16 forward and reverse primers within exons 27 and 30 is shown. C: Sanger sequencing 17 18 chromatograms of PCR products amplified using primers located within exons 27 and 30 19 For Peer Review 20 from genomic and cDNA of unaffected father and mother (carrier with 98:2% skewing), and 21 22 the affected son. 23 24

25 26 27 FIGURE 6 Summary of truncated human THOC2 proteins translated from aberrantlyspliced 28 29 mRNAs and functional outcomes of yeast Cterminal Tho2 deletion strains (Pena, et al., 30 31 2012). Blue boxes depict the 52 and 7 amino acids coded by unspliced intron sequences of 32 33 exon35:c.45502T>C and exon28:c.3503+4A>C variants, respectively. WT, +++ = normal, 34 35 ∆14081597aa, ++ and ∆12711597aa = reduced growth at restrictive temperature. 36 37 38 39 40 TABLE 1 Detailed description of the THOC2 variants. 41 42 43 44 TABLE 2 Summary of clinical data. 45 46 47 48 49 50 51 52 53 54 55 56 57 58 29 59 60 John Wiley & Sons, Inc. Page 31 of 67 Human Mutation

1 2 3 Supplementary data files 4 5 6 7 Supplementary file with detailed clinical data 8 9 10 11 Supplementary Figure S1. THOC2 variant amino acid residues are highly conserved. 12 13 14 Amino acid sequence alignments show the variant amino acid residue (red) at the conserved 15 16 positions (green). The amino acid sequences were aligned using www..org/align/. 17 18 19 For Peer Review 20 Supplementary Figure S2 Brain MRI of the individual carrying the exon28: c.3503+4A>C 21 22 THOC2 variant. Red arrow shows myelination and yellow arrow shows periventricular white 23 24 matter lesion at age 2y 11m. 25 26 27 28 29 Supplementary Figure S3 A: THOC2 variant proteins are normally localised in HeLa and 30 31 HEK293T cells. HeLa and HEK293T cells cultured on PolyLlysine coated coverslips (BD 32 33 Biosciences) were transfected with wild type or variant MycTHOC2 expression plasmids 34 35 and detected by immunofluorescence using mouse antiMyc (9E10; Sigma) primary and 36 37 Donkey antirabbitIgG Alexa 488 secondary antibody (Life Technologies). Only HeLa cell 38 39 40 localisations at 630× magnification are shown. B: Nuclear and cytoplasmic (HEK293T; 41 42 average 250 cells each) and nuclear and nuclear plus cytoplasmic (HeLa; average 400 cells 43 44 each) staining was visually scored and plotted. Note that nuclear plus cytoplasmic instead of 45 46 only cytoplasmic staining for HeLa was scored because all the cells with cytoplasmic staining 47 48 also had nuclear staining. 49 50 51 52 53 Supplementary Figure S4 Children and adolescents/adults within current and previously 54 55 reported THOC2 cohorts. Images connected by vertical grey shading represent the same 56 57 58 30 59 60 John Wiley & Sons, Inc. Human Mutation Page 32 of 67

1 2 3 individual at different ages. 4 5 6 7 Supplementary Table S1 Primers list (PCR and overlap PCR mutagenesis). 8 9 10 11 12 13 14 15 16 17 18 19 For Peer Review 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 31 59 60 John Wiley & Sons, Inc.

* * ACMG ACMG pathogenicity pathogenicity classification*

NO NO LP Protein Stability Stability Reduced

and male tested tested carrier carrier mother mother affected affected proband proband Variant Variant plasmid Fibrobla sts of the sts of

Polyphen2* Polyphen2*

gnomAD gnomAD frequency frequency

GERP++ GERP++ Phylop

Provean Provean prediction prediction

score score Provean

23.9 23.9 -4.01 Deleterious 5.97 2.016 absent P YES YES LP 23.7 23.7 N/A N/A 5.25 1.735 absent N/A 28.1 28.1 -6.08 Deleterious 5.7 1.902 absent D YES YES LP 23.1 23.1 -5.07 Deleterious 6.07 2.571 absent P YES YES LP 10.8 10.8 N/A N/A 5.57 1.86 absent N/A N/A ND LP CADD CADD Human Mutation John Wiley & Sons, Inc.

94:6% 94:6% 98:2% 98:2% Mother Mother Mother Mother Mother Mother skewed skewed skewed skewed skewed skewed De novo De novo 27.4 -5.47 Deleterious 5.03 0.963 absent D YES YES DP De novo De novo 26.6 -7.87 Deleterious 5.84 1.955 absent D YES YES DP De novo De novo 31 -4.69 Deleterious 5.73 2.412 absent D YES YES DP Mode of Mode of Maternal Maternal Maternal Maternal Maternal Maternal Maternal Maternal Maternal Maternal inheritance inheritance inheritance inheritance inheritance; inheritance; inheritance; inheritance; inheritance; inheritance; (99.9:0.1%) (99.9:0.1%) inheritance inheritance

4A>C 2A>G 2A>G r696Ile r696Ile 800Thr 800Thr u438Pro u438Pro y713Asp y713Asp s1187Tyr s1187Tyr yr517Cys yr517Cys c.1550A>G:p.T Exon35:c.4450- c.3559C>T:p.Hi c.1313T>C:p.Le c.2399T>C:p.Ile Exon28:c.3503+ c.2087C>T:p.Th c.2138G>A:p.Gl For Peer Review

Position Position NM_001081550 122799566 122799566 122765621

variants. THOC2 WES WES 122757079 WES WES 122778639 WES WES 122766890 Method of of Method sequencing sequencing sequencing X-Chr Exome Exome X-Chr Exome X-Chr identification identification

USA USA Japan Japan WES Trio 122757634 Canada Canada WES 122747561 y/Russia y/Russia Epilepsy Epilepsy 221, 2013 2013 221, me Project; me Project; Kumar et al al et Kumar al et Kumar 2015 AJHG 2015 AJHG 2015 AJHG USA Epi4K USA Consortium & & Consortium Identical twins twins Identical Phenome/Geno Canada/German Nature 501:217- Nature

1 1 2 Australia WES Trio 122767853 5 6 7 3-4 3-4

Detailed description of the of the description Detailed

Splice Splice Missense Missense Missense

Rare variants with supporting molecular evidence evidence molecular supporting with variants Rare Individual From TABLE TABLE 1 Page 33 of 67 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Page 34 of 67 25.8 25.8 -3.07 Deleterious 5.09 2.219 absent P YES NO VOUS 22.8 22.8 -1.81 Neutral 5.34 1.768 absent D YES NO VOUS 24.2 24.2 -6.49 Deleterious 5.91 0.834 absent D NO NO VOUS 15.4 15.4 0.96 Neutral 5.91 2.493 absent B YES NO VOUS or LP 23.8 23.8 -2.44 23.8 Neutral -4.54 5.38 Deleterious 2.248 5.6 absent 1.87 absent D P YES NO YES VOUS NO VOUS Human Mutation John Wiley & Sons, Inc. likely likely shows shows family family mother mother blood or or blood mother's mother's germline germline Maternal Maternal Maternal Maternal Maternal Maternal Maternal Maternal Maternal members members testing of of testing variant in in variant peripheral peripheral likely new likely mosaicism mosaicism detected in in detected inheritance inheritance inheritance inheritance inheritance inheritance inheritance Variant not not Variant inheritance; inheritance; hair roots so so hair roots s c.229 c.229 313Phe 313Phe r1012Pro r1012Pro n1261His n1261His r1108Leu r1108Leu g1121Gly g1121Gly C>T:p.Arg77Cy c.3323C>T:p.Se c.3034T>C:p.Se c.937C>T:p.Leu c.3781A>C:p.As Forc.3361A>G:p.Ar Peer Review 122757818 122757818 122757780 122757780 122802090 122802090 122759786 WES WES 122756613 WES WES 122837349 affected affected brothers brothers individual individual sequencing sequencing sequencing and affected and affected WES of two two of WES WES parents WES X-Chr Exome Exome X-Chr Exome X-Chr USA USA USA USA Spain Spain Kumar et al al et Kumar al et Kumar 2015 AJHG 2015 AJHG 2015 AJHG

8 8 11 11 12

9-10 9-10 Spain

Missense Missense Rare variants without supporting molecular evidence evidence molecular supporting without variants Rare Significance, Pathogenic,VOUS LP;unknown Variant of DP; Pathogenic, Likely novo **De determined not B;ND, Benign, N/A, not applicable; P; damaging, Possibly D; damaging, *Probably 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

6 6 12 12 No No No No No No No No No No Yes, Yes, Yes, Yes, Yes, Yes, span span 5200 5200 Male Male Mod+ Mod+ spastic spastic phrases phrases spasticity spasticity hypotonia, hypotonia, quadiplegia quadiplegia Yes, central Yes, central c.3781A>C: c.3781A>C: appendicular appendicular Yes, 23 word word 23 Yes, p.Asn1261His p.Asn1261His short attention short attention LennoxGestaut LennoxGestaut

44 44 11 11 No No No No No No No No No No No No NA NA Yes Yes Non Non Male Male ambulatory ambulatory c.3361A>G: c.3361A>G: p.Arg1121Gly p.Arg1121Gly

12 12 37 37 Premature Term 10 10 No No No No No No No No No No Yes Yes severe Yes, Yes Yes Yes Yes Sib 2 Sib2 VOUS VOUS and toe toe and walking walking c.3323C>T: c.3323C>T: Yes, 'clumsy' 'clumsy' Yes, p.Ser1108Leu p.Ser1108Leu

9 9 17 17 40 40 No No No No No No No Yes Yes 2880 2880 3480 Male Male Male Sib 1 Sib1 and toe toe and walking walking c.3323C>T: c.3323C>T: Yes, 'clumsy' Yes, 'clumsy' p.Ser1108Leu p.Ser1108Leu 9 9 41 41 No No No No No No No No No Yes, ASD Yes, ASD Yes No No No 3770 3770 Male Male spasms spasms Profound Profound Severe Severe Severe spasticity spasticity hypotonia, hypotonia, c.229C>T: c.229C>T: Yes, central central Yes, p.Arg77Cys p.Arg77Cys appendicular appendicular Yes, infantile Yes, infantile Yes, nonverbal nonverbal Yes, Yes Non ambulatory ambulatory Non

10 10 No No No No NR NR NR NA NA NA NA NA NA Yes Yes Non Non verbal verbal Female Female Profound Profound Yes, non non Yes, ambulatory ambulatory c.1550A>G: c.1550A>G: p.Tyr517Cys p.Tyr517Cys Yes, epileptic epileptic Yes, encephalopathy encephalopathy

6 6 7 8 10 10 41 41 No No No No No No No Yes, 4365 4365 Male Male 2A>G 2A>G spasticity spasticity spasticity spasticity hypotonia hypotonia dysarthria dysarthria Yes, nasal nasal Yes, based gait, gait, based Yes, central Yes, central Human Mutation ataxia/broad ataxia/broad appendicular appendicular Exon35:c.4450 John Wiley & Sons, Inc. 3 3 37 37 No No No No No Yes No No No No No Yes Yes Yes Yes Non Non 3018 3018 Male Male verbal verbal Severe Severe Severe Exon28: Exon28: Yes, non Yes, Yes, ASD Yes, ASD No ambulatory ambulatory c.3503+4A>C c.3503+4A>C

4 4 5 37 37 No No No No No No NA NA Yes Yes For Peer Review Yes Twin 2 Twin 2 walking walking (likely) pathogenic pathogenic (likely) Yes, toe toe Yes, c.3559C>T: c.3559C>T: p.His1187Tyr p.His1187Tyr 3 3 37 37 No No No No No No No NA NA Yes Yes 1990 1990 2420 Twin 1 Twin 1 walking walking Yes, toe Yes, toe c.3559C>T: c.3559C>T:

p.His1187Tyr p.His1187Tyr 2 2 No No No No No No Yes Yes Yes, Yes, problems problems intermittent intermittent gait/balance gait/balance c.2138G>A: c.2138G>A: p.Gly713Asp p.Gly713Asp Yes, nonverbal nonverbal Yes, Yes

12 12 5 7 7 36 36 37 No No No No Yes Yes Yes Yes Yes No Yes Yes Male Male Male Male Male signs signs words, words, Severe Severe Mod+ Mod+ Mod+ Yes, single Yes, p.Thr696Ile p.Thr696Ile c.2087C>T: c.2087C>T: Summary of clinical of clinical data. Summary

Gait Gait Tremor Tremor No (weeks) (weeks) Gender Gender Anxiety No features features features features Epilepsy Epilepsy No Suspected No disability disability problems problems Gestation Gestation Perinatal Perinatal Low birth Low Behaviour Behaviour Hypotonia Hypotonia No Individual Individual 1 Intellectual Intellectual Depression No Neurologic Neurologic Age (years) Age (years) disturbances disturbances Speech delay Speech delay Hyperkinesia Hyperkinesia No Variant details details Variant weight (<2.5kg) (<2.5kg) weight TABLE TABLE 2 (g) weight Birth 2000 2650 Page 35 of 67 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

No No No No Unilateral Unilateral coloboma. coloboma. Page 36 of 67 hyperopia, hyperopia, astigmatism, astigmatism, leucomalacia leucomalacia feeding issues. issues. feeding Periventricular Periventricular admission with with admission Postnatal NICU NICU Postnatal Cryptorchidism, Cryptorchidism, right optic nerve nerve right optic

ND ND Yes Yes Yes Yes cataracts, cataracts, glaucoma, glaucoma, micropenis micropenis tachycardia, tachycardia, cardiomegaly, cardiomegaly, Osteoarthritis, Osteoarthritis, supraventricular supraventricular

No No Yes (39.9) (4%) No NA NA ND ND Yes Yes

No No No No ND ND Yes Yes eneuresis eneuresis nocturnal nocturnal Pes cavus, cavus, Pes No No No No Yes (<1%) No Yes (<1%) (<1%) Yes, No No No No and and corpus corpus dilated vermal vermal failure, failure, chronic chronic myopia, myopia, diabetes. diabetes. callosum, callosum, ventricles ventricles Cerebellar Cerebellar sialorrhea, sialorrhea, respiratory respiratory dysphagia, dysphagia, dependent: dependent: esophageal esophageal strabismus. strabismus. gastrostomy gastrostomy astigmatism, astigmatism, reflux. Type I I Type reflux. Tracheostomy Tracheostomy hypoplasia, thin hypoplasia, Cryptorchidism, Cryptorchidism,

No No No No NR No, 5% 5% No, impairment impairment MRI normal MRI normal Cortical visual visual Cortical

No No No No No No Slight spaces. spaces. nystagmus nystagmus soft palate, palate, soft nystagmus, nystagmus, Human Mutation abnormality abnormality hypoteloric, cerebrofluid cerebrofluid Clinodactyly, Clinodactyly, prominence of of prominence John Wiley & Sons, Inc. No No Yes,2% No No No No No No peri delayed delayed dilatation, dilatation, ventricular ventricular Ventricular Ventricular abnormality abnormality myelination, myelination, white matter white matter

No No No No No No For Yes Peer Review planus, planus, Noonan Noonan facies, pes pes facies, hypospadias hypospadias No No ND ND ND Yes Yes planus, planus, Noonan Noonan facies, pes pes facies, hypospadias hypospadias tube tube dysgyria dysgyria abnormal abnormal callosum, callosum, Mild joint joint Mild Thin corpus Thincorpus dependency), dependency), micrognathia, micrognathia, difficulties (g difficulties sleep, feeding feeding sleep, palmar creases creases palmar laxity, subluxed subluxed laxity, disordered hips, laryngomalacia, laryngomalacia,

No No No No Yes Yes 1% Yes,< No Yes Yes Yes Yes Yes Yes MRI MRI normal normal (≤3%) (≤3%) (≤3%) (≤3%) Growth Growth forehead forehead (BMI≥25) (BMI≥25) Broad high high Broad Overweight Overweight parameters parameters Short stature Short Microcephaly Microcephaly Brain MRI/CT MRI/CT Brain Other features features Other not NA: not done; ND: resonance MRI: magnetic imaging; severity; tube; mod+: at moderate least scan; gastrostomy tomography gtube: disorder;computerised CT spectrum autism ASD: centile; %:Abbreviations: significance carevariant uncertain of VOUS: unit; intensive neonatal NICU: not reported; NR:available; 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Page 37 of 67 Human Mutation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 For Peer Review 20 21 194x70mm (300 x 300 DPI) 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 John Wiley & Sons, Inc. Human Mutation Page 38 of 67

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 For Peer Review 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 215x402mm (300 x 300 DPI) 48 49 50 51 52 53 54 55 56 57 58 59 60 John Wiley & Sons, Inc. Page 39 of 67 Human Mutation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 For Peer Review 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 106x156mm (300 x 300 DPI) 48 49 50 51 52 53 54 55 56 57 58 59 60 John Wiley & Sons, Inc. Human Mutation Page 40 of 67

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 For Peer Review 20 21 22 23 24 25 26 27 223x124mm (300 x 300 DPI) 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 John Wiley & Sons, Inc. Page 41 of 67 Human Mutation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 For Peer Review 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 134x148mm (300 x 300 DPI) 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 John Wiley & Sons, Inc. Human Mutation Page 42 of 67

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 For Peer Review 20 21 22 23 24 25 26 27

28 29 156x99mm (300 x 300 DPI) 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 John Wiley & Sons, Inc. Page 43 of 67 Human Mutation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 For Peer Review 20 21 22 23 24 25 26 27 231x127mm (300 x 300 DPI) 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 John Wiley & Sons, Inc. Human Mutation Page 44 of 67

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 For Peer Review 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 139x169mm (300 x 300 DPI) 48 49 50 51 52 53 54 55 56 57 58 59 60 John Wiley & Sons, Inc. Page 45 of 67 Human Mutation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 For Peer Review 20 21 22 23 24 25 26

27 28 236x140mm (300 x 300 DPI) 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 John Wiley & Sons, Inc. Human Mutation Page 46 of 67

1 SUPPLEMENTARY DATA FILE 2 3 4 Severe neurocognitive and growth disorders due to variation in THOC2, an essential 5 component of nuclear mRNA export machinery 6 7 8 Raman Kumar1, Alison Gardner1, Claire Homan1, Evelyn Douglas2, Heather Mefford3, 9 Dagmar Wieczorek4, Hermann-Josef Lüdecke4, Zornitza Stark5, Simon Sadedin6, The Broad 10 CMG7, Catherine Bearce Nowak8, Jessica Douglas8, Gretchen Parsons9, Paul Mark9, Lourdes 11 Loidi10, Gail Herman11, Theresa Mihalic Mosher11, Meredith K Gillespie12, Lauren Brady13, 12 13 14 15 16 13 Mark Tarnopolsky , Irene Madrigal , Jesús Eiris , Laura Domenech Salgado , Raquel 16 17 18 19 19 14 Rabionet , Tim M Strom , Naoko Ishihara , Hidehito Inagaki , Hiroki Kurahashi , 15 Tracy Dudding-Byth20,21, Elizabeth Palmer20,22, Michael Field20, Jozef Gecz1,23* 16 17 18 1Adelaide Medical School and the Robinson Research Institute, The University of Adelaide, 19 Adelaide, SA5000, Australia; 2Genetics and Molecular Pathology, SA Pathology, Adelaide, 20 SA5000, Australia; 3Division of Genetic Medicine, Department of Pediatrics, University of 21 For Peer Review 4 22 Washington & Seattle Children’s Hospital, Seattle, Washington, USA; Heinrich-Heine- 23 University, Medical Faculty, Institute of Human Genetics, Düsseldorf, Germany and Institut 24 für Humangenetik, Universitätsklinikum Essen, Universität Duisburg-Essen, Essen, 25 Germany; 5Murdoch Children's Research Institute and 26 Department of Pediatrics, University of Melbourne, Melbourne, Australia; 6Murdoch 27 Children's Research Institute and 28 Program in Medical and Population Genetics, Broad Institute of MIT and 29 Harvard, Cambridge, Massachusetts, USA; 7Broad’s Center for Mendelian Genomics, 30 8 31 Cambridge, Massachusetts, USA; Boston Children's Hospital, Longwood Avenue, Boston, 9 10 32 MA, USA; Spectrum Health Medical Genetics, Grand Rapids, MI, USA; Fundación 33 Pública Galega de Medicina Xenómica, Hospital Clínico Universitario de Santiago de, 34 Compostela. Santiago de Compostela, Spain; 11Nationwide Children’s Hospital and The Ohio 35 State University, Columbus, OH, USA; 12Children's Hospital of Eastern Ontario Research 36 Institute, University of Ottawa, Ottawa, Canada; 13Department of Pediatrics, McMaster 37 14 38 University Medical Centre, Hamilton, Canada; Fundación Pública Galega de Medicina 39 Xenómica, Hospital Clínico Universitario de Santiago de Compostela, Santiago de 15 40 Compostela, Spain; Unidad de Neurología Pediátrica, Departamento de Pediatría, Hospital 41 Clínico Universitario de Santiago de Compostela, Santiago de Compostela, Spain; 16Centre 42 for Genomic Regulation (CRG), Universitat Pompeu Fabra and CIBERESP, Barcelona 43 Institute for Science and Technology, Barcelona, Spain; 17Institut für Humangenetik, 44 Universitätsklinikum Essen, Universität Duisburg-Essen, Essen, Germany; 18Department of 45 19 46 Pediatrics, Fujita Health University School of Medicine, Japan; Division of Molecular 47 Genetics, Institute for Comprehensive Medical Science, Fujita Health University, Japan; 48 20Genetics of Learning Disability Service, Hunter Genetics, Waratah, NSW 2298, Australia; 49 21University of Newcastle, Australia Grow-Up-Well Priority Research Center; 22School of 50 Women’s and Children’s Health, University of New South Wales, Randwick, NSW 2031, 51 Australia; 23Healthy Mothers, Babies and Children, South Australian Health and Medical 52 53 Research Institute, Adelaide, SA 5000, Australia.

54 *Correspondence: Jozef Gecz, 1School of Medicine, The Robinson Research Institute, The 55 56 University of Adelaide, North Adelaide, SA5005, Australia. E-mail: 57 [email protected] 58 Detailed clinical data 59 60

1

John Wiley & Sons, Inc. Page 47 of 67 Human Mutation

1 High confidence in pathogenicity of THOC2 variants, ACMG classification likely pathogenic or 2 pathogenic. 3 4 5 Individual 1: affected male with de novo p.Thr696Ile variant 6 This is a 12 year old male, the second child of healthy unrelated Caucasian parent. He was delivered 7 at 36 weeks gestation with a birth weight of 2000g (10th centile), length 42cm (<1st centile) and head 8 circumference of 32cm (10-25th centile). He had speech and global developmental delay in infancy, 9 but was not noted to have low tone. Aged 12 he has a severe intellectual disability. He has limited 10 communication with some signs and a few single words. He was able to walk at 4 years of age. He 11 has no history of epileptic seizures. 12 13 On examination he has subtle dysmorphic features including a broad high forehead, microcephaly 14 (HC 50cm, 2nd centile), and short stature with a height of 111cm (<<1st centile) recorded at 10 years. 15

16 17 Brain MRI was reported as normal. Previous non diagnostic investigations include SNP chromosomal 18 microarray and fragile X syndrome testing. 19 20 21 Individual 2: affected maleFor with maternally Peer inherited Review p.Gly713Asp variant 22 This is a 5 year old male, the first child of healthy parents. He has a maternal half-brother with a 23 diagnosis of attention deficit hyperactivity disorder (ADHD) and a paternal half sister with diagnoses 24 of bipolar disorder and ADHD. Prenatal history was complicated by maternal hyperemesis throughout 25 pregnancy. He was delivered at near term (37+6 weeks), breech by emergency caesarian-section due 26 to prolonged rupture of membranes with a birth weight of 2650gm (10th centile), length 47cm (3-10th 27 percentile) and head circumference 34cm (25-50th percentile). 28 29 In the newborn period he was noted to have respiratory distress with biphasic stridor, mild 30 laryngomalacia and micrognathia, as well as feeding difficulties. He has disordered sleep and failure 31 to thrive, requiring gastrostomy tube feeding. He also has lax joints with a history of subluxing hips. 32 He required myringotomy tubes and adenoidectomy. He has difficulty with sleep and requires 33 clonidine to fall asleep and stay asleep. 34

35 36 Additional features include the presence of tremor and gait/balance problems. He has had 3 seizure- 37 like episodes. He had hypotonia, speech and global developmental delay and has likely a moderate 38 intellectual disability per his developmental assessments from neurology, and aged 5 years remains 39 non-verbal. He has some vocalizations and is adept with a communication device. There are no 40 features of autism. There are behavioral problems including occasional aggression thought to be 41 related to frustration due to limited communication abilities. He has a history of hand flapping 42 behaviors that have now resolved. 43 44 On examination he is small in all growth parameters with head circumference most significantly 45 affected: he is microcephalic with a head circumference of -3.79 SD. A review of his past head 46 circumference measurements show that he has consistently -3 SDs or more since birth to the present. 47 He has a broad, high forehead and flatness of the occiput. He has prominent ears with rolled helices. 48 He has a flat facial profile. Palpebral fissures are straight and short. He has mild median epicanthal 49 folds. There is a broad nasal root with an upturned tip and a smooth philtrum with a thin upper 50 vermillion border and prominent chin. He has small teeth. He has bilateral upper transverse palmar 51 creases with normal lower palmar creases and an accessory flexion crease on each 5th finger with mild 52 clinodactyly bilaterally. There is a flat hemangioma at the nape of the neck. 53

54 On Brain MRI he is noted to have a foreshortened and thin corpus callosum and low brainstem 55 56 volume. No basal ganglia fusion. There is dysgyria with variability in gyral morphology, changes in 57 orientation and gyral frequency. The gyral pattern appears finely nodular/polymicrogyria-like in the 58 inferior frontal lobes. There are also scattered areas of T2 and FLAIR signal. 59 60 Prior non diagnostic investigations include a normal karyotype, normal microarray and a negative Prader-Willi methylation study. EEG was normal. 2

John Wiley & Sons, Inc. Human Mutation Page 48 of 67

1 2 Individuals 3 and 4: affected monozygotic twins with maternally inherited p.His1187Tyr 3 variants 4 These are two 7 1/2 year old monozygotic twin brothers, born to healthy unrelated Russian parents 5 after assisted reproduction for unexplained subfertility. The twins were delivered at 37 weeks 6 gestation. Twin one had a birth weight of 1900g (<3rd centile) and length of 46 cm (3rd centile) and 7 twin two a birth weight of 2420g (10th centile) with length of 48 cm (25th centile) 8 9 Delays in their motor development were noted in the first year of life: they were only able to sit 10 without support at 12 months of age. Twin 1 walked independently at 1 year 9 months, and twin 1 at 2 11 years of age. They have a tendency to toe walk. Expressive and receptive language has been 12 significantly delayed and at 7 ½ years of age they have no verbal communication and only limited 13 receptive language. They are described as hyperactive and can have some behavioral difficulties 14 including auto-aggressive behaviors. They have difficulties coordinating the biting and chewing of 15 their food. Both boys had hypospadias. 16

17 th th 18 On examination twin one (aged 5 yr) was 106cm tall (25-50 centile) with a weight of 16.3kg (25-50 centile) and a head circumference of 40.5cm (10-25th centile). Twin two (aged 5 yr) was 107cm tall 19 th th 20 (25-50 centile) with a head circumference of 51cm (10-25 centile). Both boys were noted to have a 21 flat midface, hypertelorismFor a short nose, Peer short ears Review and prominent upper lip. A differential diagnosis 22 of Noonan syndrome had been considered. 23 24 Prior non diagnostic investigations included a normal karyotype, normal chromosomal microarray 25 (twin two), normal fragile X PCR. Testing of their mother indicated that she was almost completely 26 skewed for X inactivation (0.013:0.987). 27 28 29 Individual 5: Affected male with maternally inherited c.3503+4A>C variant 30 This is a three year old male. He was delivered at 37 weeks with a birth weight of 3.018kg (25th 31 centile). He was noted to have infantile hypotonia. Aged 3 he has been diagnosed with a severe 32 intellectual disability. He is non-verbal and has a diagnosis of autism spectrum disorder with some 33 challenging behaviors including a tendency to self-injure. His non-ambulant, but can crawl. He has 34 no history of seizures. 35 36 37 On examination he is noted to have a flat nasal bridge and a down-turned mouth. He has no tremor. 38 He is noted to be hyperkinetic. Aged 3 years 10 months his height was 94.5cm (-1.2SD), weight 39 13.8kg (-1.0SD) and head circumference was 49cm (-0.8SD) 40 41 MRI brain at age 2 years 11 months demonstrates mild dilatation of the lateral ventricles, especially in 42 frontal lobes, mildly delayed myelination and an abnormal white matter lesion in the periventricular 43 area close to the anterior horn. 44 45 Previous normal genetic investigations include a 46, XY karyotype. Copy number analysis was 46 performed on the data from exome studies and was non-diagnostic. 47 48 49 Individual 6: Affected male with maternally inherited c.4450-2A>G variant. 50 This is a 10 year old male born to non-consanguineous parents with a negative family history at 41 51 weeks gestation after a pregnancy complicated by first trimester bleeding and maternal hypertension. 52 He required oxygen at birth secondary to a small unilateral pneumothorax which resolved 53 spontaneously. Birth weight was 9lb 10 oz (>90th centile). 54

55 56 He had infantile hypotonia and was diagnosed with a severe learning disability. He is in a special 57 class at school and requires an Individualized Education Plan (IEP). He is able to communicate 58 verbally though his speech is affected by an abnormality of the soft palate and nasal dysarthria. 59 Additional neurological features including rest and intention tremor, an ataxic broad based gait, 60 spasticity, and nystagmus. There are no concerns with his behavior.

3

John Wiley & Sons, Inc. Page 49 of 67 Human Mutation

nd 1 On examination his head circumference (aged 5 years) was 49 cm (2 percentile) with a height (aged th th 2 10yr) of 142.1 cm (50-85 centile) and BMI of 17.6kg (50-80 percentile). He is measurably 3 hypoteloric with an intercanthal distance of 2.5 cm (-1SD) and an inter-pupillary distance of 4.5 cm 4 (35th percentile). He is noted to have a startle response. He has a thin eyebrows and a slightly flattened 5 nasal bridge. He has large ears (>2 SD). His palate is very high and narrow. He has slight 6 clinodactyly of the fingers. He has flat toenails that are slightly depressed at the cuticle. Brain MRI 7 shows slight prominence of the cerebro-fluid filled spaces. 8 9 Previous genetic investigations including Fragile-X, microarray, and mitochondrial DNA sequencing 10 were all negative, though he was found to have borderline complex 1+3 and 4 deficiencies. His 11 mother who carries the c.4450-2A>G splice variant has skewed X inactivation (94:6). 12 13 Individual 7: affected female with de novo p.Tyr517Cys variant. 14 This is a 10 year old female, the first child of healthy parents, she has an unaffected younger sister. 15 She has an infantile onset epileptic encephalopathy. Epileptic seizures were diagnosed at 2 ½ months 16 17 of age: with first seizures consisting of eyelid twitching, followed by clonic activity of the left arm 18 and leg. This was treated by phenobarbitone with reasonable seizure control for one month. 19 Subsequently clusters of epileptic spasms were diagnosed and EEG demonstrated hypsarrhythmia 20 with independent left and right epileptiform activity. Trialled medication included ACTH, vigabatrin 21 and clonazepam. She still hasFor breakthrough Peer seizures. Review 22 23 She was noted to have generalised hypotonia, poor head control and inability to visually fix and 24 follow. She has a severe intellectual disability and attends special education schooling with speech 25 therapy, physiotherapy and occupational therapy. She is non-verbal and non –ambulatory. She uses a 26 gait trainer. She has cortical visual impairment. 27 28 At one year of age on examination she was noted to have a head circumference on the 5th centile, 29 with height on the 90th centile and weight on the 25th centile. 30 31 Prior non-diagnostic testing included a karyotype, subtelomeric FISH, FISH at the Angelman locus, 32 sequencing of MECP2 and UBE3A and metabolic screening. Brain CT and MRI were reported as 33 normal. 34

35 36 37 Rare THOC2 variants without supportive functional data and conflicting in silico predictors: 38 variants of uncertain clinical significance on ACMG classification 39 40 Individual 8: affected male with maternally inherited p.Arg77Cys variant 41 This 9 year old male is one of two affected males to a non-consanguineous Caucasian couple. He had 42 a younger brother who died at the age of 4 and had a similar clinical presentation with intellectual 43 disability, seizure disorder and tracheostomy dependency. His mother also has two unaffected 44 daughters by different partners. The mother has a history of early developmental delay but was able to 45 complete high school. She also has bipolar disorder and a speech disorder (stutter). There is an 46 extended maternal family history which could be consistent with X linked inheritance. The proband 47 has a maternal uncle who died before the age of 4 years, who had microcephaly, a severe progressive 48 neurological condition, never being able to walk or talk, and required gastrostomy and tracheostomy. 49 The proband’s mother’s maternal aunt also had a son who died between the ages of 2 and 4 with, by 50 report, a similar presentation. The mother’s sister has learning difficulties as school, bipolar disorder 51 and a speech disorder (stutter). The maternal grandmother completed high school and some tertiary 52 level eductation. 53

54 The proband was born at 41 weeks gestation by spontaneous vaginal delivery, with Apgar scores of 8 55 56 at 1 and 9 at 5 minutes, and minimal resuscitation (transient oxygen) required at birth. Routine 57 antenatal ultrasounds were reported as normal, his mother was treated with lamotrigine through the 58 pregnancy but there were no other teratogenic exposures. Birth weight was 55.2cm, weight 3.7kg and 59 head circumference 33cm. He was noted to have increased muscle tone and poor suck, and was 60 transferred to the special care nursery requiring nasogastic feeds. A head ultrasound scan was

4

John Wiley & Sons, Inc. Human Mutation Page 50 of 67

1 suggestive of Dandy Walker malformation and he was transferred to a tertiary hospital for ongoing 2 care. 3 4 The proband has a profound intellectual disability, hypotonic quadriplegic cerebral palsy (GMFCS 5 level 5) and intractable infantile onset seizure disorder. He has a history of central apnea and chronic 6 respiratory failure and has been dependent on a tracheostomy and ventilator support since the age of 3 7 years. He has severe dysphagia and esophageal reflux and has been dependent on gastrostomy feeding 8 since the first year of life, initially being able to take some thickened oral feeds. He has sialorrhea and 9 previously had parotid duct ligation and salivary gland excision. He is profoundly delayed and does 10 not make eye contact and smiles occasionally in a non-purposeful manner. He has limited vision, 11 strabismus, myopia, astigmatism. He is non-verbal. He is centrally hypotonic with random non- 12 coordinated movements of the limbs and face, and has developed tongue fasciculations. He previously 13 was able to sit and roll, but has now lost those skills. With age he has increasing peripheral spasticity 14 and bilateral ankle clonus. He has no tremor. 15

16 17 18 He has an intractable seizure disorder, which first presented as infantile spasms. His current 19 antiepileptic medication is levetiracetam (Keppra) and diazepam (Valium). He also has non epileptic 20 myoclonus and muscular spasms which are treated with a muscular relaxant. EEG recordings have 21 been abnormal including moderateFor generalizedPeer backgroundReview slowing and occasional bi-central spike 22 and waves. 23 24 He has abnormalities on neuroimaging including hypoplasia of the cerebellar vermis, cerebral 25 aqueduct stenosis, stable enlargement of the ventricular system reflecting diffuse white matter loss 26 and thin corpus callosum, as well as hypoplasia of the brainstem, pons and medulla. 27 28 Other medical issues include type I diabetes and left cryptorchidism (requiring orchidoplexy). He has 29 a structurally normal heart and kidneys. 30 31 On recent examination aged 9 he has a weight of 29.3kg (50th centile), height of 1.132m (25th centile), 32 and is noted to be normocephalic. He has bitemporal narrowing, large ears, a flat nasal bridge, a 33 tented mouth, bilateral ptosis and right sided strabismus. He has tapered fingers with puffy feet and 34 toes. 35 36 37 Prior non diagnostic investigations include a chromosomal microarray, creatine kinase of 408 u/L at 38 birth and 123 u/L aged 2 years, no mutation identified on sanger sequencing of the PLP1 gene. 39 40 Individuals 9 and 10: two brothers with maternally inherited p. Ser1108Leu variants. 41 These are two brothers born to healthy unrelated Spanish parents, with no additional family history of 42 intellectual disability. 43 44 The first brother (individual 9) is currently 17 years of age and has diagnoses on the autistic spectrum 45 and severe intellectual disability. He was born after an uncomplicated pregnancy at 40 weeks 46 gestation. His birth weight was 2880g (10th centile). Apgar scores were 10 at 1 and 5 minutes. He 47 started to walk at 16 months without prior crawling. He had a tendency to toe walk. He is described as 48 ‘clumsy’. Development of his pincer grasp was delayed, and he was delayed in being able to chew, 49 and eat and drink independently. His ability to point to objects and his receptive and expressive 50 language development was delayed. He is able to maintain a simple conversation, solve simple 51 puzzles, and identify different parts of the body. He can recognize a few logos. He can get easily 52 frustrated and has some fears. He has no stereotypical movements. He can initiate simple social 53 interactions but does not have symbolic play. He gets anxious prior to novel situations. Eye contact is 54 limited. 55 56 57 He has no tremor or seizures. His sleep pattern is normal. He had nocturnal enuresis until the age of 58 12. 59 60 There have been significant concerns with short stature, and he was referred to an endocrinologist.

5

John Wiley & Sons, Inc. Page 51 of 67 Human Mutation

1 On examination he was noted to have a broad forehead and a narrow nasal root with a bulbous tip to 2 his nose, thick lips and a short philtrum. His ears were normally sited. He had a clumsy gait and th 3 shortening of the Achilles tendon. His head circumference was 55 cm (50 centile), height 153 cm 4 (<<1st centile) and weight 56kg (3-10th centile). 5 6 The second brother (individual 10) is 12 years of age and born at 37 weeks after an uncomplicated 7 pregnancy. He also has diagnoses of autism and severe intellectual disability, and attends a special 8 school. His birth weight was 3480g (50th centile) and Apgar scores were 9 at 1 and 10 and 5 minutes. 9 10 There were no significant concerns with his motor development in the first year of life, but he was 11 slow to walk, with this beginning at 18 months. He had a tendency to toe walk. He had significantly 12 delayed expressive and receptive speech delay, and no social pointing until 3 years of age. He has no 13 expressive language, and very delayed receptive language. His social development has been delayed 14 and he tended to use adults to obtain objects that he wanted. His eye contact has been limited. He can 15 engage in simple social interactions with adults but cannot sustain peer relationships. He is described 16 17 as being restless and has a tendency to ‘explosive’ reactions, and can have challenging behaviors 18 including mild auto and hetero-aggressive acts when frustrated. He has difficulties adapting to 19 changes in his routine and finds it very difficult to sit still. He has stereotypical movements including 20 raising up his arms, and clutching and shaking his hands when excited. He is described as clumsy. He 21 requires assistance in his activitiesFor of Peer daily living. Review 22 23 On examination he is noted to have a broad forehead and widened nasal tip. He has a tendency to toe 24 walk, and has shortening of the Achilles tendon. At the age of 12 years 9 months he was 146cm tall 25 (10th centile), and weighed 36kg (10th centile). 26 27 Prior non-diagnostic investigations include a normal chromosomal microarray (qChip Post) 28 29 30 Individual 11: male with maternally inherited p.Arg1121Gly variant 31 This male individual died aged 44, and was of Spanish heritage. He was not born prematurely and he 32 did not have a low birth weight. He is reported to have been affected by severe intellectual disability, 33 with a history of severe speech delay and infantile hypotonia. He received a diagnosis of cerebral 34 palsy. He was reported to be non-ambulant. He had aggressive behavior with self-harm including 35 36 hand-biting and face hitting. He lived in an institution for individuals with intellectual disability. He 37 had severe speech and language disorder, with around 10 words. Other health issues included asthma, 38 osteoarthritis, glaucoma, cataracts, supraventricular tachycardia and cardiomegaly. 39 ST 40 He was reported to be microcephalic with short stature, height of 134cm (<<1 centile), and weight th 41 (aged 44 years) was 69.8kg (50 centile). He had a raised BMI (39.9) and subtle dysmorphic features 42 including a broad high forehead. He was reported to have micropenis. 43 44 45 Individual 12: male with maternally inherited p.Asn1261His variant 46 This male individual is the first child to a 26 year old female. He is currently 5 years 9 month of age. 47 The pregnancy was uncomplicated with appropriate prenatal care. There were no known exposures, 48 maternal illnesses or medications used during the pregnancy. Routine prenatal testing was normal, and 49 there was no gestational diabetes. No prenatal ultrasound or genetic testing was performed. Delivery 50 was performed at 42 weeks at home with a midwife and was spontaneous vaginal with vacuum 51 extraction due to large size. Birth weight was 5.2 kg, length 57cm, head circumference 38.5cm. 52 According to the parents, the baby was initially “blue but his color came right back.” Formal Apgar 53 scores were not recorded. He had some irregular breathing at 4 hours of age and was admitted to a 54 regional neonatal intensive care unit where he remained for 10 days, mostly for feeding difficulties. 55 56 57 The parents are both healthy and unrelated. They have a daughter, who does not carry the THOC2 58 variant, who was admitted to hospital at 8 months of age with a cluster of unresponsive tonic 59 episodes. MRA/MVA showed cortical venous thrombi with superior sagittal sinus thrombosis. The 60 tonic episodes were treated with Keppra, but she was weaned off this after a normal EEG. She has made a good recovery and had normal development. 6

John Wiley & Sons, Inc. Human Mutation Page 52 of 67

1 2 The THOC2 variant is maternally inherited and not present in the maternal grandmother. The 3 maternal grandfather is deceased, and no DNA was available for testing. There is no extended family 4 history of intellectual disability. The mother has 6 healthy adult full-brothers with normal 5 development. 6 7 The patient had delayed early milestones, sitting at 11 months and walking at 2 years 11 months. His 8 first words were at 12 months of age but he did not gain additional words until he was over 2 years if 9 age. He developed overt seizures between 18-24 months, although subtle and or absence seizures 10 may have been present earlier. The seizures were treated with multiple medications but progressed to 11 refractory seizures consistent with Lennox-Gastaut syndrome. Initial EEG was consistent with a 12 diffuse encephalopathy, follow-up EEG was indicative of numerous brief atonic seizures in the setting 13 of a slow background. Seizure frequency and severity have partially responded to oral cannabis which 14 was begun as part of a clinical trial at age 5 years 2 months. Prominent seizure type remains as 15 infrequent atonic seizures. His development has progressed with improved seizure control. He knows 16 17 more than 100 words and has several two to three word phrases. He can eat with a spoon, help dress 18 himself and ride a tricycle. He attends a regular preschool class and receives occupational, physio and 19 speech therapies. He has a short attention span. 20 21 He has central hypotoniaFor with appendicular Peer spasticity Review and has received a diagnosis of spastic 22 quadriplegia. He is noted to have visual difficulties with hyperopia with astigmatism and a possible 23 right optic nerve coloboma. He has had bilateral orchidoplexy for cryptorchidism. 24 25 Aged 6 years 10 months, his weight was 17.9kg (3rd centile), height 108.4 cm (<1st centile) and head 26 circumference at 5 years 10 months was 50 cm (29 centile). He has mild dysmorphic features, 27 including somewhat long facies and prominent ears. 28 29 Previous non-diagnostic testing include chromosomal microarray, testing for Fragile X, urinary 30 organic acids, serum amino acids, acylcarnitine profile and transferrin isoforms for carbohydrate 31 deficient glycoprotein disorders. MRI scan performed at 2 was consistent with mild left sided 32 periventricular leukomalacia or delayed myelination. Hearing screening has been normal. Clinical 33 exome sequencing was performed on the proband and his parents at Baylor-Miraca, Houston, TX. 34

35 36 37 38 39 Methods used for identifying the THOC2 variants 40 41 p.Thr696Ile (Individual 1) 42 43 Trio whole exome sequencing (WES) and data processing were performed by Genomics 44 45 46 Platform at the Broad Institute of Harvard and MIT (Broad Institute, Cambridge, MA, USA). 47 48 WES was performed using Illumina exome capture (38 Mb target) and the data was 49 50 processed through a pipeline based on Picard. Single Nucleotide Polymorphism (SNPs) and 51 52 53 insertions/deletions (indels) were jointly called across all samples using Genome Analysis 54 55 Toolkit (GATK) HaplotypeCaller package version 3.4. 56 57 58 59 p.Gly713Asp (Individual 2) 60

7

John Wiley & Sons, Inc. Page 53 of 67 Human Mutation

1 THOC2 variant was identified using the published method (Tanaka, et al., 2015). 2 3 4 5 p.His1187Tyr (Individual 3-4) 6 7 8 Genomic DNA was extracted from peripheral blood samples of the affected individuals and 9 10 their parents using DNA standard extraction kits. Exomes were enriched in solution and 11 12 indexed with version 5 of the SureSelect XT Human All Exon 50Mb kit (Agilent 13 14 15 Technologies). Sequencing was performed as 101 bp paired-end reads on HiSeq2000/2500 16 17 systems (Illumina). For one of the twins and his parents, 9.4, 8.7 and 12 Gigabases of 18 19 sequence were generated, respectively, resulting in an average depth of coverage of 113, 110 20 21 For Peer Review 22 and 150, with 96.5%, 96% and 97.5% of the target regions covered at least 20 times. Image 23 24 analysis and base calling was performed using Illumina Real Time Analysis. Reads were 25 26 aligned against the human assembly hg19 (GRCh37) using BWA v 0.5.9. Variant calling was 27 28 performed specifically for the regions targeted by the exome enrichment kit using SAMtools 29 30 31 (v 0.1.18), PINDEL (v 0.2.4t), ExomeDepth (v 1.0.0) and custom scripts. Variant quality was 32 33 determined using the SAMtools varFilter script with default parameters except for the 34 35 maximum read depth (-D) and the minimum P-value for base quality bias (-2), which were 36 37 38 set to 9999 and 1e-400, respectively. In addition, a custom script was applied to mark all 39 40 variants with adjacent bases of low median base quality. All variants were then annotated 41 42 using custom Perl scripts. Annotation included known transcripts (UCSC Known Genes and 43 44 45 RefSeq genes), known variants (dbSNP v 135), type of mutation and - if applicable - amino 46 47 acid changes. The annotated variants were integrated into an in-house database. Because the 48 49 index case has a likewise affected twin brother, and the mother displayed an almost 50 51 completely skewed X-inactivation, the database was queried to show homozygosity, 52 53 54 compound heterozygosity or 'hemizygosity' for X-linked variants. To reduce false positives, 55 56 variants that were already present in the database, or had a variant quality of less than 30, or 57 58 did not pass the filter scripts were filtered out. The raw read data of the remaining variants 59 60 using the Integrative Genomics Viewer (IGV) was manually checked.

8

John Wiley & Sons, Inc. Human Mutation Page 54 of 67

1 2 3 Exon35:c.4450-2A>G (Individual 5) 4 5 THOC2 variant was identified using the published method (Nishimura, et al., 2014). 6 7 8 9 10 Exon28:c.3503+4A>C (Individual 6) 11 12 XHMM software on exome sequencing data, was used to analyse for copy number variation. 13 14 15 Fifty nanograms of whole blood DNAs from the proband and his parents were used to 16 17 generate a whole exome library using SureSelect QXT Reagent and SureSelect Clinical 18 19 Research Exome (Agilent technologies, Santa Clara, CA). Sequencing was carried out using 20 21 For Peer Review 22 HiSeq 1500 (Illumina, San Diego, CA). After demultiplexing from other sample data, the 23 24 reads were mapped onto the human reference hg19 using BWA 0.7.15 (Li and Durbin, 2009). 25 26 Sorting and recalibration of the mapped reads were done using Picard tools 2.7.1 27 28 (http://broadinstitute.github.io/picard), and variants were called into a VCF file using GATK 29 30 31 3.6 (Van der Auwera, et al., 2013). Annotations were added using SnpEff and SnpSift 32 33 (Cingolani, et al., 2012). Low quality data and common SNPs of more than 1% frequency in 34 35 Japanese or global populations were filtered out by using dbSNP 147, HGVD (Higasa, et al., 36 37 38 2016), iJGVD (Nagasaki, et al., 2015) and ExAC (Lek, et al., 2016) data. 39 40 41 42 p. Tyr517Cys (Individual 7) 43 44 45 THOC2 variant was identified using the published method (Epi, et al., 2013) 46 47 48 49 p.Arg77Cys (Individual 8) 50 51 THOC2 variant was identified using the published method (Farwell, et al., 2015). 52 53 54 55 56 p.Ser1108Leu and p.Arg1121Gly (Individual 9-10 and 11) 57 58 Both patients had prior investigation with the chromosomal microarray (qChip Post). Carriers 59 60 of variants p.Ser1108Leu and p.Arg1121Gly were identified by WES. In short, DNA from

9

John Wiley & Sons, Inc. Page 55 of 67 Human Mutation

1 the proband and family members (parents for p.Arg1121Gly and affected sibling for 2 3 Ser1108Leu) were fragmented with covaris, captured using the Agilent SureSelect 50Mb kit 4 5 and TruSeq adapters, pooled in groups of three samples and sequenced on an Illumina 6 7 8 HiSeq2000 lane with 2x75bp reads. Fastq files were processed using the in-house ediva 9 10 pipeline: they were aligned to the hg19 reference genome, duplicate-marked, locally 11 12 realigned and recalibrated, and variants were called using GATK HaplotypeCaller (Van der 13 14 15 Auwera, et al., 2013) followed by quality assessment with VQSR and retaining tranche 99.9. 16 17 Variants were annotated with ediva and filtered according to familial segregation (de novo, 18 19 recessive and X-linked variants in the trio, shared recessive and X-linked variants in the 20 21 For Peer Review 22 sibpair), MAF < 0.05% in any population/database, and scored based on functionality 23 24 predictors. The trio case p.Arg1121Gly was also a carrier of a de novo variant in SALL3, 25 26 while the THOC2 variant p.Ser1108Leu was the only likely candidate passing the filters in 27 28 the sib-pair. Identified variants were further confirmed by Sanger sequencing in the carriers. 29 30 31 32 33 p.Asn1261His (Individual 12) 34 35 THOC2 variant was identified using the published method (Yang, et al., 2014). 36 37 38 39 40 References 41 Cingolani P, Platts A, Wang le L, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM. 42 2012. A program for annotating and predicting the effects of single nucleotide 43 polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain 44 45 w1118; iso-2; iso-3. Fly (Austin) 6(2):80-92. 46 Epi KC, Epilepsy Phenome/Genome P, Allen AS, Berkovic SF, Cossette P, Delanty N, 47 Dlugos D, Eichler EE, Epstein MP, Glauser T and others. 2013. De novo mutations in 48 epileptic encephalopathies. Nature 501(7466):217-21. 49 Farwell KD, Shahmirzadi L, El-Khechen D, Powis Z, Chao EC, Tippin Davis B, Baxter RM, 50 Zeng W, Mroske C, Parra MC and others. 2015. Enhanced utility of family-centered 51 diagnostic exome sequencing with inheritance model-based analysis: results from 500 52 53 unselected families with undiagnosed genetic conditions. Genet Med 17(7):578-86. 54 Higasa K, Miyake N, Yoshimura J, Okamura K, Niihori T, Saitsu H, Doi K, Shimizu M, 55 Nakabayashi K, Aoki Y and others. 2016. Human genetic variation database, a 56 reference database of genetic variations in the Japanese population. J Hum Genet 57 61(6):547-53. 58 Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, Fennell T, O'Donnell-Luria AH, 59 Ware JS, Hill AJ, Cummings BB and others. 2016. Analysis of protein-coding genetic 60 variation in 60,706 humans. Nature 536(7616):285-91.

10

John Wiley & Sons, Inc. Human Mutation Page 56 of 67

1 Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler 2 transform. Bioinformatics 25(14):1754-60. 3 Nagasaki M, Yasuda J, Katsuoka F, Nariai N, Kojima K, Kawai Y, Yamaguchi-Kabata Y, 4 Yokozawa J, Danjoh I, Saito S and others. 2015. Rare variant discovery by deep 5 whole-genome sequencing of 1,070 Japanese individuals. Nat Commun 6:8018. 6 7 Nishimura N, Gotoh T, Oike Y, Yano M. 2014. TMEM65 is a mitochondrial inner-membrane 8 protein. PeerJ 2:e349. 9 Tanaka AJ, Cho MT, Millan F, Juusola J, Retterer K, Joshi C, Niyazov D, Garnica A, Gratz 10 E, Deardorff M and others. 2015. Mutations in SPATA5 Are Associated with 11 Microcephaly, Intellectual Disability, Seizures, and Hearing Loss. Am J Hum Genet 12 97(3):457-64. 13 Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, Del Angel G, Levy-Moonshine A, 14 15 Jordan T, Shakir K, Roazen D, Thibault J and others. 2013. From FastQ data to high 16 confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr 17 Protoc Bioinformatics 43:11 10 1-33. 18 Yang Y, Muzny DM, Xia F, Niu Z, Person R, Ding Y, Ward P, Braxton A, Wang M, Buhay 19 C and others. 2014. Molecular findings among patients referred for clinical whole- 20 exome sequencing. JAMA 312(18):1870-9. 21 For Peer Review 22

23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Supplementary Figure S1. THOC2 variant amino acid residues are highly conserved. 40 41 42 Amino acid sequence alignments show the variant amino acid residue (red) at the conserved 43 44 positions (green). The amino acid sequences were aligned using www.uniprot.org/align/. 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

11

John Wiley & Sons, Inc. Page 57 of 67 Human Mutation

1 p.Arg77Cys p.Tyr517Cys 2 Mutant DISEFCEDMPS 82 Mutant FKTFPCQHRYR 522 3 Hom DISEFREDMPS 82 Hom FKTFPYQHRYR 522 4 Mus DISEFREDMPS 82 Mus FKTFPYQHRYR 522 5 Rattus DISEFREDMPS 82 Rattus FKTFPYQHRYR 522 6 Pan DISEFREDMPS 82 Pan FKTFPYQHRYR 522 7 Macaca DISEFREDMPS 82 Macaca FKTFPYQHRYR 522 8 Canis DISEFREDMPS 82 Canis FKTFPYQHRYR 522 9 Equus DISEFREDMPS 82 Equus FKTFPYQHRYR 522 10 Sus DISEFREDMPS 82 Sus FKTFPYQHRYR 522 11 Bos DISEFREDMPS 82 Bos FKTFPYQHRYR 522 12 Loxo DVSEFREDMPS 82 Loxo FKTFPYQHRYR 522 13 Gallus DIIEFREDMPS 79 Gallus FKTFPYQYRYR 519 14 Xenopus EIIAFRDDMPS 79 Xenopus FKCFPYQYRYR 518 15 16 p.Thr696Ile p.Gly713Asp 17 Mutant ITEEMIMEQLE 701 Mutant QLKAEDGYFGQ 718 18 Hom ITEEMTMEQLE 701 Hom QLKAEGGYFGQ 718 19 Mus ITEEMTMEQLE 701 Mus QLKAEGGYFGQ 718 20 Rattus ITEEMTMEQLE 701 Rattus QLKAEGGYFGQ 718 21 Pan ITEEMTMEQLEFor 701Peer Review Pan QLKAEGGYFGQ 718 22 Macaca ITEEMTMEQLE 701 Macaca QLKAEGGYFGQ 718 23 Canis ITEEMTMEQLE 701 Canis QLKAEGGYFGQ 718 24 25 Equus ITEEMTMEQLE 701 Equus QLKAEGGYFGQ 718 26 Sus ITEEMTMEQLE 701 Sus QLKAEGGYFGQ 718 27 Bos ITEEMTMEQLE 701 Bos QLKAEGGYFGQ 718 28 Loxo ITEEMTMEQLE 701 Loxo QLKAEGGYFGQ 718 29 Gallus ITEEMTMEQLE 698 Gallus QLKAEGGYFGQ 715 30 Xenopus ITEEMTMEQLE 697 Xenopus QLKAEGGYFGQ 714 31 32 p.Ser1108Leu p.Arg1121Gly 33 Mutant KLTKAPVHCLE 1113 Mutant EYTHIGNILIV 1126 34 Hom KLTKASVHCLE 1113 Hom EYTHIRNILIV 1126 35 Mus KLTKASVHCLE 1113 Mus EYTHIRNILIV 1126 36 Rattus KLTKASVHCLE 1113 Rattus EYTHIRNILIV 1126 37 Pan KLTKASVHCLE 1113 Pan EYTHIRNILIV 1126 38 Macaca KLTKASVHCLE 1113 Macaca EYTHIRNILIV 1126 39 Canis KLTKASVHCLE 1113 Canis EYTHIRNILIV 1126 40 Equus KLTKASVHCLE 1113 Equus EYTHIRNILIV 1126 41 Sus KLTKASVHCLE 1113 Sus EYTHIRNILIV 1126 42 Bos KLTKASVHCLE 1113 Bos EYTHIRNILIV 1126 43 Loxo KLTKASVHCLE 1113 Loxo EYTHIRNILIV 1126 44 Gallus KLTKASVHCLE 1110 Gallus EYTHIRNILIV 1123 45 Xenopus KLTKASVHCLE 1109 Xenopus EYTHIRNILIV 1122 46 47 p.His1187Tyr p.Asn1261His 48 Mutant PENEFYHKDPP 1192 Mutant GNSGSHSNKAV 1266 49 Hom PENEFHHKDPP 1192 Hom GNSGSNSNKAV 1266 50 Mus PENEFHHKDPP 1192 Mus GNSGSNSNKAV 1266 51 52 Rattus PENEFHHKDPP 1192 Rattus GNSGSNSNKAV 1266 53 Pan PENEFHHKDPP 1192 Pan GNSGSNSNKAV 1266 54 Macaca PENEFHHKDPP 1192 Macaca GNSGSNSNKAV 1266 55 Canis PENEFHHKDPP 1192 Canis GNSSSNSSKAV 1266 56 Equus PENEFHHKDPP 1192 Equus GNSGSNSSKAV 1266 57 Sus PENEFHHKDPP 1192 Sus GNSGSNSSKTV 1266 58 Bos PENEFHHKDPP 1192 Bos GNSGSNSSKTV 1266 59 Loxo PENEFHHKDPP 1192 Loxo GNSGSTSNKTV 1266 60 Gallus PENEFHHKDPP 1189 Gallus GNSASNSKIIK 1265 Xenopus PENDFHHKDPP 1188 Xenopus GNSASSSKVLK 1264 12

John Wiley & Sons, Inc. Human Mutation Page 58 of 67

1 2 3 4 5 Supplementary Figure S2 Brain MRI of the individual carrying the exon28: c.3503+4A>C 6 7 8 THOC2 variant. Red arrow shows myelination and yellow arrow shows periventricular white 9 10 matter lesion at age 2y 11m. 11 12 13 14 15 16

17

18 19 20 21 For Peer Review 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36

37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55

56 57 58 59 60

13

John Wiley & Sons, Inc. Page 59 of 67 Human Mutation

1 2 Supplementary Figure S3 A: THOC2 variant proteins are normally localised in HeLa and 3 4 HEK293T cells. HeLa and HEK293T cells cultured on Poly-L-lysine coated coverslips (BD 5 6 Biosciences) were transfected with wild type or variant Myc-THOC2 expression plasmids 7 8 9 and detected by immunofluorescence using mouse anti-Myc (9E10; Sigma) primary and 10 11 Donkey anti-rabbit-IgG Alexa 488 secondary antibody (Life Technologies). Only HeLa cell 12 13 localisations at 630 magnification are shown. B: Nuclear and cytoplasmic (HEK293T; 14 15 16 average 250 cells each) and C: nuclear and nuclear plus cytoplasmic (HeLa; average 400 17 18 cells each) staining was visually scored and plotted. Note that nuclear plus cytoplasmic 19 20 instead of only cytoplasmic staining for HeLa was scored because all the cells with 21 For Peer Review 22 23 cytoplasmic staining also had nuclear staining. 24 25 A 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

14

John Wiley & Sons, Inc. Human Mutation Page 60 of 67

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 For Peer Review 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51

52 53 54 55 56 57 58 59 60

15

John Wiley & Sons, Inc. Page 61 of 67 Human Mutation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 For Peer Review 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

16

John Wiley & Sons, Inc. Human Mutation Page 62 of 67

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 For Peer Review 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56

57

58 59 60

17

John Wiley & Sons, Inc. Page 63 of 67 Human Mutation

1 2 3 4 5 B 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 For Peer Review 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45

46

47 48 49 50 51 52 53 54 55 56 57 58 59 60

18

John Wiley & Sons, Inc. Human Mutation Page 64 of 67

1 2 3 4 5 C 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 For Peer Review 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

19

John Wiley & Sons, Inc. Page 65 of 67 Human Mutation

1 2 3 Supplementary Figure 4 Children and adolescent/adults within current and previously reported* THOC2 cohorts. Images connected by vertical grey 4 shading represent the same individual at different ages. 5 6 7 Children Individual 1 2 3 (twin 1) 4 (twin 2) 5 MRX12/V- 11 8 7* 9 Variant p.Thr696lle p.Gly713As p.His1187Tyr p.His1187Tyr ex:28c.3503+ p.Leu438Pro Arg1121Gy 10 p 4A>C 11 ACMG De novo De novo Likely Likely Likely Likely VOUS 12 13 Pathogenicity pathogenic pathogenic pathogenic pathogenic pathogenic Pathogenic 14 Classification For Peer Review 15 Age 12 5 4 4 3 - - 16 17 18 19 20 21

22 23 24 Individual 6 L:2_II-5* 3_twin 1 8 MRX12_V- MRX12_V- 11 25 8* 7* 26 Variant Ex35:c.445 p.lle800Thr p.His1187Tyr p.Arg77Cys p.Leu438Pro p.Leu438Pro Arg1121Gy 27 0-2A>G 28 Pathogenicity Likely Likely Likely VOUS Likely Likely VOUS 29 pathogenic pathogenic pathogenic pathogenic pathogenic 30 31 Age 10 - - 9 9 - - 32 33 34 35 36 37 38 39 40 Adolescent Individual L22_III-7* L:2_II-5* D45_III-7* MRX12_V- MRX12_V- MRX12_V- 11 41 42 20

43 John Wiley & Sons, Inc. 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Human Mutation Page 66 of 67

1 2 3 and adults 2* 8* 7* 4 Variant p.lle800Thr p.lle800Thr p.Leu313phe p.Leu438Pro p.Leu438Pro p.Leu438Pro Arg1121Gy 5 6 7 Pathogenicity Likely Likley VOUS Likely Likely Likely VOUS 8 pathogenic Pathogenic pathogenic pathogenic pathogenic 9 Age 28 44 30 30 - 14 10 11 12 13 14 For Peer Review 15 16 17

18 19 Individual L33_III-3* L22_III-2* L22_III-1* MRC12_III- MRX12_IV- 11 20 17* 2* 21 Variant p.lle800Thr p.lle800Thr p.lle800Thr p.Leu438Pro p.Leu438Pro p.Arg1121 22 Gly 23 Pathogenicity Likely Likley Likley Likely Likely VOUS 24 pathogenic pathogenic pathogenic pathogenic pathogenic 25 26 Age 34 42 43 44 27 28 29 30 31 32 33 34 35 *Previously published families (Kumar, et al., 2015). 36 37 38 39 40 41 42 21

43 John Wiley & Sons, Inc. 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Page 67 of 67 Human Mutation

1 2 3 Supplementary Table S1 Primers list (PCR and overlap PCR mutagenesis). 4 5 6 Name 5' → 3' 7 p.Tyr517Cys 8 hTHOC2-1592-A1550G-R GTTTCATTCTTCCACTGGCCATACAGACGATATCTATGTTGACATGGAAATGT 9 10 p.Ser1108Leu Mutagenesis 11 hTHOC2_3302_C3323T_F ATTACAAACTAACCAAGGCATTGGTACATTGCCTTGAAACAG 12 hTHOC2_3343_G3323A_R CTGTTTCAAGGCAATGTACCAATGCCTTGGTTAGTTTGTAAT 13 14 p.His1187Tyr Mutagenesis For Peer Review 15 hTHOC2_3537_C3559T_F CATGATACCTGAAAATGAGTTTTATCACAAGGATCCCCCTCCGAGGAAT 16 hTHOC2_3585_G3559A_F ATTCCTCGGAGGGGGATCCTTGTGATAAAACTCATTTTCAGGTATCATG 17 18 p.Asn1261His 19 hTHOC2_3758_A3781C_F CAAGCAATGGAAATAGTGGATCCCACAGCAACAAAGCTGT 20 21 hTHOC2_3797_T3781G_F ACAGCTTTGTTGCTGTGGGATCCACTATTTCCATTGCTTG 22 p.Thr696Ile 23 hTHOC2-2149-2088T-R CAAAATAACCACCCTCAGCTTTTAGCTGCTCTCCACCAGTCATAGCCTCTAGTTGCTCCATTaTCA 24 25 p.Gly713Asp 26 hTHOC2_2112_2138G>A_F TGGTGGAGAGCAGCTAAAAGCTGAGGATGGTTATTTTGGTCAGA 27 hTHOC2-2216-R GCAAGATCATGGTCCAATAGAGCATC 28 29 p.Arg77Cys 30 hTHOC2_208_C229T_F CTTAGTGACATTAGTGAATTTtGTGAGGATATGCCCTCCATT 31 hTHOC2_249_G229A_R AATGGAGGGCATATCCTCACaAAATTCACTAATGTCACTAAG 32 33 RT-qPCR THOC2 mRNA expression 34 THOC2-RT-PCR-F GAGAATGAAGGTTATGCCAAGCTGA 35 THOC2-RT-PCR-R CGTGTTCTGGCCTGCATTCAAACAC 36 37 RT-qPCR Housekeeping mRNA expression 38 HPRT1/F TGACACTGGCAAAACAATGCA 39 40 HPRT1/R GGTCCTTTTCACCAGCAAGCT 41 42 22

43 John Wiley & Sons, Inc. 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Human Mutation Page 68 of 67

1 2 3 PCR amplification and Sanger sequencing of amplified cDNA and gDNA of carrier mother and affected son (c.4450-2A>G) 4 5 hTHOC2-4326F TACTCCTCCACCACTGTCCAAGAG 6 hTHOC2-4519-R TTGTTCCATTCTCCTCTTTACGTCTC 7 PCR amplification and Sanger sequencing of amplified gDNA of normal father, carrier mother and affected son (c.3503+4A>C) 8 9 hTHOC2-F GAATCTGGGTCAAGCTTTGG 10 hTHOC2-R CTCCCATAATCTCCTGAAGG 11 PCR amplification and Sanger sequencing of amplified cDNA of normal father, carrier mother and affected son (c.3503+4A>C) 12 13 hTHOC2-ex27F GGATTTGATGGTGGAAATAAGG 14 THOC2-ex30R ACAGCTTTCACACCACACTGFor Peer Review 15

16 17 hTHOC2-4570-R TTGGATAAGGAGATTCACAAGGAC 18 19 20 21 22 23

24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 23

43 John Wiley & Sons, Inc. 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60