Harnessing the Power of Old Data: Exome Sequencing Reanalysis on a Manitoba Cohort

by

Taryn Bryn Tristan Athey

A Thesis submitted to the Faculty of Graduate Studies of

The University of Manitoba

in partial fulfilment of the requirements of the degree of

MASTER OF SCIENCE

Department of Biochemistry and Medical Genetics

Genetic Counselling Program

Max Rady College of Medicine

Rady Faculty of Health Sciences

University of Manitoba

Winnipeg, Manitoba, Canada

Copyright ©2020 by Taryn Athey ABSTRACT

Rare disorders are thought to affect 1 in 12 Canadians, yet more than 50% of these patients currently do not have a diagnosis. Undiagnosed and misdiagnosed patients with rare disorders cause a significant burden to the healthcare system and the uncertainty often greatly affects their quality of life.

Exome sequencing (ES) is an unbiased genetic testing method that sequences most of the known coding regions in the genome. ES is used when targeted methods have failed to provide a diagnosis and has been found to lead to a definitive diagnosis in 25-50% of cases, depending on the criteria used for case selection.

Clinical ES is limited by the bioinformatics method used to call variants, the interpretation of those variants, and the knowledge of -disease associations at the time of interpretation. Therefore, it is not surprising that systematic reanalysis of ES data at regular intervals has been shown to provide a diagnosis in an additional 10-15% of cases. ES reanalysis is not routinely done for Manitoba patients; for this reason, we have developed a pilot project to reanalyze the ES data for patients who previously had non-diagnostic ES.

We recruited 33 participants from 25 families who have received ES that failed to provide a definitive genetic diagnosis. The raw ES data for each participant was collected from the sequencing laboratories and variants were called using the bcbio-nextgen bioinformatics pipeline. Variants were annotated using the Ensembl Variant Effect Predictor. Custom filters were used to prioritize variants for review and pathogenicity of variants was assessed using the American College of Medical Genetics guidelines.

We found candidate variants in 14 (56%) of the families analyzed, including 3 strong candidate variants and 6 variants in novel that have not previously been associated with disease. This study suggests that reanalyzing already generated ES data is an efficient way to increase genetic diagnoses for

I patients in Manitoba. As well, analysis of variants in genes with unknown function may lead to future gene discovery projects, adding to our overall knowledge of human genetics.

II

ACKNOWLEDGMENTS

Thank you to my thesis supervisor, Dr. Patrick Frosk, for his guidance throughout the project, for sharing his vast knowledge of medical genetics, for going through endless lists of gene candidates with me, and for teaching me that negative is just a state of mind. Thank you to my thesis committee, Claudia

Carriles, Taila Hartley, and Dr. Pingzhao Hu, for all of your individual insights and guidance. Thank you to my program director, Jessica Hartley, for keeping me organized and for our office chats.

This work could not have been completed without the help of Shirley Harvey who contacted participants for their consent, found all of the charts I had to review, and always knew the answer to my questions. I also want to extend my gratitude to the Care4Rare team: Grace Ediae, Magda Price, and

Michelle Vandeloo, for assisting with Care4Rare ethics procedures and data accession. Thank you to

Matt Osmond for his matchmaking expertise and guidance during exome rounds.

A big thank you to the inaugural class, Angela Krutish, Ashleigh Hansen, and Rachelle Dinchong, for leading the charge and for showing me that success is possible. Thank you to my classmates, Emily

Bonnell and Selina Casalino, for the love, support, commiseration, and good times throughout the program. Best of luck to the remaining genetic counselling students, Natasha Osawa, Cassie McDonald, and Dorothy Michalski, I’m so glad to have gotten to know you this past year. I would also like to extend my appreciation to the many supervisors I had throughout this program.

I am grateful to my husband, Charlie Keown-Stoneman, for spending more time on airplanes these past two years than the rest of his life combined, and for always answering the phone when I was having a tough day. Thank you to the rest of my friends and family for their endless love and support.

III

I would like to acknowledge the families who were involved in this study. Living through a diagnostic odyssey cannot be easy, and I feel privileged to have had the opportunity to assist them in finding diagnoses.

I received funding from the University of Manitoba Graduate Fellowship (Department of

Biochemistry and Medical Genetics).

IV

TABLE OF CONTENTS

ABSTRACT ...... I ACKNOWLEDGMENTS ...... III TABLE OF CONTENTS ...... V LIST OF TABLES ...... IX LIST OF FIGURES ...... X LIST OF ABBREVIATIONS ...... XI CHAPTER 1: BACKGROUND AND REVIEW OF THE LITERATURE ...... 1 1.1. Rare Disease ...... 1 1.2. Traditional Methods of Diagnosis ...... 2 1.3. Exome Sequencing ...... 3 1.3.1. Criteria for Testing ...... 3 1.3.2. Factors that Affect Diagnostic Rates ...... 4 1.3.3. Beyond Single Nucleotide Causes of Disease ...... 6 1.3.4. Secondary Findings ...... 8 1.3.5. When Exome Sequencing Cannot Find a Diagnosis ...... 9 1.4. Exome Sequencing Reanalysis ...... 9 1.4.1. New Disease-Gene Associations ...... 9 1.4.2. Variant Interpretation ...... 10 1.4.3. Trio Exome Sequencing as a Way to Increase Diagnoses ...... 11 1.4.4. Bioinformatic Considerations ...... 12 1.4.5. When to Reanalyse ...... 13 1.5. Clinical Analysis Versus Research Analysis ...... 13 1.6. Potential Benefits of Genetic Diagnosis ...... 14 1.7. Manitoba Population ...... 17 1.8. Thesis Rationale ...... 18 CHAPTER 2: METHODS ...... 20 2.1. Ethics Approval ...... 20 2.2. Participant Recruitment ...... 20 2.3. Chart Review and Phenotype Abstraction ...... 22 2.4. Data Transfer ...... 23

V

2.5. Bioinformatics Pipeline ...... 24 2.6. Variant Filtration ...... 24 2.6.1. Filter 1: New Disease-Gene Associations and Previously Miscalled Variants ...... 28 2.6.2. Filter 2: Likely Deleterious Variants in Potentially Novel Genes ...... 28 2.6.3. Filter 3: Variants in Candidate Genes ...... 29 2.7. Exome Rounds...... 29 CHAPTER 3: RESULTS ...... 30 CHAPTER 4: CASE DESCRIPTIONS AND VARIANT INTERPRETATIONS ...... 35 4.1. Case 1 ...... 35 4.2. Case 2 ...... 36 4.3. Case 3 ...... 38 4.4. Case 4 ...... 39 4.4.1. FGF12 ...... 40 4.5. Case 5 ...... 41 4.6. Case 6 ...... 43 4.6.1. BCL7C ...... 44 4.7. Case 7 ...... 45 4.8. Case 8 ...... 47 4.9. Case 9 ...... 48 4.10. Case 10 ...... 50 4.10.1. RYR1 ...... 51 4.10.2. CHRNE ...... 55 4.10.3. SYNE2 ...... 55 4.10.4. RYR3 ...... 55 4.10.5. TTN ...... 56 4.11. Case 11 ...... 58 4.11.1. CYP4A11 ...... 59 4.11.2. GOLPH3 ...... 59 4.12. Case 12 ...... 60 4.12.1. USP18 ...... 62 4.13. Case 13 ...... 62 4.14. Case 14 ...... 63 4.14.1. PNKP ...... 64

VI

4.15. Case 15 ...... 65 4.16. Case 16 ...... 66 4.16.1 HLA-B ...... 69 4.16.2. Segregation Analysis ...... 70 4.17. Case 17 ...... 70 4.17.1. EP300 ...... 71 4.17.2. North American Indian Childhood Cirrhosis (NAIC) ...... 72 4.18. Case 18 ...... 75 4.18.1. OBSCN ...... 76 4.18.2. EGF ...... 77 4.19. Case 19 ...... 79 4.19.1. PTOV1 ...... 80 4.20. Case 20 ...... 81 4.21. Case 21 ...... 83 4.21.1. PRKCH ...... 85 4.22. Case 22 ...... 85 4.22.1. ZNFX1 ...... 87 4.23. Case 23 ...... 88 4.24. Case 24 ...... 90 4.24.1. RELB...... 92 4.25. Case 25 ...... 92 4.25.1. SPTB...... 94 4.25.2. ATG9A ...... 95 4.26. Case 26 ...... 95 4.26.1. COL9A3 ...... 99 4.26.2. LAMB2 ...... 100 CHAPTER 5: DISCUSSION ...... 101 5.1. Study Findings ...... 101 5.2. Secondary Findings ...... 104 5.3. Whole Genome Sequencing as a Future Direction ...... 104 5.4. Significance ...... 107 5.5. Study Strengths ...... 109 5.6. Limitations...... 110

VII

5.7. Conclusions ...... 112 REFERENCES ...... 113 APPENDICES ...... 124 Appendix I: Letter of Introduction, Participant Consent and Assent Forms ...... 124 Appendix II: Data Release Forms ...... 140 Appendix III: Variant Spreadsheet Headers ...... 145 Appendix IV: Case Data Summary ...... 151 Appendix V: Evidence for Analyzed Variants ...... 155

VIII

LIST OF TABLES Table 3.1: Participant demographics ...... 31 Table 3.2: Variant analyses by number of family members sequenced ...... 32 Table 3.3: Candidate variants ...... 33

IX

LIST OF FIGURES Figure 2.1: Participant recruitment and research process...... 21 Figure 2.2: Variant filtration protocol...... 26 Figure 2.3: ACMG criteria evidence framework (Figure 1 from: Richards et al., 2015)...... 27

Figure 4.1: Coverage of the RYR1 gene for case 10...... 53 Figure 4.2: Pedigree for case 16...... 68 Figure 4.3: Case 17’s exome sequencing reads aligned to the gene UTP4...... 74 Figure 4.4: Pedigree for case 26...... 97

X

LIST OF ABBREVIATIONS

ACMG American College of Medical Genetics and Genomics

C4R Care4Rare Canada Consortium

C4R-S Care4Rare-Solve

CCM Centre for Computational Medicine

CNV Copy Number Variant

DCC Data Coordination Centre

EDMD Emery-Dreifuss Muscular Dystrophy

ES Exome Sequencing

FAS Fetal Alcohol Syndrome

GnomAD Genome Aggregation Database

GUS Gene of Uncertain Significance

HLA Human Leukocyte Antigen

MCA Multiple Congenital Anomalies

MHC Major Histocompatibility Complex

NAIC North American Indian Childhood Cirrhosis

NGS Next Generation Sequencing

OMIM Online Mendelian Inheritance in Man

XI

RSS Russell-Silver Syndrome

WGS Whole Genome Sequencing

WRHA Winnipeg Region Health Authority

VUS Variant of Uncertain Significance

XII

CHAPTER 1: BACKGROUND AND REVIEW OF THE LITERATURE

1.1. Rare Disease

Rare diseases are not regularly considered by most individuals working in healthcare but collectively have a large impact on the healthcare system. In the European Union, rare diseases are defined as diseases that affect no more than 1 in 2000 individuals (Baldovino, Moliner, Taruscio, Daina,

& Roccatello, 2016), while the United States defines them as not affecting more than 200,000 people or

1 in 1,860 (Public Law 107–280, 2002). Canada currently does not have a strict definition for rare disease, except that they affect few individuals (Canadian Organization for Rare Disorders, 2015). While each individual condition is considered rare there are more than 6000 disorders that fit within this definition. Therefore, it may not be surprising that when taken all together, rare disorders are thought to affect 1 in 12 Canadians (Canadian Organization for Rare Disorders, 2015).

Most rare diseases are genetic in origin and often affect multiple body systems. There is significant morbidity and mortality associated with these conditions with common features including developmental delay/intellectual disability, neurodegeneration, organ failure, and shortened life expectancies. In fact, it has been estimated that 71% of hospitalized children have an underlying disorder that is at least partly genetic in nature (McCandless, Brunger, & Cassidy, 2004). Despite the frequency and the clear medical need, diagnosing these disorders is no easy task. In a survey of patients with rare disorders, the time from disease onset to diagnosis ranged from 5-30 years (Schieppati,

Henter, Daina, & Aperia, 2008). While waiting for a correct diagnosis, 40% of patients were diagnosed incorrectly, and the remainder were left with no diagnosis at all. Misdiagnoses can have significant consequences such as inappropriate or denied surgeries and medications, mismanagement of symptoms, and unnecessary travel (Al-Murshedi, Meftah, & Scott, 2019; Schieppati et al., 2008). Hence, there is a clear need for improved diagnostic methodologies for patients with rare disorders.

1

1.2. Traditional Methods of Diagnosis

Traditional genetic testing methods are usually done using a step-wise approach, and depend heavily on the features of the disease. Chromosomal microarrays are frequently used as a first-line test with many presentations such as individuals with intellectual disability or multiple congenital anomalies.

Chromosomal microarray analysis is genome-wide and can detect copy number variants (CNVs) of 20kb or greater in coding regions and 50kb or greater in non-coding regions. These CNVs are responsible for

15-20% of unexplained developmental delay, although may be less helpful for many other clinical presentations. Ultimately though they fail to identify single nucleotide changes or smaller CNVs which are the cause of most genetic conditions (Miller et al., 2010). In order to identify abnormalities at the gene level, specific targeted mutation testing or targeted sequencing technologies are needed (single gene sequencing or gene panels). It should be noted that these techniques are hypothesis-based, relying on the clinician having a prediction about the cause of disease. Physicians must recognize which disorder is associated with the patient’s presenting symptoms, based on published descriptions in the literature; however, many genetic disorders have wide phenotypic variability, sometimes with features that have not previously been documented, so recognition of these disorders is not an easy task (S. Chang,

Vaccarella, Olatunji, Cebulla, & Christoforidis, 2011).

These traditional step-wise approaches can be extremely costly, time-consuming, and often do not yield a diagnosis (Lionel et al., 2018; Nambot et al., 2018; Stavropoulos et al., 2016). In fact, it is estimated that approximately 50% of individuals with a rare genetic disorder do not have a confirmed molecular diagnosis (Shashi et al., 2014). When molecular diagnostic methods fail to find a genetic cause for a disorder, patients can be stuck in a “diagnostic odyssey.” This is a term used in the literature to describe patients who go years without a diagnosis, see many different specialists, and are exposed to various treatments (Sawyer et al., 2016; Schwarze, Buchanan, Taylor, & Wordsworth, 2018). Not only is it heart breaking and tiring for these families to go through so many appointments and tests to no avail,

2 but it is also extremely costly for the health care system (Monroe et al., 2016; Tan et al., 2017). For this reason, genome-wide next generation sequencing (NGS) methods have started to be introduced into the clinical setting as a way of finding diagnoses for the remainder of patients who have been unable to achieve a diagnosis by traditional methods.

1.3. Exome Sequencing

Exome sequencing (ES) has become available to some clinics as a second-tier test when previous methods have failed to provide a diagnosis. ES is an unbiased approach that sequences most of the known coding exons in the genome and does not target any specific gene or region. ES is performed through NGS where library preparation is performed using probes specifically designed for the - coding regions of the genome (Myllykangas, Natsoulis, Bell, & Ji, 2011). This method has the ability to capture up to 95% of the exome at a minimum sequencing depth of 20x, allowing accurate single nucleotide variant calling (Lelieveld, Spielmann, Mundlos, Veltman, & Gilissen, 2015).

1.3.1. Criteria for Testing

Because ES has only recently been introduced into the clinical setting, criteria for which patients receive ES varies by institution with no published guidelines. It has been suggested that in order for patients to receive ES there should be a high confidence of a genetic etiology, broad differential diagnosis or low phenotypic specificity, and that alternative targeted approaches have failed to obtain a diagnosis (Boycott et al., 2015). It is not surprising that the availability of ES varies greatly between centres. In their study of ES performed at Baylor College in the United States, Yang et al. noted that the only reason that a potential participant did not receive ES was due to denial of coverage by their insurance (2014). However, in a single-payer Canadian system, selection criteria for participants may be more stringent and depend more heavily on the discretion of the geneticist. In general, ES leads to a confirmed genetic diagnosis in 25-30% of patients, though depending on clinical presentation and

3 criteria used for testing this range can expand to 16-68% (De Ligt et al., 2012; Nambot et al., 2018;

Retterer et al., 2016; Sawyer et al., 2016; Stavropoulos et al., 2016; Tarailo-Graovac et al., 2016; Y. Yang et al., 2014).

1.3.2. Factors that Affect Diagnostic Rates

The indication for genetic testing has been found to greatly impact diagnostic rates. Tarailo-

Graovac et al. found a 68% diagnostic rate when applying ES to neurometabolic disorders (2016). This high diagnostic rate was thought to be due to the strict inclusion criteria used in their study, as well as to the high prevalence of autosomal recessive inheritance in these types of disorders. In fact, 71% of diagnoses in their study had autosomal recessive inheritance, while other studies show recessive inheritance in 30-40% of ES diagnoses (Retterer et al., 2016; Stavropoulos et al., 2016; Y. Yang et al.,

2014). Disorders with autosomal recessive inheritance are particularly successful for diagnosis by ES due to the presence of more than one variant of suspicion in the same gene. Therefore, it stands to reason that disease indications with high rates of autosomal recessive disease will have a higher diagnostic rate than disorders with other inheritance patterns.

In contrast, de Ligt et al. found a 16% diagnostic rate in their study of ES in 100 people with severe intellectual disability (2012). All patients in this study had an IQ of under 50 and had previously undergone an extensive diagnostic workup, including microarray, targeted gene testing, and metabolic screening. There are a few factors that likely contribute to this low diagnostic rate. First, intellectual disability is well known to be a heterogeneous phenotype that can be caused by both genetic and environmental contributors, but the environmental components would not be evident using ES. Second, diagnostic rates of ES increase the earlier the testing is performed, because of the inclusion of patients with genetic changes in well-established disease-genes who would have been diagnosed via targeted genetic testing (Stark et al., 2016). Third, de Ligt et al. only report on definitive diagnoses found in their

4 study and did not report on candidate variants found in genes that had not yet been associated with disease (2012). If we were to re-look at this data several years later, then the diagnostic rate of this study would likely be higher due to the discovery of new disease-gene associations. This is discussed further in Section 1.4 on ES reanalysis.

Indeed, a variety of publications have found marked differences in successes of ES for differing groups of genetic disorders. Retterer et al. reported the highest diagnostic yield in hearing impairment at 55%, followed by visual impairment at 47% (2016). Similarly, Lee et al. found the highest diagnostic rate in patients with retinal disorders (2014). Interestingly, neurological disorders consistently have some of the highest rates of diagnosis ranging from 30-40% (Nambot et al., 2018; Retterer et al., 2016;

Shashi et al., 2019; Stavropoulos et al., 2016; Y. Yang et al., 2014). The rate of diagnosis for developmental disorders varies depending on accompanying symptoms. When a participant displays both developmental delay and autism but no other clinical features, the diagnostic rate ranges from 15-

20%, while developmental delay accompanied by dysmorphic features has a diagnostic rate of 30% (Lee et al., 2014; Retterer et al., 2016). Disorders reported to have lower rates of genetic diagnosis include connective tissue disorders and disorders of sexual development (Lee et al., 2014; Stavropoulos et al.,

2016). It must be noted, however, that sample sizes in many of these studies were too small to find statistically significant evidence of differences between groups.

Sample size is an important consideration when looking into the diagnostic rates of exome testing for different indications. Retterer et al. reported the highest rate of ES diagnosis in individuals with hearing loss; however, hearing loss was also the lowest cohort size in their study, with only 11 participants (2016). With such a small sample size, it is difficult to definitively conclude that hearing loss has a higher diagnostic rate than other indications. Conversely, their largest cohort was those with disorders of the central nervous system, with 1082 participants, and had a diagnostic rate of 31%, while multiple congenital anomalies had 729 participants and a diagnostic rate of 36%. Though a statistical

5 test was not done to compare these groups, it is more appropriate to draw conclusions about the differences in diagnostic rates seen between groups of this size than it is in samples of only 11 individuals. It should be noted that many studies on ES analysis use small sample sizes, due to the low instance of rare disorders and the availability of ES. Those studies that do have larger sample sizes are often looking at a broad spectrum of clinical presentations, making it difficult to draw conclusions about which groups of disorders have higher diagnostic rates.

The sequencing strategy can also have a large impact on the success rates for ES. Trio ES is when three members of a family undergo genetic sequencing, often the proband and both biological parents in order to better identify inheritance patterns of genetic variants. The main advantage of trio sequencing is the ability to identify de novo variants, but it also has the ability to more easily identify homozygous and compound-heterozygous variants in autosomal recessive disease, and to rule out autosomal dominant variants that have been inherited from an unaffected parent (Retterer et al., 2016).

In fact, Yang et al. found that 87% of autosomal dominant disorders were caused by de novo mutations which were identified through trio-sequencing (2014). This can be highlighted by a case where a patient was diagnosed after they were found to have a de novo variant in a gene not previously considered due to missing a key phenotypic feature of the disorder: hairy elbows (Lee et al., 2014). In this same study, there was a statistically significant increase in diagnoses for participants with trio sequencing (Lee et al.,

2014). Similarly, Retterer et al. found a 24% diagnostic rate for probands sequenced in singleton and a

31% diagnostic rate for those sequenced as a trio (2016).

1.3.3. Beyond Single Nucleotide Causes of Disease

Because of the hypothesis-free nature of ES, it has the advantage of solving disorders with more complicated underlying genetic causes. This includes identifying disease-causing variants with low levels of mosaicism and identifying multiple disease-causing variants for patients with more than one

6 underlying genetic disease. In the case of mosaic-causes for disease, two distinct studies showed mosaicism in approximately 1-2% of their participants, and were able to estimate the fraction of cells harbouring the pathogenic mutations (Retterer et al., 2016; Y. Yang et al., 2014). Similarly, for the case of multiple genetic diagnoses, Stavropoulos et al. found genetic changes in multiple disease-causing genes in 4% of their patients; however, due to finding variants in genes that only partially account for participant phenotypes, they estimate that 9% of their cohort may have a second genetic disorder

(2016). In addition, Tarailo-Graovac et al. found two genetic diagnoses in 14% of their participants, including a 19-year-old male with progressive dilated cardiomyopathy and sensorineural hearing loss who harboured pathogenic compound heterozygous variants in NPL as well as homozygous pathogenic variants in GJB (Tarailo-Graovac et al., 2016).

Although outside of the strict scope of the technology, ES also has the potential to find diagnoses that are not due to single nucleotide variants. For example, ES was able to identify uniparental disomy based on 2 homozygous regions of 5Mb and 19Mb on the same with coverage consistent with the rest of the exome, indicating no CNVs in this area (Lee et al., 2014). There were also no maternally inherited variants found on this chromosome and coverage across the homozygous regions was consistent across the entire chromosome. This led the researchers to determine that paternal uniparental disomy of chromosome 6 was the cause of disease in this patient

(Lee et al., 2014).

It should be noted that ES can also help us learn more about already known disorders. This is evident with the previously mentioned patient, who was diagnosed with Wiedemann-Steiner syndrome, despite not having the hairy elbows previously thought to be a key symptom of the disorder (Lee et al.,

2014). Similarly, an 8-year-old boy with abnormal neurotransmitter profiles was found to have an exon deletion in SCN2A, a gene associated with varying phenotypes, but never described in a neurotransmitter disorder (Tarailo-Graovac et al., 2016).

7

1.3.4. Secondary Findings

All untargeted tests have the ability to identify disease-causing changes that were not related to the initial testing indication, referred to as secondary findings. However, due to the large scope of ES, the chances of coming across a secondary finding are greatly increased with this technology. This can be thought of as both an advantage and a disadvantage of the technology. The identification of medically actionable secondary findings could have life-saving consequences for patients and their relatives; however, if participants are not giving fully informed consent to receive secondary findings, then revealing these findings to the participant may go against their personal autonomy (Taylor et al., 2015).

The American College of Medical Genetics (ACMG) released guidelines on secondary findings, including a list of 59 medically actionable genes, to aid clinicians in knowing what information to disclose to patients who receive ES (Green et al., 2013; Kalia et al., 2017). Additionally, during the consenting process for ES, patients are often given the option to receive secondary findings or not, in order to better preserve patient autonomy.

The number of patients wishing to receive information about secondary medically actionable findings ranges from 74-96% (Lee et al., 2014; Stavropoulos et al., 2016). Interestingly, Stavropoulos et al. found that 35% of families did not feel comfortable receiving secondary findings that may affect their life insurance (2016). It is unclear if this number has changed since the introduction of the Canadian

Genetics Non-Discrimination Act in 2017, which states that insurance companies cannot request access to genetic testing information or results.

It has been reported that secondary findings are seen in 0.5-6.0% of participants who receive ES

(Lee et al., 2014; Nambot et al., 2018; Retterer et al., 2016; Tarailo-Graovac et al., 2016; Y. Yang et al.,

2014). The most common secondary findings are related to cardiac conditions. Cardiomyopathies and inherited arrhythmias account for over 50% of secondary findings (Retterer et al., 2016; Shashi et al.,

8

2019; Y. Yang et al., 2014). The next most common type of secondary finding is related to inherited cancer syndromes (Retterer et al., 2016; Y. Yang et al., 2014). These types of diagnoses can have significant impacts on patient health outcomes by leading to increased screening and medical interventions such as the introduction of beta-blockers and/or implantable cardioverter-defibrillators.

1.3.5. When Exome Sequencing Cannot Find a Diagnosis

Unfortunately, although ES is the most powerful clinical test currently available, no test is perfect. Shashi et al. suggest three reasons why ES may not lead to a genetic diagnosis (2019). First, the cause of the disorder may not be Mendelian in nature, this includes disorders that are polygenic, multifactorial, or purely due to environmental factors. In these cases, no genetic test will lead to a diagnosis, but ES may help to rule out a monogenic etiology for the disease. Second, the variant may be undetectable by ES technology. This is because ES is unable to find mutations in deep intronic or promoter regions, and has difficulty identifying moderate to large CNVs, repeat expansions, any variant present in a segmentally duplicated region, or differentiating between genes and pseudogenes.

Therefore, genetic disorders caused by these types of genetic changes will be missed by ES. As well, approximately 5-7% of coding exons are not covered adequately enough in ES to make trusted variant calls, which may lead to missed diagnoses (Lelieveld et al., 2015). Finally, variants may be detected by

ES, but may not be reported as disease-causing. This is because ES is limited by the interpretation of each genetic variant found, which relies heavily on manual curation of variants, the availability of patients’ phenotypic information, and reported disease-gene associations (Salmon et al., 2019).

1.4. Exome Sequencing Reanalysis

1.4.1. New Disease-Gene Associations

Systematic reanalysis of ES data at regular intervals has consistently been shown to diagnose an additional 10-15% of cases (Eldomery et al., 2017; Ewans et al., 2018; Nambot et al., 2018; Shashi et al.,

9

2019; Wenger, Guturu, Bernstein, & Bejerano, 2017). The most common reason for the increase in genetic diagnosis is the discovery of new genes associated with the disease of interest since the last analysis of the exome data. There are close to 250 new disease-gene associations and over 9000 new variant-disease associations made each year, showing that the passage of time creates new opportunities for patient diagnosis (Wenger et al., 2017). In fact, Yang et al. found that 30% of diagnoses made in their cohort were in genes with a disease association that had only recently been published, indicating that if ES had been performed only a few months earlier then the patient would not have received a diagnosis (2014).

In their reanalysis study, Lee et al. presented two cases where initial ES was unable to find a diagnosis due to the lack of disease-gene associations that only became available shortly before reanalysis (2014) . In one case, a patient with complex epilepsy and development regression was found to have a de novo missense variant in a gene of uncertain significance (GUS), KCNT1. Shortly after initial

ES analysis had been completed, this gene was reported to cause epileptic encephalopathy, which matched the phenotype of their patient. Similarly, another patient with developmental delay, seizures, and microcephaly was found to have a de novo variant in the gene TUBB2A. A publication linking this gene to epilepsy in infants was released in 2014, not only allowing for a confirmed diagnosis in this patient, but also triggering the authors to review another case with similar findings (Cushion et al., 2014;

Lee et al., 2014). The release of this publication led to a confirmed diagnosis in two independent patients with infantile-onset epilepsy.

1.4.2. Variant Interpretation

Misinterpretation of variants is also a common reason for missed diagnoses. Each ES run can generate over 5000 variants. While most of these variants may be automatically filtered due to high prevalence in the general population, there can still be hundreds of variants that need to be manually

10 reviewed (Retterer et al., 2016). Yang et al. reported that at the start of their ES review it took approximately 6 hours to analyze the variants for each participant (2014). This time was lowered to approximately half an hour per case by the end of their study of 2000 patients. Because of the time required to analyze these variants, laboratories must choose which variants to prioritize, which could lead to filtering of disease-causing variants. Even when variants are properly scrutinized, the interpretation of the variant’s pathogenicity can vary from person to person. This was the case in a study by Amendola et al. who found a 66% variant discordance rate between laboratories using the

ACMG variant interpretation guidelines (2016).

During their ES reanalysis study, Shashi et al. contacted the initial sequencing laboratories to determine the reasons for missed diagnoses during initial analyses (2019). They found that one laboratory focussed on de novo variants and therefore missed an inherited disease-causing variant in the proband, and another two variants were in genes initially not thought to fit the patient phenotype.

Similarly, Taylor et al. found disease-causing variants in 4 participants that had initially been screened and misinterpreted by the laboratory (2015). Finally, Nambot et al. found that 2.5% of their reanalysis diagnoses were due to initial misinterpretations of variants (2018). Therefore, it may not be surprising that Al-Murshedi et al. believe that initial variant misinterpretation is an underappreciated problem in the ES world, and use this evidence to encourage more timely reanalysis of negative ES results (2019).

1.4.3. Trio Exome Sequencing as a Way to Increase Diagnoses

It has been suggested that the addition of parental samples (i.e., converting a singleton analysis into a trio reanalysis) may increase the number of diagnoses (Eldomery et al., 2017). This type of reanalysis increases the cost, so it is important to consider whether or not variants found after the addition of parental sequences would have been discovered when reanalysing the singleton exome on its own. Some bioinformatic pipelines can use simultaneous variant calling of trios to prioritize de novo

11 variants. Because de novo disease-causing variants can often be overlooked during analysis, trio sequencing with these bioinformatics pipelines may aid in increasing the number of diagnoses (Retterer et al., 2016; Y. Yang et al., 2014). However, researchers must take care to consider all modes of inheritance, as seen in the previously mentioned case presented by Shashi et al., where an inherited pathogenic variant was missed because the laboratory highly prioritized de novo variants (2019).

1.4.4. Bioinformatic Considerations

It may also be important to consider using different bioinformatics pipelines during sequencing reanalysis. When a variant is seen in ES data, bioinformatics software is used to annotate that variant with information such as gene and exon location, variant type, and frequency in population databases.

The different annotation tools often collect data from various online databases; therefore, the accuracy of those databases is important to the annotation and variant interpretation. Taylor et al. showed that different annotation tools may have discrepancies between them (2015). In this study, the annotation tools Annovar and Ensemble had a 66% agreement in prediction of loss-of-function annotations and an

87% agreement for annotations of all variants within exons. The greatest number of disagreements occurred for the annotation of putative splice-site variants. For this reason, researchers may want to consider multiple variant callers and annotation tools during reanalysis. The bcbio-nextgen pipeline

(https://github.com/bcbio/bcbio-nextgen) jointly uses the variant callers GATK Haplotype Caller (Poplin et al., 2017), FreeBayes (Garrison & Marth, 2012), Platypus (Rimmer et al., 2014), and Samtools (H. Li et al., 2009), as well as creating annotations using VEP-Ensembl (McLaren et al., 2016) and RefSeq

(https://www.ncbi.nlm.nih.gov/refseq/). This combination of multiple aligners and annotation tools reduces the chances of losing a true variant during bioinformatics processing.

12

1.4.5. When to Reanalyse

Despite the clear benefit, no guidelines exist for systematic reanalysis of clinical exome data.

Many groups recommend ES reanalysis every 1-3 years (Ewans et al., 2018; Wenger et al., 2017), but this recommendation does not take into account the resources needed for reanalysis. In their European guidelines for NGS diagnostic testing, Matthijs et al. clearly state that laboratories are not expected to systematically reanalyse old data (2016). Some laboratories do provide systematic reanalysis, while others will only reanalyze data at the request of the ordering physician. However, due to the amount of time required for variant assessment, some laboratories limit data reanalysis to one free request per patient, leaving it up to the ordering physician to determine when it is most optimal to request data reanalysis.

1.5. Clinical Analysis Versus Research Analysis

An important point to consider in the success of various sequencing methods is the difference between clinical sequencing and research sequencing. Several papers in this literature review have noted that the introduction of ES as a standard of care for patients with a suspected rare Mendelian disorder is blurring the lines between clinical care and research (Matthijs et al., 2016; Nambot et al.,

2018). This is because genetic testing has previously been used to molecularly confirm suspected clinical diagnoses, whereas genome wide-sequencing is now being used to discover diagnoses.

In their guidelines for the diagnostic use of NGS, Matthijs et al. state that a research test is a hypothesis driven test with limited clinical relevance to the patient enrolled (2016). However, genome wide sequencing, such as ES and whole genome sequencing (WGS) is often only available on a research- basis and is generally considered to be hypothesis free. In this case, hypothesis free means that it is an unbiased search of the entire genome, without a specific gene or diagnosis in mind. As well, Matthijs et al. state that “a diagnostic test is any test directed toward answering a clinical question related to a

13 medical condition of a patient” (2016). This is where the boundary between a research test and a clinical test are blurred. The purpose of genome-wide sequencing is to answer a clinical question even if it is being offered to a patient on a research basis.

The availability of genome-wide sequencing analysis on a research basis opens the door for translational studies to be used in GUSs. Currently only genes that are known to be associated with the patient’s phenotype can be clinically reported; however, NGS technologies create an opportunity for additional gene discovery. In their study, Nambot et al. noted that 50% of their diagnosed cases were solved through the use of both data sharing and translational studies (2018). Not only does enrolling patients in research studies allow them access to more thorough testing, such as WGS, but it also allows for translational studies on GUSs and for data sharing between clinicians. Matchmaker Exchange is an international data sharing site that allows clinicians to upload patient phenotypes and variant information (Philippakis et al., 2015). In this way, clinicians can connect to other healthcare providers anywhere in the world in order to find other individuals with a phenotype similar to those seen in their patients. If patients with similar phenotypes in different parts of the world are then found to have variants in the same candidate gene, this can increase the evidence that the candidate gene is disease- causing, increasing the potential to find a genetic diagnosis. This goes to show that in the field of genetics it is important to push the boundaries of clinical testing into the research realm in order to provide patients with optimal care.

1.6. Potential Benefits of Genetic Diagnosis

The impacts of genetic diagnoses on patient care and disease management can be quite far reaching. Ewans et al. note that a confirmed diagnosis leads to significant savings in cost through the use of appropriate and targeted clinical management, as well as increased life span and quality of life for patients, and the ability for family members to make informed reproductive decisions (2018). Because of

14 these listed benefits, it has been suggested that the use of ES methods as a first-tier test may result in overall cost savings in the healthcare system (Lionel et al., 2018). Indeed, in their cohort, Ewans et al. reported an average time to diagnosis of 12 years 8 months (2018). No doubt, during this time there were multiple unsuccessful diagnostic tests, misdiagnoses, and care mismanagement. In fact, Tarailo-

Graovac et al. reported that having a confirmed genetic diagnosis affected the care in 44% of their patients (2016). Similarly, Eldomery et al. listed that implications of genetic diagnoses included treatment with acetazolamide, surveillance for arrhythmias, kidney and liver surveillance, as well as ICD implantations (2017).

The literature revealed several examples where ES changed the working diagnosis of a patient to a completely different diagnosis, which had significant impact on screening and management implications for that patient. One patient, believed to have a mitochondrial diagnosis, was found to have

Costello syndrome via ES investigation (Ewans et al., 2018). This changed patient care management to no longer use mitochondrial therapy and to instead focus on cardiac and cancer surveillance. In another patient, their features suggested a connective tissue disorder and a second diagnosis of NF1; however,

WGS led to a diagnoses of Sotos syndrome, significantly changing this patient’s care management

(Stavropoulos et al., 2016).

Diet change and supplementation is a common implication for genetic diagnoses of inborn errors of metabolism. Eliminating fructose from the diet of a patient with homozygous variants in

Aldolase B was able to resolve their recurrent comas and seizures. A ketogenic diet was suggested for a patient with GLUT1 deficiency syndrome that led to absence seizures, ataxia, and developmental delay

(Retterer et al., 2016). Similarly, Nambot et al. had success in stopping seizures and improving psychomotor development using the ketogenic diet for their patient with GLUT1 deficiency syndrome

(2018). Supplementation with N-acetylneuraminic acid was used to treat a patient with compound heterozygous variants in the gene NANS. Carglumic acid along with an emergency protocol were able to

15 prevent irreversible neurological damage in a patient with homozygous missense variants in the gene

CA5A, important for the function of mitochondrial enzymes. A microcephalic patient with a severe seizure disorder was found to have compound heterozygous variants in GOT2, encoding a mitochondrial glutamate oxaloacetate transaminase. This patient was found to have improved head growth, psychomotor development, and seizure control when he was given oral serine and pyridoxine supplements (Tarailo-Graovac et al., 2016). Finally, pyridoxal phosphate was able to treat the seizures in a patient with pyridoxamine 5-prime-phosphate oxidase deficiency (Retterer et al., 2016).

Outside of diet change and supplementation, treatments can also include more substantial medications such as L-DOPA and selegiline used to treat a patient with progressive hypotonia, seizures, and developmental delay found to have tyrosine hydrolase deficiency (Retterer et al., 2016).

Medications were also helpful in preventing potentially fatal episodes for a patient with postural orthostatic tachycardia syndrome, diagnosed with acute intermittent porphyria (Retterer et al., 2016).

Cascade testing in the sister of a patient found to have MTO1 deficiency led to preventative therapy before the onset of symptoms (Tarailo-Graovac et al., 2016). Chemotherapy and stem-cell transplantation was made available to patients with variants found in SENP1, SYTL2, and KRAS. Finally, screening and avoidance of triggers were implicated for patients with various genetic diagnoses including variants found in the following disease genes: CBL, PIK3R1, ECT2, PIK3R2, NGLY1, KAT6B,

COL4A1, SMAD4, and PORSS1 (Stavropoulos et al., 2016; Tarailo-Graovac et al., 2016).

Even when treatments are not immediately available for patients, diagnoses may lead to investigations of novel disease treatments. For example, Tarailo-Graovac et al. were able to look into potential treatment options for a patient with a deficiency in acetyl-coenzyme A carboxylase beta

(2016). Similarly, Nambot et al. reported that novel therapeutic options were made for 9 of their patients diagnosed via ES, including supplementation with N-acetylneuraminic acid for the patient with

16 biallelic variants in NANS, described earlier (2018). These examples show that diagnosis of rare genetic disorders have invaluable impacts on the quality of life and treatment options for patients.

1.7. Manitoba Population

Clinical ES and ES reanalysis appear to be promising avenues for finding diagnoses in patients with rare genetic disease. Nevertheless, an important point to consider when assessing the utility of ES is the background of the patient population being studied. Even within Canada, populations can vary considerably between provinces. This is evident by using the 2016 census data on the mother tongue language spoken by Canadians as a proxy for diversity. When looking at the mother-tongue languages of all Canadians, 56% report English as their mother tongue, while 20.6% report French, and 21.1% report speaking a non-official language (Statistics Canada, 2017c). These numbers vary greatly when comparing individual provinces. For example, in Nova Scotia, English is reported as the mother tongue language for 91% of residents, with French at 3.2%, and non-official languages at 4.9%. Similarly, the

Manitoba population mother tongue language distribution differs from the statistics shown to represent all Canadians. In Manitoba, the make-up of self-reported mother tongue languages is 71.4%, 3.2%, and

22.9%, for English, French, and non-official languages, respectively (Statistics Canada, 2017c). This goes to show that the ethnic makeup of Canadians cannot be generalized between provinces.

Manitoba has a unique and extremely diverse population when compared to the rest of Canada.

Manitoba consists of a mixture of populations of varying ethnicities, including a number of refugee populations. Based on the most recent census data 18.3% of Canada’s immigrant population live in

Manitoba, despite being home to only 3% of the population as a whole (Statistics Canada, 2017b). Since the current normal variant databases are heavily weighted to include outbred European individuals this can complicate any genetic study. In addition to this ethnic diversity, Manitoba is home to a number of genetically isolated founder populations that include a number of indigenous populations as well as at

17 least two major Anabaptist populations, Hutterites and Dutch-German Mennonites. For all of these genetically isolated groups there can be wildly divergent allele frequencies compared to the Manitoba population as a whole which can complicate genetic analysis. To give an idea of the impact of these genetic isolates in Manitoba, 4.9% of Canadian self-identify as Aboriginal peoples whereas 18% of

Manitobans self-identify with this group (Statistics Canada, 2017a). Similarly, 35,010 Canadians report living on Hutterite colonies, and 11,275 (32.2%) of those individuals reside within Manitoba (Statistics

Canada, 2017d). Given this unique mix, it cannot be assumed that the success rate of ES in Manitoba will be similar to other areas of the country.

1.8. Thesis Rationale

Clinical ES has been made available for select patients in Manitoba since 2014. Since that time,

72 families have undergone ES testing and genetic diagnoses have been made for 29 (40.3%) of these patients, while 7 are waiting on family studies to confirm pathogenicity (Internal, unpublished data).

Currently, ES is the most thorough clinical test that we have to offer patients with suspected monogenic diseases; however, systematic reanalysis of negative ES results is not routinely done and a negative result for many of these patients is often considered a diagnostic dead end. It should also be noted, that before the availability of clinical ES, a number of patients received ES on a research basis. Many of those patients who did not receive a diagnosis after research ES have also not had recent reanalysis of their data, leaving many without a diagnosis for many years. Therefore, we are proposing systematic reanalysis of both clinical and research ES data for this Manitoba cohort.

We hypothesize that systematic reanalysis of negative exomes should lead to increased diagnoses among Manitoba patients with suspected monogenic disorders that have remained undiagnosed after traditional molecular testing methods. This thesis is a pilot study that aims to show the diagnostic utility of such reanalysis of non-diagnostic ES data in our Manitoba cohort. It is already

18 known that systematic reanalysis of previously non-diagnostic ES data leads to increased diagnoses for patients. Prior to this study, this has never been done on a Manitoba patient population. We would like to develop the local infrastructure for doing such a systematic research reanalysis, as well as show that there is clinical utility (in the form of diagnostic yield) of implementing this approach in Manitoba.

Increased diagnosis of genetic causes of disease will lead to better outcomes for many families with these genetic disorders by dictating treatment and future care (Sawyer et al., 2016; Schieppati et al., 2008). Importantly, knowing the genetic cause of a condition allows for individuals to be counselled on genetics, recurrence risks, and prenatal options. Additionally, thorough analysis of variants in genes with unknown function may lead to future gene discovery projects, adding to our overall knowledge of human genetics.

19

CHAPTER 2: METHODS

2.1. Ethics Approval

This study was approved by the University of Manitoba’s Bannatyne Campus Health Research

Ethics Board (REB approval number HS22982/H2019:274). This study was done in collaboration with the

Care4Rare (C4R) research consortium and followed the research protocols of the Care4Rare-Solve (C4R-

S) research program.

2.2. Participant Recruitment

An overview of participant recruitment and study design can be found in Figure 2.1. Patients from the Winnipeg Region Health Authority (WRHA) who had received negative exome sequencing (ES) results were eligible to participate in this study. Exclusion criteria included participants without the ability to consent, participants whose ES data was unavailable for reanalysis, and participants who had received a genetic diagnosis that explained the phenotype in its entirety.

20

Figure 2.1: Participant recruitment and research process. N represents the number of total cases in each step of the process. Patients of the Winnipeg Region Health Authority (WRHA) who were identified as eligible were contacted by a genetics assistant for consent to be contacted for a research study. Those that were interested were contacted by the student PI and given a study package including an introductory letter, consent form, children’s assent form (if applicable), data release form, and return envelope. Participants had a follow-up meeting with the student PI via telephone or BlueJeans teleconference to ask questions about the consent forms and research process. Following consent, raw exome sequencing data was transferred from the original sequencing lab and put into the bcbio-nextgen sequencing pipeline for variant calling and annotation. Simultaneously, eligible participants from the Care4Rare (C4R) cohort were identified and selected for reanalysis. Variants of all participants were analyzed and interpreted by the student PI and presented at exome rounds with C4R.

21

Participants were contacted by a genetic assistant to request permission to be contacted for a research study. Participants who agreed were contacted by telephone and given an introduction to the purposes of the study. Participants had the option of receiving the information package by mail, fax, or e-mail. Information packages included introductory letters, informed consent forms, child assent forms when applicable (Appendix I), and data release forms for the laboratory that performed the initial ES

(Appendix II). Alternatively, patients who were scheduled for follow-up appointments with the Program for Genetics and Metabolism at the WRHA were asked if they would like to meet with a researcher at the time of their appointment. Participants who received the study package by mail or in-person were given a return envelope. Follow-up phone, BlueJeans teleconference, or in-person meetings were arranged with participants to go through the consent forms and answer any questions the participants may have had. This gave participants time to review the consent form in detail before agreeing to take part in the study.

Patients of the WRHA who had previously enrolled in the C4R study and had received a negative research exome were also eligible to participate in this study. Fifteen participants were selected for reanalysis based on the type of sequencing they had previously received (trio, duo, or singleton), the lack of potential candidate genes in their first analysis, the availability of their data, and the severity of their disease. These individuals were not reconsented for this study, as the consent form for C4R allows for ES reanalysis.

2.3. Chart Review and Phenotype Abstraction

Each participant’s medical chart was reviewed for relevant clinical information. Information that was collected includes: family structure, family history, ethnicity, current age, age of disease onset, phenotype, disease progression, and previous investigations. The phenotypic information was used to generate a list of terms using the Human Phenotype Ontology (HPO; Köhler et al., 2019; P. N. Robinson

22 et al., 2008). HPO is a standardized and hierarchical way of describing patient phenotypes. Each HPO term is accompanied by a list of associated genes based on disease descriptions in the Online Mendelian

Inheritance in Man (OMIM; https://omim.org/) database and crowdsourced phenotype annotations through the program Phenotate (https://phenotate.org/). HPO updates phenotype-gene associations monthly. Each participant was assigned HPO terms based on the phenotype description recorded during chart review. HPO terms were entered into the PhenomeCentral portal which produces a list of genes associated with all the HPO terms that were selected (Buske et al., 2015). HPO terms were also used to produce a ranked gene list using the program Phenolyzer (H. Yang, Robinson, & Wang, 2015).

Phenolyzer creates ranked gene lists by comparing phenotype terms to disease databases in order to collect lists of candidate genes, then the created candidate gene list is compared to gene-gene interaction databases. The final output is a ranked gene list, weighted by how likely a gene is to be associated with the list of phenotype terms.

2.4. Data Transfer

Laboratory specific data release forms were signed by the participants at the same time as the consent forms (Appendix II). The raw ES data was transferred as either a bam, cram, or fastq file based on the laboratory specific protocol. GeneDx data transfers occurred using a secure file transfer protocol server that has been established with C4R. PreventionGenetics and Baylor data transfers occurred through secure internet servers. In these secure servers, the laboratory uploaded the files to the student PI’s account for a limited number of days to allow them to be accessed and downloaded.

The Centre for Computational Medicine (CCM) at The Hospital for Sick Children (SickKids) in

Toronto, Ontario hosts all C4R data. As this project is under the umbrella of the C4R-S study, all data were transferred to the CCM Data Coordination Centre (DCC). This is a secure site which requires a data transfer code that expires bi-weekly. Once raw data have been uploaded to the DCC it is accessible by

23

CCM. In order to link the exome data in the DCC to the appropriate participant, participant information

(i.e., project name, participant code, and data type) is added to the SickKids sample tracker website.

2.5. Bioinformatics Pipeline

C4R has developed an ES analysis pipeline, called CRE. This pipeline is available for download from https://github.com/ccmbioinfo/cre. The pipeline is dependent on the bcbio-nextgen pipeline, available to download from https://raw.github.com/bcbio/bcbio- nextgen/master/scripts/bcbio_nextgen_install.py. The pipeline initially aligns reads to the GRCh37 reference genome using the program BWA-mem (H. Li & Durbin, 2010). Variants are then called using the programs FreeBayes, GATK-HaplotypeCaller, Samtools, and Platypus (Garrison & Marth, 2012; H. Li et al., 2009; Poplin et al., 2017; Rimmer et al., 2014). The program then annotates the variants from each caller with information such as gene, frequency in population databases, association with disease, etc., using VEP-Ensembl (McLaren et al., 2016) and RefSeq (https://www.ncbi.nlm.nih.gov/refseq/).

Annotated vcf files are inputted into the CRE pipeline, which outputs an annotated spreadsheet including variants that were called by two or more variant callers with an allele frequency of less than

1%. Information on the final spreadsheet annotation can be found in Appendix III.

2.6. Variant Filtration

Once variants have been called and annotated, variant filtration will be performed as shown in

Figure 2.2. It should be noted that additional filters may have been applied for some participants based on the expected mode of inheritance or the number of family members sequenced. For example, filters to assess de novo variants were applied for participants who received trio-ES with unaffected parents, and filters to assess shared homozygous variants were applied to sibling pairs with a consanguineous family history. Nevertheless, all participants were assessed using the three initial filters described below

24 and shown in Figure 2.2. After filtration, all variants were analyzed using ACMG guidelines for variant interpretation (Richards et al., 2015). An overview of ACMG criteria can be found in Figure 2.3.

25

Figure 2.2: Variant filtration protocol. Three initial variant filters were applied for each participant. The goal of the first filter was to classify variants that may have had new gene-disease associations, have been misclassified during initial assessment, or were missed by the original bioinformatics pipeline. This filter contained variants in genes related to the participant’s phenotype using HPO terms, variants with an allele frequency of <0.001 in the GnomAD database, and variants whose zygosity/allele burden matched the expected inheritance of the gene. The goal of the second filter was to analyze variants that had a high chance of deleterious effects on genes without known links to the phenotype. This included frameshift, non- sense, and potential splice-site variants. These variants were filtered for an allele frequency of <0.001. The goal of the third filter was to assess candidate genes associated with the participant’s phenotype. Candidate gene lists were downloaded from databases curated by experienced working groups for the disorders of interest. These variants were also filtered to only include those with an allele frequency of <0.001.

26

Figure 2.3: ACMG criteria evidence framework (Figure 1 from: Richards et al., 2015). “This chart organizes each of the criteria by the type of evidence as well as the strength of the criteria for a benign (left side) or pathogenic (right side) assertion… BS, benign strong; BP, benign supporting; FH, family history; LOF, loss of function; MAF, minor allele frequency; path., pathogenic; PM, pathogenic moderate; PP, pathogenic supporting; PS, pathogenic strong; PVS, pathogenic very strong” (Richards et al., 2015).

27

2.6.1. Filter 1: New Disease-Gene Associations and Previously Miscalled Variants

The first filter was meant to assess variants that may have been missed during initial ES analysis.

These variants may have been missed due to a lack of identified disease-gene associations at the time of initial analysis, misclassification of the assessed variants, or a variant that was not called by the initial bioinformatics pipeline. Only variants in genes associated with the HPO terms for each participant were assessed. Variants were filtered to only include those with a maximum population allele frequency of

0.001. This means that only variants that occur in less than 0.1% of any population were assessed. This is because the diseases in our participants are extremely rare and if they were to occur with an allele frequency of greater than 0.001 then they would occur too often in the population for them to be associated with the disease of interest. Zygosity of the variant of interest and gene burden (i.e., number of variants within a single gene) were also assessed.

Mode of inheritance was assessed for participants who received trio sequencing. Autosomal recessive variants were assessed by filtering for variants that were heterozygous or absent in each parent and homozygous in the proband. Potentially compound heterozygous variants were assessed by filtering for variants with a gene burden of ≥2 in the proband. De novo autosomal dominant inheritance was assessed by filtering for variants that were heterozygous in the proband and absent from both parents. OMIM inheritance filters were applied for the corresponding modes of inheritance.

2.6.2. Filter 2: Likely Deleterious Variants in Potentially Novel Genes

The second variant filter that was applied assessed variants that were likely to have a deleterious effect on gene function. This included variants that were suspected to result in a loss of gene function (i.e., frameshift variants, splice-site variants, and nonsense variants). Variants were also filtered to only include those with a maximum population allele frequency of 0.001. This filter does not use any phenotypic information so is meant to identify potentially novel genes.

28

2.6.3. Filter 3: Variants in Candidate Genes

When applicable, candidate gene lists were used as a variant filter. A curated list of genes that cause neuromuscular disease was downloaded from http://www.musclegenetable.fr/. An immune disease candidate gene list was created based on the International Union of Immunological Societies

2017 report on inborn errors of immunity (Picard et al., 2018). An epilepsy candidate gene list was created based on a review of current epilepsy genetic knowledge (Myers, Johnstone, & Dyment, 2019).

A list of mitochondrial genes was downloaded from the GeneDX MitoXpanded Gene panel

(https://www.genedx.com/test-catalog/available-tests/mitoxpanded-panel/). A curated list of genes that cause intellectual disability was downloaded from http://gfuncpathdb.ucdenver.edu/iddrc/iddrc/data/IDgenelist_chr.html. Variants in candidate genes were then filtered for allele frequency of less than 0.001 before being analyzed. Please note that participants in the “multiple congenital anomalies” and “other” genetic disorder categories did not have candidate gene lists.

2.7. Exome Rounds

In order to validate exome reanalysis results, each exome was also analyzed by a member of the

C4R team in addition to the analysis by the student PI. The C4R analysis was done using the C4R variant analysis standard operating procedure, which is independent from the variant analysis procedure used by the student PI. The variants analyzed were then presented by the student PI at exome rounds, which took place over the BlueJeans web conferencing application and included the student PI, her master’s thesis supervisor/the referring clinician, and two members of the C4R team. After discussing evidence for each variant of interest, the team would discuss any variant discrepancies between the student PI and the C4R team. Future plans to confirm each variant of interest or potential follow-up studies for each participant were discussed during these meetings.

29

CHAPTER 3: RESULTS

A total of 26 families consisting of 34 individuals were analyzed in this study. For the sake of clarity, each family is recorded as a single case ID. Summaries of cases and their findings can be found in

Appendix IV. A total of 15 cases were recruited from the original Care4Rare (C4R) cohort, and 11 were recruited specifically for this study. The demographics of the participants can be found in Table 3.1. The study consisted of 19 males (55.9%) and 15 females (44.1%). The median participant age was 12 and ranged from 2 to 56. An average of 3.25 years had passed since initial exome sequencing. A total of 13 cases (50%) were affected by neurologic conditions, 5 (19%) had immune conditions, 5 (19%) had multiple congenital anomalies, and 3 (11%) had other conditions, which included an undetermined metabolic condition, cholestasis, and a connective tissue disorder.

A total of 932 variants were analyzed in this study with an average of 37.1 variants analyzed per case. The evidence considered for each variant can be found in Appendix V. A breakdown of the number of variants analyzed compared to the number of family members sequenced can be found in

Table 3.2. Candidate variants were found in 14 out of the 25 cases analyzed (Table 3.3). Of these, 3 cases contained strong candidates in known disease-causing genes, 6 cases contained candidate variants in novel disease-genes, and 2 cases contained heterozygous variants in autosomal recessive genes strongly associated with the participant’s phenotype. No diagnoses have been confirmed; therefore, an overall success rate cannot be calculated.

30

Table 3.1: Participant demographics

Number of Participants (%) Age at Thesis Submission in Years N = 331 <6 7 (21.2) 6-10 8 (24.2) 11-15 6 (18.2) 16-20 7 (21.2) >21 5 (15.2) Median 12 Average (SD)2 13.6 (10.8) Age Range 2-56 Gender N = 34 Male 19 (55.9) Female 15 (44.1) Year of Original Exome Sequence N = 253 2019 6 (24.0) 2018 4 (16.0) 2017 3 (12.0) 2016 5 (20.0) 2015 5 (20.0) 2014 1 (4.0) 2013 1 (4.0) Average years since ES (SD) 3.24 (1.78) Condition Type N = 26 Neurologic 13 (50.0) Immune 5 (19.2) MCA4 5 (19.2) Other5 3 (11.5) 1. One participant was removed from total age calculations because she was deceased at the time of the study 2. SD = standard deviation 3. One exome was removed from total calculation due to low quality data 4. MCA = multiple congenital anomalies 5. Other category includes one case with an undetermined metabolic condition, one case with cholestasis, and one case with an undetermined connective tissue disorder

31

Table 3.2: Variant analyses by number of family members sequenced

Sequencing Type Avg. Variants Variant Range No. of Cases Total

Singleton1 46.9 30-72 9 469

Affected siblings (duo2 or trio3) 51.0 5-118 6 306

Affected sibling (duo or trio) - outlier 37.6 5-64 5 188 removed4

Duo - unaffected parent 36.0 36 1 36

Trio - unaffected parents 13.4 5-39 9 121

Total 37.1 5-118 25 932

1. Singleton refers to exome analyses which include one member of a family 2. Duo refers to exome analyses which include two members of a family 3. Trio refers to exome analyses which include three members of a family 4. An affected sibling pair with 118 variants analyzed was considered an outlier, so data without this case are reported in this row

32

Table 3.3: Candidate variants

Case ID Disorder ES Gene Zygosity Inheritance Variant Candidate Type Year Classification 4 Neurologic 2017 FGF12 Heterozygous De novo c.-130-1G>T Weak1

6 Immune 2015 BCL7C Heterozygous UK2 c.809G>T p.Glu271Ter Weak

10 Neurologic 2015 TTN Homozygous UK c.33340+3A>G Strong3 SYNE2 Heterozygous UK c.18407G>A p.Arg6136His Weak

11 Neurologic 2019 GOLPH3 Heterozygous UK c.653dup p.Glu219GlyfsTer5 Novel4

12 Neurologic 2015 USP18 Heterozygous Mat5 c.511C>T p.Arg171Trp Single Hit6

14 Neurologic 2015 PNKP Heterozygous UK c.1253_1269dup p.Thr424GlyfsTer49 Single Hit

17 Other 2019 PSKH1 Homozygous UK c.686C>T p.Pro229Leu Novel DUS2 Homozygous UK c.1265C>T p.Ala422Val Novel NOB1 Homozygous UK c.1095C>A p.Asp365Glu Novel

18 MCA7 2018 EGF Heterozygous UK c.2908C>T p.Arg970Ter Novel

19 Neurologic 2019 PTOV1 Homozygous UK c.289G>A p.Gly97Ser Novel

21 MCA 2016 PRKCH Heterozygous Mat c.1214G>A p.Arg405Lys Weak

22 Immune 2018 ZNFX1 Homozygous Pat8 + UK c.3152T>C p.Leu1051Pro Novel

24 Immune 2014 RELB Multiple Hets9 UK c.433G>A p.Glu145Lys Strong UK c.1091C>T p.Pro364Leu

33

25 Neurologic 2016 SPTB Biallelic Mat c.4819G>A p.Val1607Ile Novel Pat c.871G>A p.Gly291Ser ATG9A Homozygous Mat + Pat c.1121C>T p.Thr374Ile Novel

26 Other 2013 COL9A3 Mixed10 UK c.423+1del Strong LAMB2 Mixed UK c.634T>C p.Ser212Pro Strong

1. Weak candidates refer to variants that have minimal evidence for pathogenicity but cannot be ruled out as the cause of disease 2. UK = unknown, parent of origin cannot be determined based on current data 3. Strong candidates refer to variants with strong evidence of pathogenicity in known disease genes that are related to the participant’s phenotype 4. Novel candidates refer to variants found in genes that have not been associated with disease or are associated with diseases of a different phenotype 5. Mat = maternally inherited variant 6. Single hit candidates refer to heterozygous variants in autosomal recessive disease genes that are related to the participant’s phenotype 7. MCA = multiple congenital anomalies 8. Pat = paternally inherited variant 9. Multiple hets refers to candidate genes that contained two heterozygous variants of interest 10. Mixed zygosity refers to sibling pairs where one member of the sibling pair contains a homozygous variant and the other sibling has the same variant as a heterozygote

34

CHAPTER 4: CASE DESCRIPTIONS AND VARIANT INTERPRETATIONS

Each case in this thesis represents a participant or a family. A thorough chart review was performed for each case in order to obtain relevant information about each affected individual’s phenotype, previous investigations, and family history. This section contains a description of each case, the exome sequencing (ES) reanalysis results, and a discussion on the variants of interest. All non- referenced evidence used for variant interpretation was contained within the variant spreadsheet output by the CRE pipeline. Information on the evidence contained within the CRE spreadsheet can be found in Appendix III.

4.1. Case 1

Case 1 is an 8-year old male. He was adopted with his younger biological sister and is believed to be of Caucasian and possibly Metis descent. His parents are non-consanguineous. There was some concern over potential drug use during pregnancy. His sister has some difficulty with anxiety and insomnia. He has a known relative with pseudocholinesterase deficiency, which is likely unrelated to his presentation.

He was first seen by the Program of Genetics and Metabolism at Winnipeg Health Sciences at 6 years of age for autism spectrum disorder and neuromuscular weakness. He has cognitive delay, progressive proximal muscle weakness, weak cough, and vocal fatigue. He also has widely spaced teeth, hyper-flexibility, and difficulty heel walking. His adoptive parents reported that he has difficulty sleeping, and is unable to hold his head up when he is tired. He has also had regression in his walking, decreased use of his facial muscles, and increased mumbling when talking.

A lower limb MRI was unremarkable. Electromyography, nerve conduction, and muscle biopsy did not show clear evidence of a myopathic or neuropathic process. Genetic testing for Fragile X and

Myotonic Dystrophy were normal. He had a normal microarray and normal biochemical investigations.

35

An 89 gene muscle disorder panel found a variant of uncertain significance (VUS) in NEB c.2354A>G and a VUS in FKRP c.541C>A, both of which have been ruled out as being causative. Genetic testing for mitochondrial disorders was negative and a singleton clinical exome performed in 2019 was negative.

The HPO terms selected for this case are: proximal muscle weakness (HP:0003701), fatigable weakness (HP:0003473), exercise intolerance (HP:0003546), neck flexor weakness (HP:0003722), joint hyperflexibility (HP:0005692), fatigable weakness of chewing muscles (HP:0030193), autism

(HP:0000717), sleep disturbances (HP:0002360), developmental regression (HP:0002376), weak voice

(HP:0001621), toe walking (HP:0040083), and widely spaced teeth (HP:0000687).

Exome reanalysis revealed a total of 603 variants. There were 44 variants in genes containing the HPO terms of interest for this participant. Variants found in candidate neuromuscular disease genes and intellectual disability genes were assessed in detail. A total of 41 variants were analyzed for this case. ES reanalysis for this case did not reveal any candidate variants of interest.

Because this participant is adopted, his biological parents are unavailable for trio-ES. Exome reanalysis at regular intervals is recommended. RNA analysis may also be an appropriate approach in order to assess the transcripts of various neuromuscular in this participant. Transcript levels may reveal genes that are being under-expressed in this participant, which may help to narrow down genes and/or molecular pathways of interest.

4.2. Case 2

Case 2 is a 6-year old male of English and Irish descent. He has two healthy full sisters and one healthy maternal half-brother. His family history was unremarkable. He first presented at birth due to intrauterine growth restriction and undervirilization with severe perineal hypospadias. The participant had a left inguinal hernia, right inguinal testicle, a triangular face, lop ears, upslanting palpebral fissures, small palpebral fissures, dimples on his elbows, bilateral fifth finger clinodactyly, and symmetrical

36 rhizomelia. He has frequent otitis media. Testosterone injections increased his penis size, leading clinicians to believe that his undervirilization was due to decreased serum testosterone levels.

Throughout his life he has been reaching his developmental milestones at appropriate ages but is persistently quite small. Clinically he fits best with a diagnosis of Russell-Silver syndrome (RSS), though it is somewhat atypical given his small head size and severe hypospadias.

A skeletal survey showed delayed ossification of the hyoid bone, delayed bone age, and

Wormian bones. He has normal electrolytes and normal kidneys. A biochemical CAH screening panel showed normal cortisol levels, decreased testosterone, and a slight increase of 17-alpha-progesterone.

Celiac screening, IgA levels, and growth hormone studies were normal. He had a normal 46,XY karyotype and a normal microarray. All investigations of RSS have been normal, including studies of uniparental disomy for chromosome 7, 11p15.5 gene dosage studies, and H19DMR methylation studies.

A research-based trio exome performed in 2016 was negative.

The HPO terms selected for this case are: decreased body weight (<-2SD; HP:0004325),

Microcephaly (<-3SD; HP:0000252), intrauterine growth retardation (HP:0001511), postnatal growth retardation (HP:0008897), Wormian bones (HP:0002645), hypospadias (HP:0000047), cryptorchidism

(HP:0000028), failure to thrive (HP:0001508), ambiguous genitalia male (HP:0000033), upslanted palpebral fissures (HP:0000582), triangular face (HP:0000325), narrow palpebral fissures (HP:0045025), inguinal hernia (HP:0000023), hypoplasia of the spleen (HP:0010451), recurrent otitis media

(HP:0000403), epicanthus (HP:0000286), clinodactyly of the fifth finger (HP:0004209), and decreased serum testosterone level (HP:0040171).

Exome reanalysis revealed 697 total variants and 73 variants in genes matching HPO terms. No candidate gene lists were used for this case. A total of 8 variants were analyzed. ES reanalysis did not reveal any candidate variants.

37

Though this participant has a clinical diagnosis of RSS he has had normal molecular testing for

RSS which includes analysis of both methylation at the H19/IGF2 and uniparental disomy of chromosome 7. These RSS tests are well known to have only a 60% sensitivity (Wakeling et al., 2017).

Taken together with the negative exome results and our reanalysis, it is tempting to speculate that his atypical presentation is due to an unknown, molecular mechanism of disease. Since the most common mechanism of RSS is loss of paternal methylation in the H19/IGF2 gene region (Bartholdi et al., 2009) and growth abnormalities are thought to be common in methylation disorders, genome-wide methylation studies may be an appropriate next step for this participant.

4.3. Case 3

Case 3 is an 11-year-old female of Irish, Scottish, and Romanian descent. She has two unaffected brothers and no family history of neuromuscular or metabolic diseases. She presented at 25 days of life with global hypotonia and frequent desaturations associated with bradycardias. She was noted to have long, slender fingers with joint laxity and single palmar creases bilaterally. She has global developmental delay and is currently non-verbal. She continues to have hypotonia and congenital nystagmus. She has ataxia of her upper limbs and mild scoliosis. Her parents report that she has episodes of vomiting at least once or twice a month.

A brain MRI completed in infancy showed incomplete myelination. She had a normal spinal fluid assessment and muscle biopsy. She had normal metabolic investigations including lysosomal enzymes, acylcarnitine profile, and respiratory chain enzymes. Electromyography and nerve conduction were normal. She had slightly elevated alpha-fetoprotein levels suggesting ataxia telangiectasia although this was not supported by cytogenetic analysis. She had a normal microarray and investigations for Pompe, spinal muscular atrophy, and myotonic dystrophy were all normal. Research-based trio ES analysis was negative.

38

The HPO terms selected for this case are: abnormal myelination (HP:0012447), abnormality of the cerebral white matter (HP:0002500), ataxia (HP:0001251), congenital nystagmus (HP:0006934), elevated alpha-fetoprotein (HP:0006254), generalized hypotonia (HP:0001290), global developmental delay (HP:0001263), infantile muscular hypotonia (HP:0008947), large for gestational age (HP:0001520), limb ataxia (HP:0001520), neonatal hypotonia (HP:0001319), pendular nystagmus (HP:0012043), respiratory insufficiency (HP:0002093), episodic vomiting (HP:0002572), and scoliosis (HP:0002650).

ES reanalysis could not be completed for this case due to poor data quality. Analysis of the data showed that approximately 60% of the reads were PCR duplicates; therefore, there were not enough reads to cover the exome at an adequate depth for variant analysis. Reanalysis can be an efficient way to find new diagnoses in patients using the data that has already been created; however, it is dependent on the quality of that data. It is possible that a diagnosis for this participant was initially missed because of the poor-quality data. It is recommended that sequencing for this participant be re-done before further studies are undertaken.

4.4. Case 4

Case 4 is a 10-year old boy of Ukrainian and Cree descent. He currently lives with his foster parents and has no biological siblings. It was noted in his chart that he had a paternal grandfather with epilepsy. The remainder of his family history was unremarkable. This participant presented at 6 weeks of age with abrupt onset of encephalopathy and seizures. His epilepsy is drug resistant with multiple seizure types including both focal and generalized. He has shown neurodevelopmental regression, which is thought to be secondary to his seizures. He currently has severe spasticity and rigidity as well as recurrent pneumonia.

A brain MRI showed diffuse bilateral infarcts of the central brain matter and cerebral atrophy. A follow-up brain MRI showed hemorrhage and calcification of the thalami and cystic gliosis. An EEG was

39 markedly abnormal when the participant was both drowsy and asleep, showing absence of posterior dominant rhythm, and multi-focus dysfunction with epileptiform discharges. A full metabolic workup showed increased liver enzymes and increased lactate. He had a normal karyotype and microarray.

Molecular testing for MELAS, MERFF, and NARP were unrevealing but a mitochondrial condition was still considered to be high on the differential. Research-based trio ES analysis performed in 2017 was negative.

The HPO terms selected for this case are: seizures (HP:0001250), abnormality of the basal ganglia (HP:0002134), increased CSF lactate (HP:0002490), abnormality of the periventricular white matter (HP:0002518), diffuse white matter abnormalities (HP:0007204), abnormal thalamic MRI signal intensity (HP:0012696), profound global developmental delay (HP:0012736), lactic acidosis

(HP:0003128), optic atrophy (HP:0000648), spasticity (HP:0001257), and elevated hepatic transaminase

(HP:0002910).

Exome reanalysis revealed 1169 variants total and 93 variants in genes matching HPO terms. A total of 37 variants were assessed in this case. Of interest, ES reanalysis revealed one candidate of interest, a 5’-UTR variant in FGF12, c.-130-1G>T.

4.4.1. FGF12

A de novo heterozygous variant of interest, c.-130-1G>T, was found in the gene FGF12. This variant is a suspected splice-acceptor variant with an in-silico splice prediction of 8.598, indicating a high likelihood of affect on splicing. This variant has not previously been reported in the GnomAD database; however, GnomAD calculates that the FGF12 gene has an observed over expected loss-of-function variant ratio of 0.169, indicating that it is intolerant of loss-of-function variants

(https://gnomad.broadinstitute.org/). Pathogenic variants in FGF12 cause early infantile epileptic encephalopathy with three matching HPO terms to our case: optic atrophy, seizures, and spasticity.

40

Strictly following the ACMG criteria, this variant is predicted to be likely pathogenic using the following evidence: confirmed de novo (PS2), absent from controls (PM2), in silico software predicts a splicing affect (PP3), and the participant’s phenotype is highly specific for the gene (PP4).

Similar to our patient, all known cases of FGF12-related disease are caused by de novo mutations; however, all pathogenic variants reported in this gene have been in either codon 52, 112, or

114 (Al-Mehmadi et al., 2016; Guella et al., 2016; Paprocka et al., 2019; Takeguchi et al., 2018;

Villeneuve et al., 2017). The variant found in this participant is in a non-coding region of the gene; therefore, the likelihood of it being truly pathogenic despite the ACMG criteria is low. We are calling this a weak candidate variant; thus, this case remains unsolved. We are recommending this participant for whole genome sequencing (WGS), in order to look more broadly for variants that may have been missed by technological limitations of ES. A more detailed description of WGS can be found in Section

5.3 below.

4.5. Case 5

Case 5 is an 18-year-old woman of French-Metis, Romanian, Norwegian, and English descent.

She presented at birth and was originally diagnosed with CHARGE syndrome. This diagnosis was later retracted due to an ophthalmologist report stating that she did not have a coloboma, but did have small and unusual optic nerve heads. This participant has had ongoing seizures and gastroesophageal reflex since she was 2 months of age. She is reported to have right choanal atresia, optic nerve hypoplasia, slight hypertelorism, broad nasal bridge, micrognathia, clinodactyly, flexion contractures, and cup- shaped ears which are small in size. She has profound developmental delay, small stature, small head circumference, and a short neck. She has respiratory failure as well as recurrent pneumonia. It has been noted that she has abnormal teeth, which are pointed with one fused to the lower jaw.

41

A skeletal assessment showed maxillary hypoplasia and possible mandibular hypoplasia. She has narrow lumbar vertebra, slight ulnar deviation of the hands, and flexion deformities of the proximal interphalangeal joints of the fingers. It was noted that these findings are consistent with Freeman-

Sheldon Syndrome. A CT of her head showed agenesis of the corpus callosum, hypoplastic cerebellum, intracranial hemorrhage, and a large cisterna magna. An EEG showed marked abnormal severe diffuse disturbance of brain function. She had a normal hearing exam and normal renal ultrasound. Karyotype was normal. Research-based trio-ES analysis performed in 2017 was unrevealing.

The HPO terms used for this case are: generalized myoclonic seizures (HP:0002123), choanal atresia (HP:0000453), optic nerve hypoplasia (HP:0000609), hypertelorism (HP:0000316), wide nasal bridge (HP:0000431), micrognathia (HP:0000347), cupped ears (HP:0000378), clinodactyly

(HP:0030084), flexion contracture of digit (HP:0030044), respiratory failure (HP:0002878), short neck

(HP:0000470), profound global developmental delay (HP:0012736), short stature (HP:0004322), microcephaly (HP:0000252), generalized hypotonia (HP:0001290), frontal bossing (HP:0002007), fused teeth (HP:0011090), downslanted palpebral fissures (HP:0000494), thickened nuchal skin fold

(HP:0000474), cerebellar hypoplasia (HP:0006872), and agenesis of the corpus callosum (HP:0001274).

ES reanalysis showed 772 total variants in this case and 92 variants in genes matching HPO terms. A total of 12 variants were analyzed for this case. Filtering variants for those found in an epilepsy candidate gene list revealed no variants. ES reanalysis for this participant did not reveal any candidate variants of interest. Given the wide array of birth defects found in this participant, a genetic etiology is strongly suspected. Systematic reanalysis of this participants exome is recommended. WGS may be an appropriate next step in order to assess variants in the genome that were not captured by ES.

42

4.6. Case 6

Case 6 consists of an affected sibling pair of Irish, Icelandic, and Mennonite descent. They have

2 unaffected maternal half-siblings. Their mother has mild patches of cutaneous psoriasis and recurrent yeast infections. They have a paternal grandmother with Grave’s disease. They have a strong family history of food allergies. Their parents are non-consanguineous.

The elder of the two siblings is a 14-year old boy who presented at 5-years of age with recurrent esophageal candidiasis originally found via endoscopy. He was also found to have brittle nails which break and peal, decreased sweat production, cryptorchidism, cramping of the lower legs, and chronic constipation since he was 8 months of age. He has vomiting episodes once a month, usually occurring after several days of constipation. He has fine hair, thin eyebrows, and sparse eyelashes. His teeth are small and widely spaced with some abnormality of the dental enamel. He was found to have decreased

CD8 and CD4 T-cell numbers. A rectal biopsy ruled out Hirschsprung disease.

The younger of the two siblings is a 12-year-old boy who presented due to his brother’s diagnosis of candidiasis. He has had multiple instances of oral thrush. He developed gastroesophageal reflux as an infant. Like his brother, he also has recurrent esophageal candidiasis and brittle nails, but no fungal infections of the nails. He has dry, brittle hair that breaks easily. As an infant he had rashes on the back of his legs that did not respond to steroid cream that resolved at 15-16 months of age. He has some erythema of the chin, cheeks, perioral area, and anterior thighs. He also has scrotal rash consistent with yeast infections, but no swabbing has been performed. A brain MRI showed a linear area of increased T2 and FLAIR signal intensity which are likely related to a developmental venous anomaly.

43

Both brothers have been assessed by The Hospital for Sick Children and the National Institute of

Health. They have had normal AIRE, STAT1, and IKBKG gene testing. Research duo ES analysis performed in 2015 did not reveal any candidate gene variants.

The HPO terms selected for this sibling pair are: microdontia (HP:0000691), fragile nails

(HP:0001808), sparse hair (HP:0008070), abnormality of dental enamel (HP:0000682), recurrent candida infections (HP:0005401), and vomiting (HP:0002013).

ES reanalysis revealed 372 shared variants total and 15 shared variants in genes matching the selected HPO terms. A total of 5 variants were assessed, including variants found in candidate immunological disease genes. The brothers were found to share a nonsense variant, c.809G>T p.Glu271Ter, in the gene of unknown function BCL7C.

4.6.1. BCL7C

The nonsense heterozygous variant c.809G>T (p.Glu271Ter) in BCL7C was found in both siblings.

This variant has not previously been reported in the GnomAD database, and the gene BCL7C has an observed over expected loss of function score of 0.35, indicating a low tolerance for loss of function variants. The gene itself is labelled as B-cell CLL/Lymphoma 7C in OMIM

(https://www.omim.org/entry/605847), which indicates that it may be involved in immune function, though an immune-disorder such as the one seen in our participants is more likely to be due to abnormalities in T-cells.

A literature review of BCL7C revealed that this gene, along with homologues BCL7A and BCL7B, are important components of the SWI/SNF complex, specifically BAF (Elsen et al., 2018; Kadoch et al.,

2013; Kaeser, Aslanian, Dong, Yates, & Emerson, 2008; Middeljans et al., 2012). These complexes regulate chromatin remodelling and are involved in neuronal development, transcription development, and multiple DNA repair pathways. It has been shown that deleting the ATPase subunit of BAF blocked

44 thymocyte development (Chi, 2004). Similarly, Jeong et al. used mouse models to determine that the

SWI/SNF complex plays a critical role in T-cell activation and the subsequent immune responses (2010) .

This evidence led us to believe that this variant may contribute to the disease seen in these siblings; however, a closer inspection of the gene transcripts revealed that this variant is in an exon of the gene that is not often expressed. Further, this region contains a high number of nonsense mutations.

Therefore, we have determined that even though this gene has a low tolerance for loss of function mutations, this variant is contained in a region of the gene that is more highly tolerant to these loss of function variants than expected. We have labelled this variant as a weak candidate and will not be pursuing it further.

We recommend that the exome for these siblings be systematically reanalyzed on a regular basis. Further, adding unaffected parents to the analysis may increase the diagnostic utility of this data, by enabling us to detect genes with two variants for each child and only one variant for each unaffected parent. Adding unaffected parents may also allow us to look for de novo variants shared between the siblings, which would indicate gonadal mosaicism. WGS may also be an appropriate future step.

4.7. Case 7

Case 7 is a 5-year old boy of Norwegian, Ukrainian, and Austrian descent. He has a healthy older brother and younger sister. He has a paternal cousin with esophageal atresia. The remainder of the family history is unremarkable. His parents are non-consanguineous. He first presented at age 2 due to global developmental delay and soft neurological signs. He has severe language delay, oral motor incoordination, hypotonia, and concerns regarding poor growth, feeding, and swallowing. He also has staring spells where he rubs his knuckles and cannot be roused. An EEG did not show any signs of seizures during these episodes. His parents reported that he often wakes up screaming and is inconsolable.

45

Metabolic investigations were unrevealing. He had a slightly elevated creatine kinase of 318

U/L, repeated at 537 U/L. An EEG showed intermittent left temporal slowing and a brain MRI showed patchy areas of T2 hyperintensity and T1 normointensity in the deep and subcortical white matter. The

MRI also showed bilateral temporal lobe cysts, which is suggestive of RNAse T2-deficient leukoencephalopathy. He had normal enzymatic activity of TPP1 and PTT1, enzymes whose deficiency leads to two forms of neuronal ceroid lipofuscinosis. He had a normal microarray and normal genetic testing for fragile X. A leukodystrophy genetic panel showed that he is heterozygous for a likely pathogenic variant in the gene MRPS22, which causes autosomal recessive combined oxidative phosphorylation deficiency, and has a VUS in the gene ATP7B, which causes autosomal recessive

Wilson’s disease. Neither of these variants was felt to be contributing to his phenotype. In addition, no variants were found in RNASET2, which was a gene of high interest known to cause autosomal recessive cystic leukoencephalopathy highly consistent with his MRI findings. A muscular disorder panel showed a

VUS in KBTBD13, which causes autosomal dominant nemaline myopathy, and a VUS in DPAGT1, which causes autosomal recessive myasthenic syndrome. Again, neither of these were felt to contribute to his phenotype. A trio clinical exome was performed in 2019 by Prevention Genetics which found a paternally inherited VUS in TCF20, which is associated with autosomal dominant autism spectrum disorder.

The HPO terms selected for this case are: delayed speech and language development

(HP:0000750), generalized hypotonia (HP:0001290), drooling (HP:0002307), leukoencephalopathy

(HP:0002352), sleep terror (HP:0030765), autistic behaviour (HP:0000729), and elevated serum creatine kinase (HP:0003236).

ES reanalysis revealed 679 variants total and 49 variants in genes related to his selected HPO terms. A total of 16 variants were analyzed for this case including genes in a neuromuscular and intellectual disability candidate gene lists. No variants of interest were found for this case.

46

This participant’s trio-ES was performed in 2018, which is relatively recent. Our next steps will include trio-WGS to see if there are any variants of interest that could not be captured by ES. We also recommend systematic reanalysis of the ES data for this patient every 2-3 years to determine if any new genes of interest are discovered.

4.8. Case 8

Case 8 is a 3-year old male of Irish, English, Icelandic, German, and Scottish descent. He has two older sisters, and his parents had 4 previous pregnancy losses. His parents had normal karyotype analyses and are non-consanguineous. He presented prenatally with an omphalocele, fetal pyelocaliectasis, and unilateral post-axial polydactyly found on prenatal ultrasound. He was born prematurely at 29 weeks gestation and had diazoxide unresponsive hyperinsulinism. He spent seven months in hospital and had chronic lung disease, seizures which have resolved, mild pulmonary hypertension which has resolved, bilateral grade 3 intraventricular hemorrhage, solitary hepatic cyst, hydronephrosis, mild laryngomalacia, inguinal hernia, retinopathy of prematurity, right pig bronchus, mucous cyst in the left submandibular gland which has resolved, ventriculomegaly, and hyperopia. He had difficulty feeding without choking and was, therefore, G-tube fed. He has global developmental delay which includes delays in both gross and fine motor skills. He has dysmorphic facies which include hypertrichosis, coarse facial features, macrocephaly, hypertelorism, and downslanting palpebral fissures.

Methylation and 11p15.5 gene dosage studies for Beckwith Weideman syndrome were normal.

A clinical trio exome revealed a paternally inherited VUS in ARID2 which is associated with AD Coffin-

Siris syndrome 6. He was also found to have homozygous pathogenic variants in BCHE which is associated with postanesthetic susceptibility to apnea due to butyrylcholinesterase deficiency.

47

The HPO terms selected for this case are: macrocephaly (HP:0000256), short stature

(HP:0004322), feeding difficulties (HP:0011968), downslanted palpebral fissures (HP:0000494), hypertelorism (HP:0000316), bulbous nose (HP:0000414), depressed nasal bridge (HP:0005280), thick vermillion border (HP:0012471), nevus flammeus (HP:0001052), pulmonary arterial hypertension

(HP:0002092), laryngomalacia (HP:0001601), postaxial hand polydactyly (HP:0001162), central hypotonia (HP:0011398), omphalocele (HP:0001539), hydronephrosis (HP:0000126), ventriculomegaly

(HP:0002119), intraventricular hemorrhage (HP:0030746), inguinal hernia (HP:0000023), hypoglycemia

(HP:0001943), and global developmental delay (HP:0001263).

Trio exome reanalysis revealed 606 total variants and 67 variants in genes related to his selected

HPO terms. A total of 9 variants were analysed. There were no candidate gene lists of interest for this case. Exome reanalysis did not reveal any new variants of interest for this case.

Original investigations for this participant included methylation and gene dosage studies related to Beckwith-Weidemann Syndrome, due to overgrowth and the initial presentation of an omphalocele.

Many overgrowth disorders are caused by somatic variants, which may show only low levels of mosaicism (F. Chang et al., 2017). It is possible that low level mosaic variants were filtered by the bioinformatics pipeline. Therefore, it may be worthwhile to perform deep-sequencing of this participant’s genome, in order to search for low levels of mosaicism that may be related to his phenotype.

4.9. Case 9

Case 9 is a 4-year old female of Scottish, English, Italian, and Icelandic descent. She has no siblings and her parents are non-consanguineous. She presented at 7 months of age with infantile spasms. These seizures have been drug-resistant. She has global developmental delay which includes gross motor delay, and speech and language delay. She has shown developmental regression. It was

48 noted that she has a square shaped face with a small chin. At age 2 she underwent a right functional hemispherectomy which has resolved her seizures.

A brain MRI revealed that the subarachnoid spaces and basal cisterns overlying the cerebral hemispheres were moderately enlarged, and there was mild enlargement of the ventricular system. She had a normal microarray and a negative epilepsy genetic panel. Clinical trio ES performed in 2018 showed a maternally inherited likely pathogenic variant in ADAR, which is associated with autosomal recessive Aicardi-Goutieres syndrome, and a paternally inherited likely pathogenic variant in LRAT, which is associated with autosomal recessive retinitis pigmentosa. Neither of these variants were thought to contribute to her diagnosis.

The HPO terms selected for this case are: global developmental delay (HP:0001263), seizures

(HP:0001250), absence seizure (HP:0002121), developmental regression (HP:0002376), focal motor seizure (HP:0011153), infantile spasms (HP:0012469), and epileptic encephalopathy (HP:0200134).

Trio ES reanalysis revealed 593 variants and 65 variants in genes related to her selected HPO terms. A total of 9 variants were analyzed, including variants in an epilepsy candidate gene list. Of interest is a de novo variant in the gene GPR150 (c.527C>A p.Ala176Glu). This variant has not been reported in the GnomAD database and has mixed predictions of pathogenicity by in-silico software (SIFT:

0.06; PolyPhen: 0.74; CADD: 19.29). As of May 2020, this gene does not have an entry in the OMIM database and a PubMed search for GRP150 reveals 6 articles; however, none of these articles relate to gene function. We have entered this gene in the Matchmaker Exchange database.

A hemispherectomy for this participant was able to resolve her seizures. We are hopeful that this means that a genetic diagnosis is not needed for symptom management and screening for this participant. However, the parents of this participant were concerned about recurrence risks, because they wish to have more children. This is a case where a genetic diagnosis could be used for reassurance

49 of family members and for family planning. We recommend that this participant’s ES be systematically reanalyzed in 2-3 years time to see if we are able to find variants in future newly discovered genes related to epilepsy disorders. WGS may also be warranted for this participant.

4.10. Case 10

Case 10 is a 7-year old female of Palestinian and Syrian descent. Her parents are first cousins.

She presented at birth and was seen by genetics at 3 days of age. She had mild arthrogryposis, poor feeding, minimal joint contractures in the hands, hypotonia, and impaired gross motor skills. She was also noted to have dolichocephaly with prominent occiput and a ridge over the metopic suture, high arched eyebrows, droopy eyelids, bilateral infraorbital creases, a high arched palate, thin upper vermillion, retrognathia, micrognathia, laterally displaced nipples, a high forehead, and frontal bossing.

She currently has trouble keeping up with friends, struggles to get off the floor, has a waddling gait, and is unable to stand on one foot. She has diffuse weakness which is worse in her knee flexors, hip flexors, and shoulder abductors. She is also hyperflexible with swan necking of her distal joints.

She had a normal karyotype and a normal microarray with several regions of homozygosity. A nerve conduction study was normal and there were no disease-causing copy number variants found on

CHST3 genetic testing, which is a gene that causes Spondyloepiphyseal dysplasia with congenital joint dislocations. She had normal biochemical testing, normal spinal muscular atrophy genetic testing, normal DMPK genetic testing for autosomal dominant myotonic dystrophy, and normal FBN2 genetic testing for autosomal dominant contractural arachnodactyly. A muscle biopsy of the left thigh showed contractile disorganization, similar to what is reported in multi-minicore myopathies. Recurrent muscle

MRIs show severe fibrofatty replacement in the posterior compartment of the thigh with compensatory hypertrophy of the adductor muscles. Clinical singleton ES performed in 2015 found many VUSs

50

(including variants in TTN, RYR1, and CHRNE described below), but was unable to clarify a specific diagnosis.

The HPO terms selected for this case are: high palate (HP:0000218), thin upper lip vermilion

(HP:0000219), dolichocephaly (HP:0000268), microretrognathia (HP:0000308), infra-orbital crease

(HP:0100876), highly arched eyebrow (HP:0002553), generalized muscle weakness (HP:0003324), abnormal muscle fiber morphology (HP:0004303), joint hyperflexibility (HP:0005692), delayed gross motor development (HP:0002194), feeding difficulties (HP:0011968), and congenital finger flexion contractures (HP:0005879).

ES reanalysis revealed 719 variants total and 53 variants in genes related to her selected HPO terms. A total of 48 variants were analyzed, including variants in genes from a candidate neuromuscular gene list. There were multiple variants of interest analyzed, including a single variant in the gene RYR1

(c.13369A>T p.Met4457Leu), a heterozygous variant in the gene CHRNE (c.764C>T p.Ser255Leu), a heterozygous variant in the gene SYNE2 (c.18407G>A p.Arg6136His), a single variant in the gene RYR3

(c.11224C>T p.Leu3742Phe), and two homozygous variants in the gene TTN (c.11023G>A p.Gly3675Ser and c.33340+3A>G).

4.10.1. RYR1

Due to this participant’s phenotype, particular attention was paid to the gene RYR1. Analysis of the RYR1 gene revealed a single variant c.13369A>T (p.Met4457Leu). This variant has been reported 7 times as a VUS in the ClinVar database (https://www.ncbi.nlm.nih.gov/clinvar/variation/287319/), it has been heterozygously reported 4 times in GnomAD and homozygously once. In-silico software predicts this variant to be non-deleterious. Though this variant is considered to be a VUS, there is more evidence that this variant is benign than pathogenic.

51

Coverage for the RYR1 gene was also explored. The RYR1 gene has 106 exons, which leaves a lot of room for low or missing coverage in ES analysis (Phillips et al., 1996). We analyzed the coverage of each exon and discovered that 5 of the exons had an average coverage of lower than 25X (Figure 4.1.).

Further review into each of these exons revealed that 3 exons had regions of coverage as low as 8X. It is not possible to confidently call heterozygous variants in an exome at coverage this low; therefore, we suggest clinical sequencing of the RYR1 gene for this participant in order to determine if there were any variants missed within these regions of low coverage.

52

Figure 4.1: Coverage of the RYR1 gene for case 10. The blue line represents coverage at each base position on chromosome 19. A) Coverage of the entire RYR1 gene. Exon positions are represented by bars beneath the blue lines: black bars represent exons with an average coverage greater than 25X, red bars represent exons with an average coverage of less than 25X. B-F) Coverage plots of RYR1 exons with an average coverage of less than 25X. Red bars represent the position of the exon. The black dotted line is at the 25X coverage position.

A RYR1 Coverage

1000

750

500 Position Coverage Position

250

0

38950000 39000000 39050000 Position in GRCh37 Chromosome 19

53

B C D RYR1 Exon 1 Coverage RYR1 Exon 24 Coverage RYR1 Exon 44 Coverage

60 60 60

40 40 40

20 20 20

Position Coverage Position Coverage Position Coverage Position

0 0 0

38924300 38924400 38924500 38924600 38956700 38956800 38956900 38957000 38957100 38990200 38990300 38990400 38990500 Position in GRCh37 Chromosome 19 Position in GRCh37 Chromosome 19 Position in GRCh37 Chromosome 19 E F RYR1 Exon 49 Coverage RYR1 Exon 87 Coverage

60 60

40 40

20 20

Position Coverage Position Coverage Position

0 0

38993500 38993600 38993700 39034300 39034400 39034500 39034600 Position in GRCh37 Chromosome 19 Position in GRCh37 Chromosome 19

54

4.10.2. CHRNE

A heterozygous variant was found in the gene CHRNE. This variant, c.764C>T p.Ser255Leu, has been reported as likely pathogenic once in the ClinVar database by the clinical diagnostic laboratory

Invitae (https://www.ncbi.nlm.nih.gov/clinvar/variation/465864/). This variant has not been seen in the

GnomAD database and in-silico software predict it to be deleterious (SIFT: 0; PolyPhen: 0.996; CADD:

34). Pathogenic variants in CHRNE can cause myasthenic syndrome with either an autosomal dominant or recessive pattern of inheritance. This variant was observed in this participant’s original clinical exome, and was confirmed to be paternally inherited. Because this participant’s father is unaffected, if this variant is likely pathogenic, then it is most likely to cause the autosomal recessive form of disease.

Therefore, we do not believe that this variant is the cause of disease in this participant.

4.10.3. SYNE2

Sequencing reanalysis also revealed a variant in the gene SYNE2 that was not seen in the original exome. This variant, c.18407G>A p.Arg6136His, is reported in the GnomAD database 19 times and is predicted to be deleterious by in-silico software (SIFT: 0.01; PolyPhen: 0.986; CADD: 35). The gene

SYNE2 is associated with autosomal dominant Emery-Dreifuss muscular dystrophy (EDMD), which is a heterogeneous myopathy (J. Zhang et al., 2009). This participant’s phenotype does now match the phenotype of EDMD; however, genes that are associated with disorders of skeletal muscles often cause heterogenous disease that don’t necessarily fit into the category of any one myopathy. Therefore, this variant may be a candidate for the disorder seen in this participant, albeit a weak one.

4.10.4. RYR3

Of interest, a heterozygous variant was found in the gene RYR3 (c.11224C>T p.Leu3742Phe).

Disease caused by variants in RYR3 have only been reported once in the literature, in a 22 year old woman with compound heterozygous variants (Nilipour et al., 2018). This patient had proximal muscle

55 weakness in all 4 limbs, presenting at age 4. She was also found to have a long narrow face, high arched palate, and bilateral facial weakness. This presentation is not dissimilar to what can be observed in

RYR1-related myopathies (Ferreiro et al., 2000). The variant observed in our participant has not been reported in GnomAD and in-silico software predicted it to be deleterious (SIFT: 0.02; PolyPhen: 0.999;

CADD: 32). However, only one variant was observed; therefore, parental segregation is recommended to determine if this variant could potentially lead to a dominant form of disease.

4.10.5. TTN

This participant had 2 different homozygous variants within the gene TTN. TTN encodes the largest protein in the human proteome; therefore, it has a large coding region which contains many missense variants that can be difficult to interpret (Chauveau, Rowell, & Ferreiro, 2014). This large size made sequencing difficult using older sequencing technologies, thus, before the use of next generation sequencing panels, TTN was not routinely tested clinically (Vasli et al., 2012). Therefore, when clinical testing for variants within the TTN gene started to become regular practice, not much was known about the gene, which led to the reporting of many VUSs. Since that time, pathogenic variants in TTN have been reported in patients with neuromuscular disorders of various phenotypes including congenital muscular dystrophies and centronuclear myopathies (Ceyhan-Birsoy et al., 2013; O’Grady et al., 2016).

Given the size of the gene and the ethnicity of this participant, it is not surprising that several VUSs were found in the TTN gene during this reanalysis.

The homozygous TTN variant c.11461G>A (p.Gly3675Ser) was found in this participant’s exome with a read depth of 58X. This variant has been seen in 6 heterozygotes in GnomAD, but never as a homozygous variant. It is listed as a VUS in ClinVar once by EGL Genetic Diagnostics in 2016, who did not list their evidence for their variant classification

(https://www.ncbi.nlm.nih.gov/clinvar/variation/593018/). Our analysis found that in-silico prediction

56 software reports this variant as likely deleterious (SIFT: none; PolyPhen: 1; CADD: 17.39). Based on

ACMG guidelines, this variant meets criteria for PM2 for occurring with a low frequency in population databases, and criteria for PP3 because multiple lines of computational evidence predict it to be deleterious. Therefore, this variant is classified as a VUS and will be reported as such to the referring physician. It should be noted that this variant appears exclusively in cardiac transcripts of TTN, and not in the N2A skeletal-muscle transcript (https://www.cardiodb.org/titin/). Therefore, this variant is unlikely to cause a skeletal myopathy phenotype, such as the one seen in this participant.

Another homozygous TTN variant, c.33340+3A>G, was found in this participant with a read depth of 98X. This variant has never been reported in GnomAD, and has an in-silico splice prediction score of 5.622, indicating a high likelihood of effecting splicing. In order to verify the splice prediction of this variant, we assessed it using the software Alamut Visual (https://www.interactive- biosoftware.com/alamut-visual/). Alamut assessed the predicted splicing affect of this variant versus the wildtype variant using 3 software programs: splice site finder, MaxEntScan, and NNSplice. Each of these programs predicted a significant difference in splicing. Because it is known that TTN can cause both cardiac and skeletal muscular disease, the transcripts affected by this variant were assessed. This variant was found to be 3bp into intron 147. Exon 147 is a symmetric exon that contributes to the N2A transcript of TTN, which is part of the skeletal isoform (https://www.cardiodb.org/titin/). We are suspicious enough of this variant that we are enrolling this participant in an RNA-seq study to see if this variant has in-vivo effects on splicing of TTN. A well-established in-vivo functional study would earn this variant a PS3 using ACMG criteria. When combined with the rarity of this variant in GnomAD (PM2), a functional study showing a deleterious effect would lead to a likely pathogenic classification. We have labelled this variant as a strong candidate that is worth investigating further.

57

4.11. Case 11

Case 11 is a 2-year old female of Hutterite descent. She has 2 unaffected older sisters. She has a paternal cousin who had seizures as a child and a maternal cousin who passed away due to congenital hydrocephalus. Her parents are third cousins. She presented at 8 months of age with infantile spasms following a fever and vomiting. She has both tonic-clonic and focal seizures which respond to phenobarbital and levetiracetam. At 15 months of age she presented with developmental regression.

She has acute hepatitis during times of hospital admission which resolves. She has dysmorphic facies including coarse features, inverse epicanthus, thick lips, and a tented upper lip.

A metabolic screen at hospital admission showed low alanine in plasma amino acids, low free and total carnitine, low homocysteine, and urine organic acids showed excretion of 3-hydroxybutyric acid and elevated acetoacetic acid. She also had ketosis. Neurotransmitters revealed elevated pyridoxal

5-phosphate and neopterin. A metabolic screen 3 months prior to admission was normal. A brain MRI showed significant atrophy of both the supratentorial and infratentorial brain matter. A muscle biopsy showed fiber type disproportion. No deficiencies were found in mitochondrial electron transport chain enzymes and qPCR analysis showed normal mitochondrial DNA content. She had a normal microarray with several regions of homozygosity. She had negative clinical exome and mtDNA sequencing.

The HPO terms selected for this case are: epicanthus inversus (HP:0000537), elevated hepatic transaminase (HP:0002910), global developmental delay (HP:0001263), seizures (HP:0001250), choreoathetosis (HP:0001266), global brain atrophy (HP:0002283), developmental regression

(HP:0002376), focal tonic seizures (HP:0011167), infantile spasms (HP:0012469), epileptic encephalopathy (HP:0200134), acute hepatitis (HP:0200119), and abnormality of the liver (HP:0001392).

Singleton ES reanalysis revealed 542 variants and 67 variants in genes related to her selected

HPO terms. A total of 30 variants were analyzed for this case, including variants in genes from an

58 epilepsy candidate gene list. Of interest, a frameshift variant in the gene CYP4A11 (c.1058del p.Ser353ThrfsTer28) was found, as well as a frameshift variant in the gene of uncertain significance

(GUS) GOLPH3 (c.653dup p.Glu219GlyfsTer5).

4.11.1. CYP4A11

A frameshift variant, c.1058del (p.Ser353ThrfsTer28), was found in the GUS CYP4A11. This variant is not reported in the GnomAD database. The gene CYP4A11 has an observed over expected loss-of-function variant ratio of 1.27, indicating that there are more loss-of-function variants in this gene than one may except; therefore, this gene appears to be tolerant to loss-of-function variants. Disease associated with the gene CYP4A11 has not previously been reported. This gene is part of the CYP4A subfamily, whose function is mainly known to be involved in the metabolism of medium- and long-chain fatty acids (Oktia & Okita, 2005). These types of disorder of metabolism are most often recessive. Given that only one variant has been found in the CYP4A11 gene and that this gene appears to be tolerant of loss-of-function variants, we are not pursuing this candidate any further.

4.11.2. GOLPH3

A frameshift variant, c.653dup (p.Glu219GlyfsTer5), was found in the GUS GOLPH3. This variant is not reported in the GnomAD database and GOLPH3 has an observed over expected loss-of-function variant ratio of 0.19, indicating that it is intolerant of loss-of-function variants. GOLPH3 has many reported functions which include vesicular trafficking, Golgi architecture maintenance, receptor sorting, protein glycosylation, and potentially stress-induced apoptosis (T. Li et al., 2014). In their review, Li et al. suggest that GOLPH3 is involved in regulation of mitochondria lipids, and that it plays a key role in increasing mitochondrial mass in response to mitochondrial dysfunction (2014). Similarly, Frappaolo et al. showed that GOLPH3-mutant Drosophila displayed a paralysis-phenotype when exposed to a 38ºC hot bath (2017). This phenotype was identical to the one seen in COG7 mutant Drosophila, which is a

59 gene known to cause a congenital disorder of glycosylation. Taken together, this evidence suggests that

GOLPH3 may cause an autosomal recessive stress-induced mitochondrial disorder.

Because of the likely recessive nature of disease caused by GOLPH3, we explored the possibility of a second variant in this participant. Coverage analysis showed that 253bp of GOLPH3 had a coverage of less than 20X, with some bases showing a coverage as low as 10X. A large proportion of this gene has coverage too low to make appropriate variant calls; therefore, there may be a missed copy number or single nucleotide variant in GOLPH3. Based on this evidence, we believe that GOLPH3 is a novel gene candidate of interest. Next steps include putting this gene into the Matchmaker Exchange database to see if there are any other individuals with similar phenotypes who have variants in this gene. We will also be performing Sanger sequencing of the GOLPH3 gene to confirm our variant of interest and to determine if there are any hidden variants within this gene.

4.12. Case 12

Case 12 is a female participant who passed away at 6 years of age. She had an older brother with global developmental delay and an older maternal half-brother with ADHD. She had two maternal first cousins with ADHD and a maternal cousin with Beckwith-Weidemann syndrome. Her parents are of

Irish and English descent and are non-consanguineous. She presented at 2 years of age due to changes in behaviour and loss of developmental milestones. She had previously reached all milestones at appropriate ages. At presentation she had poor eye contact, jerky movements, was sleeping more than usual, had few words, and would not play or feed herself. Her disease progressed to complete loss of ambulation, bilateral horizontal gaze palsy, left facial weakness, and left tongue deviation. She was hypotonic, constipated, and G-tube fed. She had absence of sweating and a tendency to overheat. She died in hospital due to intractable apneic seizures.

60

She had several brain MRIs over the course of her disease. These MRIs showed T2 hyperintensities which indicated a possible disorder of the myelin. Later head imaging showed cerebellar atrophy and diffuse calcification affecting the brain stem, deep nuclei, white matter, and cortex. She had a normal metabolic workup. A brain biopsy showed hyperplasia and activation of perivascular pericytes, with non-specific accumulation of lysosomal materials in pericytes. Muscle and

GI biopsies were unremarkable. EEGs showed epileptiform discharges. She had a normal microarray, normal MECP2 sequencing and MLPA testing, normal mitochondrial DNA content, and normal mtDNA analysis. A trio clinical exome from Ambry Genetics in 2015 showed a de novo likely pathogenic variant in HNRNPK, however, at the time of analysis disease related to this gene had not been described. Since that time, disease associated with HNRNPK has been described; however, disease associated with this gene does not match the participant’s phenotype. ES also revealed biallelic variants in a possible candidate gene for her condition, EDEM2, but this was ruled out as the cause when her brother was found to have the same genotype.

The HPO terms selected for this case are: progressive muscle weakness (HP:0003323), generalized hypotonia (HP:0001290), ataxia (HP:0001251), irritability (HP:0000737), developmental regression (HP:0002376), abnormality of the cerebral white matter (HP:0002500), cranial nerve VI palsy

(HP:0006897), sleep-wake cycle disturbance (HP:0006979), focal T2 hyperintense thalamic lesion

(HP:0012692), elevated brain lactate level by MRS (HP:0012707), focal T2 hyperintense brainstem lesion

(HP:0012748), hepatitis (HP:0012115), constipation (HP:0002019), mental deterioration (HP:0001268), chorea (HP:0002072), lower limb hypertonia (HP:0006895), primitive reflexes (HP:0002476), seizures

(HP:0001250), basal ganglia calcification (HP:0001250), calcification of the small brain vessels

(HP:0002504), cerebral calcification (HP:0002514), and midline brain calcification (HP:0007045).

Trio ES reanalysis revealed 622 variants total and 56 variants in genes related to her selected

HPO terms. A total of 5 variants were analyzed for this case. Of interest, a maternally inherited

61 heterozygous variant in the gene USP18 was noted (c.511C>T p.Arg171Trp). Interestingly, the HNRNPK variant was not observed during exome reanalysis.

4.12.1. USP18

This participant had a maternally inherited variant in the gene USP18 (c.511C>T p.Arg171Trp).

This variant has been reported in 15 heterozygotes in the GnomAD database and has mixed predictions of pathogenicity by in-silico software (SIFT: 0.1; PolyPhen: 0.007; CADD: 15.76). This gene is associated with pseudo-TORCH syndrome, which is an autosomal recessive disorder that mimics fetal brain damage caused by in-utero infection (Meuwissen et al., 2016). This includes intracranial hemorrhage, calcification, and brain malformations. The gene USP18 matches 2 HPO terms with our participant: cerebral calcification and generalized hypotonia. Disease caused by USP18 is typically caused by loss-of- function mutations, so it is possible that a missense mutation could cause a later-onset form of disease

(Meuwissen et al., 2016). So far, we have only found a single variant in this gene. Trio-WGS is suggested in order to more comprehensively search for a potential second variant. We also suggest simultaneous

RNA-studies, in order to determine if the expression of USP18 is absent or reduced in this participant.

4.13. Case 13

Case 13 consists of a group of affected siblings. This family is made up of eight siblings, five affected with elevated oxalate and three unaffected. There is a family history of false-positive newborn screens for CPT-1 deficiency, which is unrelated to the given indication. The second youngest sibling has a severe seizure disorder and developmental delay. The family is of Inuit descent. The parents are unaffected and are non-consanguineous. The eldest sibling is a 28-year old male who presented at 2 years of age due to cloudy urine. At this time, he was discovered to have renal stones and elevated oxalate. He is otherwise healthy.

62

All siblings have had normal urine organic acids and kidney ultrasounds. Only the eldest sibling has had kidney stones to date. A microarray on the eldest sibling revealed multiple regions of homozygosity, which include the genes AGXT and GRHPR, which are candidate genes that cause hyperoxaluria. Sequencing of these genes was normal. Research trio ES of 3 affected siblings performed in 2017 was unrevealing.

The selected HPO terms for these siblings are: nephrolithiasis (HP:0000787), hyperoxaluria

(HP:0003159), and calcium oxalate nephrolithiasis (HP:0008672).

Trio ES reanalysis for three of the affected siblings revealed 467 shared variants total and 5 variants in genes related to the selected HPO terms. There were no candidate gene lists of interest for this family. A total of 39 variants were analyzed. There were no candidate variants of interest for this family. Due to the mild nature of disease in this family, it is unlikely that more comprehensive studies will yield more useful results. We suggest reanalysis in a few years time to see if any variants are found in genes with newly discovered disease associations.

4.14. Case 14

Case 14 consists of an affected sibling pair of Metis, Irish, and German descent. Their parents are non-consanguineous. They are currently living under the care of foster parents. The eldest sibling is a 19-year old male who presented due to retinal dystrophy, developmental delay, and easy sunburns.

His younger sibling is a 16-year old male with retinal dystrophy, thick skin blisters after sun exposure, global developmental delay, microcephaly, and hypotonia.

Biochemical testing for porphyria was unremarkable. They have had normal karyotypes and unrevealing metabolic screens. Duo ES performed in 2015 was unrevealing.

63

HPO terms selected for this sibling pair are: microcephaly (HP:0000252), retinal dystrophy

(HP:0000556), cutaneous photosensitivity (HP:0000992), global developmental delay (HP:0001263), and delayed speech and language development (HP:0000750).

Duo ES reanalysis revealed 377 shared variants and 24 variants in genes related to their selected

HPO terms. There were no candidate gene lists of interest for these participants. A total of 29 variants were analyzed. Of interest, these siblings shared a heterozygous frameshift variant in the gene PNKP, c.1253_1269dup p.Thr424GlyfsTer49.

4.14.1. PNKP

The heterozygous frameshift variant c.1253_1269dup (p.Thr424GlyfsTer49) was found in the gene PNKP, which has been reported as pathogenic in the ClinVar database and has been reported in 41 heterozygotes in the GnomAD database. The gene PNKP causes an autosomal recessive disorder that is associated with progressive microcephaly and brain abnormalities leading to seizures and developmental delay (Shen et al., 2010). PNKP has two matching HPO terms with these participants: global developmental delay and microcephaly. Though not a strong phenotypic match to that seen in our participants, laboratory studies on patients with pathogenic PNKP variants found that they were unable to repair damaged DNA. This is unsurprising, since PNKP is known to be involved in the non- homologous end joining process of DNA repair (Chappell, Hanakahi, Karimi-Busheri, Weinfeld, & West,

2002). This indicates that the gene PNKP may play a role in the photosensitivity phenotype seen in this sibling pair, by preventing the repair of UV-damaged DNA. Next steps include functional studies through chromosome breakage analysis, to determine if deficiencies in DNA-damage repair is the cause of disease in this sibling pair. If this sibling pair is found to have a chromosome-breakage phenotype, then

WGS may be pursued to determine if a second causative variant in the PNKP gene can be identified.

64

4.15. Case 15

Case 15 consists of an affected sibling pair of Somali descent. They have four healthy siblings and their family history is unremarkable. The eldest affected sibling is a 17-year old male with intractable seizures. His seizures are focal with secondary generalization clonic seizures which are drug resistant. He is in a wheelchair due to spastic quadriparesis. He has axial hypotonia and increased deep tendon reflexes. He is non-verbal and is G-tube fed. He has left hemispheric cortical dysplasia and pachygyria.

An EEG showed frequent runs of epileptiform abnormalities without clinical correlates. A brain

MRI showed multiple areas of cortical dysplasia involving both the left and right cerebral hemispheres.

A renal ultrasound showed multiple areas of possible calcification in both kidneys. He has had a normal microarray and unrevealing metabolic screens.

The younger of the affected siblings is a 13-year old girl. She presented with seizures at 2 years of age which are generalized tonic and tonic-clonic seizures. These seizures are also drug resistant. She had inconsolable crying from 4 months to 9 months of age which resolved on its own. She has speech and language delay, spastic hemiparesis, and worsening axial muscle-tone. She was temporarily in a wheelchair due to her poor muscle-tone but is currently able to walk. She cannot run and has mild leg spasticity. Like her brother, she also has left hemispheric cortical dysplasia, pachygyria, and polymicrogyria. She received a vagus nerve stimulator implant at 9 years of age.

An EEG showed continuous diffuse spike, polyspike, and wave discharges, which were more prominent on the left. A brain MRI showed extensive dysplastic gray matter involving the left cerebral hemisphere and a suspended area of dysplastic gray matter within the right frontal lobe as well as within the right occipital lobe. Research-based duo ES was performed in 2015.

65

The HPO terms selected for this sibling pair are: muscular hypotonia of the trunk (HP:0008936), global developmental delay (HP:0001263), delayed speech and language development (HP:0000750), seizures (HP:0001250), spasticity (HP:0001257), pachygyria (HP:0001302), generalized tonic-clonic seizures (HP:0002069), polymicrogyria (HP:0002126), cortical dysplasia (HP:0002539), generalized tonic seizures (HP:0010818), and abnormality of brain morphology (HP:0012443).

Duo ES reanalysis reveled 824 shared variants and 110 variants in genes associated with the selected HPO terms. A total of 118 variants were analyzed including those found within genes from an epilepsy candidate gene list. No candidate variants were found in this sibling pair.

The large number of variants analyzed in this sibling pair is due to the ethnic disparity within genomics databases. When little genomic data is available for a specific population, benign ethnic- specific variants appear rare. Since one of the primary filters in this analysis was allele frequency, many variants that would have normally been filtered from analysis were not. Not only does lack of genomic data for certain ethnicities make it difficult to filter variants of interest, but it also makes it difficult to interpret the pathogenicity of those variants (Popejoy et al., 2018). This highlights one of the challenges of variant interpretation and shows the need for genomic sequencing in unaffected individuals of more diverse populations. ES reanalysis for this sibling pair did not reveal any candidate variants, despite the large number of variants analyzed. We recommend WGS of this sibling pair with the addition of their unaffected parents and two unaffected siblings in order to assess for the segregation of variants of interest.

4.16. Case 16

Case 16 is a 56-year old woman of French descent. She has presumed Type 3 hereditary angioedema with an extensive family history. She has an affected father, two affected siblings, one affected daughter, an affected son, an affected granddaughter, three affected paternal uncles, one

66 affected nephew, and one affected niece (Figure 4.2.). She has two healthy daughters and three unaffected siblings: two sisters and one brother. She has recurrent angioedema and prodromal symptoms which she treats with C1 esterase inhibitor concentrate therapy. She gets limb pain which increases with activity as well as arm and leg numbness and tingling. She often has headaches and abdominal distension. She bruises easily, has dyspnea, and chest pain. Metabolic screening showed slightly elevated homocysteine and decreased vitamin B12. Targeted mutation testing of the F12 gene was negative and she was negative for HAE type II gene sequencing. Singleton ES was performed on a research basis in 2016.

67

Figure 4.2: Pedigree for case 16. Circles represent females and squares represent males. Black circles and squares represent family members who are affected with hereditary angioedema. Unfilled circles and squares represent unaffected family members. The arrow represents the participant whose exome was reanalyzed in this study.

68

The HPO terms selected for this participant are: urticaria (HP:0001025), angioedema

(HP:0100665), stridor (HP:0010307), abdominal pain (HP:0002027), paresthesia (HP:0003401), and chest pain (HP:0100749).

ES reanalysis revealed 620 variants total and 17 variants in genes associated with case 16’s selected HPO terms. There were no candidate gene lists of interest for this case. A total of 47 variants were analyzed for this case. Of interest, this participant has homozygous variants in the gene HLA-B

(c.356T>G p.Leu119Arg).

4.16.1 HLA-B

A homozygous variant, c.356T>G (p.Leu119Arg), was found in the gene HLA-B. This variant has not been reported in the GnomAD database and in-silico software predicts that this variant is deleterious (SIFT: 0.01; PolyPhen: 0.97; CADD: 23.6). The gene HLA-B has a Phenolyzer rank of 148, which indicates that out of the over 20,000 known genes in the , this variant is predicted by the phenolyzer software to be 148th most likely to be the cause of a disease associated with this participant’s selected HPO terms. This gene is part of the human leukocyte antigen (HLA) group of genes, which make up a part of the major histocompatibility complex (MHC). The HLA-B gene is part of the MHC class I genes, which display antigen peptides to the immune system (Dendrou, Petersen,

Rossjohn, & Fugger, 2018). This gene has been associated with increased susceptibility to various diseases including drug-induced liver injury, spondyloarthropathy, Stevens-Johnson syndrome, synovitis, and toxic epidermal necrolysis (https://omim.org/entry/142830). Given the obvious autosomal dominant, monogenic nature of disease in this family, it is unlikely that homozygous variants in an MHC gene are the cause. Therefore, we are not pursuing this candidate variant any further.

69

4.16.2. Segregation Analysis

Given the strong family history of disease in this family, WGS using multiple family members is a promising next step. Each generation is expected to share 50% of their genetic information with the next generation. Therefore, if family members from multiple generations are available for WGS, then segregation analysis of variants may aid in discovering the genetic etiology of disease in this family. We recommend recruiting this participant’s affected granddaughter, her affected niece or nephew, and an affected paternal cousin for analysis. We expect that her granddaughter and paternal cousin share approximately 1/32 (3%) of their genetic information; therefore, inclusion of these distant relatives will aid in the assessment of shared, potentially pathogenic variants in this family.

4.17. Case 17

Case 17 is a 3-year-old male of First Nations and Caucasian descent. He currently lives in a foster home and little is known about his family history. He has 6 half-siblings, including a brother with developmental delay. This participant first presented at birth with neonatal abstinence syndrome, which is defined as drug withdrawal from prenatal exposure. He had an umbilical hernia, developed conjugated hyperbilirubinemia, and was below the first percentile for height and weight. He had elevated cholestatic enzymes and cirrhotic liver disease. He has dysmorphic facies including: bilateral epicanthic folds, small mouth, small palpebral fissures, thin upper lip, smooth philtrum, and microcephaly. He had a G-tube insertion due to poor growth. He has had developmental challenges, with a suspicion of fetal alcohol syndrome (FAS), though this does not explain his full phenotype.

An abdominal ultrasound showed no evidence of biliary atresia, but did show hepatomegaly with diffuse coarse echotexture. A liver biopsy showed bile duct changes and cholestasis. He eventually went on to have a liver transplant due to progressive worsening of his liver function. A full metabolic workup showed consistently high liver enzymes, as well as low vitamin D and E. He had a microarray

70 which showed a 208kbp duplication of unknown significance in the Xq23 region of his genome

(X:110,601,358-110,809,103). A cholestasis panel showed a VUS in PEX6 (c.2301-5_2301-2delCTCA), which causes Zellweger syndrome. A singleton clinical exome performed by Prevention Genetics in 2019 did not reveal any new findings.

The HPO terms selected for this case are: short stature (HP:0004322), microcephaly

(HP:0000252), failure to thrive (HP:0001508), narrow mouth (HP:0000160), thin upper lip vermilion

(HP:0000219), epicanthus (HP:0000286), smooth philtrum (HP:0000319), narrow palpebral fissure

(HP:0045025), umbilical hernia (HP:0001537), hepatic failure (HP:0001399), hepatomegaly

(HP:0002240), neonatal cholestatic liver disease (HP:0006566), and conjugated hyperbilirubinemia

(HP:0002908).

Exome reanalysis revealed 741 variants and 63 variants in genes associated with his selected

HPO terms. No candidate gene lists were used for this case. A total of 57 variants were analyzed. Of interest, a heterozygous variant was found in the gene EP300 (c.3256A>G p.Ile1086Val). Based on the participant’s phenotype we also analyzed the 16q22 region of this participant’s exome and found three homozygous variants of interest PSKH1 c.686C>T (p.Pro229Leu), DUS2 c.1265C>T (p.Ala422Val), and

NOB1 c.1095C>A (p.Asp365Glu).

4.17.1. EP300

A heterozygous variant, c.3256A>G (p.Ile1086Val), in the gene EP300 was of interest. Initial evidence for this variant was extremely convincing. This variant is absent from the GnomAD database, and in-silico software predict this variant to be deleterious (SIFT: 0.01; PolyPhen: 0.92; CADD: 26.3).

Phenolyzer gave the EP300 gene a rank of 33, indicating that out of over 20,000 known human genes, the phenolyzer software predicts that this gene is the 33rd most likely gene to be associated with this participant’s HPO terms. Finally, the EP300 gene causes autosomal dominant disease and is associated

71 with five of the HPO terms selected for this participant (epicanthus, failure to thrive, microcephaly, narrow mouth, and short stature). However, the gene EP300 causes a disorder called Rubinstein-Taybi, which is associated with distinct facial features, developmental delay, intellectual difficulty, and behavioural difficulties such as hyperactivity (Hamilton et al., 2016; Roelfsema et al., 2005). To our knowledge, Rubinstein-Taybi has never been associated with liver disease, and the facial features of this participant do not align with this disorder. Most of the HPO term matches between EP300 and our participant are due to facial features which can be attributed to the potential FAS seen in our participant. However, the main feature seen in our participant, liver cirrhosis, is not explained by this variant. Though initial review of this variant looked promising, this analysis highlights the importance of interpreting each variant in the context of the patient. Therefore, the variant found in EP300 is not considered a candidate for this participant and will not be pursued further in this case.

4.17.2. North American Indian Childhood Cirrhosis (NAIC)

Based on this participant’s phenotype, we were interested to know if this participant has the variant UTP4 c.1693C>T (p.Arg565Trp). North American Indian Childhood Cirrhosis (NAIC) is an autosomal recessive disease that has been described in Ojibway-Cree children from Northwestern

Quebec. NAIC is a form of cholestasis which typically begins as unexplained jaundice and eventually progresses to cirrhosis requiring liver transplantation (Chagnon et al., 2002). Mapping and segregation studies previously led to the conclusion that NAIC was caused by the homozygous c.1693C>T

(p.Arg565Trp) variant in the gene CIRH1A (currently referred to as UTP4), which was present in all affected individuals. However, analysis of the Exome Aggregation Consortium data found that this variant was present in a high frequency of Latino individuals (0.017; https://gnomad.broadinstitute.org/variant/16-69199289-C-T?dataset=gnomad_r2_1). Follow-up with homozygous individuals from the database revealed no issues with cholestasis or liver disease of any type (Lek et al., 2016). This variant was reclassified as benign based on ACMG criteria BS1, higher allele

72 frequency than expected for the disorder, and BS2, observed homozygously in healthy adult individuals

(Richards et al., 2015). Since this time, little work has been done to resolve the causative variant of

NAIC.

While the UTP4 c.1693C>T variant is no longer thought to be the cause of disease, given the low variant frequency in Indigenous populations it can still act as an indirect marker for the presence of the founder haplotype. Analysis of this participant’s assembled reads showed that he is homozygous for the

UTP4 c.1693C>T variant (Figure 4.3.). This is strong evidence for the diagnosis of NAIC in this participant since previous work showed a low frequency of this variant in Ojibway-Cree individuals. Analysis of other variants found in the 16q22 region of this participant’s genome found 3 homozygous variants that have low frequency in the GnomAD database: PSKH1 c.686C>T (p.Pro229Leu), DUS2 c.1265C>T

(p.Ala422Val), and NOB1 c.1095C>A (p.Asp365Glu).

73

Figure 4.3: Case 17’s exome sequencing reads aligned to the gene UTP4. The red Ts indicate that all of this participant’s reads that align to position c.1693 show the mutant T nucleotide, instead of the wildtype C.

74

The variant c.686C>T (p.Pro229Leu) in the gene PSKH1 has not been reported in the GnomAD database and has mixed in-silico predictions of pathogenicity (SIFT: 0.02; PolyPhen: 0.478; CADD: 23.8).

The gene PSKH1 is a GUS, but it may be involved in localization of splice proteins to splice factor compartments (Brede, Solheim, & Prydz, 2002) and maintenance of the Golgi apparatus (Brede,

Solheim, Stang, & Prydz, 2003). The variant c.1256C>T (p.Ala422Val) in the gene DUS2 is reported in 4 heterozygotes in the GnomAD database, and is unanimously predicted to be pathogenic by in-silico software (SIFT: 0.01; PolyPhen: 0.999; CADD: 35). Not much is known about this GUS, except that it has tRNA-dihydrouridine synthase activity (Kato et al., 2005) and though it is ubiquitously expressed, it is mostly found in the testis, small intestine, lymphocytes, and whole blood

(https://gtexportal.org/home/gene/DUS2). Finally, the variant c.1095C>A (p.Asp365Glu) in the gene

NOB1 has also not been reported in the GnomAD database and has mixed in-silico predictions of pathogenicity (SIFT: 0.02; PolyPhen: 0.71; CADD: 17.77). The gene NOB1 is also a GUS, but is known to have endonuclease activity and is required for the conversion of 20S pre-rRNA to mature 18S rRNA

(Fatica, Oeffinger, Dlakić, & Tollervey, 2003; Kato et al., 2005).

All three of these variants are novel candidates for NAIC. Follow-up steps will be to determine if other Indigenous patients in Winnipeg or the Care4Rare cohort have a diagnosis of NAIC and to look for the presence or absence of these variants in their ES data. Ultimately functional analysis of one or more of these candidate genes will be needed to prove causation unless non-Indigenous patients with a similar phenotype can be discovered with different mutations in the same gene. The Matchmaker

Exchange may be useful for this purpose.

4.18. Case 18

Case 18 is a 3-year old male of First Nations descent. He is an only child born to non- consanguineous parents. He has a maternal cousin with congenital heart disease, and the rest of his

75 family history is unremarkable. This participant presented prenatally with unusual contours of the fetal brain. At birth, he had short femurs, missing parietal bones, a small occipital meningocele, and cutis aplasia overlying the occiput. Brain MRIs revealed multiple areas of polymicrogyria, a duplicated posterior portion of the falx cerebri, and a flattened cerebellum. Other features include a sloping forehead with prominent apex, microcephaly, and a hypopigmented patch on his left shoulder.

Remarkably, he has had normal development. This participant had a normal microarray, which revealed several regions of homozygosity, and negative genetic testing for COL18A1, which is the gene associated with autosomal recessive Knobloch syndrome. Clinical singleton ES was performed in 2018.

The HPO terms selected for this case were: microcephaly (HP:0000252), sloping forehead

(HP:0000340), abnormal parietal bone morphology (HP:0002696), hypopigmentation of the skin

(HP:0001010), aplasia cutis congenita (HP:0001057), no global developmental delay (HP:0001263), abnormal cerebellum morphology (HP:0001317), occipital encephalocele (HP:0002085), polymicrogyria

(HP:0002126), abnormality of the falx cerebri (HP:0010653), and parietal meningocele (HP:0030730).

Singleton ES reanalysis revealed 776 variants total and 43 variants in genes associated with this participant’s selected HPO terms. No candidate gene lists were used for this case. A total of 48 variants were analyzed. Of interest, two heterozygous variants were found in the gene OBSCN, c.16732C>T and c.18714del, and a heterozygous nonsense variant was found in the gene EGF, c.2908C>T (p.Arg970Ter).

4.18.1. OBSCN

Two heterozygous suspected loss-of-function variants were found in the gene OBSCN. The nonsense variant, c.16732C>T (p.Arg5578Ter), has been reported three times in the GnomAD database and the frameshift variant, c.18714del (p.Ile6239SerfsTer5), has not been reported in GnomAD. The gene OBSCN has an observed over expected loss-of-function variant ratio of 0.559, indicating that it is somewhat intolerant of loss-of-function variants. OBSCN encodes for a muscle protein that co-

76 assembles with TTN during myofibrillogenesis (Young, Ehler, & Gautel, 2001) and is mainly expressed in skeletal muscle and heart (https://gtexportal.org/home/gene/OBSCN). Marston et al. described missense OBSCN variants in 4 independent patients with a dilated cardiomyopathy phenotype (2015).

They then tested cardiac muscles for obscurin protein levels in these 4 patients and found that obscurin levels were significantly decreased compared to patients with dilated cardiomyopathy due to mutations in other disease-causing genes, increasing their suspicion that OBSCN may be the causative gene. Since that time several other studies have found OBSCN mutations in patients with various cardiomyopathies including hypertrophic cardiomyopathy and left ventricular non-compaction; however, no further functional studies have been done to make a definitive disease-gene association (Grogan &

Kontrogianni-Konstantopoulos, 2019; Rowland et al., 2016; Xu et al., 2015).

Based on what is currently known about the function of OBSCN, it is unlikely to be involved in the disorder seen in this participant. However, these loss-of-function variants should not be overlooked, as they may have potential health implications. Next steps will be to confirm if these variants were inherited in trans or in cis and report them to the referring physician. Although a definitive disease-gene association with OBSCN has not been conclusively made, cardiac surveillance may still be indicated.

4.18.2. EGF

A heterozygous nonsense variant, c.2908C>T (p.Arg970Ter), was found in the gene EGF. This variant has not been reported in the GnomAD database and this gene is expected to be intolerant to loss-of-function variants based on an observed over expected loss-of-function ratio of 0.559. The gene

EGF is associated with one HPO term selected for this participant: microcephaly, and the software phenolyzer has given this gene a rank of 186, indicating that out of the more than 20,000 known genes in the human genome, phenolyzer predicts that this gene is 186th most likely to be associated with a disease causing this participant’s selected HPO terms.

77

The gene EGF encodes epidermal growth factor, which is known to effect cell differentiation

(O’Keefe, Hollenberg, & Cuatrecasas, 1974). This gene has only been described in disease once before in a consanguineous family with homozygous missense variants leading to renal hypomagnesemia

(Groenestege et al., 2007). This disease is significantly different from the disease seen in case 18; however, this could be due to the more severe nonsense mutation seen in our participant. Nonsense variants are especially likely to cause disease due to the potential of causing haploinsufficiency or a truncated product which still has the ability to bind substrates without appropriately activating downstream effects. Multiple studies have shown that EGF is important in neurogenesis during embryonic development (Lamus et al., 2020; Trujillo-Gonzalez et al., 2019). In fact, Scafidi et al. have shown that intranasal treatment of EGF promotes recovery of neonatal hypoxic brain injury and suggest it as a possible therapy (2014). As well, Rawlins et al. have shown that the timing of EGF expression is a critical aspect of appropriate suture closure during development (2008). Similarly, the role of EGF in craniofacial development has also been extensively studied, especially in the context and mandible development (Huang, Solursh, & Sandra, 1996; Shum et al., 1993). Taken together, these studies lead us to believe that this nonsense mutation in EGF, perhaps paired with a less severe missense variant, may have an effect on the brain development and skull formation seen in this participant.

It should be noted that we reviewed the aligned bam files for this participant and were unable to find a second heterozygous mutation that occurred at a low enough frequency to be potentially pathogenic. It is possible that there is a second variant that is unable to be detected using ES technology. Next steps include entering this gene into the Matchmaker Exchange database to see if there are any individuals with similar phenotypes who also carry variants in EGF. We also recommend analysis of EGF transcripts for this participant, to assess for decreased product or altered transcripts.

78

4.19. Case 19

Case 19 consists of an affected sibling pair of Hutterite descent. Their parents are second cousins and their family history is unremarkable. The eldest of the siblings is a 10-year-old female. She was born at 28 weeks gestation due to preeclampsia in the mother. She remained in hospital for 3.5 months. During this time, she was on a ventilator and had some difficulty feeding. She was noted to have left sided choanal atresia, a round and broad face, a broad nasal root, and a large mouth.

Currently, she has global developmental delay, including gross motor delay and speech delay. She has gait ataxia, central hypotonia, poor shoulder girdle tone, and a mild increase in tone in her lower extremities. She has limited extension at her elbows and knees and has rapidly progressive scoliosis.

The younger of the siblings is a 7 year-old-female. She was born at 29 weeks gestation due to progressive symmetric intrauterine growth restriction. She also presented with left choanal atresia, a broad round face, broad nasal root, and a large mouth. She has global developmental delay including gross motor delay and speech delay. Like her sister, her gait is ataxic and wide. She has hypotonia, increased flexibility, and rapidly progressive scoliosis. At their most recent appointment with genetics, it was revealed that both sisters have already reached menarche.

Brain MRIs in both sisters have found cerebral atrophy and cerebellar hypoplasia. An echocardiogram in the eldest sister found a mild narrowing of the descending aorta. They have both had normal microarrays with multiple regions of homozygosity. The eldest sister had a normal karyotype and normal fragile X testing. A metabolic screen in the younger sibling found high taurine, citrulline, and vitamin E. Clinical duo ES performed by Prevention Genetics in 2019 revealed that both sisters have a paternally inherited VUS in DOCK3 (c.3816C>G), which is associated with an autosomal recessive neurodevelopmental disorder with impaired intellectual development.

79

The HPO terms selected for this sibling pair are: wide mouth (HP:0000154), broad face

(HP:0000283), round face (HP:0000311), wide nasal bridge (HP:0000431), choanal atresia (HP:0000453), scoliosis (HP:0002650), central hypotonia (HP:0011398), global developmental delay (HP:0001263), delayed gross motor development (HP:0002194), delayed speech and language development

(HP:0000750), cerebellar hypoplasia (HP:0001321), cerebral atrophy (HP:0002059), gait ataxia

(HP:0002066), broad-based gait (HP:0002136), premature birth (HP:0001622), and precocious puberty

(HP:0010465).

Duo ES reanalysis revealed 693 shared variants in this sibling pair and 40 shared variants in genes associated with their selected HPO terms. No candidate gene list was used for this sibling pair. A total of 51 variants were analyzed. Of interest, these sisters share a homozygous variant in PTOV1

(c.289G>A p.Gly97Ser).

4.19.1. PTOV1

The homozygous variant c.289G>A (p.Gly97Ser) in the gene PTOV1 was found in both siblings.

This variant is absent from the GnomAD database and is predicted to be deleterious by in-silico software

(SIFT: 0.02; PolyPhen: 0.974; CADD: 31). The gene PTOV1 is a GUS not currently known to be associated with disease. In vitro studies have shown that when PTOV1 is depleted using RNA interference, flotillin-

1 was unable to enter the cell nucleus (Santamaría et al., 2005). Flotillin-1 acts to promote cell proliferation, differentiates epithelial cells into neuronal cells, and has been linked to plaque formation in Alzheimer’s disease (Hazarika et al., 1999; Rajendran et al., 2007). This evidence suggests that variants in PTOV1 could have an affect on neurogenesis and could potentially be the etiology of disease in this sibling pair. The PTOV1 gene has been entered in the Matchmaker Exchange database to see if it is a candidate gene in individuals with similar presentations to this sibling pair.

80

4.20. Case 20

Case 20 is a 21-year-old male of English, Scottish, French Canadian, and Metis descent. His parents are non-consanguineous. He has an older half-sister with an enlarged spleen and inflammatory bowel disease. He has a maternal grandfather with a degenerative neurologic condition, and a maternal second cousin with Aicardi-Goutières syndrome and developmental regression. This participant had anemia in infancy and recurrent pneumonia. During a bout of pneumonia at age 13 he was discovered to have pancytopenia and low neutrophils. His sicknesses are accompanied by fever, lymphadenopathy, and splenomegaly. He is consistently hypogammaglobulinemic and has low T-cell counts. He is also known to have chronic granulomatous interstitial lung disease. He has an inflammatory skin condition and pityriasis-like patches under his eye. He has bouts of nausea and vomiting, and post work-out myalgia with marked creatine kinase elevation.

He has baseline low IgG, IgM, and IgA levels, with decreased CD8+ T-cells and natural killer cells.

He consistently has an elevated serum creatine kinase (500-1200 U/L), elevated chitotriosidase and angiotensin converting enzyme levels. Bone marrow and lymph node biopsies showed granulomas inflammation, with some necrotizing. A CT scan showed bilateral ground glass and nodular opacities in his lungs, pulmonary parenchymal nodules, and hypoattenuating lesions throughout his kidneys. He has had a partial nephrectomy due to granulomatous lesions on his kidney. A whole-body MRI showed normal bones and muscles, with several small lung nodules speculated and splenomegaly. He had a negative genetic panel for severe combined immunodeficiencies and negative familial AFAR testing. A clinical exome with Prevention Genetics in 2018 revealed many heterozygous VUSs which were mainly ruled out due to being inherited from unaffected parents. These include maternally inherited c.542C>A in TNFRSF13B, maternally inherited c.766A>C in ASAH1, maternally inherited c.6734G>C in PIEZO1, paternally inherited c.3370G>T in RTEL1, and maternally inherited c.328C>T in RYR1.

81

The HPO terms selected for this case are: chronic lung disease (HP:0006528), interstitial pulmonary abnormality (HP:0006530), recurrent pneumonia (HP:0006532), rhabdomyolysis

(HP:0003201), lymphadenopathy (HP:0002716), decreased level in blood (HP:0004313), exercise-induced myalgia (HP:0003738), splenomegaly (HP:0001744), neutropenia (HP:0001875), pancytopenia (HP:0001876), anemia (HP:0001903), granulomatosis (HP:0002955), decrease in T cell count (HP:0005403), fever (HP:0001945), and elevated serum creatine phosphokinase (HP:0003236).

Singleton exome reanalysis revealed 883 total variants and 52 variants associated with the HPO terms selected for this participant. A total of 57 variants were analyzed, including variants in genes from a neuromuscular candidate gene list and an immunological candidate gene list. Of interest, a heterozygous variant in the gene TLR3 (c.1234C>G p.Leu412Val) was seen. This variant is listed as a VUS in the ClinVar database and has been reported in 37 heterozygotes in the GnomAD database. In-silico software predict that this variant has a deleterious affect on gene function (SIFT: 0; PolyPhen: 0.998;

CADD: 23.5). The gene TLR3 has a phenolyzer rank of 771, which means that out of the over 20,000 known genes in the human genome, the phenolyzer software predicts that this gene is the 771st most likely gene to be associated with this participant’s selected HPO terms. However, disease associated with TLR3, susceptibility to herpes virus-induced acute encephalopathy and resistance to HIV, do not match the phenotype of this participant (Sironi et al., 2012; S. Y. Zhang et al., 2007).

For this case, we are suggesting a polygenic affect from the variants of interest that have been described in his original exome. The variant in TNFRSF13B, c.542C>A (p.Ala181Glu), is a known pathogenic variant that is associated with common variable immunodeficiency. However, this variant has variable penetrance and has been seen in both affected and unaffected individuals, although in vitro

B-cell dysfunction has been noted in unaffected individuals (Martinez-Gallo et al., 2013). A large case- control study clearly showed an increased risk of variable immunodeficiency in heterozygotes with an estimated odds ratio of 5.6 (Pan-Hammarström et al., 2007). A combination of genetic and

82 environmental factors may be the cause of the immune dysregulation reported in this patient, despite his mother being unaffected. Similarly, the VUS in RYR1 reported in his original genome could have an effect that is exacerbated by his immune phenotype, leading to a combined myopathic and immune disorder. Based on the large number of candidate variants described in this participant’s exome, it is unlikely that follow-up investigations such as WGS will reveal any useful results. We recommend that this participant’s exome be reanalyzed in a few years and that the variants described in his original exome be reinterpreted at that time.

4.21. Case 21

Case 21 consists of a maternal half-sibling pair. Their mother is adopted; therefore, little is known about her family history. Their mother has Sjogren’s syndrome. This sibling pair have very different phenotypes, but it is hypothesized that they may have a similar underlying vascular abnormality that affected prenatal development. The eldest of the siblings is an 18-year old male who was born with bilateral congenital amputations below his knees. He was also born with synbrachydactyly in both his hands. He presented to genetics at 7 years of age due to unexplained musculoskeletal pain and fatigue. He has had pain his entire life which is particularly bad after exercise.

He discontinued using his prosthetics at the age of 5 due to pain in his lower legs. He also has pain in his shoulders, upper chest, and palms. He is chronically fatigued and has had a decline in function which includes mood swings and occasional aggressive behaviour. He has a mild learning disability, neuropathic paresthesias, as well as itchy and cold sensations in his fingers. He had a normal EEG, and genetic testing for Factor V Leiden and prothrombin were both normal. He has corticated bony outgrowths on the distal ends of his tibia.

The youngest of the siblings is a 15-year old female who was born with arthrogryposis and hypotonia. She had poor feeding and respiratory insufficiency in infancy. She was initially diagnosed

83 with merosin-positive congenital muscular dystrophy based on a muscle biopsy; however, this diagnosis was declared incorrect based on a later review of the biopsy. She is now thought to have a disorder closer to amyoplasia congenita which is thought to have a prenatal vascular etiology. She has fixed contractures in her shoulders and elbows with no power in her wrists or elbows. She is wheelchair bound, with severe motor delay and some motor regression. She is moderately hypotonic, peripherally more than axially, and her deep tendon reflexes are absent in both her upper and lower limbs. She has interphalangeal fusion in her hands, scoliosis, and is G-tube fed.

She has had a normal brain MRI. An MRI of her pelvis showed severe atrophy with fatty infiltration of the upper legs bilaterally. A muscle biopsy showed marked variation in myofiber size, slight increase in nuclei, and contains extensive adipose tissue. An MRI of her thigh and calf showed no muscle in the calf area, as well as relative sparing of rectus femoris, adductor longus, and knee flexors.

She had a normal metabolic workup, normal pulmonary test, normal microarray, normal karyotype, and normal FISH for 22q11. Research-based trio exome analysis of the affected sibling pair and their unaffected mother did not reveal any variants of interest.

The HPO terms selected for this sibling pair are: brachydactyly (HP:0001156), abnormality of the lower limb (HP:0002814), aplasia/hypoplasia of the phalanges of the hand (HP:0009767), amelia involving the lower limbs (HP:0009818), finger syndactyly (HP:0006101), generalized hypotonia

(HP:0001290), arthrogryposis multiplex congenita (HP:0002804), and delayed motor development

(HP:0002194).

Trio ES reanalysis which included these half siblings and their unaffected mother revealed 306 variants shared between the siblings and 22 variants associated with the HPO terms selected for the pair. No candidate gene lists were used for this analysis. A total of 10 variants were analyzed. Of

84 interest was a heterozygous variant in PRKCH (c.1214G>A p.Arg405Lys) for which their mother was also heterozygous.

4.21.1. PRKCH

A heterozygous variant, c.1214G>A (p.Arg405Lys), in the gene PRKCH was found in both siblings and their mother. The variant has been seen in 11 heterozygotes in the GnomAD database and is predicted to be deleterious by in-silico software (SIFT: 0.01; PolyPhen; 0.952; CADD: 32). A variant, c.

1120G>A (p.Val374Ile), within the same catalytic domain of PRKCH has been associated with an increased risk of cerebral infarction in a large cohort of Japanese individuals (Kubo et al., 2007). The

PRKCH gene was found to be expressed in the vascular endothelial cells in human atherosclerotic lesions, showing that pathogenesis in cerebral infarction may be due to effects on the vasculature (Kubo et al., 2007). We hypothesize that this variant, in combination with maternal factors, may have influenced the vasculature of these siblings during embryonic development. If this is the case, then the pattern of inheritance in these siblings would likely be multi-factorial and not Mendelian. Assessment of recurrence risks for traits with strong environmental factors can be difficult to make, especially when we do not know what environmental factors most strongly influence the trait of interest. We will be discussing this hypothesis with the developmental geneticist who first proposed a vascular abnormality in this sibling pair. We propose future studies with animal models to determine if variants within PRKCH can lead to abnormal prenatal development similar to that seen in this case.

4.22. Case 22

Case 22 is a 3-year-old female of First Nations descent. Consanguinity is unknown. She has one older sibling. The family history is unremarkable for her given indication. This participant presented at 5 months of age with acute respiratory distress syndrome associated with group A Streptococcus positive culture and pneumonia on chest x-ray. She had an episode of acute respiratory distress at 12 months of

85 age associated with low levels of Epstein-Barr virus and positive measles PCR two weeks after receiving her MMR vaccination. Upon taking her medical history, it was revealed that she also had an umbilical stump infection at 10 days of age and multiple episodes of oral thrush. Upon hospital admission she had extreme leukocytosis, anemia, thrombocytopenia, hepatosplenomegaly, acute kidney injury, and skin rash. She also had persistent liver dysfunction with elevated liver enzymes.

An abdominal ultrasound revealed abnormal echotexture of the liver with slight irregularity of the surface. She has a history of reduced B and natural killer cells during hospital admissions. She had a normal microarray with >11% homozygosity. Genetic testing of the gene IKBKB was negative for the c.1292dupG variant linked to severe immunodeficiency in First Nations populations. A clinical trio exome through GeneDx in 2018 revealed a homozygous VUS (c.563G>A) in the gene SDR9C7 inherited from each parent. This gene is associated with congenital ichthyosis and was unrelated to her phenotype.

The gene STAT2, known to cause autosomal recessive immunodeficiency and pseudo-TORCH syndrome, was shown in the region of homozygosity, so a skin biopsy was taken to explore this gene further. It should be noted that no variants were seen in the STAT2 gene on her exome.

The HPO terms selected for this case are: skin rash (HP:0000988), elevated hepatic transaminase

(HP:0002910), acute kidney injury (HP:0001919), recurrent infections (HP:0002719), sepsis

(HP:0100806), decreased liver function (HP:0001410), hepatosplenomegaly (HP:0001433), thrombocytopenia (HP:0001873), anemia (HP:0001903), and leukocytosis (HP:0001974).

Consent to release ES data for this participant’s mother was unable to be acquired, so analysis was completed as a duo with the participant and her father. Duo exome analysis revealed 551 variants for this participant and 24 variants in genes associated with her selected HPO terms. A total of 36 variants were analyzed, including those from an immune disorder candidate gene list. Of interest were

86

3 homozygous variants which her father was heterozygous for: DALRD3 c.1250A>G (p.Tyr417Cys),

LMTK2 c.605G>A (p.Gly202Glu), and ZNFX1 c.3152T>C (p.Leu1051Pro).

The variant c.1250A>G (p.Tyr417Cys) in the gene DALRD3 has not been recorded in the GnomAD database and is predicted to be deleterious by in-silico software (SIFT: 0; PolyPhen: 0.999; CADD: 24.1).

The variant c.605G>A (p.Gly202Glu) in the gene LMTK2 has not been reported in the GnomAD database and is predicted to be deleterious by in-silico software (SIFT: 0; PolyPhen: 0.999: CADD: 29.2). The genes

DALRD3 and LMTK2 are currently GUSs and a review of the literature was unable to narrowing down a proposed function for these genes; therefore, these variants will not be pursued further.

4.22.1. ZNFX1

A homozygous variant, c.3152T>C (p.Leu1051Pro), was observed in the gene ZNFX1. This variant was heterozygous in the participant’s unaffected father and is assumed to be heterozygous in the unanalyzed mother. This variant is absent from the GnomAD database and is predicted to be deleterious by in-silico software (SIFT: 0; PolyPhen: 0.998; CADD: 30). The gene ZNFX1 is absent from the OMIM database, but has been recently studied in the context of immune disorders.

Wang et al. used both in vitro and in vivo models to study the affects of ZNFX1 (2019). They created a ZNFX1-knock out mouse which they determined had no observable phenotypic differences to wildtype; however, these mice had consistently higher virus mRNA levels after infection, as well as more pulmonary infiltration of inflammatory cells. Overall, knockout mice were less resistant to infection, which is similar to the phenotype observed in our participant. Further studies led Wang et al. to hypothesize that the ZNFX1 gene functions as a sensor of viral RNA, which then interacts with mitochondrial antiviral signalling proteins to fight infection (2019).

Entering the ZNFX1 gene into the Matchmaker Exchange program generated a match with a team of researchers from Australia and Germany who have an international cohort of 12 patients with

87 similar immune phenotypes to our participant. All patients in their cohort have biallelic variants in

ZNFX1. This team has also generated functional data and are currently in the process of publishing their findings. This evidence suggests that this is a strong novel candidate for disease in this patient. Follow- up studies will include collaboration with this international team as well as confirmation of the variant in this participant’s mother.

4.23. Case 23

Case 23 is a 6-year-old female of Hutterite descent. There is distant consanguinity in her family.

She has an older brother with severe microcephaly, hypertonia, and developmental stagnation. She has two healthy older brothers and two healthy older sisters. She has one sibling who was born with tetratology of Fallot who is otherwise healthy. This participant was first seen by genetics in utero due to severe microcephaly. The family elected to not have any prenatal investigations but was seen shortly after birth for assessment. She has remained somewhat small throughout her life with very marked microcephaly, similar to her sibling. She has delayed gross motor development, and delayed speech and language development. She has no noticeable dysmorphic features aside from a bulbous nose that appears to be a familial trait. She has a lesion in her temporal left retina which resulted in a dragged disc. This is of uncertain etiology.

A brain MRI in infancy showed microcephaly, with the cerebral hemispheres equally affected and small relative to the posterior fossa structures. She had an unremarkable abdominal ultrasound.

PCR testing for CMV, HSV, EBV, and enterovirus were negative as well as serological testing for toxoplasmosis, CMV, rubella, and herpes. Electron microscopy of a fecal specimen was also negative. A microarray revealed a paternally inherited 10q26.3 deletion of unknown significance (10:131,954,003-

135,534,747) and multiple regions of homozygosity. She had a singleton research exome in 2015 which was unrevealing.

88

The HPO terms selected for this case are: microcephaly (HP:0000252), failure to thrive

(HP:0001508), chorioretinal dysplasia (HP:0007731), vitreoretinopathy (HP:0007773), hypertonia

(HP:0001276), global developmental delay (HP:0001263), delayed gross motor development

(HP:0002194), delayed speech and language development (HP:0000750), hypoplasia of the corpus callosum (HP:0002079), developmental stagnation (HP:0007281), simplified gyral pattern (HP:0009879), feeding difficulties (HP:0011968), thrombocytopenia (HP:0001873), and increased mean platelet volume

(HP:0011877).

Singleton exome reanalysis revealed 552 variants total and 56 variants in genes associated with her selected HPO terms. There were no candidate gene lists used for this analysis. A total of 72 variants were analyzed. No candidate variants of interest were found for this case.

It should be noted, that the paternally inherited deletion contains the gamma-tubulin gene

TUBGCP2, which causes autosomal recessive pachygyria, microcephaly, developmental delay, and dysmorphic facies (Mitani et al., 2019). There are three genes in the same family that are known to cause microcephaly with chorioretinopathy; PLK4, TUBGCP4, and TUBGCP6. It is possible that both siblings have an autosomal recessive TUBGCP2-related disorder where this paternal gene deletion is acting as one allele and a second maternally inherited allele is present which cannot be detected via ES.

To test this hypothesis, we are proposing segregation analysis of the chromosome 10 region in this affected sibling pair, their unaffected siblings, and their unaffected parents. Segregation analysis can help to determine if all affected siblings carry the same combination of inherited alleles which are not present in the unaffected siblings. This would be evidence of an unseen inherited pathogenic variant from the mother. If this is consistent with disease causation RNA-based study of the TUBGCP2 gene may be warranted.

89

4.24. Case 24

Case 24 is a 31-year-old male of Irish, Scottish, French, and German descent. His parents are non-consanguineous. He has a healthy brother. He has a maternal uncle with a renal disorder, and a paternal uncle with psoriasis. His maternal grandmother had breast cancer, his paternal grandfather had a cancer of unknown type, and his paternal grandmother passed away from lung cancer. This participant was believed to have osteopenia and vitamin D deficiency secondary to malabsorption syndrome. It was later thought that he had primary intestinal lymphangiectasia with resulting lymphopenia and hypogammaglobulinemia. He has had chronic lymphedema of both legs since birth, and failure to thrive and diarrhea in infancy which required prolonged hospital stays and tube feedings.

He had chronic thrush and diaper rash in infancy, as well as recurrent otitis media in childhood and recurrent chest infection throughout his entire life. He has a chronic productive cough, shortness of breath, and decreased exercise tolerance. He is frequently fatigued in the mornings, despite sleeping well at night. At the age of 28 he started to experience memory lapses.

This participant also has decompensated liver disease secondary to longstanding cholestatic liver disease. A normal liver biopsy indicated that his liver disease is due to a non-hepatic cause. He has primary hypothyroidism, elevated TSH, low gonadotropins, and low testosterone. At age 22 he was diagnosed with lymphoma, which is currently in remission. He has alopecia, which is reported to have occurred prior to chemotherapy. He has severe warts on his hands and feet as well as scaly papules and plaques on his chest, groin, and lower abdomen.

A full immunological workup at 23 years of age showed severe pan-lymphopenia with severe deficits in CD3 numbers and severe deficits in both CD4 and CD8 T-cells. He had low CD56+ natural killer cells and absent B cells. He was mildly anemic and thrombocytopenic with normal MCV. His IgG was high, IgM was elevated-normal, his IgA was low, and he had no IgE. His lymphocyte response to

90 mitogens were grossly abnormal. His response to pokeweed mitogen, phytohemagglutinin, concanavalin A, and candida antigen were 20-30% of normal, and his T-cell receptor excision circles were low.

Gastrointestinal scopes and biopsies have not demonstrated evidence of lymphangiectasia.

Alpha-one antitrypsin clearance studies have never shown protein-losing enteropathy so the etiology of his early GI issues is unclear. An abdominal ultrasound performed at 29 years of age show bilateral pleural effusions. Wart biopsies were consistent with epidermodysplasia verruciformis. Intravenous immunoglobulin therapy worked to significantly reduce the number of intercurrent infections this participant has yearly. A microarray performed at 24 years of age showed two copy number variants of unknown significance. Clinical singleton ES was performed by Baylor in 2014, which did not reveal any abnormalities in GATA2 or FOXP3, which are known immunodeficiency genes which were felt to be good candidates based on his phenotype. ES analysis revealed a heterozygous pathogenic variant in VPS45, associated with autosomal recessive neutropenia, a paternally inherited VUS in NOTCH2, which is associated with autosomal dominant Alagille’s syndrome, and a heterozygous VUS in PEX16, which is associated with autosomal recessive peroxisome biogenesis disorder.

The HPO terms selected for this case are: short stature (HP:0004322), failure to thrive in infancy

(HP:0001531), recurrent otitis media (HP:0000403), chronic oral candidiasis (HP:0009098), disseminated cutaneous warts (HP:0032215), bronchiectasis (HP:0002110), osteopenia (HP:0000938), cholestasis

(HP:0001396), malabsorption (HP:0002024), hypothyroidism (HP:0000821), immunodeficiency

(HP:0002721), abnormality of the lymphatic system (HP:0100763), ascites (HP:0001541), chronic diarrhea (HP:0002028), B-cell lymphoma (HP:0012191), and lymphedema (HP:0001004).

Singleton ES reanalysis revealed 526 variants and 56 variants in genes associated with this participant’s selected HPO terms. A total of 69 variants were analyzed including variants in genes from

91 an immune disorder candidate gene list. Of interest were two heterozygous variants found in the gene

RELB, c.433G>A (p.Glu145Lys) and c.1091C>T (p.Pro364Leu).

4.24.1. RELB

Two missense heterozygous variants, c.433G>A (p.Glu145Lys) and c.1091C>T (p.Pro364Leu), were analyzed in the gene RELB. Both variants are absent from the GnomAD database, and both are predicted to be deleterious by in-silico software (c.433G>A: SIFT = 0, PolyPhen = 0.87, = CADD: 33; c.1091C>T: SIFT = 0, PolyPhen = 0.99, CADD = 34). A homozygous nonsense mutation in RELB has been associated with immune dysfunction in a pair of siblings and their double first-cousin in a highly consanguineous family (Merico, Sharfe, Hu, Herbrick, & Roifman, 2015). Similar to our participant, the affected individuals had chronic cough, recurrent pneumonia, and multiple episodes of otitis media.

This gene is part of the NFB family alternative pathway. Interestingly, pathogenic variants in other genes within the alternative pathway cause alopecia and panhypogammaglobulinemia, similar to what is seen in our patient (Scott & Roifman, 2019). RELB-knockout mouse models show inflammatory cell infiltration into several organs, hematopoiesis, and impaired cellular immunity (Weih et al., 1995).

Disease causing variants in RELB were first described in 2015, while this participant’s exome was initially sequenced in 2014, showing how new disease-gene associations can influence variant analysis. This is a strong candidate variant. Next steps include confirming that these variants are inherited in trans in our participant. If these variants are confirmed to be independently inherited, then we propose discussing the potential of a collaboration with the team that first described variants in RELB.

4.25. Case 25

Case 25 is a 7-year-old female of mixed-European descent. Her parents are non- consanguineous. She has a healthy younger brother and a maternal grandmother with multiple sclerosis. This participant presented with developmental delay at 6 months of age followed by seizures

92 at 7 months of age. She is microcephalic with no other dysmorphic features. Her seizures are triggered by illness such as fever and fatigue. Seizure types include absence seizures, myoclonic seizures, tonic- clonic seizures, and drop seizures. She has had worsening of symptoms and regression of skills including walking and speech. She has mild hypotonia, and her gait is ataxic and apraxic. She dislikes eating and often forgets to chew. She had a G-tube inserted at 7 years of age. It was previously reported that she slept less than 6 hours a night, which has since reversed, and she is now sleeping all the time. She has also had shudder attacks which were confirmed by EEG to not be associated with seizures.

An EEG done at 18 months of age confirmed seizures. An MRI at 2 years of age showed abnormal myelination, which had resolved in an MRI at 5 years of age, indicating delayed myelination.

Metabolic screens were normal. She has been on five different epilepsy medications in her life and is currently on the ketogenic diet, though continues to have seizures. She had a negative microarray, negative SNRPN methylation screening for Angelman syndrome, and an epilepsy panel revealed a paternally inherited VUS in SPTAN1 (c.641A>G), which causes autosomal dominant epileptic encephalopathy. She had a trio clinical exome in 2016 from Baylor which revealed a maternally inherited VUS in KATNB1 (c.1108C>T), which causes autosomal recessive lissencephaly with microcephaly, a maternally inherited VUS in CC2D1A (c.1448C>A), which causes autosomal recessive intellectual disability, and a paternally inherited VUS in COASY (c.206C>T), which causes autosomal recessive neurodegeneration and pontocerebellar hypoplasia.

The HPO terms selected for this case are: postnatal microcephaly (HP:0005484), difficulty walking (HP:0002355), global developmental delay (HP:0001263), intellectual disability (HP:0001249), absent speech (HP:0001344), gait apraxia (HP:0010521), generalized hypotonia (HP:0001290), seizures

(HP:0001250), ataxia (HP:0001251), generalized tonic-clonic seizures (HP:0002069), absence seizure

(HP:0002121), generalized myoclonic seizures (HP:0002123), gait imbalance (HP:0002141),

93 developmental regression (HP:0002376), atonic seizures (HP:0010819), hypersomnia (HP:0100786), and feeding difficulties (HP:0011968).

Trio ES reanalysis revealed 597 variants include 76 variants in genes associated with her selected

HPO terms. A total of 15 variants were analyzed, including variants in genes from a candidate epilepsy gene list. Of interest, biallelic variants were found in the gene SPTB, maternally inherited c.4819G>A

(p.Val1607Ile) and paternally inherited c.871G>A (p.Gly291Ser). As well, a homozygous variant

(c.1121C>T p.Thr374Ile) in the gene ATG9A was analyzed.

4.25.1. SPTB

Biallelic variants were found in the gene SPTB. The maternally inherited c.4819G>A

(p.Val1607Ile) variant has been reported in 13 heterozygotes in GnomAD and has mixed predictions of pathogenicity (SIFT: 0.03; PolyPhen: 0; CADD: 21.7). The paternally inherited c.871G>A (p.Gly291Ser) variant has been reported in 62 heterozygotes in the GnomAD database and in-silico software predict that it is deleterious (SIFT: 0.01; PolyPhen: 0.992; CADD: 34). The gene SPTB is associated with autosomal dominant hemolytic anemia and spherocytosis, neither of which have been described in this participant. Of interest, the gene SPTB is a β-spectrin which forms heterodimers with α-spectrins, such as SPTAN1. The gene SPTAN1 is known to cause autosomal dominant early infantile epileptic encephalopathy. Tohyama et al. have shown that many of the pathogenic in-frame variants found in

SPTAN1 are in the last two spectrin repeats of the C-terminal region, which is necessary for heterodimer formation (2015). Interestingly, a drosophila model showed that loss of β-spectrin reduces the levels of neuronal α-spectrin during embryonic axonal pathfinding. We hypothesize that β-spectrin variants that do not have the ability to bind to α-spectrin cause a similar epileptic phenotype to the α-spectrin

SPTAN1. We have entered the SPTB gene into the Matchmaker Exchange program to see if there are any other individuals with variants in this gene who have a similar seizure phenotype.

94

4.25.2. ATG9A

The homozygous variant c.1121C>T (p.Thr374Ile) was found in the gene ATG9A. This gene has been reported in 28 heterozygotes in the GnomAD database and has mixed predictions of pathogenicity by in-silico software analysis (SIFT: 0.08; PolyPhen: 0.741; CADD: 25.1). The gene ATG9A is essential for autophagy. The exact role that ATG9A plays in autophagy is not fully understood, but it is believed to help control the delivery of other essential proteins to the phagosome (Judith et al., 2019). Congenital disorders of autophagy can be variable but Ebrahimi-Fakhari et al. state that “these disorders prominently affect the central nervous system at various stages of development, leading to brain malformations, developmental delay, intellectual disability, epilepsy, movement disorders, and neurodegeneration, among others” (2016). Although the gene ATG9A is not currently known to be associated with disease, we cannot ignore the possibility that this participant may have a congenital disorder of autophagy. The gene ATG9A has been entered into the Matchmaker Exchange database.

Determining if these variants of interest are also in her unaffected brother will help to determine their pathogenicity.

4.26. Case 26

Case 26 consists of a sibling pair of Pakistani descent. Their parents are half first-cousins. They have healthy identical twin sisters. Their family history is otherwise unremarkable (Figure 4.4). The youngest of the two siblings is a 14-year-old male. He presented at 4 years of age with retinal detachment, high myopia, and a cataract all in his left eye. He was found to have peripheral lattice degeneration with a retinal hole in the supertemporal and inferotemporal periphery. His eye was unable to be saved and he now has a prosthetic eye. A similar process later occurred in his right eye and he is now completely blind. During work-up for his eye surgery he was found to have albuminuria and proteinuria. This has improved over time and his eGFR is stable. He is highly flexible with a Beighton score of 9/9. He has no sign of autoimmune disease and he has normal hearing. A renal biopsy showed

95 normal expression of COL4A3 and COL4A5. The biopsy showed some subtle abnormalities including focal basement membrane splitting, and patchy glomerular epithelial cells with process fusion and cytoplasmic vacuolization.

96

Figure 4.4: Pedigree for case 26. Circles represent females and squares represent males. Filled-in quadrants represent the phenotypes shown in the legend. Unfilled circles and squares represent unaffected family members. The arrow represents the participant who first presented to genetics. The ages of the two affected siblings are shown above the squares.

97

The eldest of the siblings is a 17-year-old male with high myopia. Like his brother, he is also extremely flexible with a Beighton score of 9/9. He has high frequency sensorineural hearing loss. He has some small café au lait spots on his trunk. When he was first examined, he was thought to have some facial features similar to Stickler syndrome including a hypoplastic midface and slightly upturned nostrils. As he has aged, his features have been less reminiscent of Stickler syndrome.

Both brothers had normal skeletal surveys and negative COL4A5 testing. Linkage mapping and research-based duo ES revealed a homozygous mutation in ZC3H3 (c.2402A>T p.Lys801Ile). The gene

ZC3H3 has never been associated with human disease, and little is known about the function of this gene.

The HPO terms selected for this sibling pair are: high myopia (HP:0011003), retinal detachment

(HP:0000541), lattice retinal degeneration (HP:0007992), proteinuria (HP:0000093), albuminuria

(HP:0012592), joint hyperflexibility (HP:0005692), high frequency hearing impairment (HP:0005101), midface retrusion (HP:0011800), anteverted nares (HP:0000463), and few café au lait spots

(HP:0007429).

Duo ES reanalysis revealed 432 shared variants and 21 variants in genes associated with the HPO terms selected for this sibling pair. No candidate gene lists were used for this sibling pair. A total of 64 variants were analyzed. Of interest, both siblings were heterozygous for a variant in COL9A1 (c.1418T>C p.Ile473Thr). This variant has been reported in 8 heterozygotes in the GnomAD database and has mixed predictions of pathogenicity by in-silico software (SIFT: 0.13; PolyPhen: 0.652; CADD: 24.7). The gene

COL9A1 is associated with three of the siblings’ HPO terms: high myopia, joint hyperflexibility, and retinal detachment, and has a phenolyzer rank of 25, indicating that of the over 20,000 genes in the human genome, the software phenolyzer predicts that the gene COL9A1 is 25th most likely to be associated with the HPO terms selected for these siblings. This gene is associated with autosomal

98 recessive Stickler syndrome and was ruled out due to the presence of only one heterozygous variant in each of the siblings. This sibling pair were also found to share a variant in the gene COL9A3 (c.423+1del) and a variant in the gene LAMB2 (c.634T>C p.Ser212Pro). Both of these variants had mixed zygosity between the sibling pair.

4.26.1. COL9A3

The variant c.423+1del was found in the gene COL9A3. This variant was homozygous in the elder sibling and heterozygous in the younger sibling. This variant is not reported in the GnomAD database and the gene has an observed over expected loss-of-function variant ratio of 0.53, indicating that there are much fewer loss of function variants in this gene than one may expect. Based on its location we expected this variant to have an affect on splicing, so we assessed it using the software

Alamut Visual (https://www.interactive-biosoftware.com/alamut-visual/). Alamut assessed the predicted splicing effect of this variant versus the wildtype variant using 3 software programs: splice site finder, MaxEntScan, and NNSplice. Each of these programs predicted a significant difference in splicing.

The gene COL9A3 has a phenolyzer rank of 28 and is associated with two HPO terms selected for this sibling pair: joint hyperflexibility and retinal detachment.

COL9A3 is a procollagen gene known to cause autosomal dominant epiphyseal dysplasia and autosomal recessive Stickler syndrome (Faletra et al., 2014; Hanson-Kahn et al., 2018). Stickler syndrome is a highly variable connective tissue disorder associated with many of the combined features described in this sibling pair including sensorineural hearing loss, myopia, retinal detachment, cataracts, and hyperflexiblity. This variant was homozygous in the brother with facial features of Stickler syndrome and high frequency hearing loss. Given the brother’s combined symptoms, it was assumed that they had a shared connective tissue disorder. However, it is possible that this variant is the cause

99 of the symptoms in the older brother and that the younger brother has a different condition altogether.

Next steps are RNA-studies to determine the effects of this variant on splicing.

4.26.2. LAMB2

The variant c.634T>C (p.Ser212Pro) was found in the gene LAMB2. This variant was heterozygous in the elder sibling and homozygous in the younger sibling. This variant is not reported in the GnomAD database and is predicted to be deleterious by in-silico software (SIFT: 0; PolyPhen: 0.999;

CADD: 28.3). This gene has one HPO term which is associated with the sibling pair (proteinuria) and has a phenolyzer rank of 716. The gene LAMB2 encodes an extracellular matrix protein that is associated with nephrotic syndrome with or without ocular anomalies, including retinal detachment (Mohney et al., 2011). This variant was homozygous in the younger brother of this affected sibling pair, who has proteinuria and retinal detachment. It is likely that these brothers have different genetic disorders instead of a single underlying genetic etiology. We are reporting these results back to the referring clinician to see if this can be determined clinically.

100

CHAPTER 5: DISCUSSION

Currently, exome sequencing (ES) is the most comprehensive genetic test readily available on a clinical basis. This means that when ES fails to find a result, patients are left without a diagnosis and few other options. There are many reasons why ES may fail to lead to a diagnosis, including bioinformatic limitations, variant interpretation, and the lack of known disease-gene associations for causative alleles

(Salmon et al., 2019). Therefore, it is not surprising that ES reanalysis at regular intervals diagnose an additional 10-15% of cases (Eldomery et al., 2017; Ewans et al., 2018; Nambot et al., 2018; Shashi et al.,

2019; Wenger et al., 2017). However, systematic reanalysis of ES for patients who have received negative ES results has not been routinely done in Manitoba. For that reason, this study set out to show that ES reanalysis will lead to increased diagnoses in a Manitoba population.

5.1. Study Findings

ES reanalysis in this study found candidate variants of interest in 14 out of 25 cases (56%). Of these, we have classified 3 as strong candidate variants. In order to classify a variant as strong, it has to occur in a gene that has already been associated with disease similar to that seen in the participant, the zygosity/variant burden needs to match the expected inheritance of the disease (e.g., homozygous or biallelic variants in an autosomal recessive disease-gene), and the variants of interest must have evidence of pathogenicity. We also describe candidate variants in novel disease genes in 6 cases, heterozygous candidates in strongly suspicious autosomal-recessive disease genes in 2 cases, and weak candidate variants in 4 cases (1 weak variant in a case that also has a strong candidate variant). No diagnoses have been confirmed in this study, so it is difficult to compare the overall success rate to other published studies. However, if we make the conservative estimate that the strong candidate variants found in 3 of the reanalysed cases (cases 10, 24, and 26) lead to diagnoses and disregard the other candidates, then a diagnosis would have been found in 12% of the reanalyzed cases. This is

101 comparable to the 10-15% diagnostic rate of previous ES reanalysis studies (Eldomery et al., 2017;

Ewans et al., 2018; Nambot et al., 2018; Shashi et al., 2019; Wenger et al., 2017).

The cases which had the three strong candidate variants were each in different types of disorders: neurologic, immune, and a connective tissue case that was designated as “other”. Of the 14 cases that had candidate variants, 7 were neurologic, 3 were immune, 2 were multiple congenital anomalies (MCA), and 2 were designated as other (cholestasis and connective tissue). Because overall case numbers were small for each category, it is difficult to make overall generalizations about the success rate for each disorder type. Regardless, looking at the numbers we can see that candidate variants were found in 7 out of the 12 reanalyzed neurologic cases (58.3%), 3 out of the 5 immune cases

(60%), 2 out of the 5 MCA cases (40%), and 2 out of the 3 other cases (66.7%). Therefore, candidate variants were found proportionately to the number of cases in each category.

Trio ES greatly reduced the number of variants analyzed for each case. This is because the addition of unaffected parents provides more evidence as to which variants can be automatically ruled out. For example, a homozygous variant in a participant whose unaffected parent is also homozygous is most likely not a causative variant and does not need to be analyzed further. Similarly, a true de novo variant can be thoroughly scrutinized when it normally may not have been of interest. In this study, we analyzed 9 singleton exomes, 6 duo exomes, and 10 trio exomes. Strong candidate variants were found in 2 singleton exomes (22.2%), and 1 duo exome (16.7%). However, overall candidates were found in 5 singleton exomes (55.6%), 5 duo exomes (83.3%), and 4 trios (40%). With such low numbers it is difficult to make generalizations about success rates; however, in this study, increasing the number of participants greatly reduced the number of variants analyzed but did not seem to increase the success rate of analysis.

102

Of interest, all three cases with strong candidate variants were originally sequenced in 2015 or earlier. In fact, the two earliest sequenced cases had strong candidate variants. Interestingly, case 24 was originally sequenced in 2014 and had a strong candidate variant in the gene RELB, that was first described as associated with disease in 2015. Similarly, case 26 was originally sequenced in 2013 and had a candidate variant in the gene COL9A3 that was first described as associated with Stickler syndrome in 2014. This case also had a strong candidate variant in the gene LAMB2, which was already known to be associated with disease at the time of the original ES analysis. Although in case 26, these candidate variants may have been easily missed in reanalysis because the mode of inheritance between the sibling pair was not as expected; this is described in more detail in the case discussion in Section

4.26 above. Finally, for case 10, a strong candidate variant was found in the gene TTN, whose disease association was known at the time of sequencing; however, this variant was in a large gene that contained many variants that were difficult to classify. Therefore, if any of these three cases had been reanalyzed within 1-2 years of original sequencing, these strong candidates may have been found.

In terms of candidate variants in novel disease genes, 3 out of the 6 candidate variants (50%) were found in cases that were originally sequenced in 2019, 2 out of the 6 candidates (33.3%) were found in cases that were originally sequenced in 2018, and 1 out of the 6 candidates (16.7%) was found in a case that was originally sequenced in 2016. This implies that ES reanalysis for the sake of gene discovery can be done at any time after original clinical sequencing. Since clinical ES analysis only assesses variants in known disease-genes, reanalysis to look for novel disease-genes can be done as soon as the ES data is generated. In fact, translational studies to show disease-gene associations take time to complete. Appropriate cohorts of patients showing similar phenotypes must be assembled, and collaborations with scientists using appropriate model organisms must be formed. Therefore, it can be argued that reanalysis should be done quickly after receiving a negative ES result, in case a novel candidate is found, so that these collaborative relationships and studies may be implemented.

103

5.2. Secondary Findings

There is large variability in the incidence of secondary findings reported in the literature, between 0.5-6.0% (Lee et al., 2014; Nambot et al., 2018; Retterer et al., 2016; Tarailo-Graovac et al.,

2016; Y. Yang et al., 2014). ES reanalysis in this study revealed one secondary finding. In case 18, the participant was found to have two likely loss-of-function variants in the gene OBSCN. This gene is not currently related to disease, though a link between variants in OBSCN and cardiomyopathy has been described in the literature (Marston et al., 2015; Rowland et al., 2016; Xu et al., 2015). Even though a definitive disease-gene association has not been made between OBSCN and cardiac defects, this result will still be reported to the referring physician in case it leads to future health implications such as increased cardiac surveillance. If we were to consider these variants in OBSCN as secondary findings, then this study had a secondary finding rate of 4.0%.

The small number of secondary findings described in this study does not come as a surprise.

During ES reanalysis, variant filtration was used to prioritize variants in genes associated with the phenotype of interest. Therefore, variants in disease-associated genes that were unrelated to the recorded participant phenotypes were not thoroughly scrutinized. As well, secondary findings that were previously reported during initial ES analysis were not further scrutinized during this analysis. Therefore, few secondary findings were expected to occur during this ES reanalysis project.

5.3. Whole Genome Sequencing as a Future Direction

Though ES reanalysis has the ability to find new diagnoses due to pitfalls of the original analysis such as lack of phenotypic data and lack of disease-phenotype associations, there are still limitations to the technology itself that need to be addressed. These technological limitations include incomplete coverage of the coding regions of the genome, an inability to identify deep intronic or promoter variants, large copy number variants (CNVs), structural variants, and repeat expansions. If a genetic

104 diagnosis cannot be found due to limitations in the technology, no amount of reanalysis on the same data will be able to find the diagnosis. This is where whole genome sequencing (WGS) has an advantage over ES, as it has the ability to overcome many of the technological limitations of ES.

WGS is a method of NGS that does not require probe hybridization, therefore producing more even read coverage (Belkadi et al., 2015; Lelieveld et al., 2015). Because of the uniformity of coverage,

WGS can be used to call CNVs that were too small to be called by microarray and too large to be accurately or consistently seen in ES analysis. This uniform coverage also allows WGS to adequately cover approximately 98% of protein-coding regions, which is higher than the 95% of coding regions covered by ES (Alfares et al., 2018). In fact, Lionel et al. suggest that any exons of interest not covered by

WGS may easily be sequenced by Sanger if needed, leading to near complete exon coverage of the genome (2018). During ES probe hybridization step, not all target DNA regions bind as readily to the probes, leading to uneven coverage across the exome. The uniformity of coverage created by WGS also allows for more confident variant calls at lower coverage, because each region is expected to be covered at somewhat equal depths. This can be shown by the fact that Taylor et al. found diagnostic success in their population was independent of WGS coverage, meaning that higher sequencing depth did not increase the chances of finding a diagnosis (2015). Outside of increased uniformity and more complete coverage, WGS also has the ability to detect deep intronic and promoter variants, which are not captured via ES but may still lead to disease (Lelieveld et al., 2015).

In a study comparing ES to WGS, Lionel et al. reported that WGS was able to detect all 26 diagnostic variants found in ES, but was also able to diagnose an additional 9 participants out of 70 analyzed (2018). These diagnoses were due to intronic variants, variants in noncoding RNA, variants in mitochondrial DNA, small CNVs, and exonic variants with poor ES coverage. It has also been determined that discordance between ES and WGS results is mostly due to low quality ES calls, often leading to homozygous calls at heterozygous locations (Belkadi et al., 2015). Similarly, studies of WGS success rates

105 typically show higher rates of diagnosis than ES studies. Stavropoulos et al. found a WGS diagnostic rate of 34%, where 22% of their diagnoses were due to CNVs unable to be detected by ES analyses (2016).

Lionel et al. found a 41% diagnostic rate in their study on WGS use as a first-tier diagnostic test (2018).

This was a statistically significant increase over the 24% diagnostic rate predicted for traditional genetic testing methods.

There is a lot of utility in the ability to detect precise breakpoints and structural variants. In their

WGS study, Gilissen et al. detected 8 de novo structural variants, including 3 deletions smaller than 10kb in size (2014). They were also able to detect a partial duplication of the gene TENM3 that had been inserted with an inverse orientation into the gene IQSEC2, leading to intellectual disability in their patient (Gilissen et al., 2014). This finding would have been missed by virtually all other genetic testing methods. Similarly, Alfares et al. reported a case of exon 3 expansion in the gene PHOX2B undetectable by traditional methods, and a large deletion of exons 3-9 in the gene TPM3, too small to be detected by microarray analysis (2018). Even when rearrangements are balanced and do not appear to have a gain or loss of DNA content, gene disruptions can still lead to disease (Aristidou et al., 2018; Harewood &

Fraser, 2014). This has been seen in aromatase excess syndrome, where a 15q21.2-q21.3 inversion causes the aromatase gene CYP19A1 to become upregulated by a cryptic promoter (Shozu et al., 2003).

The reverse has also been described in a female with incontinetia pigmenti due to a t(X;2)(q23;q33) translocation causing the IKBKG gene to be downregulated due to its new proximity to the heterochromatic band of chromosome 2 (Genesio et al., 2011). These types of balanced rearrangements would be missed by microarray technology but are detectable using WGS.

Recently, studies have started harnessing the potential of WGS to detect disorders caused by tandem repeats, such as in Fragile X syndrome or Huntington’s disease. Though the methods used to detect repeat expansions are technical and complex, these bioinformatic analyses require paired-end

NGS reads to flank the area of repeat expansion. They then use statistical likelihood tests to determine

106 the size of the repeat expansion. If there is a repeat expansion in this region, there will be a larger number of reads containing large numbers of repeats than expected if there was no expansion (Bahlo et al., 2018). With the introduction of repeat expansion tests using WGS, there are now few technical limitations to making diagnoses through NGS technologies.

Similar to ES, WGS does have its pitfalls. Though repeat expansion detection through WGS sequencing is a promising avenue for increased diagnosis, it is still very new and has yet to be perfected.

This technology cannot accurately give the correct size of the repeat expansion and is based off of a comparison of read number to repeat length, which is not a perfect 1:1 ratio (Bahlo et al., 2018). As well, the current technology is most effective at looking for repeat expansions of targeted regions and does not work well on a genome-wide scale. Therefore, traditional methods of repeat expansion sizing still need to be implemented. As well, WGS is more expensive than traditional tests and produces a lot of a data which will need to be stored indefinitely (Lionel et al., 2018). This large amount of data requires a lot of computational power to be analyzed plus the added human hours needed to filter and classify vast numbers of variants (Lelieveld et al., 2015). This goes to show that no individual technology is perfect; therefore, combinations of diagnostic methods must be employed in order to maximize patient diagnoses. For this reason, we are proposing ES reanalysis at regular intervals as a first-tier investigation, followed by WGS for individuals who still remain negative and appear as if they may benefit from a more in-depth analysis.

5.4. Significance

Genetic diagnoses of patients lead to better, more targeted patient care. Several examples of potential improvements to patient care can be found in this study. In case 26, a sibling pair was suspected to share the same connective tissue disorder with variable presentations. The younger brother presented with proteinuria and retinal detachments, while the older brother has high frequency

107 hearing loss. Both brothers have hyperflexibility and high myopia. Even though the older brother has not shown signs of kidney disease or retinopathy, both brothers are being extensively followed by nephrology and ophthalmology, due to the assumption that they have the same underlying etiology.

However, in this study, we discovered that the younger brother with proteinuria has homozygous variants in a gene associated with kidney disease (LAMB2); however, the older brother is heterozygous for this variant. If this variant is confirmed to be the cause of proteinuria and retinal detachment in the younger brother, then the older brother is no longer considered to be at an increased risk of developing proteinuria and retinal detachment and therefore, would no longer need to be followed by a nephrologist.

Similarly, through ES reanalysis we have been able to confirm the diagnosis of NAIC in case 17, a young boy of First Nations descent with dysmorphic features and liver disease. Clinicians were suspicious of fetal alcohol syndrome (FAS) in this patient, making it difficult to determine which elements of his disease were due to FAS and which were due to a different underlying genetic cause.

NAIC is a non-syndromic liver disease that often leads to liver transplant. Based on this diagnosis, we can predict the course of disease for this participant and can also more confidently say that his dysmorphic features are symptoms of FAS.

We have also found candidate variants in 7 genes that have not previously been associated with disease. These candidate variants can open the door to future collaborations and may not only expand our current knowledge of genetics but may aid in finding diagnoses for other patients with variants in the same genes. For example, in case 22 homozygous variants were found in the gene ZNFX1 for a young girl of First Nations descent with recurrent illness. Not only was a literature review of this gene highly suspicious for an immune disorder, matchmaking has allowed for collaboration with international researchers who have a cohort of other patients with similar symptoms and variants in the same gene.

As well, the presence of homozygous variants in this patient is suggestive of a potential founder

108 mutation. Therefore, clinicians may be better able to diagnose future patients of First Nations descent who present with similar symptoms. In addition, based on the experiences in some of the other patients in this international cohort, a bone marrow transplant may be the next step for our patient, which is something that likely wouldn’t have been considered without a clear diagnosis.

5.5. Study Strengths

ES has many challenges which systematic reanalysis attempts to overcome. Potential limitations in initial ES analysis include: missing phenotypic information, bioinformatic limitations, initial misinterpretation of variants, new disease-gene associations (Salmon et al., 2019; Wenger et al., 2017).

This study attempted to overcome each of these pitfalls in various ways.

First, thorough chart reviews were performed by the student PI to develop an initial list of phenotypic terms. These terms were then reviewed with the referring clinician in order to make sure they were as thorough and accurate as possible. As well, the referring clinician was present during exome rounds. This enabled a discussion for variants found in genes that did not seem to meet the phenotypic picture of the participant and the clinician was able to clarify if any information was missing.

As well, we were able to discuss the molecular pathway of novel disease-genes to see if they were consistent with the type of disease seen.

Second, the bcbio-nextgen pipeline uses 4 variant callers and reports any variants that were seen by 2 or more callers. Similarly, the pipeline uses annotations from both VEP-Ensembl (McLaren et al., 2016) and RefSeq (https://www.ncbi.nlm.nih.gov/refseq/). Using multiple bioinformatics callers and annotation tools overcomes many of the biases that may be present by just using one variant caller.

Potential disease-causing variants that may have been filtered by one pipeline can now be analyzed. As well, using multiple annotation tools ensures that the most up-to-date and relevant information is being used in the interpretation of variant pathogenicity.

109

Third, the filters used in this analysis were designed to assess for variants that had the highest likelihood of causing the disease seen in each participant. This was done by filtering for variants in genes related to human phenotype ontology (HPO) terms for each participant. This ensured that any variants that may have been initially misinterpreted or any new disease-gene associations related to the participant’s phenotype were thoroughly assessed. As well, each case was considered on an individual basis, and extra filters were applied when necessary. For example, if trio sequencing was done with unaffected parents, then filters to assess for de novo variants were applied. This enabled the researchers to thoroughly assess those variants that were most likely to cause disease of interest for each participant.

Finally, a strength of this study was the interpretation of variants under a research lens. When

ES analysis is done in a purely clinical laboratory, only variants in known causative genes related to the participants phenotype can be reported. When ES analysis is done under a research lens, we can thoroughly assess variants of interest in novel genes that have not previously been described as associated with disease. This paves the way for potential gene discovery and collaborations with other clinicians and researchers. However, it should be noted that those participants who were initially recruited from the Care4Rare cohort were also initially assessed on a research basis. Therefore, this added strength only applies to those participants recruited from the Winnipeg Region Health Authority who had initially received clinical ES analysis.

5.6. Limitations

ES in this study generated an average of over 600 low frequency variants for each participant. It would be extremely time consuming to analyze every variant for each participant in this study; therefore, filters were used to determine which variants were of interest to analyze further. These filters are based on assumptions about which variants are more or less likely to be pathogenic.

110

However, it is possible that these filters may have missed causative variants. For example, the filters used in this study focus more heavily on likely loss-of-function variants, and therefore may have missed causative missense variants. We are hopeful that the use of exome rounds has reduced the risk of missing variants of interest, by having multiple researchers look at the same data. We acknowledge however, that variant filtration is a limitation of this study.

In addition, each variant is at the mercy of the subjective interpretation of the individual analyzing the data. In order to combat this, the American College of Medical Genetics has released guidelines on how to interpret the pathogenicity of variants, which we used during this study (Richards et al., 2015). However, all of the variants interpreted in this study were considered variants of uncertain significance, including all variants in genes of uncertain significance. It was then left up to the subjective interpretation of the individual analyzing the variant to decide if this variant was worth pursuing further with functional studies. The use of exome rounds allowed for each variant to be interrelated by multiple individuals in an attempt to overcome this potential limitation.

Similarly, ES analysis is limited by the software available. All bioinformatics pipelines have their limitations and different tools will prioritize variant calling in different ways. In order to overcome this, we used the bcbio-nextgen pipeline (https://github.com/bcbio/bcbio-nextgen) which jointly uses four variant callers and reports any variants that have been observed by two or more callers. However, it is still possible for true variants to have been missed using this technology. As well, much of the evidence collected to determine pathogenicity of variants is based on computer software algorithms, such as in- silico predictions of pathogenicity. These predictions are not perfect and are only one criterion recommended by the ACMG to determine pathogenicity. Nevertheless, an in-silico prediction of no deleterious affect on gene function may cause a variant to not be scrutinized any further by the individual analyzing the data.

111

Due to time constraints, we were unable to confirm diagnoses in patients before study completion. Therefore, it is difficult for us to compare our success rate with those presented in the literature. Despite this, we are confident that this study will lead to the increased diagnoses of patients in Manitoba and has successfully shown the utility of ES reanalysis in this population.

Finally, the sample size in this study was too small to make generalizations about trends in the data that are more likely to lead to diagnoses during ES reanalysis. However, this was a pilot study, meant simply to show that ES reanalysis has utility in Manitoba. It is important to view each study under the specific lens of what they are testing. Because this is a pilot study, the sample size was kept low in order to only assess what initial clinical utility there may be to ES reanalysis. Similarly, only negative ES were analyzed in order to determine if additional diagnoses could be made; therefore, no positive ES were analyzed in this study. Finally, participants were selected by the research team based on which samples were most likely to lead to a new diagnosis, and therefore, may not be representative of a complete, unbiased ES reanalysis cohort. Future studies including larger cohorts of participants may be better suited to assess trends in diagnoses through ES reanalysis.

5.7. Conclusions

This study has demonstrated that there is utility in systematically reanalyzing ES data in

Manitoban patients. Reanalyzing already generated ES data is an efficient way to increase genetic diagnoses for patients in Manitoba. An understanding of the underlying molecular mechanism can lead to more targeted treatment and surveillance. Many of these families can now be counselled on recurrence risks, risks to other family members, and prenatal options. Finally, this study has opened the door to international scientific collaborations and gene discovery efforts. Not only does gene discovery increase our overall knowledge of human genetics, but it also allows us to more appropriately care for those patients who have variants in these newly discovered genes.

112

REFERENCES

Al-Mehmadi, S., Splitt, M., Ramesh, V., DeBrosse, S., Dessoffy, K., Xia, F., … Minassian, B. A. (2016). FHF1 (FGF12) epileptic encephalopathy. Neurology: Genetics, 2(6), e115. https://doi.org/10.1212/NXG.0000000000000115 Al-Murshedi, F., Meftah, D., & Scott, P. (2019). Underdiagnoses resulting from variant misinterpretation: Time for systematic reanalysis of whole exome data? European Journal of Medical Genetics, 62(1), 39–43. https://doi.org/10.1016/j.ejmg.2018.04.016 Alfares, A., Aloraini, T., subaie, L. Al, Alissa, A., Qudsi, A. Al, Alahmad, A., … Alfadhel, M. (2018). Whole- genome sequencing offers additional but limited clinical utility compared with reanalysis of whole- exome sequencing. Genetics in Medicine, 20(11), 1328–1333. https://doi.org/10.1038/gim.2018.41 Amendola, L. M., Jarvik, G. P., Leo, M. C., McLaughlin, H. M., Akkari, Y., Amaral, M. D., … Rehm, H. L. (2016). Performance of ACMG-AMP Variant-Interpretation Guidelines among Nine Laboratories in the Clinical Sequencing Exploratory Research Consortium. American Journal of Human Genetics, 98(6), 1067–1076. https://doi.org/10.1016/j.ajhg.2016.03.024 Aristidou, C., Theodosiou, A., Bak, M., Mehrjouy, M. M., Constantinou, E., Alexandrou, A., … Sismani, C. (2018). Position effect, cryptic complexity, and direct gene disruption as disease mechanisms in de novo apparently balanced translocation cases. PLoS ONE, 13(10), e0205298. https://doi.org/10.1371/journal.pone.0205298 Bahlo, M., Bennett, M. F., Degorski, P., Tankard, R. M., Delatycki, M. B., & Lockhart, P. J. (2018). Recent advances in the detection of repeat expansions with short-read next-generation sequencing. F1000Research, 7, F1000 Faculty Rev-736. https://doi.org/10.12688/f1000research.13980.1 Baldovino, S., Moliner, A. M., Taruscio, D., Daina, E., & Roccatello, D. (2016). Rare diseases in Europe: From a wide to a local perspective. Israel Medical Association Journal, 18(6), 359–363. Belkadi, A., Bolze, A., Itan, Y., Cobat, A., Vincent, Q. B., Antipenko, A., … Abel, L. (2015). Whole-genome sequencing is more powerful than whole-exome sequencing for detecting exome variants. Proceedings of the National Academy of Sciences of the United States of America, 112(17), 5473– 5478. https://doi.org/10.1073/pnas.1418631112 Boycott, K., Hartley, T., Adam, S., Bernier, F., Chong, K., Fernandez, B. A., … Armour, C. M. (2015). The clinical application of genome-wide sequencing for monogenic diseases in Canada: Position statement of the Canadian College of medical geneticists. Journal of Medical Genetics, 52(7), 431– 437. https://doi.org/10.1136/jmedgenet-2015-103144 Brede, G., Solheim, J., & Prydz, H. (2002). PSKH1, a novel splice factor compartment-associated serine kinase. Nucleic Acids Research, 30(23), 5301–5309. https://doi.org/10.1093/nar/gkf648 Brede, G., Solheim, J., Stang, E., & Prydz, H. (2003). Mutants of the protein serine kinase PSKH1 disassemble the Golgi apparatus. Experimental Cell Research, 291(2), 299–312. https://doi.org/10.1016/j.yexcr.2003.07.009 Buske, O. J., Girdea, M., Dumitriu, S., Gallinger, B., Hartley, T., Trang, H., … Brudno, M. (2015). PhenomeCentral: A Portal for Phenotypic and Genotypic Matchmaking of Patients with Rare Genetic Diseases. Human Mutation, 36(10), 931–940. https://doi.org/10.1002/humu.22851

113

Canadian Organization for Rare Disorders. (2015). Now is the Time: A Strategy for Rare Diseases is a Strategy for all Canadians. Toronto. Carter, H., Douville, C., Stenson, P. D., Cooper, D. N., & Karchin, R. (2013). Identifying Mendelian disease genes with the variant effect scoring tool. BMC Genomics, 14 Suppl 3, S3. https://doi.org/10.1186/1471-2164-14-s3-s3 Ceyhan-Birsoy, O., Agrawal, P. B., Hidalgo, C., Schmitz-Abe, K., Dechene, E. T., Swanson, L. C., … Beggs, A. H. (2013). Recessive truncating titin gene, TTN, mutations presenting as centronuclear myopathy. Neurology, 81(14), 1205–1214. https://doi.org/10.1212/WNL.0b013e3182a6ca62 Chagnon, P., Michaud, J., Mitchell, G., Mercier, J., Marion, J. F., Drouin, E., … Richter, A. (2002). A missense mutation (R565W) in Cirhin (FLJ14728) in North American Indian childhood cirrhosis. American Journal of Human Genetics, 71(6), 1443–1449. https://doi.org/10.1086/344580 Chang, F., Liu, L., Fang, E., Zhang, G., Chen, T., Cao, K., … Li, M. M. (2017). Molecular Diagnosis of Mosaic Overgrowth Syndromes Using a Custom-Designed Next-Generation Sequencing Panel. Journal of Molecular Diagnostics, 19(4), 613–624. https://doi.org/10.1016/j.jmoldx.2017.04.006 Chang, S., Vaccarella, L., Olatunji, S., Cebulla, C., & Christoforidis, J. (2011). Diagnostic Challenges in Retinitis Pigmentosa: Genotypic Multiplicity and Phenotypic Variability. Current Genomics, 12(4), 267–275. https://doi.org/10.2174/138920211795860116 Chappell, C., Hanakahi, L. A., Karimi-Busheri, F., Weinfeld, M., & West, S. C. (2002). Involvement of human polynucleotide kinase in double-strand break repair by non-homologous end joining. EMBO Journal, 21(11), 2827–2832. https://doi.org/10.1093/emboj/21.11.2827 Chauveau, C., Rowell, J., & Ferreiro, A. (2014). A rising titan: TTN review and mutation update. Human Mutation, 35(9), 1046–1059. https://doi.org/10.1002/humu.22611 Chi, T. (2004). A BAF-centred view of the immune system. Nature Reviews Immunology, 4(12), 965–977. https://doi.org/10.1038/nri1501 Cushion, T. D., Paciorkowski, A. R., Pilz, D. T., Mullins, J. G. L., Seltzer, L. E., Marion, R. W., … Dobyns, W. B. (2014). De novo mutations in the beta-tubulin gene TUBB2A cause simplified gyral patterning and infantile-onset epilepsy. American Journal of Human Genetics, 94(4), 634–641. https://doi.org/10.1016/j.ajhg.2014.03.009 De Ligt, J., Willemsen, M. H., Van Bon, B. W. M., Kleefstra, T., Yntema, H. G., Kroes, T., … Vissers, L. E. L. M. (2012). Diagnostic exome sequencing in persons with severe intellectual disability. New England Journal of Medicine, 367(20), 1921–1929. https://doi.org/10.1056/NEJMoa1206524 Dendrou, C. A., Petersen, J., Rossjohn, J., & Fugger, L. (2018). HLA variation and disease. Nature Reviews Immunology, 18(5), 325–339. https://doi.org/10.1038/nri.2017.143 Ebrahimi-Fakhari, D., Saffari, A., Wahlster, L., Lu, J., Byrne, S., Hoffmann, G. F., … Sahin, M. (2016). Congenital disorders of autophagy: An emerging novel class of inborn errors of neuro-metabolism. Brain, 139(2), 317–337. https://doi.org/10.1093/brain/awv371 Eldomery, M. K., Coban-Akdemir, Z., Harel, T., Rosenfeld, J. A., Gambin, T., Stray-Pedersen, A., … Lupski, J. R. (2017). Lessons learned from additional research analyses of unsolved clinical exome cases. Genome Medicine, 9(1), 26. https://doi.org/10.1186/s13073-017-0412-6

114

Elsen, G. E., Bedogni, F., Hodge, R. D., Bammler, T. K., MacDonald, J. W., Lindtner, S., … Hevner, R. F. (2018). The epigenetic factor landscape of developing neocortex is regulated by transcription factors Pax6→ Tbr2→ Tbr1. Frontiers in Neuroscience, 12, 571. https://doi.org/10.3389/fnins.2018.00571 Ewans, L. J., Schofield, D., Shrestha, R., Zhu, Y., Gayevskiy, V., Ying, K., … Roscioli, T. (2018). Whole- exome sequencing reanalysis at 12 months boosts diagnosis and is cost-effective when applied early in Mendelian disorders. Genetics in Medicine, 20(12), 1564–1574. https://doi.org/10.1038/gim.2018.39 Faletra, F., D’Adamo, A. P., Bruno, I., Athanasakis, E., Biskup, S., Esposito, L., & Gasparini, P. (2014). Autosomal recessive stickler syndrome due to a loss of function mutation in the COL9A3 gene. American Journal of Medical Genetics, Part A, 164(1), 42–47. https://doi.org/10.1002/ajmg.a.36165 Fatica, A., Oeffinger, M., Dlakić, M., & Tollervey, D. (2003). Nob1p Is Required for Cleavage of the 3′ End of 18S rRNA. Molecular and Cellular Biology, 23(5), 1798–1807. https://doi.org/10.1128/mcb.23.5.1798-1807.2003 Ferreiro, A., Estournet, B., Chateau, D., Romero, N. B., Laroche, C., Odent, S., … Fardeau, M. (2000). Multi-minicore disease-searching for boundaries: Phenotype analysis of 38 cases. Annals of Neurology, 48(5), 745–757. https://doi.org/10.1002/1531-8249(200011)48:5<745::AID- ANA8>3.0.CO;2-F Frappaolo, A., Sechi, S., Kumagai, T., Robinson, S., Fraschini, R., Karimpour-Ghahnavieh, A., … Giansanti, M. G. (2017). COG7 deficiency in Drosophila generates multifaceted developmental, behavioral and protein glycosylation phenotypes. Journal of Cell Science, 130(21), 3637–3649. https://doi.org/10.1242/jcs.209049 Garrison, E., & Marth, G. (2012). Haplotype-based variant detection from short-read sequencing. ArXiv Preprint, arXiv, 1207.3907. Retrieved from http://arxiv.org/abs/1207.3907 Genesio, R., Melis, D., Gatto, S., Izzo, A., Ronga, V., Cappuccio, G., … Nitsch, L. (2011). Variegated silencing through epigenetic modifications of a large Xq region in a case of balanced X;2 translocation with Incontinentia Pigmenti-like Phenotype. Epigenetics, 6(10), 1242–1247. https://doi.org/10.4161/epi.6.10.17698 Gilissen, C., Hehir-Kwa, J. Y., Thung, D. T., Van De Vorst, M., Van Bon, B. W. M., Willemsen, M. H., … Veltman, J. A. (2014). Genome sequencing identifies major causes of severe intellectual disability. Nature, 511(7509), 344–347. https://doi.org/10.1038/nature13394 Green, R. C., Berg, J. S., Grody, W. W., Kalia, S. S., Korf, B. R., Martin, C. L., … Biesecker, L. G. (2013). ACMG recommendations for reporting of incidental findings in clinical exome and genome sequencing. Genetics in Medicine, 15(7), 565–574. https://doi.org/10.1038/gim.2013.73 Groenestege, W. M. T., Thébault, S., Van Der Wijst, J., Van Den Berg, D., Janssen, R., Tejpar, S., … Bindels, R. J. (2007). Impaired basolateral sorting of pro-EGF causes isolated recessive renal hypomagnesemia. Journal of Clinical Investigation, 117(8), 2260–2267. https://doi.org/10.1172/JCI31680 Grogan, A., & Kontrogianni-Konstantopoulos, A. (2019). Unraveling obscurins in heart disease. Pflugers Archiv European Journal of Physiology. https://doi.org/10.1007/s00424-018-2191-3

115

Guella, I., Huh, L., McKenzie, M. B., Toyota, E. B., Martina Bebin, E., Thompson, M. L., … Demos, M. (2016). De novo FGF12 mutation in 2 patients with neonatal-onset epilepsy. Neurology: Genetics, 2(6), e120. https://doi.org/10.1212/NXG.0000000000000120 Hamilton, M. J., Newbury-Ecob, R., Holder-Espinasse, M., Yau, S., Lillis, S., Hurst, J. A., … Suri, M. (2016). Rubinstein-Taybi syndrome type 2: Report of nine new cases that extend the phenotypic and genotypic spectrum. Clinical Dysmorphology, 25(4), 135–145. https://doi.org/10.1097/MCD.0000000000000143 Hanson-Kahn, A., Li, B., Cohn, D. H., Nickerson, D. A., Bamshad, M. J., & Hudgins, L. (2018). Autosomal recessive Stickler syndrome resulting from a COL9A3 mutation. American Journal of Medical Genetics, Part A, 176(12), 2887–2891. https://doi.org/10.1002/ajmg.a.40647 Harewood, L., & Fraser, P. (2014). The impact of chromosomal rearrangements on regulation of . Human Molecular Genetics, 23(R1), R76-82. https://doi.org/10.1093/hmg/ddu278 Hazarika, P., Dham, N., Patel, P., Cho, M., Weidner, D., Goldsmith, L., & Duvic, M. (1999). Flotillin 2 is distinct from epidermal surface antigen (ESA) and is associated with filopodia formation. Journal of Cellular Biochemistry, 75(1), 147–159. https://doi.org/10.1002/(SICI)1097- 4644(19991001)75:1<147::AID-JCB15>3.0.CO;2-D Huang, L., Solursh, M., & Sandra, A. (1996). The role of transforming growth factor alpha in rat craniofacial development and chondrogenesis. Journal of Anatomy, 189(1), 73–86. Ioannidis, N. M., Rothstein, J. H., Pejaver, V., Middha, S., McDonnell, S. K., Baheti, S., … Sieh, W. (2016). REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants. American Journal of Human Genetics, 99(4), 877–885. https://doi.org/10.1016/j.ajhg.2016.08.016 Jeong, S. M., Lee, C., Lee, S. K., Kim, J., & Seong, R. H. (2010). The SWI/SNF chromatin-remodeling complex modulates peripheral T cell activation and proliferation by controlling AP-1 expression. Journal of Biological Chemistry, 285(4), 2340–2350. https://doi.org/10.1074/jbc.M109.026997 Judith, D., Jefferies, H. B. J., Boeing, S., Frith, D., Snijders, A. P., & Tooze, S. A. (2019). ATG9A shapes the forming autophagosome through Arfaptin 2 and phosphatidylinositol 4-kinase IIIβ. Journal of Cell Biology, 218(5), 1634–1652. https://doi.org/10.1083/jcb.201901115 Kadoch, C., Hargreaves, D. C., Hodges, C., Elias, L., Ho, L., Ranish, J., & Crabtree, G. R. (2013). Proteomic and bioinformatic analysis of mammalian SWI/SNF complexes identifies extensive roles in human malignancy. Nature Genetics, 45(6), 592–601. https://doi.org/10.1038/ng.2628 Kaeser, M. D., Aslanian, A., Dong, M. Q., Yates, J. R., & Emerson, B. M. (2008). BRD7, a novel PBAF- specific SWI/SNF subunit, is required for target gene activation and repression in embryonic stem cells. Journal of Biological Chemistry, 283(47), 32254–32263. https://doi.org/10.1074/jbc.M806061200 Kalia, S. S., Adelman, K., Bale, S. J., Chung, W. K., Eng, C., Evans, J. P., … Miller, D. T. (2017). Recommendations for reporting of secondary findings in clinical exome and genome sequencing, 2016 update (ACMG SF v2.0): A policy statement of the American College of Medical Genetics and Genomics. Genetics in Medicine. https://doi.org/10.1038/gim.2016.190 Kato, T., Daigo, Y., Hayama, S., Ishikawa, N., Yamabuki, T., Ito, T., … Nakamura, Y. (2005). A novel human tRNA-dihydrouridine synthase involved in pulmonary carcinogenesis. Cancer Research, 65(13), 5638–5646. https://doi.org/10.1158/0008-5472.CAN-05-0600

116

Köhler, S., Carmody, L., Vasilevsky, N., Jacobsen, J. O. B., Danis, D., Gourdine, J. P., … Robinson, P. N. (2019). Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources. Nucleic Acids Research. https://doi.org/10.1093/nar/gky1105 Kubo, M., Hata, J., Ninomiya, T., Matsuda, K., Yonemoto, K., Nakano, T., … Kiyohara, Y. (2007). A nonsynonymous SNP in PRKCH (protein kinase C η) increases the risk of cerebral infarction. Nature Genetics, 39(2), 212–217. https://doi.org/10.1038/ng1945 Kumar, P., Henikoff, S., & Ng, P. C. (2009). Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nature Protocols, 4(7), 1073–1082. https://doi.org/10.1038/nprot.2009.86 Lamus, F., Martín, C., Carnicero, E., Moro, J. A., Fernández, J. M. F., Mano, A., … Alonso, M. I. (2020). FGF2/EGF contributes to brain neuroepithelial precursor proliferation and neurogenesis in rat embryos: the involvement of embryonic cerebrospinal fluid. Developmental Dynamics, 249(1), 141–153. https://doi.org/10.1002/dvdy.135 Lee, H., Deignan, J. L., Dorrani, N., Strom, S. P., Kantarci, S., Quintero-Rivera, F., … Nelson, S. F. (2014). Clinical exome sequencing for genetic identification of rare mendelian disorders. JAMA - Journal of the American Medical Association, 312(18), 1880–1887. https://doi.org/10.1001/jama.2014.14604 Lek, M., Karczewski, K. J., Minikel, E. V., Samocha, K. E., Banks, E., Fennell, T., … Williams, A. L. (2016). Analysis of protein-coding genetic variation in 60,706 humans. Nature, 536(7616), 285–291. https://doi.org/10.1038/nature19057 Lelieveld, S. H., Spielmann, M., Mundlos, S., Veltman, J. A., & Gilissen, C. (2015). Comparison of Exome and Genome Sequencing Technologies for the Complete Capture of Protein-Coding Regions. Human Mutation, 36(8), 815–822. https://doi.org/10.1002/humu.22813 Li, H., & Durbin, R. (2010). Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics, 26(5), 589–595. https://doi.org/10.1093/bioinformatics/btp698 Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., … Durbin, R. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25(16), 2078–2079. https://doi.org/10.1093/bioinformatics/btp352 Li, T., You, H., Zhang, J., Mo, X., He, W., Chen, Y., … Hu, Z. (2014). Study of GOLPH3: A potential stress- inducible protein from Golgi apparatus. Molecular Neurobiology, 49(3), 1449–1459. https://doi.org/10.1007/s12035-013-8624-2 Lionel, A. C., Costain, G., Monfared, N., Walker, S., Reuter, M. S., Hosseini, S. M., … Marshall, C. R. (2018). Improved diagnostic yield compared with targeted gene sequencing panels suggests a role for whole-genome sequencing as a first-tier genetic test. Genetics in Medicine, 20(4), 435–443. https://doi.org/10.1038/gim.2017.119 Marston, S., Montgiraud, C., Munster, A. B., Copeland, O., Choi, O., Dos Remedios, C., … Knöll, R. (2015). OBSCN mutations associated with dilated cardiomyopathy and haploinsufficiency. PLoS ONE. https://doi.org/10.1371/journal.pone.0138568 Martinez-Gallo, M., Radigan, L., Almejún, M. B., Martínez-Pomar, N., Matamoros, N., & Cunningham- Rundles, C. (2013). TACI mutations and impaired B-cell function in subjects with CVID and healthy heterozygotes. Journal of Allergy and Clinical Immunology, 131(2), 468–476. https://doi.org/10.1016/j.jaci.2012.10.029

117

Matthijs, G., Souche, E., Alders, M., Corveleyn, A., Eck, S., Feenstra, I., … Bauer, P. (2016). Guidelines for diagnostic next-generation sequencing. European Journal of Human Genetics, 24(1), 2–5. https://doi.org/10.1038/ejhg.2015.226 McCandless, S. E., Brunger, J. W., & Cassidy, S. B. (2004). The Burden of Genetic Disease on Inpatient Care in a Children’s Hospital. American Journal of Human Genetics, 74(1), 121–127. https://doi.org/10.1086/381053 McLaren, W., Gil, L., Hunt, S. E., Riat, H. S., Ritchie, G. R. S., Thormann, A., … Cunningham, F. (2016). The Ensembl Variant Effect Predictor. Genome Biology, 17(1), 122. https://doi.org/10.1186/s13059- 016-0974-4 Merico, D., Sharfe, N., Hu, P., Herbrick, J.-A., & Roifman, C. M. (2015). RelB deficiency causes combined immunodeficiency. LymphoSign Journal, 2(3), 147–155. https://doi.org/10.14785/lpsn-2015-0005 Meuwissen, M. E. C., Schot, R., Buta, S., Oudesluijs, G., Tinschert, S., Speer, S. D., … Mancini, G. M. S. (2016). Human USP18 deficiency underlies type 1 interferonopathy leading to severe pseudo-TOR CH syndrome. Journal of Experimental Medicine, 213(7), 1163–1174. https://doi.org/10.1084/jem.20151529 Middeljans, E., Wan, X., Jansen, P. W., Sharma, V., Stunnenberg, H. G., & Logie, C. (2012). SS18 together with animal-specific factors defines human BAF-type SWI/SNF complexes. PLoS ONE, 7(3), e33834. https://doi.org/10.1371/journal.pone.0033834 Miller, D. T., Adam, M. P., Aradhya, S., Biesecker, L. G., Brothman, A. R., Carter, N. P., … Ledbetter, D. H. (2010). Consensus Statement: Chromosomal Microarray Is a First-Tier Clinical Diagnostic Test for Individuals with Developmental Disabilities or Congenital Anomalies. American Journal of Human Genetics, 86(5), 749–764. https://doi.org/10.1016/j.ajhg.2010.04.006 Mitani, T., Punetha, J., Akalin, I., Pehlivan, D., Dawidziuk, M., Coban Akdemir, Z., … Gawlinski, P. (2019). Bi-allelic Pathogenic Variants in TUBGCP2 Cause Microcephaly and Lissencephaly Spectrum Disorders. American Journal of Human Genetics, 105(5), 1005–1015. https://doi.org/10.1016/j.ajhg.2019.09.017 Mohney, B. G., Pulido, J. S., Lindor, N. M., Hogan, M. C., Consugar, M. B., Peters, J., … Harris, P. C. (2011). A novel mutation of LAMB2 in a multigenerational mennonite family reveals a new phenotypic variant of Pierson syndrome. Ophthalmology, 118(6), 1137–1144. https://doi.org/10.1016/j.ophtha.2010.10.009 Monroe, G. R., Frederix, G. W., Savelberg, S. M. C., De Vries, T. I., Duran, K. J., Van Der Smagt, J. J., … Van Haaften, G. (2016). Effectiveness of whole-exome sequencing and costs of the traditional diagnostic trajectory in children with intellectual disability. Genetics in Medicine, 18(9), 949–956. https://doi.org/10.1038/gim.2015.200 Myers, K. A., Johnstone, D. L., & Dyment, D. A. (2019). Epilepsy genetics: Current knowledge, applications, and future directions. Clinical Genetics, 95(1), 95–111. https://doi.org/10.1111/cge.13414 Myllykangas, S., Natsoulis, G., Bell, J. M., & Ji, H. P. (2011). Targeted sequencing library preparation by genomic DNA circularization. BMC Biotechnology, 11, 122. https://doi.org/10.1186/1472-6750-11- 122 Nambot, S., Thevenon, J., Kuentz, P., Duffourd, Y., Tisserant, E., Bruel, A. L., … Thauvin-Robinet, C.

118

(2018). Clinical whole-exome sequencing for the diagnosis of rare disorders with congenital anomalies and/or intellectual disability: Substantial interest of prospective annual reanalysis. Genetics in Medicine, 20(6), 645–654. https://doi.org/10.1038/gim.2017.162 Nilipour, Y., Nafissi, S., Tjust, A. E., Ravenscroft, G., Hossein Nejad Nedai, H., Taylor, R. L., … Tajsharghi, H. (2018). Ryanodine receptor type 3 (RYR3) as a novel gene associated with a myopathy with nemaline bodies. European Journal of Neurology, 25(6), 841–847. https://doi.org/10.1111/ene.13607 O’Grady, G. L., Lek, M., Lamande, S. R., Waddell, L., Oates, E. C., Punetha, J., … North, K. (2016). Diagnosis and etiology of congenital muscular dystrophy: We are halfway there. Annals of Neurology, 80(1), 101–111. https://doi.org/10.1002/ana.24687 O’Keefe, E., Hollenberg, M. D., & Cuatrecasas, P. (1974). Epidermal growth factor. Characteristics of specific binding in membranes from liver, placenta, and other target tissues. Archives of Biochemistry and Biophysics, 164(2), 518–526. https://doi.org/10.1016/0003-9861(74)90062-9 Oktia, R., & Okita, J. (2005). Cytochrome P450 4A Fatty Acid Omega Hydroxylases. Current Drug Metabolism, 2(3), 265–281. https://doi.org/10.2174/1389200013338423 Pan-Hammarström, Q., Salzer, U., Du, L., Björkander, J., Cunningham-Rundles, C., Nelson, D. L., … Hammarström, L. (2007). Reexamining the role of TACI coding variants in common variable immunodeficiency and selective IgA deficiency. Nature Genetics, 39(4), 429–430. https://doi.org/10.1038/ng0407-429 Paprocka, J., Jezela-Stanek, A., Koppolu, A., Rydzanicz, M., Kosińska, J., Stawiński, P., & Płoski, R. (2019). FGF12p.Gly112Ser variant as a cause of phenytoin/phenobarbital responsive epilepsy. Clinical Genetics, 96(3), 274–275. https://doi.org/10.1111/cge.13592 Philippakis, A. A., Azzariti, D. R., Beltran, S., Brookes, A. J., Brownstein, C. A., Brudno, M., … Rehm, H. L. (2015). The Matchmaker Exchange: A Platform for Rare Disease Gene Discovery. Human Mutation, 36(10), 915–921. https://doi.org/10.1002/humu.22858 Phillips, M. S., Fujii, J., Khanna, V. K., DeLeon, S., Yokobata, K., De Jong, P. J., & MacLennan, D. H. (1996). The structural organization of the human skeletal muscle ryanodine receptor (RYR1) gene. Genomics, 34(1), 24–41. https://doi.org/10.1006/geno.1996.0238 Picard, C., Bobby Gaspar, H., Al-Herz, W., Bousfiha, A., Casanova, J. L., Chatila, T., … Sullivan, K. E. (2018). International Union of Immunological Societies: 2017 Primary Immunodeficiency Diseases Committee Report on Inborn Errors of Immunity. Journal of Clinical Immunology, 38(1), 96–128. https://doi.org/10.1007/s10875-017-0464-9 Popejoy, A. B., Ritter, D. I., Crooks, K., Currey, E., Fullerton, S. M., Hindorff, L. A., … Bustamante, C. D. (2018). The clinical imperative for inclusivity: Race, ethnicity, and ancestry (REA) in genomics. Human Mutation, 39(11), 1713–1720. https://doi.org/10.1002/humu.23644 Poplin, R., Ruano-Rubio, V., DePristo, M. A., Fennell, T. J., Carneiro, M. O., Auwera, G. A. Van der, … Banks, E. (2017). Scaling accurate genetic variant discovery to tens of thousands of samples. BioRxiv, 201178. https://doi.org/10.1101/201178 Public Law 107–280. (2002). Rare Diseases Act of 2002. Public Law, 1–5. Retrieved from https://history.nih.gov/research/downloads/PL107-280.pdf

119

Rajendran, L., Knobloch, M., Geiger, K. D., Dienel, S., Nitsch, R., Simons, K., & Konietzko, U. (2007). Increased Aβ production leads to intracellular accumulation of Aβ in flotillin-1-positive endosomes. Neurodegenerative Diseases, 4(2–3), 164–170. https://doi.org/10.1159/000101841 Ramensky, V. (2002). Human non-synonymous SNPs: server and survey. Nucleic Acids Research, 30(17), 3894–3900. https://doi.org/10.1093/nar/gkf493 Rawlins, J. T., Fernandez, C. R., Cozby, M. E., & Opperman, L. A. (2008). Timing of egf treatment differentially affects tgf-β2 induced cranial suture closure. Experimental Biology and Medicine, 233(12), 1518–1526. https://doi.org/10.3181/0805-RM-151 Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J., & Kircher, M. (2019). CADD: Predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Research, 47(D1), D886– D894. https://doi.org/10.1093/nar/gky1016 Retterer, K., Juusola, J., Cho, M. T., Vitazka, P., Millan, F., Gibellini, F., … Bale, S. (2016). Clinical application of whole-exome sequencing across clinical indications. Genetics in Medicine, 18(7), 696–704. https://doi.org/10.1038/gim.2015.148 Richards, S., Aziz, N., Bale, S., Bick, D., Das, S., Gastier-Foster, J., … Rehm, H. L. (2015). Standards and guidelines for the interpretation of sequence variants: A joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genetics in Medicine, 17(5), 405–424. https://doi.org/10.1038/gim.2015.30 Rimmer, A., Phan, H., Mathieson, I., Iqbal, Z., Twigg, S. R. F., Wilkie, A. O. M., … Lunter, G. (2014). Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications. Nature Genetics, 46(8), 912–918. https://doi.org/10.1038/ng.3036 Robinson, P. N., Köhler, S., Bauer, S., Seelow, D., Horn, D., & Mundlos, S. (2008). The Human Phenotype Ontology: A Tool for Annotating and Analyzing Human Hereditary Disease. The American Journal of Human Genetics, 83(5), 610–615. https://doi.org/10.1016/j.ajhg.2008.09.017 Roelfsema, J. H., White, S. J., Ariyürek, Y., Bartholdi, D., Niedrist, D., Papadia, F., … Peters, D. J. M. (2005). Genetic heterogeneity in Rubinstein-Taybi syndrome: Mutations in both the CBP and EP300 genes cause disease. American Journal of Human Genetics, 76(4), 572–580. https://doi.org/10.1086/429130 Rowland, T. J., Graw, S. L., Sweet, M. E., Gigli, M., Taylor, M. R. G., & Mestroni, L. (2016). Obscurin Variants in Patients With Left Ventricular Noncompaction. Journal of the American College of Cardiology. https://doi.org/10.1016/j.jacc.2016.08.052 Salmon, L. B., Orenstein, N., Markus-Bustani, K., Ruhrman-Shahar, N., Kilim, Y., Magal, N., … Bazak, L. (2019). Improved diagnostics by exome sequencing following raw data reevaluation by clinical geneticists involved in the medical care of the individuals tested. Genetics in Medicine, 21(6), 1443–1451. https://doi.org/10.1038/s41436-018-0343-7 Santamaría, A., Castellanos, E., Gómez, V., Benedit, P., Renau-Piqueras, J., Morote, J., … Paciucci, R. (2005). PTOV1 Enables the Nuclear Translocation and Mitogenic Activity of Flotillin-1, a Major Protein of Lipid Rafts. Molecular and Cellular Biology, 25(5), 1900–1911. https://doi.org/10.1128/mcb.25.5.1900-1911.2005 Sawyer, S. L., Hartley, T., Dyment, D. A., Beaulieu, C. L., Schwartzentruber, J., Smith, A., … Boycott, K. M. (2016). Utility of whole-exome sequencing for those near the end of the diagnostic odyssey: Time

120

to address gaps in care. Clinical Genetics, 89(3), 275–284. https://doi.org/10.1111/cge.12654 Scafidi, J., Hammond, T. R., Scafidi, S., Ritter, J., Jablonska, B., Roncal, M., … Gallo, V. (2014). Intranasal epidermal growth factor treatment rescues neonatal brain injury. Nature, 506(7487), 230–234. https://doi.org/10.1038/nature12880 Schieppati, A., Henter, J. I., Daina, E., & Aperia, A. (2008). Why rare diseases are an important medical and social issue. The Lancet, 371(9629), 2039–2041. https://doi.org/10.1016/S0140- 6736(08)60872-7 Schwarze, K., Buchanan, J., Taylor, J. C., & Wordsworth, S. (2018). Are whole-exome and whole-genome sequencing approaches cost-effective? A systematic review of the literature. Genetics in Medicine, 20(10), 1122–1130. https://doi.org/10.1038/gim.2017.247 Scott, O., & Roifman, C. M. (2019). NF-κB pathway and the Goldilocks principle: Lessons from human disorders of immunity and inflammation. Journal of Allergy and Clinical Immunology, 143(5), 1688– 1701. https://doi.org/10.1016/j.jaci.2019.03.016 Shashi, V., McConkie-Rosell, A., Rosell, B., Schoch, K., Vellore, K., McDonald, M., … Goldstein, D. B. (2014). The utility of the traditional medical genetics diagnostic evaluation in the context of next- generation sequencing for undiagnosed genetic disorders. Genetics in Medicine. https://doi.org/10.1038/gim.2013.99 Shashi, V., Schoch, K., Spillmann, R., Cope, H., Tan, Q. K. G., Walley, N., … Zheng, A. (2019). A comprehensive iterative approach is highly effective in diagnosing individuals who are exome negative. Genetics in Medicine, 21(1), 161–172. https://doi.org/10.1038/s41436-018-0044-2 Shen, J., Gilmore, E. C., Marshall, C. A., Haddadin, M., Reynolds, J. J., Eyaid, W., … Walsh, C. A. (2010). Mutations in PNKP cause microcephaly, seizures and defects in DNA repair. Nature Genetics, 42(3), 245–249. https://doi.org/10.1038/ng.526 Shozu, M., Sebastian, S., Takayama, K., Hsu, W. T., Schultz, R. A., Neely, K., … Bulun, S. E. (2003). Estrogen excess associated with novel gain-of-function mutations affecting the aromatase gene. New England Journal of Medicine, 348(19), 1855–1865. https://doi.org/10.1056/NEJMoa021559 Shum, L., Sakakura, Y., Bringas, P., Luo, W., Snead, M. L., Mayo, M., … Slavkin, H. C. (1993). EGF abrogation-induced fusilli-form dysmorphogenesis of Meckel’s cartilage during embryonic mouse mandibular morphogenesis in vitro. Development, 118(3), 903–917. Sironi, M., Biasin, M., Cagliani, R., Forni, D., De Luca, M., Saulle, I., … Clerici, M. (2012). A Common Polymorphism in TLR3 Confers Natural Resistance to HIV-1 Infection . The Journal of Immunology, 188(2), 818–823. https://doi.org/10.4049/jimmunol.1102179 Spies, N., Weng, Z., Bishara, A., McDaniel, J., Catoe, D., Zook, J. M., … Sidow, A. (2017). Genome-wide reconstruction of complex structural variants using read clouds. Nature Methods, 14(9), 915–920. https://doi.org/10.1038/nmeth.4366 Stark, Z., Tan, T. Y., Chong, B., Brett, G. R., Yap, P., Walsh, M., … White, S. M. (2016). A prospective evaluation of whole-exome sequencing as a first-tier molecular test in infants with suspected monogenic disorders. Genetics in Medicine, 18(11), 1090–1096. https://doi.org/10.1038/gim.2016.1 Statistics Canada. (2017a). Aboriginal Peoples Highlight Tables, 2016 Census. Ottawa, Statistics Canada

121

Catalogue no. 98-402-X2016009. Retrieved from https://www12.statcan.gc.ca/census- recensement/2016/dp-pd/hlt-fst/abo-aut/index-eng.cfm Statistics Canada. (2017b). Focus on Geography Series, 2016 Census. Ottawa, Statistics Canada Catalogue no. 98-404-X2016001. Retrieved from https://www12.statcan.gc.ca/census- recensement/2016/as-sa/fogs-spg/Index-eng.cfm?TOPIC=7 Statistics Canada. (2017c). Language Highlight Tables, 2016 Census. Ottawa, Statistics Canada Catalogue no. 98-402-X2016005. Retrieved from https://www12.statcan.gc.ca/census-recensement/2016/dp- pd/hlt-fst/lang/index-eng.cfm Statistics Canada. (2017d). Population and Dwelling Count Highlight Tables, 2016 Census. Ottawa, Statistics Canada Catalogue no. 98-402-X2016001. Retrieved from https://www12.statcan.gc.ca/census-recensement/2016/dp-pd/hlt-fst/pd-pl/index-eng.cfm Stavropoulos, D. J., Merico, D., Jobling, R., Bowdin, S., Monfared, N., Thiruvahindrapuram, B., … Marshall, C. R. (2016). Whole-genome sequencing expands diagnostic utility and improves clinical management in paediatric medicine. Npj Genomic Medicine, 1, 15012. https://doi.org/10.1038/npjgenmed.2015.12 Takeguchi, R., Haginoya, K., Uchiyama, Y., Fujita, A., Nagura, M., Takeshita, E., … Sasaki, M. (2018). Two Japanese cases of epileptic encephalopathy associated with an FGF12 mutation. Brain and Development, 40(8), 728–732. https://doi.org/10.1016/j.braindev.2018.04.002 Tan, T. Y., Dillon, O. J., Stark, Z., Schofield, D., Alam, K., Shrestha, R., … White, S. M. (2017). Diagnostic impact and cost-effectiveness of whole-exome sequencing for ambulant children with suspected monogenic conditions. JAMA Pediatrics, 171(9), 855–862. https://doi.org/10.1001/jamapediatrics.2017.1755 Tarailo-Graovac, M., Shyr, C., Ross, C. J., Horvath, G. A., Salvarinova, R., Ye, X. C., … Van Karnebeek, C. D. (2016). Exome sequencing and the management of neurometabolic disorders. New England Journal of Medicine, 374(23), 2246–2255. https://doi.org/10.1056/NEJMoa1515792 Taylor, J. C., Martin, H. C., Lise, S., Broxholme, J., Cazier, J. B., Rimmer, A., … McVean, G. (2015). Factors influencing success of clinical genome sequencing across a broad spectrum of disorders. Nature Genetics, 47(7), 717–726. https://doi.org/10.1038/ng.3304 Tohyama, J., Nakashima, M., Nabatame, S., Gaik-Siew, C., Miyata, R., Rener-Primec, Z., … Saitsu, H. (2015). SPTAN1 encephalopathy: Distinct phenotypes and genotypes. Journal of Human Genetics, 60(4), 167–173. https://doi.org/10.1038/jhg.2015.5 Trujillo-Gonzalez, I., Wang, Y., Friday, W. B., Vickers, K. C., Toth, C. L., Molina-Torres, L., … Zeisel, S. H. (2019). MicroRNA-129-5p is regulated by choline availability and controls EGF receptor synthesis and neurogenesis in the . FASEB Journal, 33(3), 3601–3612. https://doi.org/10.1096/fj.201801094RR Vasli, N., Böhm, J., Le Gras, S., Muller, J., Pizot, C., Jost, B., … Laporte, J. (2012). Next generation sequencing for molecular diagnosis of neuromuscular diseases. Acta Neuropathologica, 124(2), 273–283. https://doi.org/10.1007/s00401-012-0982-8 Villeneuve, N., Abidi, A., Cacciagli, P., Mignon-Ravix, C., Chabrol, B., Villard, L., & Milh, M. (2017). Heterogeneity of FHF1 related phenotype: Novel case with early onset severe attacks of apnea, partial mitochondrial respiratory chain complex II deficiency, neonatal onset seizures without

122

neurodegeneration. European Journal of Paediatric Neurology, 21(5), 783–786. https://doi.org/10.1016/j.ejpn.2017.04.001 Wakeling, E. L., Brioude, F., Lokulo-Sodipe, O., O’Connell, S. M., Salem, J., Bliek, J., … Netchine, I. (2017). Diagnosis and management of Silver-Russell syndrome: First international consensus statement. Nature Reviews Endocrinology, 13(2), 105–124. https://doi.org/10.1038/nrendo.2016.138 Wang, Y., Yuan, S., Jia, X., Ge, Y., Ling, T., Nie, M., … Xu, A. (2019). Mitochondria-localised ZNFX1 functions as a dsRNA sensor to initiate antiviral responses through MAVS. Nature Cell Biology, 21(11), 1346–1356. https://doi.org/10.1038/s41556-019-0416-0 Weih, F., Carrasco, D., Durham, S. K., Barton, D. S., Rizzo, C. A., Ryseck, R. P., … Bravo, R. (1995). Multiorgan inflammation and hematopoietic abnormalities in mice with a targeted disruption of RelB, a member of the NF-κB/Rel family. Cell, 80(2), 331–340. https://doi.org/10.1016/0092- 8674(95)90416-6 Wenger, A. M., Guturu, H., Bernstein, J. A., & Bejerano, G. (2017). Systematic reanalysis of clinical exome data yields additional diagnoses: Implications for providers. Genetics in Medicine, 19(2), 209–214. https://doi.org/10.1038/gim.2016.88 Xu, J., Li, Z., Ren, X., Dong, M., Li, J., Shi, X., … Dai, Q. (2015). Investigation of Pathogenic Genes in Chinese sporadic Hypertrophic Cardiomyopathy Patients by Whole Exome Sequencing. Scientific Reports. https://doi.org/10.1038/srep16609 Yang, H., Robinson, P. N., & Wang, K. (2015). Phenolyzer: Phenotype-based prioritization of candidate genes for human diseases. Nature Methods, 12(9), 841–843. https://doi.org/10.1038/nmeth.3484 Yang, Y., Muzny, D. M., Xia, F., Niu, Z., Person, R., Ding, Y., … Eng, C. M. (2014). Molecular findings among patients referred for clinical whole-exome sequencing. JAMA - Journal of the American Medical Association, 312(18), 1870–1879. https://doi.org/10.1001/jama.2014.14601 Yeo, G., & Burge, C. B. (2004). Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. Journal of Computational Biology, 11(2–3), 377–394. https://doi.org/10.1089/1066527041410418 Young, P., Ehler, E., & Gautel, M. (2001). Obscurin, a giant sarcomeric Rho guanine nucleotide exchange factor protein involved in sarcomere assembly. Journal of Cell Biology, 154(1), 123–136. https://doi.org/10.1083/jcb.200102110 Zhang, J., Felder, A., Liu, Y., Guo, L. T., Lange, S., Dalton, N. D., … Chen, J. (2009). Nesprin 1 is critical for nuclear positioning and anchorage. Human Molecular Genetics, 19(2), 329–341. https://doi.org/10.1093/hmg/ddp499 Zhang, S. Y., Jouanguy, E., Ugolini, S., Smahi, A., Elain, G., Romero, P., … Casanova, J. L. (2007). TLR3 deficiency in patients with herpes simplex encephalitis. Science, 317(5844), 1522–1527. https://doi.org/10.1126/science.1139522

123

APPENDICES Appendix I: Letter of Introduction, Participant Consent and Assent Forms

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

Appendix II: Data Release Forms

140

141

142

143

144

Appendix III: Variant Spreadsheet Headers

The variant excel spreadsheet contains 55 headers for a singleton exome. The number of headers increases with the number of participants sequenced, so that all participant’s variant data can be represented and compared in one file. The headers and their explanations are listed below:

• Position: position of the variant based on the GRCh37 version of the human genome

• UCSC_Link: Link to the gene locus on the University of California Santa Cruz (UCSC) genome

browser (https://genome.ucsc.edu/)

• GNOMAD_Link: Link to the variant on the genome aggregation database website (GnomAD;

https://gnomad.broadinstitute.org/)

• Ref: GRCh37 reference nucleotide at position of variant

• Alt: Variant nucleotide found in the current sample

• Zygosity: Whether the variant in the sample genome is heterozygous or homozygous (Het,

Homo)

• Gene: Name of the gene containing the variant

• Burden: Total number of variants within the gene containing the current variant of interest that

differ from the reference genome

• Gts: Genotype of the sample, alleles separated by a slash (“/”)

• Variation: The expected consequence of the variant seen (i.e., missense, splice_region,

stop_gain, stop_loss, start_loss, frameshift, in-frame_deletion, in-frame_insertion). If multiple

consequences are predicted depending on the transcript, the highest impact consequence is

displayed.

• Info: The coding effects of the variant from the ensembl (http://uswest.ensembl.org/index.html)

and RefSeq annotations (https://www.ncbi.nlm.nih.gov/refseq/)

145

• Refseq_change: The nucleotide and protein change of the longest RefSeq isoform of the gene

containing the variant (https://www.ncbi.nlm.nih.gov/refseq/)

• Depth: Number of reads covering the position of the variant

• Quality: The quality of the variant of interest

• Alt_depth: The number of reads showing the alternative variant at the position of interest

• Trio_coverage: Number of unfiltered reads in this region for each family member separated by

an underscore (“_”)

• Ensembl_gene_id: Ensembl gene ID for the gene containing the current variant of interest

(http://uswest.ensembl.org/index.html)

• Gene_description: Description of the gene containing the current variant of interest from the

biomart database of ensemble (http://uswest.ensembl.org/biomart)

• Omim_gene_description: Description of gene containing the current variant of interest from the

OMIM database (https://www.omim.org/). Multiple diseases are separated by a semicolon (“;”)

• Omim_inheritance: Disease inheritance pattern of gene containing the current variant of

interest from OMIM (https://www.omim.org/). Multiple diseases are separated by a semicolon

(“;”)

• Orphanet: Description of the gene containing the current variant of interest from orphanet

(https://www.orpha.net/)

• Clinvar: Predicted pathogenicity of current variant of interest in the clinvar database

(https://www.ncbi.nlm.nih.gov/clinvar/). Multiple interpretations are separated by “|”.

• Frequency_in_C4R: The number of times the current variant of interest has been seen in a

Care4Rare (C4R) analyzed exome

• Seen_in_C4R_samples: A list of the C4R samples containing the current variant of interest

146

• HGMD_id: Identifier for the current variant of interest if it is present in the Human Gene

Mutation Database (HGMD; http://www.hgmd.cf.ac.uk/ac/index.php)

• HGMD_gene: Checks if the gene containing the current variant of interest has variants in HGMD

(http://www.hgmd.cf.ac.uk/ac/index.php), displays the gene symbol if true

• HGMD_tag: Tag given to current variant in HGMD (http://www.hgmd.cf.ac.uk/ac/index.php):

DM = disease-causing mutation, DM? = likely disease-causing mutation, DP = disease-associated

polymorphism, FP = in vitro or in vivo functional polymorphism, DFP = disease-associated

polymorphism with additional functional evidence, R = retired record

• HGMD_ref: If the variant matches a published variant, then the reference to the publication for

that variant (http://www.hgmd.cf.ac.uk/ac/index.php)

• Gnomad_af_popmax: Highest subpopulation allele frequency of the current variant of interest

in GnomAD database (https://gnomad.broadinstitute.org/)

• Gnomad_af: Total allele frequency of the current variant of interest in GnomAD

(https://gnomad.broadinstitute.org/)

• Gnomad_ac: Number of alleles counted with the current variant of interest in the GnomAD

(https://gnomad.broadinstitute.org/)

• Gnomad_hom: Number of individuals who are homozygous for the current variant of interest

seen in GnomAD database (https://gnomad.broadinstitute.org/)

• Ensembl_transcript_id: The ensemble identification code for the gene transcript most severely

affected by the current variant of interest (http://uswest.ensembl.org/index.html)

• AA_position: The amino acid position for the transcript most severely affected by the current

variant of interest

• Exon: Exon containing the current variant of interest

147

• Protein_domains: If the variant is in a , the protein domain is listed in this

column

• rsIDs: Variant RefSeq identification (https://www.ncbi.nlm.nih.gov/refseq/)

• Gnomad_oe_lof_score: Ratio of the number of observed loss of function (LOF) variants

compared to the number of expected LOF variants in the gene containing the current variant of

interest calculated by the GnomAD database (https://gnomad.broadinstitute.org/)

• Gnomad_oe_mis_score: Ratio of the number of observed missense variants compared to the

number of expected missense variants seen in gene containing the current variant of interest

calculated by the GnomAD database (https://gnomad.broadinstitute.org/)

• Exac_pli_score: The probability that the gene containing the current variant of interest is

intolerant to LOF variants, calculated by the Exome Aggregation Consortium (ExAc) database

(now hosted by the GnomAD database; https://gnomad.broadinstitute.org/)

• Exac_prec_score: The probability that the gene containing the current variant of interest is

intolerant to homozygous LOF variants, calculated by the ExAc database (now hosted by the

GnomAD database; https://gnomad.broadinstitute.org/)

• Exac_pnull_score: The probability that the gene containing the current variant of interest is

intolerant to both heterozygous and homozygous LOF variants, calculated by the ExAc database

(now hosted by the GnomAD database; https://gnomad.broadinstitute.org/)

• Conserved_in_20_mammals Phylogenetic p-values (Phylop) of conservation of this variant

position in 20 mammals. A positive score indicates an evolutionarily conserved site, while a

negative score indicates a site that is evolving. “The absolute values of phyloP scores represent

–log p-values under a null hypothesis of neutral evolution” (http://compgen.cshl.edu/phast/).

• Sift_score: The score given by the program Sorting Intolerant From Tolerant (SIFT), which

predicts whether an amino acid substitution effects protein function based on sequence

148

homology and the physical properties of amino acids. Scores range from 0 to 1. A score closer

to 0 is more likely to be deleterious and a score closer to 1 is more likely to be benign (Kumar,

Henikoff, & Ng, 2009).

• Polyphen_score: The score given by the program POLYmorphism PHENotpying (PolyPhen) to

predict possible impact of an amino acid substitution on the structure and function of a protein.

Scores range from 0 to 1. A score closer to 1 is more likely to be deleterious and a score closer

to 0 is more likely to be benign (Ramensky, 2002).

• Cadd_score: The score given by the program Combined Annotation Dependent Depletion

(CADD) to predict if a variant is deleterious. CADD uses a PHRED-like scaled score. A score of 10

indicates that the variant is among the 10% most deleterious substitutions in the human

genome, while a score of 20 indicates that the variant is among the 1% most deleterious

substitutions in the human genome. In general, scores of 15 or greater are considered

deleterious (Rentzsch, Witten, Cooper, Shendure, & Kircher, 2019).

• Vest3_score: The score given by the program Variant Effect Scoring Tool (VEST) 3.0. The VEST

program predicts pathogenicity based on a machine learning algorithm with a training set of

~45,000 disease-causing variants and ~45,000 benign high frequency variants. Scores range

from 0 to 1. A score closer to 1 is more likely to be deleterious and a score closer to 0 is more

likely to be benign (Carter, Douville, Stenson, Cooper, & Karchin, 2013).

• Revel_score: Gives a score based on the program Rare Exome Variant Ensemble Learner

(REVEL). REVEL predicts pathogenicity of missense variants by integrating the scores of the

programs: MutPred, FATHMM v2.3, VEST 3.0, Polyphen-2, Mutation Assessor, MutationTaster,

LRT, GERP++, SiPhy, and phastCons. Scores range from 0 to 1 where a score closer to 1 is more

likely to be deleterious and a score closer to 0 is more likely to be benign (Ioannidis et al., 2016).

149

• Gerp_score: Uses the program Genomic Evolutionary Rate Profiling (GERP) to give a score on

the conservation of a variant. Scores range from -12.3 to 6.17, where negative scores represent

variable regions and positive scores represent conserved regions (Spies et al., 2017).

• Imprinting_status: Whether this gene is known to be affected by imprinting based on the

Imprinted Gene Database (http://geneimprint.org/site/genes-by-species)

• Imprinting_expressed_allele: If the gene is thought to affect imprinting, this column shows

which parental allele is thought to be expressed based on the Imprinted Gene Database

(http://geneimprint.org/site/genes-by-species)

• Pseudoautosomal: This column shows if the gene containing the variant is a pseudoautosomal

gene contained on the ends of the X and Y .

• Splicing: The splicing effect prediction from the program VEP based on the MaxEntScan.pm

plugin. This column reports the largest difference between wildtype splicing and alternate

variant splicing (REF-ALT). The further from 0, the larger the difference in splicing. The lower

the value, the stronger the alternate variant splice site (Yeo & Burge, 2004).

• Number_of_callers: The number of callers that predicted this variant to be present in the

aligned exome sequencing reads. A total of 4 variant callers are used in this pipeline.

• Old_multiallelic: The original multiallelic call for decomposed variants. Sometimes variant

calling software clusters individual variant calls together, if this is the case, the program will

decompose variants into single nucleotide calls

150

Appendix IV: Case Data Summary

Case Age at Gender Ethnicity Disorder Primary Indication Individuals Original Year Total HPO Analyzed Summary of Candidate Variants ID Thesis Category Sequenced Type of of ES Variants Variants Variants (Classification) Submission Analysis (Years) Caucasian, 1 8 M Neurologic Fatigable weakness Singleton Clinical 2019 603 44 41 None Metis Trio- English, Russell-Silver-Like 2 6 M MCA1 unaffected Research 2016 697 73 8 None Irish Syndrome parents Irish, Trio- 3 11 F Scottish, Neurologic Congenital nystagmus unaffected Research N/A2 N/A N/A N/A N/A Romanian parents Trio- Ukrainian, Seizures, basal ganglia De novo FGF12 c.-130-1G>T 4 10 M Neurologic unaffected Research 2017 1169 93 37 Cree abnormalities (Weak3) parents Metis, Trio- Romanian, CHARGE-like 5 18 F MCA unaffected Research 2017 772 92 12 None Norwegian, anomalies parents English Irish, Duo- recurrent esophageal Het5 BCL7C c.809G>T p.Glu271Ter 6 14, 124 M, M Icelandic, Immune affected Research 2015 372 15 5 candidiasis (Weak) Mennonite siblings Norwegian, Trio- Leukoencephalopathy, 7 5 M Ukrainian, Neurologic unaffected Clinical 2019 679 49 16 None drooling Austrian parents Irish, English, Trio- Overgrowth, 8 3 M Icelandic, MCA unaffected Clinical 2018 606 67 9 None omphalocele German, parents Scottish

151

Scottish, Trio- English, 9 4 F Neurologic Drug resistant seizures unaffected Clinical 2018 593 65 9 None Italian, parents Icelandic Homo6 TTN c.33340+3A>G Palestinian, 10 7 F Neurologic Congenital myopathy Singleton Clinical 2015 719 53 48 (Strong7), Het SYNE2 c.18407G>A Syrian p.Arg6136His (Weak) Seizures, Het GOLPH3 c.653dup 11 2 F Hutterite Neurologic Singleton Clinical 2019 542 67 30 dysmorphisms p.Glu219GlyfsTer5 (Novel8) Developmental Trio- Irish, Het USP18 c.511C>T p.Arg171Trp 12 Deceased F Neurologic regression, brain unaffected Clinical 2015 622 56 5 English (Single hit9) calcification parents Trio- 13 28, 26, 20 M, M, F Inuit Other Elevated oxalate affected Research 2017 467 5 39 None siblings Duo- Metis, Irish, Photosensitivity, Het PNKP c.1253_1269dup 14 19, 16 M, M Neurologic affected Research 2015 377 24 29 German retinal dystrophy p.Thr424GlyfsTer49 (Single hit) siblings Duo- 15 17, 13 M, F Somali Neurologic Spastic quadriparesia affected Research 2016 824 110 118 None siblings Hereditary 16 56 F French Immune Singleton Research 2016 620 17 47 None angioedema Homo PSKH1 c.686C>T First p.Pro229Leu (Novel), Homo DUS2 Fetal alcohol 17 3 M Nations, Other Singleton Clinical 2019 741 63 57 c.1265C>T p.Ala422Val (Novel), syndrome, cholestasis Caucasian Homo NOB1 c.1095C>A p.Asp365Glu (Novel) First Brain anomalies, Het EGF c.2908C>T p.Arg970Ter 18 3 M MCA Singleton Clinical 2018 776 43 48 Nations missing parietal bone (Novel) Duo- Hypotonia, choanal Homo PTOV1 c.289G>A 19 10, 7 F, F Hutterite Neurologic affected Clinical 2019 693 40 51 atresia, scoliosis p.Gly97Ser (Novel) siblings

152

English, Scottish, Immunodeficiency, 20 21 M French Immune Singleton Clinical 2019 883 52 57 None rhabdomyolysis Canadian, Metis Trio- affected Congenital half- Het PRKCH c.1214G>A 21 18, 15 F, M Unknown MCA amputations, motor Research 2016 306 22 10 siblings & p.Arg405Lys (Weak) delay unaffected parent Duo- First Homo ZNFX1 c.3152T>C 22 3 F Immune Recurrent infections unaffected Clinical 551 24 36 Nations p.Leu1051Pro (Novel) parent Hypertonia, 23 6 F Hutterite Neurologic microcephaly, retinal Singleton Research 552 56 72 None findings Irish, Multiple Het10 RELB c.433G>A Scottish, Immunodeficiency, 24 31 M Immune Singleton Clinical 526 56 69 p.Glu145Lys & c.1091C>T French, warts on hands p.Pro364Leu (Strong) German Biallelic SPTB c.4819G>A Trio- p.Val1607Ile & c.871G>A Mixed- 25 7 F Neurologic Uncontrolled seizures unaffected Clinical 597 76 15 p.Gly291Ser (Novel), Homo European parents ATG9A c.1121C>T p.Thr374Ile (Novel) Retinal detachment, Duo- Mixed11 COL9A3 c.423+1del 26 17, 14 M, M Pakistani Other proteinuria, hearing affected Research 432 21 64 (Strong), Mixed LAMB2 c.634T>C loss siblings p.Ser212Pro (Strong)

1. MCA = multiple congenital anomalies 2. N/A = not available, information is missing because analysis was unable to be done for this participant 3. Weak candidates refer to variants that have minimal evidence for pathogenicity but cannot be ruled out as the cause of disease 4. Multiple siblings, phenotypes, or variants are separated by commas

153

5. Het = heterozygous 6. Homo = homozygous 7. Strong candidates refer to variants with strong evidence of pathogenicity in known disease genes that are related to the participant’s phenotype 8. Novel candidates refer to variants found in genes that have not been associated with disease or are associated with diseases of a different phenotype 9. Single hit candidates refer to heterozygous variants in autosomal recessive disease genes that are related to the participant’s phenotype 10. Multiple hets refers to candidate genes that contained two heterozygous variants of interest 11. Mixed zygosity refers to sibling pairs where one member of the sibling pair contains a homozygous variant and the other sibling has the same variant as a heterozygote

154

Appendix V: Evidence for Analyzed Variants

Case 1: Singleton Exome Reanalysis

603 variants total. 44 variants when filtered for HPO terms. 41 variants analyzed.

Filter 1A – HPO Terms, AF < 0.001, Het variants in autosomal dominant genes: 8 variants analyzed:

• ALDH18A1 c.1774G>A p.Val592Ile, 2 hets in GnomAD, in-silico predicts benign (SIFT = 0.57, PolyPhen = 0, CADD = 14.15), 2 matching HPO terms (Joint hyperflexibility, Toe walking), high phenolyzer rank (172) • CCNF c.1121G>A p.Gly374Glu, not present in GnomAD, mixed in-silico predicts benign (SFIT = 0.33, PolyPhen = 0.003, CADD = 0.217), 1 matching HPO term (Fatigable weakness) • PIGS c.819+3G>A (splice region), 4 hets in GnomAD, O/E LOF score of 0.479, splice score of - 1.547, 1 matching HPO term (Widely spaced teeth) • CACNA1A c.6544+7A>G (splice region), 1 het in GnomAD, splicing score of 0, 2 matching HPO terms (Autism, Developmental regression) • TTN c.64631G>A p.Arg21544Gn, 13 hets in GnomAD, VUS in ClinVar, in-silico predicts deleterious (no SIFT score, PolyPhen = 0.998, CADD = 23), 2 matching HPO terms (Neck flexor weakness, Proximal muscle weakness), high phenolyzer rank (757) • DNAJC13 c.3434C>G p.Ala1145Gly, causes late on-set hereditary Parkinson’s disease, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.25, PolyPhen = 0.007, CADD = 23.1), 1 matching HPO term (Sleep disturbance) • FTSJ1 c.331G>T p.Asp111Tyr, seen in 3 out of 13 reads, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen =- 0.967, CADD = 26.5), only seen by 2 callers, 1 matching HPO term (Autism), heterozygous variant in X-linked gene for male participant indicates artifact • PCDH19 c.34_36del p.Leu12del (in frame deletion, seen in 5 out of 13 reads, Benign in ClinVar, 54 hets in GnomAD, 1 matching HPO term (Developmental regression), low phenolyzer rank (13537) ), heterozygous variant in X-linked gene for male participant indicates artifact

Filter 1B – HPO Terms, AF < 0.001, Homo variants in autosomal recessive genes: 0 variants found

Filter 1C – HPO Terms, AF < 0.001, Burden > 1 in autosomal recessive genes: 0 variants found

Filter 2 – AF < 0.001, Likely Loss of Function Variants: 19 variants analyzed:

• Het TAS1R3 c.120del p.Phe41SerfsTer36 (frameshift), not seen in GnomAD, O/E LOF score of 1.12 • Het ANKRD65 c.210-1_222dup p.Arg75GlyfsTer177 (frameshift), not seen in GnomAD, O/E LOF score of 1.9, low phenolyzer rank (11897) • Het PDE4DIP c.6598C>T p.Gln2200Ter (nonsense), 14 hets in GnomAD, O/E LOF score of 0.56, only seen by 2 callers, low phenolyzer rank (16295) • Het FLG c.5840G>A p.Trp1947Ter (nonsense), 58 hets in GnomAD, O/E LOF score of 2.35 • Het AVPR1B c.542C>A p.Ser181Ter (nonsense), 1 het in GnomAD, O/E LOF score of 0.56 • Het CYP4Z2P n.180+5G>C (splice region), not seen in GnomAD, no O/E scores listed, splice score of 1.96, low phenolyzer rank (13512)

155

• Het ZBTB48 c.1045-107T>A (splice region), not seen in GnomAD, no O/E scores listed, splice score of 2.15, only seen by 2 callers, low phenolyzer rank (16151) • Het TRIM51GP n.738+4C>T (splice region), 37 hets in GnomAD, no O/E scores listed, splice score of 0.891, no phenolyzer rank • Het RITA1 c.838C>T p.Arg280Ter (nonsense), 8 hets in GnomAD, O/E LOF score of 0.78, no phenolyzer rank • Het DIS3 n.395-749A>T (splice region), not seen in GnomAD, O/E LOF score of 0.79, splice score of -0.389 • Het FA2H c.1018del p.His340ThrfsTer30 (frameshift), het variant in AR disease gene, not seen in GnomAD, O/E LOF score of 0.27 • Het WDR59 c.-515C>T (splice region), 5 hets in GnomAD, O/E LOF score of 0.53, splice score of - 0.137 • Het GALNS c.899-6T>C (splice region), het variant in AR disease-gene, VUS in ClinVar, 22 hets in GnomAD, O/E LOF score of 0.67, splice score of -0.893, 1 matching HPO term (widely spaced teeth) • Het GPRC5C c.1355-5T>C (splice region), not seen in GnomAD, O/E LOF score of 0.49, splice score of 0.342, low phenolyzer rank (15566) • Het TMEM18 c.58-123G>C (splice region), 1 het in GnomAD, O/E LOF score of 0.64, splice score of 8.04, low phenolyzer rank (18256) • Het ABCG8 c.636del p.Leu213CysfsTer40 (frameshift), het variant in AR disease gene, not seen in GnomAD, O/E LOF score of 1.01 • Het PCNT c.9161dup p.Thr3055AsnfsTer84 (frameshift), het variant in AR disease gene, 4 hets in GnomAD, O/E LOF score of 0.69, only seen by 2 callers, 1 matching HPO term (joint hyperflexibility), high phenolyzer rank (620) • Het CHCHD6 c.570-4389C>T (splice region), not seen in GnomAD, O/E LOF score of 0.92, splice score of 0.142, low phenolyzer rank (12977) • Het MOSPD3 c.3G>A p.Met1? (start loss), not seen in GnomAD, O/E LOF score of 0.66, no phenolyzer rank

Filter 3 – Candidate Neuromuscular Genes: 8 variants analyzed:

• Het POMGNT1 c.2137A>G p.Thr713Ala, het variant in AR disease-gene, 7 hets in GnomAD, in- silico predicts benign (SIFT = none, PolyPhen = 0.057, CADD = 11.7), 1 matching HPO term (Proximal muscle weakness) • Het FKRP c.541C>A p.Arg181Ser, het variant in AR disease-gene, VUS in ClinVar, 12 hets in GnomAD, in-silico predicts benign (SIFT = 0.72, PolyPhen = 0, CADD = 6.49), 4 matching HPO terms (Fatigable weakness, Proximal muscle weakness, Sleep disturbance, Toe walking) • Het NEB c.2354A>G p.His785Arg, het variant in AR disease-gene, VUS in ClinVar, 5 hets in GnomAD, in-silico predicts benign (SIFT = 0.31, PolyPhen = 0.273, CADD = 4.98), 4 matching HPO terms (Exercise intolerance, Fatigable weakness, Neck flexor weakness, Proximal muscle weakness), • Het SBF1 c.689C>T p.Ala230Val, het variant in AR disease-gene, not seen in GnomAD, mixed in- silico predictions (SIFT = 0.85, PolyPhen = 0.994, CADD = 7.4), low phenolyzer rank (10821)

156

• Het DST c.9809C>T p.Thr3270Met, het variant in AR disease-gene, 3 hets in GnomAD, in-silico predicts benign (SIFT = none, PolyPhen = 0.401, CADD = 3.91) • Het DSP c.2675G>A p.Arg892His, VUS in ClinVar, 5 hets in GnomAD, mixed in-silico predictions (SIFT = 0, PolyPhen = 0.7, CADD = 34), gene mostly causes an arrhythmia which is not a good phenotypic match • Homo AR c.222_239dup p.Gln75_Gln80dup (in-frame insertion), seen in 1 out of 1 read, not seen in GnomAD, only seen by 2 callers, high phenolyzer rank (2)

Filter 3 – Candidate Intellectual Disability Genes: 5 variants analyzed:

• Het ASPA c.571A>C p.Ile191Leu, het variant in AR disease-gene, not seen in GnomAD, mixed in- silico predictions (SIFT = 0.64, PolyPhen = 0.375, CADD = 23.4), 2 matching HPO terms (Developmental regression, Sleep disturbance) • Het NPC1 c.3755-6_3755-5dup (splice region), het variant in AR disease-gene, 27 hets in GnomAD, O/E LOF score of 0.32, splice score of 0 • Het MAN2B1 c.1419+89G>A (splice region), het variant in AR disease-gene, 4 hets in GnomAD, O/E LOF score of 0.75, splice score of 0, 1 matching HPO term (Widely spaced teeth) • Het NDUFS7 c.167C>T p.Pro56Leu, het variant in AR disease-gene, 7 hets in GnomAD, mixed in- silico predictions (SIFT = 0.06, PolyPhen = 0.062, CADD = 23.2), 1 matching HPO term (Developmental delay) • Het IDUA c.932C>T p.Pro311Leu, het variant in AR disease-gene, VUS in ClinVar, 23 hets in GnomAD, mixed in-silico predictions (SIFT = 0.44, PolyPhen = 0.007, CADD = 17.29), 1 matching HPO term (Sleep disturbance)

OMIM Genes: 1 variant analyzed:

• Het DMGDH c.274G>A p.Ala92Thr, het variant in AR disease-gene, 9 hets in GnomAD, in-silico predicts deleterious (SIFT = 0; PolyPhen = 0.934, CADD = 34)

Case 2: Trio Exome Reanalysis, Unaffected Parents

697 variants total. 73 variants when filtered for HPO terms. 8 variants analyzed.

Filter 1 A – HPO terms, AF < 0.001, de novo het variants: 1 variant analyzed:

• MNX1 c.450C>A p.Ala117Glu, variant seen in 2 out of 4 reads, not seen in GnomAD, in-silico predicts benign (SIFT = 1, PolyPhen = 0.02, CADD = 0.199), 1 matching HPO term (Hypospadias), gene cause Curriano syndrome which is not usually caused by missense variants

Filter 1 B – HPO Terms, AF < 0.001, Homo variants when parents are not homo: 0 variants found

Filter 1 C – HPO Terms, AF < 0.001, Burden > 1 in autosomal recessive genes: 1 variant analyzed:

• HSPG2 c.4912C>T p.Arg1638Cys, 6 matching HPO terms, father has the same allele burden, 48 hets in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.997, CADD = 35), 6 matching HPO terms (Cryptorchidism, Decreased body weight, Inguinal hernia, Microcephaly,

157

Narrow palpebral fissure, Wormian bones), Gene causes Swartz-Jampel syndrome which has characteristic facies not seen in this patient o Other variant in gene c.7426G>C p.Gly2476Arg, variant present in father, 33 hets in GnomAD, in-silico predicts deleterious (SIFT = 0.03, PolyPhen = 0.976, CADD = 28.6)

Filter 2 – AF < 0.001, Likely Loss of Function Variants not present in parents: 6 variants analyzed:

• Het LINC01044 n.467G>A? (splice region), seen in 2 out of 5 reads, not seen in GnomAD, splice score of 2.074, only seen by 2 callers, no phenolyzer rank • Het DDX52 c.1588+5A>C (splice region), not seen in GnomAD, O/E LOF score of 0.79, splice score of -0.125 • Het GIPC1 c.286G>T p.Glu96Ter (nonsense), seen in 2 out of 3 reads, not seen in GnomAD, O/E LOF score of 0.46, only seen by 2 callers, low phenolyzer rank (10959) • Het DDX60L c.718G>T p.Glu240Ter (nonsense), not seen in GnomAD, O/E LOF score of 0.80, only seen by 2 callers, low phenolyzer rank (16361) • Het TEC c.138+1G>T (splice donor), not seen in GnomAD, O/E LOF score of 0.47, splice score of 8.5, only seen by 2 callers • Het WWP1 c.2724del p.Lys908AsnfsTer6 (frameshift), variant is at the front of the last exon, seen in 2 out of 3 reads, not seen in GnomAD, O/E LOF score of 0.14, only seen by 2 callers

Filter 3 – Candidate gene list: No candidate gene lists for this phenotype.

Case 3: Trio Exome Reanalysis, Unaffected Parents

Unable to analyze due to poor quality data

Case 4: Trio Exome Reanalysis, Unaffected Parents

1169 variants total. 165 variants when filtered for HPO terms. 37 variants analyzed.

Filter 1 A – HPO terms, AF < 0.001, de novo het variants: 5 variants analyzed:

• BEST1 c.1021C>T p.Arg341Trp, 2 hets in GnomAD, mixed in-silico predictions (SIFT = 0.09, PolyPhen = 0, CADD = 1.636), 1 matching HPO term (optic atrophy), AD mutations in this gene cause adult onset disease • RNF213 c.6281T>C p.Leu4021Pro, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.936, CADD = 24.5), 1 matching HPO term (seizures), gene causes susceptibility to Moyamoya disease which is not a good phenotypic match • SON c.2542A>T p.Thr848Ser, not seen in GnomAD, in-silico predicts benign (SIFT = 0.21, PolyPhen = 0.12, CADD = 11.03), only seen by 2 callers, 4 matching HPO terms (Abnormality of the periventricular white matter, Optic atrophy, Seizures, Spasticity), low phenolyzer rank (11718), individuals with this disorder have distinct facial features which this participant does not have

158

• FGF12 c.-131G>T (splice acceptor) seen in 4 out of 20 reads, not seen in GnomAD, O/E LOF score of 0.17, splice score of 8.598, only called by 2 callers, 3 matching HPO terms (Optic atrophy, Seizures, Spasticity) • SCN5A c.3950A>G p.Glu1317Gly, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.998, CADD = 31), only called by 2 callers, 1 matching HPO term (Seizures)

Filter 1 B – HPO Terms, AF < 0.001, Homo variants when parents are not homo: 0 variants found

Filter 1 C – HPO Terms, AF < 0.001, Burden > 1 in autosomal recessive genes: 0 variants found

Filter 2 – AF < 0.001, Likely Loss of Function Variants: 30 variants analyzed:

• Het TMEM63A c.746+1G>T (splice donor), not seen in GnomAD, O/E LOF score of 0.59, splice score of 8.5, only seen by 2 callers • Het TXNDC12 c.160G>T p.Gly54Ter (nonsense), not seen in GnomAD, O/E LOF score of 0.73, low phenolyzer rank (13699) • Het FNBP1L c.24G>A p.Trp8Ter (nonsense), not seen in GnomAD, O/E LOF score of 0.17, only seen by 2 callers, low phenolyzer rank (10405) • Het USP28 c.1929+355G>T (splice region), not seen in GnomAD, no O/E scores listed, splice score of 0, only seen by 2 callers • Het ARPC3 c.187G>T p.Glu63Ter (nonsense), not seen in GnomAD, O/E LOF score of 0.26, only seen by 2 callers • Het PLEKHG6 c.1276-8A>C (splice region), 3 hets in GnomAD, O/E LOF score of 0.77, splice score of -1.025, only seen by 2 callers, low phenolyzer rank (13841) • Het PAPLN c.3420-4C>A (splice region), not seen in GnomAD, O/E LOF score of 0.92, splice score of -0.16, only seen by 2 callers, low phenolyzer rank (11829) • Het ADAMTS7 c.622+8C>A (splice region), seen in 2 out of 8 reads, not seen in GnomAD, O/E LOF score of 0.47, splice score of 0, only seen by 2 callers • Het HAGH c.190G>T p.Glu64Ter (nonsense), not seen in GnomAD, O/E LOF score of 0.53, only seen by 2 callers • Het WIPI1 c.1047+24G>T (splice region), not seen in GnomAD, O/E LOF score of 0.40, only seen by 2 callers • Het COL5A3 c.3666+1G>T (splice donor), not seen in GnomAD, O/E LOF score of 0.47, splice score of 8.5, only seen by 2 callers • Het MKNK2 c.-96-5C>T (splice region), only seen in 2 out of 4 reads, 1 het in GnomAD, O/E LOF score of 0.43, splice score of 1.123, only seen by 2 callers • Het HAUS5 c.486-4C>A (splice region), not seen in GnomAD, O/E LOF score of 0.49, splice score of -0.209, only seen by 2 callers • Het LYPD4 c.-121+7C>A (splice region), not seen in GnomAD, O/E LOF score of 0.95, splice score of 0, only seen by 2 callers • Het TPGS1 c.339-2A>G (splice acceptor), only seen in 2 out of 9 reads, not seen in GnomAD, O/E LOF score of 0.56, splice score of 7.96, only seen by 2 callers, no phenolyzer rank • Het LENG8 c.2241-621C>T (splice region), 11 hets in GnomAD, O/E LOF score of 0.11, splice score of 3.64, only seen by 2 callers, low phenolyzer rank (17580)

159

• Het KIR2DL3 c.35-3C>A (splice region), not seen by GnomAD, O/E LOF score of 1.06, splice score of 1.88, only seen by 2 callers • Het GPR108 c.374+1G>T (splice donor), not seen in GnomAD, O/E LOF score of 0.76, splice score of 8.5, only seen by 2 callers, low phenolyzer rank (18119) • Het GTF3C3 c.1832-515G>T (splice donor), only seen in 2 out of 4 reads, not seen in GnomAD, O/E LOF score of 0.37, splice score of 8.50, only seen by 2 callers • Het NFAM1 c.772G>T p.Glu258Ter (nonsense), not seen in GnomAD, O/E LOF score of 0.67 • Het ZMYND10-AS1 n.123+3C>A (splice region), only seen in 4 out of 21 reads, not seen in GnomAD, no O/E scores listed, splice score of -15.10, only seen by 2 callers, no phenolyzer rank • Het DNAH1 c.1656+1G>T (splice donor), only seen in 2 out of 7 reads, not seen in GnomAD, O/E LOF score of 0.39, splice score of 8.5, only seen by 2 callers • Het H2AFZ c.4-4G>T (splice region), only seen in 2 out of 8 reads, not seen in GnomAD, O/E LOF score of 0, splice score of 1.40, only seen by 2 callers • Het PHKG1 c.1210G>T p.Glu404Ter (nonsense), only seen in 4 out of 12 reads, not seen in GnomAD, O/E LOF score of 0.89, only seen by 2 callers • Het FAM199X c.992C>A p.Ser331Ter (nonsense), only seen in 2 out of 7 reads, not seen in GnomAD, O/E LOF score of 0.18, only seen by 2 callers, low phenolyzer rank (17856) • Het AMMERCR1 c.133G>T p.Gly45Ter (nonsense), heterozygous variant in X-linked gene for male participant indicates artifact, seen in 5 out of 23 reads, not seen in GnomAD, O/E LOF score of 0.098 • Het MAGEB16 c.-66-1617C>A (splice region), heterozygous variant in X-linked gene for male participant indicates artifact, seen in 3 out of 18 reads, not seen on GnomAD, O/E LOF score of 0, splice score of 4.35, only seen by 2 callers, no phenolyzer rank listed • Hemi AF196972.3 n.218+8T>A (splice region), seen in 2 out of 2 reads, not seen in GnomAD, no O/E scores listed, splice score of 0, not phenolyzer rank listed • Het CCDC120 c.592C>T p.Gln198Ter (nonsense), heterozygous variant in X-linked gene for male participant indicates artifact, seen in 4 out of 7 reads, not seen in GnomAD, O/E LOF score of 0.21, only seen by 2 callers, low phenolyzer rank (18921) • Het LOC101059915 c.1470+2T>C (splice donor), heterozygous variant in X-linked gene for male participant indicates artifact, seen in 4 out of 11 reads, not seen in GnomAD, no O/E scores listed, splice score of 7.75, no phenolyzer rank listed

Filter 3A – Mitochondrial Candidate Genes: 2 variants analyzed:

• Het TRMT5 c.18A>T p.Leu6Phe, het variant in AR disease-gene, 2 hets seen in GnomAD, mixed in-silico predictions (SIFT = 0.08, PolyPhen = 0.001, CADD = 11.05), only seen by 2 callers • Het NEB c.19003A>T p.Ile8191Phe, het variant in AR disease-gene, 1 het in GnomAD, mixed in- silico predictions (SIFT = 0.34, PolyPhen = 0.017, CADD = 22.2), low phenolyzer rank (13065)

Filter 3B – Epilepsy Candidate Genes: 0 variants found

Case 5: Trio Exome Analysis, Unaffected Parents

772 variants total. 92 variants when filtered for HPO terms. 12 variants analyzed.

160

Filter 1 A – HPO terms, AF < 0.001, de novo het variants: 3 variants analyzed:

• B3GALNT2 c.90C>G p.Cys30Trp, het variant in AR disease-gene, not seen in GnomAD, in-silico predict benign (SIFT = 0.19, PolyPhen = 0.597, CADD = 14.68), only seen by 2 callers, 4 matching HPO terms (Agenesis of corpus callosum, Generalized hypotonia, Microcephaly, Optic nerve hypoplasia), high phenolyzer rank (399) • SPINT2 c>554-1G>A (splice acceptor), het variant in AR disease-gene, not seen in GnomAD, O/E LOF score of 0.52, splice score of 8.75, 2 matching HPO terms (Choanal atresia, Hypertelorism), not a good phenotypic match • GINS1 c.349G>T p.Ala117Ser, seen in 4 out of 6 reads, not seen in GnomAD, het variant in AR disease-gene, mixed in-silico predictions (SIFT = 0.07, PolyPhen = 0, CADD = none), 2 matching HPO terms (Microcephaly, Short stature), low phenolyzer rank (10991), not a good phenotypic match

Filter 1 B – HPO Terms, AF < 0.001, Homo variants when parents are not homo: 0 variants found

Filter 1 C – HPO Terms, AF < 0.001, Burden > 1 in autosomal recessive genes: 1 variant analyzed:

• Maternally inherited PIEZO1 c.3502T>G p.Trp1168Gly, 9 hets in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.998, CADD = 25.1), 3 matching HPO terms (Cupped ears, Micrognathia, Short stature), not a good phenotypic match o Paternally inherited PIEZO1 c.5773C>T p.Arg1925Trp, 95 hets in GnomAD, mixed in-silico predictions (SIFT = 0.18, PolyPhen = 0.332, CADD = 23.3)

Filter 2 – AF < 0.001, Likely Loss of Function Variants: 8 variants analyzed:

• Het ENPP7 c.1017G>A p.Trp339Ter (nonsense), seen in 2 out of 4 reads, 6 hets in GnomAD, no O/E scores listed, only seen by 2 callers • Het CCDC40 c.1159+1034G>T (splice region), seen in 2 out of 6 reads, het variant in AR disease- gene, not seen in GnomAD, no O/E scores listed, splice score of 1.995, only seen by 2 callers, low phenolyzer rank (17720) • Het SECTM1 c.403+6G>T (splice region), not seen in GnomAD, O/E LOF score of 0.44, splice score of -0.605, only seen by 2 callers, low phenolyzer rank (16424) • Het TMPRSS9 c.1915+7G>A (splice acceptor), 8 hets in GnomAD, O/E LOF score of 0.70, splice score of 0, low phenolyzer rank (18224) • Het CYP27C1 c.-25G>A (splice region), seen in 2 out of 4 reads, not seen in GnomAD, O/E LOF score of 0.66, splice score of -2.346, only seen by 2 callers, low phenolyzer rank (14805) • Het UQCC1 c.24+1893G>A (splice region), seen in 4 out of 11 reads, not seen in GnomAD, O/E LOF score of 0.66, splice score of 0, only seen by 2 callers, low phenolyzer rank (12283) • Het UQCC1 c.24+1888A>G (splice region), seen in 4 out of 11 reads, 3 hets in GnomAD, O/E LOF score of 0.66, splice score of 2, only seen by 2 callers, low phenolyzer rank (12283) • Het RP11-120I21 n.184G>A (splice region), seen in 2 out of 4 reads, not seen in GnomAD, no O/E scores listed, splice score of 5.2, only seen by 2 callers, no phenolyzer rank listed

Filter 3 – Candidate gene list: No candidate gene lists for this phenotype.

161

Case 6: Duo Exome Reanalysis, Affected Sibling Pair

372 shared variants total. 15 shared variants when filtered for HPO terms. 5 variants analyzed.

Filter 1 A – HPO terms, AF < 0.001, shared het variants: 0 variants found

Filter 1 B – HPO Terms, AF < 0.001, shared homo variants: 0 variants found

Filter 1 C – HPO Terms, AF < 0.001, Burden > 1 in autosomal recessive gene: 0 variants found

Filter 2 – AF < 0.001, Likely Loss of Function Variants: 5 variants found:

• Het RB1 c.1421+11dup (splice region), not seen in GnomAD, O/E LOF score of 0.017, splice score of 0, high phenolyzer rank (568), not a good phenotypic match • Het DIS3 c.2670+5G>A (splice region), 3 hets in GnomAD, O/E LOF score of 0.79, splice score of 2.78 • Het BCL7C c.809G>T p.Glu271Ter (nonsense), not seen in GnomAD, O/E LOF score of 0.35, low phenolyzer rank (17000), Gene description is B-cell CLL/Lymphoma 7C • Het IGF2BP1 c.1642-6C>G (splice region), 42 hets in GnomAD, O/E LOF score of 0.09, splice score of 1.43 • Het MYH14 c.5788-2A>G (splice region), 1 het seen in GnomAD, O/E LOF score of 0.21, splice score of 7.9, not a good phenotypic match

Filter 3 – Candidate Immune Genes: 0 variants found

Case 7: Trio Exome Reanalysis, Unaffected Parents

679 variants total. 49 variants when filtered for HPO terms. 16 variants analyzed.

Filter 1 A – HPO terms, AF < 0.001, de novo het variants: 0 variants found

Filter 1 B – HPO Terms, AF < 0.001, Homo variants when parents are het: 0 variants found

Filter 1 C – HPO Terms, AF < 0.001, Burden > 1 in autosomal recessive gene: 0 variants found

Filter 2 – AF < 0.001, Likely Loss of Function Variants: 2 variants analyzed:

• Hemi G6PD c.1002+5G>A (splicing region), VUS in ClinVar, 1 het in GnomAD, O/E LOF score of 0.112, splice score of 2.379, high phenolyzer rank (509), not a good phenotypic match (hemolytic anemia) • Het RRP36 c.130+4A>G (splice region), not seen in GnomAD, O/E LOF score of 1.12, splice score of 5.64, coverage in parents too low to call de novo, GUS

Filter 3A – Neuromuscular Candidate Genes: 0 variants found

Filter 3B – Intellectual Disability Candidate Genes: 1 variant analyzed:

• Het SGSH c.19G>C p.Ala7Pro, het variant in AR disease-gene, not seen in GnomAD, mixed in- silico predictions (SIFT = 0.06, PolyPhen = 0.162, CADD = 11.84), coverage in parents too low to call de novo

162

Homo variants when each parent is het: 2 variants found:

• NUP98 c.2198-9_2198-8dup (splice region), not seen in GnomAD, O/E LOF score of 0.05, splice score of 0, only seen by 2 callers, high phenolyzer rank (133) • GOLGA6L6 c.1456C>A p.His486Asn, coverage too low to conclude homozygous, 2 hets in GnomAD, mixed in-silico predictions (SIFT = none, PolyPhen = 0, CADD = none), only seen by 2 callers, no phenolyzer rank listed

De novo variants: 10 variants analyzed:

• Het MUC19 c.14326C>A p.His4776Asn, seen 37 times in C4R, 73 hets in GnomAD, no O/E scores listed, no in-silico predictions, only seen by 2 callers • Het MUC19 c.16453T>C p.Ser5485Pro, seen 104 times in C4R, not seen in GnomAD, no O/E scores listed, no in-silico predictions, only seen by 2 callers • Het KIR2DL3 c.576C>A p.His192Gln, seen 62 times in C4R, 8 hets in GnomAD, in-silico predicts benign (SIFT = 0.16, PolyPhen = 0.347, CADD = 4.23) • Het ZNF471 c.650A>G p.Lys217Arg, not seen in GnomAD, in-silico predicts benign (SIFT = 0.12, PolyPhen = 0.007, CADD = 10.34) • Het TNFRSF13C c.67G>A p.Glu23Lys, not seen in GnomAD, in-silico predicts benign (SIFT = 0.45, PolyPhen = 0.024, CADD = 12.05), only seen by 2 callers • Het ZFR c.120_122del p.Ala43del (in-frame deletion), 151 hets in GnomAD, only seen by 2 callers, low phenolyzer rank (16392) • Het FST c.13A>G p.Arg5Gly, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.04, PolyPhen = 0.033, CADD = 21.4), only seen by 2 callers • Het POM121 c.1730G>A p.Ser577Asn, 180 hets in GnomAD, in-silico predicts benign (SIFT = 0.87, PolyPhen = 0.003, CADD = 0.006), only seen by 2 callers, high phenolyzer rank (322) • Homo KIR2DL3 c.454C>T p.Arg152Trp, seen 240 times in C4R, 71 hets and 3 homos in GnomAD, in-silico predicts benign (SIFT = 0.16, PolyPhen = 0.043, CADD = 0.844) • Homo KIR2DL3 c.485G>C p.Arg162Thr, seen 217 times in C4R, 97 hets in GnomAD, mixed in- silico predictions (SIFT = 0.01, PolyPhen = 0.696, CADD = 4.31)

X-Linked variants: 2 variants analyzed:

• Hemi TEX13B c.382C>T p.Gln128Ter (nonsense), seen 21 times in C4R, 721 hets and 2 homos and 421 hemizygotes seen in GnomAD, O/E LOF score of 0.73, • Hemi CXorf40A c.433A>C p.Asn145His, 48 hets and 27 hemizygotes seen in GnomAD, in-silico predicts benign (SIFT = none, PolyPhen = 0.656, CADD = 7.25)

Case 8: Trio Exome Reanalysis, Unaffected Parents

606 variants total. 67 variants when filtered for HPO terms. 9 variants analyzed.

Filter 1 A – HPO terms, AF < 0.001, de novo het variants: 1 variant analyzed:

• RELN c.1402A>C, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0.01, PolyPhen = 0.996, CADD = 22.8), only seen by 2 callers, 1 matching HPO term (global developmental delay)

163

Filter 1 B – HPO Terms, AF < 0.001, Homo variants when parents are het: 0 variants found

Filter 1 C – HPO Terms, AF < 0.001, Burden > 1 in autosomal recessive genes: 0 variants found

Filter 2 – AF < 0.001, Likely Loss of Function variants: 2 variants analyzed:

• Het WSB2 c.13+1G>T (splice donor), not seen in GnomAD, O/E LOF score of 0.139, splice score of 8.5, only seen by 2 callers, gene not in OMIM • Het PTGR2 c.352_353del , not seen in GnomAD, O/E LOF score of 0.679, only seen by 2 callers

De novo variants: 1 variant analyzed:

• Homo MUC19 c.15842C>T p.Thr5281Ile, seen 25 times in C4R, 6 hets in GnomAD, no in-silico scores

X-linked variants: 5 variants analyzed:

• Hemi COL4A5 c.2694G>A p.Met898Ile, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.05, PolyPhen = 0.019, CADD = 22.1), high phenolyzer rank (526), gene causes Alport’s syndrome which is not a good phenotypic match • Hemi BCORL1 c.3367A>G p.Lys1123Glu, 8 hets in GnomAD, mixed in-silico predictions (SIFT = none, PolyPhen = 0.255, CADD = 17.53) • Hemi STARD8 c.1122A>T p.Glu374Asp, 1 het in GnomAD, in-silico predicts benign (SIFT = 0.28, PolyPhen = 0.225, CADD = 0.003) • Het MAGEC1 c.617T>C p.Phe206Ser, heterozygous variant in X-linked gene for male participant indicates artifact, 22 hets in GnomAD, in-silico predicts benign (SIFT = 0.21, PolyPhen = 0.307, CADD = 13.89), low phenolyzer rank (19017) • Het MAGEC1 c.618C>G p.Phe206Leu, heterozygous variant in X-linked gene for male participant indicates artifact, 1 het in GnomAD, in-silico predicts benign (SIFT = 0.45, PolyPhen = 0.005, CADD = 7.489), low phenolyzer rank (19017)

Case 9: Trio Exome Reanalysis, Unaffected Parents

593 variants total. 65 variants when filtered for HPO terms. 9 variants analyzed.

Filter 1 A – HPO terms, AF < 0.001, de novo het variants: 0 variants found

Filter 1 B – HPO Terms, AF < 0.001, Homo variants when parents are het: 0 variants found

Filter 1 C – HPO Terms, AF < 0.001, Burden > 1 in autosomal recessive gene: 1 variant analyzed:

• HTT c.95_96insACA p.Gln38dup (in-frame insertion), not seen in GnomAD, only seen by 2 callers, 3 matching HPO terms (Developmental regression, Global developmental delay, Seizures), gene causes Huntington’s disease, not a good phenotypic match

Filter 2 – AF < 0.001, Likely Loss of Function Variants: 1 variant found

• Het MTX2 c.112T>C p.Ter38GlnextTer4 (stop loss), seen in 2 out of 6 reads, 3 hets in GnomAD, O/E LOF score of 0.34, only seen by 2 callers

164

Filter 3 – Epilepsy Candidate Genes: 0 variants found

De novo variants: 7 variants analyzed:

• Homo NPIPB4 c.2382+2G>C (splice donor), 77 hets in GnomAD, O/E LOF score of 0.19, splice score of 0.107, only seen by 2 callers, no phenolyzer rank listed • Homo NPIPB4 c.2378C>A p.Pro793His, 48 hets in GnomAD, mixed in-silico predictions (SIFT = 0, PolyPhen = 0.994, CADD = 12.83), only seen by 2 callers, no phenolyzer rank listed • Homo CRIPAK c.644T>C p.Met215Thr, seen 33 times in C4R, 2 hets in GnomAD, in-silico predicts benign (SIFT = 0.28, PolyPhen = 0.001, CADD = 0.002), only seen by 2 callers, no phenolyzer rank listed • Homo CRIPAK c.646T>C p.Trp216Arg, seen 38 times in C4R, 138 hets and 27 homos in GnomAD, in-silico predicts benign (SIFT = 0.72, PolyPhen = 0, CADD = 0.001), only seen by 2 callers, no phenolyzer rank listed • Homo HLA-DRB1 c.305C>G p.Ala102Gly, seen 596 times in C4R, not seen in GnomAD, mixed in- silico predictions (SIFT = 0.08, PolyPhen = 0, CADD = 8.13), only seen by 2 callers • Het KRTAP4-7 c.279C>G p.Ser93Arg, seen 96 times in C4R, 15 hets in GnomAD, mixed in-silico predictions (SIFT = 0.01, PolyPhen = 0.648, CADD = 23.7), only seen by 2 callers • Het GPR150 c.527C>A p.Ala176Glu, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.06, PolyPhen = 0.74, CADD = 19.29)

Case 10: Singleton Exome Reanalysis

719 variants total. 53 variants when filtered for HPO terms. 48 variants analyzed.

Filter 1 A – HPO terms, AF < 0.001, Het variants in autosomal dominant genes: 11 variants analyzed:

• NFASC c.476C>A p.Thr159Lys, not seen in GnomAD, mixed in-silico predictions (SIFT = 0; PolyPhen = 0.027, CADD = 33), 2 matching HPO terms (Delayed gross motor development, Feeding difficulties) • KCNH1 c.1849A>G p.Ser590Gly, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.01, PolyPhen = 0.217, CADD = 19.9), 1 matching HPO term (High palate), low phenolyzer rank (11282) • SYNE2 c.18407G>A p.Arg6136His, 19 hets in GnomAD, in-silico predicts deleterious (SIFT = 0.01, PolyPhen = 0.986, CADD = 35), 1 matching HPO term (abnormal muscle fiber morphology), low phenolyzer rank (11276) • CHRNE c.764C>T p.Ser255Leu, likely pathogenic in ClinVar, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.996, CADD = 34), 4 matching HPO terms (Abnormal muscle fiber morphology, Delayed gross motor development, Deeding difficulties, High palate), listed as paternally inherited on original exome • CHRNB1 c.1218-7C>T (splice region), 18 hets in GnomAD, O/E LOF score of 0.92, splice score of 0.015, 4 matching HPO terms (Abnormal muscle fiber morphology, Delayed gross motor development, Feeding difficulties, High palate) • RYR1 c.13354A>T p.Met4457Leu, VUS in ClinVar, 4 hets and 1 homo in GnomAD, in-silico predicts benign (SIFT = 1, PolyPhen = 0, CADD = 9.02), 7 matching HPO terms (Abnormal muscle

165

fiber morphology, Delayed gross motor development, Dolichocephaly, Feeding difficulties, Generalized muscle weakness, High palate. Joint hyperflexibility), high phenolyzer rank (214) • KIAA1109 c.14141A>G p.Asp4714Gly, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0.05, PolyPhen = 0.997, CADD = 25.2), 1 matching HPO term (High palate), no phenolyzer rank listed • ZFPM2 c.1064C>G p.Ser624Cys, 25 hets in GnomAD, mixed in-silico predictions (SIFT = 0.05, PolyPhen = 0, CADD = 22.8), 1 matching HPO term (Dolichocephaly) • CYP11B1 c.641A>G p.His214Arg, 5 hets in GnomAD, in-silico predicts benign (SIFT = 0.5, PolyPhen = 0.015, CADD = 0.001), 1 matching HPO term (Feeding difficulties) • PTCH1 c.2005G>A p.Asp669Asn, VUS in ClinVar, 6 hets in GnomAD, mixed in-silico predictions (SIFT = 0.07, PolyPhen = 0.031, CADD = 23.5), 2 matching HPO terms (Feeding difficulties, Joint hyperflexibility), high phenolyzer rank (78 • AGTR2 c.154T>G p.Tyr52Asp, 1 hemizygote in GnomAD, in-silico predict benign (SIFT = 0.15, PolyPhen = 0.213, CADD = 11.23), 1 matching HPO term (Thin upper vermillion), X-linked disorder in a female

Filter 1 B – HPO terms, AF < 0.001, Homo variants in autosomal recessive genes: 2 variants analyzed:

• TTN c.33340+3A>G (splice region), not seen in GnomAD, splice score of 5.622, 5 matching HPO terms (Abnormal muscle fiber morphology, Delayed gross motor development, Generalized muscle weakness, High palate, Microretrognathia) • TTN c.11461G>A p.Gly3675Ser, VUS in ClinVar, 6 hets in GnomAD, mixed in-silico predictions (SIFT = none, PolyPhen = 1, CADD = 17.39), 5 matching HPO terms (Abnormal muscle fiber morphology, Delayed gross motor development, Generalized muscle weakness, High palate, Microretrognathia)

Filter 1 C – HPO Terms, AF < 0.001, Burden > 1 in autosomal recessive genes: 2 variants analyzed:

• FMN2 c.718C

Filter 2 – AF < 0.001, Likely Loss of Function Variants: 23 variants analyzed:

• Het PIK3C2B c.2393del p.Pro798LeufsTer53 (frameshift), not seen in GnomAD, O/E LOF score of 0.18 • Homo BEST4 c.636+8G>A (splice region), not seen in GnomAD, O/E LOF score of 0.51, splice score of 0, low phenolyzer rank (11824), GUS • Het CELF2 c.687C>A (splice region), not seen in GnomAD, O/E LOF score of 0, splice score of 0.31, low phenolyzer rank (12961) • Het ADAM8 c.1281dup p.Glu428ArgfsTer17 (frameshift), 5 hets in GnomAD, O/E LOF score of 0.63 • Het COPB1 c.92-135G>A (splice region), 2 hets in GnomAD, O/E LOF score of 0.10, splice score of -3.3, GUS

166

• Het MRPL23 c.297+3681C>T (splice region), not seen in GnomAD, O/E LOF score of 1.6, splice score of 0.027, only seen by 2 callers, low phenolyzer rank (11450) • Het TRIM51P c.645+1A>G (splice donor), not seen in GnomAD, no O/E scores listed, splice score of -8.8, no phenolyzer rank listed, pseudogene not in OMIM • Het TFAP4 c.526-8G>C (splice region), 1 het in GnomAD, O/E LOF score of 0, splice score of -1.34 • Het DNAH9 c.3353+3A>G (splice region), not seen in GnomAD, O/E LOF score of 0.62, splice score of 4.2, het variant in AR disease-gene, low phenolyzer rank (12863) • Het IL9RP4 n.934+8G>T (splice region), not seen in GnomAD, no O/E score listed, splice score of 0, low phenolyzer rank (16292) • Het ZRANB3 c.715del p.Ala239HisfsTer11 (frameshift), 1 het seen in GnomAD, O/E LOF score of 0.69, low phenolyzer rank (16812) • Het CCT4 c.127+118C>T (splice region), not seen in GnomAD, O/E LOF score of 0.07, splice score of 0.44 • Het KIAA1644 c.282-4G>A (splice region), 4 hets in GnomAD, O/E LOF score of 0.41, splice score of 0.23, no phenolyzer rank listed • Het TMPRSS11E c.828_853del p.Tyr276Ter (frameshift), not seen in GnomAD, O/E LOF score of 0.599, only seen by 2 callers, no phenolyzer rank listed • Het OSMR c.1585+4A>C (splice region), 1 het in GnomAD, O/E LOF score of 0.70, splice score of 2.68, not a good phenotypic match • Het IL31RA c.-246+6T>C (splice region), not seen in GnomAD, O/E LOF score of 0.71, splice score of 0.877, not a good phenotypic match • Het XPO5 c.1442-25_1442-8del (splice region), not seen in GnomAD, O/E LOF score of 0.16, splice score of 0 • Het AKR1B1 c.237G>A (splice region), 3 hets in GnomAD, O/E LOF score of 0.51, splice score of 0.36 • Het PRKAR1B c.770-7C>T (splice region), 3 hets in GnomAD, O/E LOF score of 0.15, splice score of 0.31 • Het SEMA3D c.376-8C>T (splice region), not seen in GnomAD, O/E LOF score of 0.45, splice score of -0.955, low phenoylzer score, not a good phenotypic match • Het MTSS1 c.1580-8A>G (splice region), 1 het in GnomAD, O/E LOF score of 0.06, splice score of -0.198, low phenolyzer rank (12929) • Het GPR124 c.1698_1701dup p.Val568ArgfsTer17 (frameshift), O/E LOF score of 0.40, not seen in GnomAD, O/E LOF score of 0.40, low phenolzyer rank (12072), GUS • Het SSNA1 c.253-1G>A (splice acceptor), 1 het in GnomAD, O/E LOF score of 1.02, splice score of 8.75, low phenolyzer rank, GUS

Filter 3 – Neuromuscular Candidate gene list: 10 variants analyzed: • Het IBA57 c.272C>T p.Ala91Val, het variant in AR disease-gene, 8 hets in GnomAD, in-silico predict benign (SIFT = 0.32, PolyPhen = 0.005, CADD = 0.499), 2 matching HPO terms (Feeding difficulties, High palate), low phenoylzer rank (14979) • Het PYGM c.2447G>A p.Arg728His, het variant in AR disease-gene, 39 hets in GnomAD, in-silico predict deleterious (SIFT = 0, PolyPhen = 0.995, CADD = 35) • Het DDHD1 c.334_336dup p.Gly112dup (in-frame insertion), het variant in AR disease-gene, 75 hets in GnomAD, only seen by 2 callers • Het RYR3 c.11224C>T p.Leu3742Phe, 1 het in GnomAD, in-silico predicts deleterious (SIFT = 0.02, PolyPhen = 0.999, CADD = 32)

167

• Het LAMA5 c.8314G>A p.Glu2772Lys, not seen in GnomAD, in-silico predict benign (SIFT = 0.87, PolyPhen = 0.007, CADD = 2.992), high phenolyzer rank (370) • Het SIL1 c.1112C>A p.Pro371Gln, het variant in AR disease-gene, not seen in GnomAD, mixed in- silico predictions (SIFT = 0.03, PolyPhen = 0.276, CADD = 23.2), 1 matching HPO term (Abnormal muscle fiber morphology), low phenolyzer rank (11796) • Het AARS2 c.1055G>A :p.Arg352Gln, het variant in AR disease-gene, 9 hets in GnomAD, mixed in-silico predictions (SIFT = 0.04, PolyPhen = 0.24, CADD = 23.6), 1 matching HPO term (Generalized muscle weakness) • Het SGCE c.16T>G p.Trp6Gly, not seen in GnomAD, mixed in-silico predictions (SIFT = 0, PolyPhen = 0.31, CADD = 27.1), no phenolyzer rank listed, not a good phenotypic match • Homo SETX c.5621T>G p.Leu1874Arg, 1 het in GnomAD, in-silico predicts benign (SIFT = 0.36, PolyPhen = 0.216, CADD = 8.33), low phenolyzer rank (11743) • Het GBA2 c.74A>T p.Asp25Val, het variant in AR disease-gene, not seen in GnomAD, mixed in- silico predictions (SIFT = 0, PolyPhen = 0.179, CADD = 11.94)

Case 11: Singleton Exome Reanalysis 542 variants total. 67 when filtered for HPO terms. 30 variants analyzed.

Filter 1 A – HPO terms, AF < 0.001, Het variants in autosomal dominant genes: 8 variants analyzed:

• WDR26 c.73G>A p.Gly25Ser, 2 hets in GnomAD, mixed in-silico predictions (SIFT = 0, PolyPhen = 0, CADD = 21.9), 2 matching HPO terms (Global developmental delay, Seizures) • HEPACAM c.121C>T p.Arg41Cys, 2 hets in GnomAD, in-silico predicts deleterious (SIFT = 0.07, PolyPhen = 0.744, CADD = 23.9), 1 matching HPO term (Infantile onset seizures), low phenolyzer rank (13445) • ATN1 c.1488_1508dup Gln496_Gln502dup (in-frame insertion), seen in 6 out of 33 reads, not seen in GnomAD, 2 matching HPO terms (Choreoathetosis Seizures), gene can cause disease through point mutations or triplicate repeat expansion, triplicate repeat expansion does not match phenotype • TCF4 c.-31G>C, not seen in GnomAD, no O/E scores listed, splice score of 0, 4 matching HPO terms (Abnormality of the liver, Elevated hepatic transaminase, Global developmental delay, Seizures) • TBR1 c.848-109_848-106dup (splice region), 5 hets in GnomAD, O/E LOF score of 0.46, splice score of 0, 2 matching HPO terms (Global developmental delay, Seizures), low phenolyzer rank (12431) • FLNB c.3025C>T p.Arg1009Trp, seen 4 times in C4R, 1 het in GnomAD, in-silico predicts deleterious (SIFT = 0.06, PolyPhen = 0.924, CADD = 34), 1 matching HPO terms (Global developmental delay), not a good phenotypic match • FGFRL1 c.286G>A p.Val96Met, 48 hets in GnomAD, in-silico predicts deleterious (SIFT = 0.04, PolyPhen = 0.775, CADD = 16.72), 3 matching HPO terms (Abnormality of the liver, Global developmental delay, Seizures), disease associated with Wolf-Hirschhorn which his a microdeletion syndrome • ZSWIM6 c.3119G>A p.Arg1040His, 69 hets in GnomAD, in-silico predicts deleterious (SIFT = 0.02, PolyPhen = 0.96, CADD = 25.1), low phenolyzer rank (12799)

Filter 1 B – HPO terms, AF < 0.001, Homo variants in autosomal recessive genes: 0 variants found

168

Filter 1 C – HPO terms, AF < 0.001, Burden > 1 in autosomal recessive genes: 1 variant analyzed:

• ROBO3 c.3736C>T p.Arg1246Cys, 27 hets in GnomAD, no homos in GnomAD, mixed in-silico predictions (SIFT = 0, PolyPhen = 0.513, CADD = 32), 1 matching HPO term (Seizures) o ROBO3 c.2993G>T p.Gly998Val, VUS in ClinVar once and Likely Benign in ClinVar once, seen 19 times in C4R, 442 hets and 2 homos in GnomAD, mixed in-silico predictions (SIFT = 0, PolyPhen = 0.027, CADD = 27.9)

Filter 2 – AF < 0.001, Likely Loss of Function Variants: 21 variants analyzed:

• Het CYP4A11 c.1058del p.Ser353ThrfsTer28 (frameshift), not seen in GnomAD, O/E LOF score of 1.27 • Het MALRD1 c.943+13del (splice region), 1 het in GnomAD, no O/E scores listed, splice score of 0, no phenolyzer rank listed • Het WDFY4 c.1252C>T p.Arg418Ter (nonsense), not seen in GnomAD, O/E LOF score of 0.22 • Het RP11-7M8.2 c.125-72A>T (splice region), not seen in GnomAD, no O/E scores listed, splice score of 0.16, no phenolyzer rank listed • Het RECQL c.700+4T>A (splice region), 5 hets in GnomAD, O/E LOF score of 0.96, splice score of - 2.51, low phenolyzer rank (19647) • Het RECQL c.622_623insTC p.Tyr208PhefsTer35 (frameshift), not seen in GnomAD, O/E LOF score of 0.96, low phenolyzer rank (19647) • Het CCDC175 c.1491+5G>C (splice region), 2 hets in GnomAD, O/E LOF score of 0.55, splice score of 1.96, no phenolyzer rank listed • Het TMUB2 c.858del p.Trp286CysfsTer29 (frameshift), not seen in GnomAD, O/E LOF score of 0.86, low phenolyzer rank (20109) • Het MYPOP c.1099dup p.Arg367ProfsTer83 (frameshift), seen in 7 our of 28 reads, seen 6 times in C4R, not seen in GnomAD, O/E LOF score of 0, low phenolyzer rank (15597) • Het ESPNL c.913_923dup p.Gly309ProfsTer12 (frameshift), not seen in GnomAD, O/E LOF score of 0.61, low phenolyzer rank (12552) • Het PASK c.1811dup p.Ser605GlnfsTer12 (frameshift), 7 hets in GnomAD, O/E LOF score of 0.73, no phenolyzer rank listed • Het TMC2 c.728-1G>T (splice region), 4 hets in GnomAD, O/E LOF score of 0.87, splice score of 8.59, no phenolyzer rank listed • Homo FBXL c.747+97T>C (splice region), not seen in GnomAD, O/E LOF score of 0.311, splice score of -0.65, low phenolyzer rank (13408) • Het PPID c.868C>T p.Gln290Ter (nonsense), 4 hets in GnomAD, O/E LOF score of 0.77, low phenolyzer rank (12528) • Het SIMC1 c.*120C>A (splice region), seen in 4 out of 19 reads, not seen in GnomAD, O/E LOF score of 0.55, splice score of 0.13, no phenolyzer rank listed • Het TBC1D9B c.1565+8G>A (splice region), not seen in GnomAD, O/E LOF score of 0.49, splice score of 0, low phenolyzer rank (11828) • Het GOLPH3 c.653dup p.Glu219GlyfsTer5 (frameshift), not seen in GnomAD, O/E LOF score of 0.19, low phenolyzer rank (12618) • Het MCMDC2 c.10del p.Leu4Ter (frameshift), seen in 7 out of 27 reads, 8 hets in GnomAD, O/E LOF score of 0.61, no phenolyzer rank listed

169

• Het ZNF883 c.949G>T p.Gly317Ter (nonsense), seen 4 times in C4R, 7 hets in GnomAD, no O/E scores listed, low phenolyzer rank (11271) • Het NA c.559_560del p.Ser187HisfsTer85 (frameshift), 1 het in GnomAD, no O/E LOF score listed, no phenolyzer rank listed • Het SDCCAG3 c.1135del p.Val379TrpfsTer2 (frameshift), 1 het in GnomAD, O/E LOF score of 0.89, no phenolyzer rank listed

Filter 3 – Epilepsy Candidate Genes: 0 variants found

Case 12: Trio Exome Reanalysis, Unaffected Parents

622 total variants. 56 when filtered for HPO terms. 5 variants analyzed.

Filter 1 A – HPO terms, AF < 0.001, de novo het variants: 1 variant analyzed:

• MED12 c.1964_1966del p.Ser655del (in-frame deletion), 5 hets in GnomAD, 3 matching HPO terms (Abnormality of the cerebral white matter, Constipation, Generalized hypotonia), no phenolyzer rank listed, gene that causes an X-linked recessive disorder found in a female participant

Filter 1 B – HPO terms, AF < 0.001, Homo variants when parents are het: 0 variants found

Filter 1 C – HPO terms, AF < 0.001, Burden >1 in autosomal recessive genes: 0 variants found

Filter 2 - AF < 0.001, Likely Loss of Function Variants: 3 variants analyzed:

• Het FTOP1 n.369+3A>G (splice region), seen in 2 out of 3 reads, not seen in GnomAD, no O/E scores listed, splice score of 4.1, only seen by 2 callers, no phenolyzer rank listed • Het WBP1 c.419dup p.Tyr141LeufsTer14 (frameshift), 38 hets in GnomAD, O/E LOF score of 0.78, low phenolyzer rank (18206) • Het TMEM175 c.194_195del p.Val65AspfsTer104 (frameshift), seen in 3 out of 5 reads, 12 hets in GnomAD, O/E LOF score of 0.7, no phenolyzer rank listed

Filter 3 – Candidate genes: No candidate gene list for this phenotype

OMIM Genes: 1 variant analyzed:

• Het USP18 c.511C>T p.Arg171Trp, het variant in AR disease-gene, 15 hets in GnomAD, mixed in- silico predictions (SIFT = 0.1, PolyPhen = 0.007, CADD = 15.76), 2 matching HPO terms (Cerebral calcification, Generalized hypotonia), maternally inherited variant

Case 13: Trio Exome Reanalysis, Affected Siblings

467 shared variants total. 5 shared variants when filtered for HPO terms. 39 variants analyzed.

Filter 1 A – HPO terms, AF < 0.001, shared het variants: 3 variants analyzed:

170

• XDH c.3539T>C p.Val1180Ala, het variant in AR disease-gene, 2 hets in GnomAD, in-silico predicta deleterious (SIFT = 0, PolyPhen = 0.99, CADD = 32), 1 matching HPO term (Nephrolithiasis), high phenolyzer rank (448) • SI c.273_274del p.Gly92LeufsTer8 (frameshift), 2 siblings homozygous and 1 sibling heterozygous in AR disease-gene, seen 10 times in C4R, 1 het in GnomAD, O/E LOF score of 0.86, 1 matching HPO term (Nephrolithiasis), high phenolyzer rank (593) • HLA-DRB1 c.370+8T>C (splice region), not seen in GnomAD, O/E LOF score of 0.56, splice score of 0, only seen by 2 callers, 1 matching HPO term (Nephrolithiasis)

Filter 1 B – HPO terms, AF < 0.001, shared homo variants: 0 variants found

Filter 1 C – hPO terms, AF < 0.001, Burden > 1 in autosomal recessive genes: 0 variants found

Filter 2 – AF <0.001, Likely Loss of Function Variants: 36 variants analyzed:

• Homo MUC5AC c.1902_1902+1insA p.Met635AsnfsTer88 (frameshift), not seen in GnomAD, no O/E scores listed, only seen by 2 callers • Homo KIAA0430 c.3338_3339insCTCTGCCTCCCTCCTTCTTTCCT p.Pro1114SerfsTer60 (frameshift), not seen in GnomAD, O/E LOF score of 0.05, no phenolyzer rank listed • Homo MROH8 c.13+21_13+22insATAGACAGGGCCCCGCGGCCGGCACTCTT (splice acceptor), not seen in GnomAD, O/E LOF score of 0.61, splice score of 0, no phenolyzer rank listed • Homo LIMD1 c.1409-5G>T (splice region), not seen in GnomAD, O/E LOF score of 0.22, splice score of -2.69 • Homo LIMD1 c.1409-4C>G (splice region), not seen in GnomAD, O/E LOF score of 0.22, splice score of 0.34 • Homo LIMD1 c.1409-3A>C (splice region), not seen in GnomAD, O/E LOF score of 0.22, splice score of -3.85 • Homo HLA-DQB1 c.-64+7T>A (splice region), seen in 5 of 5 reads, not seen in GnomAD, O/E LOF score of 0.48, splice score of 0 • Homo HLA-DQB1 c.-64+5A>G (splice region), seen in 5 of 5 reads, not seen in GnomAD, O/E LOF score of 0.48, splice score of -12.08 • Homo HLA-DQB1 c.-64+4C>A (splice region), seen in 5 of 5 reads, not seen in GnomAD, O/E LOF score of 0.48, splice score of -5.21 • Homo BMS1P9 n.199-5T>C (splice region), seen in 2 of 2 reads, not seen in GnomAD, no O/E score listed, splice score of 0.48, no phenolyzer rank listed • Het STAB2 c.2647-4G>A (splice region), seen 8 times in C4R, 16 hets in GnomAD, O/E LOF score of 0.73, splice score of 0.37 • Het PRB3 c.649+56A>G (splice region), seen in 5 of 13 reads, 2 hets in GnomAD, no O/E scores listed, splice score of 0, only seen by 2 callers • Het PRB3 c.649+50C>A (splice donor), not seen in GnomAD, no O/E scores listed, splice score of 0.42, only seen by 2 callers • Het CDH16 c.2209+7T>C (splice region), seen 6 times in C4R, 6 hets in GnomAD, O/E LOF score of 1.04, splice score of 0, low phenolyzer rank (11686) • Het HYDIN c.4195-8_4195-4del (splice region), seen 5 times in C4R, not seen in GnomAD, O/E LOF score of 0.37, splice score of 0, only seen by 2 callers, no phenolyzer rank listed

171

• Het KCNJ12 c.-56-4G>C (splice region), not seen in GnomAD, O/E LOF score of 0.39, splice score of -0.25 • Het CABLES1 c.1761+1G>A (splice donor), not seen in GnomAD, O/E LOF score of 0.37, splice score of 8.18 • Het WDR87 c.8046_8049dup p.Pro2684ArgfsTer6 (frameshift), not seen in GnomAD, O/E LOF score of 0.73, low phenolyzer rank (15122) • Het WDR87 c.5895dup p.Gln1966ThrfsTer32 (frameshift), not seen in GnomAD, O/E LOF score of 0.73, low phenolyzer rank (15122) • Het RYR1 c.4455-7C>T (splice region), seen 8 times in C4R, not seen in GnomAD, O/E LOF score of 0.38, splice score of 0.29 • Het ANKRD30BL n.440A>G (splice region), seen in 5 of 10 reads, not seen in GnomAD, no O/E scores listed, splice score of 2.09, low phenolyzer rank (15662) • Het ZNF806 c.942del p.Tyr315ThrfsTer152 (frameshift), not seen in GnomAD, no O/E scores listed, low phenolyzer rank (16332) • Het ZNF806 c.1367dup p.Asn456LysfsTer85 (frameshift), not seen in GnomAD, no O/E scores listed, low phenolyzer rank (16332) • Het ZNF806 c.1580del p.Asn527MetfsTer36 (frameshift), not seen in GnomAD, no O/E scores listed, low phenolyzer rank (16332) • Het LOC101060017 c.34-4C>G (splice region), seen in 2 out of 14 reads, not seen in GnomAD, no O/E scores listed, splice score of -0.75, only seen by 2 callers, no phenolyzer rank listed • Het FAM182B c.201+1490C>T (splice acceptor), not seen in GnomAD, no O/E scores listed, splice score of 0.53, no phenolyzer rank listed • Het FAM182B c.201+1483T>A (splice region), not seen in GnomAD, no O/E scores listed, splice score of 0.76, no phenolyzer rank listed • Het APOL1 c.-59+773G>C (splice donor), seen 9 times in C4R, 6 hets in GnomAD, O/E LOF score of 1.02, splice score of 8.27, low phenolyzer rank (15050) • Het MUC8 c.83-2436A>C (splice region), not seen in GnomAD, no O/E scores listed, splice score of 8.04, only seen by 2 callers • Het IQCG c.8+3A>G (splice region), seen 8 times in C4R, not seen in GnomAD, O/E LOF score of 1.23, splice score of 4.6, low phenolyzer rank (12700) • Het CDHR4 c.848-8C>T (splice region), not seen in GnomAD, O/E LOF score of 0.71, splice score of 0.53, low phenolyzer rank (15296) • Het HHIP c.1678+6G>A (splice region), not seen in GnomAD, O/E LOF score of 0.16, splice score of 1.26 • Het BTN1A1 c.881-8T>C (splice region), seen 6 times in C4R, not seen in GnomAD, O/E LOF score of 0.74, splice score of -0.38 • Het BMP6 c.1392+7T>G (splice region), seen 5 times in C4R, not seen in GnomAD, O/E LOF score of 0.18 splice score of 0 • Het SDK1 c.566-5del (splice region), not seen in GnomAD, O/E LOF score of 0.30, splice score of, low phenolyzer rank (14080) • Het FBXL6 c.1225+6C>T (splice region), not seen in GnomAD, O/E LOF score of 0.73, splice score of -2.66, only seen by 2 callers, no phenolyzer rank listed

Filter 3 – Candidate genes: No candidate gene list for this phenotype

172

Case 14: Duo Exome Reanalysis, Affected Sibling Pair

377 shared variants total. 24 shared variants when filtered for HPO terms. 29 variants analyzed.

Filter 1 A – HPO terms, AF < 0.001, shared het variants: 4 variants analyzed:

• PUM1 c.2120G>C p.Arg707Pro, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.99, CADD = 32), 1 matching HPO term (Delayed speech and language development), not a good phenotypic match • PIGT c.92G>C p.Arg31Pro, 2 hets in GnomAD, in-silico predicts deleterious (SIFT = 0.03, PolyPhen = 0.89, CADD = 26), 1 matching HPO term (Global developmental delay), not a good phenotypic match • RP1L1 c.4019A>G p.Glu1340Gly, benign in ClinVar, not seen in GnomAD, in-silico predicts benign (SIFT = 0.37, PolyPhen = 0.51, CADD = 9.38), low phenolyzer rank (14074) • PLEC c.1204G>A p.Val402Met, 29 hets in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.77, CADD = 27.3), 1 matching HPO term (Global developmental delay), low phenolyzer rank (11311)

Filter 1 B – HPO terms, AF < 0.001, shared homo variants: 0 variants found

Filter 1 C – hPO terms, AF < 0.001, Burden > 1 in autosomal recessive genes: 0 variants found

Filter 2 – AF <0.001, Likely Loss of Function Variants: 24 variants analyzed:

• Het DCLRE1C c.537+7C>G (splice region), 2 hets in GnomAD, O/E LOF score of 0.47, splice score of 0 • Homo MUC5AC c.1902_1902+1insA p.Met635AsnfsTer88 (frameshift), not seen in GnomAD, no O/E scores listed, only seen by 2 callers • Het IGHV7-27 c.196G>T p.Glu66Ter (nonsense), not seen in GnomAD, no O/E scores listed, only seen by 2 callers, low phenolyzer rank (18167) • Homo KIAA0430 c.3338_3339insCTCTGCCTCCCTCCTTCTTTCCT p.Pro1114SerfsTer60 (frameshift), not seen in GnomAD, O/E LOF score of 0.05, low phenolyzer rank (15052) • Het RPL3L c.690G>A (splice region), not seen in GnomAD, O/E LOF score of 1.11, splice score of 0.2, high phenolyzer rank (958) • Het MT4 c.98-2A>G (splice acceptor), 15 hets in GnomAD, O/E LOF score of 1.53, splice score of 7.95 • Het KCNJ12 c.-56-4G>C (splice region), not seen in GnomAD, O/E LOF score of 0.39, splice score of -0.25 • Het VMO1 c.319G>T p.Glu107Ter (nonsense), not seen in GnomAD, O/E LOF score of 1.26, no phenolyzer rank listed • Het FCAR c.208G>T p.Glu70Ter (nonsense), 3 hets seen in GnomAD, O/E LOF score of 1.03 • Het ZNF806 c.942del p.Tyr315ThrfsTer152 (frameshift), not seen in GnomAD, no O/E scores listed, low phenoylzer rank (10685) • Het ZNF806 c.1367dup p.Asn456LysfsTer85 (frameshift), not seen in GnomAD, no O/E scores listed, low phenoylzer rank (10685)

173

• Het ZNF806 c.1580del p.Asn527MetfsTer36 (frameshift), not seen in GnomAD, no O/E scores listed, low phenoylzer rank (10685) • Het FAM182B c.201+1490C>T (splice acceptor), not seen in GnomAD, no O/E scores listed, splice score of 0.5, no phenolyzer rank listed • Het RBL1 c.517_518del p.Leu173ValfsTer2 (frameshift), 5 hets in GnomAD, O/E LOF score of 0.53, high phenolyzer rank (1050) • Homo MROH8 c.13+21_13+22insATAGACAGGGCCCCGCGGCCGGCACTCTT (splice acceptor), not seen in GnomAD, O/E LOF score of 0.61, splice score of 0, no phenolyzer rank listed • Het LOC101926954 c.358G>T p.Glu120Ter (nonsense), not seen in GnomAD, no O/E scores listed, no phenolyzer rank listed • Het MUC4 c.83-2436A>C (splice acceptor), seen in 5 out of 22 reads, not seen in GnomAD, no O/E scores listed, splice score of 8.04, only seen by 2 callers • Homo LIMD1 c.1409-5G>T (splice region), not seen in GnomAD, O/E LOF score of 0.22, splice score of -2.69 • Homo LIMD1 c.1409-4C>G (splice region), not seen in GnomAD, O/E LOF score of 0.22, splice score of 0.3 • Homo LIMD1 c.1409-3A>C (splice region), not seen in GnomAD, O/E LOF score of 0.22, splice score of -3.85 • Het XR_241674.1 n.79+7del (splice region), not in GnomAD, no O/E scores listed, splice score of 0, no phenolyzer rank listed • Het MAP3K1 c.709C>T p.Gln237Ter (nonsense), not seen in GnomAD, O/E LOF score of 0.098, not a good phenotypic match • Het SORBS3 c.406-6C>G (splice region), 2 hets in GnomAD, O/E LOF score of 0.57, splice score of 2.48, low phenolyzer rank (14342) • Het PTPRD c.4304+7_4304+8insACAGTTCAGGAATGGTAAGTT (splice region), not seen in GnomAD, O/E LOF score of 0.04, splice score of 0, low phenolyzer rank (11204)

Filter 3 – Candidate genes: No candidate gene list for this phenotype

OMIM Genes: 1 variant analyzed:

• Het PNKP c.1253_1269dup p.Thr424GlyfsTer49 (frameshift), het variant in AR disease-gene, pathogenic in ClinVar, 41 hets in GnomAD, O/E LOF score of 0.92, 2 matching HPO terms (Global developmental delay, Microcephaly)

Case 15: Duo Exome Reanalysis, Affected Sibling Pair

824 shared variants total. 110 shared variants when filtered for HPO terms. 118 variants analyzed.

Filter 1 A – HPO terms, AF < 0.001, shared het variants: 28 variants analyzed:

• SLC2A1 c.313G>A p.Val105Met, Likely benign 3 times and VUS once in ClinVar, 3 hets in GnomAD, in-silico predicts benign (SIFT = 0.13, PolyPhen = 0.338, CADD = 12.51), 6 matching HPO terms (Abnormality of brain morphology, Delayed speech and language development,

174

Generalized tonic-clonic seizures, Global developmental delay, Seizures, Spasticity), high phenolyzer rank (128) • FOXE3 c.619G>A p.Gly207Arg, 9 hets in GnomAD, mixed in-silico predictions (SIFT = 0.03, PolyPhen = 0.02, CADD = 21), 2 matching HPO terms (Abnormality of brain morphology, Global developmental delay), low phenolyzer rank (14573) • FAM149B1 c.1714_1728del p.Gly572_Ser576del (in-frame deletion), not seen in GnomAD, 3 matching HPO terms (Abnormality of brain morphology, Global developmental delay, Seizures), no phenolyzer rank listed • LRRC56 c.959G>C p.Ser320Thr, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.99, CADD = 24.6), 2 matching HPO terms (Abnormality of brain morphology, Delayed speech and language development), no phenolyzer rank listed • SPTBN2 c.5500G>A p.Ala1834Thr, 4 hets in GnomAD, mixed in-silico predictions (SIFT = 1, PolyPhen = 0.001, CADD = 19.52), 4 matching HPO terms (Abnormality of brain morphology, Delayed speech and language development, Global developmental delay, Spasticity), high phenolyzer rank (199) • ATXN2 c.-65+591G>C, seen in 12 of 61 reads, not seen in GnomAD, in-silico predicts benign (SIFT = 0.12, PolyPhen = 0.22, CADD = 0.003), only seen by 2 callers, 2 matching HPO terms (Abnormality of brain morphology, Spasticity) • NCAPD2 c.2526A>T p.Glu842Asp, 22 hets in GnomAD, in-silico predicts benign (SIFT = 1, PolyPhen = 0, CADD = 0.24), 2 matching HPO terms (Abnormality of brain morphology, Delayed speech and language development), low phenolyzer rank (10024) • SRCAP c.6494+5T>C (splice region), 1 het in GnomAD, O/E LOF score of 0.86, splice score of 0.08, 3 matching HPO terms (Abnormality of brain morphology, Delayed speech and language development, Global developmental delay) • CREBBP c.3138C>T (splice region), 2 hets in GnomAD, O/E LOF score of 0.70, splice score of - 0.32, 3 matching HPO terms (Abnormality of brain morphology, Delayed speech and language development, Seizures), high phenolyzer rank (58) • PIEZO1 c.2666C>G p.Pro889Arg, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.05, PolyPhen = 0.99, CADD = 24.6), 1 matching HPO term (Global developmental delay), low phenolyzer rank (14833) • UNC13A c.739G>A p.Glu247Lys, seen 6 times in C4R, 77 hets in GnomAD, mixed in-silico predictions (SIFT = 0.02, PolyPhen = 0.06, CADD = 26.6), 1 matching HPO term (Spasticity) • TTN c.53852C>A p.Thr17951Asn, not seen in GnomAD, mixed in-silico predictions (SIFT = none, PolyPhen = 0.72, CADD = 14.3), 1 matching HPO term (Delayed speech and language development) • CCDC141 c.4498G>A p.Gly1500Ser, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.04, PolyPhen = 0.66, CADD = 24.1), 2 matching HPO terms (Abnormality of brain morphology, Seizures), low phenolyzer rank (15094) • STAT1 c.541+7G>T (splice region), 2 hets in GnomAD, O/E LOF score of 0.28, splice score of 0, 1 matching HPO term (Abnormality of brain morphology), high phenolyzer rank (605) • HECW2 c.2171-4dup (splice region), 71 hets in GnomAD, O/E LOF score of 0.69, splice score of 0, 3 matching HPO terms (Abnormality of brain morphology, Global developmental delay, Seizures) • IDH1 c.-93A>T (splice region), seen in 6 out of 13 reads, 2 hets in GnomAD, O/E LOF score of 0.88, splice score of 4.87, 1 matching HPO term (Abnormality of brain morphology)

175

• TRIP12 c.2273G>A p.Cys758Tyr, 1 het in GnomAD, mixed in-silico predictions (SIFT = 0.09, PolyPhen = 0.9, CADD = 23.9), 3 matching HPO terms (Delayed speech and language development, Global developmental delay, Seizures) • CTNNA2 c.42G>C p.Lys14Asn, not seen in GnomAD, in-silico predicts benign (SIFT = 0.15, PolyPhen = 0, CADD = 8.07), 5 matching HPO terms (Abnormality of brain morphology, Delayed speech and language development, Global developmental delay, Pachygyria, Spasticity) • VSX1 c.328C>T p.Pro110Ser, 2 hets in GnomAD, mixed in-silico predictions (SIFT = 0.31, PolyPhen = 0.81, CADD = 24.1), 1 matching HPO term (Abnormality of brain morphology), low phenolyzer rank (12134) • HTT c.87_110del p.Gln31_Gln38del (in frame deletion), not seen in GnomAD, 6 matching HPO terms (Abnormality of brain morphology, Delayed speech and language development, Global developmental delay, Muscular hypotonia of the trunk, Seizures, Spasticity), gene causes Huntington’s disease which is not a good phenotypic match • PPP2R2B c.89-177612_89-177586dup (in-frame insertion), not seen in GnomAD, 1 matching HPO term (Abnormality of brain morphology) • ATXN 666_677del p.Gln222_Gln225del (in-frame deletion), seen in 23 of 125 reads, no O/E scores, not seen in GnomAD, only seen by 2 callers, 2 matching HPO terms (Abnormality of brain morphology, Spasticity), low phenolyzer rank (11482) • HLA-B c.206A>C p.Glu69Ala, 2 hets in GnomAD, in-silico predicts benign (SIFT = 0.22, PolyPhen = 0.13, CADD = 5.25), only seen by 2 callers, 2 matching HPO terms (Abnormality of brain morphology, Seizures) • HLA-DRB1 c.297G>C p.Gln99His, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.03, PolyPhen = 0.16, CADD = 1.32), only seen by 2 callers, 1 matching HPO term (Spasticity) • HLA-DRB1 c.295C>G p.Gln99Glu, not seen in GnomAD, in-silico predicts benign (SIFT = 0.36, PolyPhen = 0.005, CADD = 0.001), only seen by 2 callers, 1 matching HPO term (Spasticity) • NOS3 c.3007C>T p.Pro1003Ser, not seen in GnomAD, in-silico predicts benign (SIFT = 0.45, PolyPhen = 0.007, CADD = 15.05), 2 matching HPO terms (Abnormality of brain morphology, Seizures), high phenolyzer rank (285) • MT-CYB c.95A>G p.Asn32Ser, not seen in GnomAD, mixed in-silico predictions (SIFT = 0, PolyPhen = 0.97, CADD = none), 3 matching HPO terms (Abnormality of brain morphology, Generalized tonic-clonic seizures, Seizures), no phenolyzer rank listed • MT-CYB c.928T>C p.Ser310Pro, seen 7 times in C4R, not seen in GnomAD, mixed in-silico predictions (SIFT = 0, PolyPhen = 0.57, CADD = none), 3 matching HPO terms (Abnormality of brain morphology, Generalized tonic-clonic seizures, Seizures), no phenolyzer rank listed

Filter 1 B – HPO terms, AF < 0.001, shared homo variants: 1 variant analyzed:

• DMPK c.*245_*283del (splice acceptor), seen in 7 out of 7 reads, not seen in GnomAD, no O/E scores listed, splice score of 0, 1 matching HPO term (Abnormality of brain morphology)

Filter 1 C – HPO terms, AF < 0.001, Burden > 1 in autosomal recessive genes: 15 variants analyzed:

• Het CCDC114 c.1781C>T p.Ser594Leu, 1 het in GnomAD, mixed in-silico predictions (SIFT = 0, PolyPhen = 0.015, CADD = 21.5), 2 matching HPO terms (Abnormality of brain morphology, Delayed speech and language development), low phenolyzer rank (19585)

176

• CCDC114 c.1362G>T p.Lys454Asn, Het in one sibling and homo in the other, not seen in GnomAD, mixed in-silico predictions (SIFT = 0, PolyPhen = 0.94, CADD = 23.5), 2 matching HPO terms (Abnormality of brain morphology Delayed speech and language development), low phenolyzer rank (19585) • ABCG8 c.1619T>C p.Phe540Ser, Het in one sibling and homo in the other, 3 hets in GnomAD, in- silico predicts deleterious (SIFT = 0, PolyPhen = 0.88, CADD = 26.8), 1 matching HPO term (Abnormality of brain morphology) o ABCGC8 c.1226A>G, 85 hets in GnomAD, in-silico predicts deleterious (SIFT = 0.04, PolyPhen = 0.739, CADD = 23.7) • Het PCNT c.139G>C p.Asp47His, 15 hets in GnomAD, in-silico predicts benign (SIFT = 0.19, PolyPhen = 0.003, CADD = 9.65), 3 matching HPO terms (Abnormality of brain morphology, Global developmental delay, Seizures) • Het PCNT c.6946C>A p.Leu2316Ile, Likely benign in ClinVar, 13 hets in GnomAD, in-silico predicts benign (SIFT = 0.23, PolyPhen = 0.435, CADD = 1.217), 3 matching HPO terms (Abnormality of brain morphology, Global developmental delay, Seizures) • Het PCNT c.7133G>A p.Arg2378His, 2 hets in GnomAD, in-silico predicts benign (SIFT = 0.13, PolyPhen = 0, CADD = 6.81), 3 matching HPO terms (Abnormality of brain morphology, Global developmental delay, Seizures) • Het DNAH5 c.13448C>T p.Thr4483Met, 17 hets in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.998, CADD = 32), 2 matching HPO terms (Abnormality of brain morphology, Delayed speech and language development), low phenolyzer rank (11979) • Het DNAH5 c.3043A>G p.Thr1015Ala, 2 hets in GnomAD, in-silico predicts benign (SIFT = 0.72, PolyPhen = 0.007, CADD = 9.82), 2 matching HPO terms (Abnormality of brain morphology, Delayed speech and language development), low phenolyzer rank (11979) • MICAL1 c.1787T>C p.Leu596Pro, Het in one sibling and homo in the other, seen 5 times in C4R, 29 hets in GnomAD, in-silico predicts benign (SIFT = 0.28, PolyPhen = 0, CADD = 10.96), 2 matching HPO terms (Generalized tonic-clonic seizures, Seizures), low phenolyzer rank (15639) • MICAL1 c.1363C>G p.Leu455Val, Het in one sibling and homo in the other, seen 5 times in C4R, 29 hets in GnomAD, mixed in-silico predictions (SIFT = 0, PolyPhen = 0.248, CADD = 25), 2 matching HPO terms (Generalized tonic-clonic seizures, Seizures), low phenolyzer rank (15683) • HLA-B c.356T>G p.Leu119Arg, Het in one sibling and homo in the other, not seen in GnomAD, in- silico predicts deleterious (SIFT = 0.01, PolyPhen = 0.97, CADD = 23.6), 2 matching HPO terms (Abnormality of brain morphology, Seizures) • HLA-B c.355C>T p.Leu119Phe, Het in one sibling and homo in the other, not seen in GnomAD, in- silico predicts benign (SIFT = 0.4, PolyPhen = 0.40, CADD = 0.38), only seen by 2 callers, 2 matching HPO terms (Abnormality of brain morphology, Seizures) • HLA-B c.282G>C p.Gln94His, Het in one sibling and homo in the other, 22 hets in GnomAD, in- silico predicts benign (SIFT = 0.47, PolyPhen = 0.19, CADD = 9.39), 2 matching HPO terms (Abnormality of brain morphology, Seizures) • Het ABCD1 c.1744G>A p.Val582Ile, benign in ClinVar, 8 hets in GnomAD, in-silico predicts benign (SIFT = 1, PolyPhen = 0.001, CADD = 1.38), only seen by 2 callers, 3 matching HPO terms (Abnormality of brain morphology, Seizures, Spasticity), heterozygous variant in X-linked gene for male participant indicates artifact

177

• Het ABCD1 c.1748T>A p.Val583Glu, likely benign in ClinVar, 11 hets in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.99, CADD = 28.1), only seen by 2 callers, 3 matching HPO terms (Abnormality of brain morphology, Seizures, Spasticity), heterozygous variant in X-linked gene for male participant indicates artifact

Filter 2 – AF <0.001, Likely Loss of Function Variants: 67 variants analyzed:

• Het C1ord195 c.-267+11685G>A (splice region), 7 hets in GnomAD, O/E LOF score of 0.43, splice score of -0.06, no phenolyzer rank listed • Het KIAA1751 c.1047-4G>T (splice region), 1 het in GnomAD, no O/E scores, splice score of 0.54, low phenolyzer rank (20277) • Het SRRM1 c.1390+8_1390+11del (splice region), 2 hets in GnomAD, O/E LOF score of 0.06, splice score of 0 • Het ANKRD13C c.666A>G (splice region), 2 hets in GnomAD, O/E LOF score of 0.10, splice score of -0.43, low phenolyzer rank (12599) • Het LRRIQ3 c.263C>T p.Arg89Ter (nonsense), 4 hets in GnomAD, O/E LOF score of 0.73, no phenolyzer rank listed • Het AKR1CL1 c.261+4A>C (splice region), 4 hets in GnomAD, O/E LOF score of 1.13, splice score of 2.29, no phenolyzer rank listed • Het ASCC1 c.-33-5T>G (splice region), het variant in AR disease-gene, 5 hets in GnomAD, O/E LOF score of 1.01, splice score of 4.35, 2 matching HPO terms (Abnormality of brain morphology, Global developmental delay), low phenolyzer rank (11744) • Het POLR3A c.1771-5C>T (splice region), single variant in AR disease-gene, 23 hets in GnomAD, O/E LOF score of 0.68, splice score of -0.64, 5 matching HPO terms (Abnormality of brain morphology, Global developmental delay, Polymicrogyria, Seizures, Spasticity) • Homo MUC5AC c.1902_1902+1insA p.Met635AsnfsTer88 (frameshift), not seen in GnomAD, no O/E scores listed, only seen by 2 callers • Het INPPL1 c.754-4C>G (splice region), single variant in AR disease-gene, 25 hets in GnomAD, O/E LOF score of 0.28, splice score of -1.42 • Het EP400 c.2936-3C>A (splice region), 15 hets in GnomAD, O/E LOF score of 0.15, splice score of 4.17 • Het CPM c.258+4C>T (splice region), 21 hets in GnomAD, O/E LOF score of 0.53, splice score of 0.39 • Het FRS2 c.253+7C>A (splice region), not seen in GnomAD, O/E LOF score of 0.04, splice score of 0, high phenolyzer rank (907) • Het LRRIQ1 c.522G>A p.Trp174Ter (nonsense), not seen in GnomAD, O/E LOF score of 0.76, no phenolyzer rank listed, GUS not in OMIM • Het ARGLU1 c.-107A>G (splice region), 6 hets in GnomAD, O/E LOF score of 0.15, splice score of 0.029, low phenolyzer rank (18661) • Het ZNF280D c.1372-6A>G (splice region), 5 hets in GnomAD, O/E LOF score of 0.22, splice score of 0.75, low phenolyzer rank (17756) • Het PAQR5 c.750C>T (splice region), 16 hets in GnomAD, O/E LOF score of 0.33, splice score of 1.84, low phenolyzer rank (18958)

178

• Het CHRNA5 c.1312_1316del p.Val438AsnfsTer14 (frameshift), 4 hets in GnomAD, O/E LOF score of 0.07, not a good phenotypic match • Homo KIAA0430 c.3338_3339insCTCTGCCTCCCTCCTTCTTTCCT p.Pro1114SerfsTer60 (frameshift), not seen in GnomAD, O/E LOF score of 0.05, low phenolyzer rank (15720), variant is a known artifact that has been seen in multiple participants in this study • Het SRCAP c.6494+5T>C (splice region), 1 het in GnomAD, O/E LOF score of 0.04, splice score of 0.08, 3 matching HPO terms (Abnormality of brain morphology, Delayed speech and language development, Global developmental delay), low phenolyzer rank (9966) • Het CREBBP c.3138C>T (splice region), 2 hets in GnomAD, O/E LOF of 0.025, splice score of -0.32, 3 matching HPO terms (Abnormality of brain morphology, Delayed speech and language development, Seizures), high phenolyzer rank (58), gene causes Rubenstein-Taby which is not a good phenotypic match • Het DEF8 c.134G>A p.Trp45Ter (nonsense), not seen in GnomAD, no O/E scores listed, low phenolyzer rank (18718) • Het KCNJ12 c.-56-4G>C (splice region), seen in 49 of 206 reads, not seen in GnomAD, O/E LOF score of 0.39, splice score of -0.25, high phenolyzer rank (831) • Het FADS6 c.357+7C>T (splice region), 3 hets in GnomAD, O/E LOF score of 1.13, splice score of 0, low phenolyzer rank (22951) • Het KIAA0430 c.2219+7G>A (splice region), 1 het variant in GnomAD, O/E LOF of 0.51, splice score of 0, low phenolyzer rank (18279) • Het LGALS3BP c.53-52A>G (splice acceptor), not seen in GnomAD, no O/E scores listed, splice score of 7.95, low phenolyzer rank (11136) • Het FAM129C c.148+1G>C (splice donor), 20 hets in GnomAD, O/E LOF score of 0.79, splice score of 8.27, low phenolyzer rank (17778) • Het PBX4 c.94C>T p.Gln32Ter (nonsense), not seen in GnomAD, O/E LOF score of 0.59, low phenolyzer rank (15866) • Het KCTD15 c.47_48dup p.Gly17ThrfsTer67 (frameshift), not seen in GnomAD, O/E LOF score of 0.39 • Het SIPA1L3 c.4209-3C>T (splice region), het variant in AR disease-gene, 15 hets in GnomAD, O/E LOF score of 0.11, splice score of 1.57, not a good phenotypic match • Het ACTN4 c.162+5G>A (splice region), not seen in GnomAD, O/E LOF score of 0.02, splice score of 4.82, not a good phenotypic match • Het CYP2A7 c.493+8A>G (splice region), not seen in GnomAD, O/E LOF of 1.07, splice score of 0 • Het ATP5SL c.314+4T>A (splice region), 21 hets in GnomAD, O/E LOF score of 0.89, splice score of -2.89, low phenolyzer rank (20817) • Het EMC10 c.1096del p.Leu366SerfsTer3 (frameshift), 1 het in GnomAD, O/E LOF score of 1.01, no phenolyzer rank listed • Het ZNF806 c.942del p.Tyr315ThrfsTer152 (frameshift), not seen in GnomAD, no O/E scores listed, low phenolyzer rank (11329) • Het ZNF806 c.1367dup p.Asn456LysfsTer85 (frameshift), not seen in GnomAD, no O/E scores listed, low phenolyzer rank (11329) • Het ZNF806 c.1580del p.Asn527MetfsTer36 (frameshift), not seen in GnomAD, no O/E scores listed, low phenolyzer rank (11329)

179

• Het OLA1 c.255-8G>T (splice region), 4 hets in GnomAD, O/E LOF score 0.29, splice score of - 0.73, low phenolyzer rank (11483) • Het STAT1 c.541+7G>T (splice region), 2 hets in GnomAD, O/E OF score of 0.08, splice score of 0, 1 matching HPO term (Abnormality of brain morphology), not a good phenotypic match • Het CRYGC c.253-6G>T (splice region), 5 hets in GnomAD, O/E LOF score of 0.67, splice score of - 1.02, low phenolyzer rank (20146), not a good phenotypic match • Het PCSK2 c.653-3C>T (splice region), not seen in GnomAD, O/E LOF score of 0.06, splice score of 0.31 • Het FAM182B c.201+1493C>T (splice region), seen in 93 of 464 reads, not seen in GnomAD, no O/E scores listed, splice score of 0.56, only seen by 2 callers, no phenolyzer rank listed • Het FAM182B c.201+1490C>T (splice region), not seen in GnomAD, no O/E scores listed, splice score of 0.53, no phenolyzer rank listed • Het FAM182B c.201+1483T>A (splice region), seen in 90 of 417 reads, not seen in GnomAD, no O/E scores listed, splice score of 0.73, no phenolyzer rank listed • Homo MROH c.13+21_13+22insATAGACAGGGCCCCGCGGCCGGCACTCTT (splice acceptor), not seen in GnomAD, O/E LOF score of 0.61, splice score of 0, low phenolyzer rank (22830) • Het LOC101926954 c.346A>T p.Lys116Ter (nonsense), seen in 182 of 1173 reads, not seen in GnomAD, no O/E scores listed, no phenolyzer rank listed • Het C21orf58 c.316G>T p.Glu106Ter (nonsense), 1 het in GnomAD, O/E LOF score of 1.15, low phenolyzer rank (18792) • Het MUC4 c.83-2436A>C (splice acceptor), not seen in GnomAD, no O/E scores listed, splice score of 8.04, only seen by 2 callers • Het GOLGA4 c.1069-4T>G (splice region), 1 het in GnomAD, O/E LOF score of 0.39, splice score of 1.43 • Homo LIMD1 c.1409-5G>T (splice region), not seen in GnomAD, O/E LOF of 0.22, splice score of - 2.69 • Homo LIMD1 c.1409-4G>T (splice region), not seen in GnomAD, O/E LOF of 0.22, splice score of 0.34 • Homo LIMD1 c.1409-3G>T (splice region), not seen in GnomAD, O/E LOF of 0.22, splice score of - 3.85 • Het WWC2 c.131+9_131+23del (splice region), not seen in GnomAD, O/E LOF score of 0.39, splice score of 0, low phenolyzer rank (18937) • Het FIP1L1 c.1620-5T>C (splice region), 27 hets in GnomAD, O/E LOF of 0.17, splice score of 1.21, low phenolyzer rank (10348) • Het RP11-813N20.3 n.1018+6T>C (splice region), 1 het in GnomAD, no O/E scores listed, splice score of 0.80, no phenolyzer rank listed • Het PCDHGA4 c.2461T>C p.Ter821GlnextTer? (stop loss), not seen in GnomAD, no O/E score, low phenolyzer rank (19332) • Het NUS1 c.691+8T>G (splice region), het variant in AR disease-gene, not seen in GnomAD, O/E LOFT score of 0, splice score of 0, 5 matching HPO terms (Abnormality of brain morphology, Delayed speech and language development, Global developmental delay, Seizures, Spasticity) • Het RAET1G c.562G>T p.Gly188Ter (nonsense), 1 het in GnomAD, O/E OF score of 0.81

180

• Het HG6.3 c.499_502del p.Thr167CysfsTer22 (frameshift), 1 het in GnomAD, no O/E scores listed, splice score of 0, only seen by 2 callers, no phenolyzer rank listed • Het HLA-DRB6 n.743A>G (splice region), not seen in GnomAD, no O/E scores, splice score -0.13, low phenolyzer rank (16994) • Het PRSS3P1 n.342+4G>A (splice region), not seen in GnomAD, no O/E scores listed, splice score of -3.79, no phenolyzer rank listed, labelled as pseudogene • Het GHRHR c.598-3del (splice region), het variant in AR disease-gene, not seen in GnomAD, O/E LOF score of 0.61, splice score of 0, not a good phenotypic match • Het DPY1PL1 c.104+3A>G (splice region), not seen in GnomAD, O/E LOF score of 0.31, splice score of 1.91, low phenolyzer rank (19311), GUS but may be associated with a duplication syndrome • Het HUS1 c.481T>C p.Ter161GlnextTer2 (stop lost), not seen in GnomAD, O/E LOF score of 0.65 • Het COBL c.1342_1345dup p.Asn449ThrfsTer11 (frameshift), not seen in GnomAD, O/E LOF score of 0.29, low phenolyzer rank (17008) • Het MTMR9 c.292-7C>T (splice region), 11 hets in GnomAD, O/E LOF score of 0.34, splice score of -0.29, low phenolyzer rank (17048) • Het HMCN2 c.1816+5G>A (splice region), not seen in GnomAD, O/E LOF score of 0.74, splice score of 3.99, low phenolyzer rank (17462), GUS not in OMIM

Filter 3 – Epilepsy Candidate Gene: 0 variants found

OMIM Genes: 7 variants analyzed:

• Het WWOX c.67A>G p.Ile23Val, 23 hets in GnomAD, in-silico predicts deleterious (SIFT = 0.05, PolyPhen = 0.98, CADD = 23.1), 6 matching HPO terms (Abnormality of brain morphology, Delayed speech and language development, Global developmental delay, Muscular hypotonia of the trunk, Seizures, Spasticity) • Het AXOC1 c.878T>C p.Leu293Pro, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 1, CADD = 29.8), 2 matching HPO terms (Global developmental delay, Seizures) • Het LMNB2 c.514G>A p.Gly172Ser, 16 hets in GnomAD, in-silico predicts benign (SIFT = 1, PolyPhen = 0.003, CADD = 0.99), 3 matching HPO terms (Abnormal brain morphology, Global developmental delay, Seizures) • Het BOA3 c.93G>C p.Glu31Asp, 3 hets in GnomAD, in-silico predicts benign (SIFT = 0.32, PolyPhen 0.04, CADD = 12.88), 3 matching HPO terms (Global developmental delay, Seizures, Spasticity), gene is described as “fatal” which is not a good phenotypic match • Het VWA3B c.2182G>A p.Gly728Arg, 9 hets in GnomAD, in-silico predicts deleterious (SIFT = 0.03, PolyPhen = 0.91, CADD = 25.5), 2 matching HPO terms (Abnormality of brain morphology, Spasticity) • Het QARS c.1792G>A p.Ala598Thr, 7 hets in GnomAD, mixed in-silico predictions (SIFT = 0.39, PolyPhen = 0.006, CADD = 20.9) • Het RFT1 c.155C>T p.Thr52Met, 3 hets in GnomAD, mixed in-silico predictions (SIFT = 0.39, PolyPhen = 0.05, CADD = 17.1), 4 matching HPO terms (Abnormality of brain morphology, Global developmental delay, Seizures, Spasticity)

181

Case 16: Singleton Exome Reanalysis

620 variants total. 17 variants when filtered for HPO terms. 47 variants analyzed.

Filter 1 A – HPO terms, AF < 0.001, Het variants in autosomal dominant genes: 7 variants analyzed:

• TP53 c.783-5T>C (splice region), not seen in GnomAD, O/E LOF score of 0.27, splice score of - 0.099, 3 matching HPO terms (Abdominal pain, Chest pain, Paresthesia), high phenolyzer rank (12) • HLA-B c.206A>C p.Glu69Ala, 2 hets in GnomAD, in-silico predict benign (SIFT = 0.22, PolyPhen = 0.013, CADD = 5.25), 3 matching HPO terms (Abdominal pain, Chest pain, Paresthesia), only seen by 2 callers • HLA-DRB1 c.370+8T>C (splice region), not seen in GnomAD, O/E LOF score of 0.59, splice score of 0, only seen by 2 callers, 4 matching HPO terms (Abdominal pain, Chest pain, Paresthesia, Urticaria), high phenolyzer rank (18) • HLA-DRB1 c.370+7A>G (splice region), not seen in GnomAD, O/E LOF score of 0.59, splice score of 0, only seen by 2 callers, 4 matching HPO terms (Abdominal pain, Chest pain, Paresthesia, Urticaria), high phenolyzer rank (18) • HLA-DRB1 c.297G>C p.Gln99His, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.03, PolyPhen = 0.16, CADD = 1.32), only seen by 2 callers, 4 matching HPO terms (Abdominal pain, Chest pain, Paresthesia, Urticaria), high phenolyzer rank (18) • HLA-DRB1 c.295C>G p.Gln99Glu, Not seen in GnomAD, in-silico predicts benign (SIFT = 0.36, PolyPhen = 0.005, CADD = 0.001), only seen by 2 callers, 4 matching HPO terms (Abdominal pain, Chest pain, Paresthesia, Urticaria), high phenolyzer rank (18) • COL5A1 c.1570-7C>T (splice region), 13 hets in GnomAD, O/E LOF score of 0.02, splice score of - 0.29, 1 matching HPO term (urticaria)

Filter 1 B – HPO terms, AF < 0.001, Homo variants in autosomal recessive genes: 4 variants analyzed:

• HLA-B c.356T>G p.Leu119Arg, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0.01, PolyPhen = 0.97, CADD = 23.6), 3 matching HPO terms (Abdominal pain, Chest pain, Paresthesia) • HLA-B c.355C>T p.Leu119Phe, not seen in GnomAD, in-silico predicts benign (SIFT = 0.4, PolyPhen = 0.40, CADD = 0.38), 3 matching HPO terms (Abdominal pain, Chest pain, Paresthesia) • HLA-B c.283G>A p.Ala95Thr, 1 het seen in GnomAD, in-silico predicts benign (SIFT = 0.13, PolyPhen = 0.003, CADD = 11.07), 3 matching HPO terms (Abdominal pain, Chest pain, Paresthesia), only seen by 2 callers • HLA-B c.282G>C p.Gln94His, 22 hets in GnomAD, in-silico predicts benign (SIFT = 0.47, PolyPhen = 0.19, CADD = 9.39), 3 matching HPO terms (Abdominal pain, Chest pain, Paresthesia), only seen by 2 callers

Filter 1 C – HPO terms, AF < 0.001, Burden > 1 in autosomal recessive genes: 0 variants found

Filter 2 – AF <0.001, Likely Loss of Function Variants: 36 variants analyzed:

• Het ZFYVE9 c.-107+130T>G (splice region), 2 hets in GnomAD, O/E LOF score of 0.18, splice score of -0.49

182

• Het LIPA c.193C>T p.Arg65Ter (nonsense), het variant in AR disease-gene, 7 hets in GnomAD, O/E LOF score of 0.65 • Het PRB3 c.649+56A>G (splice region), 2 hets in GnomAD, no O/E scores listed, splice score of 0, only seen by 2 callers, no phenolyzer rank listed • Het PRB3 c.649+50C>A (splice donor), not seen in GnomAD, no O/E scores listed, splice score of 0.42, only seen by 2 callers, no phenolyzer rank listed • Het RPLP0 c.318+4A>G (splice region), not seen in GnomAD, O/E LOF score of 0.08, splice score of 1.75 • Het CERS5 c.698+302_698+305dup (splice region), 4 hets in GnomAD, O/E LOF score of 0.64, splice score of 0 • Het PHF2P2 n.128+6C>T (splice region), seen 6 times in C4R, 9 hets in GnomAD, no O/E scores listed, splice score of -4.98, no phenolyzer rank listed • Het WDFY2 c.831+2T>C (splice donor), not seen in GnomAD, O/E LOF score of 0.62, splice score of 7.75, low phenolyzer rank (11270), GUS • Het MYCBP2 c.11036+6T>A (splice region), not seen in GnomAD, O/E LOF score of 0.08, splice score of 0.92, low phenolyzer rank (10700) • Het SLC28A1 c.1252dup p.Ala418GlyfsTer70 (frameshift), not seen in GnomAD, O/E LOF score of 0.92 • Het KCNJ12 c.-56-4G>C (splice region), seen in 53 of 235 reads, not seen in GnomAD, O/E LOF score of 0.39, splice score of -0.25 • Het CCDC178 c.1279del p.Thr427HisfsTer23 (frameshift), 2 hets in GnomAD, O/E LOF score of 0.86, no phenolyzer rank listed • Het ILVBL c.816+6T>G (splice region), 1 het in GnomAD, O/E LOF score of 0.63, splice score - 0.06, low phenolyzer rank (17664) • Het ZNF285 c.15+1G>A (splice donor), not seen in GnomAD, O/E LOF score of 0.68, splice score of 8.18, GUS not in OMIM • Het DPRX c.383C>A p.Ser128Ter (nonsense), 14 hets in GnomAD, O/E LOF score of 1.62, low phenolyzer rank (15876) • Het SLC25A41 c.*86A>C (splice region), not seen in GnomAD, O/E LOF score of 0.99, splice score of 4.25 • Het C2orf48 c.194-5A>G (splice region), 2 hets in GnomAD, O/E LOF score of 0.27, splice score of 5.2, low phenolyzer rank (18067), GUS not in OMIM • Het ZNF806 c.942del p.Tyr315ThrfsTer152 (frameshift), not seen in GnomAD, no O/E scores listed, low phenolyzer rank (14617) • Het ZNF806 c.1367dup p.Asn456LysfsTer85 (frameshift), not seen in GnomAD, no O/E scores listed, low phenolyzer rank (14617) • Het ZNF806 c.1580del p.Asn527MetfsTer36 (frameshift), not seen in GnomAD, no O/E scores listed, low phenolyzer rank (14617) • Het AAK1 c.2681-8G>C (splice region), 15 hets in GnomAD, O/E LOF score of 0.04, splice score of -1.10, low phenolyzer rank (10069) • Het VSX1 c.913G>T p.Glu305Ter (nonsense), 2 hets in GnomAD, O/E LOF score of 0.39, low phenolyzer rank (16042), not a good phenotypic match

183

• Het FAM182B c.201+1490C>T (splice acceptor), not seen in GnomAD, no O/E scores listed, splice score of 0.53, no phenolyzer rank listed • Het TMC2 c.1594-4C>G (splice region), 12 hets in GnomAD, O/E LOF score of 0.87, splice score of 0.37, no phenolyzer rank listed • Homo MROH8 c.13+21_13+22insATAGACAGGGCCCCGCGGCCGGCACTCTT (splice acceptor), not seen in GnomAD, O/E LOF score of 0.60, splice score of 0, low phenolyzer rank (19634) • Het PRR5 c.529-7C>A (splice region), 10 hets in GnomAD, O/E LOF score of 0.25, splice score of 2.09 • Homo LIMD1 c.1409-5G>T (splice region), not seen in GnomAD, O/E LOF score of 0.22, splice score of -2.69 • Homo LIMD1 c.1409-4C>G (splice region), not seen in GnomAD, O/E LOF score of 0.22, splice score of 0.34 • Homo LIMD1 c.1409-3A>C (splice region), not seen in GnomAD, O/E LOF score of 0.22, splice score of -3.85 • Het PPP2R2C c.48C>T (splice region), 2 hets in GnomAD, O/E LOF score of 0.14, splice score of 0.65 • Het SLC22A5 c.916C>T p.Arg306Ter (nonsense), het variant in AR disease-gene, pathogenic in ClinVar, 11 hets in GnomAD, not a good phenotypic match • Het LPA c.-52G>A (splice region), 2 hets in GnomAD, O/E LOF score of 1.30, splice score of 6.59, not a good phenotypic match • Het HLA-DRB6 n.876-5T>C (splice region), not seen in GnomAD, no O/E scores listed, splice score of -0.45 • Homo HLA-DRB6 n.743A>G (splice region), not seen in GnomAD, no O/E scores listed, splice score of -0.13 • Het PCSK5 c.1900+108_1900+127del (splice region), not seen in GnomAD, O/E LOF score of 0.38, only seen by 2 callers • Het PRPRD c.4304+7_4304+8insACAGTTCAGGAATGGTAAGTT (splice region), not seen in GnomAD, O/E LOF score of 0.04, low phenolyzer rank (10368)

Case 17: Singleton Exome Reanalysis

741 variants total. 63 variants when filtered for HPO terms. 57 variants analyzed.

Filter 1 A - HPO terms, AF < 0.001, Het variants in autosomal dominant genes: 10 variants analyzed:

• HDAC4 c.443G>A p.Arg148Gln, 9 hets in GnomAD, mixed in-silico predictions (SIFT = 0.53, PolyPhen = 0.005, CADD = 21.7), 4 matching HPO terms (Microcephaly, Narrow palpebral fissures, Short stature, Umbilical hernia), high phenolyzer rank (6), gene is part of a microdeletion syndrome • EP300 c.3256A>G p.Ile1086Val, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0.01, PolyPhen = 0.92, CADD = 26.3), 5 matching HPO terms (Epicanthus, Failure to thrive, Microcephaly, Narrow mouth, Short stature), high phenolyzer rank (33) • ZFP57 c.751C>G p.Arg251Gly, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.99, CADD = 29.4), 1 matching HPO term (Failure to thrive), low phenolyzer rank (11257)

184

• GTPBP2 c.97C>G p.Arg33Gly, 10 hets in GnomAD, mixed in-silico predictions (SIFT = 0.1, PolyPhen = 0.04, CADD = 23), 1 matching HPO term (Failure to thrive), low phenolyzer rank (10567) • GLI3 c.2200G>A p.Asp734Asn, VUS in ClinVar, 21 hets in GnomAD, mixed in-silico predictions (SIFT = 0.24, PolyPhen = 0.001, CADD = 20.4), 2 matching HPO terms (Short stature, Umbilical hernia), high phenolyzer rank (745) • TBL2 c.754C>T p.His252Tyr (missense), seen in 4 out of 11 reads, not seen in GnomAD, in-silico predicts benign (SIFT = none, PolyPhen = 0, CADD = none), splice score of 0, 5 matching HPO terms (Epicanthus, Failure to thrive, Microcephaly, Short stature, Umbilical hernia) • SPIDR c.2527A>G p.Asn843Asp, 1 het in GnomAD, mixed in-silico predictions (SIFT = 0, PolyPhen = 0.04, CADD = 0.25), 2 matching HPO terms (Microcephaly, Short stature), low phenolyzer rank (15002) • ABCA1 c.2842G>A p.Gly948Arg, 1 het in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 1, CADD = 35), 1 matching HPO term (Hepatomegaly) • KCNT1 c.238T>C p.Phe80Leu (missense), not seen in GnomAD, mixed in-silico predictions (SIFT = 0.1, PolyPhen = 0.16, CADD = 22.7), 1 matching HPO term (Microcephaly), low phenolyzer rank (12684), not a good phenotypic match • TPM2 c.493-7C>A (splice region), Likely benign in ClinVar, not seen in GnomAD, O/E LOF score of 0.48, splice score of 2.78, low phenolyzer rank (10203)

Filter 1 B - HPO terms, AF < 0.001, Homo variants in autosomal recessive genes: 0 variants found

Filter 1 C - HPO terms, AF < 0.001, Burden > 1 in autosomal recessive genes: 5 variants analyzed:

• TTN c.87264C>A p.Ser29088Arg, 1 het in GnomAD, in-silico predicts benign (SIFT = none, PolyPhen = 0.58, CADD = 12.51) • TTN c.53440G>A p.Gly17814Ser, seen 7 times in C4R, 46 hets in GnomAD, in-silico predicts deleterious (SIFT = none, PolyPhen = 1, CADD = 24), not a good phenotypic match • TTN n.3568+76A>G (splice region), not seen in GnomAD, O/E LOF score of 0.31, splice score of - 1.14, not a good phenotypic match • TAF2 c.546G>T p.Arg182Ser, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0.04, PolyPhen = 0.99, CADD = 19.99), 1 matching HPO term (Microcephaly), not a good phenotypic match • TAF2 c.545G>T p.Arg182Met, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0.01, PolyPhen = 1, CADD = 21.2), 1 matching HPO term (Microcephaly), not a good phenotypic match

Filter 2 - AF < 0.001, Likely Loss of Function Variants: 39 variants analyzed:

• Het MEGF6 c.2575+8G>A (splice region), not seen in GnomAD, O/E LOF score of 0.68, splice score of 0, no phenolyzer rank listed • Het KIF2C c.1450-5C>A (splice region), 16 hets in GnomAD, O/E LOF score of 0.36, splice score of 1.02, high phenolyzer rank (959) • Het ACOT7 c.81del p.Lys27AsnfsTer21 (frameshift), not seen in GnomAD, O/E LOF score of 0.16, according to OMIM is downregulated in epilepsy

185

• Het APBB1IP c.-203+7_-203+8insAGGGATAACAGAGG (splice region), seen 24 times in C4R, not seen in GnomAD, O/E LOF score of 0.14, splice score of 0, only seen by 2 callers, low phenolyzer rank (990) • Het APBB1IP c.-203+7_-203+8insAGGGATAACAGAG (splice region), seen twice in C4R, not seen in GnomAD, O/E LOF score of 0.14, splice score of 0, only seen by 2 callers, low phenolyzer rank (990) • Het HOGA1 c.212-2181C>G (splice region), het variant in AR disease-gene, not seen in GnomAD, O/E LOF score of 1.25, splice score of -1 • Het FTH1 c.-33C>T (splice region), 3 hets in GnomAD, O/E LOF score of 0.29, splice score of - 0.16, only seen by 2 callers • Het PLEKHB1 c.390+2786C>G (splice region), not seen in GnomAD, O/E LOF score of 0.56, splice score of 14.58, low phenolyzer rank (14619) • Het RECQL c.-369_-355dup (splice region), seen in 9 out of 12 reads, 3 hets in GnomAD, O/E LOF score of 0.96, splice score of 0, only seen by 2 callers, low phenolyzer rank (19541) • Het KRT86 c.177C>A p.Cys59Ter (nonsense), 3 hets in GnomAD, O/E LOF score of 0.79, not a good phenotypic match • Het TIMELESS c.3452-7A>G (splice region), not seen in GnomAD, O/E LOF score of 0.80, splice score of -0.75, low phenolyzer rank (11865) • Het ITGBL1 c.195_198dup p.Gly67ProfsTer15 (frameshift), not seen in GnomAD, O/E LOF score of 0.63, low phenolyzer rank (19762) • Het VWA8 c.3713dup p.Asn1238LysfsTer58 (frameshift), not seen in GnomAD, O/E LOF score of 0.97, no phenolyzer rank listed • Het FAM216B c.99+24_99+27dup (splice region), seen 15 times in C4R, not seen in GnomAD, O/E OF score of 0.68, splice score of 0, no phenolyzer rank listed • Het RBM26 c.2829T>C (splice region), seen 2 times in C4R, not seen in GnomAD, O/E LOF score of 0.04, splice score of -0.29, low phenolyzer rank (17591) • Het DCT c.232C>T p.Arg78Ter (nonsense), 23 hets in GnomAD, O/E LOF score of 0.83 • Het TMCO5A c.573A>T p.Ter191CysextTer17 (stop loss), seen 2 times in C4R, 3 hets in GnomAD, O/E LOF score of 0.62, splice score of 5.7, no phenolyzer rank listed • Het ANPEP c.2158-7del (splice region), seen 4 times in C4R, 3 hets in GnomAD, O/E LOF score of 0.45, splice score of 0 • Het NA n.5992-25961G>A (splice region), 2 hets in GnomAD, no O/E scores listed, splice score of 0.24, no phenolyzer rank listed • Het ITFG3 c.972-7T>C (splice region), 1 het in GnomAD, O/E LOF score of 0.88, splice score of 0.10, low phenolyzer rank (16456) • Het VPS9D1-AS1 n.498-1G>A (splice region), seen 2 times in C4R, 1 het in GnomAD, no O/E scores listed, splice score of 8.75, low phenolyzer score (13760) • Het SSC5D c.362-8C>T (splice region), 5 hets in GnomAD, O/E LOF score of 0.86, splice score of - 0.11, no phenolyzer rank listed • Het SH3RF4 c.374_399del p.Pro125GlnfsTer79 (frameshift), seen in 6 out of 19 reads, not seen in GnomAD, O/E LOF score of 0.26, only seen by 2 callers, low phenolyzer rank (12375), GUS not in OMIM

186

• Het KRTAP19-1 c.1A>T p.Met1? (start loss), not seen in GnomAD, no O/E score listed, GUS not in OMIM • Het GPX1 c.371dup p.Leu125SerfsTer (frameshift), 3 hets in GnomAD, O/E LOF score of 1.22 • Het KIAA1211 c.1024del p.Glu342ArgfsTer82 (frameshift), seen in 4 out of 14 reads, not seen in GnomAD, O/E LOF score of 0.33, only seen by 2 callers, no phenolyzer rank listed • Het SAP30L-AS1 c.568+235_568+239dup (splice region), seen in 3 out of 14 reads, not seen in GnomAD, no O/E scores listed, only seen by 2 callers, low phenolyzer rank (13258) • Het SASH1 c.727+23_727+30del (splice region), seen in 6 out of 10 reads, not seen in GnomAD, O/E LOF score of 0.29, splice score of 0, only seen by 2 callers, low phenolyzer rank (17721) • Het PEX6 c.2301-5_2301-2del (splice region), VUS in ClinVar, 16 hets in GnomAD, O/E LOF score of 0.39, splice score of 0, 6 matching HPO terms (Epicanthus, Failure to thrive, Hepatic liver failure, Hepatomegaly, Microcephaly, Short stature) • Het PARP12 c.1421+258A>G (splice region), seen in 3 out of 13 reads, not seen in GnomAD, no O/E scores listed, splice score of 0.44, only seen by 2 callers, low phenolyzer rank (16696) • Het KMT2C c.7443-7_7443-6del (splice region), seen in 9 out of 16 reads, seen 9 times in C4R, not seen in GnomAD, O/E LOF score of 0.08, splice score of 0, only seen by 2 callers, low phenolyzer score (10060) • Het AOAH c.1022-6A>G (splice region), 2 hets in GnomAD, O/E LOF score of 0.69, splice score of 0.04, low phenolyzer score (17833) • Het RP11-368M16.8 n.77-8T>C (splice region), seen 23 times in C4R, not seen in GnomAD, no O/E scores listed, splice score of 0.14, only seen by 2 callers, no phenolyzer rank listed • Het PKHD1L1 c.75T>C (splice region), het variant in AR disease-gene, 3 hets in GnomAD, O/E LOF score of 0.89, splice score of 1.75, low phenolyzer score (18006) • Het RP11-219B4.3 c.234+148C>T (splice region), seen in 2 out of 10 reads, not seen in GnomAD, no O/E scores listed, splice score of 0.9, only seen by 2 callers, no phenolyzer rank listed • Het DPY19L4 c.1632+1G>T (splice donor), seen 2 times in C4R, 17 hets in GnomAD, O/E LOF score of 0.82, splice score of 8.50, low phenolyzer rank (17701), GUS • Het MAMDC4 c.1717C>T p.Arg573Ter (nonsense), 5 hets in GnomAD, O/E LOF score of 1.09, no phenolyzer rank listed • Het TESK1 c.1399T>C p.Ter467GlnextTer18 (stop loss), not seen in GnomAD, O/E LOF score of 0.11, low phenolyzer rank (14554) • Het MAMLD1 c.97-6898A>T (splice region), not seen in GnomAD, O/E LOF score of 0.21, splice score of 0, only seen by 2 callers, heterozygous variant in X-linked gene for male participant indicates artifact

Variants in 16q22 Region: 3 variants analyzed:

• Homo PSKH1 c.686C>T p.Pro229Leu, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.03, PolyPhen = 0.48, CADD = 23.8), low phenolyzer rank (11132) • Homo DUS2 c.1265C>T p.Ala422Val, 4 hets in GnomAD, in-silico predicts deleterious (SIFT = 0.01, PolyPhen = 0.99, CADD = 35), no phenolyzer rank listed • Homo NOB1 c.1095C>A p.Asp365Glu, 1 het in GnomAD, mixed in-silico predictions (SIFT = 0.03, PolyPhen = 0.71, CADD = 17.77)

187

Case 18: Singleton Exome Reanalysis

776 variants total. 43 variants when filtered for HPO terms. 48 variants analyzed.

Filter 1 A: AF < 0.001, HPO Terms, Het variants in autosomal dominant genes: 9 variants

• DHTKD1 c.1267G>A p.Val423Met, 3 hets in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.96, CADD = 26), 1 matching HPO term (Microcephaly), not a good phenotypic match • DYNC1H1 c.13652C>T p.Ala4551Val, seen 3 times in C4R, 2 hets in GnomAD, mixed in-silico predictions (SIFT = 0.31, PolyPhen = 0.17, CADD = 21), 2 matching HPO terms (Abnormal cerebellum morphology, Microcephaly), high phenolyzer rank (887) • PIEZO2 c.1463C>A p.Ala488Asp, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.86, CADD = 25.6), low phenolyzer rank (12001) • TCF4 c.-142C>G (missense), seen in 3 out of 11 reads, not seen in GnomAD, no O/E scores listed, no in-silico predictions, 2 matching HPO terms (Hypopigmentation of the skin, Microcephaly) • ATR c.7734T>A p.Asn2578Lys, 1 het in GnomAD, mixed in-silico predictions (SIFT = 0.09, PolyPhen = 0.55, CADD = 23.4), 3 matching HPO terms (Abnormal cerebellum morphology, Microcephaly, Sloping forehead), high phenolyzer rank (443) • NELFA c.573G>T p.Gln191His, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.12, PolyPhen = 0.06, CADD = 22.1), 3 matching HPO terms (Abnormal cerebellum morphology, Aplasia cutis congenita, Microcephaly), gene associated with Wolf-Hirschhorn which is a microdeletion syndrome • ARID1B c.808_813dup p.Ser270_Ala271dup (in-frame duplication), 1 het in GnomAD, 2 matching HPO terms (Abnormal cerebellum morphology, Microcephaly) • ELN c.590G>A p.Gly197Glu, seen 11 times in C4R, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0.01, PolyPhen = 0.83, CADD = 16.3), 2 matching HPO terms (Abnormal cerebellum morphology, Microcephaly), not a good phenotypic match • MT-ND6 c.394T>C p.Ser132Pro, seen 11 times in C4R, not seen in GnomAD, no O/E scores listed, in-silico predicts benign (SIFT = 0.21, PolyPhen = 0.005, CADD = none), 3 matching HPO terms (Abnormal cerebellum morphology, Hypopigmentation of the skin, Microcephaly), no phenolyzer rank

Filter 1 B: AF < 0.001, HPO Terms, Homo variants in autosomal recessive genes: 1 variant analyzed:

• AHI1 c.74A>G p.Asp25Gly, VUS in ClinVar, seen 6 times in C4R, 11 hets in GnomAD, in-silico predicts benign (SIFT = 0.96, PolyPhen = 0.01, CADD = 4.88), 2 matching HPO terms (Abnormal cerebellum morphology, Polymicrogyria)

Filter 1 C: AF < 0.001, HPO Terms, Burden > 1 in autosomal recessive genes: 0 variants found

Filter 2: AF < 0.001, Likely Loss of Function Variants: 38 variants analyzed:

• Het OBSCN c.16732C>T p.Arg5578Ter (nonsense), 3 hets in GnomAD, O/E LOF score of 0.77 • Het OBSCN c.18662-2970del (frameshift), not seen in GnomAD, O/E LOF score of 0.77 • Homo APBB1IP c.-203+7_-203+8insAGGGATAACAGAGG (splice region), seen 24 times in C4R, not seen in GnomAD, O/E LOF score of 0.14, splice score of 0, only seen by 2 callers

188

• Homo APBB1IP c.-203+7_-203+8insAGGGATAACAGAG (splice region), not seen in GnomAD, O/E LOF score of 0.14, splice score of 0, only seen by 2 callers • Het FAM25C c.136+8C>T (splice region), 12 hets and 1 homo in GnomAD, O/E LOF score of 1.14, splice score of 0, no phenolyzer rank listed • Het CYP2C18 c.932del p.Leu311ArgfsTer4 (frameshift), not seen in GnomAD, O/E LOF score of 0.87 • Het HCAR2 c.-159-3del (splice region), seen 3 times in C4R, not seen in GnomAD, no O/E scores listed, splice score of 0, only seen by 2 callers • Het GTF2H3 c.487-94T>C (splice region), not seen in GnomAD, O/E LOF score of 0.67, splice score of 0.07 • Het DDX51 c.556_559del p.Asn186ValfsTer42 (frameshift), 20 hets in GnomAD, O/E LOF score of 0.68, high phenolyzer rank (15048) • Het MUC19 c.12863G>A p.Gly4288Glu (nonsense), seen 18 times in C4R, not seen in GnomAD, no O/E scores listed, only seen by 2 callers • Het MUC19 c.12893A>T p.Glu4298Val (stop loss), seen 24 times in C4R, not seen in GnomAD, no O/E scores listed, only seen by 2 callers • Het KDM5A c.2151-14_2151-5del (splice region), not seen in GnomAD, O/E LOF score of 0.09, splice score of 0, low phenolyzer rank (11273) • Het LEPREL2 c.94del p.Ala32ArgfsTer94 (frameshift), seen in 4 out of 16 reads, 1 het in GnomAD, no O/E scores listed, only seen by 2 callers • Het RBM26 c.2829T>C (splice region), not seen in GnomAD, O/E LOF score of 0.04, splice score of -0.29, no phenolyzer rank listed • Het DZIP1 c.597+3G>A (splice region), 1 het in GnomAD, O/E LOF score of 0.78, splice score of - 1.81 • Het SLC12A1 c.18C>T p.Gln7Ter (nonsense), het variant in AR disease-gene, seen 5 times in C4R, 2 hets in GnomAD, O/E LOF score of 0.62 • Het TRPM7 c.1134C>A (splice region), 17 hets in GnomAD, O/E LOF score of 0.43, splice score of 0.50, low phenolyzer rank (11151) • Het MAP2K5 c.-217_-212del (splice acceptor), 2 hets in GnomAD, O/E LOF score of 0.28, splice score of 0, GUS • Het NA c.-272+418T>A (splice region), not seen in GnomAD, no O/E scores listed, splice score of 5.30, no phenolyzer rank listed • Het PPM1N c.118G>T p.Glu40Ter (nonsense), not seen in GnomAD, O/E LOF score of 0.83, no phenolyzer rank listed • Het NOSTRIN c.965-479G>T (splice donor), 6 hets in GnomAD, O/E LOF score of 0.77, splice score of 8.50 • Het NLRC4 c.2257+6G>A (splice region), not seen in GnomAD, O/E LOF score of 0.65, splice score of 1.09, not a good phenotypic match • Het C20orf26 c.1591-4G>A (splice region), 14 hets in GnomAD, O/E LOF score of 0.54, splice score of -0.38, no phenolyzer rank listed • Het OGFR c.399-6C>T (splice region), 1 het in GnomAD, O/E LOF score of 0.34, splice score of - 0.38, no phenolyzer rank listed

189

• Het PATZ1 c.191_192del p.Cys64TyrfsTer17 (frameshift), seen in 11 out of 68 reads, seen 9 times in C4R, 1 het in GnomAD, no O/E scores listed, only seen by 2 callers, low phenolyzer rank (10683) • Het MAGEF1 c.888del p.Arg297GlyfsTer15 (frameshift), 33 hets in GnomAD, O/E LOF score of 0.81, low phenolyzer rank (16598) • Het EFHB c.322A>T p.Arg108Ter (nonsense), 3 hets in GnomAD, O/E LOF score of 0.98, low phenolyzer rank (14612) • Het MST1 c.471-2A>G (splice acceptor), not seen in GnomAD, O/E LOF score of 0.75, splice score of 7.96 • Het ADH1C c.567+2T>C (splice donor), 1 het in GnomAD, no O/E scores listed, splice score of 7.75 • Het EGF c.2908C>T p.Arg970Ter (nonsense), het variant in AR disease-gene, not seen in GnomAD, O/E LOF score of 0.56, 1 matching HPO term (Microcephaly), high phenolyzer rank (186) • Het CDKL2 c.542+8A>T (splice region), 1 het in GnomAD, O/E LOF score of 0.98, splice score of 0, low phenolyzer rank (14655) • Het SPDL1 c.*2384del (frameshift), seen 8 times in C4R, 5 hets in GnomAD, O/E LOF score of 0.81 • Het GFM2 c.245-5A>G (splice region), seen 4 times in C4R, 4 hets in GnomAD, O/E LOF score of 0.81, splice score of 1.36 • Het ZFP57 c.373C>T p.Arg125Ter (nonsense), not seen in GnomAD, O/E LOF score of 0.69, low phenolyzer rank (11550) • Het EXOC2 c.2743C>T p.Gln915Ter (nonsense), 2 hets in GnomAD, O/E LOF score of 0.43 • Het ABCA2 c.568-8C>T (splice region), 8 hets in GnomAD, O/E LOF score of 0.09, splice score of - 0.58 • Het PCSK5 c.1900+162_1900+166dup (frameshift), seen 18 times in C4R, not seen in GnomAD, O/E LOF score of 0.38, only seen by 2 callers • Het HUWE1 c.10038A>G (splice region), 3 hets in GnomAD, O/E LOF score of 0.03, splice score of -0.27, not a good phenotypic match

Case 19: Duo Exome Reanalysis, Affected Sibling Pair

693 shared variants total. 40 shared variants when filtered for HPO terms. 51 variants analyzed.

Filter 1 A – HPO Terms, AF < 0.001, Shared het variants: 16 variants analyzed:

• PEX19 c.46A>T p.Arg16Trp, het variant in AR disease-gene, 3 hets in GnomAD, mixed in-silico predictions (SIFT = 0.02, PolyPhen = 0, CADD = 27.2), 5 matching HPO terms (Central hypotonia, Cerebral atrophy, Global developmental delay, Premature birth, Wide nasal bridge), high phenolyzer rank (15) • STT3A c.1405A>G p.Ile469Val, het variant in AR disease-gene, 2 hets in GnomAD, in-silico predicts benign (SIFT = 0.84, PolyPhen = 0.007, CADD = 12.51), 1 matching HPO term (Global developmental delay)

190

• NCAPD3 c.4253-7T>C (splice region), 3 hets in GnomAD, O/E LOF score of 0.36, splice score of - 0.29, 1 matching HPO term (Global developmental delay), low phenolyzer rank (11450) • ATN1 c.1488_1508dup p.Gln496_Gln502dup (in-frame insertion), not seen in GnomAD, only seen by 2 callers, 1 matching HPO term (Gait ataxia), triplicate repeat disorder, not a good phenotypic match • TTC8 c.436G>A p.Gly146Arg, het variant in AR disease-gene, seen 5 times in C4R, 18 hets in GnomAD, in-silico predict deleterious (SIFT = 0, PolyPhen = 0.99, CADD = 33), 2 matching HPO terms (Global developmental delay, Wide nasal bridge) • JPH3 c.472_473insCTGCTT p.Val158delinsAlaAlaLeu (protein altering), not seen in GnomAD, no O/E scores listed, only seen by 2 callers, 1 matching HPO term (Cerebral atrophy), low phenolyzer rank (13423) • BRCA1 c.339C>G p.Asn113Lys, VUS in ClinVar, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.03, PolyPhen = 0.02, CADD = 21.5), high phenolyzer rank (120), not a good phenotypic match • DYM c.145G>A p.Arg49Gln, het variant in AR-disease gene, seen 6 times in C4R, 62 hets in GnomAD, mixed in-silico predictions (SIFT = 0.01, PolyPhen = 0.16, CADD = none), 2 matching HPO terms (Global developmental delay, Scoliosis), low phenolyzer rank (13068) • SCN2A c.2019C>G (splice region), Benign once and VUS once in ClinVar, 19 hets in GnomAD, O/E LOF score of 0.06, splice score of 0.47, 1 matching HPO term (Global developmental delay) • SLC6A19 c.994C>T p.Arg332Cys, 7 hets in GnomAD, in-silico predicts deleterious (SIFT = 0.02, PolyPhen = 0.79, CADD = 27.9), 1 matching HPO term (Global developmental delay) • CWC27 c.20A>G p.Gln7Arg, het variant in AR disease-gene, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.02, PolyPhen = 0.69, CADD = 23.6), 2 matching HPO terms (Global developmental delay, Delayed speech and language development) • BRAT1 c.920A>C p.His307Pro, het variant in AR disease-gene, 8 hets in GnomAD, in-silico predicts benign (SIFT = 0.24, PolyPhen = 0.06, CADD = 0.05), 5 matching HPO terms (Broad face, Delayed gross motor development, Delayed speech and language development, Gait ataxia, Global developmental delay), high phenolyzer rank (381), not a good phenotypic match • PLEC c.5110G>T p.Gly1704Cys, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.82, CADD = 23.7), 1 matching HPO term (Premature birth) • INPP5E c.197G>A p.Arg66Gln, VUS in ClinVar, 3 hets in GnomAD, in-silico predicts benign (SIFT = 0.36, PolyPhen = 0, CADD = 1.09), 3 matching HPO terms (Delayed speech and language development, Global developmental delay, Scoliosis) • HDAC8 c.839-5T>G (splice region), not seen in GnomAD, O/E LOF score of 0.06, splice score of 2.68, 4 matching HPO terms (Cerebral atrophy, Choanal atresia, Global developmental delay, Premature birth), high phenolyzer rank (405), X-linked dominant disorder in female participants • PCDH19 c.2338C>T p.Arg780Cys, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.89, CADD = 34), 1 matching HPO term (Global developmental delay), low phenolyzer rank (15248), female restricted X-linked disorder in female participants, not a good phenotypic match

Filter 1 B – HPO Terms, AF < 0.001, Shared Homo variants: 2 variants analyzed:

191

• DMXL2 c.88-9_88-3dup (splice variant), seen in 5 out of 5 reads, not seen in GnomAD, O/E LOF score of 0.09, splice score of 0, only seen by 2 callers, 1 matching HPO term (Cerebellar hypoplasia) • PLEKHG2 c.882+8del (splice region), 98 hets in GnomAD, O/E LOF score of 0.43, splice score of 0, 1 matching HPO term (Global developmental delay)

Filter 1 C – HPO Terms, AF < 0.001, Burden > 1 in autosomal recessive genes: 0 variants found

Filter 2 – AF < 0.001, Likely Loss of Function Variants: 27 variants analyzed:

• Het PRAMEF1 c.688T>C p.Ter230GlnextTer50 (stop lost), 2 hets in GnomAD, O/E OF score of 1.44, no phenolyzer rank listed • Het PRRC2C c.1248+3A>G (splice region), not seen in GnomAD, O/E LOF score of 0.06, splice score of 2.17, no phenolyzer rank listed • Het NFASC c.536-4A>G (splice region), 2 hets in GnomAD, O/E LOF score of 0.12, splice score of - 0.02 • Het MALRD1 c.597+8T>G (splice region), 12 hets in GnomAD, no O/E scores listed, splice score of 0, no phenolyzer rank listed • Het APBB1IP c.-203+7_-203+8insAGGGATAACAGAGG (splice region), seen 24 times in C4R, not seen in GnomAD, O/E LOF score of 0.14, splice score of 0, only seen by 2 callers, high phenolyzer rank (911) • Het APBB1IP c.-203+7_-203+8insAGGGATAACAGAG (splice region), not seen in GnomAD, O/E LOF score of 0.14, splice score of 0, only seen by 2 callers, high phenolyzer rank (911) • Het RDX c.147_150dup p.Asp51CysfsTer11 (frameshift), single variant in AR disease-gene, 1 het in GnomAD, O/E LOF score of 0.17 • Het FAM216B c.99+23_99+27dup (splice region), not seen in GnomAD, O/E LOF score of 0.68, splice score of 0, only seen by 2 callers, no phenolyzer rank listed • Het LINS c.-272-17G>T (splice region), het variant in AR disease-gene, 4 hets in GnomAD, O/E LOF score of 0.57, splice score of -0.87, no phenolyzer rank listed • Het GGT6 c.1195dup p.Ala399GlyfsTer12 (frameshift), 11 hets in GnomAD, O/E LOF score of 0.96 • Het PMSB6 c.102+3G>A (splice region), 8 hets in GnomAD, O/E LOF score of 0.39, splice score of -2.55, high phenolyzer rank (165) • Het PLD2 c.2123+10dup (splice region), 30 hets in GnomAD, O/E LOF score of 0.87, splice score of 0, high phenolyzer rank (601) • Het FXR2 c.1926+99G>A (splice region), not seen in GnomAD, O/E LOF score of 0.07, splice score of -0.22, low phenolyzer rank (12431) • Het ANKRD20A5P n.502-281G>A (splice region), 28 hets in GnomAD, no O/E scores listed, splice score of 0.13, no phenolyzer rank listed • Het PSMA8 c.230-2A>G (splice acceptor), 4 hets in GnomAD, O/E LOF score of 0.59, splice score of 7.96, high phenolyzer rank (187), GUS • Het PHLPP1 c.3325-4T>C (splice region), 32 hets in GnomAD, O/E LOF score of 0.14, splice score of 1.03 • Het AXL c.410-7C>T (splice region), not seen in GnomAD, O/E LOF score of 0.27, splice score of 0.37

192

• Het SSC5D c.1053_1071dup p.Phe358GlyfsTer41 (frameshift), 1 het in GnomAD, O/E LOF score of 0.87, only seen by 2 callers, no phenolyzer rank listed • Het C2orf53 c.106C>T p.Gln36Ter (nonsense), 10 hets in GnomAD, O/E LOF score of 1.12, no phenolyzer rank listed • Het TMPRSS2 c.1267dup p.Ala423GlyfsTer21 (frameshift), not seen in GnomAD, O/E LOF score of 0.59, low phenolyzer rank (10742) • Het DDX17 c.1325+8T>C (splice region), 35 hets in GnomAD, O/E LOF score of 0.03, splice score of 0, low phenolyzer rank (12388) • Het MAML3 c.1512_1526del (splice acceptor), seen 12 times in C4R, not seen in GnomAD, O/E LOF score of 0.10, splice score of 0, only seen by 2 callers • Het FAM200A c.1136_1142del p.Gln379ArgfsTer36 (frameshift), seen 5 times in C4R, 6 hets in GnomAD, O/E LOF score of 0.78, low phenolyzer rank (20098) • Het ZNF883 c.949G>T p.Gly317Ter (nonsense), 7 hets in GnomAD, no O/E scores listed, low phenolyzer rank (10641) • Het GPR144 c.2790+2T>G (splice donor), not seen in GnomAD, O/E LOF score of 0.71, splice score of 7.6, only seen by 2 callers, low phenolyzer rank (17422) • Het RBMXL3 c.1205_1206insAGGCCGCTCGCCCGACGCCCACAGCG p.Asn405SerfsTer111 (frameshift), not seen in GnomAD, no O/E scores listed, only seen by 2 callers, low phenolyzer rank (12613), X-linked gene in female participants • Het SNX12 n.602-17_602-8del (splice region), not seen in GnomAD, no O/E scores listed, splice score of 0, low phenolyzer rank (11472)

Shared homozygous variants not yet reviewed: 6 variants analyzed:

• RNF207 c.1754C>T p.Pro585Leu, seen 6 times in C4R, 352 hets in GnomAD, in-silico predicts benign (SIFT = 0.32, PolyPhen = 0, CADD = 11.93), low phenolyzer rank (10857) • LOC388282 c.137_138insTGCAGGTG p.Gly49ValfsTer35 (frameshift), seen 124 times in C4R, not seen in GnomAD, no O/E scores listed, only seen by 2 callers, no phenolyzer rank listed • MEGF8 c.7774G>A p.Val2592Met, VUS in ClinVar, 255 hets in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.99, CADD = 25.7), 3 matching HPO terms (Global developmental delay, Scoliosis, Wide nasal bridge), low phenolyzer rank (12214), gene causes carpenter syndrome which is not a good phenotypic match • NTF4 c.172C>T p.Gln58Ter (nonsense), seen 18 times in C4R, 293 het in GnomAD, O/E LOF score of 0.73 • PTOV1 c.289G>A p.Gly97Ser, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0.02, PolyPhen = 0.97, CADD = 31), low phenolyzer rank (17324) • RGPD8 c.2856G>T p.Arg952Ser, seen in 16 out of 16 reads, seen 34 times in C4R, 8 hets in GnomAD, in-silico predicts benign (SIFT = 1, PolyPhen = 0, CADD = 0.003), only seen by 2 callers, low phenolyzer rank (11838)

Case 20: Singleton Exome Reanalysis

883 shared variants total. 52 shared variants when filtered for HPO terms. 57 variants analyzed.

193

Filter 1 A – HPO Terms, AF < 0.001, Het variants in autosomal dominant genes: 8 variants analyzed:

• SYNE2 c.13526G>T p.Arg4509Leu, 4 hets in GnomAD, mixed in-silico predictions (SIFT = 0.03, PolyPhen = 0.25, CADD = 24.9), 1 matching HPO term (Elevated serum creatine phosphokinase) • PIEZO1 c.6734G>C p.Arg2245Pro, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.02, PolyPhen = 0.48, CADD = 21.7), 2 matching HPO terms (Anemia, Splenomegaly), low phenolyzer rank (12007) • RYR1 c.328C>T p.His110Tyr, Likely Pathogenic and VUS in ClinVar, 1 het in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.99, CADD = 27.4), 4 matching HPO terms (Elevated serum creatine phosphokinase, Exercise-induced myalgia, Fever, Rhabdomyolysis), high phenolyzer rank (3), original exome reported as maternally inherited • RTEL1 c.3298G>T p.Ala1100Ser, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0.01, PolyPhen = 0.99, CADD = 27), 4 matching HPO terms (Anemia, Decreased antibody level in blood, Interstitial pulmonary abnormality, Splenomegaly), original exome reported as paternally inherited • NEFH c.1964_1965insTGAGAAGGCCCAGTCCCC p.Lys657_Glu658insAlaGlnSerProGluLys (in- frame insertion), not seen in GnomAD, 1 matching HPO term (Elevated serum creatine phosphokinase), low phenolyzer rank (11155) • DLEC1 c.868C>T p.Leu290Phe, 7 hets in GnomAD, in-silico predicts benign (SIFT = 0.11, PolyPhen = 0.14, CADD = 0.02), 1 matching HPO term (Lymphadenopathy), low phenolyzer rank (12600) • COLZA1 c.6193_6196del p.Glu2065LysfsTer140 (frameshift), not seen in GnomAD, O/E LOF score of 0.53, 1 matching HPO term (Anemia) • ASAH1 c.700A>C p.Ile234Leu, 22 hets in GnomAD, mixed in-silico predictions (SIFT = 0.49, PolyPhen = 0.03, CADD = 17.5), 2 matching HPO terms (Granulomatosis, Splenomegaly), high phenolyzer rank (100), original exome reported as maternally inherited

Filter 1 B – HPO Terms, AF < 0.001, Homo variants in autosomal recessive genes: 0 variants found

Filter 1 C – HPO Terms, AF < 0.001, Burden > 1 in autosomal recessive genes: 0 variants found

Filter 2 – AF < 0.001, Likely Loss of Function Variants: 37 variants analyzed:

• Het PGLYRP4 c.626-8C>T (splice region), 4 hets in GnomAD, O/E LOF score of 1.29, splice score of -0.24 • Het ELK4 c.-11C>T (splice region), seen in 10 reads out of 14 reads, not seen in GnomAD, O/E LOF score of 0.39, splice score of 1.67 • Het EIF2D c.260G>A p.Trp87Ter (nonsense), 3 hets in GnomAD, O/E LOF score of 0.56, low phenolyzer rank (16621) • Het SEC23IP c.1312+3G>A (splice region), not seen in GnomAD, O/E LOF score of 0.35, splice score of -2.98 • Het ARMC4 c.820-8dup (splice region), het variant in AR disease-gene, 1 het in GnomAD, O/E LOF score of 0.89, splice score of 0, low phenolyzer rank (18238) • Het SCN4B c.192-2A>G (splice acceptor), VUS in ClinVar, 6 hets in GnomAD, O/E LOF score of 1.12, splice score of 7.96, causes Long-QT which is not a phenotypic match • Het MUC5AC c.4312+5G>A (splice region), 2 hets in GnomAD, no O/E scores listed, splice score of 8.33

194

• Het NADSYN1 c.*75-7G>A (splice region), 20 hets in GnomAD, O/E LOF score of 0.67, splice score of 0.70 • Het DDX54 c.753-3del (splice region), not seen in GnomAD, O/E LOF score of 0.48, splice score of 0, low phenolyzer rank (21703) • Het TMTC1 c.939-32894_939-32893del (splice donor), not seen in GnomAD, no O/E scores listed, splice score of 0, low phenolyzer rank (12025) • Het MUC19 c.15933A>G p.Ter5311TrpextTer (stop loss), not seen in GnomAD, no O/E scores listed, only seen by 2 callers • Het MUC19 c.16803G>A p.Trp5601Ter (nonsense), seen 20 times in C4R, not seen in GnomAD, no O/E scores listed • Het LRRIQ1 c.3377+2T>G (splice donor), not seen in GnomAD, O/E LOF score of 0.76, splice score of 7.65, low phenolyzer rank (22312), GUS not in OMIM • Het BORA c.1708-4T>C (splice region), 10 hets in GnomAD, O/E LOF score of 0.41, splice score of -0.64, low phenolyzer rank (11279) • Het CTB-31N19.2 n.216C>T (splice region), seen 11 times in C4R, 17 hets in GnomAD, no O/E scores listed, splice score of 0.34, no phenolyzer rank listed • Het ZNF257 c.35-4A>C (splice region), 2 hets in GnomAD, O/E LOF score of 0.61, splice score of 0.63 • Het PPFIA3 c.2462+12_2462+13del (splice region), 1 het in GnomAD, O/E LOF score of 0.05, splice score of 0 • Het ZNF414 c.1049_1077del p.Gly350AlafsTer67 (frameshift), 2 hets in GnomAD, O/E LOF score of 0.28, only seen by 2 callers, low phenolyzer rank (12135), GUS not in OMIM • Het MRPS9 c.315+1G>A (splice donor), 2 hets in GnomAD, O/E LOF score of 0.64, splice score of 8.18, low phenolyzer rank (12399) • Het LRP2 c.428-13_428-8del (splice region), het variant in AR disease-gene, not seen in GnomAD, O/E LOF score of 0.19, splice score of 0, only seen by 2 callers, causes Donnai-Barrow syndrome which is not a good phenotypic match • Het FAM178B c.1045del p.Asp349IlefsTer58 (frameshift), not seen in GnomAD, O/E LOF score of 0.68, low phenolyzer rank (12864), GUS not in OMIM • Het MGAT4A c.403+6T>C (splice region), 6 hets in GnomAD, O/E LOF score of 0.39, splice score of 0.69 • Het NA c.346A>T p.Lys116Ter (nonsense), not seen in GnomAD, no O/E scores listed, no phenolyzer rank listed • Het NDUFV3 c.526C>T p.Arg176Ter (nonsense), 3 hets in GnomAD, O/E OF score of 1.33 • Het CBS c.-10C>T (splice region), het variant in AR disease-gene, not seen in GnomAD, O/E LOF score of 0.45, splice score of 0.8 • Het PDGFB c.584dup p.Ser196PhefsTer36 (frameshift), 2 het variants in GnomAD, O/E LOF score of 0.15, high phenolyzer rank (314) • Het FBLN1 c.1827C>T (splice region), not seen in GnomAD, O/E LOF score of 0.15, splice score of -1.99, low phenolyzer rank (11789) • Het EFCAB12 c.1391del p.Lys464ArgfsTer26 (frameshift), 16 hets in GnomAD, O/E LOF score of 0.53, low phenolyzer rank (12573)

195

• Homo RBM6 c.873+36_873+37del (frameshift), not seen in GnomAD, O/E LOF score of 0.06, only seen by 2 callers, low phenolyzer rank (14137) • Het ATP8A1 c.1764G>A (splice region), seen 5 times in C4R, 1 het in GnomAD, O/E LOF score of 0.24, splice score of -0.28 • Het DTNBP1 c.269-12672_269-12669del (frameshift), het variant in AR disease-gene, not seen in GnomAD, O/E LOF score of 0.62, only seen by 2 callers • Het SLC17A5 c.1112-8A>C (splice region), het variant in AR disease-gene, not seen in GnomAD, O/E LOF score of 0.69, splice score of -0.83, 1 matching HPO term (Splenomegaly) • Het RELN c.5211-10_5211-7del (splice region), not seen in GnomAD, O/E LOF score of 0.12, splice score of 0, only seen by 2 callers • Het ABCB5 c.1537-5A>G (splice region), 9 hets in GnomAD, O/E LOF score of 0.89, splice score of 1.42, low phenolyzer rank (10623) • Het PAPOLB c.1543_1546del p.Leu515GlnfsTer3 (frameshift), 1 het in GnomAD, O/E LOF score of 0.43, low phenolyzer rank (14948) • Het TYW1B Het c.1378C>T p.Arg460Ter (nonsense), 8 hets in GnomAD, no O/E scores listed, low phenolyzer rank (13196) • Het LINC00484 n.235+676A>G (splice region), seen in 10 out of 16 reads, not seen in GnomAD, no O/E scores listed, splice score of 1.66, only seen by 2 callers, low phenolyzer rank (17100)

Filter 3 – Neuromuscular Candidate Genes: 7 variants analyzed:

• Het AP4B1 c.577G>A p.Val193Ile, het variant in AR disease-gene, VUS in ClinVar, 19 hets in GnomAD, mixed in-silico predictions (SIFT = 0.55, PolyPhen = 0.06, CADD = 17.87), not a good phenotypic match • Het LAMA5 c.10288G>C p.Val3430Leu, 2 hets in GnomAD, in-silico predicts benign (SIFT = 0.19, PolyPhen = 0.01, CADD = 2.58), high phenolyzer rank (438) • Het ANK2 c.1997A>G p.Asn666Ser, Benign once and VUS 3 times in ClinVar, 14 hets in GnomAD, mixed in-silico score (SIFT = 0.07, PolyPhen = 0.12, CADD = 23.5), gene causes Long QT which is not a good phenotypic match • Het SGCB c.341C>T p.Ser114Phe, het variant in AR disease-gene, Pathogenic in ClinVar, 75 hets in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.83, CADD = 31), 1 matching HPO term (Elevated serum creatine phosphokinase) • Het LAMA2 c.9106C>T p.Arg3036Cys, het variant in AR disease-gene, 7 hets in GnomAD, in-silico predicts deleterious (SIFT = 0.01, PolyPhen = 0.99, CADD = 32), 1 matching HPO term (Elevated serum creatine phosphokinase) • Het DST c.5184+1726G>A (missense), het variant in AR disease-gene, 1 het in GnomAD, mixed in-silico predictions (SIFT = none, PolyPhen = 1, CADD = 16.65), 1 matching HPO term (Fever) • Het DSP c.273+5G>A (splice region), 79 hets in GnomAD, O/E LOF score of 0.19, splice score of 8.13, 1 matching HPO term (Interstitial pulmonary abnormality), not a good phenotypic match

Filter 3 – Immune Candidate Genes: 4 variants analyzed:

• Het AP3D1 c.2854_2856del p.Lys952del (in-frame deletion), het variant in AR disease-gene, 45 hets in GnomAD, 3 matching HPO terms (Interstitial pulmonary abnormality, Neutropenia, Splenomegaly), not a good phenotypic match

196

• Het DNMT3B c.1151T>C p.Phe384Ser, het variant in AR disease-gene, 37 hets in GnomAD, mixed in-silico predictions (SIFT = 0.15, PolyPhen = 0.003, CADD = 20.5), 3 matching HPO terms (Anemia, Decrease in T cell count, Decreased antibody level in blood), high phenolyzer rank (102), not a good phenotypic match • Het TLR3 c.1234C>G p.Leu412Val, VUS in ClinVar, 37 hets in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.99, CADD = 23.5), high phenolyzer rank (771), not a good phenotypic match • Het SBDS c.107T>C p.Val36Ala, het variant in AR disease-gene, 42 hets in GnomAD, in-silico predicts deleterious (SIFT = 0.03, PolyPhen = 1, CADD = 32), 3 matching HPO terms (Anemia, Neutropenia, Pancytopenia), not a good phenotypic match

OMIM Genes: 1 variant analyzed:

• SEPN1 c.1350C>G p.Ile450Met, seen in 31 out of 114 reads, VUS in ClinVar, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.75, CADD = 25.5), only seen by 2 callers

Case 21: Trio Exome Reanalysis, Affected Half-Sibling Pair and Unaffected Mother

306 sibling shared variants total. 22 sibling shared variants when filtered for HPO terms. 10 variants analyzed.

Filter 1A - HPO Terms, AF < 0.001, Sibling shared het variants not found in mother: 1 variant analyzed:

• APTX c.526-3del (splice region), seen in 4 out of 12 reads, het variant in AR disease-gene, Benign in ClinVar, not seen in GnomAD, O/E LOF score of 0.89, splice score of 0, only seen by 2 callers, 1 matching HPO term (Abnormality of the lower limb), low phenolyzer rank (15407)

Filter 1B - HPO Terms, AF < 0.001, Sibling shared homo variants het in mother: 0 variants found

Filter 1C – HPO, AF < 0.001, Burden > 1 in autosomal recessive genes: 0 variants found

Filter 2 – AF < 0.001, Sibling Shared Likely Loss of Function Variants: 0 variants

Filter 3 – Candidate gene list: No good candidate gene lists for this phenotype

Sibling Shared Variants Not Found in Mother: 8 variants analyzed:

• Het KMT2D c.12485G>A p.Arg4162Gln, 40 hets in GnomAD, mixed in-silico predictions (SIFT = 0.03, PolyPhen = 0, CADD = 21.8), 3 matching HPO terms (Abnormality of the lower limb, Aplasia/hypoplasia of the phalanges of the hand, Generalized hypotonia) • Het ANKLE1 c.1771T>G p.Leu591Val, (missense), not seen in GnomAD, in-silico predicts benign (SIFT = none, PolyPhen = 0, CADD = none), low phenolyzer rank (12447) • Het MADCAM1 c.742T>C p.Ser248Pro, not seen in GnomAD, in-silico predicts benign (SIFT = 0.47, PolyPhen = 0, CADD = 2.15), only seen by 2 callers • Het LOC101926954 c.295A>T p.Thr99Ser (missense), not seen in GnomAD, no O/E scores listed, no in-silico scores listed, only seen by 2 callers, no phenolyzer rank listed

197

• Het LOC101926954 c.285T>G p.His95Gln (missense), not seen in GnomAD, no O/E scores listed, no in-silico scores listed, only seen by 2 callers, no phenolyzer rank listed • Homo HLA-DPB1 c.365-8G>A (splice region), 50 hets in GnomAD, O/E LOF score of 0.98, splice score of 0.75 • Homo HLA-DPB1 c.365-4C>G (splice region), 28 hets in GnomAD, O/E LOF score of 0.98, splice score of 0.55 • Homo HLA-DPB1 c.374G>A p.Arg125Lys, 29 hets in GnomAD, in-silico predicts benign (SIFT = 1, PolyPhen = 0, CADD = 0.001)

OMIM Genes: 1 variant analyzed:

• Het PRKCH c.1214G>A p.Arg405Lys, 11 hets in GnomAD, in-silico predicts deleterious (SIFT = 0.01, PolyPhen = 0.95, CADD = 32), heterozygous in mother, gene causes susceptibility to multifactorial cerebral infarction

Case 22: Duo Exome Reanalysis, Unaffected Father

551 variants total. 24 variants when filtered for HPO terms. 36 variants analyzed.

Filter 1 A – HPO Terms, AF < 0.001, Het variant not found in father: 7 variants analyzed:

• GNPTAB c.1284+4A>G (splice region), found in 9 out of 43 reads, het variant in AR disease-gene, 3 hets in GnomAD, O/E LOF score of 0.63, splice score of 2.24, only seen by 2 callers, 1 matching HPO term (Recurrent infections) • NOTCH3 c.2906G>T p.Arg969Leu, 1 het in GnomAD, mixed in-silico predictions (SIFT = 0.04, PolyPhen = 0.08, CADD = 22.8), 1 matching HPO term (Recurrent infections) • NFE2L2 c.407T>C p.Ile136Thr (missense), 1 het in GnomAD, in-silico predicts benign (SIFT = 1, PolyPhen = 0.29, CADD = 5.81), 1 matching HPO term (Recurrent infection) • ALMS1 c.8972A>G p.Asp2993Gly, het variant in AR disease-gene, not seen in GnomAD, mixed in- silico predictions (SIFT = none, PolyPhen = 0.04, CADD = 22.8), 3 matching HPO terms (Decreased liver function, Elevated hepatic transaminase, Recurrent infections) • ACAD9 c.1357C>T p.His453Tyr, het variant in AR disease-gene, not seen in GnomAD, mixed in- silico predictions (SIFT = 0, PolyPhen = 0.01, CADD = 17.01), 3 matching HPO terms (Decreased liver function, Elevated hepatic transaminase, Thrombocytopenia) • WFS1 c.1684G>A p.Gly562Ser, 17 hets in GnomAD, in-silico predicts deleterious (SIFT = 0.01, PolyPhen = 0.96, CADD = 24.8), 3 matching HPO terms (Anemia, Recurrent infections, Thrombocytopenia) • PEX1 c.2427A>C p.Leu809Phe, het variant in AR disease-gene, not seen in GnomAD, mixed in- silico predictions (SIFT = 0.03, PolyPhen = 0.06, CADD = 23.3), 1 matching HPO term (Decreased liver function)

Filter 1 B – HPO Terms, AF < 0.001, Homo variants het in father: 2 variants analyzed:

• SDR9C7 c.563G>A p.Arg188His, 40 hets in GnomAD, mixed in-silico predictions (SIFT = 0.02, PolyPhen = 0.10, CADD = 26.3), splice score of 0.18, 2 HPO terms (Recurrent infections, Sepsis) • GALNS c.1495C>G p.Pro499Ala, 1 het in GnomAD, in-silico predicts benign (SIFT = 0.18, PolyPhen = 0.02, CADD = 14.53), 1 matching HPO term (Recurrent infection)

198

Filter 1 C – HPO Terms, AF < 0.001, Burden > 1 in autosomal recessive genes: 0 variants found

Filter 2 – AF < 0.001, Likely Loss of Function Variants: 15 variants analyzed:

• Het CRY1 c.595+4A>G (splice region), seen 6 times in C4R, 1 het in GnomAD, O/E LOF score of 0.56, splice score of 3.27, only seen by 2 callers, low phenolyzer rank (13409) • Het PPL c.717C>A p.Tyr239Ter (nonsense), not seen in GnomAD, O/E LOF score of 0.85 • Het GRAMD1A c.1312-8C>T (splice region), 1 het in GnomAD, O/E LOF score of 0.33, splice score of 1.01, low phenolyzer rank (20667) • Het ZNF850 c.235+6dup (splice region), 1 het in GnomAD, O/E LOF score of 0, splice score of 0, low phenolyzer rank (12528) • Het SMPD4 c.908+2T>C (splice donor), het variant in AR disease-gene, not seen in GnomAD, O/E LOF score of 0.77, splice score of 7.75, not a good phenotypic match • Het BIRC6 c.839+4T>G (splice region), 1 het in GnomAD, O/E LOF score of 0.07, splice score of - 0.33, low phenolyzer rank (10459) • Het VRKN2 c.1521dup p.Leu508SerfsTer14 (frameshift), 9 hets in GnomAD, O/E LOF score of 0.73, low phenolyzer rank (10803) • Het XRN2 c.549+6T>A (splice region), not seen in GnomAD, O/E LOF score of 0.21, splice score of 2.12 • Het FAM53A c.882+6G>A (splice region), 3 hets in GnomAD, O/E LOF score of 0.57, splice score of 0.85, low phenolyzer rank (15228) • Het C4orf50 c.55C>T p.Gln19Ter (nonsense), 5 hets in GnomAD, O/E LOF score of 0.96, low phenolyzer rank (13781) • Het CDHR2 c.3793-2A>G (splice region), seen 7 times in C4R, not seen in GnomAD, O/E LOF score of 0.66, splice score of 7.96, low phenolyzer rank (17571), GUS is not in OMIM • Homo ALKBH7 c.88del p.Ala30ProfsTer145 (frameshift), not seen in GnomAD, no O/E scores listed, low phenolyzer rank (13681) • Homo MROH8 c.2376_2377del p.Arg793ThrfsTer (frameshift), not seen in GnomAD, O/E LOF score of 0.61, low phenolyzer rank (21057) • Homo KIF9 c.2218-7C>T (splice region), seen 7 times in C4R, 1 het in GnomAD, O/E LOF score of 0.89, splice score of -0.005 • Homo SLC26A4 c.1806T>C (splice region), 1 het in GnomAD, O/E LOF score of 0.88, splice score of 0.15

Filter 3 – Immune Candidate Genes: 3 variants analyzed:

• Het JAK1 c.2854A>G p.Ile952Val, seen in 10 out of 56 reads, 12 hets in GnomAD, mixed in-silico predictions (SIFT = 1, PolyPhen = 0.34, CADD = 20.4), high phenolyzer rank (54) • Het LIG1 c.2692C>T p.Arg898Trp, 5 hets in GnomAD, mixed in-silico predictions (SIFT = 0.06, PolyPhen = 0.009, CADD = 22.8) • Het SAMD9 c.2374G>A p.Val792Ile, 134 hets in GnomAD, in-silico predicts benign (SIFT = 0.4, PolyPhen = 0, CADD = 0.004), 4 matching HPO terms (Anemia, Recurrent infections, Sepsis, Thrombocytopenia)

OMIM Genes: 4 variants analyzed:

199

• Het CR2 c.641G>A p.Arg214His, Likely benign in ClinVar, 137 hets in GnomAD, mixed in-silico predictions (SIFT = 0.01, PolyPhen = 0.62, CADD = 3.43), 4 matching HPO terms (Anemia, Elevated hepatic transaminase, Recurrent infections, Thrombocytopenia), high phenolyzer rank (816), father is heterozygous, gene causes common variable immunodeficiency • Het PKHG2 c.801+6C>T (splice region), het variant in AR disease-gene, 10 hets in GnomAD, O/E LOF score of 0.36, splice score of -2.05, 1 matching HPO term (Elevated hepatic transaminase), father is heterozygous, gene causes cirrhosis due to liver phosphorylase kinase deficiency • Het TFRC c.536G>A p.Ser179Asn, het variant in AR disease-gene, 7 hets in GnomAD, mixed in- silico predictions (SIFT = 0.2, PolyPhen = 0, CADD = 17.37), 4 matching HPO terms (Anemia, Recurrent infections, Sepsis, Thrombocytopenia ), high phenolyzer rank (412), father is heterozygous, gene causes immunodeficiency • Het SKIV2L c.2752G>A p.Val918Ile, het variant in AR disease-gene, seen 9 times in C4R, 127 hets and 1 homo in GnomAD, in-silico predicts benign (SIFT = 0.61, PolyPhen = 0.037, CADD = 8.43), gene causes Trichohepatoenteric syndrome

Homozygous Variants with CADD >15, Heterozygous in Father: 5 variants analyzed:

• ZNFX1 c.3152T>C p.Leu1051Pro, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.99, CADD = 30), no phenolyzer rank listed, GUS not in OMIM • DALRD3 c.1250A>G p.Tyr417Cys, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.99, CADD = 24.1), low phenolyzer rank (20763), GUS not in OMIM • RPUSD3 c.484G>A p.Gly162Arg, 2 hets in GnomAD, in-silico predicts deleterious (SIFT = 0.07, PolyPhen = 0.98, CADD = 23.8), low phenolyzer rank (15337) • CSF2 c.122G>A p.Arg41His, 28 hets in GnomAD, in-silico predicts benign (SIFT = 0.39, PolyPhen = 0.61, CADD = 14.92), high phenolyzer rank (470) • LMTK2 c.605G>A p.Gly202Glu, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.99, CADD = 29.2), low phenolyzer rank (15248)

Case 23: Singleton Exome Reanalysis

552 variants total. 56 variants when filtered for HPO terms. 72 variants analyzed.

Filter 1 A – HPO Terms, AF < 0.001, Het variants in autosomal dominant genes: 6 variants analyzed:

• LRRK2 c.7111C>T p.Pro2371Ser, not seen in GnomAD, mixxed in-silico predictions (SIFT = 0.98, PolyPhen = 0.164, CADD = 16.01), 1 matching HPO term (Hypertonia) • BUB1 c.2710A>G p.Ile904Val, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.05, PolyPhen = 0.38, CADD = 23.3), 2 matching HPO terms (Global developmental delay, Microcephaly) • GLI2 c.4577C>T p.Ser1526Leu, 17 hets in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.99, CADD = 33), 2 matching HPO terms (Global developmental delay, Microcephaly) • SAMHD1 c.1174A>G p.Lys392Glu, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.1, PolyPhen = 0.005, CADD = 12.45), 6 matching HPO terms (Feeding difficulties, Global

200

developmental delay, Hypertonia, Hypoplasia of the corpus callosum, Microcephaly, Thrombocytopenia) • ATXN1 c.624_626dup p.Gln208dup (in-frame insertion), Likely benign in ClinVar, not seen in GnomAD, only seen by 2 callers, 1 matching HPO term (Hypertonia), low phenolyzer rank (12214) • HLA-DQA1 c.447C>A p.Ser149Arg, 1 het in GnomAD, in-silico predicts benign (SIFT = 1, PolyPhen = 0.02, CADD = 0.008), 1 matching HPO term (Failure to thrive)

Filter 1B – HPO Terms, AF <0.001, Homo variants in autosomal recessive genes: 0 variants found

Filter 1C – HPO Terms, AF < 0.001, Burden >1 in autosomal recessive genes: 6 variants analyzed:

• WFS1 c.2082G>C p.Glu694Asp, 1 het in GnomAD, mixed in-silico predictions (SIFT = 0.03, PolyPhen = 0.08, CADD = 14.6), 2 matching HPO terms (Feeding difficulties, Thrombocytopenia) o WFS1 c.1675G>A p.Ala559Thr, Benign five times and likely benign four times and VUS once in ClinVar, seen 16 times in C4R, 1112 hets and 2 homos in GnomAD, mixed in- silico predictions (SIFT = 0.63, PolyPhen = 0.09, CADD = 21.6) • HLA-B c.283G>A p.Ala95Thr, 1 het in GnomAD, in-silico predicts benign (SIFT = 0.13, PolyPhen = 0.003, CADD = 11.07), 1 matching HPO term (Thrombocytopenia) • HLA-B c.282G>C p.Gln94His, 22 hets in GnomAD, in-silico predicts benign (SIFT = 0.47, PolyPhen = 0.19, CADD = 9.39), 1 matching HPO term (Thrombocytopenia) • HLA-B c.206A>C p.Glu69Ala, 2 hets in GnomAD, in-silico predicts benign (SIFT = 0.22, PolyPhen = 0.01, CADD = 5.25), 1 matching HPO term (Thrombocytopenia) • HLA-DRB1 c.370+8T>C (splice region), not seen in GnomAD, O/E LOF score of 0.59, splice score of 0, only seen by 2 callers, 2 matching HPO terms (Hypertonia, Thrombocytopenia) • HLA-DRB1 c.351C>A p.Ser117Arg, not seen in GnomAD, in-silico predicts benign (SIFT = 0.46, PolyPhen = 0.02, CADD = 0.08), only seen by 2 callers, 2 matching HPO terms (Hypertonia, Thrombocytopenia)

Filter 2: AF <0.001, Likely Loss of Function Variants: 40 variants analyzed:

• Het KRTCAP2 c.478C>T p.Gln160Ter (nonsense), 1 het in GnomAD, no O/E scores listed, low phenolyzer rank (20863) • Het SMG5 c.714-3C>G (splice region), not seen in GnomAD, O/E LOF score of 0.27, splice score of 8.45 • Het INSRR c.122+6722A>G (splice region), 2 hets in GnomAD, O/E LOF score of 0.76, splice score of 4.91, low phenolyzer rank (10080) • Het GALE c.873+7C>A (splice region), het variant in AR disease-gene, seen 5 times in C4R, 11 hets in GnomAD, O/E LOF score of 0.92, splice score of 0, 4 matching HPO terms (Delayed gross motor development, Delayed speech and language development, Failure to thrive, Global developmental delay), high phenolyzer rank (323) • Het THAP3 c.331-4C>T, 22 hets in GnomAD, O/E LOF score of 0.47, splice score of -0.29, low phenolyzer rank (19278) • Het ABCD1P2 n.235-6C>G (splice region), not seen in GnomAD, no O/E scores listed, splice score of 4.39, no phenolyzer rank listed

201

• Het NUP98 c.2198-7G>T (splice region), not seen in GnomAD, O/E LOF score of 0.06, splice score of -1.03, only seen by 2 callers, high phenolyzer rank (358) • Het OR5D16 c.910C>T p.Arg304Ter (nonsense), 25 hets in GnomAD, no O/E scores listed • Het RASSF7 c.34+169del (frameshift), seen in 5 out of 18 reads, 4 hets in GnomAD, no O/E scores listed, low phenolyzer rank (12873) • Het PRB4 c.325C>T p.Gln109Ter (nonsense), seen 5 times in C4R, 10 hets in GnomAD, O/E LOF score of 2.7, only seen by 2 callers, no phenolyzer rank listed • Het PRKCH c.-116T>C (splice region), seen in 12 out of 18 reads, 9 hets in GnomAD, O/E LOF score of 0.21, splice score of -0.45 • Het HERC2P9 n.1138+1_1138+2dup (splice region), seen 9 times in C4R, 12 hets in GnomAD, no O/E scores listed, splice score of 0, no phenolyzer rank listed • Het KCNJ12 c.-56-4G>C (splice region), not seen in GnomAD, O/E LOF score of 0.39, splice score of -0.25 • Het NCOR1P2 n.105G>T (splice region), not seen in GnomAD, no O/E score listed, splice score of 3.21, no phenolyzer rank listed • Het POLRMT c.136_137insGGCGGCGGCGGCGGCGGCGC p.Pro46ArgfsTer13 (frameshift), not seen in GnomAD, no O/E scores listed, only seen by 2 callers • Het FBN3 c.2172C>T (splice region), 5 hets in GnomAD, O/E LOF score of 0.61, splice score of -0.99 • Het ZNF806 c.942del p.Tyr315ThrfsTer152 (frameshift), not seen in GnomAD, no O/E scores listed, low phenolyzer rank (10958) • Het ZNF806 c.1367dup p.Asn456LysfsTer85 (frameshift), not seen in GnomAD, no O/E scores listed, low phenolyzer rank (10958) • Het ZNF806 c.942del c.1580del p.Asn527MetfsTer36 (frameshift), not seen in GnomAD, no O/E scores listed, low phenolyzer rank (10958) • Het C2orf80 c.295-4A>G (splice region), 30 hets in GnomAD, O/E LOF score of 1.06, splice score of 2.91, no phenolyzer rank listed • Het NMUR1 c.378dup p.Phe127LeufsTer16 (frameshift), not seen in GnomAD, O/E LOF score of 0.73 • Het FAM182B c.201+1490C>T (splice region), not seen in GnomAD, no O/E scores listed, splice score of 0.53, only seen by 2 callers, no phenolyzer rank listed • Het FAM182B c.201+1483T>A (splice region), not seen in GnomAD, no O/E scores listed, splice score of 0.73, no phenolyzer rank listed • Homo MROH8 c.13+21_13+22insATAGACAGGGCCCCGCGGCCGGCACTCTT (splice acceptor), not seen in GnomAD, O/E LOF score of 0.60, splice score of 0, only seen by 2 callers, low phenolyzer rank (21262) • Het ABCC5 c.129+4378A>C (splice region), 1 het in GnomAD, O/E LOF score of 0.26, splice score of 2.96 • Homo LIMD1 c.1409-5G>T (splice region), not seen in GnomAD, O/E LOF score of 0.22, splice score of -2.69 • Homo LIMD1 c.1409-4C>G (splice region), not seen in GnomAD, O/E LOF score of 0.22, splice score of 0.34

202

• Homo LIMD1 c.1409-3A>C (splice region), not seen in GnomAD, O/E LOF score of 0.22, splice score of -3.85 • Het DOK7 c.1497+2T>G (splice donor), het variant in AR disease-gene, not seen in GnomAD, O/E LOF score of 0.47, splice score of 7.65, 1 matching HPO term (Delayed gross motor development) • Het MTX3 c.229-6dup (splice region), 81 hets in GnomAD, O/E LOF score of 0.59, splice score of 0, only seen by 2 callers, low phenolyzer rank (18088) • Het HLA-DRB6 n.876-5T>C (splice region), not seen in GnomAD, no O/E scores listed, splice score of -0.45, low phenolyzer rank (17427) • Homo HLA-DRB6 n.743A>G (splice region), not seen in GnomAD, no O/E scores listed, splice score of -0.13, low phenolyzer rank (17427) • Het HLA-DRB6 n.742T>A (splice region), not seen in GnomAD, no O/E scores listed, splice score of 0.51, low phenolyzer rank (17427) • Het PRSS3P1 n.591+4G>A (splice region), not seen in GnomAD, no O/E scores listed, splice score of -3.79, no phenolyzer rank listed • Het IQCE c.83-8T>C (splice region), 26 hets in GnomAD, O/E LOF score of 1.27, splice score of 0.45 • Het FOXK1 c.1920C>T (splice region), 3 hets in GnomAD, O/E LOF score of 0.17, splice score of 1.58 • Het NSUN5 c.93+8A>C (splice region), seen 5 times in C4R, 48 hets in GnomAD, O/E LOF score of 0.87, splice score of 0, low phenolyzer rank (16562) • Het ZNF883 c.949G>T p.Gly317Ter (nonsense), 7 hets in GnomAD, no O/E scores listed, low phenolyzer rank (10984) • Het RNF38 c.910-5C>T (splice region), 2 hets in GnomAD, O/E LOF score of 0.03, splice score of -0.47, low phenolyzer rank (10535) • Het PRPRD c.4304+7_4304+8insACAGTTCAGGAATGGTAAGTT (splice region), not seen in GnomAD, O/E LOF score of 0.04, splice score of 0, only seen by 2 callers, low phenolyzer rank (12034)

Filter 3 – Candidate Gene list: No candidate gene lists for this phenotype

Homo variants not yet analyzed: 18 variants analyzed:

• RP11-108K14.8 c.1A>G p.Met1? (start loss), seen in 4 out of 6 reads, seen 6 times in C4R, 2 hets in GnomAD, mixed in-silico predictions (SIFT = 0, PolyPhen = 0, CADD = 0.001), only seen by 2 callers, no phenolyzer rank listed • ANO5 c.367_368insAGGGGAATGAGGAGGAGGAGGAGG p.Glu122_Ala123insGluGlyAsnGluGluGluGluGlu (in-frame insertion), Benign in ClinVar, not seen in GnomAD, no O/E scores listed, only seen by 2 callers, low phenolyzer rank (11098) • AKAP6 c.3193T>G p.Cys1065Gly, 10 hets in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.98, CADD = 18.61), low phenolyzer rank (13918) • SARM1 c.449G>T p.Gly150Val, 2 hets in GnomAD, in-silico predicts benign (SIFT = 0.57, PolyPhen = 0, CADD = none) • SARM1 c.451T>G p.Ser151Ala, 1 het in GnomAD, in-silico predicts benign (SIFT = 0.6, PolyPhen = 0, CADD = none)

203

• KRTAP10-1 c.477T>A p.Asp159Glu, not seen in GnomAD, in-silico predicts benign (SIFT = 0.43, PolyPhen = 0.01, CADD = 0.02) • KRTAP10-1 c.476A>C p.Asp159Ala, not seen in GnomAD, in-silico predicts benign (SIFT = 1, PolyPhen = 0, CADD = 0.001) • KRTAP10-1 c.475G>T p.Asp159Tyr, not seen in GnomAD, in-silico predicts benign (SIFT = 0.05, PolyPhen = 0.23, CADD = 0.08) • MUC4 c.7702G>T p.Ala2568Ser, not seen in GnomAD, in-silico predicts benign (SIFT = 1, PolyPhen = 0.37, CADD = 9.13) • HLA-DRB6 n.458+7G>A, 2 hets in GnomAD, no O/E scores listed, splice score of 0, only seen by 2 callers, low phenolyzer rank (17427) • MUC3A c.1347A>T p.Arg449Ser, 19 hets and 4 homos in GnomAD, no in-silico scores listed • LCN8 c.230A>G p.His77Arg, not seen in GnomAD, no in-silico scores listed, low phenolyzer rank (20300) • LCN8 c.227T>C p.Leu76Pro, 1 het in GnomAD, no in-silico scores listed, low phenolyzer rank (20300) • C9orf169 c.55C>T p.Arg19Trp, 2 hets in GnomAD, mixed in-silico predictions (SIFT = 0, PolyPhen = 0.08, CADD = 33), low phenolyzer rank (19244) • MAP7D2 c.1197A>C p.Glu399Asp, 10 hets in GnomAD, mixed in-silico predictions (SIFT = 0.14, PolyPhen = 0.02, CADD = 0.08), no phenolyzer rank listed • MAGEB6 c.293G>A p.Arg98His, seen in 14 out of 15 reads, 9 hets in GnomAD, mixed in-silico predictions (SIFT = 0.09, PolyPhen = 0, CADD = 0.001), no phenolyzer rank listed • MAGEB6 c.299C>T p.Ala100Val, seen in 12 out of 13 reads, 3 hets in GnomAD, mixed in-silico predictions (0.02, PolyPhen = 0.39, CADD = 13.16), no phenolyzer rank listed • AR c.237_239dup p.Gln80dup, seen in 12 out of 12 reads, not seen in GnomAD, only seen by 2 callers, high phenolyzer rank (24)

OMIM Genes: 2 variants analyzed:

• RAB3GAP2 c.3350G>A p.Arg1117Lys, het variant in AR disease-gene, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.91, PolyPhen = 0.02, CADD = 20.2), 4 matching HPO terms (Delayed speech and language development, Feeding difficulties, Global developmental delay, Hypertonia, Hypoplasia of the corpus callosum, Microcephaly), gene causes Martsolf syndrome and Warburg micro syndrome • KRIT1 c.292C>T p.Pro98Ser, 17 hets in GnomAD, mixed in-silico predictions (SIFT = 0.24, PolyPhen = 0.03, CADD = 16.23), low phenolyzer rank (10647) gene causes autosomal dominant cerebral cavernous malformations

Case 24: Singleton Exome Reanalysis

526 variants total. 56 variants when filtered for HPO terms. 69 variants analyzed.

Filter 1A – HPO, AF < 0.001, Het variants in autosomal dominant genes: 14 variants analyzed:

• NOTCH2 c.1861C>T p.Arg621Cys, 1 het in GnomAD, in-silico predicts deleterious (SIFT = 0.03, PolyPhen = 0.92, CADD = 27.5), 4 matching HPO terms Abnormality of the lymphatic system,

204

Cholestasis, Osteopenia, Short stature), high phenolyzer rank (525), original exome determined variant is paternally inherited • UFC1 c.127G>C p.Val43Leu, 3 hets in GnomAD, mixed in-silico predictions (SIFT = 0.03, PolyPhen = 0.12, CADD = 23.7), 1 matching HPO term (Short stature), low phenolyzer rank (18502) • LBR c.964G>C p.Val322Leu, not seen in GnomAD, in-silico predicts benign (SIFT = 0.86, PolyPhen = 0.02, CADD = 4.55), 7 matching HPO terms (Abnormality of the lymphatic system, Ascites, Cholestasis, Lymphedema, Malabsorption, Recurrent otitis media, Short stature), high phenolyzer rank (616) • GFI1 c.925-16_925-5dup (splice region), not seen in GnomAD, O/E LOF score of 0.25, splice score of 0, 1 matching HPO term (Osteopenia), low phenolyzer rank (10251) • COL2A1 c.4064G>A p.Gly1355Asp, VUS in ClinVar, 25 hets in GnomAD, mixed in-silico score (SIFT = 0.37, PolyPhen = 0.99, CADD = 23.3), 2 matching HPO terms (Recurrent otitis media, Short stature) • SDR9C7 c.454C>T p.Arg152Trp, 11 hets in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.99, CADD = 34), 1 matching HPO term (Short stature) • TSC2 c.4498G>A p.Val1500Met, 1 het in GnomAD, in-silico predicts deleterious (SIFT = 0; PolyPhen = 0.99, CADD = 32), 4 matching HPO terms (Abnormality of the lymphatic system, Ascites, Hypothyroidism, Lymphedema), high phenolyzer rank (51) • TTN c.13561A>G p.Thr4521Ala (missense), 5 hets in GnomAD, in-silico predicts benign (SIFT = 0.29, PolyPhen = 0, CADD = 1.87), 1 matching HPO term (Short stature) • FN1 c.4652C>A p.Thr1551Asn, 1 het in GnomAD, mixed in-silico predictions (SIFT = 0.1, PolyPhen = 0.03, CADD = 12.96), 1 matching HPO term (Short stature), high phenolyzer rank (457) • SPRY4 c.311_313dup p.Ser104dup (in-frame insertion), 14 hets in GnomAD, 1 matching HPO term (Osteopenia), low phenolyzer rank (11566) • TCOF1 c.1316C>T p.Pro439Leu, VUS in ClinVar, 103 hets in GnomAD, mixed in-silico predictions (SIFT = 0.09, PolyPhen = 0.05, CADD = 12.88), 1 matching HPO term (Abnormality of the lymphatic system), low phenolyzer rank (14533) • BTNL2 c.1021T>C p.Cys341Arg, 5 hets in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 1, CADD = 25.5), 3 matching HPO terms (Abnormality of the lymphatic system, Bronchiectasis, Hypothyroidism) • HLA-DQA1 c.91G>A p.Val31Ile, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.08, PolyPhen = 0.007, CADD = 4.42), 2 matching HPO terms (Malabsorption, Short stature), high phenolyzer rank (433) • GNE c.1985C>T p.Ala662Val, Pathogenic in ClinVar, 24 hets in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.99, CADD = 33), 2 matching HPO terms (Abnormality of the lymphatic system, Hypothyroidism), gene causes AR Nonaka myopathy and AD Sialuria

Filter 1 B – HPO, AF < 0.001, Homo Variants in autosomal recessive genes: 2 variants analyzed:

• HLA-B c.283G>A p.Ala95Thr, 1 het in GnomAD, in-silico predicts benign (SIFT = 0.13, PolyPhen = 0.003, CADD = 11.07), 2 matching HPO terms (Abnormality of the lymphatic system, Malabsorption), high phenolyzer rank (451) • HLA-B c.282G>C p.Gln94His, 22 hets in GnomAD, in-silico predicts benign (SIFT = 0.47, PolyPhen = 0.19, CADD = 9.39), 2 matching HPO terms (Abnormality of the lymphatic system, Malabsorption), high phenolyzer rank (451)

205

Filter 1 C – HPO, AF < 0.001, Burden > 1 in autosomal recessive genes: 5 variants analyzed:

• RELB c.433G>A p.Glu145Lys, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.87, CADD = 33), 1 matching HPO term (Recurrent otitis media), high phenolyzer rank (963), gene causes AR immunodeficiency • RELB c.1091C>T p.Pro364Leu, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.99, CADD = 34), 1 matching HPO term (Recurrent otitis media), high phenolyzer rank (963), gene causes AR immunodeficiency • HLA-B c.538C>G p.Arg180Gly, not seen in GnomAD, in-silico predicts benign (SIFT = 0.33, PolyPhen = 0.03, CADD = 4.17), 2 matching HPO terms (Abnormality of the lymphatic system, Malabsorption), high phenolyzer rank (451) • HLA-DRB1 c.370+8T>C (splice region), not seen in GnomAD, O/E LOF score of 0.59, splice score of 0, only seen by 2 callers, 5 matching HPO terms (Abnormality of the lymphatic system, Bronchiectasis, Hypothyroidism, Lymphedema, Malabsorption), high phenolyzer rank (80) • HLA-DRB1 c.370+7A>G (splice region), not seen in GnomAD, O/E LOF score of 0.59, splice score of 0, only seen by 2 callers, 5 matching HPO terms (Abnormality of the lymphatic system, Bronchiectasis, Hypothyroidism, Lymphedema, Malabsorption), high phenolyzer rank (80)

Filter 2: AF < 0.001, Likely Loss of Function Variants: 47 variants analyzed:

• Het CPSF3L c.547-1G>T (splice acceptor), not seen in GnomAD, O/E LOF score of 0.71, splice score of 8.59, low phenolyzer rank (10541) • Het VPS45 c.703C>T p.Arg235Ter (nonsense), het variant in AR disease-gene, 6 hets in GnomAD, O/E LOF score of 0.42, 1 HPO term (Abnormality of the lymphatic system) • Het POU2F1 c.*296del (frameshift), not seen in GnomAD, O/E LOF score of 0.18, only seen by 2 callers • Het DCAF6 c.1379-4del (splice region), not seen in GnomAD, O/E LOF score of 0.23, splice score of 0, low phenolyzer rank (10202) • Het KLHL17 c.1701-6C>A (splice region), 1 het in GnomAD, O/E LOF score of 0.91, splice score of 2.98, low phenolyzer rank (22800) • Het TIRAP c.68-8C>T (splice region), 5 hets in GnomAD, O/E LOF score of 0.80, splice score of - 0.47 • Het ANKRD10 c.691+8C>T (splice region), 28 hets in GnomAD, O/E LOF score of 0.27, splice score of 0, low phenolyzer score (11065) • Het IGHV3-33 c.47-2A>C (splice region), 7 hets in GnomAD, no O/E scores listed, splice score of 8.04, only seen by 2 callers, low phenolyzer rank (14929) • Het ADYC4 c.931-5C>A (splice region), 82 hets in GnomAD, O/E LOF score of 0.65, splice score of 0.39, high phenolyzer rank (286) • Het MIS18BP1 c.3263del p.Pro1088GlnfsTer13 (frameshift), not seen in GnomAD, O/E LOF score of 0.56, low phenolyzer rank (12354) • Het CT62 c.135+1G>A (splice donor), 44 hets in GnomAD, O/E LOF score of 1.69, splice score of 8.18, no phenolyzer rank listed • Het KCNJ12 c.-56-4G>C (splice region), seen in 5 out of 16 reads, not seen in GnomAD, O/E LOF score of 0.39, splice score of -0.25

206

• Het SLC39A11 c.741dup p.Val248CysfsTer41 (frameshift), 5 hets in GnomAD, O/E LOF score of 1.24 • Het SERPINB13 c.-17-7A>G (splice region), 14 hets in GnomAD, O/E LOF score of 0.67, splice score of -1.15, low phenolyzer rank (12328) • Het A1BG-AS1 c.-44C>T (splice region), 59 hets in GnomAD, no O/E scores listed, low phenolyzer rank (12636) • Homo POLRMT c.-18_-17insGGCGGCGGCGGCGGCGGCGC (frameshift), not seen in GnomAD, no O/E scores listed • Het CYS1 c.319-7T>C (splice region), 25 hets in GnomAD, O/E LOF score of 0.72, splice score of 0.43, low phenolyzer rank (12357) • Het SCTR c.1014-5C>T (splice region), 21 hets in GnomAD, O/E LOF score of 0.76, splice score of 1.09 • Het ZNF806 c.942del p.Tyr315ThrfsTer152 (frameshift), not seen in GnomAD, no O/E scores listed, low phenolyzer rank (12517) • Het ZNF806 c.1367dup p.Asn456LysfsTer85 (frameshift), not seen in GnomAD, no O/E scores listed, low phenolyzer rank (12517) • Het ZNF806 c.1580del p.Asn527MetfsTer36 (frameshift), not seen in GnomAD, no O/E scores listed, low phenolyzer rank (12517) • Het FABP1 c.68-134G>A (splice region), seen in 9 out of 19 reads, 2 hets in GnomAD, O/E LOF score of 1.35, splice score of -0.81 • Het FAM182B c.201+1490C>T (splice region), not seen in GnomAD, no O/E scores listed, splice score of 0.53, no phenolyzer rank listed • Het FAM182B c.201+1483T>A (splice region), not seen in GnomAD, no O/E scores listed, splice score of 0.73, no phenolyzer rank listed • Het SLC9A8 c.-88T>C (splice region), 24 hets in GnomAD, O/E LOF score of 0.71, splice score of - 0.61 • Het LOC101926954 c.975dup p.Met326TyrfsTer10 (frameshift), not seen in GnomAD, no O/E scores listed, no phenolyzer rank listed • Het LOC101926954 c.868C>T p.Arg290Ter (nonsense), not seen in GnomAD, no O/E scores listed, no phenolyzer rank listed • Het LOC101926954 c.358G>T p.Glu120Ter (nonsense), not seen in GnomAD, no O/E scores listed, no phenolyzer rank listed • Het PRODH c.605+3G>A (splice region), het variant in AR disease-gene, 15 hets in GnomAD, O/E LOF score of 0.74, splices core of -2.87 • Het METTL6 c.339-215A>G (splice region), not seen in GnomAD, O/E LOF score of 0.69, splice score of -0.16, low phenolyzer score (21928) • Homo LIMD1 c.1409-5G>T (splice region), not seen in GnomAD, O/E LOF score of 0.22, splice score of -2.69 • Homo LIMD1 c.1409-4C>G (splice region), not seen in GnomAD, O/E LOF score of 0.22, splice score of 0.34 • Homo LIMD1 c.1409-3A>C (splice region), not seen in GnomAD, O/E LOF score of 0.22, splice score of -3.85

207

• Het AC026703.1 c.276del p.Lys93AsnfsTer7 (frameshift), 2 hets in GnomAD, no O/E scores listed, no phenolyzer rank listed • Het NNT c.1471-5A>T (splice region), het variant in AR disease-gene, 96 hets in GnomAD, O/E LOF score of 0.15, splice score of -0.27, 1 matching HPO term (Hypothyroidism) • Het TNXB c.2605C>T p.Arg869Ter (nonsense), not seen in GnomAD, O/E LOF score of 0.26 • Homo HLA-DRB6 n.876-5T>C (splice region), not seen in GnomAD, no O/E scores listed, splice score of -0.45, low phenolyzer rank (16030) • Homo HLA-DRB6 n.743A>G (splice region), not seen in GnomAD, no O/E scores listed, splice score of -0.13, low phenolyzer rank (16030) • Homo HLA-DRB6 n.742T>A (splice region), not seen in GnomAD, no O/E scores listed, splice score of 0.51, low phenolyzer rank (16030) • Homo PRSS3P1 n.591+4G>A (splice region), not seen in GnomAD, no O/E score listed, splice score of -3.79, no phenolyzer rank listed • Homo RP11-368M16.3 n.101+6C>A (splice region), seen in 12 out of 14 reads, seen 17 times in C4R, not seen in GnomAD, no O/E scores listed, splices core of 1.23, only seen by 2 callers, no phenolyzer rank listed • Het VPS13B c.6732+1G>A (splice donor), het variant in AR disease-gene, Likely pathogenic and pathogenic in ClinVar, 11 hets in GnomAD, O/E LOF score of 0.54, splice score of 8.18, 2 matching HPO terms (Failure to thrive in infancy, Short stature) • Het FBXL c.1225+6C>T (splice region), seen in 11 out of 41 reads, not seen in GnomAD, O/E LOF score of 0.73, splice score of -2.67, low phenolyzer rank (16309) • Het NSMAF c.2360T>A p.Leu787Ter (nonsense), 6 hets in GnomAD, O/E LOF score of 0.49 • Het ASHP c.847+300_847+303del (splice region), seen in 9 out of 17 reads, het variant in AR disease-gene, 4 hets in GnomAD, O/E LOF score of 0.62, splice score of 0, low phenolyzer rank (10071) • Het KIAA1429 c.5141-6G>C (splice region), 9 hets in GnomAD, O/E LOF score of 0.16, splice score of -1.17, low phenolyzer rank (15969) • Het PTPRD c.4304+7_4304+8insACAGTTCAGGAATGGTAAGTT (splice region), not seen in GnomAD, O/E LOF score of 0.04, splice score of 0, only seen by 2 callers, low phenolyzer rank (14985)

Filter 3 – Immune Candidate Genes: 1 variant analyzed:

• Het IL21R c.495C>A p.Asp165Glu, het variant in AR disease-gene, 12 hets in GnomAD, mixed in- silico predictions (SIFT = 0.08, PolyPhen = 0.01, CADD = 16.03), 3 matching HPO terms (Bronchiectasis, Chronic diarrhea, Immunodeficiency), high phenolyzer rank (395)

Case 25: Trio Exome Reanalysis, Unaffected Parents

597 variants total. 76 variants when filtered for HPO terms. 15 variants analyzed.

Filter 1A – HPO, AF < 0.001, de novo het variants: 1 variant analyzed:

• FMN2 c.2962G>A p.Ala988Thr, seen in 60 out of 247 reads, het variant in AR disease-gene, not seen in GnomAD, in-silico predicts benign (SIFT = 0.35, PolyPhen = 0, CADD = 0.003), only seen

208

by 2 callers, 5 matching HPO terms (Feeding difficulties, Generalized hypotonia, Global developmental delay, Intellectual disability, Seizures), low phenolyzer rank (10707)

Filter 1B – HPO, AF < 0.001, Homo variants when parents are not homo: 0 variants found

Filter 1C – HPO, AF < 0.001, Burden > 1 in autosomal recessive disease genes: 2 variants analyzed:

• SPTB c.4819G>A p.Val1607Ile, 13 hets in GnomAD, mixed in-silico predictions (SIFT = 0.03, PolyPhen = 0, CADD = 21.7), 1 matching HPO term (Ataxia), high phenolyzer rank (470), maternally inherited • SPTB c.871G>A p.Gly291Ser, 62 hets in GnomAD, in-silico predicts deleterious (SIFT = 0.01, PolyPhen = 0.99, CADD = 34), 1 matching HPO term (Ataxia), high phenolyzer rank (470), paternally inherited

Filter 2 – AF < 0.001, Likely Loss of Function Variants: 6 variants analyzed:

• Het MUC4 c.83-6510G>T (splice acceptor), seen in 16 out of 69 reads, not seen in GnomAD, no O/E scores listed, splice score of 8.59, only seen by 2 callers • Het HLA-DRB5 c.115C>T p.Gln39Ter (nonsense), not seen in GnomAD, O/E LOF score of 1.11, only seen by 2 callers • Homo HLA-DRB6 n.876-5T>C (splice region), not seen in GnomAD, no O/E scores listed, splice score of -0.45, low phenolyzer rank (15926) • Homo HLA-DRB6 n.742T>A (splice region), not seen in GnomAD, no O/E scores listed, splice score of 0.51, low phenolyzer rank (15926) • Homo PRSS3P1 n.591+4G>A (splice region), not seen in GnomAD, no O/E scores listed, splice score of -3.79, no phenolyzer rank listed • Homo FBXL6 c.1225+6C>T (splice region), seen in 13 out of 14 reads, not seen in GnomAD, O/E LOF score of 0.73, splice score of -2.67, only seen by 2 callers, low phenolyzer rank (16274)

Filter 3 - Epilepsy Candidate Genes: 0 variants found

De Novo Variants Not Yet Analyzed: 5 variants analyzed:

• Het MUC4 c.83-6283T>C (missense), seen in 8 out of 29 reads, 27 hets in GnomAD, in-silico predicts benign (SIFT = 0.24, PolyPhen = 0.16, CADD = 0.006), only seen by 2 callers • Het MUC4 c.83-6510G>T (splice acceptor), seen in 16 out of 69 reads, not seen in GnomAD, no O/E scores listed, splice score of 8.59, only seen by 2 callers • Het HLA-DRB5 c.117G>C p.Gln39His, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0.02, PolyPhen = 0.85, CADD = 22.4), only seen by 2 callers • Homo IGHJ6 c.18_19insGGT p.Tyr6_Met7insGly (in-frame insertion), seen 8 times in C4R, 9 hets in GnomAD, no O/E scores listed, only seen by 2 callers, low phenolyzer rank (20251) • Homo LOC730268 c.147_148insG p.Leu50ValfsTer23 (frameshift), 124 hets in GnomAD, no O/E scores listed, only seen by 2 callers, no phenolyzer rank listed

Homo variant when each parent is het: 1 variant analyzed:

• ATG9A c.1121C>T p.Thr374Ile, 28 hets in GnomAD, mixed in-silico scores (SIFT = 0.08, PolyPhen = 0.74, CADD = 25.1), low phenolyzer rank (10075)

209

Case 26: Duo Exome Reanalysis, Affected Sibling Pair

432 shared variants total. 21 shared variants when filtered for HPO terms. 64 variants analyzed.

Filter 1A – HPO, AF < 0.001, shared het variants: 1 variant analyzed:

• COL9A1 c.1418T>C p.Ile473Thr, 8 hets in GnomAD, mixed in-silico predictions (SIFT = 0.13, PolyPhen = 0.65, CADD = 24.7), 3 matching HPO terms (High myopia, Joint hyperflexibility, Retinal detachment), high phenolyzer rank (25)

Filter 1B – HPO, AF < 0.001, shared homo variants: 0 variants found

Filter 1C – HPO, AF < 0.001, Burden > 1 in autosomal recessive genes: 0 variants found

Filter 2 – AF < 0.001, Likely Loss of Function Variants: 39 variants analyzed:

• Het HIPK1-AS1 n.130A>T (splice region), 6 hets in GnomAD, no O/E scores listed, splice score of 4.18, no phenolyzer rank listed • Het ACAP3 c.738+8C>T (splice region), 2 hets in GnomAD, O/E LOF score of 0.33, splice score of 0, low phenolyzer rank (10946) • Het C2CD4D c.893dup p.Asp299GlyfsTer58 (frameshift), not seen in GnomAD, O/E LOF score of 0.60, no phenolyzer rank listed • Het CDK11B c.1426-4G>T (splice region), 24 hets in GnomAD, O/E LOF score of 0.38, splice score of -0.46, low phenolyzer rank (10433) • Het SYCP3 c.96T>A p.Cys32Ter (nonsense), not seen in GnomAD, no O/E scores listed, low phenolyzer rank (11201) • Het TAPBPL c.682C>T p.Arg228Ter (nonsense), 24 hets in GnomAD, O/E LOF score of 0.82, low phenolyzer rank (16173) • Het IGHV7-27 c.196G>T p.Glu66Ter (nonsense), seen in 6 out of 35 reads, not seen in GnomAD, no O/E scores listed, only seen by 2 callers, no phenolyzer rank listed • Het PSMA6 c.447-8C>A (splice region), 20 hets in GnomAD, O/E LOF score of 0, splice score of 1.39, high phenolyzer rank (324) • Het TARSL2 c.1866+8G>A (splice region), 13 hets in GnomAD, O/E LOF of 0.65, splice score of 0, low phenolyzer rank (16289) • Het CA12 c.874+6G>A (splice region), not seen in GnomAD, O/E LOF score of 0.88, splice score of 0.48 • Het SLCC2A31 c.-304A>G (splice region), 2 hets in GnomAD, O/E LOF score of 1.26, splice score of -0.06, low phenolyzer rank (10044) • Het KCNJ12 c.-56-4G>C (splice region), not seen in GnomAD, O/E LOF score of 0.39, splice score of -0.25 • Het NOS2 c.1477-7T>C (splice region), 9 hets in GnomAD, O/E LOF score of 0.56, splice score of - 0.43, high phenolyzer rank (777) • Het CCDC124 c.465-6C>T (splice region), 36 hets in GnomAD, O/E LOF score of 0.25, splice score of 0.89, no phenolyzer rank listed

210

• Het SPTBN4 c.897+6A>T (splice region), 13 hets in GnomAD, O/E LOF score of 0.10, splice score of -0.76 • Het NTN5 c.970+1G>A (splice donor), 14 hets in GnomAD, O/E LOF score of 1.20, splice score of 8.18, no phenolyzer rank listed • Het POLRMT c.136_137insGGCGGCGGCGGCGGCGGCGC p.Pro46ArgfsTer13 (frameshift), not seen in GnomAD, no O/E scores listed • Het ZNF806 c.942del p.Tyr315ThrfsTer152 (frameshift), not seen in GnomAD, no O/E scores listed • Het ZNF806 c.1367dup p.Asn456LysfsTer85 (frameshift), not seen in GnomAD, no O/E scores listed • Het ZNF806 c.1580del p.Asn527MetfsTer36 (frameshift), not seen in GnomAD, no O/E scores listed • Het THADA c.172-4A>G (splice region), not seen in GnomAD, O/E LOF score of 0.77, splice score of -0.69, no phenolyzer rank listed • Het CLHC1 c.1199-6C>G (splice region), 8 hets in GnomAD, O/E LOF score of 1.00, splice score of 3.89, no phenolyzer rank listed • Homo MROH8 c.13+21_13+22insATAGACAGGGCCCCGCGGCCGGCACTCTT (splice acceptor), not seen in GnomAD, O/E LOF score of 0.61, splice score of 0, only seen by 2 callers, no phenolyzer rank listed • Het BCAS1 c.1183-3745dup (frameshift), 1 het in GnomAD, O/E LOF score of 0.54, no phenolyzer rank listed • COL9A3 c.423+1del (frameshift), mixed zygosity between siblings, not seen in GnomAD, O/E LOF score of 0.54, 2 HPO terms (Joint hypertension, Retinal detachment), high phenolyzer rank (28) • Het CPNE4 c.736-7C>A (splice region), 10 hets in GnomAD, O/E LOF score of 0.23, splice score of 1.75, low phenolyzer rank (13551) • Homo LIMD1 c.1409-5G>T (splice region), not seen in GnomAD, O/E LOF score of 0.22, splice score of -2.69 • Homo LIMD1 c.1409-4C>G (splice region), not seen in GnomAD, O/E LOF score of 0.22, splice score of 0.34 • Homo LIMD1 c.1409-3A>C (splice region), not seen in GnomAD, O/E LOF score of 0.22, splice score of -3.85 • Het ARHGEF38 c.1545+2T>A (splice donor), 6 hets in GnomAD, O/E LOF score of 0.61, splice score of 8.18 • Het HLA-DRB1 c.370+8T>C (splice region), not seen in GnomAD, O/E LOF score of 0.59, splice score of 0, only seen by 2 callers • Het HLA-DRB1 c.370+7A>G (splice region), not seen in GnomAD, O/E LOF score of 0.59, splice score of 0, only seen by 2 callers • Het HLA-DRB1 c.370+1G>A (splice donor), not seen in GnomAD, O/E LOF score of 0.59, splice score of 8.18, only seen by 2 callers • Het PRSS3P1 n.342+4G>A (splice region), not seen in GnomAD, no O/E scores listed, splice score of -3.79, no phenolyzer rank listed • Homo ADAM28 c.151-1120G>A (splice region), 10 hets in GnomAD, O/E LOF score of 0.89, splice score of -0.11, low phenolyzer rank (16677)

211

• Het FOCAD c.2961+11dup (splice region), 13 hets in GnomAD, O/E LOF score of 0.69, splice score of 0, no phenolyzer rank listed • Het PTPRD c.4304+7_4304+8insACAGTTCAAGAATGGTAAGTT (splice region), not seen in GnomAD, O/E LOF score of 0.04, splice score of 0, low phenolyzer rank (11649) • Het PTPRD c.4304+7_4304+8insAGTTACAGTTCAAGAATGGTAAGTT (splice region), not seen in GnomAD, O/E LOF score of 0.04, splice score of 0, low phenolyzer rank (11649) • Het DAPK1 c.423+4C>T (splice region), 24 hets in GnomAD, O/E LOF score of 0.17, splice score of 0.96

Shared homozygous variants not yet analyzed: 14 variants analyzed:

• OVGP1 c.1553C>G p.Thr518Ser, 72 hets in GnomAD, mixed in-silico predictions (SIFT = 0.07, PolyPhen = 0.02, CADD = 9.46), low phenolyzer score (12878) • SARM1 c.449G>T p.Gly150Val, 2 hets in GnomAD, in-silico predicts benign (SIFT = 0.57, PolyPhen = 0, CADD = none) • SARM1 c.451T>G p.Ser151Ala (missense), 1 het in GnomAD, in-silico predicts benign (SIFT = 0.6, PolyPhen = 0, CADD = none) • KRTAP10-1 c.477T>A p.Asp159Glu, not seen in GnomAD, in-silico predicts benign (SIFT = 0.43, PolyPhen = 0.01, CADD = 0.03) • KRTAP10-1 c.476A>C p.Asp159Ala, not seen in GnomAD, in-silico predicts benign (SIFT = 1, PolyPhen = 0, CADD = 0.001) • KRTAP10-1 c.475G>T p.Asp159Tyr, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.05, PolyPhen = 0.23, CADD = 0.08) • TPRXL c.206C>A p.Pro69His, not seen in GnomAD, mixed in-silico predictions (SIFT = none, PolyPhen = 0.98, CADD = none), low phenolyzer rank (13668) • HLA-B c.283G>A p.Ala95Thr, 1 het in GnomAD, mixed in-silico prediction (SIFT = 0.13, PolyPhen = 0.003, CADD = 11.07) • HLA-B c.282G>C:p.Gln94His, 22 hets in GnomAD, in-silico predicts benign (SIFT = 0.47, PolyPhen = 0.19, CADD = 9.39) • JRK c.833C>T p.Ser278Leu, 14 hets in GnomAD, mixed in-silico predictions (SIFT = 0.03, PolyPhen = 0.21, CADD = 11.78), no phenolyzer rank listed • ZC3H3 c.2402A>T p.Lys801Ile, 1 het in GnomAD, mixed in-silico predictions (SIFT 0, PolyPhen = 0.43, CADD = 25.5), no phenolyzer rank listed • GSDMD c.488C>T p.Arg164Cys, 7 hets in GnomAD, mixed in-silico score (SIFT = 0, PolyPhen = 0, CADD = none) • PCSK5 c.2008_2022del p.Lys670_Met674del (in-frame deletion), not seen in GnomAD • MT-CYB c.685G>A p.Ala229Thr, Likely pathogenic in ClinVar, seen 13 times in C4R, not seen in GnomAD, mixed in-silico predictions (SIFT = 0.05, PolyPhen = 0.23, CADD = none), no phenolyzer rank listed

OMIM Genes: 10 variants analyzed:

• Het EPHA2 c.1340C>T p.Thr447Ile, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0.02, PolyPhen = 0.94, CADD = 28.8), gene causes AD early onset cataract

212

• COL4A1 c.1534C>T p.Pro512Ser, mixed zygosity between siblings, 107 hets and 1 homo in GnomAD, in-silico predicts benign (SIFT = none, PolyPhen = 0.19, CADD = 12.66), 1 HPO term (Retinal detachment), high phenolyzer rank (128) • Het SPIAL1L3 c.4043G>A p.Arg1348His, 70 hets in GnomAD, in-silico predicts benign (SIFT = 0.2, PolyPhen = 0, CADD = 12.41) • Het TGFB1 c.1010G>C p.Ser337Thr, 147 hets and 2 homos in GnomAD, mixed in-silico predictions (SIFT = 0.09, PolyPhen = 0.02, CADD = 17.35), high phenolyzer rank of (51), gene causes AD Camurati-Engelmann disease which causes deafness and optic nerve compression • Het OTOF c.4642G>A p.Glu1548Lys, 102 hets in GnomAD, mixed in-silico predictions (SIFT = 0.01, PolyPhen = 0.45, CADD = 34), no phenolyzer rank listed, gene causes AR auditory neuropathy and AR deafness • Het OTOF c.2243T>C p.Met748Thr, 24 hets in GnomAD, mixed in-silico predictions (SIFT = 0.03, PolyPhen = 0, CADD = 17.52), no phenolyzer rank listed, gene causes AR auditory neuropathy and AR deafness • LAMB2 c.634T>C p.Ser212Pro, mixed zygosity between siblings, not seen in GnomAD, in-silico predicts deleterious (SIFT = 0, PolyPhen = 0.99, CADD = 28.3), 1 matching HPO term (Proteinuria), high phenolyzer rank (716), gene causes AR nephrotic syndrome with or without ocular abnormalities and AR Pierson syndrome • Het RP1L1 c.1267C>T p.Arg423Trp, 34 hets and 1 homo in GnomAD, in-silico predicts benign (SIFT = 0.17, PolyPhen = 0.02, CADD = 11.5), low phenolyzer rank (13530) • RECQL4 c.2086C>A p.Arg696Ser, mixed zygosity between siblings, Likely benign in ClinVar, 164 hets in GnomAD, in-silico predict deleterious (SIFT = 0, PolyPhen = 0.99, CADD = 25.9), no phenolyzer rank listed, gene causes AR Baller-Gerold syndrome and AR RAPADILINO syndrome and AR Rorthmund-Thomson syndrome • Het EYA1 c.923G>A p.Arg308Gln, Likely benign in ClinVar, 51 hets in GnomAD, in-silico predicts deleterious (SIFT = 0.02, PolyPhen = 0.98, CADD = 33), 1 matching HPO term (Anteverted nares), gene causes AD anterior segment anomalies and branchioororenal sydndrome with or without cataract

213