Contribution of Rare Copy Number Variants to the Development of Autism Spectrum Disorder in High-Risk Siblings

by

Lia D’Abate

A thesis submitted in conformity with the requirements for the degree of Master of Science

Department of Molecular Genetics University of Toronto

© Copyright by Lia D’Abate 2016

Contribution of Rare Copy Number Variants to the

Development of Autism Spectrum Disorder in High-Risk

Siblings

Lia D’Abate

Master of Science

Department of Molecular Genetics University of Toronto

2016

Abstract

As microarrays are commonly used to identify the causes of autism spectrum disorder (ASD) we are investigating whether genetic testing in young children can predict ASD status prior to the typical age of onset of symptoms. A total of 441 individuals (190 probands, 88 siblings with

ASD, 163 unaffected siblings) from the Infant Siblings Research Consortium were genotyped on the high-resolution Affymetrix CytoScan HD platform. De novo and inherited presumed pathogenic CNVs were identified in 23/199 families (11.4%). In 6 of these families, the index case and one or more siblings carried the presumed pathogenic CNV; in every such case the carrier individual either had ASD (4) or had a subclinical form of ASD (2). There were 6 additional families in which CNV data would have been informative for early ASD diagnosis or detection of developmental delay. Our results suggest CNVs may serve as biological markers for

ASD.

ii

Acknowledgements

The completion of this thesis could not have been possible without the unwavering support of my supervisor Dr. Stephen Scherer. One of the greatest privileges of working under the supervision of Dr. Scherer was to not only see his commitment to scientific excellence, but his genuine desire to share that with the public and improve the quality of life of children with neurodevelopmental disorders. I now strive to bring that level of humanity and reverence for basic science to all of my projects. I am very grateful to have had access to first-class technologies that make answering these scientific questions possible. I would also like to thank my committee members Dr. Lucy Osborne and Dr. Freda Miller for always giving me excellent guidance and supporting me on my path to becoming a better graduate student.

My ability to make use of multitude of resources was not only contingent on the mentorship and support of my supervisor, but that of all members of The Centre for Applied Genomics (TCAG). In particular, I would like to thank my lab mates Dr. Ryan Yuen, Dr. Susan Walker, Dr. Mohammed Uddin, Dr.Mehdi Zarrei, Dr. Kristiina Tammimies and Dr. Eric Denault for their willingness to share their expertise and sound advice as I worked towards completing this project. Additionally, I would especially like to thank Dr. Richard Wintle, Dr. Daniele Merico, Dr. Liz Li, Dr. Chao Lu, Dr. Giovanna Pellecchia, Mrs. Bhooma Thiruvahindrapuram and Mr. John Wei for guiding me so carefully along the very technical aspects of my analysis. Lastly, my time in the lab would not have been complete without my fellow graduate students Ada Chan, Matthew Gazzellone and Dr. Anath Lionel, whom are more like family than lab mates.

A student’s pursuit of graduate studies not only requires a fertile academic environment, but a tremendously supportive home one as well. I would like to deeply thank my parents for their unwavering love and support and all of the wonderful friends I have made both in and out of school. You anchored me during my most difficult days, and I could only hope that I have given you as much friendship and compassion as you have given me.

iii

Table of Contents

Acknowledgements ...... iii

Table of contents ...... iv

List of Tables ...... vi

List of Figures ...... vii

List of Appendices ...... viii

Introduction ...... 1

1.1 Clinical Presentation of ASD ...... 1

1.2 Genetic Etiology of ASD ...... 3

1.2.1. Evidence from Family-Based Studies ...... 4

1.2.2 The Broader Autism Phenotype ...... 5

1.3 Genetic Models of ASD ...... 6

1.3.1. Common Disease-Common Variant Hypothesis ...... 6

1.3.2 Common Disease-Rare Variant Hypothesis ...... 8

1.4 Copy Number Variation in ASD ...... 9

1.5 Developmental Neurobiology of ASD ...... 11

1.6 Project Rationale ...... 12

1.6.1. Hypothesis ...... 12

1.6.2 Objectives ...... 13

Methods...... 15

2.1 Sample Ascertainment and Diagnostic Criteria ...... 15

2.1.1 Autism Diagnostic Observation Schedule, Second Edition ...... 17

2.1.2 Mullen Scales of Early Learning ...... 17 iv

2.1.3 Vineland Adaptive Behavioral Scales, Second Edition ...... 17

2.2 DNA Collection and Genotyping ...... 18

2.3 CNV Calling ...... 18

2.4 Variant Prioritization and Molecular Validation ...... 23

2.5 Critical Exon Analysis ...... 24

Results ...... 26

3.1 De Novo CNV Analysis ...... 26

3.1.1 De Novo CNVs in Individuals with ASD ...... 26

3.1.2 De Novo CNVs in Individuals without ASD ...... 28

3.1.3 Critical Exon Analysis of De Novo CNVs ...... 29

3.1.4 Distribution of De Novo and Inherited Presumed Pathogenic CNVs ...... 29

3.2 Predictive Value of Chromosomal Microarray ...... 35

Discussion ...... 38

4.1 Current Study Limitations ...... 41

Conclusion and Significance...... 43

Future Directions ...... 44

6.1 WGS of Infant Siblings Cohort ...... 45

6.2 Investigating the Role of PTCHD1-AS in ASD ...... 46

Appendices ...... 49

7.1 Rare, Exonic CNVs ...... 50

References ...... 63

v

List of Tables

Table 1: de novo CNVs in Probands (n=189) and Infant Siblings (n=119) ...... 27

Table 2: Identification of Exonic Presumed Pathogenic CNVs in Infant Siblings of ASD Probands ...... 33

Table 3: Predictive Value Statistic ...... 37

vi

List of Figures

Figure 1: Prevalence of ASD ...... 2

Figure 2: Multifactorial Threshold Model ...... 7

Figure 3: Project Flowchart ...... 14

Figure 4: CNV Workflow ...... 16

Figure 5: Ancestry Stratification of ASD Cases from Infant Siblings Cohort ...... 19

Figure 6: Stratification of CNVs Called per Algorithm ...... 20

Figure 7: Determining Optimal CNV Size and Number of Overlapping Microarray Probes ...... 21

Figure 8: Selection of Stringent CNVs ≥15 Kb, Overlapping ≥ 10 Consecutive Probes ...... 22

Figure 9: Detection of Deletion on the X using SYBR® Green Method for qPCR ...... 24

Figure 10: Spatiotemporal Distribution of Critical Exons ...... 30

Figure 11: Summary of Presumed Pathogenic CNVs ...... 31

Figure 12: Deletion of MBD5 in Quad ...... 32

Figure 13: Presumed Pathogenic CNVs Shared Between Proband and Affected or Unaffected Infant Sibling ...... 34

Figure 14: De Novo and Presumed Pathogenic CNVs Found Uniquely in Siblings with ASD or Other Developmental Concerns ...... 35

Figure 15: Previously Reported and Novel Deletions Overlapping PTCHD1-AS ...... 48

vii

List of Appendices

7.1 Rare, Exonic CNVs ...... 50

viii

Introduction

1.1 Clinical Presentation of ASD

Autism Spectrum Disorder (ASD) is a prevalent neurodevelopmental disorder presenting with core deficits in social communication and restrictive, repetitive patterns of behavior (Fig.1). These deficits encompass rigid adherence to a routine, fixated interests, stereotyped behaviors and a lack of social and emotional reciprocity. Autism as a unique disorder was first alluded to by Dr. Leo Kanner, where he described a group of 11 children (eight boys: 3 girls) who demonstrated similar patterns of social maladaptation (Kanner 1943). One year later, Hans Asperger described a more moderate form of autism in higher-functioning individuals (Asperger 1944). Over time, clinical definitions of ASD have changed as we gained a better understanding of its clinical presentation. In 1981, the eponymous Asperger Syndrome was placed under the umbrella of Pervasive Developmental Disorders (PDD) in the Diagnostic Manual of Mental Disorders-IV (DSM-IV), along with Autistic Disorder, PPD-Not Otherwise Specified (PDD-NOS), Rett Syndrome and Childhood Disintegrative Disorder (American Psychiatric Association 2000). The diagnostic criteria were further amended in the DSM-V to remove the stratification of PDD and cluster all phenotypes under ASD (American Psychiatric Association 2013).

Clinical manifestations of ASD are highly heterogeneous and frequently comorbid with language disorders, cognitive delays, psychiatric and a variety of medical conditions. Identification of these endophenotypes can direct therapeutic intervention and improve an individual’s outcome. A systematic review of 31 studies (2, 121 cases) reported that 39.6% of youth with ASD present with a clinical form of anxiety, including specific phobias (29.8%) and obsessive-compulsive disorder (OCD; 17.4%) (van Steensel, Bogels, and Perrin 2011). Epilepsy prevails among 5-46% of individuals with ASD, with the presence and severity of intellectual disability (ID) being strongly associated with seizure activity (Hughes and Melyn 2005; Spence and Schneider 2009). Estimates place the co-occurrence of ID and ASD between 40-70% (Kaufman 2011).

16 1 in 68

14

1 in 88 12

1 in 110 10 1 in 125 1 in 150 8

6

4 Incidence per 1000 Children 1000 per Incidence 2

0 2000 2002 2004 2006 2008 2010 Year

Figure 1: Prevalence of ASD

As evidenced from data compiled from 6-14 sites across the United States, the Center for Disease

Control (CDC) has demonstrated a steady rise in rates of ASD over the last 15 years

(http://www.cdc.gov/ncbddd/autism/data.html). One of the limitations of these reported rates is that they are ascertained from medical and school records of 8-year old children at participating sites. As a result, only children who are accessing services in their communities are being reported which can lead to underestimates. A new parent survey conducted by the National Centre for Health Statistics recently reported that 1 in 45 children in the United States have ASD.

2

One of the epidemiological features of ASD is gender bias in clinical presentation, classically reported as a 4-5:1 male to female ratio (Fombonne 2009; Lai et al. 2015). More recent population-based studies have placed this estimate as low as 2:1, with females displaying less impairment (Idring et al. 2012; Volkmar, Szatmari, and Sparrow 1993; Jensen, Steinhausen, and Lauritsen 2014). The discrepancy in sex ratios is plausibly be due to the population-based approach casting a wider net and capturing higher functioning females with ASD than clinical and school-based ascertainment methods.

This epidemiological phenomenon is also compounded by the inherent bias in the tools used to diagnose ASD. All diagnostic measures employed to diagnose an individual with autism are based on specific behavioral criteria that relate to the two broad domains that constitute an autistic phenotype. These tests, however, do not account for the difference in presentation of these behavioral characteristics between males and females. On average, females have been shown to develop faster linguistically, exhibit stronger tendencies to interact with others and camouflage difficulties in social interaction using compensatory behaviors. This suite of developmental differences can influence the scoring of social communication domains on diagnostic evaluations. As such, females with ASD included in most studies often display more serious cognitive deficits than males (Lai et al. 2015).

This being said, a slight gender bias still remains that suggests that females harbor a “female protective effect”. It is thought to underscore etiological differences between the sexes, with females requiring an increased burden of genetic and environmental insults to yield an autistic phenotype (Jacquemont et al. 2014). Gaining a better understanding of the genetic and neurobiological underpinnings of ASD can not only inform us of mechanisms and present novel therapeutic targets, but may also alter the way we clinically define ASD.

1.2 Genetic Etiology of ASD The data arising from family-based studies have shown a clustering of ASD symptoms between first-degree relatives that supersedes the rate of ASD traits observed in the general population, indicating the likelihood that genetic events drive the phenotype. Furthermore, traits and genetic events associated with ASD do not segregate following a typical Mendelian patter, which suggests that many liabilities act in concert to produce a phenotype. ASD traits are continuously distributed, individuals with ASD being present on either extreme of this continuum. This

3

manifestation of symptoms belies a multifactorial model in which genetic and environmental factors aggregate to yield a phenotype. Several models have been proposed under this paradigm to explain what factors contribute to ASD.

1.2.1 Evidence from Family-Based Studies

The genetic basis for ASD was first alluded to in 1977 in the first published twin study (Folstein and Rutter 1977). Conducted by Folstein and Rutter, a 36% concordance for ASD was found among monozygotic (MZ) twin pairs and 0% in dizygotic twin (DZ) pairs. These results were replicated in a follow-up study in 1995, with a 60% concordance found in MZ twins (Bailey et al. 1995). Independent studies published within that time on different populations demonstrated concordance rates of up to 95.7% in MZ twins and 23.5% in DZ twins (Ritvo et al. 1985). Earlier studies were small in size and assessed narrowly-defined ASD. The concordance of subclinical ASD traits fell between 75-92% in MZ twin pairs (Kates et al. 2004; Le Couteur et al. 1996). A study conducted on 3,400 twin pairs from the Twin Early Development Study placed heritability of strict ASD between 0.64-0.92 and 0.78-0.81 for ASD-like traits (Ronald et al. 2006). The largest published twin study in ASD profiling 6,000 twin pairs published a concordance rate of 56%-95% (Colvert et al. 2015). The high heritability of ASD as demonstrated by twin studies has driven the hypothesis that its etiology is rooted in genetics. Given that heritability estimates for ASD vacillate between 40-95% for MZ twins, we can surmise that the environment influences the etiology of ASD in tandem with genetic factors.

Like twin studies, studies looking at recurrence risk of ASD among siblings have shed light on the contributions of genetics to ASD. Early studies placed these estimates between 3-10%, similar to the concordance rates observed in DZ twins (Chakrabarti and Fombonne 2001; Icasiano et al. 2004; Lauritsen, Pedersen, and Mortensen 2005). However, many of these studies did not account for the stoppage effect, which occurs when parents will opt to not have more children after an initial ASD diagnosis (Jones and Szatmari 1988). The recurrence risks were therefore underestimated. The most recent study evaluating 1.5 million Danish sibling pairs born between1980-1996 observed a recurrence risk of 7% for full siblings, with prior studies placing recurrence risk as high as 18.7% (Ozonoff et al. 2011). The discrepancies between studies are most likely due to respective differences in study designs. In addition to factoring in the stoppage effect, studies with a retrospective design, as used Ozonoff et al. (2011) can give a diagnosis in real time, eliminating bias in parental reporting of phenotype and symptom severity. Risk factors 4

for a second-born child developing ASD include the presence of >1 older sibling with ASD and infant gender. Evidence supporting increased ASD risk for siblings of female probands is inconclusive (Goin-Kochel et al. 2007).

1.2.2 The Broader Autism Phenotype

The gradient expression of ASD traits, known as subclinical traits, in non-Autistic first-degree relatives of individuals with ASD has been well-documented, with the more subtle manifestation of these traits being officially termed a “Broader Autism Phenotype” (BAP) (Bolton et al. 1994; Piven et al. 1990; Szatmari et al. 2000). Specifically, parents of individuals with ASD often present with BAP (Sasson et al. 2013). Despite the presence of subclinical traits being controversial for years, over 5 independent tools have been used to identify subclinical ASD-like traits. The BAPQ is the tool that is most frequently used to evaluate BAP in non-Autistic relatives of individuals with ASD. It incorporates elements from the Modified Personality Assessment Schedule (MPAS) and the Pragmatic Rating Scale (PRS), whose subscales assess the triad of symptoms in the DSM-IV: social deficits, stereotyped behaviors and language deficits. Sensitivity and specificity of the BAPQ was determined to be >70% (Hurley et al. 2007).

One distinct contribution of twin and family studies to making genotype-phenotype correlations has been in the identification of subclinical ASD traits in non-Autistic relatives of index cases. The first twin study observed an 82% concordance rate of cognitive-deficits in MZ twins, and a 10% concordance in DZ twins (Folstein and Rutter 1977). Subsequent case-control studies have demonstrated that certain social behaviors, language deficits (e.g. aloofness, rigidity, use of pragmatic language) and displays of anxious tendencies are found in first-degree relatives of children diagnosed with ASD, which in theory may index a genetic predisposition to ASD in a family unit (Murphy et al. 2000; Pickles et al. 2000; Piven et al. 1997). A higher frequency of subclinical traits is found in parents in multiple-incidence autism families (MIAF) compared to those in single-incidence autism families (SIAF) (Losh et al. 2008). There is a higher likelihood of both parents in MIAF displaying BAP, denoting bilineal transmission of genetic liabilities consistent with a multifactorial liability model for ASD.

5

1.3 Genetic Models of ASD

The transmission these genetic liabilities does not follow a strict pattern of Mendelian inheritance. Evidence, such as the presumed bilineal transmission of ASD in parents, lends credence to the polygenic inheritance of contributing factors to ASD. A widely accepted paradigm used to explain ASD etiology is the threshold model proposed by Falconer (Falconer 1981), whereby a constellation of genetic and environmental liabilities are present along a continuum and a clinical phenotype is only expressed by surpassing a threshold (Fig. 2). As the gender bias in ASD suggests, this observed threshold can differ between sexes, with females exhibiting a higher threshold of liability before displaying clinical symptoms. Called the Carter Effect (Carter and Evans 1969), this is supported by evidence that a female expressing a phenotype will harbor a greater genetic load and her offspring or younger siblings have a greater risk of developing ASD. Extensive genetic analyses have demonstrated that a plethora of genetic events, such as the presence of a rare, private mutation or modifier loci, can act in concert to produce an autistic phenotype.

1.3.1 Common Disease-Common Variant Hypothesis

The Common Disease-Common Variant (CDCV) hypothesis was a paradigm put forth to explain genetic susceptibility to common disease. It posits that common diseases are attributable to one or a few susceptibility alleles that are commonly found in the population (minor allele frequency of >1%) (Lander 1996). It is based upon the premise that common alleles conferring risk to common disease in a small founder population persist in today’s population due to less selective pressure and difficulty in being overcome by the introduction of new alleles over a very rapid population expansion. The population remains in disequilibrium, such that the number and frequency of alleles conferring susceptibility to common disease is lower and the genetic risk lies in one common variant or a small number of common variants (Reich and Lander 2001). These variants are easily detected using linkage and genome-wide association studies (GWAS). These methods rely on the identification of select single-nucleotide polymorphisms (SNPs) that index specific haplotypes. One can then narrow-in on the suite of highly associated with the disease (Gibson 2011).

6

A Population Mean Threshold

Liabilities

B Population Mean Threshold C

First-Degree Relatives

Liabilities Liabilities

Figure 2: Multifactorial Threshold Model for ASD

A In the multifactorial threshold model, different liabilities that predispose to ASD are continuously distributed in the general population. In ASD, these genetic liabilities accrue to a critical point at which, confluent with certain environmental factors, an individual will display

ASD symptoms. B First-degree relatives of individuals with ASD possess a spectrum of genetic liabilities that is shifted to the right of the general population. C The “female protective effect” posits that females require a higher burden of genetic and environmental insults than males to display an ASD phenotype.

7

Four main studies have been conducted to decipher the role of common variants in ASD etiology. Polymorphisms located at 5p14.1 between CDH9 and CDH10 (Wang et al. 2009), in MACROD2 (Anney et al. 2010), between SEMA5A and TAS2R1 (Weiss et al. 2009) and within CNTNAP2 (Arking et al. 2008) showed the strongest association with ASD. Many of the signals have failed to be replicated and can only explain a small fraction of heritability (15-40%), partially due to the fact that the sample size of such studies cumulatively includes fewer than 5,000 cases. Common variants have been recognized as exerting a weak individual risk, however can account for a significant portion of the ~60% heritability in aggregate. Methods for accounting for the contribution of these variants have been under tremendous scrutiny as they seem to inflate their influence in the ASD phenotype (Gaugler et al. 2014).

1.3.2 Common Disease-Rare Variant Hypothesis

In contrast to the CDCV hypothesis, the Common Disease-Rare Variant (CDRV) hypothesis posits a different evolutionary paradigm whereby introduction of mutations, genetic drift and purifying selection have affected the frequency and heterogeneity of susceptibility alleles. Their mutation rates are high and the alleles are moderately deleterious, putting them under more selective pressure than those with lower mutation rates that have a neutral effect (CDCV). Genetic liability thus lies in an array of low-frequency damaging rare variants with a high mutation rate to compensate for their negative selection (Pritchard and Cox 2002).

Typically, rare variants are identified by frequency in a control population, with different studies imposing different frequencies. Minor allele frequencies ranging from 1-5% (Frazer et al. 2009; Gorlov et al. 2011) have been used to qualify rare variants, with an even lower frequency of 0.1% being used to identify variants that are most pathogenic (Bodmer and Bonilla 2008).

Some of the first pieces of evidence gathered in support of the CDRV hypothesis were the overlap of the ASD phenotype with that of various monogenic disorders and the presence of gross chromosomal abnormalities. Collectively, these syndromic forms of ASD and chromosomal aberrations account for ~15% (10% and 5%, respectively), with no single variant being attributed to more than 2% of ASD cases (Cook and Scherer 2008). Common examples include Fragile X Syndrome and Rett Syndrome. Fragile X Syndrome is an X-linked syndrome caused by a CGG repeat in the FMR1 at Xq27.3 (Santoro, Bray, and Warren 2012). Approximately 30% of individuals with the syndrome have an ASD phenotype (Hagerman, Hoem, and Hagerman 2010).

8

Rett Syndrome is a developmental disorder characterized by the developmental regression of an individual at approximately 18 months of age. Typically found in females, ~80% of cases are caused by mutations in the MECP2 gene (Amir et al. 1999; Zappella et al. 2003). Another example of a monogenic disorder associated with ASD is tuberous sclerosis, whereby individuals develop benign tumors in multiple organ systems (Smalley 1998). Up to 60% of individuals with this syndrome meet the criteria for ASD (Seri et al. 1999). Cytogenetic analysis was undertaken to identify other risk loci, which included the 22q11.2 region.

Smaller classes of variation have been investigated using multiple technologies and helped identify rare variants that have contributed to ASD. These include querying the genome for copy number variations (CNVs) using chromosomal microarrays, single-nucleotide variants (SNVs) and small insertions/deletions (indels) using whole exome (WES) and whole genome sequencing (WGS). Initial studies adopted a trio design, which identified a significant enrichment in damaging de novo variants, referring to rare, private mutations in index cases (De Rubeis et al. 2014; Iossifov et al. 2012; Neale et al. 2012; O'Roak et al. 2012; Sanders et al. 2012). Next- generation sequencing technology has increased the resolution of mutations that we can detect, and since their use have helped identify over 100 ASD susceptibility genes. An initial study using whole-genome sequencing on 32 trios identified a de novo rate of 19% in index cases and a genetic cause of ASD in 31% of cases (Jiang et al. 2013).

1.4 Copy Number Variation in ASD

While SNVs are the most common type of variation in the and high-resolution sequencing efforts are making great strides to assess their contribution to ASD etiology, some of the most momentous findings in psychiatric genetics have surrounded structural variation. Copy number variants are an intermediate form of structural variation in the human genome that are significantly outnumbered by SNPs, but impact a similar number of base pairs (Iafrate et al. 2004; Redon et al. 2006; Sebat et al. 2004). CNVs constitute deletions, duplications and complex rearrangements that range in size from 1000 Kb – 1Mb in size (Conrad et al. 2010). There are four identifiable mechanisms by with CNVs are formed: 1) non-allelic homologous recombination (NAHR); 2) non-homologous end joining (NHEJ); 3) fork stalling/template switching (FoSTeS); and 4) L1-mediated retrotransposition (Malhotra and Sebat 2012). Recurrent CNVs are typically formed via NAHR, whereby structural change is mediated by regions of >90% sequence

9

homology, termed segmental duplications (Hastings et al. 2009). One example of recurrent CNVs relevant to ASD formed by such a mechanism are the microdeletions/microduplications in the 16p11.2 region, causing the syndrome of the same name. This CNV is seen in roughly 1% of ASD cases(Walsh and Bracken 2011). The region that is subject to rearrangement is flanked by these segmental duplications. Non-recurrent, complex rearrangements are typically formed via collaboration between the replication and repair-based mechanisms FoSTeS (Lee, Carvalho, and Lupski 2007) and NEHJ (Shaw and Lupski 2005).

Chromosomal microarrays remain the technology of choice for the detection of structural variation in the human genome (Miller et al. 2010). It functions on the premise of detecting magnitude of fluorescent signal upon binding of your sample to probes, matching specific regions in the genome. Probes are either immobilized stretches of nucleotides corresponding to a specific allele (oligo) or single nucleotides (SNPs). Newer platforms use a combination of both probe types to accurately assess copy number states. Over the last 5 years, the resolution of these platforms has highly increased, with early versions containing a few hundred thousand probes, to modern versions containing over 2 million.

The use of microarrays provided some of the first rare and highly-penetrant genetic hits in ASD . Family-based studies, mainly the study of trios, have found a substantial contribution of de novo variants to ASD etiology. With a generally low occurrence in the population (0.07-0.12 per generation), CNV studies have consistently delivered a de novo rate of 5-10% in ASD cases compared to controls (~1%) (Autism Genome Project et al. 2007; Marshall et al. 2008; Sebat et al. 2007). This can data can plausibly explain the incomplete Mendelian pattern of inheritance of ASD, as well as the correlation with ASD incidence and increased paternal age. Studies on simplex families (SPX) have observed a discrepancy in de novo rates between affected and unaffected siblings (Levy et al. 2011).

Although the high burden of risk resides in de novo CNVs, many highly penetrant inherited variants have been identified. These include deletions and duplications overlapping members of the Neurexin (Gauthier et al. 2011; Schaaf et al. 2012; Szatmari et al. 2000; Vaags et al. 2012) and SHANK (Berkel et al. 2010; Durand et al. 2007; Sato et al. 2012) gene families, DDX52- PTCHD1 (Carter and Evans 1969; Noor et al. 2010), and the 16p11.2 (Kumar et al. 2008; Weiss et al. 2008), CNTN4 and NLGN4X (Jamain et al. 2003). One of the challenges of CNV analysis is that they are often large and contain many genes, making it difficult to pinpoint which 10

genes are responsible for conferring a phenotype. Extensive case-gathering has helped to identify critical regions and genes, such as the CHRNA7 in the 15q11.2-13.3 microduplication/microdeletion in individuals.

This region, like many others, is not fully penetrant. Variable expressivity of an ASD phenotype in the presence of a specific CNV is seen. Many CNVs are found across several neuropsychiatric disorders. This overlap may, in part, be due to common neurobiological mechanisms involved in these disorders. For example, microduplications in 1q21.1 and 22q11.2 are noted in both ASD and schizophrenia. One hypothesis posits that CNVs may give rise to a more general phenotype such as ID, which in turn is comorbid with many neuropsychiatric disorders, rather than specific CNVs being exclusively responsible for the cognitive and behavioral profile associated with ASD. Currently, >100 ASD susceptibility genes are also associated with ID (Betancur 2011), but many are not. This can be because an autistic phenotype is mediated by an additive effect of rare and common variants. Secondly, CNVs act via a dosage effect. One of the best examples is gene dosage that is observed at the 16p11.2 locus. Deletions are associated with a more severe phenotype in mice and are also more penetrant than reciprocal duplications (Horev et al. 2011). Many ASD risk genes also act through haploinsufficiency, whereby impairment in one allele can prompt an ASD phenotype (Betancur and Buxbaum 2013; Bozdagi et al. 2010; Damaj et al. 2015).

1.5 The Developmental Neurobiology of ASD

Genetic findings have highlighted ASD as resulting in part from oligogenic factors, with no known ASD gene accounting for more than 1% of cases. Despite the heterogeneity of genetic factors associated with ASD, they have been shown to converge on neurodevelopmental pathways which shed light on the pathophysiology of ASD. Analysis on genes impacted by CNVs in a study assessing 2, 446 individuals with ASD demonstrated that pathways involved in chromatin remodeling, synapse formation and neuronal signaling (Pinto et al. 2014) are enriched for. Additionally, it was shown that these genes impact the FMRP pathways. These findings were replicated by studies looking at small-scale variants using WES, which showed that de novo mutations were found in genes that target the postsynaptic FMRP pathway (De Rubeis et al. 2014; Iossifov et al. 2012). Some of these genes are from the NRXN1, SHANK, and NLGN families.

11

Pathway enrichment has thus alluded to ASD being a disorder rooted in aberrant neuronal connectivity. Functional genomic analyses have allowed us to gain a better understanding of the topology of these dysfunctional mechanisms. ASD susceptibility genes have been shown to aggregate in cortical layers at the surface of the brain and affect neuronal subtypes, specifically at the prenatal time point (DiCicco-Bloom et al. 2006).

1.6 Project Rationale and Objectives

Extensive studies on the phenotypic range of ASD have consistently shown the familial aggregation of these traits in first-degree relatives. The recurrence estimates for ASD, now reliably close to 20%, underscore the risk at which siblings of individuals with ASD are at for developing the disorder. Given that early intervention is the best recognized way to improve a child’s clinical outcome, it is important for both parents and clinicians to be aware of any aberrations in a child’s developmental course in infancy. As such, biological indicators for ASD have been posited to aid in this endeavor. The role of CNVs in ASD etiology has been well- established, with chromosomal microarray being the first tier diagnostic test to determine the genetic causes of ASD in the clinical population. Using an extensively phenotype cohort with longitudinal data (12, 24 and 36 months), we can reliably capture the correlation between genotype and phenotype of the infant siblings of children with ASD.

1.6.1 Hypothesis

It is hypothesized that the presence of a presumed pathogenic CNV in an index case, when shared with an infant sibling, will result in that sibling either expressing ASD or subthreshold traits. While the preponderance of clinically-relevant CNVs will be observed in probands and affected siblings, given the incomplete penetrance of many ASD susceptibility loci we predict that a smaller number of siblings without ASD will possess them as well. Infant siblings that did not meet criteria for an ASD diagnosis, herein referred to as unaffected siblings, who harbor presumed pathogenic variants will likely display a subclinical phenotype. Given the presence of a “female protective effect”, it is hypothesized that the preponderance of unaffected siblings with clinically relevant CNVs will be female and likely possess subclinical traits.

12

1.6.2 Objectives 1. Identify de novo CNVs in entire cohort

2. Classify all presumed pathogenic CNVs in index cases and infant siblings

3. Access longitudinal phenotype information to assess the developmental course of

unaffected siblings harboring these variants

4. Quantify the predictive nature of chromosomal microarray in ASD diagnosis

13

Index Case

Longitudinal Infant Siblings Phenotype Assessment

Phenotype/Affection Phenotype of Status of Infant Index Case + Siblings

Microarray

Found Uniquely Presumed Found Uniquely in Index Case Pathogenic in a Sibling CNV

Subclinical Shared between Phenotype in Index and Sibling Unaffected Siblings?

Can Genotype Predict ASD or Subthreshold ASD Symptoms in Infant Siblings?

Figure 3: Project Flowchart

14

Methods

2.1 Sample Ascertainment and Diagnostic Criteria

All families enrolled in the study were recruited from 5 Canadian sites and 6 from the United States of America that are registered in Baby Siblings Research Consortium (BSRC), an initiative of Autism Speaks (University of Toronto, Bloorview Research Institute, Kennedy Kreiger Institute, Dalhousie University, McMaster University, University of Alberta, University of Miami, Vanderbilt University, University of California-Davis, University of California-San Diego, and University of Washington). Families were recruited through Web-based mediums, organizations servicing individuals with ASD and their families, word of mouth and referrals from medical professionals to the academic institution. A total of 769 individuals were eligible for CNV analysis (Fig.4): 192 probands (5.0 males : 1 female), 163 unaffected siblings (1.3 males : 1 female), 89 affected siblings (3.0 males : 1 female), 174 mothers and 151 fathers. Institutional review boards for each institution approved the data collection and analysis.

All families had one child, typically the first-born, diagnosed with ASD using DSM-IV criteria. Clinical status was confirmed using the Autism Diagnostic Observation Schedule (First or Second Edition) or a combination of the Autism Diagnostic Interview (ADI) and Social Responsiveness Scale (SRS). Individuals whose ASD was attributed to the presence of a previously diagnosed genetic or neurological disorders (e.g. Fragile-X, Rett Syndrome) were excluded from the study. Later-born infant siblings began clinical assessments at a mean age of eight months and returned for testing again at 24 months, with a clinical outcome delivered at a mean of 37 months. Aside from the ADOS, probands and infant siblings were administered the Mullen Scales of Early Learning and Vineland Adaptive Behavioral Scales. Two hundred twenty-five parents were administered the Broader Autism Phenotype Questionnaire (BAPQ) as a self-report.

15

Figure 4: CNV Workflow

16

2.1.1 Autism Diagnostic Observation Schedule, Second Edition The Autism Diagnostic Observation Schedule, Second Edition (ADOS-2) replaced the ADOS as the gold standard in the diagnosis of ASD (Lord et al., 2012; Lord, Rutter et al., 2012). Consisting of a semi-structured interview, it assesses different facets of social interaction and communication by setting up scenarios likely to elicit a specific behavior. Five modules are used in accordance with a person’s age and language skills: the Toddler Module, typically used on children aged 12- 30 months lacking phrase speech; Module 1, used on young children >31 months who do not use phrase speech; Module 2, used for individuals using phrase speech but who do not have verbal fluency; and Modules 3 & 4, for verbally fluent children and adults with the latter module focusing on daily living activities and conversational interview techniques rather than play. The infant siblings in this study were typically administered the Toddler Module at their first and second assessments (~12 and 24 months, respectively). Individuals assessed prior to the availability of the ADOS-2 were assessed starting with Module 1.

2.1.2 Mullen Scales of Early Learning

The Mullen Scales of Early Learning scores children 0-60 months in five domains: visual reception, gross motor skills, fine motor skills, receptive language and expressive language. The administrator of the test will guide the child through different tasks assessing each domain and score them accordingly. An Early Learning Composite score is calculated and underscores overall cognitive functioning (Mullen 1995).

2.1.3 Vineland Adaptive Behavioral Scales, Second Edition

The Vineland Adaptive Behavioral Scales, second edition is used to assess 5 domains of adaptive functioning (communication, daily living skills, socialization, motor skills, and maladaptive behavior) in individuals from birth to age 90 (Sparrow 2005). There are four surveys that can be administered. The first two are completed by the parent/caregiver and come in a rating scale or interview format; the third is an expanded interview used if a child is eligible for a modified curriculum; the fourth is an assessment performed by the teacher. The first four domains are combined to form an adaptive behavior composite score for children aged 0-6 years and 11 months. The first three domains are used to create the composite score for individuals 7 years and over (Community-University Partnership for the Study of Children, Youth and Families, 2011).

17

2.2 DNA Collection and Genotyping

Blood samples were obtained from 769 probands, infant siblings and parents at each respective site. Samples from U.S. sites (n=436) were subsequently sent to the DNA and Cell Repository at Rutgers University and then distributed to The Centre for Applied Genomics (TCAG) housed in the Peter Gilgan Centre for Research and Learning. Genomic DNA was extracted from whole blood and transformed lymphoblastoid cell lines for 631 and 138 individuals, respectively. Quality control thresholds were imposed, as specified by the manufacturer: a Waviness Standard Deviation (Waviness SD) of ≤0.12; a Median Absolute Pairwise Difference (MAPD) of ≤0.25; and a SNP Quality Control (SNP QC) of ≥15.0. The PLINK software (Purcell et al. 2007) was used to identify loss of heterozygosity (LOH), Mendelian errors and stratify the ancestry of the samples using the 750,000 informative SNPs available on the array (Fig. 5).

2.3 CNV Calling

CNVs were called using a combination of four algorithms: Chromosome Analysis Suite (ChAS) (Affymetrix Inc., USA), iPattern (Pinto et al. 2010), Nexus (Darvishi 2010) and Partek (Downey 2006). A CNV was deemed stringent if it was called by a minimum of two algorithms, at least one of which had to be ChAS or iPattern (Fig. 6). Samples with an excess number of calls for at least 2 algorithms (≥3 standard deviations from the mean number of calls) were equally eliminated. Eight samples (2 probands, 1 affected sibling, 3 mothers and 2 fathers) did not meet this requirement in addition to quality control criteria and were thus excluded from the CNV analysis. This resulted in a total of 190 probands, 88 affected siblings, 163 unaffected siblings, 174 mothers and 151 fathers being used in the CNV analysis. CNVs on the were called uniquely by ChAS and iPattern, with all calls on the Y chromosome being eliminated. To be considered for analysis, CNVs needed to be ≥15 Kb in length and span ≥10 consecutive probes to reduce the detection of false positive calls (Fig. 7). CNVs were then restricted to those in which at least 75% of the variant was present in a copy-number stable region as defined in Zarrei et al. (2014).

18

A

B

Figure 4: Ancestry Stratification of ASD Cases from Infant Siblings Cohort

Sample ancestry was ascertained using 122, 368 SNPs present on the Affymetrix 6.0 and

CytoScan HD platforms, each with a minor allele frequency >5% and genotyping rate >95%.

A This plot shows the allelic distribution of Canadian and U.S. samples juxtaposed with European controls. A total of 70.7% (312/441) of the samples were of European ancestry, 5.7% (25/441) were of Asian ancestry and 1.4% (6/441) were of African ancestry.

B Of the 441 samples analyzed, 22.2% (98/441) were of alternate ancestry, 31 of which are

Canadian and 67 are from the U.S. 19

200

180

160

140

120 ChAS 100 iPattern

80 Nexus Partek

60 Mean CNV Calls per SubjectperCalls Mean CNV 40

20

0 All CNVs Autosomal X Chromosome Y Chromosome

Figure 6: Stratification of CNVs Called per Algorithm

The four algorithms employed different statistical models to detect changes in copy number

which varied the yield of variants for each one. They were thus strategically aggregated to

identify stringent CNVs. None of the algorithms accurately called copy number changes on the Y

chromosome, thus these calls were excluded. A high number of CNVs were identified on the X

chromosome by Nexus compared to the other three algorithms, thus increasing the likelihood of

false positive results. Partek returned the opposite trend. ChAS and iPattern called a similar

number of CNVs per individual, therefore they were the only two algorithms used to reliably call

variants on the X chromosome. Given that the latter two displayed similar results in all categories,

at least one of ChAS or iPattern was required to call stringent CNVs, in addition to either Partek

and/or Nexus.

20

A

B

C

Figure 7: Determining Optimal CNV Size and Number of Consecutive Microarray Probes A

Analysis conducted on 84 quads demonstrated that the replication rate between WGS and high- resolution microarray was ≥70% for CNVs ≥15 Kb. B A minimum of 82% of CNVs identified in n=22 duplicate samples from the OPGP control cohort were replicated in CNVs ≥10 Kb. Using the same microarray data, it was determined that ≥69% of CNVs were replicated when spanned by 10 or more consecutive microarray probes. C The size distribution of CNVs in both the OPGP control cohort and infant sibling data set was similar for CNVs ≥15 Kb.

21

Figure 8: Selection of Stringent CNVs ≥15 Kb, Overlapping ≥ 10 Consecutive Probes

Depicted in the above boxplot is the number of stringent CNVs per individual (n=441) after imposing the following filtering criteria: ST0 = all CNVs (≥15 Kb and overlapping, ≥ 10 consecutive probes); ST1 = all CNVs + rare status; ST2= all CNVs + rare status + ≤70% overlap with segmental duplications; ST3 = all CNVs + rare status + ≤70% overlap with segmental duplications + ≥75% copy number stable; ST4 = all CNVs + rare status + ≤70% overlap with segmental duplications + ≥75% copy number stable + exon overlap. The subset of CNVs used in the analysis resulted in a mean of 2 CNVs per individual.

22

Rare CNVs were defined as not exceeding a 50% reciprocal overlap with CNVs found in less than 0.1% of unrelated control populations with no psychiatric history (Fig. 8). The primary control data set used consisted of 873 individuals from the Ontario Populations Genomics Platform (OPGP), all genotyped on the Affymetrix CytoScan HD (Uddin et al. 2015). Four unrelated control populations totaling 6,136 individuals were additionally queried to enhance the accurate detection of rare variants: the Collaborative Genetic Study of Nicotine Dependence (COGEND), genotyped on the Illumina Omni 2.5M (Bierut et al. 2008); the Health, Aging, and Body Composition (Health ABC) Study, genotyped on the Illumina 1M-Du0 (Goodpaster et al. 2006); the Ontario Heart Institute Controls (Stewart et al. 2009) and POPGEN ((Krawczak et al. 2006), both genotyped on the Affymetrix 6.0 microarray. I was able to ascertain de novo status of CNVs from individuals with both parents sequenced. This included 133 probands, 119 unaffected siblings and 56 affected siblings from 142 families. ASD-relevant CNVs, herein referred to as presumed pathogenic CNVs, were defined according to ACMG classification of pathogenic and variant-of-unknown-significance-likely-pathogenic (Kearney et al. 2011).

2.4 Variant Prioritization and Molecular Validation All potential de novo and presumed pathogenic CNVs were experimentally validated using either SYBR® Green or the TaqMan© Copy Number Assay methods for real-time quantitative PCR (qPCR). The SYBR® Green absolute quantification method utilizes a standard curve to interpolate the quantity of your region of interest. Two sets of primers 15-25 bp in length were designed using Primer3 Input version 0.4.0 (http://bioinfo.ut.ee/primer3-0.4.0/) within the boundaries of the CNV, typically ≥500 bp from either designated breakpoint. The amplicons were between 90-140 bp in length. Standard primer design protocol was adhered to. Primers were aimed to be 30-70% GC-rich, placed outside of repetitive regions; did not overlap common SNPs and did not share sequence identity with other parts of the genome as determined using tools available in the UCSC Genome Browser (https://genome.ucsc.edu). Control primers were placed in the FOXP2 gene (Forward Primer Sequence: 5’ TGC TAG AGG AGT GGG ACA AGT A 3’; Reverse Primer Sequence: 5’ GAA GCA GGA CTC TAA GTG CAG A 3’), which served as a diploid comparator. The melting temperature of the amplicon in the gene being tested was designed to be 76°C ±4°C to closely match that of the control (MCBF Oligo Calculator: http://mbcf.dfci.harvard.edu/docs/oligocalc.html). Primers were tested on HapMap sample IDs NA10851 (male) and NA15510 (female) and run on a 1% agarose gel to ensure that a single band

23

1.2

1

0.8 FOXP2 Primer Set 1 0.6 Primer Set 2

0.4 Ratio Sample: Ratio

0.2

0 Mother Father Proband (Male) Sibling (Male) Male Control Female Control (10851) (15510)

Figure 9: Detection of Deletion in PTCHD1-AS on the X Chromosome using SYBR® Green Method for qPCR

This graph depicts a deletion spanning the third exon of PTCHD1-AS2 at Xp22.11 (chrX:

23043876-23134236). Both the male proband and affected sibling possess this deletion in their single copy. This structural variant was maternally inherited, as evidenced by the mother displaying 0.5:1 ratio of this region to FOXP2 in comparison to the female control who shows a ratio of ~1:1.

24

of the expected size was produced. A standard curve was created by performing 4 serial dilutions on control DNA. The same two HapMap samples were used as diploid negative controls. The ratios of both experimental primer sets to those of FOXP2 were compared to determine if the locus of interest was deleted (˂ 0.7), duplicated (˃1.3) or neutral (~1.0) (Fig. 9).

The TaqMan© Copy Number Assay is a probe-based assay that uses the comparative Ct method (2-ΔΔCt) of relative quantitation to determine the copy number of a sample. In place of a standard curve, it utilizes an endogenous reference (RNAse P) with a copy number of 2 in a diploid genome. The RNase P probe is 87 bp in length and located within the only exon of Ribonuclease P RNA component H1 gene located at chr.14:20811565 (hg19). Reference assay probes possess a 5’-VIC® reporter dye and 3’- TAMRA™ quencher. Each test assay probe is approximately 80 bp in length and has a 5’-FAM™ reporter dye and 3’- TAMRA™ quencher.

2.5 Critical Exon Analysis

De novo deletions and duplications in affected (n=88) and unaffected individuals (n=163) were pooled with those reported by Levy et al. (2011) in 68 probands and 17 related unaffected siblings. Genes impacted by all de novo CNVs from the two studies were collated, and brain- critical exons were identified using previously described criteria (i.e. >75th percentile of expression data and low burden of rare, missense mutations). Results were reported as a proportion of critical exons to total exons. Gene expression data from the BrainSpan Atlas of the Human Developing Brain was used and stratified by developmental window, namely prenatal (8- 37 post-conception weeks) and adulthood (18 years onward), and by brain region. P-values were calculated using a two-tailed, paired T-Test.

25

Results We analyzed 199 families, n=77 from 5 Canadian sites and n=122 from 6 US sites. Of these families, 126 (63.3%) were simplex (i.e., one affected individual at the start of the study) and 73 were multiplex (36.7%). A total of 251 infant siblings were included in the CNV analysis, of which 88 obtained a formal ASD diagnosis by 36 months of age and 163 who did not meet criteria for ASD. Twenty-five non-ASD infant siblings that did not qualify for an ASD diagnosis were observed as having other concerns. Of the siblings with differential clinician best estimate diagnoses, 12 were identified as having subclinical ASD, 4 had attention deficit disorder, 2 had developmental delay and 7 had a language delay. A male-to-female ratio of 3:1 was noted for infants with ASD, while a ratio of 1.2:1 and 1.7:1 were observed for non-ASD infants with typically and atypical developmental trajectories, respectively. Accounting for birth order of siblings, a recurrence rate of 34.3% was observed. Gender of the infant was observed to be a predictor of ASD and the presence of other developmental concerns (P-value=0.014, OR=2 using a generalized linear model), with 71.7% of atypically developing and ASD infants being male.

3.1 Identification of De Novo and Presumed Pathogenic CNVs in Infant Siblings I first analyzed the distribution of de novo CNVs between affected and unaffected individuals. A total of 14 de novo CNVs were detected and validated in probands and siblings. Nine de novo CNVs were observed in eight affected individuals (5 deletions: 4 duplications) and five de novo CNVs (3 deletions: 2 duplications) were observed in five unaffected individuals (Table 1). These correspond to de novo rates of 4.8% and 4.2%, respectively (P-value=0.97, OR=1.02 using a generalized linear model).

3.1.1 De Novo CNVs in ASD Individuals Nine de novo CNVs were validated in eight affected individuals. A proband possessed 243 Kb deletion (BP2-BP3) at the 16p11.2 locus, which typically presents with general developmental delay, intellectual disability and facial dysmorphology (Barge-Schaapveld et al. 2011). A male proband was found to harbour an 11 Mb deletion encompassing the FOXP2 gene, in which deletions are associated with verbal language dyspraxia (Feuk et al. 2006).

26

Table 1: de novo CNVs in Affected Individuals (n=189) and Unaffected Individuals (n=119)

Individual Family Gender Status Loci Size CNV Genes ID Type 5-0098-003 M SPX Proband 7q31.1 11 Mb Deletion FOXP2 + 39 genes1

6-0139-001 F SPX Proband 19q13.33 2.6 Mb Duplication PNKP + 149 genes

*Non-ASD 15q21.3- 3-4453-005 M SPX 1.6 Mb Duplication ADAM10 + 12 genes Sibling 15q22.2 ACTG1, BAHCC1 5-0289-003 M SPX Proband 17q25.3 859 Kb Duplication + 41 genes2 5-0289-003 M SPX Proband 17q25.3 62 Kb Deletion SLC16A3, CSNKID2 Non-ASD 6-0376-004 M MPX 16p11.2 610 Kb Deletion KCTD13 + 28 genes Sibling 6-0384-001 F SPX Proband 16p11.2 243 Kb Deletion SH2B1 + 11 genes C18orf56,CETN1, TYMS, CLULI, 1-0040-003 F SPX Proband 18p11.32 547 Kb Duplication COLEC12, YES1, ADCYAP1, ENOSF1 GULP1, LINC01090, 6-0258-001 F SPX Proband 2q32.1-2 328 Kb Deletion MIR561

NDUFV3,PED9A, 6-0244-003 M SPX Proband 21q22.3 284 Kb Duplication WDR4, PKNOX12 Non-ASD TRIP4, ZNF609, 6-0243-004 F SPX 15q22.31 279 Kb Deletion Sibling OAZ2 5-0514-003 M SPX Proband 7p22.1 130 Kb Deletion FBXL18, TNRC18 *Non-ASD 5-0514-004 F SPX Xq13.1 129 Kb Duplication PJA1, LINC00269 Sibling *Non-ASD MSTO1, MSTO2P, 6-0356-004 F SPX 1q22 75 Kb Deletion Sibling Y1AP1

A total of 14 de novo CNVs were found in 13 individuals (6 males: 7 females), with 57% being deletions. Five of the 13 individuals were classified as being unaffected by ASD, three of whom displayed other concerns (denoted by *). See section 7.1 in the appendices for CNV breakpoints.

1. Published by Feuk et al. (2006) 2. Published by Pinto et al. (2010)

27

A 2.6 Mb duplication encompassing 150 genes at 19q13.33 was found in an affected female, including a whole-gene duplication of PNKP, disruptions in which are associated with MCSZ Syndrome whereby cases present with microcephaly, seizures and developmental delay (Reynolds et al. 2012). One affected male harbored a previously-reported 859 Kb duplication at locus 17q25.3 affecting ACTG1 (Riviere et al. 2012; Pinto et al. 2010), and a contiguous 62 Kb deletion encompassing SLC16A3 (Pinto et al. 2010).

3.1.2 De Novo CNVs in non-ASD Individuals

Five de novo CNVs were observed in unaffected siblings. Three of the five siblings were noted as having an atypical developmental trajectory. Of the de novo CNVs observed in unaffected individuals, 25% (1/4) overlapped an ASD risk gene. A 1.6 Mb duplication overlapping 13 genes including ADAM10 in a male was observed. ADAM10 is an α-secretase responsible for the ectodomain shedding of neuroligin 1 (NLGN1) and amyloid precursor (APP) (Suzuki et al. 2012). This individual was administered the ADOS, Vineland Adaptive Behavioral Scale and Mullen Scales of Early Learning at 14, 19, 25 and 37 months. At each time point the male infant sibling was assessed using the original (2003) and revised (2007) version of the test, with module 1 being used at the first three visits and module 2 being used at the last. At 14 months, his scores met the clinical diagnostic threshold for ASD, whereas at 25 months they did not. Discordance in diagnoses between the two versions of the assessment was noted at 36 months, with the revised version placing clinical scores just above the threshold for ASD.

Two other de novo CNVs were found in females who exhibited subclinical symptoms of ASD. An unaffected female sibling possessed a 75 Kb deletion at 1q22, which encompassed genes MSTO1, MSTO2P and YY1AP. The first gene and its corresponding pseudogene affect mitochondrial morphology and distribution (Kimura and Okano 2007), while YY1AP is a co- activator of the zinc finger protein Ying Yang 1 (Ohtomo et al. 2007). This individual was assessed for ASD at 18, 24 and 36 months. Clinical scores met the ASD diagnostic threshold, but no ASD diagnosis was given. Concerns for the presence of attention deficit hyperactivity disorder were also expressed by clinicians. A second female sibling possessed a 129 Kb duplication at Xq13.1affecting PJA1, an E3 ubiquitin ligase (Yu et al. 2002), and LINC00269. This individual met diagnostic criteria for Pervasive Developmental Disorder (PDD) at age 3, but the diagnosis was revoked two years later. No further clinical details were provided.

28

Lastly, two typically developing, non-ASD individuals each possessed de novo CNVs. One female harbored a 279 Kb deletion at 15q22.31 affected genes TRIP4, ZNF609 and OAZ2. Thyroid hormone receptor activator 4 (TRIP4) is a transcriptional coactivator (Ruiz et al. 2014), ornithine decarboxylase antizyme 2 (OAZ2) is a decarboxylase that regulates cell proliferation (He et al. 2014) and zinc finger protein 609 (ZNF609) is a DNA-binding protein involved in transcriptional regulation (Nagase et al. 1997). While these are not known ASD risk genes, the thyroid hormone receptor pathway has been implicated in ASD and developmental delay. A de novo missense mutation has been identified in an autistic individual in the thyroid hormone nuclear receptor-α (TRα) (Yuen et al. 2015) and disruptions of TRα are consistently associated with developmental delay in mice (Tinnikov et al. 2002). Lastly, I validated a 610 Kb deletion at the 16p11.2 locus in a non-ASD male. Deletions in this region are highly associated with ASD and cognitive impairment (Hanson et al. 2015).

3.1.3 Critical Exon Analysis for De Novo CNVs The spatiotemporal expression of critical exons revealed that there was a significantly larger proportion of critical exons in genes impacted by de novo mutations in affected versus unaffected individuals across the eight brain regions at the prenatal time point (Fig. 10). This higher burden of critical exons across brain regions (P < 0.05) was also observed in adulthood, with the exception of the dorsolateral prefrontal cortex (DFC) and the medial frontal cortex (MFC).

3.1.4 Distribution of De Novo and Inherited Presumed Pathogenic CNVs A total of 16 infant siblings harbored a de novo or inherited presumed pathogenic CNV (6.4%). Of these individuals, 6 shared a CNV with their related proband (Table 2). For example, a female proband and related male sibling with ASD possessed a 70 Kb maternally-inherited CNTN4 duplication of exon 2 and 54 Kb maternally-inherited deletion in cadherin-associated protein, alpha 3 (CTNNA3). The female proband in this family also had a 547 Kb de novo duplication affecting 8 genes including adenylate cyclase activating polypeptide 1 (ADCYAP1), a modulator of circadian rhythms. A57 Kb maternally-inherited deletion in the 4th untranslated exon of methyl-CpG-binding domain 5 (MBD5) (Fig. 10), part of the critical region of the 2q23.1 microdeletion syndrome, was found in in a male proband and male affected sibling. A 90 kb deletion affecting the third exon of PTCHD1-AS2 was observed in a male proband and corresponding affected male sibling. The male proband had a male fraternal twin with ASD whose DNA was not provided. 29

Prenatal (8-34 post-conception weeks)

Proportion of Critical Exons Critical of Proportion

Brain Regions ASD Siblings (Levy et al. + Infant Sibs. Study)

Non-ASD Siblings

Adulthood (>18 yrs.) (Levy et al. + Infant Sibs. Study)

Proportion of Critical Exons Critical of Proportion

Brain Regions

Figure 10: Spatiotemporal Distribution of CNVs Overlapping Critical Exons

Enrichment in affected individuals of critical exons (Uddin et al. 2014) in 8 brain regions (AMY,

amygdaloid complex; CBC, cerebellar cortex; DFC, dorsolateral prefrontal cortex; HIP,

hippocampus; ITC, inferolateral temporal cortex; MD, mediodorsal nucleus of thalamus; STR,

striatum; VFC, ventrolateral prefrontal cortex) prenatally (8-34 post-conception weeks) and in

adulthood (>18 years of age).

30

Figure 11: Summary of CNV Findings in Infant Siblings

31

A

B 4-0049 Fig. 12: Deletion of MBD5 in Quad

A This figure plots the maternally-inherited deletion in the male

proband and related male infant sibling against controls datasets and the

CNVs in the Database of Genomic Variants (DGV).

B Pedigree of the two male siblings harboring a maternally-inherited

CNVs.

32

Table 2. Identification of de novo and Inherited Exonic Presumed Pathogenic CNVs in Infant Siblings of ASD Probands

Family Sex Status CNV Type Loci Size CNV Genes ID 3-8257 M/M Proband+ Maternally Xp22.11 90.1 Kb Deletion PTCHD1-AS ASD Sibling Inherited 4-0049 M/M Proband+ Maternally 2q23.1 57.5 Kb Deletion MBD5 ASD Sibling Inherited 1-0040 M/F Proband+ Maternally 10q21.3 53.3 Kb Deletion CTNNA3 ASD Sibling Inherited 1-0040 M/F Proband+ Maternally 3p26.3 70.4 Kb Duplication CNTN4 ASD Sibling Inherited 5-0455 M/M Proband+ Maternally 8p12 64.8 Kb Deletion NRG1, NRG1-IT1 ASD Sibling Inherited 3-4168 M/F/ 3 ASD Paternally 2q24.1;2q23.3 134.5 Kb Deletion GALNT13 M Siblings Inherited 8-1021 F ASD Sibling Paternally 3p26.3 264 Kb Duplication CNTN4 Inherited 1-0062 M/F Proband+ Paternally 2p16.3 228.7 Kb Deletion NRXN1 Non-ASD Inherited Sibling 6-0382 M/F Proband+ Maternally Xp22.31 50.1 Kb Duplication NLGN4X Non-ASD Inherited Sibling 1-0061 M Non-ASD Paternally 3p26.2-3 1.8 Mb Deletion CNTN6, CNTN4, Sibling Inherited CNTN4-AS2 Non-ASD 15q21.3- 3-4453 M de novo 1.6 Mb Duplication ADAM10 + 12 genes Sibling 15q22.2 Non-ASD de novo MSTO1, MSTO2P, 6-0356 F 1q22 75 Kb Deletion Sibling Y1AP1 Non-ASD de novo TRIP4, ZNF609, 6-0243 F 15q22.31 279 Kb Deletion Sibling OAZ2 Non-ASD de novo 6-0376 M 16p11.2 610 Kb Deletion KCTD13 + 28 genes Sibling Non-ASD de novo 1-0514 F Xq13.1 129 Kb Duplication PJA1, LINC00269 Sibling

A total of 15 CNV de novo and inherited presumed pathogenic CNVs were found in 16 high-risk siblings. Three CNVs were shared between probands and affected siblings, while two were shared with unaffected siblings. Ten of these CNVs were found uniquely in siblings. See section 7.1 of appendices for CNV breakpoints. 33

Figure 13: Presumed Pathogenic CNVs Shared Between Proband and Affected or Unaffected Infant Sibling

Seven presumed pathogenic CNVs were shared between a proband and their respective infant sibling(s). Three of the six siblings were not diagnosed with ASD by age but displayed subclinical symptoms. Of the six infant siblings, four were affected by ASD.

34

Figure 14: de novo and Inherited Presumed Pathogenic CNVs Found in Siblings with ASD or Other Developmental Concerns and Not in Probands

Eight infant siblings possessed a de novo or inherited presumed pathogenic CNV that was not shared with a related proband. Four of these individuals did not meet criteria for ASD, but were noted as having other developmental concerns.

35

Two of seven CNVs were shared between a proband and unaffected sibling. These CNVs include a paternally-inherited 228 Kb 5’-NRXN1 and a 50 Kb duplication overlapping the second exon of the third isoform of NLGN4X (unknown inheritance). The high-risk female with the NLGN4X duplication was identified as having a BAP at 24 months, but no ASD diagnosis was ascribed at 36 months. The child’s score on the Social Communication Questionnaire (SCQ) was above the clinical cutoff for ASD. The female sibling sharing the 5’-NRXN1 deletion met the diagnostic cut- offs for ASD, yet was not given an ASD diagnosis. The child’s Early Learning Composite score on the Mullen Scales was low, as was her Adaptive Behavioral Composite derived from the Vineland Adaptive Behavioral Scales. Deficits in fine-motor skills were also recorded in the latter two tests. The father did not display a BAP, as determined by the self-assessed BAPQ.

Eight high-risk siblings possessed de novo and inherited presumed pathogenic CNVs not shared with a related index case. In a multiplex family of 5 children, two male and one female infant sibling with ASD possessed a paternally-inherited duplication of the third exon of the longest isoform of GALNT13, an ASD candidate gene (Bucan et al. 2009). The third-born male and index case did not possess the duplication. Another female infant sibling possessed a paternally- inherited duplication of exon 3 of the longest isoform and exons 1 and 2 of the shortest isoform of CNTN4. Additionally, an unaffected male sibling harboring a paternally-inherited 1.8 Mb deletion of CNTN6, CNTN4 and CNTN4-AS was noted as having a language delay. Analysis of his Early Learning Composite scores tabulated from the Mullen Scales at 12, 24 and 26 months of age showed a regression in cognitive abilities.

3.2 Predictive Value of Chromosomal Microarray for ASD Diagnoses

Of the 88 affected siblings in the cohort, 8 (9%) possessed a presumed pathogenic CNV. Six of 25 (24%) unaffected siblings with subclinical forms of ASD or other developmental concerns possessed a de novo or presumed pathogenic CNV as well. For the purposes of calculating the predictive value of chromosomal microarray, infants in these two groups were placed and labelled as atypically developing. Two of 138 typically developing high-risk infants harbored de novo CNVs, thus categorizing them as false-positive results.

The specificity of chromosomal microarray in this study was 98.6% (CI 95% =0.8-0.92), meaning that 1.4% of typically developing, non-ASD individuals from an affected family will test positive for a presumed pathogenic variant. The sensitivity of the microarray was 12.4% (CI 95% =0.06-

36

0.20), indicating that just over 10% of individuals with ASD will test positive for presumed pathogenic CNVs using chromosomal microarray.

37

Table 3: Predictive Value Statistic

ASD Outcomes of HR Siblings

Microarray Result Atypically Typically Total Developing Developing Positive 14 2 16 Negative 99 136 235 Total 113 138 Value (%) CI (95%) Positive Predictive Value 87.5 0.62-0.98 Negative Predictive Value 57.9 0.51-0.64 Sensitivity 12.4 0.06-0.2 Specificity 98.6 0.8-0.92

The positive predictive value (PPV), negative predictive value (NPV), sensitivity and specificity were calculated and presented with 95% confidence interval.

38

Discussion The results of this study suggest that the presence of a presumed pathogenic CNV in a sibling is predictive of a strict ASD or subclinical phenotype, specifically if it is shared with the related proband. Importantly, it also mirrors findings from previously a published study of 84 quartets indicating that ASD can have different genetic causes within a family, and thus separate genetic testing of siblings is optimal for early detection of ASD (Yuen et al. 2015). This study provides the first evidence for CNVs being used as biological markers for ASD. Given the importance of early therapeutic intervention, identifying an ASD-relevant CNV in infants may help to indicate which children would benefit from such services at a critical point in their development.

Given the reported contribution of de novo CNVs to the development of ASD, I first assessed the incidence of these mutations in all affected and unaffected individuals. The de novo rate in affected individuals (i.e. probands and affected siblings) was similar to previously reported rates of 5-10% in unrelated samples (Levy et al. 2011; Pinto et al. 2014; Tammimies et al. 2015). The rate in unaffected individuals was reported to be 2.2%, similar to the 2% de novo rate reported in the first published CNV study which included 887 probands and related unaffected siblings from the Simons Simplex Collection (Levy et al., 2011). Three non-ASD siblings with de novo CNVs had other developmental concerns. The de novo rate exclusively among typically developing siblings was 1.4% (2/138), comparable to that observed in other studies.

In order to determine the deleterious nature of de novo variants to ASD, I performed a critical exon enrichment analysis that was adapted from a previously published study demonstrating that exons that are highly expressed in the brain and have a low burden of rare, missense mutations are under purifying selection in the ASD population. I processed these CNVs in combination with unrelated samples from Levy et al. (2011) through this critical exon pipeline. Similar to previously reported results, I found that de novo CNVs in affected individuals harbored a higher proportion of critical exons than unaffected siblings. The proportion of critical exons affected by CNVs between affected individuals from this infant siblings study did not statistically differ from the probands in Levy et al. (2011), allowing us to pool subjects with ASD from both studies (results not shown). This could not be verified for unaffected siblings given the small number of these individuals with de novo CNVs (n=5). This indicates that the genes that confer the most risk are expressed in the developing brain. A statistical difference was observed in the proportion of critical exons between affected and unaffected individuals for all brain regions, with the exception 39

of two. These results indicate that the nature of the de novo CNVs may be more benign in unaffected individuals than in ASD cases.

Given the advantage of having a mixture of both SPX and MPX families available for genotyping, we wished to determine how CNVs were distributed among them. Our first observation was that SPX families had a higher burden of de novo CNVs than MPX families (14 SPX families). The observation that de novo CNVs were abundant in SPX families compared to MPX families would indicate that these genetic events drive ASD in this family type, and that, by default, inherited variants are more causal of an ASD phenotype in MPX families. This data contradicts one of the most recent large CNV studies in ASD comprised of thousands of cases, which demonstrated that there was an even distribution in de novo events among SPX and MPX families (Pinto et al. 2014). This discrepancy is presumed to be due to the small sample size. It is also important to note that two inherited presumed pathogenic CNVs were found in multiplex families, indicating that siblings can harbor different genetic drivers for ASD.

In total, presumed pathogenic CNVs were observed in 23 families (11.4%). Seven of these variants were shared between and index case and an infant sibling, all of which either had ASD or subclinical forms of ASD. The longitudinal data allowed me to assess the infant’s developmental trajectory in real time, something that previous ASD studies, most of which are retrospective in nature, do not allow. Two of these female non-ASD infants were classified as having BAP or had ADOS-2 scores that met the diagnostic cut-off for ASD between 12 and 36 months, although were never given a diagnosis.

These findings are supportive of data emerging from studies assessing the diagnostic stability of behavioral diagnostic tools for ASD, as well as the developmental profiles of non-ASD siblings in infancy and childhood. One study gauging the stability of ASD diagnoses between 18 and 24 months of age found that a significant number of individuals that did not receive a diagnosis at these time points (63% and 41%, respectively) went on to receive a diagnosis at 36 months (Ozonoff et al. 2015). Despite being relatively stable by 36 months, the sensitivity of these diagnostic measures is low, which indicates a high number of false-negatives.

A limited number of studies have assessed the outcome of non-ASD individuals past 3 years of age. One study looked at 200 children who received an ASD diagnosis at six years of age and found that most high-risk siblings identified with language delays, motor delays and impaired

40

cognitive functioning (Davidovitch et al. 2015). Importantly, a recent study observed that 6/49 non-ASD siblings (3 boys, 3 girls) were diagnosed with ASD after 3 years old. These children generally displayed lower ASD symptoms and higher cognitive function (Brian et al. 2015). The presence of de novo and presumed pathogenic CNVs in individuals with a reported subclinical phenotype warranted investigation into the predictive value of chromosomal microarray in the diagnosis of ASD. In the context of this study, the “gold standard test” for ASD diagnosis was a combination of the ADOS, ADI-R and clinical impression, to which I compared the predictive value of chromosomal microarray. The specificity, which refers to the portion of individuals without ASD which will have a negative genetic result, was high (98.6%), while the sensitivity was low (12.4%). Both of these values are intrinsic attributes of the diagnostic test in question and indicate that it is suitable to detect a true negative (i.e. high specificity), but not a true positive (i.e. low sensitivity). This is in accordance with current clinical observations that a genetic cause for ASD can be found in 10-20% of cases. The PPV indicates the number of cases with a positive test that have ASD, which is 87.5%. The use of microarray in a clinical setting can thus confidently indicate the presence of ASD or a subclinical phenotype. However, given that the NPV was relatively high (57.9%), not all individuals that are genotyped will be given a diagnosis.

4.1 Current Study Limitations While this infant sibling cohort is unique in its collection of longitudinal and detailed phenotype assessments, there were certain limitations intrinsic to the study design and execution that prevented me from making conclusive genotype-phenotype correlations. Firstly, there was a lack of consistency across and within sites in how ASD was diagnosed. While all sites used a combination of the ADOS, ADI-R and clinical impression to yield a best estimate diagnosis, the cut-off scores and interpretations of test results varied between clinicians. There were no criteria that were consistently applied to determine if an individual presented with subclinical ASD symptoms, thus making it possible that the number of individuals with a BAP was underestimated. Another limitation with respect to the interpretation of clinical information was the inconsistency in number of individuals who had scores available for each psychometric test. For example, only 225 of 320 parents completed the self-assessed BAPQ, which restricted my ability to determine if the presence of inherited presumed pathogenic CNVs segregated with autistic traits in parents. Lastly, there was no consistent clinical work-up conducted on infant siblings after 3 years of age, which prevented me from being able to identify individuals that

41

qualified for a diagnosis later in childhood. This information would specifically be useful to track the developmental progress of non-ASD infant siblings possessing pathogenic CNVs. For example, the unaffected male with a de novo 610 Kb deletion at locus 16p11.2 was deemed typically developing at 3 years of age, but no clinical follow-up was provided. Given the pathogenic nature of this CNV (Hanson et al. 2015), psychiatric impairments not perceptible early in childhood may have surfaced later in development.

Limitations of this study extended to the identification of CNVs and interpreting their impact on the autistic phenotype. While all samples were genotyped on the same high-density microarray using an established bioinformatics pipeline (Gazzellone et al. 2014), CNVs smaller than 15 Kb could not be reliably identified. Integration of WGS could assist in calling CNVs as small as a few hundred base pairs (Jiang et al. 2013; Yuen et al. 2015) in size. This technology would also allow for the detection of single-nucleotide variants and small insertions and deletions, which would likely increase the diagnostic yield (Stavropoulos 2016; Tammimies et al. 2015). Lastly, the small sample size of affected and unaffected siblings in this study made interpretation of statistical tests difficult. For example, the de novo rates were calculated with less than 200 individuals in each category (i.e. affected and unaffected siblings) and the presence of one or two additional CNVs could misrepresent the statistical significance between de novo rates. As described earlier in the discussion, this could alternatively explain the discrepancy between the percentages of unaffected individuals with a de novo CNV in this study versus previously published studies.

42

Conclusions and Significance

In summary, microarray analysis of 199 families identified a genetic cause of ASD in 11.4% of families. In seven families a presumed pathogenic CNV was shared between a proband and infant sibling. Five of the siblings were affected by ASD and two were deemed unaffected by ASD but displayed a subclinical phenotype and cognitive impairment. Additional CNVs found uniquely in eight siblings were deemed informative. As such, this data presents one of the first lines of evidence for the presence of biological markers for ASD. Specifically, chromosomal microarray has been shown to have predictive potential for infants for whom there is a family history of ASD.

Although subclinical phenotypes are observed at an increased frequency in first-degree relatives of those with ASD and recurrence risk is estimated to be 18.7%, this study is one of the first to correlate genetic findings with longitudinal behavioral and cognitive clinical data. Performing a genetic test using a chromosomal microarray can be a method for identifying if a child is high- risk before the onset of symptoms. As such, they may qualify for early intervention which is recognized as the most effective treatment for ASD symptoms (Zwaigenbaum et al. 2015). As behavioral manifestations of ASD are not always evident in children prior to 3 years of age, the identification of a presumed pathogenic CNV can effectively foreshadow the presence of social and communication impairments and atypical behavior patterns at a critical point in development to allow for timely clinical intervention.

43

Future Directions

Neurodevelopmental disorders, such as ID and ASD, collectively affect more than 2% of the population. Their prevalence and comorbidity with other conditions has made ASD/ID a great public health concern and placed an increasing economic burden on families (Saunders et al. 2015). The main contributor to an optimal clinical outcome is the timely implementation of behavioral and medical interventions, such as the Early Start Denver Model, which can improve an individuals’ developmental trajectory (Dawson et al. 2010; Zwaigenbaum et al. 2015). A critical part of early intervention is receiving an accurate diagnosis, ASD being a prime example. As was alluded to early on in this work, the genetic and environmental causes of ASD are multifactorial and extremely heterogeneous. This study has revealed that CNVs may serve as biological markers for ASD where there is a relevant family history. Gaining a better understanding of the genetic underpinnings of ASD by assessing all classes of variation in the genome can further inform diagnosis and treatment options for children.

Intermediate structural variation in the genome constitutes some of the most identifiable genetic causes of ASD. Chromosomal microarray is considered the first-tier diagnostic test for ASD and is currently the best platform to identify this class of variation in the genome (Miller et al. 2010). Clinical diagnostic yields have ranged from 7-9.3% (Tammimies et al. 2015), with microarray being especially beneficial for finding causes for idiopathic ASD. Whole-exome sequencing is currently being implemented in the clinic and has been shown to have a similar molecular diagnostic yield as chromosomal microarray, and is paving the way for introduction of WGS in ASD diagnostics. A recent study comparing WGS and clinical microarray deduced that WGS has a diagnostic yield of 34%, up to threefold higher than the 8% yield for microarray, when investigating congenital malformations and neurodevelopmental disorders (Stavropoulos 2016). As such, it is expected that the implementation of WGS as a standard of care in ASD diagnostics will increase the identifiable genetic causes of ASD in cases.

Infant siblings of children with ASD have been shown to possess subclinical traits more than infant siblings with no family history of ASD. Given that a causal genetic event can be identified in ~10% of ASD cases, it is reasonable to suggest that genetic testing of an infant can disclose biological markers for ASD. Conducting microarray analysis on a cohort where cognitive, social and behavioral development is longitudinally assessed has allowed for effective correlation

44

between genetic liabilities and the earliest manifestations of ASD. The work reported here on 199 families acts as a proof of concept to launch an investigation on all classes of variation in the human genome using WGS in a larger number of individuals.

In an attempt to gain a better understanding of the genetic architecture of ASD, functional experiments are helpful to determine how the aberrant functioning of risk genes impacts neurobiology. I therefore aim to couple a broad query of the genome using microarrays and WGS with specific functional experiments on a gene(s) of interest to explain the role of a given gene in nervous system development. Many in vivo and in vitro assays can be used to asses this. This includes behavioral characterization of knock-out mice and analyses of synapse formation and electrical signal transduction in neurons.

6.1 WGS of Infant Siblings Cohort Given the increased diagnostic yield using WGS compared to microarray, the use of the former technology will enable the identification of variants ranging in size from a single-nucleotide to hundreds of thousands of nucleotides. Research has indicated that the ASD risk genes identified using WES and microarray technology do differ, furthering the need to query the genome for all classes of variants to gain a more solid understanding of the genetic architecture of ASD. Approximately 315 families have been registered with the BSRC and consented to genetic analysis. All Canadian families that were included in this study have already been whole-genome sequenced, and an estimated 100 families from the US contingent will additionally be sequenced upon approval from each institute’s research and ethics board.

Genomic DNA will be extracted from whole blood for all samples and sequenced at TCAG using Illumina X-Ten sequencers (Illumina, San Diego, CA). DNA libraries will be created using TruSeq Nano DNA preparatory kit, whereby adaptor nucleotides will be ligated with purified DNA with an insert size ~350 bp. After filtering using a pre-established in-house protocol, sequence reads will be aligned to the reference genome (build GRCh 37). De novo SNVs and indels will be identified using the random forest method (ForestDNM). De novo structural variants and CNVs will be called using a combination of algorithms (SegSeq, ERDS and Meerkat). All SNVs and indels will be validated by Sanger sequencing and CNVs will be detected using the probe-based TaqMan® Copy Number Assay. Over 75% of samples that are being processed for WGS will also have been processed on the Affymetrix CytoScan HD

45

platform. Previous data, as presented in in Fig. 7, shows that microarrays are still optimal for detecting CNVs >15 Kb, while WGS is more efficient in identifying CNVS between 1-10 Kb in size, hence the rationale for integrating both technologies in detecting this class of structural variation.

The resulting DNA sequences will be functionally annotated using ANNOVAR. Sequence conservation will be reported using UCSC PhyloP and phastCons scores. A suite of predicting algorithms will be used to assess the impact of variants on the sequence (SIFT, PolyPhen2, Mutation Assessor, Mutation Taster, CADD). I will prioritize putative loss of function (LoF) mutations in both coding and non-coding regions of the genome. These deleterious variants include missense, nonsense, splice site mutations and insertions/deletions (indels). An excess burden of LoF mutations have been noted to contribute substantially to the ASD phenotype, among other neuropsychiatric disorders.

Importantly, the TCAG will also acquire the Infant Siblings Cohort phenotype database, which will include longitudinal data on all infant siblings. As in the CNV study, the ADOS, Mullen Scales and Vineland Scales will be used to assess ASD, cognitive and behavioral outcomes. As previously alluded to, in-depth, longitudinal phenotype data is what makes this cohort unique and is currently one of the best tools with which to correlate genotype with the earliest signs of ASD. Acquiring the database in full will further enhance our ability to utilize this information and permit us to assess the presence of specific traits (e.g. gross motor dysfunction, verbal delay) across the whole cohort with the presence of presumed pathogenic variants. Current limitations in access to the phenotype information have restricted this portion of the analysis to a case-by-case basis. In the next phase of the project, we will be able to accurately determine how infant siblings harboring variants of interest to ASD. This dual approach will continue to allow us to correlate subtypes of symptoms with genetic information to properly assess how genetic testing can predict a clinical outcome.

6.2 Investigating the Role of PTCHD1-AS in ASD The PTHCD1 gene, located at Xp22.11, is considered to be one of the most highly-penetrant ASD risk genes. Evidence from chromosomal microarray studies has shown that disruptions of this gene and neighboring gene DDX53 segregate predominantly with affected males. PTCHD1 is 61.9 Kb in length and expressed primarily in the brain and testis. The resulting transmembrane

46

protein contains a patched-related domain which interacts with the ligand-binding region of the Hedgehog receptor and is found throughout the developing neural tube, that pathway being critical in central nervous system patterning (Jessell 2000).Over 40% of individuals harboring a mutation in this locus are reported to have minor-to-moderate ID and autistic features (Chaudhry et al. 2015). Recent knock-out mouse model in which PTCHD1 was deleted from the thalamic reticular nucleus indicated that aberration in gene expression can lead to attention deficits, sleep disruption and hyperactivity, none of which are classical features of ASD. Evidence implicating the role of PTCHD1’s long non-coding RNA counterpart, PTCHD1-AS, in ASD has been mounting as more reports of disruptions of this locus in clinical cases have surfaced. Referred to as PTCHD1-AS, an initial search of 996 ASD families revealed eight deletions in unrelated males with ASD (~0.8%) (Noor et al. 2010; Pinto et al. 2010). This study further identified an additional mutation spanning the third exon of the second isoform (Fig. 14). Although multiple isoforms of the long-noncoding RNA have been detected, the third exon is common to nearly all of them and is the exon most frequently affected in ASD cases (7/15 deletions). While assessment of 10, 486 controls demonstrated that the preponderance of CNVs in this locus are harbored by females and affected the second exon of the first isoform alone, two males appeared to harbor deletions with one of the breakpoints hovering close to the exon 3 boundary, although microarray resolution (≤ 1 million probes) could have overestimated the CNV breakpoints. Experiments can therefore be undertaken to not only assess the role of PTCHD1-AS in ASD, but specifically of exon 3.

The first step in the two-pronged approach to assessing the contribution of this exon to the ASD phenotype is to create a mouse-model in which this exon is disabled. Classic behavioral analyses in mice analogous to core deficits observed in with ASD will be assessed. The exon in question will be deleted using the CRISPR-Cas9 system.

The creation of the mouse model will also allow me to probe the neurocircuitry of the brain and it’s morphology. Using immunohistochemical assays, I will analyze neurons to see how

47

A

B

Figure 15: Previously Reported and Novel Deletions Overlapping PTCHD1-AS A This figure plots the maternally-inherited deletion in the male

proband and related male infant sibling along with other novel and

previously reported CNVs.

B Pedigree of the two male siblings harboring a maternally-inherited

deletion.

48

dendritic branching and synapse formation are maintained. Electrophysiology assays on brain slices will also allow us to determine synapse integrity. Assessing these parameters at various stages of mouse development can give great insight into the role of this long non-coding RNA in brain development.

49

Appendices 7.1 Rare, Exonic CNVs

ID Status Sex Chr Start End Size CNV Genes

1-0001-003 Proband F 7 98755291 98807839 52,549 Duplication KPNA7 ZNF676;LOC100996349;ZNF729;ZNF728;L 1-0001-004 Unaff. Sib M 19 22341905 23204005 862,101 Duplication OC101929164;LINC01233;LOC101929124; GOLGA2P9;ZNF98;ZNF99;ZNF492 1-0001-004 Unaff. Sib M 7 98755292 98807839 52,548 Duplication KPNA7 1-0001-004 Unaff. Sib M 9 28844606 28953316 108,711 Deletion MIR873;LINGO2;MIR876 ZNF676;LOC100996349;ZNF729;ZNF728;L 1-0001-005 Unaff. Sib F 19 22341905 23207386 865,482 Duplication OC101929164;LINC01233;LOC101929124; GOLGA2P9;ZNF98;ZNF99;ZNF492 1-0001-005 Unaff. Sib F 7 98758968 98807839 48,872 Duplication KPNA7 1-0026-003 Proband M 4 8290793 8569232 278,440 Duplication ACOX3;TRMT44;HTRA3 LOC100289656;TJP1;APBA2;GOLGA8J;GOL 1-0027-003 Proband M 15 29010256 30386553 1,376,298 Duplication GA6L7P;PDCD6IPP2;NDNL2;FAM189A1 1-0027-003 Proband M 16 19598397 19676290 77,894 Duplication C16orf62 1-0027-003 Proband M 12 95530700 95585836 55,137 Deletion FGD6 1-0027-003 Proband M 2 1072940 1091421 18,482 Deletion SNTG2 LOC100289656;TJP1;ARHGAP11B;APBA2; GOLGA8T;GOLGA8J;GOLGA8H;GOLGA8R; 1-0027-004 Unaff. Sib F 15 29023368 31073669 2,050,302 Duplication GOLGA6L7P;PDCD6IPP2;LOC100288637;U LK4P1;ULK4P2;NDNL2;DKFZP434L187;ULK 4P3;FAM189A1;CHRFAM7A 1-0027-004 Unaff. Sib F 16 19598506 19676278 77,773 Duplication C16orf62 1-0034-003 Proband M 3 154335929 154905740 569,812 Deletion MME 1-0034-003 Proband M 7 12236075 12380824 144,750 Deletion VWDE;TMEM106B 1-0034-004 Unaff. Sib F 3 154335929 154905740 569,812 Deletion MME 1-0040-003 Proband F 10 68490104 68543430 53,327 Deletion CTNNA3 TYMSOS;CETN1;TYMS;CLUL1;COLEC12; 1-0040-003 Proband F 18 380185 926925 546,741 Duplication YES1;ADCYAP1;ENOSF1 1-0040-003 Proband F 3 2609217 2663652 54,436 Duplication CNTN4 1-0040-004 Aff. Sib M 10 68490104 68543430 53,327 Deletion CTNNA3 1-0040-004 Aff. Sib M 3 2609217 2679515 70,299 Duplication CNTN4 1-0042-004 Aff. Sib F 8 467166 495888 28,723 Deletion TDRP 1-0046-003 Proband M 13 115004474 115035651 31,178 Duplication CDC16 1-0046-004 Aff. Sib M 7 140161144 140187482 26,339 Deletion MKRN1 1-0046-004 Aff. Sib M 13 115007056 115031495 24,440 Duplication CDC16 TAF1A- AS1;TAF1A;HHIPL2;BROX;TP53BP2;AIDA;F 1-0049-003 Proband M 1 222697088 224104993 1,407,906 Deletion AM177B;SUSD4;CAPN2;MIA3;TLR5;DISP1; CCDC185;CAPN8 TAF1A- AS1;TAF1A;HHIPL2;BROX;TP53BP2;AIDA;F 1-0049-004 Unaff. Sib F 1 222697088 224104993 1,407,906 Deletion AM177B;SUSD4;CAPN2;MIA3;TLR5;DISP1; CCDC185;CAPN8 1-0061-003 Proband M 1 196195407 196229968 34,562 Duplication KCNT2 1-0061-003 Proband M 6 166640592 166730657 90,066 Duplication PRR18;LOC101929297

50

1-0061-003 Proband M 9 91951354 92016916 65,563 Deletion SEMA4D;SECISBP2 1-0061-004 Unaff. Sib M 11 72938407 72954453 16,047 Deletion P2RY2 1-0061-004 Unaff. Sib M 3 1080020 2886098 1,806,079 Deletion CNTN6;CNTN4;CNTN4-AS2 1-0061-004 Unaff. Sib M 9 91951354 92014625 63,272 Deletion SEMA4D;SECISBP2 1-0062-003 Proband M 2 51141571 51363855 222,285 Deletion NRXN1 1-0062-003 Proband M 3 197221740 197312425 90,686 Duplication BDH1 1-0062-004 Unaff. Sib F 2 51141571 51370150 228,580 Deletion NRXN1 1-0062-004 Unaff. Sib F 6 156975 237215 80,241 Deletion LOC285766 1-0069-003 Proband M 2 15104101 15343456 239,356 Duplication NBAS 1-0070-004 Unaff. Sib M 1 46307027 46550052 243,026 Duplication MAST2;PIK3R3 1-0070-004 Unaff. Sib M 19 53882848 53906447 23,600 Duplication ZNF765;ZNF525 PCDHB9;PCDHB8;PCDHB7;PCDHB6;PCDHB 1-0070-004 Unaff. Sib M 5 140517382 140576241 58,860 Deletion 5;PCDHB10;PCDHB17;PCDHB16 1-0074-003 Proband M 4 108135377 108575736 440,360 Duplication PAPSS1 1-0074-004 Unaff. Sib F 4 108132348 108578164 445,817 Duplication PAPSS1 1-0075-003 Proband M 7 157935822 157985571 49,750 Duplication PTPRN2 1-0075-003 Proband M X 8430920 8536493 105,574 Duplication VCX3B;KAL1 1-0076-003 Proband M 9 71717407 71742989 25,583 Duplication TJP2 F8A2;CLIC2;F8A1;F8A3;LOC101927830;H2 AFB3;H2AFB2;H2AFB1;MIR1184- 1-0076-004 Unaff. Sib M X 154523001 154780118 257,118 Duplication 1;MIR1184-2;MIR1184-3;TMLHE- AS1;TMLHE 2-1019-003 Proband M 15 29429088 29453476 24,389 Duplication FAM189A1 2-1019-004 Unaff. Sib M 15 29429017 29453464 24,448 Duplication FAM189A1 2-1090-004 Aff. Sib F 2 179087487 179152313 64,827 Deletion OSBPL6 2-1090-005 Unaff. Sib M 10 43153955 43218552 64,598 Duplication LINC01518 2-1090-005 Unaff. Sib M 14 61339139 61366019 26,881 Duplication MNAT1 3-4058-001 Proband M 1 24877458 24939645 62,188 Duplication LOC100506985;NCMAP 3-4058-004 Unaff. Sib F 17 51715300 52043990 328,691 Duplication KIF2B 3-4058-004 Unaff. Sib F 9 115778294 115819870 41,577 Deletion ZFP37 3-4098-001 Proband M 7 96066180 96119435 53,256 Deletion C7orf76 3-4098-004 Unaff. Sib M 13 43535714 43734722 199,009 Duplication DNAJC15;LINC00400;EPSTI1 3-4098-004 Unaff. Sib M 2 121051838 121069099 17,262 Deletion RALB 3-4098-004 Unaff. Sib M 4 116612 355132 238,521 Duplication ZNF876P;ZNF732;ZNF718;ZNF141 3-4098-004 Unaff. Sib M 7 96066180 96119435 53,256 Deletion C7orf76 3-4139-001 Proband F 1 145016706 145068480 51,775 Duplication PDE4DIP;NBPF9;NBPF20 3-4139-001 Proband F 7 134220442 134263949 43,508 Deletion AKR1B10;AKR1B15 3-4139-004 Aff. Sib M 7 134220257 134263937 43,681 Deletion AKR1B10;AKR1B15 3-4168-001 Proband M 2 176981810 177003022 21,213 Deletion HOXD-AS2;HOXD8;HOXD9;HOXD10 3-4168-004 Aff. Sib M 2 154881876 155011019 129,144 Deletion GALNT13 3-4168-004 Aff. Sib M X 111665194 111789348 124,155 Duplication ZCCHC16 3-4168-006 Aff. Sib F 2 154876621 155011208 134,588 Deletion GALNT13 3-4168-007 Aff. Sib M 2 154876609 155011019 134,411 Deletion GALNT13 3-4168-007 Aff. Sib M 2 176981810 177003022 21,213 Deletion HOXD-AS2;HOXD8;HOXD9;HOXD10 3-4168-007 Aff. Sib M X 111690557 111789352 98,796 Duplication ZCCHC16 51

3-4310-001 Proband M 13 26074996 26204258 129,263 Duplication ATP8A2 3-4310-001 Proband M 16 70287667 70305117 17,451 Deletion AARS 3-4310-001 Proband M 19 34514046 34704129 190,084 Duplication LSM14A 3-4310-006 Unaff. Sib F 19 34514046 34692586 178,541 Duplication LSM14A 3-4310-007 Unaff. Sib M 16 70287667 70305117 17,451 Deletion AARS 3-4404-001 Proband M 11 104126965 104542802 415,838 Deletion LOC102723895 3-4404-004 Aff. Sib M 11 104126965 104542802 415,838 Deletion LOC102723895 3-4425-004 Unaff. Sib F 4 69316107 69353300 37,194 Deletion TMPRSS11E 3-4425-005 Unaff. Sib F 4 69316107 69353300 37,194 Deletion TMPRSS11E 3-4425-007 Unaff. Sib M 4 69316107 69353312 37,206 Deletion TMPRSS11E CCNB2;LIPC;FAM63B;SLTM;GCOM1;HSP90 3-4453-005 Unaff. Sib M 15 57859275 59410122 1,550,848 Duplication AB4P;AQP9;ADAM10;LOC101928694;MYZ AP;RNF111;POLR2M;ALDH1A2 3-4453-005 Unaff. Sib M 2 713331 944591 231,261 Duplication LINC01115;LOC101060385 3-4676-001 Proband M 10 109821101 109879903 58,803 Deletion LINC01435 3-8161-001 Proband M 22 49067525 49109069 41,545 Deletion FAM19A5 3-8161-004 Unaff. Sib M 11 123381452 123400246 18,795 Duplication GRAMD1B 3-8257-001 Proband M 12 17072246 17174202 101,957 Duplication SKP1P2 3-8257-001 Proband M 13 40786024 40945400 159,377 Duplication LINC00548;LINC00598 3-8257-001 Proband M 18 28726019 28745634 19,616 Deletion DSC1;DSCAS 3-8257-001 Proband M X 23043876 23134236 90,361 Deletion PTCHD1-AS 3-8257-004 Aff. Sib M X 23043876 23124980 81,105 Deletion PTCHD1-AS 4-0031-001 Proband M 22 21318384 21335070 16,687 Deletion AIFM3;LOC101928891 4-0031-004 Aff. Sib M 16 8899461 8950829 51,369 Duplication PMM2;CARHSP1 4-0049-001 Proband M 16 77927974 78594966 666,993 Duplication VAT1L;CLEC3A;WWOX 4-0049-001 Proband M 2 149093863 149151358 57,496 Deletion MBD5 4-0049-001 Proband M 4 174022301 174174689 152,389 Deletion GALNT7 4-0049-001 Proband M 8 9019023 9117819 98,797 Deletion LOC101929128 4-0049-004 Aff. Sib M 2 149094015 149151358 57,344 Deletion MBD5 4-0049-004 Aff. Sib M 5 49441945 50122950 681,006 Duplication PARP8;EMB 4-0049-004 Aff. Sib M 8 9019023 9117819 98,797 Deletion LOC101929128 4-0078-001 Proband M 1 241867791 242197567 329,777 Duplication MAP1LC3C;EXO1;WDR64;BECN1P1 4-0078-001 Proband M 19 23609726 24149545 539,820 Duplication ZNF675;ZNF681;RPSAP58;ZNF726 4-0078-001 Proband M 4 3675815 3750273 74,459 Duplication LOC100133461 4-0078-004 Aff. Sib M 1 241867791 242197567 329,777 Duplication MAP1LC3C;EXO1;WDR64;BECN1P1 4-0078-004 Aff. Sib M 9 8699245 8780048 80,804 Deletion PTPRD 4-0094-001 Proband M 10 91568315 92036154 467,840 Deletion LINC01375;LINC00865 4-0094-004 Unaff. Sib F 4 186431009 187040907 609,899 Deletion TLR3;SORBS2;PDLIM3 4-0095-001 Proband F 1 100195249 100513446 318,198 Duplication AGL;HIAT1;FRRS1;SLC35A3 4-0132-001 Proband M 6 95982530 96072192 89,663 Deletion MANEA-AS1;MANEA 4-0132-004 Unaff. Sib F 10 133435388 133677330 241,943 Duplication LINC01164 4-0132-004 Unaff. Sib F 6 95982530 96071088 88,559 Deletion MANEA-AS1;MANEA 4-0150-004 Unaff. Sib F 1 179308950 179333315 24,366 Deletion SOAT1 5-0025-004 Proband M 7 153499963 153671436 171,474 Duplication DPP6 52

5-0059-003 Proband M 7 12352210 12443042 90,833 Deletion VWDE 5-0059-004 Unaff. Sib F 7 12352549 12443042 90,494 Deletion VWDE 5-0083-003 Proband M 17 1234258 1259930 25,673 Duplication YWHAE FOXP2;EIF3IP1;MIR6132;IMMP2L;LOC100 996249;LINC01393;CTTNBP2;DOCK4;LINC 00998;LINC01510;LSMEM1;CAV2;CAV1;A NKRD7;ST7- AS1;PPP1R3A;LINC01392;ST7;TES;GPR85;S 11,051,461 5-0083-003 Proband M 7 108606430 119657890 Deletion T7-OT3;LSM8;TFEC;ST7-

OT4;MDFIC;ASZ1;DOCK4- AS1;IFRD1;MIR3666;TMEM168;CAPZA2; ST7-AS2;C7orf60;WNT2;LRRN3;MET; ZNF277;LOC101928036;CFTR; LOC101928012 5-0092-003 Proband M 12 40497609 40558050 60,442 Duplication SLC2A13 5-0092-005 Unaff. Sib M 18 70622593 70958283 335,691 Deletion LOC400655 5-0092-005 Unaff. Sib M 7 44814172 44840028 25,857 Deletion PPIA 5-0112-003 Proband M 4 189259332 189856546 597,215 Duplication LINC01060 5-0112-004 Aff. Sib M 2 202057852 202077627 19,776 Deletion CASP10 5-0112-004 Aff. Sib M 4 99058771 99164872 106,102 Deletion STPG2 5-0112-004 Aff. Sib M 4 189259332 189856546 597,215 Duplication LINC01060 5-0112-004 Aff. Sib M 6 167693591 167787878 94,288 Duplication TTLL2;TCP10;UNC93A 5-0139-005 Aff. Sib F 19 42259136 42305800 46,665 Duplication CEACAM3;CEACAM6 5-0144-004 Proband M 5 149915676 149968071 52,396 Duplication NDST1 5-0171-003 Unaff. Sib F 1 104562739 104645166 82,428 Deletion LOC100129138 5-0171-004 Proband M 7 121769869 121832764 62,896 Deletion AASS 5-0171-005 Aff. Sib M 6 140507594 140865359 357,766 Deletion MIR3668 5-0224-003 Unaff. Sib M 11 18606095 18621686 15,592 Deletion UEVLD;SPTY2D1-AS1 5-0224-003 Unaff. Sib M 19 51409056 51441458 32,403 Duplication KLK4 5-0224-003 Unaff. Sib M 6 17537923 17575353 37,431 Duplication CAP2 5-0224-004 Proband M 11 18606136 18621552 15,417 Deletion UEVLD;SPTY2D1-AS1 5-0224-004 Proband M 6 17537923 17582196 44,274 Duplication CAP2 5-0228-003 Proband M 11 47002107 47089412 87,306 Duplication C11orf49 5-0228-003 Proband M X 104258709 104887352 628,644 Duplication TEX13A;IL1RAPL2 5-0228-004 Unaff. Sib F X 104258709 104887352 628,644 Duplication TEX13A;IL1RAPL2 5-0228-005 Unaff. Sib M 11 47005633 47081275 75,643 Duplication C11orf49 5-0244-003 Proband M 12 52233091 52295228 62,138 Duplication ANKRD33 5-0244-003 Proband M 21 44093000 44134899 41,900 Duplication PDE9A 5-0244-003 Proband M 21 44169804 44454004 284,201 Duplication NDUFV3;PDE9A;WDR4;PKNOX1 5-0244-003 Proband M 7 70769618 70786695 17,078 Deletion MIR3914-1;MIR3914-2;WBSCR17 5-0244-004 Unaff. Sib F 12 52233091 52290477 57,387 Duplication ANKRD33 5-0244-005 Unaff. Sib M 12 52233091 52294624 61,534 Duplication ANKRD33 5-0244-005 Unaff. Sib M 7 70769618 70786695 17,078 Deletion MIR3914-1;MIR3914-2;WBSCR17 5-0261-003 Proband F 1 145014997 145074808 59,812 Duplication PDE4DIP;NBPF9;NBPF20 5-0289-003 Proband F 11 103981953 104113681 131,729 Duplication PDGFD 5-0289-004 Aff. Sib M 11 103981833 104113681 131,849 Duplication PDGFD

53

5-0289-005 Unaff. Sib M 11 103981833 104113681 131,849 Duplication PDGFD 5-0291-005 Unaff. Sib M 22 44531006 44552071 21,066 Deletion PARVB C17orf70;ACTG1;TSPAN10;ANAPC11;OXL D1;STRA13;RFNG;ARL16;MIR3186;NPLOC 4;PYCR1;SLC25A10;GPS1;DUS1L;MIR6786; MAFGAS1;DCXR;FASN;LOC100130370;AR 5-0298-003 Proband M 17 79330617 80189678 859,062 Duplication HGDIA;MAFG;BAHCC1;MIR4740;MRPL12; SIRT7;RAC3;CCDC57;P4HB;PCYT2;HGS;ALY REF;GCGR;MYADML2;FSCN2;ASPSCR1;CC DC137;NOTUM;FAM195B;SLC16A3;NPB;P PP1R27;PDE6G;LRRC45 5-0298-003 Proband M 17 80190109 80252756 62,648 Deletion SLC16A3;CSNK1D;MIR6787 5-0298-004 Unaff. Sib F 8 53403364 53503816 100,453 Duplication FAM150A 5-0299-004 Aff. Sib M 14 55318171 55337426 19,256 Deletion GCH1 FAHD2B;ANKRD36;FAM178B;LOC1019270 5-0345-003 Aff. Sib F 2 97572665 97858150 285,486 Deletion 53 5-0345-005 Proband M 8 71529266 71663110 133,845 Duplication LACTB2;XKR9;LOC286190 5-0389-003 Proband M 1 178446190 178499350 53,161 Duplication RASAL2;TEX35 5-0389-003 Proband M 12 3500517 3615311 114,795 Deletion PRMT8;LOC100129223 5-0389-004 Aff. Sib M 1 178448416 178499350 50,935 Duplication RASAL2;TEX35 5-0389-004 Aff. Sib M 7 122259825 122275049 15,225 Deletion CADPS2 5-0405-003 Proband M 8 146231188 146295771 64,584 Duplication C8orf33;ZNF252P-AS1 5-0405-005 Unaff. Sib M 1 144987766 145080038 92,273 Duplication PDE4DIP;NBPF9;NBPF20 5-0414-003 Proband M 22 29167331 29420651 253,321 Duplication CCDC117;ZNRF3;XBP1 5-0438-003 Proband M 16 81200651 81347422 146,772 Duplication BCO1;PKD1L2 5-0438-004 Aff. Sib M 16 81199040 81358769 159,730 Duplication GAN;BCO1;PKD1L2 5-0438-004 Aff. Sib M 6 57206230 57560028 353,799 Duplication PRIM2 5-0439-003 Proband M 13 43469211 43790867 321,657 Duplication ENOX1;DNAJC15;LINC00400;EPSTI1 5-0439-004 Unaff. Sib F 13 43469157 43790867 321,711 Duplication ENOX1;DNAJC15;LINC00400;EPSTI1 LCE2C;LCE2B;LCE2D;LCE3D;LCE3E;LCE5A;L 5-0441-003 Proband M 1 152446839 152659909 213,071 Duplication CE3A;LCE3B;LCE3C;CRCT1 5-0441-003 Proband M 11 71847808 71873232 25,425 Duplication FOLR3 5-0441-003 Proband M 15 75877357 75974622 97,266 Duplication IMP3;SNX33;SNUPN;CSPG4 CDK10;CPNE7;SPATA33;SPATA2L;CHMP1A 5-0441-003 Proband M 16 89656577 89772674 116,098 Duplication ;DPEP1 LCE2C;LCE2B;LCE2D;LCE3D;LCE3E;LCE5A;L 5-0441-004 Unaff. Sib M 1 152447346 152661368 214,023 Duplication CE3A;LCE3B;LCE3C;CRCT1 5-0441-004 Unaff. Sib M 11 71847228 71873232 26,005 Duplication FOLR3 5-0441-004 Unaff. Sib M 15 75877215 75991451 114,237 Duplication IMP3;SNX33;SNUPN;CSPG4 CDK10;CPNE7;SPATA33;SPATA2L;CHMP1A 5-0441-004 Unaff. Sib M 16 89656577 89772750 116,174 Duplication ;DPEP1 5-0441-004 Unaff. Sib M 17 72212992 72257959 44,968 Duplication TTYH2 5-0455-003 Proband M 5 35015198 35077887 62,690 Duplication PRLR;AGXT2 5-0455-003 Proband M 8 31994987 32056041 61,055 Deletion NRG1;NRG1-IT1 5-0455-004 Aff. Sib M 10 34670467 35339324 668,858 Duplication PARD3;PARD3-AS1;CUL2 SYNM;HSP90B2P;LUNAR1;PGPEP1L; 5-0455-004 Aff. Sib M 15 99509838 100158444 648,607 Duplication MEF2A;LRRC28;TTC23 5-0455-004 Aff. Sib M 8 31991227 32056041 64,815 Deletion NRG1;NRG1-IT1 5-0479-003 Proband M 6 118903189 119061445 158,257 Duplication CEP85L 5-0479-005 Unaff. Sib F 6 118903189 119061445 158,257 Duplication CEP85L

54

5-0479-006 Aff. Sib F 6 118903189 119061445 158,257 Duplication CEP85L 5-0479-006 Aff. Sib F 9 21432478 21448327 15,850 Deletion IFNA1 5-0479-007 Unaff. Sib M 6 118903189 119061163 157,975 Duplication CEP85L 5-0479-007 Unaff. Sib M 9 21432211 21448327 16,117 Deletion IFNA1 5-0514-003 Proband M 7 5392787 5522717 129,931 Deletion FBXL18;TNRC18 5-0514-004 Unaff. Sib F X 68321236 68450191 128,956 Duplication PJA1;LINC00269 5-0533-003 Proband M 22 36540261 36587223 46,963 Deletion APOL4;APOL3 5-0534-003 Proband M 6 34741917 34887258 145,342 Duplication TAF11;UHRF1BP1;ANKS1A 5-0534-004 Aff. Sib M 6 34741917 34887258 145,342 Duplication TAF11;UHRF1BP1;ANKS1A 5-0534-005 Unaff. Sib F 6 34717275 34887258 169,984 Duplication SNRPC;TAF11;UHRF1BP1;ANKS1A 5-0539-004 Unaff. Sib F 18 70908956 71475536 566,581 Duplication LOC400655;LOC100505817 5-0539-004 Unaff. Sib F 8 33317982 33345243 27,262 Deletion FUT10;MAK16 5-0541-005 Unaff. Sib M 12 10265411 10346627 81,217 Duplication TMEM52B;OLR1;CLEC7A 5-0541-006 Unaff. Sib M 12 10265411 10346627 81,217 Duplication TMEM52B;OLR1;CLEC7A 5-0541-006 Unaff. Sib M 20 30053255 30075050 21,796 Duplication REM1;DEFB124;LINC00028 ENOX1;ENOX1- 5-0546-003 Proband M 13 43488448 44322314 833,867 Duplication AS2;DNAJC15;LINC00400;EPSTI1 5-0546-003 Proband M 7 15400822 15426956 26,135 Deletion AGMO ENOX1;ENOX1- 5-0546-004 Unaff. Sib M 13 43491472 44322302 830,831 Duplication AS2;DNAJC15;LINC00400;EPSTI1 5-0546-004 Unaff. Sib M 5 59169417 59245687 76,271 Duplication PDE4D 5-0546-004 Unaff. Sib M 7 15400822 15425411 24,590 Deletion AGMO 5-0548-003 Proband M 1 145023338 145079983 56,646 Duplication PDE4DIP;NBPF9;NBPF20 5-0548-003 Proband M 3 43314418 43376205 61,788 Duplication SNRK 5-0548-004 Unaff. Sib F 3 43314418 43369330 54,913 Duplication SNRK 5-0568-003 Proband M 3 121339020 121374032 35,013 Deletion HCLS1;FBXO40 5-0574-004 Aff. Sib M 3 140798675 140923121 124,447 Deletion SPSB4 5-0574-004 Aff. Sib M X 96763922 96832872 68,951 Duplication DIAPH2-AS1;DIAPH2 TRAPPC2L;CDH15;PABPN1L; LOC100129697;LOC339059;CDT1;CBFA2T 3;CTU2;LOC400558;LINC00304;MIR4722;P 5-0580-003 Proband M 16 88724255 89304889 580,635 Duplication IEZO1;ACSF3;GALNS;APRT;MVD;SNAI3; RNF166;SLC22A31;LOC100289580; ZNF778;SNAI3-AS1 5-0580-004 Aff. Sib M 16 47721456 47738637 17,182 Duplication PHKB 5-0595-003 Proband M 19 11912917 12069592 156,676 Duplication ZNF69;ZNF439;ZNF700;ZNF440;ZNF491 5-0595-003 Proband M 20 58742917 58930497 187,581 Duplication MIR646HG 5-0595-004 Aff. Sib M 10 43270409 43300158 29,750 Duplication BMS1 5-0595-004 Aff. Sib M 10 43394725 43534765 140,041 Duplication MIR5100;LINC01264 5-0595-004 Aff. Sib M 10 43651933 43721154 69,222 Duplication RASGEF1A;CSGALNACT2 5-0653-003 Proband M X 98706559 98860140 153,582 Deletion XRCC6P5 5-0653-005 Unaff. Sib M X 98706559 98860140 153,582 Deletion XRCC6P5 5-0653-006 Unaff. Sib F X 98713494 98860140 146,647 Deletion XRCC6P5 5-0704-003 Proband M 13 80947635 81597109 649,475 Deletion LINC00377 5-0707-004 Unaff. Sib M 16 20858130 20899174 41,045 Duplication DCUN1D3;LOC81691 5-0707-004 Unaff. Sib M 22 33692906 33733995 41,090 Duplication LARGE

55

5-0707-005 Unaff. Sib F 16 20858130 20899174 41,045 Duplication DCUN1D3;LOC81691 6-0007-001 Proband F 19 52938070 52985232 47,163 Deletion ZNF534;ZNF578 6-0007-004 Unaff. Sib F 19 52938070 52985232 47,163 Deletion ZNF534;ZNF578 6-0017-001 Proband M 1 15170927 15375033 204,107 Duplication KAZN 6-0017-004 Unaff. Sib M 1 15167599 15375033 207,435 Duplication KAZN 6-0020-001 Proband M 10 103123032 103348865 225,834 Duplication POLL;BTRC;DPCD 6-0020-001 Proband M 13 43487161 43810828 323,668 Duplication ENOX1;DNAJC15;LINC00400;EPSTI1 6-0020-004 Unaff. Sib F 10 103122804 103323347 200,544 Duplication BTRC 6-0020-004 Unaff. Sib F 13 43468368 43811763 343,396 Duplication ENOX1;DNAJC15;LINC00400;EPSTI1 6-0022-001 Proband M 16 7101377 7131548 30,172 Deletion RBFOX1 6-0022-004 Unaff. Sib F 10 34759359 34976440 217,082 Deletion PARD3 6-0022-004 Unaff. Sib F 16 7101376 7131548 30,173 Deletion RBFOX1 6-0022-004 Unaff. Sib F 17 39504593 39526296 21,704 Deletion KRT33B;KRT33A 6-0022-004 Unaff. Sib F 18 7079996 7576095 496,100 Duplication LRRC30;LAMA1;PTPRM 6-0022-004 Unaff. Sib F 20 25442598 25593002 150,405 Duplication NINL 6-0024-001 Proband M 2 198148570 198236716 88,147 Duplication ANKRD44;ANKRD44-IT1 6-0024-004 Unaff. Sib M 10 43734650 43861722 127,073 Duplication RASGEF1A 6-0025-001 Proband F 17 41860755 41877231 16,477 Duplication C17orf105 6-0025-001 Proband F 6 65515528 65542276 26,749 Duplication EYS 6-0025-005 Aff. Sib M 22 51061529 51087096 25,568 Duplication ARSA 6-0025-005 Aff. Sib M 6 65515528 65542264 26,737 Duplication EYS LBX1-AS1;LBX1;LOC101927419; 6-0033-004 Unaff. Sib M 10 102874403 103093618 219,216 Duplication TLX1;TLX1NB;LINC01514 6-0033-004 Unaff. Sib M 4 167490482 167663175 172,694 Duplication SPOCK3 6-0033-004 Unaff. Sib M 7 32210330 32426455 216,126 Duplication PDE1C 6-0034-001 Proband F 19 45789286 45850149 60,864 Duplication MARK4;CKM;KLC3 6-0034-004 Unaff. Sib M 19 45789159 45850149 60,991 Duplication MARK4;CKM;KLC3 6-0034-004 Unaff. Sib M 4 186932780 187133823 201,044 Duplication TLR3;CYP4V2;FAM149A;FLJ38576 6-0050-004 Unaff. Sib F 19 21986293 22033652 47,360 Deletion ZNF43 6-0050-004 Unaff. Sib F 9 118397913 118625897 227,985 Deletion LOC101928775 TIPIN;MEGF11;MAP2K1;SNAPC5;SCARNA1 6-0056-001 Proband M 15 66356282 66790939 434,658 Duplication 4;DIS3L 6-0056-001 Proband M 3 36948417 37016476 68,060 Deletion TRANK1 6-0056-001 Proband M 6 135959674 136006301 46,628 Duplication LINC00271 6-0056-004 Unaff. Sib M 6 135959674 136006516 46,843 Duplication LINC00271 6-0062-001 Proband M 6 27656162 27695329 39,168 Deletion LINC01012 6-0063-001 Proband F 10 43017191 43038761 21,571 Duplication ZNF37BP 6-0063-001 Proband F X 7124021 7144389 20,369 Duplication STS 6-0063-004 Unaff. Sib M 4 77475932 77521826 45,895 Duplication SHROOM3;MIR4450 6-0112-004 Unaff. Sib M 8 8584457 8707197 122,741 Deletion MFHAS1 6-0119-001 Proband M 1 151144910 151243214 98,305 Deletion PSMD4;TMOD4;PIP5K1A;VPS72 6-0119-001 Proband M 12 18226577 18316993 90,417 Deletion RERGL 6-0124-001 Proband M 4 122074943 122097469 22,527 Deletion TNIP3 RECQL5;LLGL2;SMIM6;SMIM5;SAP30BP;T 6-0124-004 Aff. Sib M 17 73514076 73713995 199,920 Duplication SEN54;MYO15B 56

6-0134-001 Proband F 13 73577227 73701559 124,333 Duplication KLF5;PIBF1 6-0134-004 Unaff. Sib F 13 73577966 73692789 114,824 Duplication KLF5;PIBF1 6-0135-004 Aff. Sib M 16 28120654 28271103 150,450 Duplication XPO6 6-0135-004 Aff. Sib M 7 65415319 65470691 55,373 Duplication GUSB;VKORC1L1 6-0135-005 Aff. Sib F 16 28128093 28271103 143,011 Duplication XPO6 6-0135-006 Aff. Sib F 16 28128093 28271103 143,011 Duplication XPO6 6-0135-006 Aff. Sib F 22 50538303 50556825 18,523 Deletion MOV10L1 MIR4751;LIN7B;SNAR- D;SIGLEC11;MAMSTR;CD37;IZUMO1;MED 25;PIH1D1;MYH14;ASPDH;NOSIP;NAPSA; NAPSB;RPL18;GRWD1;SNARG2;LIG1;ADM 5;ZNF473;ELSPBP1;LHB;VRK3;PNKP;MIR47 49;PRMT1;BCAT2;EMC10;SIGLEC16;ALDH 16A1;LRRC4B;MIR6800;CCDC155;MIR679 9;NUCB1;HRC;MIR4750;LMTK3;MIR6798; PRRG2;MIR150;IZUMO2;PLEKHA4;DKKL1; DBP;BSPH1;HSD17B14;PTOV1;FUZ;SNORD 35A;SNORD35B;SYNGR4;RRAS;PRR12;RUV BL2;TULP2;SEC1P;PTOV1AS2;C19orf68;PT OV1AS1;SNARG1;CARD8;AKT1S1;TRPM4; NR1H2;RCN3;SNORD32A;SNAR- A14;SNARA11;SNARA10;GYS1;PTH2;FTL;S 6-0139-001 Proband F 19 48462617 51107899 2,645,283 Duplication PACA4;RPL13AP5;KDELR1;NUCB1- AS1;CABP5;TBC1D17;MYBPC2;SULT2B1;Z NF114;FGF21;RASIP1;DHDH;KCNA7;C19or f73;NUP62;POLD1;CGB;RPL13A;SNAR- A9;SNARA8;SNARA3;IL4I1;SNRNP70;SNAR A7;SLC6A16;SNARA5;SNARA4;IRF3;NTF4;T SKS;CGB5;CGB7;CGB1;CGB2;SPIB;SNAR- B1;FLT3LG;SCAF1;JOSD2;CCDC114;SLC17A 7;NTN5;SNARA6;PPP1R15A;EMP3;CARD8- AS1;SPHK2;RPS11;MIR4324;KCNC3;BAX;T EAD2;CGB8;FAM83E;FLJ26850;PPFIA3;BCL 2L12;LOC101059948;FAM71E1;MIR5088;F CGRT;CPT1C;FUT1;KCNJ14;CA11;LOC1019 28295;AP2A1;PLA2G4C;CYTH2;SNAR- B2;GFY;SNORD34;SNORD33;FUT2;GRIN2D ;ATF5;TMEM143 6-0144-001 Proband M 1 112996940 113023348 26,409 Duplication WNT2B;MIR4256;CTTNBP2NL 6-0144-001 Proband M 17 78207690 78238585 30,896 Duplication SLC26A11;RNF213 6-0144-001 Proband M X 140041499 140094746 53,248 Duplication SPANXB1 6-0144-004 Unaff. Sib F 15 99017562 99037757 20,196 Deletion FAM169B 6-0145-001 Proband M 14 51101935 51153264 51,330 Deletion SAV1 6-0145-004 Aff. Sib M 14 51101935 51153264 51,330 Deletion SAV1 6-0152-001 Proband M 2 152430332 153267017 836,686 Deletion FMNL2;ARL5A;STAM2;NEB;CACNB4 6-0154-004 Unaff. Sib M 3 190869467 190995191 125,725 Deletion OSTN;UTS2B 6-0179-001 Proband F 16 88553346 88665219 111,874 Duplication ZC3H18;ZFPM1 CDK10;CPNE7;SPATA33;SPATA2L; 6-0179-001 Proband F 16 89656577 89772674 116,098 Duplication CHMP1A;DPEP1 6-0179-001 Proband F 17 15276705 15377385 100,681 Deletion TVP23C-CDRT4;CDRT4 6-0179-004 Unaff. Sib F 1 240170510 240191274 20,765 Duplication RPS7P5 CDK10;CPNE7;SPATA33;SPATA2L; 6-0179-004 Unaff. Sib F 16 89656577 89769763 113,187 Duplication CHMP1A;DPEP1 6-0185-001 Proband M 19 41451368 41514489 63,122 Duplication CYP2B6;CYP2B7P 6-0185-001 Proband M 5 60239443 60277069 37,627 Deletion ERCC8;NDUFAF2

57

6-0185-001 Proband M 6 127971152 128092888 121,737 Duplication THEMIS 6-0185-004 Unaff. Sib M 1 225659166 225690830 31,665 Duplication ENAH 6-0185-004 Unaff. Sib M 14 44939853 45004380 64,528 Duplication FSCB 6-0185-004 Unaff. Sib M 19 41451368 41514489 63,122 Duplication CYP2B6;CYP2B7P 6-0185-004 Unaff. Sib M 2 98614334 99155864 541,531 Duplication CNGA3;VWA3B;INPP4A 6-0185-004 Unaff. Sib M 2 186021324 186833705 812,382 Duplication LOC101927196;FSIP2 6-0185-004 Unaff. Sib M 5 60239443 60277069 37,627 Deletion ERCC8;NDUFAF2 6-0185-004 Unaff. Sib M 6 127971152 128092888 121,737 Duplication THEMIS 6-0227-001 Proband M 4 152147857 152260715 112,859 Deletion SH3D19;PRSS48 6-0227-004 Unaff. Sib M 4 152148622 152260715 112,094 Deletion SH3D19;PRSS48 6-0227-004 Unaff. Sib M 6 54098301 54129982 31,682 Deletion MLIP 6-0231-001 Proband M 19 56481555 56543377 61,823 Deletion NLRP5;NLRP8 6-0231-001 Proband M 4 89038578 89280430 241,853 Deletion ABCG2;PPM1K 6-0231-004 Aff. Sib M 19 56481555 56543377 61,823 Deletion NLRP5;NLRP8 6-0231-004 Aff. Sib M 3 28455742 28734773 279,032 Duplication ZCWPW2;LINC00693 6-0231-004 Aff. Sib M 4 89038578 89271165 232,588 Deletion ABCG2;PPM1K 6-0238-004 Unaff. Sib F 5 37237232 37316387 79,156 Duplication NUP155;C5orf42 6-0243-001 Proband M 12 15423569 15505731 82,163 Duplication PTPRO 6-0243-004 Unaff. Sib F 15 64721928 65000963 279,036 Deletion TRIP4;ZNF609;OAZ2 6-0246-001 Proband F 13 34002221 34032926 30,706 Deletion STARD13 6-0246-004 Unaff. Sib F 5 306683 363136 56,454 Duplication PDCD6;AHRR 6-0250-001 Proband M X 24920090 25931562 1,011,473 Duplication ARX;POLA1 6-0250-004 Unaff. Sib M 13 20797315 20833546 36,232 Duplication GJB6 6-0253-001 Proband M 1 241730456 241752434 21,979 Deletion KMO 6-0258-001 Proband F 2 189149450 189477881 328,432 Deletion GULP1;LINC01090;MIR561 6-0258-001 Proband F 3 162857006 162945412 88,407 Deletion LINC01192 6-0258-004 Unaff. Sib M 5 75833647 75914307 80,661 Deletion F2RL2;IQGAP2 6-0260-001 Proband M 16 70365070 70398721 33,652 Duplication DDX19A;DDX19B;LOC100506083 6-0260-001 Proband M 17 4667542 4683035 15,494 Deletion TM4SF5 6-0260-001 Proband M 2 217186190 217250099 63,910 Duplication 04-Mar 6-0260-001 Proband M 2 217284892 217317765 32,874 Duplication SMARCAL1 6-0260-004 Unaff. Sib F 13 28486188 28522031 35,844 Duplication PDX1;PDX1-AS1;ATP5EP2 6-0265-001 Proband M 20 45913433 45939862 26,430 Duplication ZMYND8 6-0265-001 Proband M 4 79229435 79265870 36,436 Duplication FRAS1 6-0265-001 Proband M 5 59405485 59846252 440,768 Duplication PART1;PDE4D 6-0265-004 Unaff. Sib M 16 68051364 68078221 26,858 Deletion DDX28;DUS2 6-0265-004 Unaff. Sib M 20 45913433 45939862 26,430 Duplication ZMYND8 6-0265-004 Unaff. Sib M X 127140450 127245906 105,457 Deletion ACTRT1 6-0271-001 Proband M 9 13231782 13253775 21,994 Duplication MPDZ 6-0271-004 Unaff. Sib F 1 181196615 181215213 18,599 Deletion GM140 6-0273-001 Proband M 3 112542793 112559846 17,054 Deletion CD200R1L 6-0273-004 Unaff. Sib M 1 175877677 175917430 39,754 Duplication RFWD2

58

6-0273-004 Unaff. Sib M 3 112542793 112559846 17,054 Deletion CD200R1L 6-0276-001 Proband M 9 110210658 110233097 22,440 Deletion LINC01509 6-0277-001 Proband M 2 238930130 238973062 42,933 Duplication UBE2F-SCLY;SCLY;UBE2F 6-0284-001 Proband M 12 16369875 16397863 27,989 Deletion SLC15A5 6-0284-001 Proband M 2 176981810 177003022 21,213 Deletion HOXD-AS2;HOXD8;HOXD9;HOXD10 6-0284-004 Unaff. Sib F 19 20720868 20923670 202,803 Duplication ZNF626;ZNF737 6-0284-004 Unaff. Sib F 2 176981810 177003022 21,213 Deletion HOXD-AS2;HOXD8;HOXD9;HOXD10 6-0295-005 Aff. Sib M 3 114399296 114551611 152,316 Deletion MIR4796;ZBTB20 TTC27;LTBP1;BIRC6;BIRC6- 6-0318-004 Unaff. Sib M 2 32626260 33334307 708,048 Duplication AS2;MIR4765;MIR558;LOC100271832;LIN C00486 6-0323-001 Proband M 16 34197492 34434335 236,844 Deletion UBE2MP1 6-0323-001 Proband M 16 83961788 83989850 28,063 Deletion OSGIN1 6-0325-001 Proband M 19 41451368 41514477 63,110 Duplication CYP2B6;CYP2B7P 6-0325-001 Proband M 5 60239443 60277069 37,627 Deletion ERCC8;NDUFAF2 6-0325-001 Proband M 6 127975067 128092900 117,834 Duplication THEMIS 6-0325-004 Unaff. Sib M 19 41451368 41512841 61,474 Duplication CYP2B6;CYP2B7P 6-0325-004 Unaff. Sib M 6 127975067 128092888 117,822 Duplication THEMIS 6-0332-001 Proband M 1 24492808 24520960 28,153 Duplication IFNLR1 6-0332-001 Proband M 12 126463562 126720475 256,914 Duplication LINC00939;LOC101927464 6-0332-001 Proband M 12 129912960 130254371 341,412 Duplication TMEM132D 6-0332-004 Aff. Sib M 12 126463562 126727689 264,128 Duplication LINC00939;LOC101927464 6-0332-004 Aff. Sib M 12 129913192 130254371 341,180 Duplication TMEM132D 6-0344-001 Proband M 10 89671800 89701020 29,221 Duplication PTEN 6-0344-001 Proband M 22 36937448 36958380 20,933 Deletion CACNG2 6-0344-001 Proband M 6 128245079 128375418 130,340 Deletion LOC101928140;PTPRK 6-0344-004 Aff. Sib M 2 176981810 177003022 21,213 Deletion HOXD-AS2;HOXD8;HOXD9;HOXD10 6-0344-004 Aff. Sib M 6 128243369 128378806 135,438 Deletion LOC101928140;PTPRK 6-0349-001 Proband M 8 95199302 95235890 36,589 Duplication CDH17 6-0349-004 Aff. Sib M 8 95199302 95235890 36,589 Duplication CDH17 6-0351-001 Proband M 4 151668146 152078423 410,278 Duplication SH3D19;SNORD73A;RPS3A;LRBA 6-0351-001 Proband M 5 94958290 94991346 33,057 Duplication SPATA9;RFESD 6-0351-004 Aff. Sib M 3 60802169 60820204 18,036 Duplication FHIT 6-0351-004 Aff. Sib M 4 151668146 152078423 410,278 Duplication SH3D19;SNORD73A;RPS3A;LRBA 6-0351-004 Aff. Sib M 5 94958290 94991346 33,057 Duplication SPATA9;RFESD 6-0356-001 Proband M 15 28172572 28218324 45,753 Deletion OCA2 6-0356-004 Unaff. Sib F 1 155561901 155636887 74,987 Deletion MSTO1;MSTO2P;YY1AP1 6-0356-004 Unaff. Sib F X 148760404 148808419 48,016 Duplication MAGEA11 6-0358-004 Unaff. Sib F 13 111167634 111332899 165,266 Duplication RAB20;CARS2;CARKD 6-0361-001 Proband M 4 58139970 58566388 426,419 Duplication LOC101928851 6-0361-004 Aff. Sib M 15 56021208 56105695 84,488 Duplication PRTG 6-0361-004 Aff. Sib M 4 58139970 58566388 426,419 Duplication LOC101928851 6-0362-004 Unaff. Sib F 11 14863083 14902916 39,834 Duplication CYP2R1;PDE3B 6-0364-001 Proband M 2 12693172 12840663 147,492 Duplication LOC100506457 59

6-0364-004 Aff. Sib M 2 12693172 12840663 147,492 Duplication LOC100506457 6-0364-004 Aff. Sib M 3 43444990 43474916 29,927 Deletion ANO10 6-0376-004 Aff. Sib M 14 48262530 48380146 117,617 Deletion LINC00648 DOC2A;ASPHD1;TBX6;PRRT2;CDIPT;QPRT; SMG1P2;YPEL3;SLC7A5P1;PPP4C;MAPK3;S PN;MVP;FAM57B;ZG16;ALDOA;INO80E;SE 6-0376-004 Aff. Sib M 16 29567296 30177928 610,633 Deletion Z6L2;TAOK2;KCTD13;MAZ;KIF22;GDPD3;C 16orf92;C16orf54;CDIPT- AS1;TMEM219;PAGR1;HIRIP3 6-0382-001 Proband M 14 102738583 102767489 28,907 Duplication MOK 6-0382-001 Proband M 4 177866174 178336645 470,472 Duplication NEIL3 6-0382-001 Proband M X 6081390 6127437 46,048 Duplication NLGN4X 6-0382-004 Unaff. Sib F X 6085042 6135144 50,103 Duplication NLGN4X 6-0384-001 Proband F 11 134352349 134476362 124,014 Deletion LOC283177 ATXN2L;ATP2A1;NFATC2IP;ATP2A1- 6-0384-001 Proband F 16 28808206 29051191 242,986 Deletion AS1;MIR4721;SPNS1;RABEP2;SH2B1;LAT; MIR4517;TUFM;CD19 6-0384-001 Proband F 19 55080578 55103812 23,235 Duplication LILRA2 GTF2IP1;NSUN5;LOC100093631;FKBP6; 6-0384-004 Unaff. Sib F 7 72611543 72818672 207,130 Duplication NCF1B;TRIM50 6-0384-004 Unaff. Sib F 9 37499013 37514163 15,151 Duplication FBXO10;POLR1E 6-0384-004 Unaff. Sib F X 31875748 31894898 19,151 Duplication DMD 6-0385-004 Aff. Sib M 18 72082944 72146980 64,037 Duplication FAM69C TTC27;LTBP1;BIRC6;BIRC6- 7-0318-001 Proband M 2 32629209 33336179 706,971 Duplication AS2;MIR4765;MIR558;LOC100271832; LINC00486 7-0318-001 Proband M 4 75201804 75500577 298,774 Deletion AREG;EREG 7-0318-001 Proband M 6 98462005 98485537 23,533 Duplication MIR2113 7-1117 Unaff. Sib M 12 11152029 11207580 55,552 Deletion TAS2R31;PRH1-PRR4;TAS2R19;PRH1 7-1117-001 Proband M 8 92220201 92240175 19,975 Deletion LRRC69;SLC26A7 7-1126-001 Proband F 20 60019924 60083998 64,075 Deletion CDH4 7-1126-001 Proband F 7 155124600 155465851 341,252 Duplication RBM33;BLACE;EN2;CNPY1;LOC100506302 7-1127-001 Proband M 3 191307146 191742078 434,933 Duplication LINCR-0002 7-1127-001 Proband M 8 53484239 53578696 94,458 Duplication RB1CC1 7-1127-001 Proband M 8 53719997 54063024 343,028 Duplication NPBWR1 7-1127-004 Unaff. Sib F 19 42259136 42305800 46,665 Duplication CEACAM3;CEACAM6 7-1127-004 Unaff. Sib F 3 191307146 191742078 434,933 Duplication LINCR-0002 7-1128-001 Proband M 10 126401796 126456614 54,819 Deletion FAM53B-AS1;FAM53B;METTL10 7-1128-001 Proband M 2 233013157 233033912 20,756 Deletion DIS3L2 7-1128-004 Aff. Sib M 2 233013157 233033912 20,756 Deletion DIS3L2 7-1129-001 Proband M 11 6981241 7051113 69,873 Deletion ZNF214;NLRP14 7-1129-001 Proband M 11 24502121 24618295 116,175 Duplication LUZP2 7-1129-004 Unaff. Sib F 11 6981241 7051113 69,873 Deletion ZNF214;NLRP14 7-1129-004 Unaff. Sib F 3 61975169 62035156 59,988 Deletion PTPRG 7-1129-004 Unaff. Sib F 6 2113809 2192841 79,033 Deletion GMDS 7-1130-001 Proband M 3 61989945 62133088 143,144 Deletion PTPRG 7-1130-004 Aff. Sib M 4 8355962 8373293 17,332 Deletion ACOX3 7-1130-004 Aff. Sib M X 3577569 3658983 81,415 Duplication PRKX-AS1;PRKX 60

7-1131-001 Proband M 2 173274311 173318571 44,261 Duplication ITGA6 7-1131-004 Unaff. Sib M 2 173274311 173318571 44,261 Duplication ITGA6 7-1131-004 Unaff. Sib M 7 148093306 148112003 18,698 Deletion CNTNAP2 7-1132-001 Proband M 15 55527874 55609214 81,341 Deletion RAB27A 7-1134-001 Proband M 6 11055134 11090113 34,980 Deletion ELOVL2-AS1 7-1134-004 Unaff. Sib F 17 41534491 41631866 97,376 Duplication DHX8;ETV4 8-1000-001 Proband M 6 80212843 80273293 60,451 Duplication LCA5 8-1000-001 Proband M 7 148719680 148746224 26,545 Duplication PDIA4 8-1000-004 Aff. Sib F 4 55144307 55174536 30,230 Duplication PDGFRA 8-1000-004 Aff. Sib F 6 80212843 80273293 60,451 Duplication LCA5 8-1000-004 Aff. Sib F 7 148719680 148746224 26,545 Duplication PDIA4 8--1001-004 Aff. Sib M 17 526 167835 167,310 Duplication RPH3AL;LOC100506371;DOC2B 8--1001-004 Aff. Sib M 19 11518184 11564273 46,090 Deletion PRKCSH;RGL3;CCDC151;ELAVL3 8--1001-004 Aff. Sib M 5 158659678 158702949 43,272 Deletion UBLCP1 MFAP1;MIR1282;SERF2;PDIA3;PIN4P1;HY 8-1003-004 Aff. Sib M 15 43992627 44198616 205,990 Duplication PK;SERINC4;ELL3;FRMD5;SERF2- C15ORF63;CATSPER2P1;WDR76 8-1007-001 Proband M 11 14863083 14903360 40,278 Duplication CYP2R1;PDE3B 8-1007-004 Unaff. Sib M 11 14863083 14903360 40,278 Duplication CYP2R1;PDE3B 8-1007-004 Unaff. Sib M 17 78695964 78761151 65,188 Deletion RPTOR 8-1007-004 Unaff. Sib M 6 149967083 150019167 52,085 Duplication KATNA1;LATS1 8-1008-001 Proband F 16 19746481 19803846 57,366 Duplication IQCK LINC01476;PRR11;YPEL2;TRIM37;SMG8;M 8-1008-001 Proband F 17 57144519 57717737 573,219 Duplication IR4729;MIR301A;MIR454;SKA2;CLTC;DHX 40;GDPD1 8-1008-001 Proband F 8 71172796 71220651 47,856 Deletion NCOA2 8-1008-004 Aff. Sib M 8 71172796 71220651 47,856 Deletion NCOA2 8-1009-001 Proband M 1 33045723 33060752 15,030 Deletion ZBTB8A 8-1009-001 Proband M 17 72875848 72899769 23,922 Deletion FADS6 8-1009-004 Unaff. Sib F 1 33045600 33060752 15,153 Deletion ZBTB8A 8-1009-004 Unaff. Sib F 17 72875848 72893576 17,729 Deletion FADS6 8-1009-004 Unaff. Sib F 2 28792031 28815986 23,956 Deletion PLB1 8-1010-004 Unaff. Sib M 21 43100690 43135545 34,856 Deletion LINC00479;LINC00111 8-1010-004 Unaff. Sib M 7 16917816 17422044 504,229 Duplication AHR;AGR3;LOC102659288 8--1012-001 Proband M 4 68346 89747 21,402 Deletion ZNF718;ZNF595 8--1012-004 Unaff. Sib F 4 68346 89747 21,402 Deletion ZNF718;ZNF595 8-1013-001 Proband M 1 222686165 222702931 16,767 Deletion HHIPL2 DLEU7-AS1;RNASEH2B;RNASEH2B- 8-1015-001 Proband M 13 51076960 51555944 478,985 Deletion AS1;DLEU1-AS1;DLEU1;DLEU7 DLEU7-AS1;RNASEH2B;RNASEH2B- 8-1015-004 Aff. Sib M 13 51086107 51555944 469,838 Deletion AS1;DLEU1-AS1;DLEU1;DLEU7 LOC392452;MIR222;LOC401585;MIR221;L 8-1015-004 Aff. Sib M X 45200693 45807324 606,632 Duplication INC01204 8-1017-001 Proband M 10 68461662 68544726 83,065 Deletion CTNNA3 8-1017-001 Proband M X 57703003 58019911 316,909 Duplication ZXDA 8-1017-004 Unaff. Sib M X 57703003 58019911 316,909 Duplication ZXDA 8-1021-001 Proband M 12 11450109 11503087 52,979 Deletion PRB4 61

8-1021-001 Proband M 19 49088533 49112022 23,490 Duplication SPACA4;FAM83E;SULT2B1 8-1021-004 Aff. Sib F 3 2212318 2476350 264,033 Duplication CNTN4 6-0170-004 Unaff. Sib F 14 100898748 100972431 73,684 Deletion WDR25 6-0170-004 Unaff. Sib F 7 47723549 47945846 222,298 Duplication C7orf69;LINC00525;PKD1L1 FIGF;ASB11;ZRSR2;PIR- 6-0170-004 Unaff. Sib F X 15208755 15863090 654,336 Duplication FIGF;PIGA;CA5BP1;ASB9;TMEM27;AP1S2;I NE2;ACE2;PIR;BMX;CA5B 6-0187-001 Proband M 5 60239443 60277069 37,627 Deletion ERCC8;NDUFAF2 6-0187-001 Proband M X 76113652 76141069 27,418 Deletion MIR384;MIR325HG 6-0187-004 Unaff. Sib M 16 31567485 31630503 63,019 Duplication YBX3P1

62

References

American Psychiatric Association. 2000. 'Diagnostic and statistical manual of mental disorders (4th ed. text rev.)'.

American Psychiatric Association 2013. " Diagnostic and statistical manual of mental disorders: DSM-5." In. Washington, DC.

Amir, R. E., I. B. Van den Veyver, M. Wan, C. Q. Tran, U. Francke, and H. Y. Zoghbi. 1999. 'Rett syndrome is caused by mutations in X-linked MECP2, encoding methyl-CpG- binding protein 2', Nat Genet, 23: 185-8.

Anney, R., L. Klei, D. Pinto, R. Regan, J. Conroy, T. R. Magalhaes, C. Correia, B. S. Abrahams, N. Sykes, A. T. Pagnamenta, J. Almeida, E. Bacchelli, A. J. Bailey, G. Baird, A. Battaglia, T. Berney, N. Bolshakova, S. Bolte, P. F. Bolton, T. Bourgeron, S. Brennan, J. Brian, A. R. Carson, G. Casallo, J. Casey, S. H. Chu, L. Cochrane, C. Corsello, E. L. Crawford, A. Crossett, G. Dawson, M. de Jonge, R. Delorme, I. Drmic, E. Duketis, F. Duque, A. Estes, P. Farrar, B. A. Fernandez, S. E. Folstein, E. Fombonne, C. M. Freitag, J. Gilbert, C. Gillberg, J. T. Glessner, J. Goldberg, J. Green, S. J. Guter, H. Hakonarson, E. A. Heron, M. Hill, R. Holt, J. L. Howe, G. Hughes, V. Hus, R. Igliozzi, C. Kim, S. M. Klauck, A. Kolevzon, O. Korvatska, V. Kustanovich, C. M. Lajonchere, J. A. Lamb, M. Laskawiec, M. Leboyer, A. Le Couteur, B. L. Leventhal, A. C. Lionel, X. Q. Liu, C. Lord, L. Lotspeich, S. C. Lund, E. Maestrini, W. Mahoney, C. Mantoulan, C. R. Marshall, H. McConachie, C. J. McDougle, J. McGrath, W. M. McMahon, N. M. Melhem, A. Merikangas, O. Migita, N. J. Minshew, G. K. Mirza, J. Munson, S. F. Nelson, C. Noakes, A. Noor, G. Nygren, G. Oliveira, K. Papanikolaou, J. R. Parr, B. Parrini, T. Paton, A. Pickles, J. Piven, D. J. Posey, A. Poustka, F. Poustka, A. Prasad, J. Ragoussis, K. Renshaw, J. Rickaby, W. Roberts, K. Roeder, B. Roge, M. L. Rutter, L. J. Bierut, J. P. Rice, J. Salt, K. Sansom, D. Sato, R. Segurado, L. Senman, N. Shah, V. C. Sheffield, L. Soorya, I. Sousa, V. Stoppioni, C. Strawbridge, R. Tancredi, K. Tansey, B. Thiruvahindrapduram, A. P. Thompson, S. Thomson, A. Tryfon, J. Tsiantis, H. Van Engeland, J. B. Vincent, F. Volkmar, S. Wallace, K. Wang, Z. Wang, T. H. Wassink, K. Wing, K. Wittemeyer, S. Wood, B. L. Yaspan, D. Zurawiecki, L. Zwaigenbaum, C. Betancur, J. D. Buxbaum, R. M. Cantor, E. H. Cook, H. Coon, M. L. Cuccaro, L. Gallagher, D. H. Geschwind, M. Gill, J. L. Haines, J. Miller, A. P. Monaco, J. I. Nurnberger, Jr., A. D. Paterson, M. A. Pericak-Vance, G. D. Schellenberg, S. W. Scherer, J. S. Sutcliffe, P. Szatmari, A. M. Vicente, V. J. Vieland, E. M. Wijsman, B. Devlin, S. Ennis, and J. Hallmayer. 2010. 'A genome-wide scan for common alleles affecting risk for autism', Hum Mol Genet, 19: 4072-82.

Arking, D. E., D. J. Cutler, C. W. Brune, T. M. Teslovich, K. West, M. Ikeda, A. Rea, M. Guy, S. Lin, E. H. Cook, and A. Chakravarti. 2008. 'A common genetic variant in the neurexin superfamily member CNTNAP2 increases familial risk of autism', Am J Hum Genet, 82: 160-4.

Asperger, Hans. 1944. 'Die ‘Autistischen psychopathen’ im kindesalter', Arch Psychiatr Nervenkr, 117: 76-136.

63

Autism Genome Project, Consortium, P. Szatmari, A. D. Paterson, L. Zwaigenbaum, W. Roberts, J. Brian, X. Q. Liu, J. B. Vincent, J. L. Skaug, A. P. Thompson, L. Senman, L. Feuk, C. Qian, S. E. Bryson, M. B. Jones, C. R. Marshall, S. W. Scherer, V. J. Vieland, C. Bartlett, L. V. Mangin, R. Goedken, A. Segre, M. A. Pericak-Vance, M. L. Cuccaro, J. R. Gilbert, H. H. Wright, R. K. Abramson, C. Betancur, T. Bourgeron, C. Gillberg, M. Leboyer, J. D. Buxbaum, K. L. Davis, E. Hollander, J. M. Silverman, J. Hallmayer, L. Lotspeich, J. S. Sutcliffe, J. L. Haines, S. E. Folstein, J. Piven, T. H. Wassink, V. Sheffield, D. H. Geschwind, M. Bucan, W. T. Brown, R. M. Cantor, J. N. Constantino, T. C. Gilliam, M. Herbert, C. Lajonchere, D. H. Ledbetter, C. Lese-Martin, J. Miller, S. Nelson, C. A. Samango-Sprouse, S. Spence, M. State, R. E. Tanzi, H. Coon, G. Dawson, B. Devlin, A. Estes, P. Flodman, L. Klei, W. M. McMahon, N. Minshew, J. Munson, E. Korvatska, P. M. Rodier, G. D. Schellenberg, M. Smith, M. A. Spence, C. Stodgell, P. G. Tepper, E. M. Wijsman, C. E. Yu, B. Roge, C. Mantoulan, K. Wittemeyer, A. Poustka, B. Felder, S. M. Klauck, C. Schuster, F. Poustka, S. Bolte, S. Feineis-Matthews, E. Herbrecht, G. Schmotzer, J. Tsiantis, K. Papanikolaou, E. Maestrini, E. Bacchelli, F. Blasi, S. Carone, C. Toma, H. Van Engeland, M. de Jonge, C. Kemner, F. Koop, M. Langemeijer, C. Hijmans, W. G. Staal, G. Baird, P. F. Bolton, M. L. Rutter, E. Weisblatt, J. Green, C. Aldred, J. A. Wilkinson, A. Pickles, A. Le Couteur, T. Berney, H. McConachie, A. J. Bailey, K. Francis, G. Honeyman, A. Hutchinson, J. R. Parr, S. Wallace, A. P. Monaco, G. Barnby, K. Kobayashi, J. A. Lamb, I. Sousa, N. Sykes, E. H. Cook, S. J. Guter, B. L. Leventhal, J. Salt, C. Lord, C. Corsello, V. Hus, D. E. Weeks, F. Volkmar, M. Tauber, E. Fombonne, A. Shih, and K. J. Meyer. 2007. 'Mapping autism risk loci using genetic linkage and chromosomal rearrangements', Nat Genet, 39: 319-28.

Bailey, A., A. Le Couteur, I. Gottesman, P. Bolton, E. Simonoff, E. Yuzda, and M. Rutter. 1995. 'Autism as a strongly genetic disorder: evidence from a British twin study', Psychol Med, 25: 63-77.

Barge-Schaapveld, D. Q., S. M. Maas, A. Polstra, L. C. Knegt, and R. C. Hennekam. 2011. 'The atypical 16p11.2 deletion: a not so atypical microdeletion syndrome?', Am J Med Genet A, 155A: 1066-72.

Berkel, S., C. R. Marshall, B. Weiss, J. Howe, R. Roeth, U. Moog, V. Endris, W. Roberts, P. Szatmari, D. Pinto, M. Bonin, A. Riess, H. Engels, R. Sprengel, S. W. Scherer, and G. A. Rappold. 2010. 'Mutations in the SHANK2 synaptic scaffolding gene in autism spectrum disorder and mental retardation', Nat Genet, 42: 489-91.

Betancur, C. 2011. 'Etiological heterogeneity in autism spectrum disorders: more than 100 genetic and genomic disorders and still counting', Brain Res, 1380: 42-77.

Betancur, C., and J. D. Buxbaum. 2013. 'SHANK3 haploinsufficiency: a "common" but underdiagnosed highly penetrant monogenic cause of autism spectrum disorders', Mol Autism, 4: 17.

Bierut, L. J., J. A. Stitzel, J. C. Wang, A. L. Hinrichs, R. A. Grucza, X. Xuei, N. L. Saccone, S. F. Saccone, S. Bertelsen, L. Fox, W. J. Horton, N. Breslau, J. Budde, C. R. Cloninger, D. M. Dick, T. Foroud, D. Hatsukami, V. Hesselbrock, E. O. Johnson, J. Kramer, S. Kuperman, P. A. Madden, K. Mayo, J. Nurnberger, Jr., O. Pomerleau, B. Porjesz, O. Reyes, M. Schuckit, G. Swan, J. A. Tischfield, H. J. Edenberg, J. P. Rice, and A. M. Goate. 2008. 64

'Variants in nicotinic receptors and risk for nicotine dependence', Am J Psychiatry, 165: 1163-71.

Bodmer, W., and C. Bonilla. 2008. 'Common and rare variants in multifactorial susceptibility to common diseases', Nat Genet, 40: 695-701.

Bolton, P., H. Macdonald, A. Pickles, P. Rios, S. Goode, M. Crowson, A. Bailey, and M. Rutter. 1994. 'A case-control family history study of autism', J Child Psychol Psychiatry, 35: 877- 900.

Bozdagi, O., T. Sakurai, D. Papapetrou, X. Wang, D. L. Dickstein, N. Takahashi, Y. Kajiwara, M. Yang, A. M. Katz, M. L. Scattoni, M. J. Harris, R. Saxena, J. L. Silverman, J. N. Crawley, Q. Zhou, P. R. Hof, and J. D. Buxbaum. 2010. 'Haploinsufficiency of the autism- associated Shank3 gene leads to deficits in synaptic function, social interaction, and social communication', Mol Autism, 1: 15.

Brian, J., S. E. Bryson, I. M. Smith, W. Roberts, C. Roncadin, P. Szatmari, and L. Zwaigenbaum. 2015. 'Stability and change in autism spectrum disorder diagnosis from age 3 to middle childhood in a high-risk sibling cohort', Autism.

Bucan, M., B. S. Abrahams, K. Wang, J. T. Glessner, E. I. Herman, L. I. Sonnenblick, A. I. Alvarez Retuerto, M. Imielinski, D. Hadley, J. P. Bradfield, C. Kim, N. B. Gidaya, I. Lindquist, T. Hutman, M. Sigman, V. Kustanovich, C. M. Lajonchere, A. Singleton, J. Kim, T. H. Wassink, W. M. McMahon, T. Owley, J. A. Sweeney, H. Coon, J. I. Nurnberger, M. Li, R. M. Cantor, N. J. Minshew, J. S. Sutcliffe, E. H. Cook, G. Dawson, J. D. Buxbaum, S. F. Grant, G. D. Schellenberg, D. H. Geschwind, and H. Hakonarson. 2009. 'Genome-wide analyses of exonic copy number variants in a family-based study point to novel autism susceptibility genes', PLoS Genet, 5: e1000536.

Carter, C. O., and K. A. Evans. 1969. 'Inheritance of congenital pyloric stenosis', J Med Genet, 6: 233-54.

Chakrabarti, S., and E. Fombonne. 2001. 'Pervasive developmental disorders in preschool children', JAMA, 285: 3093-9.

Chaudhry, A., A. Noor, B. Degagne, K. Baker, L. A. Bok, A. F. Brady, D. Chitayat, B. H. Chung, C. Cytrynbaum, D. Dyment, I. Filges, B. Helm, H. T. Hutchison, L. J. Jeng, F. Laumonnier, C. R. Marshall, M. Menzel, S. Parkash, M. J. Parker, D. D. D. Study, L. F. Raymond, A. L. Rideout, W. Roberts, R. Rupps, I. Schanze, C. T. Schrander-Stumpel, M. D. Speevak, D. J. Stavropoulos, S. J. Stevens, E. R. Thomas, A. Toutain, S. Vergano, R. Weksberg, S. W. Scherer, J. B. Vincent, and M. T. Carter. 2015. 'Phenotypic spectrum associated with PTCHD1 deletions and truncating mutations includes intellectual disability and autism spectrum disorder', Clin Genet, 88: 224-33.

Colvert, E., B. Tick, F. McEwen, C. Stewart, S. R. Curran, E. Woodhouse, N. Gillan, V. Hallett, S. Lietz, T. Garnett, A. Ronald, R. Plomin, F. Rijsdijk, F. Happe, and P. Bolton. 2015. 'Heritability of Autism Spectrum Disorder in a UK Population-Based Twin Sample', JAMA Psychiatry, 72: 415-23.

65

Conrad, D. F., D. Pinto, R. Redon, L. Feuk, O. Gokcumen, Y. Zhang, J. Aerts, T. D. Andrews, C. Barnes, P. Campbell, T. Fitzgerald, M. Hu, C. H. Ihm, K. Kristiansson, D. G. Macarthur, J. R. Macdonald, I. Onyiah, A. W. Pang, S. Robson, K. Stirrups, A. Valsesia, K. Walter, J. Wei, Consortium Wellcome Trust Case Control, C. Tyler-Smith, N. P. Carter, C. Lee, S. W. Scherer, and M. E. Hurles. 2010. 'Origins and functional impact of copy number variation in the human genome', Nature, 464: 704-12.

Cook, E. H., Jr., and S. W. Scherer. 2008. 'Copy-number variations associated with neuropsychiatric conditions', Nature, 455: 919-23.

Damaj, L., A. Lupien-Meilleur, A. Lortie, E. Riou, L. H. Ospina, L. Gagnon, C. Vanasse, and E. Rossignol. 2015. 'CACNA1A haploinsufficiency causes cognitive impairment, autism and epileptic encephalopathy with mild cerebellar symptoms', Eur J Hum Genet, 23: 1505-12.

Darvishi, K. 2010. 'Application of Nexus copy number software for CNV detection and analysis', Curr Protoc Hum Genet, Chapter 4: Unit 4 14 1-28.

Davidovitch, M., N. Levit-Binnun, D. Golan, and P. Manning-Courtney. 2015. 'Late diagnosis of autism spectrum disorder after initial negative assessment by a multidisciplinary team', J Dev Behav Pediatr, 36: 227-34.

Dawson, G., S. Rogers, J. Munson, M. Smith, J. Winter, J. Greenson, A. Donaldson, and J. Varley. 2010. 'Randomized, controlled trial of an intervention for toddlers with autism: the Early Start Denver Model', Pediatrics, 125: e17-23.

De Rubeis, S., X. He, A. P. Goldberg, C. S. Poultney, K. Samocha, A. E. Cicek, Y. Kou, L. Liu, M. Fromer, S. Walker, T. Singh, L. Klei, J. Kosmicki, F. Shih-Chen, B. Aleksic, M. Biscaldi, P. F. Bolton, J. M. Brownfeld, J. Cai, N. G. Campbell, A. Carracedo, M. H. Chahrour, A. G. Chiocchetti, H. Coon, E. L. Crawford, S. R. Curran, G. Dawson, E. Duketis, B. A. Fernandez, L. Gallagher, E. Geller, S. J. Guter, R. S. Hill, J. Ionita-Laza, P. Jimenz Gonzalez, H. Kilpinen, S. M. Klauck, A. Kolevzon, I. Lee, I. Lei, J. Lei, T. Lehtimaki, C. F. Lin, A. Ma'ayan, C. R. Marshall, A. L. McInnes, B. Neale, M. J. Owen, N. Ozaki, M. Parellada, J. R. Parr, S. Purcell, K. Puura, D. Rajagopalan, K. Rehnstrom, A. Reichenberg, A. Sabo, M. Sachse, S. J. Sanders, C. Schafer, M. Schulte-Ruther, D. Skuse, C. Stevens, P. Szatmari, K. Tammimies, O. Valladares, A. Voran, W. Li-San, L. A. Weiss, A. J. Willsey, T. W. Yu, R. K. Yuen, D. D. D. Study, Autism Homozygosity Mapping Collaborative for, Uk K. Consortium, E. H. Cook, C. M. Freitag, M. Gill, C. M. Hultman, T. Lehner, A. Palotie, G. D. Schellenberg, P. Sklar, M. W. State, J. S. Sutcliffe, C. A. Walsh, S. W. Scherer, M. E. Zwick, J. C. Barett, D. J. Cutler, K. Roeder, B. Devlin, M. J. Daly, and J. D. Buxbaum. 2014. 'Synaptic, transcriptional and chromatin genes disrupted in autism', Nature, 515: 209-15.

DiCicco-Bloom, E., C. Lord, L. Zwaigenbaum, E. Courchesne, S. R. Dager, C. Schmitz, R. T. Schultz, J. Crawley, and L. J. Young. 2006. 'The developmental neurobiology of autism spectrum disorder', J Neurosci, 26: 6897-906.

Downey, T. 2006. 'Analysis of a multifactor microarray study using Partek genomics solution', Methods Enzymol, 411: 256-70.

66

Durand, C. M., C. Betancur, T. M. Boeckers, J. Bockmann, P. Chaste, F. Fauchereau, G. Nygren, M. Rastam, I. C. Gillberg, H. Anckarsater, E. Sponheim, H. Goubran-Botros, R. Delorme, N. Chabane, M. C. Mouren-Simeoni, P. de Mas, E. Bieth, B. Roge, D. Heron, L. Burglen, C. Gillberg, M. Leboyer, and T. Bourgeron. 2007. 'Mutations in the gene encoding the synaptic scaffolding protein SHANK3 are associated with autism spectrum disorders', Nat Genet, 39: 25-7.

Falconer, DS. 1981. Introduction to quantitative genetics, 2nd edn. (Oliver and Boyd: Edinburgh).

Feuk, L., A. Kalervo, M. Lipsanen-Nyman, J. Skaug, K. Nakabayashi, B. Finucane, D. Hartung, M. Innes, B. Kerem, M. J. Nowaczyk, J. Rivlin, W. Roberts, L. Senman, A. Summers, P. Szatmari, V. Wong, J. B. Vincent, S. Zeesman, L. R. Osborne, J. O. Cardy, J. Kere, S. W. Scherer, and K. Hannula-Jouppi. 2006. 'Absence of a paternally inherited FOXP2 gene in developmental verbal dyspraxia', Am J Hum Genet, 79: 965-72.

Folstein, S., and M. Rutter. 1977. 'Infantile autism: a genetic study of 21 twin pairs', J Child Psychol Psychiatry, 18: 297-321.

Fombonne, E. 2009. 'Epidemiology of pervasive developmental disorders', Pediatr Res, 65: 591- 8.

Frazer, K. A., S. S. Murray, N. J. Schork, and E. J. Topol. 2009. 'Human genetic variation and its contribution to complex traits', Nat Rev Genet, 10: 241-51.

Gaugler, T., L. Klei, S. J. Sanders, C. A. Bodea, A. P. Goldberg, A. B. Lee, M. Mahajan, D. Manaa, Y. Pawitan, J. Reichert, S. Ripke, S. Sandin, P. Sklar, O. Svantesson, A. Reichenberg, C. M. Hultman, B. Devlin, K. Roeder, and J. D. Buxbaum. 2014. 'Most genetic risk for autism resides with common variation', Nat Genet, 46: 881-5.

Gauthier, J., T. J. Siddiqui, P. Huashan, D. Yokomaku, F. F. Hamdan, N. Champagne, M. Lapointe, D. Spiegelman, A. Noreau, R. G. Lafreniere, F. Fathalli, R. Joober, M. O. Krebs, L. E. DeLisi, L. Mottron, E. Fombonne, J. L. Michaud, P. Drapeau, S. Carbonetto, A. M. Craig, and G. A. Rouleau. 2011. 'Truncating mutations in NRXN2 and NRXN1 in autism spectrum disorders and schizophrenia', Hum Genet, 130: 563-73.

Gazzellone, M. J., X. Zhou, A. C. Lionel, M. Uddin, B. Thiruvahindrapuram, S. Liang, C. Sun, J. Wang, M. Zou, K. Tammimies, S. Walker, T. Selvanayagam, J. Wei, Z. Wang, L. Wu, and S. W. Scherer. 2014. 'Copy number variation in Han Chinese individuals with autism spectrum disorder', J Neurodev Disord, 6: 34.

Gibson, G. 2011. 'Rare and common variants: twenty arguments', Nat Rev Genet, 13: 135-45.

Goin-Kochel, R. P., A. Abbacchi, J. N. Constantino, and Consortium Autism Genetic Resource Exchange. 2007. 'Lack of evidence for increased genetic loading for autism among families of affected females: a replication from family history data in two large samples', Autism, 11: 279-86.

67

Goodpaster, B. H., S. W. Park, T. B. Harris, S. B. Kritchevsky, M. Nevitt, A. V. Schwartz, E. M. Simonsick, F. A. Tylavsky, M. Visser, and A. B. Newman. 2006. 'The loss of skeletal muscle strength, mass, and quality in older adults: the health, aging and body composition study', J Gerontol A Biol Sci Med Sci, 61: 1059-64.

Gorlov, I. P., O. Y. Gorlova, M. L. Frazier, M. R. Spitz, and C. I. Amos. 2011. 'Evolutionary evidence of the effect of rare variants on disease etiology', Clin Genet, 79: 199-206.

Hagerman, R., G. Hoem, and P. Hagerman. 2010. 'Fragile X and autism: Intertwined at the molecular level leading to targeted treatments', Mol Autism, 1: 12.

Hanson, E., R. Bernier, K. Porche, F. I. Jackson, R. P. Goin-Kochel, L. G. Snyder, A. V. Snow, A. S. Wallace, K. L. Campe, Y. Zhang, Q. Chen, D. D'Angelo, A. Moreno-De-Luca, P. T. Orr, K. B. Boomer, D. W. Evans, S. Kanne, L. Berry, F. K. Miller, J. Olson, E. Sherr, C. L. Martin, D. H. Ledbetter, J. E. Spiro, W. K. Chung, and Consortium Simons Variation in Individuals Project. 2015. 'The cognitive and behavioral phenotype of the 16p11.2 deletion in a clinically ascertained population', Biol Psychiatry, 77: 785-93.

Hastings, P. J., J. R. Lupski, S. M. Rosenberg, and G. Ira. 2009. 'Mechanisms of change in gene copy number', Nat Rev Genet, 10: 551-64.

He, H., B. Kang, D. Jiang, R. Ma, and L. Bai. 2014. 'Molecular cloning and mRNA expression analysis of ornithine decarboxylase antizyme 2 in ovarian follicles of the Sichuan white goose (Anser cygnoides)', Gene, 545: 247-52.

Horev, G., J. Ellegood, J. P. Lerch, Y. E. Son, L. Muthuswamy, H. Vogel, A. M. Krieger, A. Buja, R. M. Henkelman, M. Wigler, and A. A. Mills. 2011. 'Dosage-dependent phenotypes in models of 16p11.2 lesions found in autism', Proc Natl Acad Sci U S A, 108: 17076-81.

Hughes, J. R., and M. Melyn. 2005. 'EEG and seizures in autistic children and adolescents: further findings with therapeutic implications', Clin EEG Neurosci, 36: 15-20.

Hurley, R. S., M. Losh, M. Parlier, J. S. Reznick, and J. Piven. 2007. 'The broad autism phenotype questionnaire', J Autism Dev Disord, 37: 1679-90.

Iafrate, A. J., L. Feuk, M. N. Rivera, M. L. Listewnik, P. K. Donahoe, Y. Qi, S. W. Scherer, and C. Lee. 2004. 'Detection of large-scale variation in the human genome', Nat Genet, 36: 949-51.

Icasiano, F., P. Hewson, P. Machet, C. Cooper, and A. Marshall. 2004. 'Childhood autism spectrum disorder in the Barwon region: a community based study', J Paediatr Child Health, 40: 696-701.

Idring, S., D. Rai, H. Dal, C. Dalman, H. Sturm, E. Zander, B. K. Lee, E. Serlachius, and C. Magnusson. 2012. 'Autism spectrum disorders in the Stockholm Youth Cohort: design, prevalence and validity', PLoS One, 7: e41280.

68

Iossifov, I., M. Ronemus, D. Levy, Z. Wang, I. Hakker, J. Rosenbaum, B. Yamrom, Y. H. Lee, G. Narzisi, A. Leotta, J. Kendall, E. Grabowska, B. Ma, S. Marks, L. Rodgers, A. Stepansky, J. Troge, P. Andrews, M. Bekritsky, K. Pradhan, E. Ghiban, M. Kramer, J. Parla, R. Demeter, L. L. Fulton, R. S. Fulton, V. J. Magrini, K. Ye, J. C. Darnell, R. B. Darnell, E. R. Mardis, R. K. Wilson, M. C. Schatz, W. R. McCombie, and M. Wigler. 2012. 'De novo gene disruptions in children on the autistic spectrum', Neuron, 74: 285-99.

Jacquemont, S., B. P. Coe, M. Hersch, M. H. Duyzend, N. Krumm, S. Bergmann, J. S. Beckmann, J. A. Rosenfeld, and E. E. Eichler. 2014. 'A higher mutational burden in females supports a "female protective model" in neurodevelopmental disorders', Am J Hum Genet, 94: 415-25.

Jamain, S., H. Quach, C. Betancur, M. Rastam, C. Colineaux, I. C. Gillberg, H. Soderstrom, B. Giros, M. Leboyer, C. Gillberg, T. Bourgeron, and Study Paris Autism Research International Sibpair. 2003. 'Mutations of the X-linked genes encoding neuroligins NLGN3 and NLGN4 are associated with autism', Nat Genet, 34: 27-9.

Jensen, C. M., H. C. Steinhausen, and M. B. Lauritsen. 2014. 'Time trends over 16 years in incidence-rates of autism spectrum disorders across the lifespan based on nationwide Danish register data', J Autism Dev Disord, 44: 1808-18.

Jessell, T. M. 2000. 'Neuronal specification in the spinal cord: inductive signals and transcriptional codes', Nat Rev Genet, 1: 20-9.

Jiang, Y. H., R. K. Yuen, X. Jin, M. Wang, N. Chen, X. Wu, J. Ju, J. Mei, Y. Shi, M. He, G. Wang, J. Liang, Z. Wang, D. Cao, M. T. Carter, C. Chrysler, I. E. Drmic, J. L. Howe, L. Lau, C. R. Marshall, D. Merico, T. Nalpathamkalam, B. Thiruvahindrapuram, A. Thompson, M. Uddin, S. Walker, J. Luo, E. Anagnostou, L. Zwaigenbaum, R. H. Ring, J. Wang, C. Lajonchere, J. Wang, A. Shih, P. Szatmari, H. Yang, G. Dawson, Y. Li, and S. W. Scherer. 2013. 'Detection of clinically relevant genetic variants in autism spectrum disorder by whole-genome sequencing', Am J Hum Genet, 93: 249-63.

Jones, M. B., and P. Szatmari. 1988. 'Stoppage rules and genetic studies of autism', J Autism Dev Disord, 18: 31-40.

Kanner, Leo. 1943. 'Autistic Disturbances of Affective Contact', Nervous Child, 2: 217-50.

Kates, W. R., C. P. Burnette, S. Eliez, L. A. Strunge, D. Kaplan, R. Landa, A. L. Reiss, and G. D. Pearlson. 2004. 'Neuroanatomic variation in monozygotic twin pairs discordant for the narrow phenotype for autism', Am J Psychiatry, 161: 539-46.

Kaufman, L., Noor,A., Ayub, M., Vincent, J.B. . 2011. "Common Genetic Etiologies and Biological Pathways Shared Between Autism Spectrum Disorders and Intellectual Disabilities." In Autism Spectrum Disorders: The Role of Genetics in Diagnosis and Treatment, edited by Stephen Deutsch (Ed.), 125-58. Rijeka, Croatia: InTech.

69

Kearney, H. M., E. C. Thorland, K. K. Brown, F. Quintero-Rivera, S. T. South, and Committee Working Group of the American College of Medical Genetics Laboratory Quality Assurance. 2011. 'American College of Medical Genetics standards and guidelines for interpretation and reporting of postnatal constitutional copy number variants', Genet Med, 13: 680-5.

Kimura, M., and Y. Okano. 2007. 'Human Misato regulates mitochondrial distribution and morphology', Exp Cell Res, 313: 1393-404.

Krawczak, M., S. Nikolaus, H. von Eberstein, P. J. Croucher, N. E. El Mokhtari, and S. Schreiber. 2006. 'PopGen: population-based recruitment of patients and controls for the analysis of complex genotype-phenotype relationships', Community Genet, 9: 55-61.

Kumar, R. A., S. KaraMohamed, J. Sudi, D. F. Conrad, C. Brune, J. A. Badner, T. C. Gilliam, N. J. Nowak, E. H. Cook, Jr., W. B. Dobyns, and S. L. Christian. 2008. 'Recurrent 16p11.2 microdeletions in autism', Hum Mol Genet, 17: 628-38.

Lai, M. C., M. V. Lombardo, B. Auyeung, B. Chakrabarti, and S. Baron-Cohen. 2015. 'Sex/gender differences and autism: setting the scene for future research', J Am Acad Child Adolesc Psychiatry, 54: 11-24.

Lander, E. S. 1996. 'The new genomics: global views of biology', Science, 274: 536-9.

Lauritsen, M. B., C. B. Pedersen, and P. B. Mortensen. 2005. 'Effects of familial risk factors and place of birth on the risk of autism: a nationwide register-based study', J Child Psychol Psychiatry, 46: 963-71.

Le Couteur, A., A. Bailey, S. Goode, A. Pickles, S. Robertson, I. Gottesman, and M. Rutter. 1996. 'A broader phenotype of autism: the clinical spectrum in twins', J Child Psychol Psychiatry, 37: 785-801.

Lee, J. A., C. M. Carvalho, and J. R. Lupski. 2007. 'A DNA replication mechanism for generating nonrecurrent rearrangements associated with genomic disorders', Cell, 131: 1235-47.

Levy, D., M. Ronemus, B. Yamrom, Y. H. Lee, A. Leotta, J. Kendall, S. Marks, B. Lakshmi, D. Pai, K. Ye, A. Buja, A. Krieger, S. Yoon, J. Troge, L. Rodgers, I. Iossifov, and M. Wigler. 2011. 'Rare de novo and transmitted copy-number variation in autistic spectrum disorders', Neuron, 70: 886-97.

Losh, M., D. Childress, K. Lam, and J. Piven. 2008. 'Defining key features of the broad autism phenotype: a comparison across parents of multiple- and single-incidence autism families', Am J Med Genet B Neuropsychiatr Genet, 147B: 424-33.

Malhotra, D., and J. Sebat. 2012. 'CNVs: harbingers of a rare variant revolution in psychiatric genetics', Cell, 148: 1223-41.

Marshall, C. R., A. Noor, J. B. Vincent, A. C. Lionel, L. Feuk, J. Skaug, M. Shago, R. Moessner, D. Pinto, Y. Ren, B. Thiruvahindrapduram, A. Fiebig, S. Schreiber, J. Friedman, C. E. Ketelaars, Y. J. Vos, C. Ficicioglu, S. Kirkpatrick, R. Nicolson, L. Sloman, A. Summers,

70

C. A. Gibbons, A. Teebi, D. Chitayat, R. Weksberg, A. Thompson, C. Vardy, V. Crosbie, S. Luscombe, R. Baatjes, L. Zwaigenbaum, W. Roberts, B. Fernandez, P. Szatmari, and S. W. Scherer. 2008. 'Structural variation of chromosomes in autism spectrum disorder', Am J Hum Genet, 82: 477-88.

Miller, D. T., M. P. Adam, S. Aradhya, L. G. Biesecker, A. R. Brothman, N. P. Carter, D. M. Church, J. A. Crolla, E. E. Eichler, C. J. Epstein, W. A. Faucett, L. Feuk, J. M. Friedman, A. Hamosh, L. Jackson, E. B. Kaminsky, K. Kok, I. D. Krantz, R. M. Kuhn, C. Lee, J. M. Ostell, C. Rosenberg, S. W. Scherer, N. B. Spinner, D. J. Stavropoulos, J. H. Tepperberg, E. C. Thorland, J. R. Vermeesch, D. J. Waggoner, M. S. Watson, C. L. Martin, and D. H. Ledbetter. 2010. 'Consensus statement: chromosomal microarray is a first-tier clinical diagnostic test for individuals with developmental disabilities or congenital anomalies', Am J Hum Genet, 86: 749-64.

Mullen, EM. 1995. Mullen Scales of Early Learning (American Guidance Service: Cirlce Pines, MN).

Murphy, M., P. F. Bolton, A. Pickles, E. Fombonne, J. Piven, and M. Rutter. 2000. 'Personality traits of the relatives of autistic probands', Psychol Med, 30: 1411-24.

Nagase, T., K. Ishikawa, D. Nakajima, M. Ohira, N. Seki, N. Miyajima, A. Tanaka, H. Kotani, N. Nomura, and O. Ohara. 1997. 'Prediction of the coding sequences of unidentified human genes. VII. The complete sequences of 100 new cDNA clones from brain which can code for large in vitro', DNA Res, 4: 141-50.

Neale, B. M., Y. Kou, L. Liu, A. Ma'ayan, K. E. Samocha, A. Sabo, C. F. Lin, C. Stevens, L. S. Wang, V. Makarov, P. Polak, S. Yoon, J. Maguire, E. L. Crawford, N. G. Campbell, E. T. Geller, O. Valladares, C. Schafer, H. Liu, T. Zhao, G. Cai, J. Lihm, R. Dannenfelser, O. Jabado, Z. Peralta, U. Nagaswamy, D. Muzny, J. G. Reid, I. Newsham, Y. Wu, L. Lewis, Y. Han, B. F. Voight, E. Lim, E. Rossin, A. Kirby, J. Flannick, M. Fromer, K. Shakir, T. Fennell, K. Garimella, E. Banks, R. Poplin, S. Gabriel, M. DePristo, J. R. Wimbish, B. E. Boone, S. E. Levy, C. Betancur, S. Sunyaev, E. Boerwinkle, J. D. Buxbaum, E. H. Cook, Jr., B. Devlin, R. A. Gibbs, K. Roeder, G. D. Schellenberg, J. S. Sutcliffe, and M. J. Daly. 2012. 'Patterns and rates of exonic de novo mutations in autism spectrum disorders', Nature, 485: 242-5.

Noor, A., A. Whibley, C. R. Marshall, P. J. Gianakopoulos, A. Piton, A. R. Carson, M. Orlic- Milacic, A. C. Lionel, D. Sato, D. Pinto, I. Drmic, C. Noakes, L. Senman, X. Zhang, R. Mo, J. Gauthier, J. Crosbie, A. T. Pagnamenta, J. Munson, A. M. Estes, A. Fiebig, A. Franke, S. Schreiber, A. F. Stewart, R. Roberts, R. McPherson, S. J. Guter, E. H. Cook, Jr., G. Dawson, G. D. Schellenberg, A. Battaglia, E. Maestrini, Consortium Autism Genome Project, L. Jeng, T. Hutchison, E. Rajcan-Separovic, A. E. Chudley, S. M. Lewis, X. Liu, J. J. Holden, B. Fernandez, L. Zwaigenbaum, S. E. Bryson, W. Roberts, P. Szatmari, L. Gallagher, M. R. Stratton, J. Gecz, A. F. Brady, C. E. Schwartz, R. J. Schachar, A. P. Monaco, G. A. Rouleau, C. C. Hui, F. Lucy Raymond, S. W. Scherer, and J. B. Vincent. 2010. 'Disruption at the PTCHD1 Locus on Xp22.11 in Autism spectrum disorder and intellectual disability', Sci Transl Med, 2: 49ra68.

71

O'Roak, B. J., L. Vives, S. Girirajan, E. Karakoc, N. Krumm, B. P. Coe, R. Levy, A. Ko, C. Lee, J. D. Smith, E. H. Turner, I. B. Stanaway, B. Vernot, M. Malig, C. Baker, B. Reilly, J. M. Akey, E. Borenstein, M. J. Rieder, D. A. Nickerson, R. Bernier, J. Shendure, and E. E. Eichler. 2012. 'Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations', Nature, 485: 246-50.

Ohtomo, T., T. Horii, M. Nomizu, T. Suga, and J. Yamada. 2007. 'Molecular cloning of a structural homolog of YY1AP, a coactivator of the multifunctional YY1', Amino Acids, 33: 645-52.

Ozonoff, S., G. S. Young, A. Carter, D. Messinger, N. Yirmiya, L. Zwaigenbaum, S. Bryson, L. J. Carver, J. N. Constantino, K. Dobkins, T. Hutman, J. M. Iverson, R. Landa, S. J. Rogers, M. Sigman, and W. L. Stone. 2011. 'Recurrence risk for autism spectrum disorders: a Baby Siblings Research Consortium study', Pediatrics, 128: e488-95.

Ozonoff, S., G. S. Young, R. J. Landa, J. Brian, S. Bryson, T. Charman, K. Chawarska, S. L. Macari, D. Messinger, W. L. Stone, L. Zwaigenbaum, and A. M. Iosif. 2015. 'Diagnostic stability in young children at risk for autism spectrum disorder: a baby siblings research consortium study', J Child Psychol Psychiatry, 56: 988-98.

Pickles, A., E. Starr, S. Kazak, P. Bolton, K. Papanikolaou, A. Bailey, R. Goodman, and M. Rutter. 2000. 'Variable expression of the autism broader phenotype: findings from extended pedigrees', J Child Psychol Psychiatry, 41: 491-502.

Pinto, D., E. Delaby, D. Merico, M. Barbosa, A. Merikangas, L. Klei, B. Thiruvahindrapuram, X. Xu, R. Ziman, Z. Wang, J. A. Vorstman, A. Thompson, R. Regan, M. Pilorge, G. Pellecchia, A. T. Pagnamenta, B. Oliveira, C. R. Marshall, T. R. Magalhaes, J. K. Lowe, J. L. Howe, A. J. Griswold, J. Gilbert, E. Duketis, B. A. Dombroski, M. V. De Jonge, M. Cuccaro, E. L. Crawford, C. T. Correia, J. Conroy, I. C. Conceicao, A. G. Chiocchetti, J. P. Casey, G. Cai, C. Cabrol, N. Bolshakova, E. Bacchelli, R. Anney, S. Gallinger, M. Cotterchio, G. Casey, L. Zwaigenbaum, K. Wittemeyer, K. Wing, S. Wallace, H. van Engeland, A. Tryfon, S. Thomson, L. Soorya, B. Roge, W. Roberts, F. Poustka, S. Mouga, N. Minshew, L. A. McInnes, S. G. McGrew, C. Lord, M. Leboyer, A. S. Le Couteur, A. Kolevzon, P. Jimenez Gonzalez, S. Jacob, R. Holt, S. Guter, J. Green, A. Green, C. Gillberg, B. A. Fernandez, F. Duque, R. Delorme, G. Dawson, P. Chaste, C. Cafe, S. Brennan, T. Bourgeron, P. F. Bolton, S. Bolte, R. Bernier, G. Baird, A. J. Bailey, E. Anagnostou, J. Almeida, E. M. Wijsman, V. J. Vieland, A. M. Vicente, G. D. Schellenberg, M. Pericak-Vance, A. D. Paterson, J. R. Parr, G. Oliveira, J. I. Nurnberger, A. P. Monaco, E. Maestrini, S. M. Klauck, H. Hakonarson, J. L. Haines, D. H. Geschwind, C. M. Freitag, S. E. Folstein, S. Ennis, H. Coon, A. Battaglia, P. Szatmari, J. S. Sutcliffe, J. Hallmayer, M. Gill, E. H. Cook, J. D. Buxbaum, B. Devlin, L. Gallagher, C. Betancur, and S. W. Scherer. 2014. 'Convergence of genes and cellular pathways dysregulated in autism spectrum disorders', Am J Hum Genet, 94: 677-94.

Pinto, D., A. T. Pagnamenta, L. Klei, R. Anney, D. Merico, R. Regan, J. Conroy, T. R. Magalhaes, C. Correia, B. S. Abrahams, J. Almeida, E. Bacchelli, G. D. Bader, A. J. Bailey, G. Baird, A. Battaglia, T. Berney, N. Bolshakova, S. Bolte, P. F. Bolton, T. Bourgeron, S. Brennan, J. Brian, S. E. Bryson, A. R. Carson, G. Casallo, J. Casey, B. H. Chung, L. Cochrane, C. Corsello, E. L. Crawford, A. Crossett, C. Cytrynbaum, G. 72

Dawson, M. de Jonge, R. Delorme, I. Drmic, E. Duketis, F. Duque, A. Estes, P. Farrar, B. A. Fernandez, S. E. Folstein, E. Fombonne, C. M. Freitag, J. Gilbert, C. Gillberg, J. T. Glessner, J. Goldberg, A. Green, J. Green, S. J. Guter, H. Hakonarson, E. A. Heron, M. Hill, R. Holt, J. L. Howe, G. Hughes, V. Hus, R. Igliozzi, C. Kim, S. M. Klauck, A. Kolevzon, O. Korvatska, V. Kustanovich, C. M. Lajonchere, J. A. Lamb, M. Laskawiec, M. Leboyer, A. Le Couteur, B. L. Leventhal, A. C. Lionel, X. Q. Liu, C. Lord, L. Lotspeich, S. C. Lund, E. Maestrini, W. Mahoney, C. Mantoulan, C. R. Marshall, H. McConachie, C. J. McDougle, J. McGrath, W. M. McMahon, A. Merikangas, O. Migita, N. J. Minshew, G. K. Mirza, J. Munson, S. F. Nelson, C. Noakes, A. Noor, G. Nygren, G. Oliveira, K. Papanikolaou, J. R. Parr, B. Parrini, T. Paton, A. Pickles, M. Pilorge, J. Piven, C. P. Ponting, D. J. Posey, A. Poustka, F. Poustka, A. Prasad, J. Ragoussis, K. Renshaw, J. Rickaby, W. Roberts, K. Roeder, B. Roge, M. L. Rutter, L. J. Bierut, J. P. Rice, J. Salt, K. Sansom, D. Sato, R. Segurado, A. F. Sequeira, L. Senman, N. Shah, V. C. Sheffield, L. Soorya, I. Sousa, O. Stein, N. Sykes, V. Stoppioni, C. Strawbridge, R. Tancredi, K. Tansey, B. Thiruvahindrapduram, A. P. Thompson, S. Thomson, A. Tryfon, J. Tsiantis, H. Van Engeland, J. B. Vincent, F. Volkmar, S. Wallace, K. Wang, Z. Wang, T. H. Wassink, C. Webber, R. Weksberg, K. Wing, K. Wittemeyer, S. Wood, J. Wu, B. L. Yaspan, D. Zurawiecki, L. Zwaigenbaum, J. D. Buxbaum, R. M. Cantor, E. H. Cook, H. Coon, M. L. Cuccaro, B. Devlin, S. Ennis, L. Gallagher, D. H. Geschwind, M. Gill, J. L. Haines, J. Hallmayer, J. Miller, A. P. Monaco, J. I. Nurnberger, Jr., A. D. Paterson, M. A. Pericak- Vance, G. D. Schellenberg, P. Szatmari, A. M. Vicente, V. J. Vieland, E. M. Wijsman, S. W. Scherer, J. S. Sutcliffe, and C. Betancur. 2010. 'Functional impact of global rare copy number variation in autism spectrum disorders', Nature, 466: 368-72.

Piven, J., J. Gayle, G. A. Chase, B. Fink, R. Landa, M. M. Wzorek, and S. E. Folstein. 1990. 'A family history study of neuropsychiatric disorders in the adult siblings of autistic individuals', J Am Acad Child Adolesc Psychiatry, 29: 177-83.

Piven, J., P. Palmer, R. Landa, S. Santangelo, D. Jacobi, and D. Childress. 1997. 'Personality and language characteristics in parents from multiple-incidence autism families', Am J Med Genet, 74: 398-411.

Pritchard, J. K., and N. J. Cox. 2002. 'The allelic architecture of human disease genes: common disease-common variant...or not?', Hum Mol Genet, 11: 2417-23.

Purcell, S., B. Neale, K. Todd-Brown, L. Thomas, M. A. Ferreira, D. Bender, J. Maller, P. Sklar, P. I. de Bakker, M. J. Daly, and P. C. Sham. 2007. 'PLINK: a tool set for whole-genome association and population-based linkage analyses', Am J Hum Genet, 81: 559-75.

Redon, R., S. Ishikawa, K. R. Fitch, L. Feuk, G. H. Perry, T. D. Andrews, H. Fiegler, M. H. Shapero, A. R. Carson, W. Chen, E. K. Cho, S. Dallaire, J. L. Freeman, J. R. Gonzalez, M. Gratacos, J. Huang, D. Kalaitzopoulos, D. Komura, J. R. MacDonald, C. R. Marshall, R. Mei, L. Montgomery, K. Nishimura, K. Okamura, F. Shen, M. J. Somerville, J. Tchinda, A. Valsesia, C. Woodwark, F. Yang, J. Zhang, T. Zerjal, J. Zhang, L. Armengol, D. F. Conrad, X. Estivill, C. Tyler-Smith, N. P. Carter, H. Aburatani, C. Lee, K. W. Jones, S. W. Scherer, and M. E. Hurles. 2006. 'Global variation in copy number in the human genome', Nature, 444: 444-54.

73

Reich, D. E., and E. S. Lander. 2001. 'On the allelic spectrum of human disease', Trends Genet, 17: 502-10.

Reynolds, J. J., A. K. Walker, E. C. Gilmore, C. A. Walsh, and K. W. Caldecott. 2012. 'Impact of PNKP mutations associated with microcephaly, seizures and developmental delay on enzyme activity and DNA strand break repair', Nucleic Acids Res, 40: 6608-19.

Ritvo, E. R., B. J. Freeman, A. Mason-Brothers, A. Mo, and A. M. Ritvo. 1985. 'Concordance for the syndrome of autism in 40 pairs of afflicted twins', Am J Psychiatry, 142: 74-7.

Riviere, J. B., B. W. van Bon, A. Hoischen, S. S. Kholmanskikh, B. J. O'Roak, C. Gilissen, S. Gijsen, C. T. Sullivan, S. L. Christian, O. A. Abdul-Rahman, J. F. Atkin, N. Chassaing, V. Drouin-Garraud, A. E. Fry, J. P. Fryns, K. W. Gripp, M. Kempers, T. Kleefstra, G. M. Mancini, M. J. Nowaczyk, C. M. van Ravenswaaij-Arts, T. Roscioli, M. Marble, J. A. Rosenfeld, V. M. Siu, B. B. de Vries, J. Shendure, A. Verloes, J. A. Veltman, H. G. Brunner, M. E. Ross, D. T. Pilz, and W. B. Dobyns. 2012. 'De novo mutations in the actin genes ACTB and ACTG1 cause Baraitser-Winter syndrome', Nat Genet, 44: 440-4, S1-2.

Ronald, A., F. Happe, P. Bolton, L. M. Butcher, T. S. Price, S. Wheelwright, S. Baron-Cohen, and R. Plomin. 2006. 'Genetic heterogeneity between the three components of the autism spectrum: a twin study', J Am Acad Child Adolesc Psychiatry, 45: 691-9.

Ruiz, A., S. Heilmann, T. Becker, I. Hernandez, H. Wagner, M. Thelen, A. Mauleon, M. Rosende-Roca, C. Bellenguez, J. C. Bis, D. Harold, A. Gerrish, R. Sims, O. Sotolongo- Grau, A. Espinosa, M. Alegret, J. L. Arrieta, A. Lacour, M. Leber, J. Becker, A. Lafuente, S. Ruiz, L. Vargas, O. Rodriguez, G. Ortega, M. A. Dominguez, Igap, R. Mayeux, J. L. Haines, M. A. Pericak-Vance, L. A. Farrer, G. D. Schellenberg, V. Chouraki, L. J. Launer, C. van Duijn, S. Seshadri, C. Antunez, M. M. Breteler, M. Serrano-Rios, F. Jessen, L. Tarraga, M. M. Nothen, W. Maier, M. Boada, and A. Ramirez. 2014. 'Follow-up of loci from the International Genomics of Alzheimer's Disease Project identifies TRIP4 as a novel susceptibility gene', Transl Psychiatry, 4: e358.

Sanders, S. J., M. T. Murtha, A. R. Gupta, J. D. Murdoch, M. J. Raubeson, A. J. Willsey, A. G. Ercan-Sencicek, N. M. DiLullo, N. N. Parikshak, J. L. Stein, M. F. Walker, G. T. Ober, N. A. Teran, Y. Song, P. El-Fishawy, R. C. Murtha, M. Choi, J. D. Overton, R. D. Bjornson, N. J. Carriero, K. A. Meyer, K. Bilguvar, S. M. Mane, N. Sestan, R. P. Lifton, M. Gunel, K. Roeder, D. H. Geschwind, B. Devlin, and M. W. State. 2012. 'De novo mutations revealed by whole-exome sequencing are strongly associated with autism', Nature, 485: 237-41.

Santoro, M. R., S. M. Bray, and S. T. Warren. 2012. 'Molecular mechanisms of fragile X syndrome: a twenty-year perspective', Annu Rev Pathol, 7: 219-45.

Sasson, N. J., K. S. Lam, M. Parlier, J. L. Daniels, and J. Piven. 2013. 'Autism and the broad autism phenotype: familial patterns and intergenerational transmission', J Neurodev Disord, 5: 11.

Sato, D., A. C. Lionel, C. S. Leblond, A. Prasad, D. Pinto, S. Walker, I. O'Connor, C. Russell, I. E. Drmic, F. F. Hamdan, J. L. Michaud, V. Endris, R. Roeth, R. Delorme, G. Huguet, M. 74

Leboyer, M. Rastam, C. Gillberg, M. Lathrop, D. J. Stavropoulos, E. Anagnostou, R. Weksberg, E. Fombonne, L. Zwaigenbaum, B. A. Fernandez, W. Roberts, G. A. Rappold, C. R. Marshall, T. Bourgeron, P. Szatmari, and S. W. Scherer. 2012. 'SHANK1 Deletions in Males with Autism Spectrum Disorder', Am J Hum Genet, 90: 879-87.

Saunders, B. S., J. M. Tilford, J. J. Fussell, E. G. Schulz, P. H. Casey, and D. Z. Kuo. 2015. 'Financial and employment impact of intellectual disability on families of children with autism', Fam Syst Health, 33: 36-45.

Schaaf, C. P., P. M. Boone, S. Sampath, C. Williams, P. I. Bader, J. M. Mueller, O. A. Shchelochkov, C. W. Brown, H. P. Crawford, J. A. Phalen, N. R. Tartaglia, P. Evans, W. M. Campbell, A. C. Tsai, L. Parsley, S. W. Grayson, A. Scheuerle, C. D. Luzzi, S. K. Thomas, P. A. Eng, S. H. Kang, A. Patel, P. Stankiewicz, and S. W. Cheung. 2012. 'Phenotypic spectrum and genotype-phenotype correlations of NRXN1 exon deletions', Eur J Hum Genet, 20: 1240-7.

Sebat, J., B. Lakshmi, D. Malhotra, J. Troge, C. Lese-Martin, T. Walsh, B. Yamrom, S. Yoon, A. Krasnitz, J. Kendall, A. Leotta, D. Pai, R. Zhang, Y. H. Lee, J. Hicks, S. J. Spence, A. T. Lee, K. Puura, T. Lehtimaki, D. Ledbetter, P. K. Gregersen, J. Bregman, J. S. Sutcliffe, V. Jobanputra, W. Chung, D. Warburton, M. C. King, D. Skuse, D. H. Geschwind, T. C. Gilliam, K. Ye, and M. Wigler. 2007. 'Strong association of de novo copy number mutations with autism', Science, 316: 445-9.

Sebat, J., B. Lakshmi, J. Troge, J. Alexander, J. Young, P. Lundin, S. Maner, H. Massa, M. Walker, M. Chi, N. Navin, R. Lucito, J. Healy, J. Hicks, K. Ye, A. Reiner, T. C. Gilliam, B. Trask, N. Patterson, A. Zetterberg, and M. Wigler. 2004. 'Large-scale copy number polymorphism in the human genome', Science, 305: 525-8.

Seri, S., A. Cerquiglini, F. Pisani, and P. Curatolo. 1999. 'Autism in tuberous sclerosis: evoked potential evidence for a deficit in auditory sensory processing', Clin Neurophysiol, 110: 1825-30.

Shaw, C. J., and J. R. Lupski. 2005. 'Non-recurrent 17p11.2 deletions are generated by homologous and non-homologous mechanisms', Hum Genet, 116: 1-7.

Smalley, S. L. 1998. 'Autism and tuberous sclerosis', J Autism Dev Disord, 28: 407-14.

Sparrow, S.S.; Chicchetti D.A.; Balla. 2005. vineland Adaptive Behavior Scales-2nd edition manual (NCS Pearson Inc.: Minneapolis).

Spence, S. J., and M. T. Schneider. 2009. 'The role of epilepsy and epileptiform EEGs in autism spectrum disorders', Pediatr Res, 65: 599-606.

Stavropoulos, D. J., Merico, D., Jobling, R. et al. . 2016. 'Whole-genome sequencing expands diagnostic utility and improves clinical management in paediatric medicine', Genomic Medicine, 1.

75

Stewart, A. F., S. Dandona, L. Chen, O. Assogba, M. Belanger, G. Ewart, R. LaRose, H. Doelle, K. Williams, G. A. Wells, R. McPherson, and R. Roberts. 2009. 'Kinesin family member 6 variant Trp719Arg does not associate with angiographically defined coronary artery disease in the Ottawa Heart Genomics Study', J Am Coll Cardiol, 53: 1471-2.

Suzuki, K., Y. Hayashi, S. Nakahara, H. Kumazaki, J. Prox, K. Horiuchi, M. Zeng, S. Tanimura, Y. Nishiyama, S. Osawa, A. Sehara-Fujisawa, P. Saftig, S. Yokoshima, T. Fukuyama, N. Matsuki, R. Koyama, T. Tomita, and T. Iwatsubo. 2012. 'Activity-dependent proteolytic cleavage of neuroligin-1', Neuron, 76: 410-22.

Szatmari, P., J. E. MacLean, M. B. Jones, S. E. Bryson, L. Zwaigenbaum, G. Bartolucci, W. J. Mahoney, and L. Tuff. 2000. 'The familial aggregation of the lesser variant in biological and nonbiological relatives of PDD probands: a family history study', J Child Psychol Psychiatry, 41: 579-86.

Tammimies, K., C. R. Marshall, S. Walker, G. Kaur, B. Thiruvahindrapuram, A. C. Lionel, R. K. Yuen, M. Uddin, W. Roberts, R. Weksberg, M. Woodbury-Smith, L. Zwaigenbaum, E. Anagnostou, Z. Wang, J. Wei, J. L. Howe, M. J. Gazzellone, L. Lau, W. W. Sung, K. Whitten, C. Vardy, V. Crosbie, B. Tsang, L. D'Abate, W. W. Tong, S. Luscombe, T. Doyle, M. T. Carter, P. Szatmari, S. Stuckless, D. Merico, D. J. Stavropoulos, S. W. Scherer, and B. A. Fernandez. 2015. 'Molecular Diagnostic Yield of Chromosomal Microarray Analysis and Whole-Exome Sequencing in Children With Autism Spectrum Disorder', JAMA, 314: 895-903.

Tinnikov, A., K. Nordstrom, P. Thoren, J. M. Kindblom, S. Malin, B. Rozell, M. Adams, O. Rajanayagam, S. Pettersson, C. Ohlsson, K. Chatterjee, and B. Vennstrom. 2002. 'Retardation of post-natal development caused by a negatively acting thyroid hormone receptor alpha1', EMBO J, 21: 5079-87.

Uddin, M., K. Tammimies, G. Pellecchia, B. Alipanahi, P. Hu, Z. Wang, D. Pinto, L. Lau, T. Nalpathamkalam, C. R. Marshall, B. J. Blencowe, B. J. Frey, D. Merico, R. K. Yuen, and S. W. Scherer. 2014. 'Brain-expressed exons under purifying selection are enriched for de novo mutations in autism spectrum disorder', Nat Genet, 46: 742-7.

Uddin, M., B. Thiruvahindrapuram, S. Walker, Z. Wang, P. Hu, S. Lamoureux, J. Wei, J. R. MacDonald, G. Pellecchia, C. Lu, A. C. Lionel, M. J. Gazzellone, J. R. McLaughlin, C. Brown, I. L. Andrulis, J. A. Knight, J. A. Herbrick, R. F. Wintle, P. Ray, D. J. Stavropoulos, C. R. Marshall, and S. W. Scherer. 2015. 'A high-resolution copy-number variation resource for clinical and population genetics', Genet Med, 17: 747-52.

Vaags, A. K., A. C. Lionel, D. Sato, M. Goodenberger, Q. P. Stein, S. Curran, C. Ogilvie, J. W. Ahn, I. Drmic, L. Senman, C. Chrysler, A. Thompson, C. Russell, A. Prasad, S. Walker, D. Pinto, C. R. Marshall, D. J. Stavropoulos, L. Zwaigenbaum, B. A. Fernandez, E. Fombonne, P. F. Bolton, D. A. Collier, J. C. Hodge, W. Roberts, P. Szatmari, and S. W. Scherer. 2012. 'Rare deletions at the neurexin 3 locus in autism spectrum disorder', Am J Hum Genet, 90: 133-41.

76

van Steensel, F. J., S. M. Bogels, and S. Perrin. 2011. 'Anxiety disorders in children and adolescents with autistic spectrum disorders: a meta-analysis', Clin Child Fam Psychol Rev, 14: 302-17.

Volkmar, F. R., P. Szatmari, and S. S. Sparrow. 1993. 'Sex differences in pervasive developmental disorders', J Autism Dev Disord, 23: 579-91.

Walsh, K. M., and M. B. Bracken. 2011. 'Copy number variation in the dosage-sensitive 16p11.2 interval accounts for only a small proportion of autism incidence: a systematic review and meta-analysis', Genet Med, 13: 377-84.

Wang, K., H. Zhang, D. Ma, M. Bucan, J. T. Glessner, B. S. Abrahams, D. Salyakina, M. Imielinski, J. P. Bradfield, P. M. Sleiman, C. E. Kim, C. Hou, E. Frackelton, R. Chiavacci, N. Takahashi, T. Sakurai, E. Rappaport, C. M. Lajonchere, J. Munson, A. Estes, O. Korvatska, J. Piven, L. I. Sonnenblick, A. I. Alvarez Retuerto, E. I. Herman, H. Dong, T. Hutman, M. Sigman, S. Ozonoff, A. Klin, T. Owley, J. A. Sweeney, C. W. Brune, R. M. Cantor, R. Bernier, J. R. Gilbert, M. L. Cuccaro, W. M. McMahon, J. Miller, M. W. State, T. H. Wassink, H. Coon, S. E. Levy, R. T. Schultz, J. I. Nurnberger, J. L. Haines, J. S. Sutcliffe, E. H. Cook, N. J. Minshew, J. D. Buxbaum, G. Dawson, S. F. Grant, D. H. Geschwind, M. A. Pericak-Vance, G. D. Schellenberg, and H. Hakonarson. 2009. 'Common genetic variants on 5p14.1 associate with autism spectrum disorders', Nature, 459: 528-33.

Weiss, L. A., D. E. Arking, Hopkins Gene Discovery Project of Johns, Consortium the Autism, M. J. Daly, and A. Chakravarti. 2009. 'A genome-wide linkage and association scan reveals novel loci for autism', Nature, 461: 802-8.

Weiss, L. A., Y. Shen, J. M. Korn, D. E. Arking, D. T. Miller, R. Fossdal, E. Saemundsen, H. Stefansson, M. A. Ferreira, T. Green, O. S. Platt, D. M. Ruderfer, C. A. Walsh, D. Altshuler, A. Chakravarti, R. E. Tanzi, K. Stefansson, S. L. Santangelo, J. F. Gusella, P. Sklar, B. L. Wu, M. J. Daly, and Consortium Autism. 2008. 'Association between microdeletion and microduplication at 16p11.2 and autism', N Engl J Med, 358: 667-75.

Yu, P., Y. Chen, D. A. Tagle, and T. Cai. 2002. 'PJA1, encoding a RING-H2 finger ubiquitin ligase, is a novel human X chromosome gene abundantly expressed in brain', Genomics, 79: 869-74.

Yuen, R. K., B. Thiruvahindrapuram, D. Merico, S. Walker, K. Tammimies, N. Hoang, C. Chrysler, T. Nalpathamkalam, G. Pellecchia, Y. Liu, M. J. Gazzellone, L. D'Abate, E. Deneault, J. L. Howe, R. S. Liu, A. Thompson, M. Zarrei, M. Uddin, C. R. Marshall, R. H. Ring, L. Zwaigenbaum, P. N. Ray, R. Weksberg, M. T. Carter, B. A. Fernandez, W. Roberts, P. Szatmari, and S. W. Scherer. 2015. 'Whole-genome sequencing of quartet families with autism spectrum disorder', Nat Med, 21: 185-91.

Zappella, M., I. Meloni, I. Longo, R. Canitano, G. Hayek, L. Rosaia, F. Mari, and A. Renieri. 2003. 'Study of MECP2 gene in Rett syndrome variants and autistic girls', Am J Med Genet B Neuropsychiatr Genet, 119B: 102-7.

77

Zwaigenbaum, L., M. L. Bauman, R. Choueiri, D. Fein, C. Kasari, K. Pierce, W. L. Stone, N. Yirmiya, A. Estes, R. L. Hansen, J. C. McPartland, M. R. Natowicz, T. Buie, A. Carter, P. A. Davis, D. Granpeesheh, Z. Mailloux, C. Newschaffer, D. Robins, S. Smith Roley, S. Wagner, and A. Wetherby. 2015. 'Early Identification and Interventions for Autism Spectrum Disorder: Executive Summary', Pediatrics, 136 Suppl 1: S1-9.

78