Molecular Psychiatry (2012) 17, 402–411 & 2012 Macmillan Publishers Limited All rights reserved 1359-4184/12 www.nature.com/mp ORIGINAL ARTICLE Rare structural variation of synapse and neurotransmission in X Gai1, HM Xie1, JC Perin1, N Takahashi2, K Murphy1, AS Wenocur1, M D’arcy1, RJ O’Hara1, E Goldmuntz3,6, DE Grice4, TH Shaikh5, H Hakonarson6,7,8, JD Buxbaum2, J Elia9,10 and PS White1,6,11 1Center for Biomedical Informatics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA; 2Seaver Autism Center and Department of Psychiatry, Mt Sinai School of Medicine, New York, NY, USA; 3Division of Cardiology, Children’s Hospital of Philadelphia, Philadelphia, PA, USA; 4Department of Child and Adolescent Psychiatry, Columbia University, New York, NY, USA; 5Department of Pediatrics, University of Colorado School of Medicine, Denver, CO, USA; 6Department of Pediatrics, University of Pennsylvania School of Medicine, Philadelphia, PA, USA; 7Center for Applied Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA; 8Division of Pulmonary Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA, USA; 9Department of Child and Adolescent Psychiatry, Children’s Hospital of Philadelphia, Philadelphia, PA, USA; 10Department of Psychiatry, University of Pennsylvania School of Medicine, Philadelphia, PA, USA and 11Division of Oncology, Children’s Hospital of Philadelphia, Philadelphia, PA, USA

Autism spectrum disorders (ASDs) comprise a constellation of highly heritable neuropsychia- tric disorders. Genome-wide studies of autistic individuals have implicated numerous minor risk alleles but few common variants, suggesting a complex genetic model with many contributing loci. To assess commonality of biological function among rare risk alleles, we compared functional knowledge of genes overlapping inherited structural variants in idiopathic ASD subjects relative to healthy controls. In this study we show that biological processes associated with synapse function and neurotransmission are significantly enriched, with replication, in ASD subjects versus controls. Analysis of phenotypes observed for mouse models of copy-variant genes established significant and replicated enrichment of observable phenotypes consistent with ASD behaviors. Most functional terms retained significance after excluding previously reported ASD loci. These results implicate several new variants that involve synaptic function and glutamatergic signaling processes as important contributors of ASD pathophysiology and suggest a sizable pool of additional potential ASD risk loci. Molecular Psychiatry (2012) 17, 402–411; doi:10.1038/mp.2011.10; published online 1 March 2011 Keywords: autism; copy-number variation; glutamatergic signaling; neurotransmission; structural variation; synaptic transmission

Introduction copy-number variation (CNV) analyses have identi- fied several candidate loci, including 16p11.2 and Autism and disorders (ASDs) are NRXN1 ( 1).10,12 The lack of common predis- characterized by deficits in social interactions, com- position genes has led to the suggestion that rare munication difficulties and the presence of restricted, 1,2 variants constitute the majority of ASD risk, and that repetitive and stereotyped patterns of behavior. these variants collectively affect a limited number of Autism is co-occurrent in disorders with known common functional pathways. In particular, genetic cytogenetic etiologies such as Fragile X, suggesting a and functional studies of candidate genes suggest the role for genome structural aberrations; however, these 3 involvement of synaptic transmission in ASD etiol- account for < 10% of cases to date. The remainder, ogy.13,14 This evidence served as the basis for our often referred to as ‘idiopathic’ autism, are considered hypothesis that inherited rare structural variants in highly heritable with a 5–10% recurrence rate in genes from a number of closely related neurological siblings and a 60–90% concordance rate in mono- 4,5 pathways collectively contribute to a significant zygotic twins. To date, genetic linkage and associa- portion of disease risk in idiopathic ASDs. tion studies have had limited success in identifying genetic risk loci for ASDs.6–11 Most recently, genome Materials and methods Correspondence: Dr PS White, Children’s Hospital of Philadel- phia, 34th Street and Civic Center Boulevard, Room 1407 CHOP Study cohorts North, Philadelphia, PA 19104-4318, USA. E-mail: [email protected] The Autism Genetic Resource Exchange (AGRE) Received 4 May 2010; revised 6 January 2011; accepted 19 collection includes a total of 5431 affected and January 2011; published online 1 March 2011 parental samples from 1000 families. These samples Rare inherited structural variants in autism X Gai et al 403 are grouped into four different sets based on the time (Illumina, San Diego, CA, USA) as previously de- when they were recruited: set 1 (660), set 2 (1373), set scribed.16 The standard Illumina cluster file was used 3 (598) and set 4 (2800). Pedigree and affected status for the analysis, which is generated at Illumina by information on 4411 subjects are provided in the genotyping 120 HapMap samples, running the Bead- AGRE pedigree file at the AGRE website. Approxi- Studio clustering algorithm and reviewing single- mately 95% of these families are multiplex. The nucleotide polymorphisms (SNPs) with poor perfor- autism discovery and replication cohorts were de- mance statistics, including call frequency, cluster rived from AGRE set 4 and sets 1–3, respectively. separation and Hardy–Weinberg equilibrium. For Cases designated as ‘autism’ and corresponding both control sets, ancestry informative markers avail- parents were included; cases designated as ‘spectrum’ able on the Human-Hap550 BeadChip17 were used to and ‘not quite autism’ were not considered. Cases not evaluate eligible subjects to determine ethnicity. of European descent and cases with known genetic Genotypes that failed quality measures for accurate syndromes or other non-idiopathic autistic causes CNV detection (call rate > 98%) were excluded. were excluded. After applying these exclusion criter- ia and our genotyping and CNV detection quality CNV detection and analysis control criteria (see below), the final discovery cohort The Illumina BeadStudio 3.0. software package was included 1793 subjects (631 cases and 1162 parental used for initial CNV detection analysis. Log R ratios samples), and the replication cohort included 1702 and B allele frequencies were first exported from subjects (593 cases and 1109 parents). A number of BeadStudio. Log R ratio values were used as an parents (90 parents in the discovery cohort and 77 additional sample-wide genotype quality control parents in the replication cohort) were excluded for measure, and log R ratios with the s.d. above 0.35 failing to meet genotype quality measures. For the were excluded from the study. CNV predictions were discovery cohort, 262 of the 335 families have multi- generated using CNV Workshop as described pre- ple siblings included, for a total of 558 siblings. For viously.15,18,19 To reduce the possibility of type I error, the replication cohort, 218 of the 354 families have only deletions spanning X10 consecutive SNPs and multiple siblings included, for a total of 457 siblings. duplications spanning X20 consecutive SNPs were Healthy control samples were recruited from well- included. These CNV size thresholds were selected child visits conducted within the healthcare network based upon previous studies from our group15,18,19 and of the CHOP (Children’s Hospital of Philadelphia). experience with samples undergoing array-based Criteria for discovery control cohort inclusion were: clinical diagnostics at our institution.20 Specifically, age within age 3–18 years; screened for having no in these previous studies, our CNV detection method chronic illness as well as autism; and an extensive was shown to have a negligible false-positive rate and continuous electronic health record with no using these thresholds. An increased threshold for indications of chronic health issues by diagnosis, duplications was used because of the difference in the prescribed medications, diagnostic tests or subspeci- degree of expected signal intensity variation between alty encounters. All controls were unrelated as copy states for each probe: a twofold decrease for determined by health record and genotype assess- heterozygous deletion versus only a 1.5-fold increase ment. After quality metrics used for the autism cohort for duplication. For comparative analysis purposes, were applied, the discovery control cohort comprised two CNVs were considered equivalent if their 1005 individuals of European descent, 723 of African genomic regions overlapped at least 80% reciprocally. descent and 47 of Asian descent. The replication All CNVs meeting quality and experimental design control cohort consisted of 2026 healthy individuals metrics were independently assessed by visual comprising the CHOP CNV resource, as previously inspection with BeadStudio. For pathway analyses, described.15 -specific CNV analyses included all olfactory receptor genes were removed before healthy control individuals confined to European consideration, as this gene family is rapidly evol- descent, whereas functional analyses included all ving21–23 and tends to be associated with gene clusters ethnicities in order to identify and remove a larger that can over-represent such genes in functional constellation of nonpathogenic variants. The conge- enrichment studies. Careful analysis of enriched nital heart defect cohort used as a control comparison gene sets indicated no genomic region-specific consisted of 610 subjects with conotruncal defects bias from other functional gene clusters. For single and 896 parents of European descent who were gene analyses, apparently identical CNVs ( > 80% uniformly assessed, genotyped and analyzed at reciprocal overlap in genomic regions) from related CHOP. The congenital heart defect cohort was siblings were removed before analysis. Significance processed using the same protocols, quality measures testing for individual gene analyses used Fisher’s and CNV size thresholds described for the ASD exact test. cohort, including restricting consideration of CNVs DNA for all AGRE and congenital heart defect to those with unequivocal inheritance from a parent. samples was derived from immortalized cell lines, whereas all healthy control were derived from Genotyping blood. Although acknowledging that the difference in AGRE and control samples were assayed on DNA origin between cases and controls could theore- the Illumina Infinium II Human-Hap550 BeadChip tically have an effect on the results, we consider this

Molecular Psychiatry Rare inherited structural variants in autism X Gai et al 404 to be highly unlikely, for the following reasons: (1) DNA-Glycosylase with ROX (Invitrogen, Carlsbad, with the exception of the increased frequency of CA, USA) and 25 ng genomic DNA). Values were autism subject CNVs detected in the IgH-juxtaposed evaluated using Sequence Detection Software v2.2.1 RNA gene LOC10013346, we saw no significant (Applied Biosystems). Data analysis was further 24 difference in genome-wide CNV rates in autism performed using either the DDCT method or qBase. cases, and no other single gene was significantly Reference genes, chosen from COBL, GUSB, PPIA and enriched in cases; (2) real and presumptive de novo SNCA, were included based on the minimal coeffi- CNVs are excluded from our functional analyses; (3) cient of variation. Data were subsequently normalized our functional analyses would likely not be affected by setting a normal control to a value of 1. by increased cell line artifacts, as each gene is counted in a frequency-independent manner in these Results analyses. We conducted a modified case–control CNV study, analysis restricting inclusion to probands and siblings of Gene ontology analysis was carried out using the European descent with a diagnosis of idiopathic Explain Analysis System 2.4.2 (BIOBASE, Wolfen- autism. We focused our analysis on rare inherited bu¨ ttel, Germany). RefSeq genes overlapping rare and structural variants in the autistic subjects and inherited CNVs observed in autism cases were potential enrichment of any functional categories extracted from the UCSC (University of California, assigned to genes overlapping these CNVs. Genotypes Santa Cruz) Genome Browser and input into the of autistic subjects were obtained from the AGRE25 requisite pathway analysis models using default (see Materials and methods) and were divided into settings. Fisher’s exact test was used when testing discovery and replication cohorts based upon the for enrichment of any gene ontology terms in the time of recruitment. input gene lists. Mouse gene annotations were used In the discovery stage of the study, 631 autism for analysis because of a richer annotation set than for subjects and 1162 parents of these subjects were humans. All analyses were repeated with human compared with a healthy control set of 1775 children. annotations and yielded comparable results, although We observed no significant differences in the fre- significance values were lessened because of the quency of CNV rate (P=0.109), CNV size (P=0.136), paucity of annotations for some genes. A Benjamini– deletions (P=0.987) or duplications (P=0.144) be- Hochberg correction for multiple testing was applied tween the subject and control groups. A total of 395 for all functional enrichment analyses using the inherited CNVs (135 duplications, 256 heterozygous p.adjust function with parameter BH in the statistical deletions and 4 hemizygous deletions of the X package R. ) were identified in 286 autism subjects that were not present in healthy controls (Supple- Mouse phenotype ontology analysis mentary Table 1). CNV transmission to probands was Mouse phenotype analysis was performed using independent of parent gender (185/395 with paternal Mouse Phenotype Ontology (MP) term annotations inheritance; P=1). For the purpose of this study, we from the MGI (Mouse Genome Informatics) resource considered and refer to these CNVs as ‘rare’, as the (Jackson Laboratory; Bar Harbor, ME, USA). Assigned estimated prevalence of each is < 0.05% in an MP terms for mouse models of genes overlapping rare ethnically matched control population. These CNVs and inherited autism CNVs were extracted from MGI. overlapped a total of 382 genes excluding olfactory If a MP term was associated with a gene, all parental receptor genes. In all, 14 genes had CNVs in 5 MP terms were assigned to the gene as well. Fisher’s unrelated individuals, and the remaining 368 genes exact test was employed to measure the enrichment of had CNVs in p4 individuals. This autism-enriched, autism CNV genes annotated with a given MP term CNV-associated gene set (autism gene set) was compared with all genes annotated with a MP term. analyzed to determine if it was enriched for particular The P-values were used to rank the significance of biological or disease processes. enrichments. Gene ontology analysis revealed significant enrich- ment of the autism gene set for a number of biological CNV validation by quantitative PCR processes (BPs) after correction for multiple Universal Probe Library (UPL; Roche, Indianapolis, testing (Table 1 and Supplementary Table 2). Many of IN, USA) probes were selected using the ProbeFinder these processes were associated with synaptic function v2.41 software (Roche). Probes and primer sequences and neurotransmission, including cell–cell signaling selected for each CNV region are listed in Supple- (P = 1.97 Â 10À7), transmission of nerve impulse mentary Table 12. Quantitative PCR was performed (P = 7.06 Â 10À5), synaptic transmission (P = 8.05 Â 10À5), on an ABI 7500 Real Time PCR Instrument or on an neuron adhesion (P = 2.06 Â 10À3), central nervous sys- ABI Prism 7900HT Sequence Detection System tem development (P = 4.30 Â 10À3), regulation of neuro- (Applied Biosystems, Foster City, CA, USA). Each transmitter levels (P = 4.88 Â 10À3), sample was analyzed in quadruplicate in a 10 ml secretion (P = 1.30 Â 10À2)andneurotransmitter trans- reaction mixture (100 nM probe, 200 nM each primer, port (P = 2.95 Â 10À2). Several cellular component (CC) 1 Â Platinum Quantitative PCR SuperMix-Uracil- terms were similarly enriched in the autism gene set,

Molecular Psychiatry Rare inherited structural variants in autism X Gai et al

c 405

7 6 6 2 3 3 3 3 3 2 3 3 À4

À À À À À À À À À À À À including presynaptic membrane (P = 5.65  10 )and À3 10 10 10 10 10 10 10 10 10 10 10 10 synapse (P = 1.11  10 ) (Table 2 and Supplementary             Table 3). Both the BP and CC terms of significance were Adjusted

6.14 4.98 4.98 1.17 3.22 2.09 1.33 1.78 1.78 1.50 1.05 6.43 represented by sizable and only modestly overlapping significance gene sets (BP: n = 114 unique genes, mean = 3.2 terms/

b gene; CC: n = 18 unique genes, mean = 1.4 terms/gene), 9 8 8 4 4 5 5 5 5 4 5 4

À À À À À À À À À À À À indicating considerable independence in implication of 10 10 10 10 10 10 10 10 10 10 10 10 individual terms (Supplementary Figure 1). Autism CNV Â Â Â Â Â Â Â Â Â Â Â Â genes frequently associated with terms related to synapse function and neurotransmission included previous Significance candidates CNTN4, NRXN1 and PARK2, as well as genes CNTN6, CTNND2, GRIN2A, NCAM2, GRM7, a

3 6.69 GRM8 and RCAN1. 523633 3.88 24 5.60 12 6.31 1.63 10 8.60 587512 6.77 8.80 2.99 3.36 137109 4.21 6.64 To demonstrate the specificity of the approach, the CNVs Genes in < 0.05 in both the discovery and replication

P same CNV inclusion criteria and analysis strategies were re-applied using a large set of parent–child trios c 2 3 3 3 2 2 2 2 3 2 3 2

À À À À À À À À À À À À for congenital heart disease. The enriched pathway

10 10 10 10 10 10 10 10 10 10 10 10 spectrum was almost entirely dissimilar in the             cardiovascular cohort, and no function classes con- Adjusted

3.53 8.05 8.05 8.82 4.73 3.28 2.87 2.70 9.54 4.73 8.82 2.87 sistent with neuropsychiatric function were found significance significant, even without application of a multiple

b test correction (HM Xie et al., in preparation). 3 5 5 5 3 3 4 4 4 3 5 4 À À À À À À À À À À À À To replicate the functional analyses, we applied the 10 10 10 10 10 10 10 10 10 10 10 10 Â Â Â Â Â Â Â Â Â Â Â Â same CNV inclusion criteria and methods as used for the discovery cohort to 593 additional idiopathic

Significance autism subjects of European descent, and 1109 corresponding parental samples, from AGRE. These

a were compared with CNVs from 2026 healthy

37 8.80 7 1.15 8 1.19 5.53 individuals comprising the CHOP healthy control 221918 1.32 13 4.01 2.17 7259 2.63 32 6.45 43 4.05 2.83 7.34 15

Genes cohort. A total of 392 inherited CNVs (131 duplica- in CNVs tions, 256 heterozygous deletions, 1 homozygous deletion and 4 hemizygous X chromosome deletions) c 7 5 5 3 3 3 3 3 2 2 2 2

À À À À À À À À À À À À were exclusive to 270 of the autistic subjects

10 10 10 10 10 10 10 10 10 10 10 10 (Supplementary Table 1). CNV transmission was             again independent of parent gender (205/392 with Adjusted

1.97 7.06 8.05 2.06 4.30 4.88 7.08 8.37 1.30 2.27 2.45 2.95 paternal inheritance; P=1). Excluding olfactory significance receptors, these CNVs overlapped 387 unique genes,

b of which only 43 (11.1%) were present in the 9 6 6 4 4 4 4 4 3 3 3 3 À À À À À À À À À À À À discovery autism CNV gene set. 10 10 10 10 10 10 10 10 10 10 10 10

            A total of 12 BP and 3 CC terms reaching corrected significance in the discovery cohort were also

Significance enriched in the replication cohort after multiple test correction. As with the discovery cohort, the genes

a with CNVs in these enriched terms showed only moderate overlap between terms (Supplementary

Genes Figure 1). Furthermore, the autism CNV gene sets in CNVs representing the replicated enriched terms demon- strated only a small overlap between the two cohorts, with only 23.7% (19/99) and 29.4% (5/17) of replica- tion genes also present in the discovery cohort for BP and CC, respectively. For example, of 21 and 18 genes assigned to the BP term synaptic transmission with -values using the Benjamini–Hochberg correction; only categories reaching a corrected statistical significance of

P autism-specific CNVs in the discovery and replication cohorts, respectively, only 6 genes had CNVs in both cohorts. Among the genes with autism-specific CNVs found in both cohorts were known autism candidate -values.

P genes CTNND2, CNTN4, NRXN1 and PARK2, along GO biological process terms significantly enriched in genes overlapping inherited rare autism CNVs with glutamate receptors GRIN2A, GRIN2C and GRM7 (Table 5). The discovery and replication cohorts were then Single-test Number of genes overlapping CNVs exclusively in the autismMultiple cohort test-corrected assigned to the specified GO term. Table 1 GO termCell–cell signalingTransmission of nerve impulseSynaptic transmissionNeuron adhesionCentral nervous system developmentRegulation of neurotransmitter levelsDevelopmental processMulticellular organismal developmentNeurotransmitter 23 secretion 16Post-translational modificationProtein 8 modification process 64Neurotransmitter transport 2.23 37 21 3.80 35Abbreviations: CNV, 4.95 copy-numbera variation; Discovery GO, 9.36 cohort gene ontology. b 2.08 3 3.06 80 3.58 c 6 41 1.29 7.50 7 1.62 4.03 5.29 Replication cohort Combined cohort cohorts are shown. Bold values indicate ancombined adjusted significance of < 0.05. into a single set of 1224 samples. Analysis

Molecular Psychiatry Rare inherited structural variants in autism X Gai et al

406 c 4 5 3

À À À of this combined cohort yielded results that were

10 10 10 consistent with the initial findings. Autism candidate À2 Â Â Â genes NRXN1 (P=1.46 Â 10 ) and CNTNAP2 À3

Adjusted (P=3.51 Â 10 ) demonstrated individual gene signif- 1.10 1.49 2.36 significance icance but failed to retain significance after multiple b

6 8 5 test correction (Table 4). In functional enrichment À À À analysis, all BP and CC terms enriched in the 10 10 10 discovery cohort retained multiple test-corrected    significance in the combined cohort. The most significant functions were BP terms cell–cell signaling < 0.05 in both the discovery Significance À7 P (P=6.14  10 ), transmission of nerve impulse a (P=4.98  10À6) and synaptic transmission À6 85 1.40 5.96

27 9.45 (P=4.98 Â 10 ) (Table 1 and Supplementary Table 2);

Genes and CC terms synapse (P=1.49 Â 10À5), presynaptic in CNVs membrane (P=1.10 Â 10À4) and dystrophin-asso- c

3 3 2 À3

À À À ciated glycoprotein complex (P=2.36 Â 10 ) (Table

10 10 10 2 and Supplementary Table 3). There was again broad    representation of genes among significant terms

Adjusted (Supplementary Figure 1). 8.19 3.54 1.14 significance As a complementary approach, we determined b

4 5 3 whether our autism CNV gene sets were preferentially À À À associated with particular phenotypes assigned to 10 10 10 Â Â Â mouse transgenic and null mice previously devel- oped for these genes. For this analysis, we used MP 26

Significance term assignments from the MGI group that have

a been reported for 113 of the 382 discovery cohort autism CNV genes, and 98 of 387 such genes in the 43 7.96 1.33 14 8.52 replicate cohort. Of 6406 functional terms, 12 were Genes

in CNVs significantly enriched in the autism gene set both for

c the discovery and replication cohorts, and subse- 4 3 3

À À À quently in the combined cohort (Table 3 and

10 10 10 Supplementary Table 4). Many of the terms were    highly consistent with an autistic phenotype, includ-

Adjusted ±

5.65 1.11 1.87 ing abnormal ( CNS) synaptic transmission, abnor- significance mal motor capabilities/coordination/movement, b

6 5 4 abnormal motor coordination/balance, reduced À À À

10 10 10 NMDA-mediated synaptic currents, abnormal

   impulse conducting system conduction, decreased startle reflex, abnormal miniature excitatory postsy-

Significance naptic currents and abnormal spatial learning. After correction for multiple testing, two closely associated a terms retained significance in the replication and combined cohorts (abnormal synaptic transmission, CNVs À2 Genes in combined P=2.52 Â 10 ; abnormal CNS synaptic transmission, combined P=2.67 Â 10À2). The gener- ally lower significance values than observed for the BP and CC analyses likely reflect the relatively small number of genes with well-annotated phenotypes; however, the general trend of enrichment for synaptic function and neurotransmission was consistent with our previous analyses. Only 5 of 33 (15.2%) autism -values using the Benjamini–Hochberg correction; only categories reaching a corrected statistical significance of

P CNV genes within the 12 significant MP assignments were represented in both the discovery and replica- tion cohorts, again suggesting widespread genetic heterogeneity.

-values. To determine the relative contribution of the P unreported autism candidate genes to the enriched GO cellular component terms significantly enriched in genes overlapping inherited rare autism CNVs functional groups, each functional analysis was repeated after excluding a set of 205 genes either previously reported as autism risk candidates, or that Single-test Number of genes overlapping CNVs exclusively in theMultiple autism test-corrected cohort assigned to the specified GO term. Table 2 GO termPresynaptic membraneSynapseDystrophin-associated glycoprotein complexAbbreviations: CNV, copy-numbera 4 variation; GO, gene ontology. b c 1.04 6 7.61 Discovery cohort 16 2.85 Replication cohort Combined cohort and replication cohorts are shown. Bold values indicate an adjusted significance of < 0.05. reside in a genomic rearrangement associated with

Molecular Psychiatry Rare inherited structural variants in autism X Gai et al

c 407 2 2

À À autism susceptibility (Supplementary Table 5). De-

10 10 spite the reduced power of these analyses, most BP Â Â

0.11 0.22 0.37 0.39 0.39 0.40 0.40 0.40 0.44 0.57 and all CC and MP terms retained multiple test Adjusted

2.52 2.67 corrected significance (BP and CC) or uncorrected significance significance (MP) in both the discovery and replica- b

5 5 4 4 3 3 3 3 3 3 2 2 tion cohorts (Supplementary Tables 6–11). Several À À À À À À À À À À À À 10 10 10 10 10 10 10 10 10 10 10 10 additional terms consistent with autism pathophy- Â Â Â Â Â Â Â Â Â Â Â Â siology were newly significant, including cell–cell adhesion (BP), response to mechanical stimulus (BP), Significance cell junction (CC) and cytoplasmic vesicle membrane

a (CC). These results support representation of pre- viously unattributed autism risk in the enriched 3329 1.76 3.73

Genes autism gene sets. in CNVs To search for common variants, we determined the c

2 2 extent to which any single genes were represented by À À CNVs in autism cases relative to matched healthy 10 10

  individuals. We first compared our CNV set with 0.470.260.470.260.43 630.47 30.28 340.31 2.30 30.43 6.12 80.47 1.31 4 2.39 6 57 2.44 4.84 5 5.04 9 3.15 1.17 7.25

Adjusted findings from previous autism genomic studies, 1.97 4.00 significance including a recent study by Glessner and collea- 11,27,28 b gues that reported CNV findings for a set of 5 5 2 3 2 3 2 2 3 2 2 2 À À À À À À À À À À À À autism spectrum disorder subjects partially overlap- 10 10 10 10 10 10 10 10 10 10 10 10

            ping our cohort. Combining the two cohorts, 25 inherited CNVs overlapped one or more previously

Significance implicated candidate genes (NRXN1, CNTN4,

a SUMF1, DISC1, PARK2, AUTS2 and CNTNAP2)or genomic regions (Table 4 and Supplementary Table 5). No gene had evidence of over-representation Genes

in CNVs in the ASD cases, although formal significance testing c -aspartate. was not possible because of the unavailability D of inheritance status of CNVs in the control groups. The majority of CNVs detected in these genes did 0.370.400.170.370.17 210.17 180.40 290.37 2.26 20.40 9.17 160.47 4.34 20.47 2.73 40.47 4.51 2 7.39 4 31 3.00 4.40 3 8.41 6 1.09 3.13 4.88 not substantially overlap CNVs detected in healthy -methyl Adjusted N significance controls. When expanding this approach genome-wide and b 3 3 4 3 4 4 2 3 2 2 2 2

À À À À À À À À À À À À considering all ASD subject CNVs regardless of 10 10 10 10 10 10 10 10 10 10 10 10 inheritance status and presence in controls, only a             single RNA gene on 14q32, LOC10013346, was significantly enriched for CNVs in ASD cases (27 Significance overall CNVs versus none in the discovery control

a cohort, and 29 versus 4 in the replication control cohort). However, CNVs disrupting this gene in CNVs

Genes in autism cases also spanned the adjoining immunoglo- bulin heavy chain , suggesting that these CNVs were likely cell culture artifacts. Moreover, only one of these CNVs in the discovery cohort and two in the replication cohort were present in parental samples. No other individual genes were significantly over- represented in either autism cohort. CNVs overlapping five genomic regions (CTNND2, GRIN2A, NCAM1, KCNE2/FAM165B/KCNE1/RCAN1 and ICA1/NXPH1) were selected for experimental -values using the Benjamini–Hochberg correction; the 12 most significant terms in the combined cohort are listed. Bold values indicate an

P validation of CNVs based upon the following criteria: higher frequency seen in autism cases relative to controls; evidence for involvement in neurotransmis- sion processes; and little or no previous implication

-values. for autism risk. As a validation control, we included P two autism probands with known CNTN4 deletions Mouse phenotype ontology terms significantly enriched in genes overlapping inherited rare autism CNVs and one proband with a known CNTN4 duplication.28 All 17 CNVs identified in probands and parental samples in these loci were experimentally validated Single-test Number of genes overlapping CNVs exclusively inMultiple the test-corrected autism cohort assigned to the specified phenotype term. adjusted significance of < 0.05. Table 3 Phenotype termAbnormal synaptic transmissionAbnormal CNS synaptic transmissionAbnormal motor capabilities/coordination/movement 41 3.30 15 17 8.72 a 7.19 b Discoveryc cohort Replication cohort Combined cohort Small hair folliclesAbnormal motor coordination/balanceReduced NMDA-mediated synaptic currentsAbnormal impulse conducting systemAbnormal conduction QT intervalDecreased startle reflexAbnormal nervous system physiologyAbnormal miniature excitatory postsynapticAbnormal currents spatial 5 learning 3 24Abbreviations: CNS, central 3 nervous 1.19 system; CNV, copy-number variation; 5.49 NMDA, 5.54 4.45 32(Supplementary 2 4.61 3 3.59 4 7 7.34 1.36 Figure 3.33 2).

Molecular Psychiatry Rare inherited structural variants in autism X Gai et al 408 Table 4 Analysis of inherited CNVs identified in genes previously implicated in autism

Gene Location Evidence for Discovery Replication Combined

implicationa Casesb Controls Casesb Controls Casesb Controls

NRXN1 2p16.3 Rare variant 4 2 1 6 5 8 CNTN4 3p26.3 Rare variant 1 1 1 1 2 2 SUMF1 3p26.1 Rare variant 0 0 1 0 1 0 PARK2 6q26 Rare variant 5 15 7 17 12 32 AUTS2 7q11.22 Rare variant 1 0 0 1 1 1 ASTN2 9q33.1 Rare variant 0 1 1 4 1 5 CNTNAP2 7q36.1 Association 2 0 0 0 2 0 DISC1 1q42.2 Association 0 2 1 5 1 7

Abbreviation: CNV, copy-number variation. aRefer to Supplementary Table 1 for gene implication references. bNumber of cases containing a CNV overlapping a listed gene or region.

Discussion social learning, contextual and cued fear conditioning and fear-potentiated startle.41 KCNMA1 encodes a Our functional enrichment analysis of the AGRE calcium-activated potassium channel linked to gluta- cohort identified a number of genes enriched for matergic synapses,42 whereas mice null for this gene inherited CNVs in ASD subjects, several of which are exhibit cerebellar learning deficiencies.43 Disruption known to have functions consistent with ASD of KCNMA1 was identified in an autistic subject with etiology. Of particular interest were the ionotropic a balanced translocation t(9;10)(q23;q22).44 glutamate receptor GRIN2A and CNVs spanning both During review of this manuscript, Pinto et al.45 the islet cell autoantigen ICA1 (islet cell autoantigen 1) reported an analysis of rare CNVs in ASD cases and the adjacent a-neurexin ligand NXPH1 (neurex- relative to healthy controls. Both studies are consis- ophilin-1), as these genes were disrupted by CNVs in tent with the presence of many novel and previously subjects from both autism cohorts and had no reported loci, each with only a small number of evidence of structural variation in healthy controls distinct CNVs observed in the subjects considered. In (Supplementary Figure 3). ICA1 is involved in comparing our results with the study of Pinto et al,45 glutamate receptor-mediated transmission,29 whereas there is little overlap between the sets of individual the neurexophilin NXPH1 is a ligand for the protein genes suggested as ASD candidates. These differences product of the autism candidate gene NRXN1.30 may be because of our exclusive focus on inherited Barnby et al.31 implicated GRIN2A (glutamate recep- events or might further support the presence of a tor, ionotropic, N-methyl D-aspartate 2A) as an autism substantial pool of potential ASD risk loci. Encoura- candidate by gene-specific association, and further gingly, pathway enrichment findings between the two evidence suggests involvement in susceptibility to studies show greater consistency, with enrichment and attention deficit hyperactivity found in cell adhesion, central nervous system disorder.32–34 GRIN2A-null mice exhibit impairments development, synaptic function and glutamate- in contextual memory and synaptic plasticity.35 mediated neuronal signaling processes. A follow-up Also of special interest were 12 autism CNV genes study comparing inherited and de novo CNV func- associated with significantly enriched functions each tional attributes would allow determination of the in the BP, CC and MP analyses (Table 5). Three of degree to which each CNV class is functionally these genes (DOC2A, NRXN1 and PARK2) have distinct. previous support as autism susceptibility loci and Our results are consistent with the hypothesis that validate our approach. The receptor tyrosine kinase inherited autism risk is genetically highly heteroge- ERBB4 controls maturation and plasticity of glutama- neous, both from our observations of a lack of autism- tergic synapses, myelination and dopaminergic func- specific CNVs overlapping any single gene in our tion, and has been suggested as having a role in analysis, and the lack of overlap between gene sets schizophrenia risk.36 GRIN2A (described above), represented by the same autism-enriched functional GRIN2C and GRM7 function as glutamate receptors, terms in our two cohorts. Better estimates of the whereas UNC13A-null mice demonstrate impairment number of genes that can potentially confer risk, and of glutamatergic maturation.37 The the number of risk alleles typically required to confer neurexin homolog NRXN3 is not well characterized sufficient risk in an individual, await higher-resolu- but has roles in synaptic plasticity and synaptic tion genomic studies in larger cohorts, as well as vesicle .38 SYN3 functions in synaptogen- platforms that more fully consider common CNVs. esis and the modulation of neurotransmitter re- Our method, which has yielded results similarly lease,39,40 and SYN3-null mice exhibit deficits in consistent with disease-specific pathophysiology for

Molecular Psychiatry Rare inherited structural variants in autism X Gai et al 409 Table 5 Autism CNV genes implicated in significant functional processes in both the discovery and replication cohorts

Replicated biological Replicated cellular Replicated mouse Genes implicated in process genesa,b component genesc phenotype genesd all functional typese

CALCR GRIN2A CTNND2 DOC2A CDH13 NRXN1 GRIN2A ERBB4 CNTN4 PARK2 NRXN1 GRIN2A CNTN6 SGCZ PARK2 GRIN2C CTNND2 SNTG2 PTPRD GRM7 GRIN2A KCNMA1 ICA1 NRXN1 IMMP2L NRXN3 KCNE1 PARK2 MYOM2 SYN3 NCAM2 UNC13A NRXN1 UTRN PARK2 PCSK6 PTPRD RAF1 RCAN1 SDK1 SNTG2

Abbreviation: CNV, copy-number variation. aPreviously implicated autism candidate genes are underlined. bAutism CNV genes for which at least one autism-specific CNV overlapping the gene was identified in both the discovery and replication cohorts, and which has also been assigned to a biological process functional class demonstrating significant enrichment in both the discovery and replication cohorts. cAutism CNV genes for which at least one autism-specific CNV overlapping the gene was identified in both the discovery and replication cohorts, and which has also been assigned to a cellular component functional class demonstrating significant enrichment in both the discovery and replication cohorts. dAutism CNV genes for which at least one autism-specific CNV overlapping the gene was identified in both the discovery and replication cohorts, and which has also been assigned to a mouse phenotype functional class demonstrating significant enrichment in the combined cohort. eAutism CNV genes for which at least one autism-specific CNV overlapping the gene was identified in the combined cohort, and which has also been assigned to at least one biological process, cellular component and mouse phenotype functional class demonstrating significant enrichment in the combined cohort. attention deficit hyperactivity disorder,18 may be a Conflict of interest viable approach for identifying risk signatures across functionally related processes. Caveats of this ap- The authors declare no conflict of interest. proach include insufficient specificity to identify individual candidate genes within moderate cohort sizes, and reliance upon incomplete and potentially Acknowledgments biased gene knowledge. In addition, our approach only considered ASD variants not found in controls, We acknowledge the resources provided by the which may oversimplify the biological role that Autism Genetic Resource Exchange (AGRE) consor- certain variants might contribute to ASD etiology tium and participating AGRE families, and members and is also dependent upon control cohort size. of the Center for Applied Genomics at the Children’s However, our results do consistently and indepen- Hospital of Philadelphia for making genotype data dently implicate possible pathways, functions and available. The Autism Genetic Resource Exchange is a gene sets that can be readily investigated by follow-up program of Autism Speaks and is supported, in part, genomic and experimental investigation. Recent by Grant 1U24MH081810 from the National Institute studies have implicated individual genes associated of Mental Health to Clara M Lajonchere (PI). This with synaptic function, including several neuroli- work was supported in part by Pennsylvania Depart- gands, as well as indications of glutamatergic path- ment of Health Grant SAP 4100037707 (PSW), NIH way involvement.8,13 Our current results extend these grants GM081519 (THS) and P30HD026979 (MD and findings and, more importantly, for the first time XG) and the Seaver Foundation (JDB). We thank provide statistical evidence that synaptic function Marcella Devoto for helpful comments. We also thank and glutamate-mediated neurotransmission are con- all the participating individuals and families of the tributing factors in autism etiology. AGRE study and for all the control families at the

Molecular Psychiatry Rare inherited structural variants in autism X Gai et al 410 Children’s Hospital of Philadelphia for making this 19 Gai X, Perin JC, Murphy K, O0Hara R, D0Arcy M, Wenocur A et al. study possible. CNV Workshop: an integrated platform for high-throughput copy Author contributions: THS, HH, JDB and BG collected number variation discovery and clinical diagnostics. BMC Bioin- formatics 2010; 11:74. samples and contributed phenotype data. XG, HMX, 20 Conlin LK, Thiel BD, Bonnemann CG, Medne L, Ernst LM, Zackai DDG, MD, JDB, JE and PSW designed the experi- EH et al. Mechanisms of mosaicism, chimerism and uniparental ments. XG, HMX, JCP, KM, ASW, MD, RJO and PSW disomy identified by single nucleotide polymorphism array performed the experiments and analyzed analysis. Hum Mol Genet 2010; 19: 1263–1275. 21 Hasin Y, Olender T, Khen M, Gonzaga-Jauregui C, Kim PM, Urban the data. XG, JDB, JE and PSW reviewed the data, AE et al. High-resolution copy-number variation map reflects assisted with interpretation of the data and edited the human olfactory receptor diversity and evolution. PLoS Genet manuscript. 2008; 4: e1000249. 22 Nozawa M, Nei M. Evolutionary dynamics of olfactory receptor genes in Drosophila species. Proc Natl Acad Sci USA 2007; 104: References 7122–7127. 23 Young JM, Endicott RM, Parghi SS, Walker M, Kidd JM, Trask BJ. 1 Association AP. Diagnostic and Statistical Manual of Mental Extensive copy-number variation of the human olfactory receptor Disorders. American Psychiatric Association: Washington, DC, gene family. Am J Hum Genet 2008; 83: 228–242. 2004. 24 Hellemans J, Mortier G, De Paepe A, Speleman F, Vandesompele J. 2 Gupta AR, State MW. Recent advances in the genetics of autism. qBase relative quantification framework and software for manage- Biol Psychiatry 2007; 61: 429–437. ment and automated analysis of real-time quantitative PCR data. 3 Muhle R, Trentacoste SV, Rapin I. The genetics of autism. Genome Biol 2007; 8: R19. Pediatrics 2004; 113: e472–e486. 25 Geschwind DH, Sowinski J, Lord C, Iversen P, Shestack J, Jones P 4 Folstein S, Rutter M. Genetic influences and infantile autism. et al. The autism genetic resource exchange: a resource for the Nature 1977; 265: 726–728. study of autism and related neuropsychiatric conditions. Am J 5 Ronald A, Happe F, Bolton P, Butcher LM, Price TS, Wheelwright Hum Genet 2001; 69: 463–466. S et al. Genetic heterogeneity between the three components of the 26 Smith CL, Goldsmith CA, Eppig JT. The mammalian phenotype autism spectrum: a twin study. J Am Acad Child Adolesc ontology as a tool for annotating, analyzing and comparing Psychiatry 2006; 45: 691–699. phenotypic information. Genome Biol 2005; 6:R7. 6 Alarcon M, Abrahams BS, Stone JL, Duvall JA, Perederiy JV, Bomar 27 Bucan M, Abrahams BS, Wang K, Glessner JT, Herman EI, JM et al. Linkage, association, and gene-expression analyses Sonnenblick LI et al. Genome-wide analyses of exonic copy identify CNTNAP2 as an autism-susceptibility gene. Am J Hum number variants in a family-based study point to novel autism Genet 2008; 82: 150–159. susceptibility genes. PLoS Genet 2009; 5: e1000536. 7 Arking DE, Cutler DJ, Brune CW, Teslovich TM, West K, Ikeda M 28 Glessner JT, Wang K, Cai G, Korvatska O, Kim CE, Wood S et al. et al. A common genetic variant in the neurexin superfamily Autism genome-wide copy number variation reveals ubiquitin and member CNTNAP2 increases familial risk of autism. Am J Hum neuronal genes. Nature 2009; 459: 569–573. Genet 2008; 82: 160–164. 29 Cao M, Xu J, Shen C, Kam C, Huganir RL, Xia J. PICK1-ICA69 8 Szatmari P, Paterson AD, Zwaigenbaum L, Roberts W, Brian J, Liu heteromeric BAR domain complex regulates synaptic targeting XQ et al. Mapping autism risk loci using genetic linkage and and surface expression of AMPA receptors. J Neurosci 2007; 27: chromosomal rearrangements. Nat Genet 2007; 39: 319–328. 12945–12956. 9 Wang K, Zhang H, Ma D, Bucan M, Glessner JT, Abrahams BS et al. 30 Missler M, Hammer RE, Sudhof TC. Neurexophilin binding to Common genetic variants on 5p14.1 associate with autism alpha-. A single LNS domain functions as an indepen- spectrum disorders. Nature 2009; 459: 528–533. dently folding ligand-binding unit. J Biol Chem 1998; 273: 34716– 10 Weiss LA. Autism genetics: emerging data from genome-wide 34723. copy-number and single nucleotide polymorphism scans. Expert 31 Barnby G, Abbott A, Sykes N, Morris A, Weeks DE, Mott R et al. Rev Mol Diagn 2009; 9: 795–803. Candidate-gene screening and association analysis at the 11 Weiss LA, Arking DE, Daly MJ, Chakravarti A. A genome-wide autism-susceptibility locus on chromosome 16p: evidence of linkage and association scan reveals novel loci for autism. Nature association at GRIN2A and ABAT. Am J Hum Genet 2005; 76: 2009; 461: 802–808. 950–966. 12 Merikangas AK, Corvin AP, Gallagher L. Copy-number variants in 32 Kebir O, Tabbane K, Sengupta S, Joober R. Candidate genes and neurodevelopmental disorders: promises and challenges. Trends neuropsychological phenotypes in children with ADHD: review of Genet 2009; 25: 536–544. association studies. J Psychiatry Neurosci 2009; 34: 88–101. 13 Betancur C, Sakurai T, Buxbaum JD. The emerging role of synaptic 33 Lang UE, Puls I, Muller DJ, Strutz-Seebohm N, Gallinat J. cell-adhesion pathways in the pathogenesis of autism spectrum Molecular mechanisms of schizophrenia. Cell Physiol Biochem disorders. Trends Neurosci 2009; 32: 402–412. 2007; 20: 687–702. 14 Zoghbi HY. Postnatal neurodevelopmental disorders: meeting at 34 Tang J, Chen X, Xu X, Wu R, Zhao J, Hu Z et al. Significant linkage the synapse? Science 2003; 302: 826–830. and association between a functional (GT)n polymorphism in 15 Shaikh TH, Gai X, Perin JC, Glessner JT, Xie H, Murphy K et al. promoter of the N-methyl-D-aspartate receptor subunit gene High-resolution mapping and analysis of copy number variations (GRIN2A) and schizophrenia. Neurosci Lett 2006; 409: 80–82. in the : a data resource for clinical and research 35 Sprengel R, Suchanek B, Amico C, Brusa R, Burnashev N, applications. Genome Res 2009; 19: 1682–1690. Rozov A et al. Importance of the intracellular domain of 16 Hakonarson H, Grant SF, Bradfield JP, Marchand L, Kim CE, NR2 subunits for NMDA receptor function in vivo. Cell 1998; 92: Glessner JT et al. A genome-wide association study identifies 279–289. KIAA0350 as a type 1 diabetes gene. Nature 2007; 448: 591–594. 36 Mei L, Xiong WC. Neuregulin 1 in neural development, synaptic 17 Yang N, Li H, Criswell LA, Gregersen PK, Alarcon-Riquelme ME, plasticity and schizophrenia. Nat Rev Neurosci 2008; 9: 437–452. Kittles R et al. Examination of ancestry and ethnic affiliation 37 Augustin I, Rosenmund C, Sudhof TC, Brose N. Munc13-1 is using highly informative diallelic DNA markers: application essential for fusion competence of glutamatergic synaptic vesicles. to diverse and admixed populations and implications for Nature 1999; 400: 457–461. clinical epidemiology and forensic medicine. Hum Genet 2005; 38 Missler M, Zhang W, Rohlmann A, Kattenstroth G, Hammer RE, 118: 382–392. Gottmann K et al. Alpha-neurexins couple Ca2 þ channels to 18 Elia J, Gai X, Xie HM, Perin JC, Geiger E, Glessner JT et al. Rare synaptic vesicle exocytosis. Nature 2003; 423: 939–948. structural variants found in attention-deficit hyperactivity dis- 39 Kao HT, Porton B, Czernik AJ, Feng J, Yiu G, Haring M et al. A order are preferentially associated with neurodevelopmental third member of the synapsin gene family. Proc Natl Acad Sci USA genes. Mol Psychiatry 2010; 15: 637–646. 1998; 95: 4667–4672.

Molecular Psychiatry Rare inherited structural variants in autism X Gai et al 411 40 Feng J, Chi P, Blanpied TA, Xu Y, Magarinos AM, Ferreira A et al. 44 Laumonnier F, Roger S, Guerin P, Molinari F, M’Rad R, Cahard D Regulation of neurotransmitter release by synapsin III. J Neurosci et al. Association of a functional deficit of the BKCa channel, a 2002; 22: 4372–4380. synaptic regulator of neuronal excitability, with autism and mental 41 Porton B, Rodriguiz RM, Phillips LE, Gilbert JWT, Feng J, retardation. Am J Psychiatry 2006; 163: 1622–1629. Greengard P et al. Mice lacking synapsin III show abnormalities 45 Pinto D, Pagnamenta AT, Klei L, Anney R, Merico D, Regan R et al. in explicit memory and conditioned fear. Genes Brain Behav 2010; Functional impact of global rare copy number variation in autism 9: 257–268. spectrum disorders. Nature 2010; 466: 368–372. 42 Misonou H, Menegola M, Buchwalder L, Park EW, Meredith A, Rhodes KJ et al. Immunolocalization of the Ca2 þ -activated K þ channel Slo1 in axons and nerve terminals of mammalian brain This work is licensed under the Creative and cultured neurons. J Comp Neurol 2006; 496: 289–302. Commons Attribution-NonCommercial- 43 Sausbier M, Hu H, Arntz C, Feil S, Kamm S, Adelsberger H et al. Cerebellar ataxia and Purkinje cell dysfunction caused by Ca2 þ - No Derivative Works 3.0 Unported License. To view activated K þ channel deficiency. Proc Natl Acad Sci USA 2004; a copy of this license, visit http://creativecommons. 101: 9474–9478. org/licenses/by-nc-nd/3.0/

Supplementary Information accompanies the paper on the Molecular Psychiatry website (http://www.nature.com/mp)

Molecular Psychiatry