© 2014 Nature America, Inc. All rights reserved. us us with complementary in sets enriched causal ASD mutations. provided approaches independent two These studies. recent several computational approach and on all on by a of our that network-based set were implicated analyses genes our We focused scores. phenotypic social and intelligence including have assembled a large compendium of ASD-related phenotypic data, Correspondence should Correspondence be addressed to D.V. ( 4 and Columbia Bioinformatics, University, New York, New York, USA. 1 (SSC) Collection Simplex Simons the as such resources addition, stages developmental and regions brain anatomical distinct comprehensive database of expression across different cell types, genetic a include These are accumulated. resources rapidly being phenotypic disease-associated and of functional brain-related of sets data complementary identification variations, the with parallel In networks that are involved in neurogenesis and synaptic function biological several on converge disorders psychiatric other and ASDs with associated variations genetic that found recently others and we variants genetic disease- of associated collection diverse a using networks biological cohesive for searches that (NETBAG+) approach computational a developed ASDs. in role have a causal to are likely CNVs large and mutations) frameshift and site splice nonsense, as (such SNVs truncating that single nucleotide variations (SNVs) associated with ASDs, including copy number variations (CNVs) of collection large a identified have studies recent genes target of number large a of result a as substantial be to likely is cases ASD sporadic to contribution collective but their tant phenotypes related and autism to contribute abnormalities behavioral and cognitive of range wide a with associated are ASDs social and behavioral ASD phenotypes. ASD high-functioning cases. Overall, we find that stronger functional insults usually lead to more severe intellectual, expression than in males. Our results also suggest that truncating and brain corticostriatal circuits. In females, truncating ASD mutations on average affect genes with 50–100% higher brain be strongest in cortical pyramidal interneurons, neurons and the medium spiny neurons of the striatum, implicating cortical likely to be Multiple cell haploinsufficient. types and brain areas are affected, but the impact of ASD mutations appears to truncating mutations in early embryonic development. We find that functional mutations are observed preferentially in genes perturbed in ASD suggests that both truncating and nontruncating Autism spectrum disorders (ASDs) are by characterized phenotypic and genetic heterogeneity. Our analysis of functional networks Jonathan Chang spectrum disorders Genotype to phenotype relationships in autism nature nature Received 12 August; accepted 26 November; published online 22 December 2014; Department Department of Psychiatry, Department of Genetics, Yale University, New Haven, Connecticut, USA. Department of Biomedical Informatics, Columbia University, New York, New York, USA. To explore the underlying biological pathways, we previously previously we pathways, biological underlying the explore To de novo de NEUR mutations associated with ASDs are individually rare, rare, individually are ASDs with associated mutations 1 OSCI , 2 . It is estimated that many hundreds of genes may may genes of hundreds many that estimated is It . EN 1 , 2 C , 5 E

, Sarah , R Sarah Gilman advance online publication online advance 12 , 1 3 . Using network-based approaches, approaches, network-based Using . de novo 3 , 7 , 8 , [email protected] 1 1 . These studies demonstrated truncating mutations from 2 , 1 3 . Functionally impor Functionally . , 2 , 5 de novo de , , Andrew H Chiang 3– 3 mutations mutations ). 1 Department Department of Psychiatry, University of California, San Francisco, California, USA.

0 . Several Several . 16 , 1 12– 6 7 and . In . 1 1 5 8 - .

doi:10.1038/nn.390 de de novo (GO) terms that were significantly enriched among network gene gene network among enriched significantly were that terms (GO) DAVID used we network, implicated siblings. in identified non-synonymous 368 the with associated genes using detected were networks significant no Notably, connectivity. sets that matched the real data in terms of length and network The CNVs. network’s using random was input estimated significance by affected were genes 131 which ( genes 159 taining The network Methods). NETBAG+ con a search revealed functional (Online network phenotypic underlying the in connected strongly are that genes input the of subset a identify to NETBAG+ Weused of analysis previous our in con sidered loci 47 the than larger considerably was used loci genomic of novo de loci, genomic independent 624 with genes unique 580 including from genes unique 991 contained of data total input a combined The etiology. ASD of hypotheses genome- using by any preexisting biased obtained not are therefore and were methodologies wide analyses our for input the as used SSC the from patients autistic in observed by applied affected we genes of ASD, set in a to NETBAG+ perturbed networks functional elucidate To Functional gene networks affected by RESULTS phenotypes. ASD affect genes autism-associated of properties files of implicated genes. We also explored Wehow investigatedexpression the temporal, spatial and expression cell-specific pro and functional de de novo To explore the biological functions associated with the the with associated functions biological the explore To 1 2 , Department Department of Systems Biology, Center for Biology Computational 2 , , Stephan J Sanders CNVs ( CNVs mutations play a smaller role in the etiology of mutations contribute to autism, with a bias against 5 These These authors contributed equally to this work. Supplementary Table 1 Table Supplementary 7 P = 0.036; = 0.036; 3 Fig. Fig. , 4 & Dennis Vitkup de de novo 1 de novo de de novo de and and 1 9 Supplementary Table 2 Supplementary to identify Ontology Gene identify to de novo SNVs and 434 genes within within and SNVs genes 434 ); we note that the number number the that note we ); de novo de 3 SNVs and 31 by by 31 and SNVs CNV events in autism in events CNV , 6– 8 . All of the mutations mutations the of All . mutations de novo de t r a CNVs and SNVs SNVs and CNVs

1

, 2

mutations mutations C I

de novo de e l ), of ), 1 s 3  - - - .

© 2014 Nature America, Inc. All rights reserved. ronal projections and actin cytoskeleton ( cytoskeleton actin and projections ronal LIMK1 PRKCA, CTNNB1, MDM2, EPHA1/B2 DCC, genes that are associated with neuronal signaling and migration ( memory and learning in important are which channels, calcium dependent ( CACNA1D genes contained cluster This receptors. ion of and set channels diverse a included cluster related A synapses. excitatory ( ( neurexin contained cluster This and function. formation for synapse ( functions biological non-overlapping and distinct with associated each clusters, network a as network ( phenotypic metric the in interactions of strength the using genes network the of clustering hierarchical performed we network, the in genes between relationships functional the understand better To activity. channel calcium and modification chromatin function, synaptic including network, with the implicated associated functions ( annotations Table 3 ( clustering hierarchical using by colors) node (indicated clusters functional cohesive four into was divided The network types. by mutation both affected genes represent and CNVs diamonds from genes represent squares SNVs, from genes represent circles mutations: of types the corresponding indicate shapes Node shown. For clarity,are for gene each edges Methods). the only two strongest (Online phenotype to the genetic same contributes pair gene the that corresponding P 1 Figure t r a  FLNA, ACTN4 NRXN SHANK2

= 0.036). Node sizes are proportional to the contributions of each gene to the overall network score and edge widths are proportional to the likelihood to the likelihood are proportional and widths edge score network of to gene each the overall to the contributions are proportional sizes Node = 0.036). CACNA1E One cluster in our network ( network our in cluster One for a complete list of GO terms associated with each cluster). Gray nodes represent genes that are not members of the network clusters. of the network are that not members genes represent nodes Gray cluster). each with of list GO associated for a terms complete Channel CACNA1B

) and neuroligin ( neuroligin and ) activity C I Supplementary Fig. 1) Fig. Supplementary The network implicated by NETBAG+ based on ASD-associated on based ASD-associated by NETBAG+ implicated The network 2 CACNA1S 0 KCNC1 and and . The largest cluster of the implicated network contained contained network implicated the of cluster largest The . SCN2 , , e l CACNA1E Postsynapti ATP2B CHRND Table Table ). As we have previously demonstrated, ASD-associated CACNA1D A density CASK DLG2/DLG4 GABRB3 MPP6 OPRL1 s TRPV5 4 ), intracellular signaling ( signaling intracellular ), HRH2 RYR1 CHRNA 1 Supplementary Fig. 1 Fig. Supplementary Supplementary Table 3 Supplementary DLG1 DLG2 GRIN2B GRM5 ). This analysis identified a diverse set of of set diverse a identified analysis This ). , , c GRM7 CACNA1S NRLG 7 SHANK2 SPTAN ) of the postsynaptic density (PSD) of of (PSD) density postsynaptic the of ) MYH DLG4 GNAS CYFIP1 ABI2 IQGAP2 . This analysis identified four major major four identified analysis This . Fig. Fig. 9 AP1GBP1 ), as well as important components components important as well as ), 1 RIMS1 FANCA ITPR1 MYH11 LLGL1 ADCY ) for subunits of several voltage- several of subunits for ) A2 ), and the development of neu of development the and ), NLGN 1 CRK M PFKM ) contained genes responsible responsible genes contained ) NLGN 5 BAIAP2 L MYH10 SYNE1 3 MYO9 ); general biological functions of these clusters, determined using DAVID,using (see are shown determined of clusters, these functions biological ); general SDH 1 CYFIP1, TRIO, SPTAN1, SPTAN1, TRIO, CYFIP1, NF PMM2 NRXN1 MAPK3, EGFR, PTEN, PTEN, EGFR, MAPK3, AK1 B MAPT 1 A ). DS ACTN LRP2 T ITGB3 PIK3R2 PDGFD CA 4 MAD CLPB D FLN ACAC PTEN LRP1 D APAF1 A CACNA1B B TTN PSEN1 TLK2 PPM1 TSC2 TCF7L1 PRKG1 EGFR MDM2 DNMT3A JU NF1, NF1, D PRKC CTNNA P GRK5 de de novo - CTNNB1 MSH ,

STK1 A 3 AXIN DCC 6 EPHB2 MST1R studies have revealed recurrent truncating mutations in several several in mutations ( truncating genes recurrent revealed have studies functional properties of the network genes. Recent exome-sequencing work and gene that subsets specific may the highlight phenotypic and above, we there investigated was whether an overlap the net between discussed categories GO biological and molecular the to addition In Association of the implicated genes with specific functional subsets EIF3G ( interference RNA ASDs: in implicated have been that processes other in involved develop ment, neuroplasticityneural andof learning stages various affect crucially mechanisms tory Notably,regulation. there is that growing evidence chromatin regula and transcriptional remodeling chromatin modifications, chromatin with associated functions to related primarily was cluster final The our in implicated schizophrenia and ASDs in insults been genetic of have analyses previous which spines, dendritic of function and growth the to related encode above described clusters projections neuronal of the and development adhesion cell-cell migration, neuron for required are that on the of regulation processes and cytoskeleton actin other structural converge that pathways signaling multiple perturb often mutations WNK2 1 CDH5 SNVs and CNVs from recent studies (network is comprised of 159 genes, is comprised (network studies and recent SNVs from CNVs TRI 1 UBE3C O MAPK13 TNK ROS1 DDR2 ) and splicing ( splicing and ) PTPR MAPK3 SMAD ADNP, ANK2, ARID1B, CHD8, CUL3, DYRK1A, GRIN2B, GRIN2B, DYRK1A, CUL3, CHD8, ARID1B, ANK2, ADNP, S STK1 EPHA1 advance online publication online advance CUL5 M PTK DYRK1A NAV2 PTPR 2 CI 9 DICER1, DICER1, AGO1/EIF2C1 T 7 VPS4A CUL3 K RUVBL1 LIMK RPS6KA3 SNRK DDB1 CDC42BP ABCA2 VC 1 SFPQ, TRA2B SFPQ, STK36 P RPS6KB2 TRRA ZMYND11 IKBKAP B SMC3 TBL1XR1 P TSSK2 NUAK1 BRSK2 NOLC1 BRD1 HDAC9 L3MBTL WDR4 2 cytoskeleton 3 KIF18A NCOR1 signaling/ Neuronal . This cluster also contained genes ) 2 EP400 ), ), translation ( 4 UIMC . TRA2B GTF3C1 BRD4 ARID1B 12 CHD3 TADA2L SUPT16H ,

1 21 SFPQ modification/ nature nature , HTATIP2 Chromatin 2 regulation CBX 2 MYBBP1A HNRNP SRCA . Many genes in the the in genes Many . EIF2C1 SMARCA2 4 CHD8 P Supplementary Supplementary F EIF4A1, EIF4G1, EIF4G1, EIF4A1, NEUR HNRNPUL1 DHX SF EIF3 DICER1 1 9 G EIF4G1 OSCI DDX20 EIF4A1 POL EN 12 Q , C 1 3 E - - - .

© 2014 Nature America, Inc. All rights reserved. harboring SNVs with potentially stronger functional effect. functional stronger potentially with SNVs harboring mutations, this analysis demonstrates that damaging NETBAG predominantly more genes selects to correspond scores GERP higher As one tail, Wilcoxon rank-sum are 4.0 and respectively; 3.3, scores GERP non-network and network (average network the into was significantly higher than the genes average GERP score of network genes not of selected score Online GERP (see average the mutations Notably, SNV Methods). of severity the characterize to scores We also considered the Genomic Evolutionary Rate Profiling (GERP) are 0.69 and 0.39, respectively; Wilcoxon rank-sum two tail, with recurrent truncating mutationsgenes andfor unaffected sibling probability SNV genes (median mutations truncating recurrent with test, two-tail rank-sum Wilcoxon respectively; 0.32, and 0.57 are genes network bynot selected NETBAG+ probability (median for network and non- were that genes than haploinsufficient to be more likely significantly study a recent from probabilities haploinsufficiency using Indeed, genes. sufficient haploin target preferentially would mutations causal true that likely Because ASD is of a enrichment causal genes substantial in the implicated network. network (Fisher’s exact one-tail test, 6 network, mutations of are 11 truncating in genes with recurrent the 580 (~23%) of all genes harboring missense SNVs form the implicated KATNAL2,SCN2A POGZ, ( background as genes brain-expressed using obtained also were results enrichment similar genes; human all of set background default the using performed were calculations cluster, see network each with associated terms GO of list a For shown. not are terms redundant and genes, human 400 than more with associated terms is, that terms, Nonspecific component. cellular for CC and function molecular for MF process, biological for BP domain: GO indicates column DAVID. ontology in The procedure Benjamini-Hochberg the using testing hypotheses david.abcc.ncifcrf.go ( network implicated the in enriched terms GO GO ID Table 1 nature nature genes harboring truncating ASD GO:003042 G GO:001406 GO: GO:000593 GO:00 GO:000761 GO:0044 GO:004521 GO:000591 GO:001 GO:003003 GO:001562 GO:00 GO:003125 GO:000526 GO:005101 GO: GO:004520 O:004300 Notably, a recent exome study showed a significant overlap between 000591 001656 3002 0438 688

45 NEUR GO GO terms associated with the implicated network Supplementary Table 4 Supplementary 5 5 9 1 8 9 1 6 1 3 7 6 9 6 2 2 5 8 2

P de de novo 2 OSCI < 10 < 6 , we found that genes in the implicated network were network implicated in the , genes we that found v . . P EN values shown in the table were corrected for multiple multiple for corrected were table the in shown values −14 mutations are predominantly heterozygous, it is Ontology C ); a similar result was also observed for genes genes for observed also was result similar a ); E MF MF MF MF BP BP BP BP CC CC CC CC CC CC CC CC CC CC CC

and advance online publication online advance ). de novo TBR1 Supplementary Table 3 Supplementary Term Dendrite Neuron projection density Postsynaptic Cell-cell junction Cell cortex based process Actin filament- Learning or memory Synapse part membrane Postsynaptic junction Cell-cell adherens ATPase activity organization Actin cytoskeleton Actin cytoskeleton Helicase activity Cell leading edge activity Calcium channel binding Actin filament modification Chromatin Synapse P Fig. Fig. = 0.02). This suggests that there ) 3 SNVs and targets of the fragile X , 7 1 , 8 ), as identified by DAVID by identified as ), , 11 , 2 5 . Although only 131 of 131 only . Although . The enrichment enrichment The . 8 8 × 10 8 × 10 7 × 10 7 × 10 6 × 10 3 × 10 2 × 10 1 × 10 7 × 10 6 × 10 3 × 10 0.0002 0.0001 0.0001 0.0001 0.0001 P 0.004 0.004 0.002 P value P

= 0.001). = 0.001). = 0.016).

−5 −5 −5 −5 −5 −5 −5 −5 −6 −6 −7

-

of humangeneticvariation of theproportionraresynonymousvariants observedinarecentlarge-scalesurvey implicated genes.Theexpectednumbers ofFMRPtargetswereobtainedonthebasis The expectedandobservednumberofFMRP targets Sibling SNV genes Non-network SNV genes Channel activity cluster genes Postsynaptic density cluster genes Chromatin modification/regulation Neuronal signaling/cytoskeleton Network SNV non-truncating genes Truncating SNV genes Network genes Gene set Table 2 each of the four functional clusters of the implicated network ( for FMRP targets remained significant when we separately considered ( siblings unaffected from genes ( network the to selected not were that In contrast, there was no significant enrichment for proband SNV genes FMRP targets from another recent study ( 1:2.67, = (enrichment SNVs non-truncating harboring genes network only P two- test, 1:2.13, binomial enrichment tail gene SNV (truncating targets FMRP for enriched significantly were network implicated the in genes genes, ( sets gene ASD various in number observed the with compared then was study exome-sequencing large a from mutabilities gene inferring by genes ASD for targets FMRP of number expected the calculated we SNVs, analysis autistic previous by a accompanied Following often symptoms. disabilities cognitive of spectrum in a resulting X disorder a syndrome, genetic can FMRP fragile cause learning and plasticity synaptic ing protein that is for essential a wide array of includ cognitive functions (FMRP) protein retardation mental of theoverlapswere established usingatwo-tailbinomialtest. recent studyofFMRP targets genes ASD-implicated the of expression brain investigated Thus, next we function. gene of contexts developmental and physiological for clues unde important provide patterns expression Gene genes implicated of patterns expression brain spatial and Temporal 1:0.58, (enrichment under-represented significantly P 1:1.05, (enrichment siblings unaffected in SNVs harboring for genes 1:1.9, (enrichment probands 5 Table (enrichment 1:3.4, two-tail binomial test, network implicated the in genes PSD-associated of to enrichment cant crucial is plasticity and and communication synapses synaptic excitatory of membrane postsynaptic disorders psychiatric other and non-truncating with genes of fraction for a substantial role a causal suggest and etiology autism for important are targets FMRP that confirm analyses these Overall, = 3 × 10 = 0.8). For genes not selected to the network, PSD genes were were genes PSD network, the to selected not genes For 0.8). = cluster cluster genes cluster genes Genes forming the PSD are also likely to be important in ASDs ASDs in important be to likely also are PSD the forming Genes Table 3 0 P (Online Methods); the expected number of FMRP targets targets FMRP of number expected the Methods); (Online

) and marginal enrichment for truncating truncating for enrichment marginal and ) Overlap Overlap between ASD gene sets and FMRP targets = 4 × 10 × 4 = −11 2 ). The enrichment remained significant when considering ). This analysis revealed that, similar to truncating SNV to truncating similar that, revealed analysis This ). −9 P ). Significant enrichment was also observed for for observed also was enrichment Significant ). = 3 × 10 × 3 = 3 0 . Theobservedandexpectedoverlapsbased onanother 3 1 aregivenin

de-novo P

= 0.05). No enrichment was observed observed was No = enrichment 0.05). −4 of genes P Number ; network gene enrichment 1:2.78, 1:2.78, enrichment gene network ; 12 = 0.15). Furthermore, enrichment enrichment Furthermore, 0.15). = 355 449 138 108 159

Supplementary Table 5 21 11 50 69 , SNVs. 3 28 2 3 . The PSD is localized at the the at localized is PSD The . , , 2 2 9 P 7

. Failure to properly express express properly to Failure . . FMRP is an RNA-binding RNA-binding an is FMRP . 25 P = 0.2) nor for for nor 0.2) = 2 Supplementary TableSupplementary 5 = 7 × 10 7

, 33 amongdifferentsetsof 38.6:47 38.6:47 (1:1.22) 48.8:40 (1:0.82) 15.0:40 (1:2.67) 11.7:25 (1:2.13) 17.3:48 (1:2.78) 5.43:17 5.43:17 (1:3.13) 7.49:15 (1:2.00) 3 2.28:9 2.28:9 (1:3.95) 1.19:5 (1:4.19) , of truncating truncating of 3 Expected: observed 4 . There was a signifi a was There . −9 t r a de novo de ; . Thesignificances Supplementary P

= 0.04). = de novo de C I r 3 3 × 10 SNVs in in SNVs 1 1 × 10 4 × 10 standing standing Table 0.0002 0.0003 P e novo de 0.004 0.15 0.01 value

0.2 e l SNV SNV −11 ) −5 −9 2 3 s 1 ).  - - .

© 2014 Nature America, Inc. All rights reserved. sion (less than 8 post-conception weeks) was substantially lower lower substantially was weeks) post-conception 8 than (less sion truncating with P 0.18, test, one-tail 0.16, (bias P genes network for significant highly was harboring gene every peri for ods developmental postnatal and prenatal during expression between the difference the calculated we bias, expression prenatal the Toquantify composition. cellular brain in changes or cells brain genes in ASD-related of activity higher of result a as possibly weeks), during the early fetal to early mid-fetal periods (8–19 post-conception ( periods ( levels lower at significantly brain the in expressed are connectivity network or length equivalent with con human in genes the or randomly network; selected nectivity phenotypic length their of consequence a simply not is genes of network expression brain high significantly Notably, is genes. brain-related network in implicated enriched the that confirms result ( This network the to selected not were that higher than was the significantly expression of SNV-containing genes size (Online Methods). The brain expression of network genes of equal ( sets probe random in bias a greater of observing probability samples biological and test a (P permutation-based of sets between difference of expression the significance the estimate with genes for truncating as well as network, implicated the formed that sets gene various the for periods developmental across levels expression average the mined brain multiple We stages. deter and developmental regions across expression mRNA of map comprehensive a provides database HBT the individuals, healthy from samples analysis of transcriptional postmortem tissue (HBT, base Human Transcriptome data the Brain using s.e.m. represent bars Error stages. developmental postnatal and prenatal separate lines dashed Vertical (purple). genes SNV truncating male and (blue) genes network male (green), genes network female (red), genes SNV truncating female genes: SNV female/male truncating and network for profiles ( (blue). genes activity channel and (green) genes signaling/cytoskeleton (red), genes modification/regulation chromatin (cyan), genes density postsynaptic network: ( clusters functional ( probands. in observed mutations (red) non-truncating and (cyan) truncating with genes network for ( (purple). genes SNV non-network and (blue) genes SNV truncating all (green), CNVs from genes network (red), genes network all (orange), network the in ( set. a given in genes all using calculated were stage developmental each at levels expression average database; HBT the from obtained were (log data Expression sets. gene implicated for stages developmental across brain human the 2 Figure t r a  test, (two-tail periods developmental later in that with compared as a

WT ) Expression profiles for truncating SNV genes genes SNV truncating for profiles ) Expression Expression of the network genes ( genes of network the Expression < 10 < WT

C I Temporal profiles in profiles expression Temporal gene Supplementary Fig. 2 Fig. Supplementary −15 < 10 < c de novo ) Expression profiles for the four four the for profiles ) Expression e l http:// /P PT −15 de novo de s < 10 < SNVs. We used a Wilcoxon rank-sum test (P Fig. Fig. /P hbatlas.org b ) Expression profiles profiles ) Expression PT −4 1 < 10 < ) of the implicated implicated the ) of SNVs or or SNVs ) and for genes with truncating SNVs (bias (bias SNVs truncating with genes for and ) d −4 ) Expression ) Expression / de novo de ) ). However, for ASD network genes genes network ASD for However, ). ). 1 6 . Based on on Based . de novo de P < 10 Fig. 2 Fig.

mutations. The prenatal bias bias prenatal The mutations.

−4

P

CNVs, embryonic expres embryonic CNVs, 2 a

WT ) across all developmental developmental ) all across ) ) was, on average, highest ) highest on was, average, - - < 10 < c a

PT Brain expression level (log ) Brain expression level (log )

−15 2 2

Embryonic 10.0 11.0 Embryonic ) ) to estimate the 6.0 7.0 8.0 9.0 7.0 8.0 9.0

; ; Early fetal Early fetal P

PT EarlyEarly mid-fetal fetal EarlyEarly mid-fetal fetal

< 10 < Early mid-fetal Early mid-fetal Fig. 2

WT Late mid-fetal Late mid-fetal −4 ) ) to a ). ). - - - ) Late fetal Late fetal

Early infancy Early infancy

Late infancy Late infancy ent with the important roles of signaling and structural genes across across genes structural and signaling of roles important the with ent consist is This stages. developmental all across constant relatively was cytoskeleton and signaling neuronal with associated cluster the stages may be attributed to their involvement neurodevelopmental in synaptic plasticity after genes modification chromatin of to plateau. a expression decreased Sustained postnatal then gradually developmental peak of neuronal proliferation and differentiation, and a with consistent expression, fetal and embryonic high showed tion and regula modification chromatin with associated the cluster trast, In con stage. developmental mid-fetal at early the of synaptogenesis start the with consistent development, fetal early during rise distinct a showed activities channel various with associated cluster the and cluster PSD the Both brain. the in expression overall highest the had ( density postsynaptic the with associated primarily was that ( network implicated the forming clusters four autism. with associated to more than severe developmental consequences mutations typically ful mutations during embryonic development, as these mutations lead harm against that is SNVs there and strong suggests selection CNVs (P (P synonymous non- for ( absent was period embryonic the mutations in mutations against bias non-truncating and SNVs) and (CNVs truncating truncating with genes network for profiles expression tively; P Early childhood Early childhood WT WT WT We also examined the temporal expression profiles for each of the of the for each profiles expression temporal the We examined also Late childhood Late childhood = 6 × 10 × 6 = = 0.2/P = = 0.4/P = All genesindatabase Channel activity Signaling/cytoskeleton Chromatin modification Postsynaptic density All genesindatabase Non-network SNVs Truncating Network CNV Network Truncating innetwork

Fig. 2 Fig. Adolescence Adolescence Young adulthood Young adulthood

Middle adulthood Middle adulthood advance online publication online advance a PT

PT Late adulthood Late adulthood ). This effect is apparent when comparing the average average the comparing when apparent is effect This ). e novo de −7 s = 0.2). The observed bias against against bias observed The 0.2). = = 0.3) and for genes with with genes for and 0.3) = /P PT < 10 < SNV genes not selected to the network network the to selected not genes SNV d b Embryonic Brain expression level (log2) Embryonic Brain expression level (log2) −4 8.0 7.0 8.0 9.0 7.0

and P and Early fetal Early fetal

EarlyEarly mid-fetal fetal EarlyEarly mid-fetal fetal

Early mid-fetal Early mid-fetal WT All genesindatabas Network geneswithCNVsortruncatingSNVs Network geneswithnon-truncatingSNVs Late mid-fetal Late mid-fetal = 10 =

Late fetal Late fetal Early infancy

de novo de Early infancy nature nature −14 Late infancy Late infancy Early childhood Early childhood /P Fig. 2 Fig. Late childhood e de novo de

PT Late childhood SNVs in siblings siblings in SNVs < 10 < All genesindatabas Male truncating Male innetwork Female innetwork Female truncating NEUR Adolescence Adolescence Young adulthood Young adulthood c i. 2 Fig. ). The cluster cluster The ). Middle adulthood Middle adulthood truncating truncating −4 OSCI 3

5 Late adulthood

, respec , Late adulthood . . Finally, b Fig. 2 Fig. ). The The ). EN C e c E - - - - - )

© 2014 Nature America, Inc. All rights reserved. the Bonferroni procedure. Bias s.e.m. was calculated separately for each brain region. BG, basal ganglia. The significances of the expression biases were evaluated using the Wilcoxon rank-sum one-tail test and corrected using log bias for each brain region, we calculated the difference between the average log The biases were calculated using human expression data obtained from the HBT database. To quantify the expression Frontal Thalamus Hippocampus Temporal Parietal Occipital Cerebellum Striatum/BG Amygdala Brain region Table 3 P log truncating ing harbor genes for higher significantly was expression brain average test, compared Fisher’sone-tail in males; 13% exact with truncating are females in mutations SNV of (30% work net implicated the in genes the for stronger even was dimorphism greater a had test, one-tail collection exact (Fisher’s SSC males than the mutations truncating in of burden females that observed also males in than females in ASDs trigger to insults genetic of old thresh a higher requires that effect protective of a female a result be ASDs in findings notable and consistent most the of one is individuals, high-functioning for clusters. functional corresponding of profiles expression average the with consistent generally were genes vidual ( requirements of genes. expression Normalized trajectories dosage individual different by explained partially least at be to likely is ability ( profiles expression cluster average the around genes individual the of expression the in variability substantial was there Notably, stages. developmental all in presented are figure All (*). asterisk an with indicated are meta-analyses both for cutoff significance the passing types cell the and combined The methods. meta-analysis Stouffer’s and Fisher’s using combined were (blue), mutations truncating recurrent by affected genes on based other the and (red) genes network on based one approaches, the addition, In blue. red and light in shown are types non-significant and blue, and red dark in shown are types cell significant 10%; of rate discovery false a procedure with Benjamini-Hochberg the using testing hypothesis multiple for corrected and test one-tail rank-sum Wilcoxon the using was evaluated biases expression the of significance The (red). genes network for bias expression type cell the of magnitude the by ordered are with genes human log average the and genes human log average the between difference the type cell each for calculated we biases, expression the described previously using calculated were biases The (blue). SNVs truncating recurrent with genes 11 for and red) in (shown genes network implicated for CNS the of types cell 25 across computed were biases expression sibling unaffected versus Proband mutations. truncating recurrent 3 Figure nature nature Supplementary Fig. 3 Fig. Supplementary WT 2 A high male-to-female incidence ratio, estimated at 4:1 more than estimated ratio, incidence male-to-female A high postnatal expression, such that positive values in the table indicate higher expression levels in the prenatal periods. 2 expression for genes: females, 7.98; males, 7.40; one-tail test, test, one-tail 7.40; males, 7.98; females, genes: for expression < 10 < P = 0.07), which is consistent with this hypothesis. This gender gender This hypothesis. this with consistent is which 0.07), =

NEUR Prenatal Prenatal versus postnatal brain expression biases across brain regions Cell-type expression biases for network mutations and and mutations network for biases expression Cell-type −15 /P P OSCI PT de novo de values were corrected using the Bonferroni method, method, Bonferroni the using corrected were values 0.12 0.14 0.15 0.16 0.16 0.17 0.23 0.24 0.25 Bias = 2 × 10 × 2 = de novo de EN Mus musculus Mus Network genes upeetr Fg 3 Fig. Supplementary Supplementary Table 8 Table Supplementary C 2 mutations in females than in males (average (average males in than females in mutations S.e.m. E expression of mouse orthologs for implicated implicated for orthologs mouse of expression ) revealed that the temporal profiles of indi of profiles temporal the that revealed ) 0.08 0.08 0.07 0.07 0.08 0.08 0.08 0.08 0.08

P SNVs in unaffected siblings. Cell types types Cell siblings. unaffected in SNVs 36 values obtained from the two independent independent two the from obtained values advance online publication online advance −3 , 3 7 ; ; . Evidence suggests that this bias may may bias this that suggests Evidence . Fig. 2 Fig. 1 1 × 10 1 × 10 1 × 10 2 × 10 3 × 10 3 × 10 3 × 10 2 × 10 1 × 10 2 P expression of mouse ortholog for for ortholog mouse of expression expression data expression value P −16 −4 −5 −17 −11 −7 −9 −12 −12 values associated with the the with associated values d ). These log-scale differences differences log-scale These ). 0.17 0.22 0.18 0.19 0.17 0.15 0.17 0.22 0.26 Bias . . oee, hs vari this However, ). P Truncating SNVs = 0.03). Notably, = 0.03). the 1 S.e.m. 7 0.09 0.09 0.09 0.09 0.10 0.10 0.10 0.10 0.10 . To quantify To. quantify 4 4 × 10 1 × 10 4 × 10 2 × 10 3 × 10 3 × 10 4 × 10 3 × 10 2 × 10

P value

2

prenatal expression and the average 3

8

. We. −10 −5 −4 −8 −4 −2 −3 −5 −6

- - - - -

0.03 0.03 0.02 0.05 0.05 0.04 0.07 0.04 0.02 Bias of truncating mutations on protein function, which makes the expres effect of severe the a is consequence likely result This genes. network versus for implicated than SNVs truncating with for genes females larger was males in mutations harboring genes for expression brain in difference relative the that note We also respectively. difference), and in were brain levels females males expression 7.42 and 7.39 (<3% average the probands male in SNVs truncating with genes for respec tively; difference), (<5% 7.95 and 8.02 were males and females in expression truncating SNV mutationswith in female genes probands normal For the average brain expressionfemales. by levels and males explained between differences be cannot patterns observed the expression Notably, females. in ASDs with associated erentially is, that mutations are in brain expression, genes with higher pref truncating perturbations, genetic stronger relatively data that expression suggest the also Thus, males. and females in SNVs truncating harboring genes of expression average the in stages developmental across 50–100% of differences level expression absolute to translate Mature oligodendrocytes,progenitors(cerebellum) Granule celllayerinterneurons(innerGolgicells Sibling SNVs Cholinergic projectionneurons(basalforebrain) Mature oligodendrocytes,progenitors(cortex Motor neurons,midbraincholinergicneurons S.e.m. 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 Layer 6corticothalamicpyramidalneurons Layer 5acorticostriatalpyramidalneurons Motor neurons,cholinergicinterneurons Drd1+ mediumspinyneurons(striatum) Drd2+ mediumspinyneurons(striatum) Unipolar brushcells(mGluR1subtype) Mature oligodendrocytes(cerebellum) Cholinergic neurons(corpusstriatum) Granule cells,deepcerebellarnucle Astroglia (includesBergmannglia Mature oligodendrocytes(cortex Significant, recurrenttruncating Significant, network P value 1.0 1.0 1.0 1.0 0.7 1.0 0.1 1.0 1.0 Mixed neurons(cortex Astroglia (JD133line) Astroglia (JD130line) Interneurons (cortex Stellate, basketcells Layer 5bneurons Neurons (cortex ASDs. Furthermore, we investigated the the investigated we in Furthermore, observed ASDs. abnormalities phenotypic of spectrum wide the with consistent is which ( regions brain all genes network showed significantly higher expression across controls, with Compared siblings. unaffected in SNVs levels with genes expression of the to genes network of levels expression average the compared we region, brain each For database. HBT the in data available expression spatial we analyzed regions, brain specific affect preferentially patients. across phenotypes ASD of variability the explaining in factor important an genes corresponding of sion Bergman glia Purkinje cells To investigate whether ASD mutations mutations ASD whether investigate To –0. ) ) ) ) ) ) ) i –1 * * * * * * 2 for recurrenttruncatingmutations 0 0 0 0 Non-significant, recurrenttruncating Non-significant, network Cell-type expressionbias Supplementary Table 7 Table Supplementary Cell-type expressionbias for networkmutations .5 .1 1 1 0. t r a 0 2 .5 .3 C I 2 0. 0 4 e l 2. 5 .5 s ), ),  - - -

© 2014 Nature America, Inc. All rights reserved. types were also independently implicated by considering expression expression considering by implicated independently also were types and ( striatum the of neurons neurons spiny medium pyramidal interneurons, cortical expres especially neurons, cortical significant in observed and were genes network strong implicated for biases sion particularly cell affected, multiple were Although types types. cell diverse across pathways neural and structural signaling, common of usage shared by partially, least 8 Table ( genes network the in mutations ASD by affected were likely types cell and non-neuronal neuronal that multiple revealed analysis This siblings. in mutations unaffected harboring genes for orthologs of expression the with genes network for each cell type, the expression of mouse orthologs for the compared, implicated we biases, expression cell-specific To assess CNS. mouse set contains gene expression profiles for 25 distinct cell types from the (TRAP) purification affinity ribosome translating generated using expression gene cell-specific of set data used independent we an question, this explore To types. cell specific toward genes implicated of biases possible investigate to sought we and types, cell areas. brain across functions gene of sharing the however,regions. Overall, the lack underscores of regional specificity numerically, although not significantly, larger prenatal bias than other stimuli processing in emotional role crucial a have to known is which amygdala, the ( regions brain all for significant Wesiblings. in found and that statistically similar the prenatal bias was generally SNVs with genes for bias prenatal the with these pared com and genes truncating and network for bias expression prenatal s.e.m. represent bars Error siblings. unaffected all for individual per mutations of numbers average corresponding the represent lines dashed Horizontal individual. per (orange) SNVs synonymous and (purple) SNVs non-truncating of ( individual. per (green) CNVs and (blue) SNVs ( IQs. different with probands 4 Figure t r a  ( ( stages. developmental postnatal and prenatal separate lines dashed Vertical blue. in phenotypes severe less and red in shown are phenotypes severe more with probands in affected genes for Profiles lines. dashed as SNVs truncating for profiles and lines solid as displayed are genes network for profiles Expression s.e.m. represent bars error and subset a given in genes all using calculated were stage developmental each at levels expression average database; HBT the from obtained were (log data Expression scores. phenotypic median corresponding the to relative low) and (high groups two into divided were Probands scores. phenotypic 5 Figure c

) Profiles for probands with low and high ADIR-R scores. ADIR-R high and low with probands for ) Profiles a The The human brain contains a of and variety neuronal non-neuronal Embryonic Brain expression level (log2) 7.0 8.0 Early fetal

C I ). The diversity of affected cells could be explained, at at explained, be could cells affected of diversity The ). Average numbers of of numbers Average Temporal gene expression profiles in the human brain across developmental stages for genes affected in subsets of probands with different different with probands of subsets in affected genes for stages developmental across brain human the in profiles expression Temporal gene EarlyEarly mid-fetal fetal

Early mid-fetal e l

Late mid-fetal s 3

Late fetal 9

Early infancy a showed striatum, and cerebellum as well as ,

Late infancy Early childhood

Late childhood a de novo de ) The average number of truncating truncating of number average ) The

YoungAdolescence adulthood High IQ,truncating High IQ,network Low IQ,network Low IQ,truncating Middle adulthood Table mutations per individual for for individual per mutations a ) Profiles for probands with high and low IQ. ( IQ. low and high with probands for ) Profiles Late adulthood Fig. Fig. Fig. Fig. 3 ). We note, however, that that however, note, We ). b ) The average number number average ) The 3 3 ). Notably, similar cell cell Notably, similar ). and and b Embryonic Brain expression level (log2) 7.0 8.0

Early fetal Supplementary

EarlyEarly mid-fetal fetal 1

Early mid-fetal 7 ; the data data the ; de de novo Late mid-fetal

Late fetal Early infancy - -

Late infancy Early childhood erties of erties implicated genes mutations and phenotypic affect associated prop how functional to understand it is important probands, across differences pathophysiological substantial manifest ASDs Because Functional properties of implicated genes and disease phenotypes genes ASD confidence high around built networks coexpression in convergence spatio-temporal of point a as recently identified also were neurons projection glutamatergic cortical layer deep Notably, affected. less markedly were cells, astroglial and rons truncating ( recurrent with genes 11 for biases the average number of non-truncating mutations per individual was was individual per mutations of non-truncating number average the that showed probands in SNVs) non-truncating and (synonymous P IQ for 0.09 100, < IQ for (0.17 100 to or than equal IQ greater with probands than higher twofold about was 100 than less IQ with average probands for SNVs The truncating of number siblings. became unaffected in and number average decreased the truncating to SNVs) similar of truncating number and average (CNVs the mutations probands, ASD mutations functioning truncating ( of spectrum IQ the number across individual per average the SNVs, truncating novo de truncating of effect the To investigate disease. the of characteristics Fig. Fig. a Late childhood mutations for non-truncating analyses similar In contrast, = 0.02). Average number of truncating mutations per individual Low ADIR-S,truncating High ADIR-S,truncating Low ADIR-S,network High ADIR-S,network Adolescence 3

Young adulthood 0.05 0.10 0.15 0.20 0.25 0.30 b ). ). On the other hand, some other cell such types, as motor neu ) Profiles for probands with low and high ADIR-S scores. scores. ADIR-S high and low with probands for ) Profiles mutations on IQ, we calculated, separately for CNVs and and CNVs for separately calculated, we IQ, on mutations Middle adulthood 0 advance online publication online advance <

Late adulthood 55 70 Proband IQ CNVs Truncating SNVs 90 c Embryonic Brain expression level (log2) 7. 8. 110

Early fetal 0 0

EarlyEarly mid-fetal fetal

Early mid-fetal ≥

Late mid-fetal b 100; Fisher’s exact two-tail test, test, two-tail Fisher’s exact 100; Average number of non-truncating mutations per individual Fig. 4 Fig.

Late fetal 0. 0. 0. 0. Early infancy

0 2 4 6 8 nature nature

Late infancy <55 Early childhood a

Late childhood high- for Notably, ). e novo de Synonymous SNVs Non-truncating SNVs 70 4 Proband IQ 0 Adolescence Low ADIR-R,truncating High ADIR-R,truncating Low ADIR-R,network High ADIR-R,network

NEUR Young adulthood .

Middle adulthood 90 mutations mutations

OSCI Late adulthood

110 EN C 2 ) E - -

© 2014 Nature America, Inc. All rights reserved. a substantial fraction of non-truncating non-truncating of fraction substantial a truncating ASD. in behaviors tive repeti and stereotypical may mediate that circuits neural functional Consequently, our unbiased genome-wide analysis implicates ASD specific in perturbed often are that behaviors habit-forming and cognitive emotional, motor, diverse mediate to known are circuits cortico-striatal-thalamic often corresponding 6 the thalamus; the to layer project and striatum the to project often neurons pyramidal striatum, seemed to be more strongly the affected. Notably, cortical of neuron layer as spiny 5 cortical medium such and neurons types, pyramidal some interneurons, types, cell multiple across processes and active are genes implicated the Although function. brain and synaptogenesis mobility, neuron neurogenesis, of stages multiple in important be to likely therefore are and profiles expression mental develop diverse disorder. had Wethe genes with affected that found is matched by the diversity of genetic and functional insults associated of heterogeneity ASD Our that results suggest the pathophysiological DISCUSSION 10 × 1.5 P test, one-tail tions: P test, P ally had significantly higher brain expression (network genes: one-tail or higher ADIR scores (both indicating more severe phenotypes) usu scores IQ lower with associated genes affected that revealed analysis ( subset phenotype each in identified genes implicated scores. Wephenotype the average then calculated expression levels of high-scoring phenotypic subsets relative to the corresponding median and low- into cases ASD the divided we phenotypes, corresponding and genes affected of level expression average the between tionship (ADIR-R) behaviors repetitive/restrictive and parents and patterns reflect in proband reciprocal interactions social (ADIR-S) with interviews structured on scores. based are scores behavior ADIR The repetitive and interaction social (ADIR) Interview- Revised Diagnostic Autism and IQ full-scale the considered analysis, we this For outcomes. phenotypic different with average, on is associated, genes for affected of expression brain level the whether dosage. in increase an than effect functional stronger substantially P test, one-tail rank-sum (Wilcoxon respectively 83.9, and 64.8 were duplications. duplications and and deletions CNV with probands for IQs deletions average The CNV by affected genes network with probands. ASD high-functioning for mutations recent study to play a ASD less Another prominent cases. role in likely high-functioning are mutations truncating that test, suggest analyses one-tail these exact Overall, (Fisher’s 70 > IQ ratio) with (3.3:1 probands mutations in synonymous IQ 110 mutations with and probands non-truncating synonymous in 330 72 observed were and ratio) (2.2:1 non-truncating 160 70: below signifi were those with compared 70 IQs above with probands in enriched cantly non-truncating non-synonymous mutations, mous for 0.46 100, < IQ for IQ (0.43 probands ASD high-functioning for relatively constant across IQs ( nature nature we Notably, disorder. the to contributes also probands in observed = 6 × 10 WT Although previous studies have primarily emphasized the role of role the emphasized primarily have studies previous Although asked we patterns, expression diverse exhibited genes that Given Notably, there was a significant difference in IQs between probands ≥ < 10 < 100; Fisher’s exact two-tail test, test, two-tail exact Fisher’s 100; WT −3 NEUR < 10 −15 −3 for ADIR-S, P ADIR-S, for de novo de ). ). This result that suggests a in decrease gene dosage has a 4 /P 1 −15 also demonstrated no excess of OSCI PT /P = 0.024 for ADIR-R; genes with truncating muta truncating with genes ADIR-R; for 0.024 = PT mutations in ASD in mutations EN WT = 0.01 for IQ, P C < 10 E

WT advance online publication online advance < 10 < −15 Fig. Fig. 4 /P −15 PT /P < 10 b WT ), ), without a prominent decrease PT 3 , P < 10 7 = 0.08 for ADIR-R; ADIR-R; for 0.08 = , = 0.7). Relative to synony to Relative 0.7). = −4 de novo de 8 , our analysis suggests that that suggests analysis our , for IQ, P −15 de novo 4 /P 2 . . To rela the explore missense mutations mutations missense PT = 0.3 for ADIR-S, WT loss-of-function ≤ < 10 70, whereas whereas 70, Fig. Fig. P

−15 = 0.04). 0.04). = Fig. Fig. 5 ). This This ). /P 43– PT 5 4 ). = 6 ------.

function in the brain. Functional gain and other types of genetic genetic of types other and gain Functional brain. the in function phen autism high-functioning that suggests result this function, of loss a with high-functioning associated are mutations usually truncating in Because ASD cases. role smaller a play mutations truncating that elucidated. be to remain effect this of mechanisms the females, although in effect protective a suggest evidence of sources pendent sizes autism CNVs of analyses in through demonstrated perturbations previously were females functional Stronger phenotypes. autistic female with are associated expression, brain higher substantially with genes of perturbation as such insults, functional stronger that hypothesis disorders. two the in and affected pathways genes specific than autism may lie primarily in the degree of overall functional and disability effect ratherintellectual between distinction the Thus, phenotypes. tions. Stronger functional insults lead, on average, to more severe ASD of consequences phenotypic observed the ence influ to likely are levels, expression brain as such genes, affected of We mechanisms. characteristics for disease the found that functional loinsufficient genes, which confirms that dosage effects are important in hap observed mutations were preferentially found that functional 2. 1. reprints/index.htm at online available is information permissions and Reprints The authors declare no competing interests.financial the results. J.C., A.H.C. andS.R.G., D.V. wrote the manuscript. analysis. functional D.V. the designed study, the project supervised and interpreted results. S.J.S. contributed data, interpreted the results and contributed to the J.C., andS.R.G. A.H.C. performed computational analysis and interpreted the Howard Hughes Medical Institute International Student Research Fellowship. in part by US NIGMS training grant T32 GM082797. S.J.S. was supported by a (MAGNet) grant to U54CA121852 Columbia University. was S.R.G. supported SFARI# 308962 to D.V. and US National Centers for Biomedical Computing discussions. This work was supported by a grant from the Simons Foundation G. Fischbach and all of the members of the Vitkup laboratory for helpful We would like to sincerely thank M. Wigler, M. State, D. Geschwind, A. Packer, online version of the pape Note: Any Supplementary Information and Source Data files are available in the the in v available are references associated any and Methods M therapies. ASD targeted to and, and predictions ultimately,prognostic diagnostic individualized to lead may cancer, in practice common a now stratification, patient Such types. cell and functions biological pathways, affected of basis the on stratified be can patients ASD individual which to extent the investigate to important be will it future, the In severity. phenotypic ASD predicting in useful be may genes target and mutations of ties cases. ASD high-functioning to contributors primary the be may regions, regulatory non-coding in mutations or morphisms such variations, as non-truncating COM AUTH Acknowledgmen ersion of the pape the of ersion

et We also find, in agreement with recently published studies published recently with agreement in find, also We the for evidence further provides expression of brain analysis Our Taken together, our results suggest that various functional proper Taken functional that various together, suggest our results Berg, J.M. & Geschwind, D.H. Autism genetics: searching for specificity and specificity for searching genetics: Autism D.H. Geschwind, & J.M. Berg, threshold the on genetics: autism in Advances D.H. Geschwind, & B.S. Abrahams, convergence. neurobiology. new a of P h o O E types are less likely to be mediated by a loss of normal gene gene normal of loss a by mediated be to likely less are types ods TI R R CON NG 6 , FI 4 TRIBUTI 7 Genome Biol. Genome and gene network properties network gene and N l . A NC r ts . r IA . Nat. Rev. Genet. Rev. Nat. ON L I

N 13 S T , 247 (2012). 247 , E R E STS de de novo

9 , 341–355 (2008). 341–355 , mutations, common poly 1 3 . Thus, multiple inde multiple Thus, . de novo de http://www.nature.c t r a ASD muta ASD C I online online e l 41

om/ , 4 s 8  ------,

© 2014 Nature America, Inc. All rights reserved. 25. 24. 23. 22. 21. 20. 19. 18. 17. 16. 15. 14. 13. 12. 11. 10. 9. 8. 7. 6. 5. 4. 3. t r a 

Ziff, E.B. Enlightening the postsynaptic density. postsynaptic the Enlightening E.B. Ziff, cognition: to development Santini, E. neural From G.R. Crabtree, & W. Wu, J.L., Ronan, synaptic of role D. Pinto, emerging The J.D. Buxbaum, & T. Sakurai, C., Betancur, for role possible a autism: of mechanisms Molecular R.E. Dolmetsch, J.F.Krey,& resource a Collection: analysis integrative and Systematic Simplex R.A. Lempicki, B.T.& D.W.,Sherman, Huang, Simons The C. Lord, & G.D. Fischbach, Doyle, J.P. neurodevelopmental in dysfunction H.J. Kang, Synaptic M.F. Bear, & H.Y. Zoghbi, translation? troubled neuron: autistic M.F.Bear,The & III R.J. Kelleher, S.R. Gilman, S.R. Gilman, B.J. O’Roak, disease. human in heterogeneity Genetic M.C. King, & J. McClellan, H.G. Brunner, & J.A. Veltman, S.J. Sanders, B.J. O’Roak, D. Levy, disorders. spectrum autism of Genetics D.H. Geschwind, review a relevance: clinical its and disorders autistic of genetics The C.M. Freitag, I. Iossifov, associated with autism. with associated chromatin. for roles unexpected disorders. spectrum disorders. spectrum autism of pathogenesis the TrendsNeurosci. in pathways cell-adhesion Ca (2009). f ag gn lss sn DVD iifrais resources. bioinformatics DAVID using lists risk factors. gene large of genetic autism of (2010). identification for types. cell CNS of analysis (2011). 483–489 Biol. Perspect. disabilities. intellectual and autism with associated disorders (2008). 401–406 Neuron synapses. of function and formation in involved genes of network functional large schizophrenia. in (2012). involved networks gene disorders. spectrum autism in genes (2010). 210–217 Genet. Rev. Nat. autism. with associated strongly of network disorders. spectrum (2011). 409–416 literature. the of Neuron 2+ signaling. C I

70 74 t al. et et al. et et al. , 898–907 (2011). 898–907 , (2012). 285–299 , et al. de novo de t al. et e l et al. et et al. et t al. et t al. et t al. et

t al. et Rare Functional impact of global rare copy number variation in autism in variation number copy rare global of impact Functional 4 Exaggerated translation causes synaptic and behavioural aberrations Application of a translational profiling approach for the comparative Curr. Opin. Neurobiol. Opin. Curr.

Spatio-temporal transcriptome of the human brain. human the of transcriptome Spatio-temporal

, a009886 (2012). a009886 , s Mol. Psychiatry Mol. 13 32 e novo De Sporadic autism exomes reveal a highly interconnected protein interconnected highly a reveal exomes autism Sporadic mutations. Multiplex targeted sequencing identifies recurrently mutated recurrently identifies sequencing targeted Multiplex

Rare , 565–575 (2012). 565–575 , , 402–412 (2009). 402–412 , ies tps f eei vrain ovre n functional on converge variation genetic of types Diverse Nature Neuron e novo De e novo de Nature ee irpin i cide o te uitc spectrum. autistic the on children in disruptions gene

e novo de

Cell 466 70 uain rvae b woeeoe eunig are sequencing whole-exome by revealed mutations n tasitd oynme vrain n autistic in variation copy-number transmitted and

, 886–897 (2011). 886–897 , Nature

493 , 368–372 (2010). 368–372 , 135

Nat. Rev. Genet. Rev. Nat. Nature 12 e novo De ains soitd ih uim mlct a implicate autism with associated variants , 411–415 (2013). 411–415 , , 2–22 (2007). 2–22 , , 749–762 (2008). 749–762 , Science

17 485

485 , 112–119 (2007). 112–119 , uain i hmn eei disease. genetic human in mutations , 246–250 (2012). 246–250 , , 237–241 (2012). 237–241 ,

338 a. Neurosci. Nat. Neuron

, 1619–1622 (2012). 1619–1622 , 14 , 347–359 (2013). 347–359 ,

19 Neuron a. Protoc. Nat. rns on Sci. Cogn. Trends , 1163–1174 (1997). 1163–1174 ,

od pig Harb. Spring Cold 15

68 1723–1728 , Nature 192–195 ,

4 Cell Cell 44–57 ,

478 135 141

15 , , , ,

40. 39. 38. 29. 48. 47. 46. 45. 44. 43. 42. 41. 37. 36. 35. 34. 33. 32. 31. 30. 28. 27. 26.

Willsey, A.J. Willsey, cognition? social to contribute amygdala the does What R. Adolphs, X. Zhao, Edbauer, D. Iossifov,I. S.J. Sanders, P.E. Rothwell, of stimulation Optogenetic A.M. Graybiel, & G. P.,Feng, Monteiro, E., Burguière, disease. Langen,M.,Durston, S.,Kas, in M.J., vanEngeland, Staal,H.& W.G. role Theneurobiology its and connectivity Corticostriatal G.M. Shepherd, Lord, C., Rutter, M. & Le Couteur, A. Autism Diagnostic Interview-Revised: a revised K.E. Samocha, Fombonne, E. Epidemiology of pervasive developmental disorders. Newschaffer,C.J. A. Vogel-Ciernia, density. postsynaptic the at machines Signal-processing M.B. Kennedy, Kennedy, M.B. The postsynaptic density at glutamatergic synapses. A. Bayés, Jr. M. Ascano, W. Fu, plasticity M.F.Bear,synaptic S.T.Warren, & Altered S.M., Gallagher, K.M., Huber, Darnell, J.C. predicting and Characterising M.E. Hurles, & E.M. Marcotte, I., Lee, N., Huang, projection neurons in the pathogenesis of autism. of pathogenesis the in neurons projection Sci. USA Sci. Acad. Natl. Proc. (2009). 591–598 memory. and (2013). plasticity synaptic for necessary is 290 20 density. postsynaptic human expression. protein variants. protein-coding miR-132. and miR-125b microRNAs (2002). 7746–7750 retardation. mental X fragile of model mouse a in autism. and function disorder. autism. with associated strongly are Neuron region, syndrome Williams 7q11.23 the of behaviors. repetitive boost to circuits striatal (2013). 1243–1246 behaviors. compulsive suppresses pathway orbitofronto-striatal lateral men.repetitiveand behavior:of… Neurosci. disorders. developmental pervasive possible with individuals of caregivers for interview diagnostic a of version disease. human Public Health Public genome. human the in haploinsufficiency , 264–268 (1997). 264–268 ,

, 750–754 (2000). 750–754 , 1191

t al. et 70 advance online publication online advance

Nature , 42–61 (2010). 42–61 , 14 t al. et t al. et , 863–885 (2011). 863–885 , et al. et et al. et al. nlss f ,1 eoe rvas h rcn oii o ms human most of origin recent the reveals exomes 6,515 of Analysis , 278–291 (2013). 278–291 , et al. et

28 t al. et t al. et et al. et

hrceiain f h poem, iess n eouin f the of evolution and diseases proteome, the of Characterization The contribution of contribution The t al. et A unified genetic theory for sporadic and inherited autism. andinherited for sporadic theory genetic unified A Nat. Genet. Nat. 515 Regulation of synaptic structure and function by FMRP-associated , 235–258 (2007). 235–258 , et al. et FMRP stalls ribosomal translocation on mRNAs linked to synaptic et al. et Coexpression networks implicate human midfetal deep cortical deep midfetal human implicate networks Coexpression Nature FMRP targets distinct mRNA sequence elements to regulate to elements sequence mRNA distinct targets FMRP uimascae nuoii- mttos omny impair commonly mutations neuroligin-3 Autism-associated , 216–221 (2014). 216–221 , utpe recurrent Multiple faeok o te nepeain of interpretation the for framework A Cell The neuron-specific chromatin regulatory subunit BAF53b subunit regulatory chromatin neuron-specific The The epidemiology of autism spectrum disorders. spectrum autism of epidemiology The Nature

J. Autism Dev. Disord. Dev. Autism J.

146 492

104 46 Nat. Neurosci. Nat. , 247–261 (2011). 247–261 ,

, 382–386 (2012). 382–386 , , 944–950 (2014). 944–950 , 493 , 12831–12836 (2007). 12831–12836 , Neurosci.Biobehav. Rev. , 216–220 (2013). 216–220 , de novo de Neuron PLoS Genet. PLoS e novo de

coding mutations to autism spectrum autism to mutations coding 14

Cell 65 , 19–21 (2011). 19–21 ,

, 373–384 (2010). 373–384 ,

24

158 Cell Proc. Natl. Acad. Sci. USA Sci. Acad. Natl. Proc. Ns icuig duplications including CNVs, nature nature , 659–685 (1994). 659–685 , a. Neurosci. Nat. , 198–212 (2014). 198–212 ,

155 6 , e1001154 (2010). e1001154 ,

, 997–1007 (2013). 997–1007 , 35 e novo de NEUR , 356–365(2011)., Pediatr. Res. Trends Neurosci.

Ann. NY Acad. NY Ann. 16 Science OSCI uain in mutation 552–561 , Annu. Rev.Annu. a. Rev. Nat. Science EN

340

65 99 C E , , ,

© 2014 Nature America, Inc. All rights reserved. expected and the observed fractions of SNVs events in various ASD disease gene andPSDgenes.domSNVmutationWeFMRP in occur to comparedthen the calculated fractions represent the background (expected) probabilities for a ran entries were determined using the variant chromosomal position and allele. The ESP variant data was downloaded from only once, that affect FMRP target genes and genes associated with PSD. NHLBI study the in synonymousmutationssynonymous mutations,is,observed that sequencing data of about 6,500 individuals, we calculated the fractions of all rare forNHLBIthe ESPpurpose this exome-sequencing study used we synonymousrare study; ofsequencingexomelarge mutations a from targets FMRP between overlap the using ments approach described ously F largest network before a considerable decrease in network significance. of various network sizes random gene sets while also accounting for multiple hypothesis testing as a networkresult ASD mutations were allowed to participate in the randomization sets. byThe final affected genes genes; long or connected highly by driven not was the top five edge scores) and protein lengths to ensure that network significance match the input genes in terms of network connectivity (based on the average of random gene sets. The random gene sets used in the calculations were chosen to obtained using real input data to the distributionnetwork the of score comparingthe and sets generandom of 5,000 to algorithm network scores obtained using with five or fewer genes were ignored during the network searches. was constrained to contribute at most one gene to the network. Small networks of all pairwise likelihood scores between the network genes connected to the growing network.strongly most Networksgenes the addsare consecutively algorithmscored greedy based the gene, inputon a weighted sum find strongly interconnected networks among the input genes; startingphenotypic ofthe nodes network.togreedysearch from algorithmused A then was each associated with 132 different genetic phenotypes. of human genes compiled by a previous study genomes MINT and MIPS), phylogenetic profiles and chromosomal co-clustering across DIP,BiGG,BioGRID, IntAct,InNetDB,(BIND, HPRD, databases of number a in partners interaction sharedprotein-protein interactions, direct database, protein domains from the InterPro database, tissue expression from the TiGER tations, shared pathways in Kyoto Encyclopedia of Genes and GenomesBayesian (KEGG),integration of various descriptors of protein function: shared GOphenotype geneticsame the utetoanno genes are assigned a score proportional to the likelihood ratio that genes contribrelies on the previously described phenotypic network in Phenotypicwhich networkall andpairs the of human obtained from the original studies weremutations SNV of severity evolutionary the assess to used scores GERP in various tests comparing mutations and corresponding functional properties. SNV) each for (once times multiplecounted were probands these mutations; as input in our analysis. 11 probands considered in our study had multiple SNV duplicate genes resulted in 991 genes at 624 distinct genomic loci that were used removingoverlapsCombiningandevent. separate a as considered were CNV event, the SNV gene was considered individually and the remaining genes in the 5 Mb were ignored. When a gene thanaffected larger CNVs and byevent single ana intocombined SNVwere CNVs Overlappingwas also contained in a CNV studies of families in the SSC and included AS ONLINE MET doi: M Network significance was determined by applying the same greedy search greedy same the applying by determined was significance Network The 991 genes affected by D RP and PS and RP 10.1038/nn.3907 associated P 5 0 value reflects the probability of finding a higher scoring network using . The likelihood network was constructed using a carefully curated set D enrichment among implicatedsets. gene H de novo de ODS 12 variants. 3 , 1 and other similar studies, we estimated the enrich estimatedthe westudies, similar other and 3 de novo . The final network of 159 genes was selected as the NE TBA 4 12 mutations were mapped to the corresponding Variants recentwereobtainedseveral from 9 . , 1 G 3 . This ratio was calculated using a naivea usingcalculated ratiowas This . http:// + algorithm. de novo evs.gs.washington.ed 5 1 that contains 476 human genes 27 CNVs , 3 1 The NETBAG+ algorithm or PSD genes PSD or 6 and 12 3 0 , 1 Following previa . Using. exome-the 3 de novo . Each CNV event u 3 and unique 2 SNVs and a set set a and de novo de 3 , 7 , 8 - - - - - .

significance of various expression differences using a method based on randomly with a Bonferroni correction to account for multiple hypothesis testing. of regional and prenatal biases was evaluated using the Wilcoxon rank-sum test log average log the differencebetween the calculatingquantified bybrainregionwas eachfor log ference between the average log the post-mortem brain samples from healthy individuals. representsquantile-normalized log and Brain Transcriptome database (GEO accession ID, Humanthe from obtained wasregions brainand developmental acrossstages expression. brain human of analysis temporal and Spatial scores were used as the clustering metric. ( hierarchical clustering to divide the implicated network into functional clusters Hierarchical clustering of the implicated network. to the genes affected by ASD mutations ( random sets of human genes matched by protein length targetand networkgenes with connectivityvarious ASD gene sets to FMRP theof overlapoverlap the comparing by resultsof enrichment FMRPFMRP target the confirmed genes with sets using a two-tail binomial test ( 51. 50. 49. P Fisher’scombinedusing Stouffer’sandcombined The methods. meta-analysis onandgenesother thebased affected by recurrent truncating mutations, were genes network on based approaches,oneindependent two the from obtained multiple hypothesis testing using the Benjamini-Hochberg procedure. expressionWilcoxonusingthebiases rank-sumone-tailcorrectedandtestfor genes with cated human genes and the average log the difference between the average log in probands compared to siblings were quantified for each cell providedtype by by calculatingAffymetrix for the Mouse Genome 430 Array. human-mouseorthologs ofThe list the expression used we data, genetichuman biasesthe and data ferent cell types from the GSE1337 previouslyobtained data expression mouse considered we types, cell specific toward genes cated C taking the difference between the corresponding expression averages. median expression value of probe sets; expression biases were then calculated by the taking computedexpressiontranscript-levelbywe data, HBT original the in corresponding original gene sets. Analogous to the procedure used to process sampled in each of 10,000 random trials were equal to the numbers of biasprobes greater or equal to the bias insets the original data. The numbers of probe sets P generated expression probe sets ( Supplementary Fig. 1 Fig. Supplementary values were corrected using the Bonferroni method. values (P ell type–specific expression analysis. Complementary to the Wilcoxon rank-sum test (P To quantify the expression biases for each brain region, we calculated the dif A 2 2

eda, . Rhtk, . Vtu, . ewr poete o gns harboring genes of properties Network D. Vitkup, & using A. activities Rzhetsky, metabolic I., orphan Feldman, for genes Predicting D. Vitkup, & L. Chen, G.M. Cooper, (2008). mutations. disease inherited profiles. phylogenetic sequence. postnatal expression (early infancy to late adulthood stages). The significance expression of genes harboring SNVs in siblings. The prenatal expression bias Supplementary 9 ). The considered data set contains expression information for 25 dif 2 PT de novo of prenatal expression (embryonic to late fetal stages) and the average ) obtained with the random trials reflect the probability to obtain a Genome Res. Genome et al. et SNVs in unaffected siblings. We evaluated significances of the 1 Distribution and intensity of constraint in mammalian genomic mammalian in constraint of intensity and Distribution 7 M using ribosome affinity purification (GEO accession ID accessionpurification(GEO affinity ribosome using ethods Genome Biol. Genome ). The inverses of the phenotypic network likelihood likelihood network phenotypic the of inverses The ). Mus musculus

15 , 901–913 (2005). 901–913 , C 2 expression of implicated genes and the average Supplementary Table 9 rc Nt. cd Si USA Sci. Acad. Natl. Proc. hecklist Table

7 2 2 , R17 (2006). R17 , expression of mouse orthologs for impli expression of mouse ortholog for human To evaluate expression bias of the impli CNS. To connect the mouse expression 2 2 Supplementary Table 6 -transformedexpressions fromvalues and is available. Supplementary Table 5 nature nature GSE2521 We used the average linkage WT ), we also estimated the ). The permutation test 9

NEUR ) 105 Expression data Expression 1 6 ; the HBT data ). 4323–4328 , OSCI ). We also P values EN C E - - - -