1 Supplementary Text Section 1: Sequential convergence from variant, to 2 pathway level

3 Compare to the overlap between SSC and ASC cohorts at variant level, the overlap 4 between probands and siblings in SSC cohort is actually higher (Fig. 1b). They are 5 expected to have more similar genomic background, because they came from the same 6 family in the same cohort.

7 Supplementary Text Section 2: genetic association dissected across multiple 8 levels

9 Previous studies report higher variant or CNV burden in autism1-4. In our analysis, these 10 extra variants most comes from the gene disrupting and pathway hitting categories (Fig. 2 11 row 1, section 2 in main text). In other words, our two-factor model largely explains 12 away the association of extra variant burden with autism, i.e. it is not a direct causal 13 factor of autism. In addition, to affect autism disease status, such extra variants need to 14 occur to the effective variant categories (gene disrupting and pathway hitting), but not 15 other categories (like the silent variants within selected pathways and most variants 16 outside) (Fig. 2 row 1). 17 As described in Fig. 2 and main text, what we proposed is essentially a Noisy-AND 18 model for autism genetic risk factors, i.e. risk genetic variant tend to be both gene 19 disrupting AND pathway hitting. The model is Noisy because our knowledge on gene 20 disrupting and pathway assignment is incomplete. For instance, there are still some level 21 of association between autism and gene disrupting variants outside selected pathways 22 (Fig. 2 row 1, section 2 in main text), presumably because some relevant pathways 23 remain unknown hence unselected and more can be assigned to the selected 24 pathways than currently known. Similarly, error/noise may arise from the inaccurate 25 assignment of variants to silent or gene disrupting categories or their incomplete 26 penetrance. 27 This model is built using DN rare (local, SNV or indel). However, we expect 28 other types of data, i.e. CNV1,4,5 or common variants6,7, would converge to the same two- 29 factor model and the same set of molecular mechanisms (pathways or genes). Although 30 they are different types of variants, they should follow the same set of rules to be 31 associated with autism: they should disrupt the target genes, either the structure/function 32 or the expression, and at the same time they should hit the same set of pathways as these 33 local variants. We would like to explore multi-level integrated analysis on CNV or 34 GWAS data in our future work. 35 36

1

1 Supplementary Text Section 3: Pathways of DN events, integrated molecular 2 mechanism

3 Besides the three pathways described in Section 3 of main text, other pathways and 4 graphs are equally informative (Table 1 and Fig. S4). All pathway identified are 5 potentially causal, except digestion and absorption, which is likely just 6 association8. They even explain associated symptoms of ASD, including intellectual 7 disability9 (Glut, GABA), sleeping10 (Circ) and digestive8 (digestion) problems. Here are 8 the mechanisms revealed by these pathways with respect to ASD.

9 hsa00310 Lysine degradation (Fig. S4a)

10 All data converge to the protein lysine methylation branch, these genes are histone-lysine 11 N-methyltransferase for histone modification and transcription regulation. In addition, the 12 hydroxylation may also be important for autism. The other two genes, TMLHE11-13 and 13 PLOD3, likely affect histone methylation by change histone markers through 14 hydroxylation at up or down stream steps. In , most of these genes also 15 fall to GO:0018024 histone-lysine N-methyltransferase activity. It is known that 16 epigenetics and histone modifications are important for autism14,15.

17 hsa04810 Regulation of actin cytoskeleton (Fig. S4b)

18 Both the inputs (chemotaxis and immediate responses/effectors) and outputs (Actin 19 polymerization, stabilization, focal adhesion and adhesion junction) of this pathway are 20 relevant. 21 The actin cytoskeleton plays a major role in neuronal morphogenesis and structural 22 plasticity16. Example processes include axon initiation, growth, guidance and branching, 23 in morphogenesis of dendrites and dendritic spines, in synapse formation and stability, 24 and in axon and dendrite retraction16. Indeed, autism involves abnormal neuronal 25 development and connectivity17. In addition, actin remodeling in neurons are the key to 26 synapse formation and function18, which is a primary factor in autism pathology19.

27 hsa04010 MAPK signaling pathway (Fig. S4c)

28 Both the classical MAPK and Jnk-p38 pathways are relevant. Similar to the actin 29 cytoskeleton pathway, MAPK signaling also plays an important role in synaptic 30 formation, function and plasticity20,21. Both ERK and p38 MAPKs have defined roles21. 31 Consistently, our results showed that both perturbations in both classical and JNK/p38 32 MAPK pathways lead to autism. In known autism genes in SFARI gene database (labeled 33 in blue in Fig. S4c), there are plenty evidences on the classical MAPK pathway, but not 34 on JNK/p38 pathway. In other words, the latter or the most of its hit genes are novel 35 autism associations.

2

1 hsa04530 Tight junction (Fig. S4d)

2 Neuron communication depends on synapses. Synapses are essentially a type of cell 3 junction. Cell junctions and cell adhesion molecules not only connect pre- and 4 postsynaptic compartments, but also mediate trans-synaptic recognition and signaling 5 processes that are essential for the establishment, specification, and plasticity of 6 synapses22. It is known that brain function depends on cell adhesion molecules22. Tight 7 junction may be part of synaptic structure and function23.

8 hsa04713 Circadian entrainment (Fig. S4e)

9 Circadian entrainment is also a synapse related process. Like other synapse pathways, 10 most part of circadian entrainment is relevant to autism. Our results showed that circadian 11 entrainment pathway is hit by autism genes. In other words, autism shares causal genes or 12 molecular pathways with circadian disorders. In addition, autism or synaptic genes 13 interact with clock genes. In other words, these two conditions may affect each other 14 directly24. Indeed, Circadian disorder is a primary comorbidity of autism. Circadian 15 disorder and sleep problem are prevalent (up to 83%) in autism patients10.

16 hsa04974 Protein digestion and absorption (Fig. S4f)

17 Recent research shows that more than 50% of children with autism have GI symptoms, 18 food allergies, and maldigestion or malabsorption issues8. In addition, altered intestinal 19 permeability was found in 43% of autistic patients, but not found in any of the controls25. 20 Intestinal permeability, commonly called "leaky gut", may allow undigested food and 21 other toxins to enter the blood stream, which could cause health problems. 22

23 Supplementary Text Section 4: Subpathway biology, coherent fine details

24 In general, missense variants hitting selected pathways in probands are more relevant 25 biologically than in silbings or other missense (Fig. 3). The probands-sibling difference in 26 missense variants is well exhibited by the example pathways too. In probands (vs in 27 sibling), 6 out of 8 or 75% (vs 1 out of 4 or 25%, p= 4.8×10-02) missense events in Wnt 28 signaling, and 15 out of 15 or 100% (vs 1 out of 3 or 33%, p= 4.0×10-04) missense events 29 in the synapses pathways hit the relevant function domains. Our data consistently shows, 30 for probands vs silbings, there are not only more DN events in selected pathways, but 31 these events are also more disruptive (LGD and missense vs silent, Fig. 2 and Fig. S10, 32 Fig. S5-7), and more relevant in function (Fig. 4a-b and Fig. S8-9). 33 34 Autistic missense events on the same genes tend to hit residues extremely close and in the 35 same domain. This occurs to all cases we observed in Wnt and synapse pathways (Fig. 5,

3

1 Fig. S8-9), including ADCY5: A534T and R603C on AC domain, SLC6A1 or SLC6A13: 2 A288V and G299V or A450V and V459M on Sodium:neurotransmitter symporter 3 domain, CREBBP: Y1539C and T1569A on KAT11 domain. Assuming missense events 4 occur randomly, the chance to observe such closely occurred events in the same domain 5 are very small (p= 0.01, 0.002, 0.03 and 0.02 respectively). These data strongly suggest 6 that these missense events do not occurred in random, but precisely and consistently 7 targeting specific risky loci for autism. 8 9 Missense event cluster on cAMP second-messenger system (biology) 10 cAMP second-messenger system is key component of the synapse function, 11 neurotransmission, learning and memory (cognitive) processes26,27. In post synapses, 12 glutamate receptors (mGluRs) activate the coupled G upon glutamate binding. G 13 proteins then binds and control Adenylate cyclases, which produce cAMP, a primary 14 second messenger for downstream signaling. 15 16 Another missense event cluster in the glutamatergic synapse pathway 17 GRK can inhibit mGluR or GPCR signaling by phosphorylating activated the receptors 18 and by sequestering heterotrimeric G proteins28 (Fig. S7 and 11). Fig. S11 shows the 19 phosphorylation independent mechanism, including sequestration of G-alpha (Gi/o here) 20 subunits with its RGS homology (RH) domain and G-beta/gamma with its pleckstrin 21 homology (PH) domain (protein structure29). Indeed, missense variant hit GRK 22 (ADRBK2) in its PH domain likely perturb its interaction with G-beta and G-gamma. 23 The missense variant in GNAO1 hit the G-alpha domain close to the RGS binding site. 24 Notice the same GNAO1 hit the G-alpha domain also interacts with AC (ADCY5) in the 25 cAMP system (Fig. S7 and Fig. 4d). 26

27 Supplementary Text Section 5 Superpathway biology, emergent big picture

28 The selected pathways are distinct yet highly interconnected. For example, MAPK feed 29 into canonical Wnt pathway and inhibit TCF/LEF dependent transcription (Fig. 3a and 30 Fig. S4c). MAPK (PKC/PKA-Erk signaling), Glutamatergic synapse and Circadian 31 entrainment are tightly intertwined by definition (Fig. 3c, Fig. S4c and 4e). In addition, 32 they also share numerous other connections. For example, Wnt and MAPK are both 33 involved in adherens jucntions and focal adhesion (Fig. S4g-4h). Calcium signal is part 34 of MAPK pathway and Glutamatergic synapse (Fig. 3c and Fig. S4c). These commonly 35 connected pathways are also perturbed in ASD yet marginally significant (p.val=0.01- 36 0.10, Table S2). They may hit the significance level with a larger sample size. 37 38 Independent GO term analysis converges to the same set of biological themes.

4

1 We analyzed all 3 branches of GO tree, but only describe Biological Process (BP) and 2 Molecular Function (MF) in details here. Cellular Component (CC) results are less 3 informative and specific for functional interpretation. Nonetheless, CC are still consistent 4 and biologically relevant (Table S3). 5 Three major biological themes emerged from the selected BP and MF terms (Fig. S12). 6 These are synapse function, morphology and transcription, highly consistent with KEGG 7 pathway analysis (Fig. 5). 8 Each theme is supported by multiple GO terms. Each GO term associates with at least 1 9 theme, some with multiple themes. Two small GTPase related GO terms contribute to all 10 3 themes. Ras GTPase family regulate actin cytoskeleton hence synapse morphology, and 11 gene transcription through MAPK (Fig. S4b, Regulation of actin cytoskeleton). They also 12 control synapse transmission directly30,31. Another GO term associates with to all 3 13 themes, i.e. Adenyl ribonucleotide binding (mainly ATP binding 79/81). Indeed, 14 neurotransmission, synapse assembly (actin cytoskeleton) and activity dependent 15 transcription are all energy intensive processes32,33. ATP binding and energy supply is 16 critical for all of them. 17 The remaining associations between GO terms and biological themes are 1-to-1 and 18 straightforward. The Transcription theme is supported by chromatin modification, 19 histone-lysine N-methyltransferase activity, RNA polymerase II core binding. The 20 Morphology theme is supported by neuron projection morphogenesis, cell adhesion 21 molecule binding. The Synapse function theme is supported by calcium ion 22 transmembrane transporter activity and intracellular lipid transport. The latter is involved 23 in synaptic vesicle and protein delivery34. 24 In addition to agreement on big themes, the selected GO terms are complimentary yet 25 consistent with selected KEGG pathways. GO:0018024 histone-lysine N- 26 methyltransferase activity is essentially the same gene set as hsa00310 Lysine 27 degradation, both are subsets of chromatin modification. GO:0050839 cell adhesion 28 molecule binding overlap with hsa04520 Adherens junction, and GO:0048812 neuron 29 projection morphogenesis with hsa04530 Tight junction. 30

31 References

32 33 1 Pinto, D. et al. Functional impact of global rare copy number variation in autism spectrum 34 disorders. Nature 466, 368-372, doi:10.1038/nature09146 (2010). 35 2 Iossifov, I. et al. The contribution of de novo coding mutations to autism spectrum disorder. 36 Nature 515, 216-221, doi:10.1038/nature13908 (2014). 37 3 Sanders, S. J. et al. De novo mutations revealed by whole-exome sequencing are strongly 38 associated with autism. Nature 485, 237-241, doi:10.1038/nature10945 (2012). 39 4 Sebat, J. et al. Strong association of de novo copy number mutations with autism. Science 316, 40 445-449, doi:10.1126/science.1138659 (2007). 41 5 Glessner, J. T. et al. Autism genome-wide copy number variation reveals ubiquitin and neuronal 42 genes. Nature 459, 569-573, doi:10.1038/nature07953 (2009).

5

1 6 Weiss, L. A., Arking, D. E., Daly, M. J. & Chakravarti, A. A genome-wide linkage and 2 association scan reveals novel loci for autism. Nature 461, 802-808, doi:10.1038/nature08490 3 (2009). 4 7 Wang, K. et al. Common genetic variants on 5p14.1 associate with autism spectrum disorders. 5 Nature 459, 528-533, doi:10.1038/nature07999 (2009). 6 8 Horvath, K., Papadimitriou, J. C., Rabsztyn, A., Drachenberg, C. & Tildon, J. T. Gastrointestinal 7 abnormalities in children with autistic disorder. The Journal of pediatrics 135, 559-563 (1999). 8 9 Zoghbi, H. Y. & Bear, M. F. Synaptic dysfunction in neurodevelopmental disorders associated 9 with autism and intellectual disabilities. Cold Spring Harb Perspect Biol 4, 10 doi:10.1101/cshperspect.a009886 (2012). 11 10 Glickman, G. Circadian rhythms and sleep in children with autism. Neurosci. Biobehav. Rev. 34, 12 755-768, doi:10.1016/j.neubiorev.2009.11.017 (2010). 13 11 Nava, C. et al. Analysis of the X exome in patients with autism spectrum disorders 14 identified novel candidate genes, including TMLHE. Transl Psychiatry 2, e179, 15 doi:10.1038/tp.2012.102 (2012). 16 12 Celestino-Soper, P. B. et al. Use of array CGH to detect exonic copy number variants throughout 17 the genome in autism families detects a novel deletion in TMLHE. Hum. Mol. Genet. 20, 4360- 18 4370, doi:10.1093/hmg/ddr363 (2011). 19 13 Ziats, M. N. et al. Improvement of regressive autism symptoms in a child with TMLHE deficiency 20 following supplementation. Am J Med Genet A 167, 2162-2167, 21 doi:10.1002/ajmg.a.37144 (2015). 22 14 Grafodatskaya, D., Chung, B., Szatmari, P. & Weksberg, R. Autism spectrum disorders and 23 epigenetics. J. Am. Acad. Child Adolesc. Psychiatry 49, 794-809, doi:10.1016/j.jaac.2010.05.005 24 (2010). 25 15 De Rubeis, S. et al. Synaptic, transcriptional and chromatin genes disrupted in autism. Nature 515, 26 209-215, doi:10.1038/nature13772 (2014). 27 16 Luo, L. Actin cytoskeleton regulation in neuronal morphogenesis and structural plasticity. Annu. 28 Rev. Cell Dev. Biol. 18, 601-635, doi:10.1146/annurev.cellbio.18.031802.150501 (2002). 29 17 Belmonte, M. K. et al. Autism and abnormal development of brain connectivity. The Journal of 30 neuroscience : the official journal of the Society for Neuroscience 24, 9228-9231, 31 doi:10.1523/JNEUROSCI.3340-04.2004 (2004). 32 18 Cingolani, L. A. & Goda, Y. Actin in action: the interplay between the actin cytoskeleton and 33 synaptic efficacy. Nat Rev Neurosci 9, 344-356, doi:10.1038/nrn2373 (2008). 34 19 Bourgeron, T. A synaptic trek to autism. Curr. Opin. Neurobiol. 19, 231-234, 35 doi:10.1016/j.conb.2009.06.003 (2009). 36 20 Giachello, C. N. et al. MAPK/Erk-dependent phosphorylation of synapsin mediates formation of 37 functional synapses and short-term homosynaptic plasticity. J. Cell Sci. 123, 881-893, 38 doi:10.1242/jcs.056846 (2010). 39 21 Thomas, G. M. & Huganir, R. L. MAPK cascade signalling and synaptic plasticity. Nat Rev 40 Neurosci 5, 173-183, doi:10.1038/nrn1346 (2004). 41 22 Missler, M., Sudhof, T. C. & Biederer, T. Synaptic cell adhesion. Cold Spring Harb Perspect Biol 42 4, a005694, doi:10.1101/cshperspect.a005694 (2012). 43 23 Tang, V. W. Proteomic and bioinformatic analysis of epithelial tight junction reveals an 44 unexpected cluster of synaptic molecules. Biol Direct 1, 37, doi:10.1186/1745-6150-1-37 (2006). 45 24 Bourgeron, T. The possible interplay of synaptic and clock genes in autism spectrum disorders. 46 Cold Spring Harb. Symp. Quant. Biol. 72, 645-654, doi:10.1101/sqb.2007.72.020 (2007). 47 25 D'Eufemia, P. et al. Abnormal intestinal permeability in children with autism. Acta Paediatr. 85, 48 1076-1079 (1996). 49 26 Salin, P. A., Malenka, R. C. & Nicoll, R. A. Cyclic AMP mediates a presynaptic form of LTP at 50 cerebellar parallel fiber synapses. Neuron 16, 797-803 (1996). 51 27 Kandel, E. R. & Hawkins, R. D. The biological basis of learning and individuality. Sci. Am. 267, 52 78-86 (1992). 53 28 Kim, C. H., Lee, J., Lee, J. Y. & Roche, K. W. Metabotropic glutamate receptors: phosphorylation 54 and receptor signaling. J. Neurosci. Res. 86, 1-10, doi:10.1002/jnr.21437 (2008).

6

1 29 Tesmer, V. M., Kawano, T., Shankaranarayanan, A., Kozasa, T. & Tesmer, J. J. Snapshot of 2 activated G proteins at the membrane: the Galphaq-GRK2-Gbetagamma complex. Science 310, 3 1686-1690, doi:10.1126/science.1118890 (2005). 4 30 Zhu, J. J., Qin, Y., Zhao, M., Van Aelst, L. & Malinow, R. Ras and Rap control AMPA receptor 5 trafficking during synaptic plasticity. Cell 110, 443-455 (2002). 6 31 Chen, H. J., Rojas-Soto, M., Oguni, A. & Kennedy, M. B. A synaptic Ras-GTPase activating 7 protein (p135 SynGAP) inhibited by CaM kinase II. Neuron 20, 895-904 (1998). 8 32 Rangaraju, V., Calloway, N. & Ryan, T. A. Activity-driven local ATP synthesis is required for 9 synaptic function. Cell 156, 825-835, doi:10.1016/j.cell.2013.12.042 (2014). 10 33 Harris, J. J., Jolivet, R. & Attwell, D. Synaptic energy use and supply. Neuron 75, 762-777, 11 doi:10.1016/j.neuron.2012.08.019 (2012). 12 34 Rohrbough, J. & Broadie, K. Lipid regulation of the synaptic vesicle cycle. Nat Rev Neurosci 6, 13 139-150, doi:10.1038/nrn1608 (2005). 14 15

7

Probands Silent LGD Missense Nonsilent Sum (n=2508) Selected 83 52 134 186 269 (102.1) (34.6)** (132.2) (166.9)* Others 1209 386 1539 1925 3134 (1189.9) (403.4) (1540.8) (1944.1) All 1292 438 1673 2111 3403

Siblings Silent LGD Missense Nonsilent Sum (n=1911) Selected 62 16 69 85 147 (59.4) (13.8) (73.8) (87.6) Others 863 199 1079 1278 2141 (865.6) (201.2) (1074.2) (1275.4) All 925 215 1148 1363 2288

Table S1. The actual (and expected) counts of DN events by gene (columns) and pathway (rows) level effects. Independent categories by gene level effects are Silent, LGD (likely gene disrupting), and Missense. Nonsilent group is a composite of LGD and missense, Sum includes all. Pathway level effects or targets include the Selected pathways, Others and All. We also did overrepresentation tests on each cell, and cell with significant p-values are marked (**p= 0.001; * p= 0.006).

SSC Pathways p.val q.val p.con q.con j k Reference -02 1 hsa04720 Long-term 1.38×10 0.35 0.13 0.19 6 1 potentiation hsa04723 Retrograde 3.09×10-02 0.35 0.43 0.43 7 1 2,3 endocannabinoid signaling hsa04510 Focal adhesion 4.13×10-02 0.41 0.04 0.41 11 11 4,5 -02 6 hsa04520 Adherens junction 6.42×10 0.46 0.06 0.46 5 5 hsa04020 Calcium signaling 8.75×10-02 0.52 0.09 0.52 9 9 7,8 pathway

Table S2. Other pathways connected by 3 or more selected pathways in Table 1. Columns j and k are the marginal and conditional counts of selected genes. These pathways are marginally significant (p.val) in the pathway-level analysis, and likely relevant to ASD based on literature.

Sub SSC Pathways p.val q.val p.con q.con j k Ref. BP GO:0016568 chromatin 6.2×10-08 7.4×10-04 6.2×10-08 7.4×10-04 8 8 9,10 modification -06 -03 -07 -05 11 BP GO:0048812 neuron 2.0×10 5.5×10 3.7×10 2.0×10 8 8 projection morphogenesis BP GO:0032365 intracellular 2.7×10-04 9.4×10-02 1.6×10-04 6.9×10-03 7 7 12,13 lipid transport -04 -02 -04 -03 14-16 BP GO:0051056 regulation of 1.6×10 6.6×10 1.9×10 4.8×10 10 9 small GTPase mediated signal transduction CC GO:0030054 cell junction 4.2×10-06 5.6×10-03 4.2×10-06 5.6×10-03 12 12 6,17,18 CC GO:0035097 histone 1.7×10-03 1.4×10-01 1.2×10-03 1.8×10-02 14 10 9,19 methyltransferase complex CC GO:0031672 A band 8.8×10-06 5.8×10-03 1.8×10-03 1.7×10-02 9 5 CC GO:0044459 plasma 2.3×10-04 6.2×10-02 4.5×10-03 2.7×10-02 8 3 membrane part CC GO:0005694 chromosome 3.6×10-03 1.8×10-01 1.2×10-02 8.8×10-02 8 2 CC GO:0005578 proteinaceous 4.5×10-03 1.9×10-01 1.4×10-02 4.2×10-02 8 8 extracellular matrix -09 -05 -09 -05 14-16 MF GO:0005083 small GTPase 7.7×10 3.0×10 7.7×10 3.0×10 8 8 regulator activity MF GO:0032559 adenyl 1.3×10-07 9.6×10-05 1.4×10-06 3.1×10-05 7 7 20,21 ribonucleotide binding MF GO:0018024 histone-lysine 1.7×10-04 2.3×10-02 7.2×10-05 1.7×10-03 10 9 9,19 N-methyltransferase activity MF GO:0050839 cell adhesion 1.1×10-03 1.2×10-01 3.1×10-04 2.5×10-03 12 12 4,22 molecule binding -03 -01 -04 -03 9 MF GO:0000993 RNA 2.0×10 1.9×10 8.2×10 2.5×10 14 10 polymerase II core binding MF GO:0015085 calcium ion 1.9×10-03 1.8×10-01 1.8×10-03 3.5×10-03 9 5 7,8 transmembrane transporter activity MF GO:0030234 enzyme 1.1×10-07 9.6×10-05 2.0×10-02 2.0×10-02 8 3 regulator activity

Table S3. Significant Gene Ontology groups selected from SSC, their test statistics and references. Columns j and k are the marginal and conditional counts of selected genes. These groups are likely drivers or disease causing groups due to the special analysis procedure (see Methods). Analysis was done the same way as for pathways in Table 1.

Table S4. Validated and selected variants in the multi-level integrated analyses of SSC and ASC data. Events are selected at variant and gene level as described in Methods. For SSC, all validated variants are listed, including the silent group.

Table S5. Validated variants and functional annotations in the selected pathways of SSC data.

References

1 Jung, N. H. et al. Impaired induction of long-term potentiation-like plasticity in patients with high- functioning autism and Asperger syndrome. Dev. Med. Child Neurol. 55, 83-89, doi:10.1111/dmcn.12012 (2013). 2 Zhang, L. & Alger, B. E. Enhanced endocannabinoid signaling elevates neuronal excitability in fragile X syndrome. The Journal of neuroscience : the official journal of the Society for Neuroscience 30, 5724-5729, doi:10.1523/JNEUROSCI.0795-10.2010 (2010). 3 Kerr, D. M., Downey, L., Conboy, M., Finn, D. P. & Roche, M. Alterations in the endocannabinoid system in the rat valproic acid model of autism. Behav. Brain Res. 249, 124-132, doi:10.1016/j.bbr.2013.04.043 (2013). 4 Betancur, C., Sakurai, T. & Buxbaum, J. D. The emerging role of synaptic cell-adhesion pathways in the pathogenesis of autism spectrum disorders. Trends Neurosci. 32, 402-412, doi:10.1016/j.tins.2009.04.003 (2009). 5 Wei, H. et al. Abnormal cell properties and down-regulated FAK-Src complex signaling in B lymphoblasts of autistic subjects. The American journal of pathology 179, 66-74, doi:10.1016/j.ajpath.2011.03.034 (2011). 6 Williams, E. L. & Casanova, M. F. Above genetics: lessons from cerebral development in autism. Transl Neurosci 2, 106-120, doi:10.2478/s13380-011-0016-3 (2011). 7 Gargus, J. J. Genetic calcium signaling abnormalities in the central nervous system: seizures, migraine, and autism. Ann. N. Y. Acad. Sci. 1151, 133-156, doi:10.1111/j.1749-6632.2008.03572.x (2009). 8 Lu, A. T., Dai, X., Martinez-Agosto, J. A. & Cantor, R. M. Support for calcium channel gene defects in autism spectrum disorders. Mol Autism 3, 18, doi:10.1186/2040-2392-3-18 (2012). 9 De Rubeis, S. et al. Synaptic, transcriptional and chromatin genes disrupted in autism. Nature 515, 209-215, doi:10.1038/nature13772 (2014). 10 Lasalle, J. M. Autism genes keep turning up chromatin. OA Autism 1, 14 (2013). 11 Hutsler, J. J. & Zhang, H. Increased dendritic spine densities on cortical projection neurons in autism spectrum disorders. Brain Res. 1309, 83-94, doi:10.1016/j.brainres.2009.09.120 (2010). 12 El-Ansary, A. & Al-Ayadhi, L. Lipid mediators in plasma of autism spectrum disorders. Lipids Health Dis 11, 160, doi:10.1186/1476-511X-11-160 (2012). 13 Rohrbough, J. & Broadie, K. Lipid regulation of the synaptic vesicle cycle. Nat Rev Neurosci 6, 139-150, doi:10.1038/nrn1608 (2005). 14 Pinto, D. et al. Functional impact of global rare copy number variation in autism spectrum disorders. Nature 466, 368-372, doi:10.1038/nature09146 (2010). 15 Zhu, J. J., Qin, Y., Zhao, M., Van Aelst, L. & Malinow, R. Ras and Rap control AMPA receptor trafficking during synaptic plasticity. Cell 110, 443-455 (2002). 16 Chen, H. J., Rojas-Soto, M., Oguni, A. & Kennedy, M. B. A synaptic Ras-GTPase activating protein (p135 SynGAP) inhibited by CaM kinase II. Neuron 20, 895-904 (1998). 17 Liu, Z., Li, N. & Neu, J. Tight junctions, leaky intestines, and pediatric diseases. Acta Paediatr. 94, 386-393 (2005). 18 Tang, V. W. Proteomic and bioinformatic analysis of epithelial tight junction reveals an unexpected cluster of synaptic molecules. Biol Direct 1, 37, doi:10.1186/1745-6150-1-37 (2006). 19 Akbarian, S. & Huang, H. S. Epigenetic regulation in human brain-focus on histone lysine methylation. Biol. Psychiatry 65, 198-203, doi:10.1016/j.biopsych.2008.08.015 (2009). 20 Rangaraju, V., Calloway, N. & Ryan, T. A. Activity-driven local ATP synthesis is required for synaptic function. Cell 156, 825-835, doi:10.1016/j.cell.2013.12.042 (2014). 21 Harris, J. J., Jolivet, R. & Attwell, D. Synaptic energy use and supply. Neuron 75, 762-777, doi:10.1016/j.neuron.2012.08.019 (2012). 22 Ye, H., Liu, J. & Wu, J. Y. Cell adhesion molecules and their involvement in autism spectrum disorder. Neurosignals 18, 62-71, doi:10.1159/000322543 (2010).

a

b

c

Fig. S1. Multi-level recurrence of de novo mutations in SSC study. Venn diagrams and test statistics on recurrence at different levels for a) probands and b) siblings. c) recurrence rates in selected vs other genes or pathways for probands and siblings. Term “considered” or “selected” refers to items before or after selection process at each level (Methods). See Supplementary Table 4 for full lists of selected variants and genes used in the analysis. Error bars represent standard error of the mean (SEM).

Fig. S2. Autism genetic association analysis across variant, gene and pathway levels with the SSC exome DN data. Column 1: DN event rates and association test by gene level effect only without considering pathway level effect, i.e. this is the marginal sum of column 1 and 2 from Figure 2; column 2: A descriptive model for autism genetic association. Rows are marginal (Row 1) and conditional (Row 2-3) association tests and statistics, and corresponding model representations. Error bars represent standard error of the mean (SEM).

Fig. S3. Prevalence of DN variant types by gene and pathway level effects in the SSC exome study. This figure is comparable to row 1 and column 1-2 of Figure 2, and legend is the same as there too. Error bars represent standard error of the mean (SEM).

a

b c d

e f

g

h

Fig. S4 An integrated view of autism associated DN variants or genes from multiple sources: a-f) selected KEGG pathways besides those shown in Figure 4, and g-h) commonly connected pathways. DN variants data come from SSC and ASC studies, and reported autism genes from SFARI Gene Database. P-values are from pathway analysis (Table 1). Data are integrated and visualized on KEGG pathway graphs using Pathview (23).

a

b

Fig. S5 An integrated view of DN variants by gene level effects in selected KEGG pathways, hsa04310 Wnt signaling pathway: a) probands, b) silbings. DN variants data come from SSC study. Data are integrated and visualized on KEGG pathway graphs using Pathview (23).

a

b

Fig. S6 An integrated view of DN variants by gene level effects in selected KEGG pathways, hsa04727 GABAergic synapse: a) probands, b) silbings. DN variants data come from SSC study. Data are integrated and visualized on KEGG pathway graphs using Pathview (23). Nodes in bold black box are synapse membrane proteins (neurotransmitter receptors, transporters and ion channels) hit by missense events. Effects of these events on protein functions are exhibited in Figure 4c.

a

b

Fig. S7 An integrated view of DN variants by gene level effects in selected KEGG pathways, hsa04724 Glutamatergic synapse: a) probands, b) silbings. DN variants data come from SSC study. Data are integrated and visualized on KEGG pathway graphs using Pathview (23).Nodes in bold black box are synapse membrane proteins (neurotransmitter receptors, transporters and ion channels) hit by missense events. Effects of these events on protein functions are exhibited in Figure 4c. We also identified two clusters of interacting missense events as shown in blue and green dashed boxes. Detailed biology of these events is interpreted in Figure 4d and Fig. S11.

Fig. S8. 1D protein domain structure and missense variants of genes in Wnt signaling pathway for both probands and siblings. It is recommended to view and interpret these events in the pathway context in Fig. S6.

Fig. S9. 1D protein domain structure and missense variants of genes in synapse pathways for both probands and siblings. It is recommended to view and interpret these events in the pathway context in Fig. S7- 8.

Fig. S10. 1D protein domain structure and DN mutations of gene pairs in Wnt signaling and synapse pathways. These gene pairs are the same or similar genes hit by DN variants in both probands and siblings. Frame shift variants are labeled by suffix fs with position. It is recommended to view and interpret these events in the pathway context in Fig. S6-8.

Fig. S11. 1 and 3D protein structures and missense variants hitting mGluR inhibitor GRK (ADRBK2) and interacting G proteins. Pathway context is shown in the blue dashed box in in Fig. S7. Biology background and function impact of these missense variants are described in Supplementary Text Section 4.

Fig. S12. Selected GO terms and emerging biological themes from the SSC exome variant data. GO analysis was done the same way as KEGG pathway analysis and results are listed in Supplementary Table 3. Significant terms from Biological Process or Molecular Function branches were considered here. These GO terms converge to 3 higher level biological themes within the synapse context, i.e. synapse function (transmission), morphology (wiring) and transcription (plasticity or dynamics). Each theme is supported by multiple GO terms. Each GO term associates with at least 1 theme, some with multiple themes.