<<

Supporting Information

Zhang et al. 10.1073/pnas.1309725110 SI Methods selection (day 6 postinfection), the cells were differentiated into Cell Culture and Neural Differentiation. The embryonic stem NPCs using noggin induction (150 ng/mL) for 10 d then the cells cell (hESC) lines H1 (WA01) and H9 (WA09), both from WiCell, were gently propagated using Accutase and grown further on poly-L- 2 were maintained in an undifferentiated state by culturing on ornithine (20 μg/mL)/fibronectin (1.7 μg/cm )-coated plates in NPC Matrigel-coated plates (Becton & Dickinson) in feeder-free and medium with basic FGF (10 ng/mL) for 2 wk more before harvesting serum-free, component-defined conditions. H1 was cultured in the samples for RNA extraction or immunostaining (see below); (ii) DMEM/F12 medium with supplement components described for each shRNA a simultaneous cell culture was also maintained in previously (1); H9 was cultured in mTeSR medium (Stem Cell hESC medium to maintain pluripotency. The pluirpotent cells were Technologies). The neuronal induction, propagation and char- cultured for 24 d with medium change every 2 d. At the end of the acterization of neuronal progenitor cells (NPCs) were carried experiment cell lysates were collected in TRIzol reagent (In- out as we and others previously described (1, 2). vitrogen) and kept at −80 °C for RNA extraction.

Viral Infection of hESCs. Pooled shRNA library screening. Colonies of FACS Sorting. Live neural cell surface antigen staining using anti- hESC maintained on Matrigel (BD Biosciences) were pretreated polysialylated neural cell adhesion molecule (anti–PSA-NCAM) for 1 h with 10 μM Y27632 compound (Calbiochem), dissociated antibody was performed as described by Pruszak et al. (3) with with Accutase (Millipore) and single-cell suspension of un- minor modifications. In preliminary experiments the concentra- differentiated hESCs were infected with pools of the pLKO.1- tion of primary and secondary antibodies were titrated for based lentiviral short hairpin RNA (shRNA) library targeting the specificity using isotype-specific antibody controls along with entire [MISSION library, Sigma-Aldrich, origi- hESCs and their derived NPCs. Briefly, for large-scale, pre- nally from The RNA1 Consortium (TRC)/Broad Institute]. For parative FACS all of the infected NPCs cultured under puro- one viral pool (1/10th of the human genome, ∼8,600 distinct mycin selection for 4 wk were quickly digested at 37 °C using shRNAs with four to five different shRNAs targeting the same Accutase (Millipore) and collected in sterile tubes containing ) we infected 1 × 108 undifferentiated hESCs prepared as FACS buffer [2% (vol/vol) BSA (Invitrogen) in HBSS (Gibco- a single-cell suspension, such that that at least 300 cells are in- Invitrogen)]. Following centrifugation at 100 × g for 4 min at fected with the same single species of viral particle when using a +4 °C, the pellets were resuspended in FACS buffer, triturated multiplicity of infection of 0.3. Forty-eight hours later the cells were with a glass pipet, and filtered using a 40-μm sterile strainer (BD split and an aliquot (∼5 × 107 cells) was collected for a time 0 (T0) Biosciences). A total of 2.5 × 107 cells were resuspended in 2.5 input reference; the rest of the cells cultured further in plurip- mL FACS buffer. After removing aliquots for unstained control otent condition. Because the pLKO.1 backbone harbors the and unsorted cells, the cell suspensions were incubated with puromycin resistance gene, the uninfected cells were eliminated anti–PSA-NCAM IgM monocolonal antibody (Millipore; 1:250 by adding this chemical (1 μg/mL) at 72 h after the infection. The dilution) for 30 min at 4 °C using gentle rotation. Following rest of the cell population was expanded for an additional 3 d as a wash with washing buffer [0.1% (vol/vol) BSA in HBSS], the pluripotent cells then divided into two experimental arms: (i) cells were incubated with Alexa-488 secondary antibody (1:500 propagated in a condition that maintains undifferentiated state dilution) for 20 min at +4 °C in the dark. The cell suspensions and (ii) differentiated into neuronal precursors/progenitors using were washed twice using washing buffer containing 0.1% BSA, a monolayer approach (1, 2). Puromycin selection was main- then sorted into three populations, PSA-NCAM–negative, tained throughout the entire experiment. The pluripotent arm of weakly positive, and PSA-NCAM strongly positive cells by pre- the experiment performed in parallel served as additional con- parative FACS sorting using a VantageSE sorter. trol to obtain a time course of for self-renewal and overall cel- lular survival. The undifferentiated cells were maintained in DNA Extraction, PCR Amplification Adapted for Deep Sequencing. To pluripotent ES medium, and ∼5 × 107 cells were collected every extract genomic DNA, lysis buffer (50 mM Tris, pH 7.5, 50 mM 5–6 d for genomic DNA extraction, over a total of 4 wk. For EDTA, pH 8.0, 100 mM NaCl, 5 mM DTT, 0.5 mM Spermidine- neuronal differentiation, cells were cultured in NPC medium HCl, 1% SDS) was added to the various cell samples. The cell [DMEM/F12 (Gibco-Invitrogen) containing 2% (vol/vol) B27 lysates were de-proteinized by incubation with 100 μL proteinase supplement and 1% (vol/vol) from a 10,000 units/mL penicillin K (10 mg/mL) overnight at 55 °C followed by RNase A (0.1 mg/ and 10,000 μg/mL streptomycin stock solution] in the presence of mL) treatment at 37 °C for half an hour. Finally, phenol-chloro- noggin (R&D Systems; 100 ng/mL) for 14 d before the cells were form and ethanol precipitation was used to extract genomic DNA. split and transferred to poly-L-ornithine (20 μg/mL)/fibronectin (3 A total of 3.0 μg gDNA for each experimental group was μg/cm2) -coated flasks, and maintained in NPC medium with a low PCR-amplified to recover shRNA inserts for deep sequencing. concentration of recombinant human basic FGF (Millipore) (10 The PCR was performed in multiple tubes, each with 50–100 ng ng/mL) but without noggin for another 33 d before live-cell FACS gDNA, 200 nM dNTPs, 200 nM forward (5′ AATGATACGG- sorting. The experimental schedule is illustrated by Fig. 1B. CGACCACCGATTCTTGGGTAGTTTGC) and reverse primers Individual viral shRNA infection. We infected undifferentiated H1 or (5′ CAAGCAGAAGACGGCATACGACTGCCATTTGTCTCG) H9 hESCs as single cell suspensions plated in multiwell plates containing Illumina Genome Analyzer II adapter sequences, 0.6 μL with viral lysates of individual pLKO.1-based shRNAs with DMSO, and 0.6 U Phusion Polymerase (New England Biosciences). a multiplicity of infection of 2. Two to three shRNA species The PCR program setting was 98 °C for 2 min, 26 cycles of 98 °C for targeting different regions of the same gene were used. As done at 30 s, 60 °C for 15 s, and 72 °C for 15 s, and then 72 °C for 5 min. The the primary screen with the pooled library, 72 h after transduction PCR products were pooled together and purified using MinElute puromycin (1 μg/mL) was added to each well, and this selection columns (Qiagen) before running onto a 2% (wt/vol) E-gel (In- was maintained throughout the rest of the experiment. To test vitrogen) and the PCR products were cut out and purified using a the effects of individual shRNAs, two parallel experiments were MinElute gel purification kit (Qiagen). The PCR products were performed: (i) after recovery from the infection and puromycin then sequenced using Illumina GAII with a customer designed

Zhang et al. www.pnas.org/cgi/content/short/1309725110 1of13 sequencing primer (5′ TCTTGTGGAAAGGACGAAACACC- (TARDBP) immunostaining the cells were permeabilized GG) to capture the sequences of shRNA inserts (single-end, with 0.2% Triton X-100 and washed first before blocking; the 36-bp sequencing). PSA-NCAM immunostaining was performed without any de- tergent present. The following primary antibodies and dilutions RNA Extraction, cDNA Synthesis, and Quantitative Real-Time PCR. For were used: PSA-NCAM (mouse mAb IgM, Chemicon; 1:175), cells cultured in pluripotent condition, total RNA was extracted NESTIN (mouse mAb, Millipore; 1:200), SOX1 (goat pAb, R&D with the TRIzol reagent (Invitrogen), then treated with DNase I Systems; 1:20), TARDBP (rabbit pAb, Proteintech; 1:100). Pri- fi and puri ed using an RNeasy mini kit (Qiagen). Differentiated mary antibodies were incubated overnight at + 4 °C and after – NPCs had a limited number of cells (20 5,000 per well), thus our washing the cells were incubated with the respective fluorescence- fi modi ed RNA extraction protocol was used using reagents of labeled secondary antibodies for 1 h at room temperature. After the RNeasy Plus Micro kit (Qiagen) (4). In brief, for samples washing and mounting Zeiss Apotome fluorescent microscope μ with less than 200 cells, 200 L of RLT Plus Lysis Buffer with providing optical sectioning was used for image acquisition. 2-mercaptoethanol (Qiagen) was added, the cells were re- suspended and vortexed at speed 8 on an SP vortex mixer Statistical Analyses. The statistical approach RIGER was de- – (Boxter) for 30 times, each for 1 s. For samples with 200 1,000 veloped at the Broad Institute specifically for pooled library μ cells and 350 L of RLT Plus Lysis Buffer with 2-mercaptoe- screening strategy using the TRC/Sigma library (5). It is a non- thanol were added and pipetted up and down 10 times, followed parametric approach, ultimately based on gene-enrichment fi by vortexing for 1 min. Afterward puri cation steps followed the analysis methodology with a consideration of the entire list of protocol of RNeasy Plus Micro kit, with 20 ng carrier RNA shRNAs and several hairpin shRNA features. We used the applied. The cDNA synthesis was performed according to the Second Best Rank feature of RIGER, which scores the second protocol especially suitable for low amounts of total RNA. The fi μ highest shRNA effect, therefore requiring that two different puri ed total RNA (13.5 L) was mixed with 3.3 mM EDTA, shRNAs have the same effect. analysis of gene 0.33 mM dNTP, and 6.7 μM oligo-dT23 (all RNase-free) in μ hits was performed using Database for Annotation, Visualization a total of 15- L volume, incubated at 65 °C for 5 min, and im- and Integrated Discovery (DAVID) v6.7 (6), and WebGestalt mediately chilled on ice for 3 min. This process was followed by (WEB-based GEne SeT AnaLysis Toolkit) (7) using the library adding 2 mM DTT, 1 mM RNaseOut, 6 mM magnesium, 1 μL pool as input for comparison. SuperScript reverse transcriptase II, and 1× first strand buffer DAPPLE (Disease Association Protein-Protein Link Evalua- (Invitrogen), and incubating at 42 °C for 60 min, followed by 70 °C tor) (8) is based on protein–protein interaction information re- for 10 min to inactivate the transcriptase. For real-time RT- ported in the literature and was used to test if a greater degree of PCR, primers were designed such that primer pairs spanned over connectivity than expected by chance among encoded by two exons (primer sequences are listed in Table S8). The primers were prevalidated to show specificity with a single band in aga- the gene hits within a phenotypic category of the screen would rose gel electrophoresis and a single melting curve peak in real- exist. We submitted the list of 74 whose shRNA showed time PCR. In addition, each set of primers was tested for their gradual depletion during neural differentiation and analyzed amplification efficiency, to confirm that they gave to twofold against the entire library input, where different random inputs of fi 500 targeted genes did not generate any other network with ampli cation with each PCR cycle using a standard curve. The fi fi expression level of each gene was calculated by the relative the level of con dence documented. Genes were de ned as being “ ” abundance compared with an internal control gene, TATA box nervous system-related if they were implicated in literature binding protein (TBP) using the ΔΔCt method. Real-time PCR analysis primarily using PubMed (www.ncbi.nlm.nih.gov/pubmed). was performed by ABI 7900, using 3 μL cDNA (prepared on the The list of genes implicated in autism spectrum disorders (ASD) same day), FastStart Universal SYBR Green Master mix (Roche), and nonsyndromic mental retardation/intellectual disability (MR/ and 0.26 μM of primers. ID) were created by reviewing recent large-scale, reproduced ge- nomic datasets of affected individuals (and their first-degree - Immunofluorescent Microscopy. H1 and H9 hESC-derived NPCs tives, when indicated) on genetic association, de novo and inherited either uninfected or harboring various shRNAs (including non- copy number variations (CNVs) and de novo and inherited single targeting controls) were plated onto poly-L-ornithine/fibronectin- nucleotide mutations revealed by whole-exome sequencing (9–25). coated eight-well glass-chamber slides (Falcon) after noggin To examine if the relative abundance of gene targets of interest induction, as described above. At the end of differentiation the (i.e., specificity for neural differentiation by experimental data, cells were fixed with 4% paraformaldehyde and immunostaining relation to nervous system physiology and pathology or enrich- was performed as previously described (1). Briefly, for SOX1 and ment of genes implicated in ASD/ID) in each phenotypic category NESTIN staining 10% normal donkey serum/0.3% (vol/vol) Tri- of the screen occurs more than by chance was evaluated by the ton X-100 in PBS was used for blocking; for TAR DNA binding Fisher’s two-tailed exact test for 2 × 2 contingency tables.

1. Wu JQ, et al. (2010) Dynamic transcriptomes during neural differentiation of human 8. Rossin EJ, et al.; International Inflammatory Bowel Disease Genetics Constortium embryonic stem cells revealed by short, long, and paired-end sequencing. Proc Natl (2011) Proteins encoded in genomic regions associated with immune-mediated Acad Sci USA 107(11):5254–5259. disease physically interact and suggest underlying biology. PLoS Genet 7(1):e1001273. 2. Wu H, et al. (2007) Integrative genomic and functional analyses reveal neuronal 9. Ben-David E, Shifman S (2012) Combined analysis of exome sequencing points toward subtype differentiation bias in human embryonic stem cell lines. Proc Natl Acad Sci a major role for transcription regulation during brain development in autism. Mol USA 104(34):13821–13826. Psychiatry, 10.1038/mp.2012.148. 3. Pruszak J, Sonntag KC, Aung MH, Sanchez-Pernaute R, Isacson O (2007) Markers and 10. Cooper GM, et al. (2011) A copy number variation morbidity map of developmental methods for cell sorting of human embryonic stem cell-derived neural cell populations. delay. Nat Genet 43(9):838–846. Stem Cells 25(9):2257–2268. 11. de Ligt J, et al. (2012) Diagnostic exome sequencing in persons with severe intellectual 4. Pan X, et al. (2013) Two methods for full-length RNA sequencing for low quantities of disability. N Engl J Med 367(20):1921–1929. cells and single cells. Proc Natl Acad Sci USA 110(2):594–599. 12. Gilman SR, et al. (2011) Rare de novo variants associated with autism implicate a large 5. Luo B, et al. (2008) Highly parallel identification of essential genes in cancer cells. Proc functional network of genes involved in formation and function of synapses. Neuron Natl Acad Sci USA 105(51):20380–20385. 70(5):898–907. 6. Huang W, Sherman BT, Lempicki RA (2009) Systematic and integrative analysis of 13. Iossifov I, et al. (2012) De novo gene disruptions in children on the autistic spectrum. large gene lists using DAVID bioinformatics resources. Nat Protoc 4(1):44–57. Neuron 74(2):285–299. 7. Zhang B, Kirov S, Snoddy J (2005) WebGestalt: An integrated system for exploring 14. Lim ET, et al.; NHLBI Exome Sequencing Project (2013) Rare complete knockouts in gene sets in various biological contexts. Nucleic Acids Res 33(Web Server issue): : Population distribution and significant role in autism spectrum disorders. W741–W748. Neuron 77(2):235–242.

Zhang et al. www.pnas.org/cgi/content/short/1309725110 2of13 15. Michaelson JJ, et al. (2012) Whole-genome sequencing in autism identifies hot spots 21. Sanders SJ, et al. (2012) De novo mutations revealed by whole-exome sequencing are for de novo germline mutation. Cell 151(7):1431–1442. strongly associated with autism. Nature 485(7397):237–241. 16. Neale BM, et al. (2012) Patterns and rates of exonic de novo mutations in autism 22. Sebat J, et al. (2007) Strong association of de novo copy number mutations with spectrum disorders. Nature 485(7397):242–245. autism. Science 316(5823):445–449. 17. O’Roak BJ, et al. (2012) Multiplex targeted sequencing identifies recurrently mutated 23. Talkowski ME, et al. (2012) Sequencing chromosomal abnormalities reveals neu- genes in autism spectrum disorders. Science 338(6114):1619–1622. rodevelopmental loci that confer risk across diagnostic boundaries. Cell 149(3): 18. O’Roak BJ, et al. (2012) Sporadic autism exomes reveal a highly interconnected protein 525–537. network of de novo mutations. Nature 485(7397):246–250. 24. van Bokhoven H (2011) Genetic and epigenetic networks in intellectual disabilities. 19. Pinto D, et al. (2010) Functional impact of global rare copy number variation in autism Annu Rev Genet 45:81–104. spectrum disorders. Nature 466(7304):368–372. 25. Yu TW, et al. (2013) Using whole-exome sequencing to identify inherited causes of 20. Rauch A, et al. (2012) Range of genetic mutations associated with severe non-syndromic autism. Neuron 77(2):259–273. sporadic intellectual disability: an exome sequencing study. Lancet 380(9854):1674–1682.

Zhang et al. www.pnas.org/cgi/content/short/1309725110 3of13 Fig. S1. Recovery of shRNAs for high-throughput sequencing in pooled RNAi screen for genes involved in neural differentiation of hESCs. (A) PSA-NCAM FACS efficiently separates hESC-derived neuronal progenitors. FACS analysis by a monoclonal antibody against the PSA moiety of NCAM of control NPCs derived from H1 hESCs is shown. The blue graph represents the PSA-NCAM pattern, the red graph is the isotype specific control. FACS separation of live cells is optimal to this phenotype-based genomic screen so the subsequent analysis of shRNA abundance using deep sequencing can be performed without the interferenceof a fixation. (B) Preparative FACS. Cells differentiated to NPCs after infecting hESCs with the pooled shRNA library and maintaining under puromycin selection were sorted into three groups based on their PSA-NCAM expression. (C) Recovery of shRNAs from gDNA for high-throughput sequencing. The shRNA inserts from each sorted cell population were recovered by single-step PCR using primers adapted to the Illumina GAII platform. The gel image shows the uniformity of PCR products of shRNA species obtained from each gDNA aliquot of two different FACS-sorted cell populations (D) Distribution of the recovered shRNA species. The number of times each unique shRNA sequence appears in the sequenced library represents the abundance of the given shRNA species in the infected cell population. Forty-eight hours after transduction (T0) about 96% of the shRNA clones from the tested pool were represented in the sequencing Legend continued on following page

Zhang et al. www.pnas.org/cgi/content/short/1309725110 4of13 reads with a near-normal distribution pattern. Similar patterns were seen from the recovered shRNAs of PSA-NCAM–sorted cells (PSA-NCAM − , negative; PSA- NCAM ++, strongly positive) and in cells grown in pluripotent conditions for 4 wk (T5). These results support the efficiency of the protocol for infecting hES cells and of recovery of shRNA inserts. x axis: log2 sequence read counts; y axis: frequency of the shRNAs.

Fig. S2. Protein:protein interaction of all genes targeted by shRNAs that were depleted during neuronal differentiation. To test whether the gene targets whose shRNAs were gradually depleted during differentiation of the pooled shRNA library-infected hESCs had a greater degree of connectivity than expected by chance, we submitted the list of these gene targets (74 genes) to DAPPLE and analyzed against the genes of the library pool as input. Different random inputs of 500 targeted genes did not generate any other network with the level of confidence documented. The group of hits that were identified as neural system-related by literature curation (n = 32) displayed statistically significant connectivity (P = 0.027) over that of expected by chance when considering direct or indirect protein interactions (i.e., interactions although intermediate genes not on the submitted list). The network of indirect interactions is visualized (Lower).

Zhang et al. www.pnas.org/cgi/content/short/1309725110 5of13 4

3.5

3

2.5

2

1.5

0 0.5 11 1.5 2 2.5 3 3.5 4 4.5 (P) pluripotency arm

10 0.5 Log

- 0

- Log10 (P) neural differentiation arm

Fig. S3. Rank order analysis of gene hits by RIGER to assess neural differentiation specificity. RIGER analysis (Second Best Ranking) was performed to rank the genes whose shRNA were gradually depleted (Depleted) in PSA-NCAM strong-positive cell populations at the end of neural differentiation (neural differ- entiation arm). The behavior of respective shRNAs analyzed in undifferentiated cells grown for the same time as the neural differentiation arm (T5 vs. T0 input time) (pluripotent arm) were also ranked by RIGER and the two ranking plotted. P values indicate the ranking and significance of differences—depletion in this case—between each population in the two arms. Individual dots within the oval are those depleted genes (62 out of 70) whose shRNAs showed no significant change between T5 and T0 in the undifferentiated pluripotent arm (P > 0.05). x axis, −log10 (P values) for the neural differentiation arm; y axis, −log10 (P values) for the undifferentiated arm. The x and y axis intersect at 1.30103 [= −log10 (0.05)].

Zhang et al. www.pnas.org/cgi/content/short/1309725110 6of13 Fig. S4. Target validation by individual shRNA-mediated silencing in a second hES line, H9. (A) Quantitative RT-PCR analysis shows fold differences of the expression of mRNA of the respective target gene in individual shRNA-infected cells differentiated to NPCs from H9 in the same manner as performed for the H1 hESC line. (B) NCAM1 mRNA expression by quantitative RT-PCR of individual shRNA - infected cells after neural differentiation. For both studies SEM represents three technical replicates with two biological replicates were performed. (C) Immunostaining of PSA-NCAM (green) and SOX1/Nestin (red/green) expression in control and individual shRNA-infected H9 cells after neural differentiation. DAPI indicates nuclei (blue). (Scale bar, 10 μm.)

Zhang et al. www.pnas.org/cgi/content/short/1309725110 7of13 Table S1. Summary of shRNA high-throughput sequencing results: Total and postfilter reads for each sample using an Illumina GAII sequencer Sequencing libraries Total reads Postfilter reads % pass filter

T0 25,587,060 22,806,937 89.1 T3 39,843,892 37,312,288 93.6 T5 10,164,458 9,402,209 92.5 PSA-NCAM − 16,038,252 14,930,780 93.1 PSA-NCAM + 25,294,005 22,959,897 90.8 PSA-NCAM ++ 27,294,067 21,119,917 77.4

ShRNA sequencing libraries were prepared from gDNA from each cell group. T0, T3, and T5 indicate various time points following the pooled shRNA library infection and culturing the cells in pluripotent condition. PSA-NCAM − (negative), + (weak positive), and ++ (strong positive) indicate cell populations based on FACS separation at the end of neural differentiation.

Table S2. Summary of shRNA high-throughput sequencing results: Screening statistics and hit selection criteria during neural differentiation from hESCs infected by the pooled shRNA library Features Total

Total number of unique hairpins infected 8,488 Total number of targeted genes (four to five shRNAs/genes on average) 1,801 Total number of genes targeted by unique hairpins present 48 h postinfection (T0) 1,689 > Two hairpins targeting the same gene change > 1.7-fold when comparing PSA-NCAM ++ vs. PSA-NCAM − groups 208 and analogous trend is supported by the PSA-NCAM + group (Depleted and Enriched) Number of target genes whose shRNAs fully disappeared from all groups during differentiation (Absent) 354

Table S3. Target genes are enriched for components of RNA granules and nervous system-related genes based on literature curation Gene hit categories Gene hit category Unchanged Absent Enriched

Unchanged ––– Absent <0.0001 –– Enriched <0.0001 0.146 – Depleted <0.0001 0.458 0.074

The gene hit categories are defined as described in Table 1. The relative abundance of nervous system-related genes in each category was calculated by comparing the number of such genes defined by literature curation with the number of all genes within the same category. The relative proportion of nervous system-related genes in each category was then compared be- tween the different categories and the statistical significance of differ- ences was calculated as described in Table 1.The P values in boldface indicate significant differences. The results show significant enrichment of nervous system-related gene targets in the Absent, Enriched, and De- pleted categories compared with the nervous system-related genes whose shRNA abundance did not change significantly during neural differentia- tion (Unchanged).

Zhang et al. www.pnas.org/cgi/content/short/1309725110 8of13 Table S4. Specific depletion of nervous system-related shRNAs during neural differentiation with target genes involved in early neurogenesis and in late-onset neurological disorders RIGER gene Gene hit rank Nervous system-related function Disorder of the nervous system

ATF4 17 , neurogenesis BMI1 2 Neuronal stem cell multipotency and self-renewal CITED2 42 Novel neuronal activity-dependent transcription factor Brain developmental abnormal (exencephaly) CNOT4 38 Polyubiquitination of JARID1C/SMCX, a histone Nonsyndromic mental retardation demethylase mutated in nonsyndromic mental retardation EBF1 83 Ventral telencephalon development, specification of multiple retinal cell types GPR63 79 Predominantly brain expressed G protein-coupled PSP24 beta HIVEP3 8 Transcriptional modulator of multiple genes GWAS, confirmed Parkinson disease locus (PARK10) IKBKB 27 Proliferation of TNF-α–induced adult neural stem cells LMO3 102 Neuronal specific transcription factor DAT1 Malignant phenotype of neuroblastoma MBD1 28 Epigenetic regulation, neural stem/progenitor cell Autism candidate gene network differentiation and neurogenesis MC4R 76 Melanocortin 4 receptor, synaptic adaptation, central Anhedonia regulation of metabolism NEUROD1 29 Neurogenesis, neural differentiation Huntington disease NPFFR1 122 Neuropeptide FF receptor 1, hypothalamus NTSR1 99 Neurotensin receptor 1, cortical layer neuron specification Alzheimer’s disease PAX5 72 Mesencephalon and spinal cord development during Disruptive mutation in autism embryogenesis PFDN5 36 Prefoldin subunit, chaperon Huntington disease PQBP1 9 Cortical neuronal migration RNA granule component Syndromic and nonsyndromic mental retardation RELA 59 Microtubule stability enhanced NF-κB activation by Amyotrophic Lateral Sclerosis deregulated TARDBP REST 126 Neuron restrictive silencer factor, NPC maintenance Down syndrome neurological phenotype, X-linked mental retardation SIRT1 6 Proliferation of NPCs, activates molecular chaperones Lewy Body dementia, Parkinson disease to α-synuclein aggregation-induced stress SIX4 9 transcription factor, differentiation or maturation of neuronal cells TARDBP 41 hnRNAP family, RNA processing RNA granule component Frontotemporal dementia/ Amyotrophic Lateral Sclerosis TCERG1L 90 Transcription elongation regulator 1-like GWAS, late-onset Alzheimer’s disease locus TIAL1 93 RNA binding protein, translational control, splicing Spinal Muscular Atrophy RNA granule component ZNF586 5 Hub protein, differential regulation by FOXP2 in human brain vs. in chimp HELT 78 Mesencephalic excitatory/inhibitory neuronal fate FOXD4 86 Establishment and maintenance of the embryonic Mutation with syndromic obsessive-compulsive disorders, neural ectoderm suicidality

The shRNA-mediated silencing of the gene hits was specific to the neural differentiation arm of the screen because the same shRNAs did not interfere with the growth of pluripotent hESCs in parallel experiments. Ranking of the hits by RIGER analysis is indicated. The smaller numbers indicate more significant shRNA depletion when PSA-NCAM strong-positive cell fraction is compared with the PSA-NCAM–negative group (P value ranging from P < 0.001 to P < 0.05). A total of 1,801 genes were ranked. The biological relevance of gene hits in the nervous system was defined by literature curation. GWAS, genomewide association studies.

Zhang et al. www.pnas.org/cgi/content/short/1309725110 9of13 Table S5. RNA granule components revealed by the pooled RNAi screen on neural differentiation of hESCs Gene symbols Gene target categories

ANKRD17 Absent in all PSA-NCAM sorted groups CIRBP Enriched in PSA-NCAM strong-positive group ELAVL1 No change ERF No change FXR2 No change G3BP1 * HNRNPA0 No change HNRNPL No change ILF2 Absent KHSRP No change PQBP1 Depleted from PSA-NCAM strong-positive group PURA No change SYNCRIP * TAF15 Enriched TARDBP Depleted TIAL1 Depleted FUBP3 No change ARNTL Absent BRF1 Enriched HTT * CNOT8 (Pop2) Absent

Target genes encoding RNA granule components are indicated in the left column. The right column shows the relative representation of their respec- tive shRNAs when the population of sorted PSA-NCAM–positive and nega- tive cells were compared at the end of neural differentiation. The classification of changes at a target gene level is described in Table S2. An asterisk indicates peak shRNA abundance in the PSA-NCAM weak positive group. Gene targets that were subsequently studied and confirmed by in- fection of individual shRNAs are in boldface.

Zhang et al. www.pnas.org/cgi/content/short/1309725110 10 of 13 Table S6. Nervous system-related genes with essential and specific function in neural lineage commitment of hESCs in vitro Gene symbol Function

AFF4 Transcription elongation factor, single gene mutation in autism AKAP1 Excitatory synaptic plasticity ARNTL Neural stem cell differentiation, circadian rhythm ASCL1 Development of ventral telencephalon, differentiation of olfactory and autonomic neurons ATOH1 Development of key neuronal subtypes CARHSP1 Ventral telencephalon development C2orf3 Dyslexia candidate gene DMRT3 Neural tube formation (zebrafish) Cell cycle progression regulation in neuronal stem cells EGR4 Synaptic plasticity EOMES Neurogenesis, cortical layer specification, mutated in human polygyria ERCC6 Transcription coupled DNA repair, neurodegeneration, mutated in Cockayne syndrome ETV1 Terminal differentiation of cortical neurons FEV Generation of serotonergic neurons and connectivity, role in SIDS FOXC1 Cerebellar development, mutated in Dandy-Walker syndrome FOXN4 Retinogenesis FOXP1 Cortical layer specification, spinal cord development, de novo mutation in autism GATA3 Development of GABAergic precursors, de novo deletion in autism GPR155 Dysregulation associated with autism GPR17 Specific FOXO1 target in hypothalamic neurons, regulation of satiety GRM1 Glutamate receptor, metabotropic 1 HESX1 Development of anterior neuroectroderm HIRA Primary candidate gene for microduplication of 22q11.2 with autism HMG20B Regulation of neuronal progenitor differentiation HMX1 Development of somatosensory neurons in the geniculate ganglion HOXA1 Autism candidate gene HOXA2 Rostrocaudal organization of hindbrain HOXC9 Motor neuron fate in spinal cord IRX2 Cerebellum formation LHX2 Dorsal telencephalon specification and patterning LHX3 Pituitary development and motor neuron specification LMX1A Specification mesencephalic DA-ergic neurons MED17 Mutated in human microcephaly associated with cerebellar atrophy MYST4 Brain development, adult NSC self-renewal NEUROG1 Neural development, dorsal telencephalon NEUROG3 Neurogenesis, dendritogenesis and synaptogenesis NFIX Telencephalic neuronal progenitor differentiation NHLH1 Development of dorsal telencephalon NKX6-1 Telencephalic NPC generation NR4A1 Immediate early gene, multiple roles in the nervous system NOTCH3 Growth and differentiation of neural stem cells, mutated in CADASIL, de novo deletion in autism ONECUT1 Neurogenesis, spinal cord POU3F4 Striatal DA-ergic neuronal precursor differentiation POU4F3 Retinal ganglion cell differentiation PTH2R Thalamus, brainstem, nociception modulation RAX2 Retina and anterior neural fold development RGR Retinal pigment epithelium and Mueller cells with neural tube origin RUNX3 Sensory neuron development SMAD6 Neural fate regulation TBX1 Neural fate determination, de novo mutation in autism TCERG1 Interacts with huntingtin, putative modifier of age of onset in Huntington disease TCF4 Neural stem cell development and neuronal connectivity, haploinsufficiency - mutated in nonsyndromic autism and in Pitt-Hopkins mental retardation syndrome

All silencing shRNAs targeting the listed genes were fully eliminated in each PSA-NCAM phenotypic group at the end of neural differentiation (“Absent” category), but without interfering with the self-renewal and growth of undifferentiated hESCs in the parallel arm of the screen. The biological function of the gene hits in the nervous system was defined by literature curation.

Zhang et al. www.pnas.org/cgi/content/short/1309725110 11 of 13 Table S7. Genes implicated in autism and nonsyndromic mental retardation/intellectual disability whose shRNA abundances significantly changed during neural differentiation in vitro Targeted gene Genetic alteration in affected individuals

ShRNAs absent in all PSA-NCAM sorted cells at the end of neural differentiation AFF4 Rare single gene mutation in autism ASCL1 Probable disease causing mutation in AR ID, Timothy syndrome, ASD BRD1 Susceptibility to autism, schizophrenia and bipolar disorder BTAF1 Rare single gene mutation in autism EGR2 Genetic association with autism EN2 Genetic association with autism FOXP1 Disruptive de novo mutation in autism GATA3 De novo deletion in autism GPR112 Loss-of-function homozygous mutation in ASD cases only GPR155 Dysregulation associated with autism GTF2IRD1 De novo CNV deletion with autism HOXA1 Candidate gene for autism NOTCH3 De novo mutation in autism PTH2R Loss-of-function homozygous mutation in ASD cases only RUVBL1 De novo mutation in autism SLC6A7 Differential expression in autistic brain TBX1 Candidate gene for autism TBX6 De novo CNV duplication in autism TCF4 Chr translocation disruptive in autism, syndromic epileptic encephalopathy, mental retardation HOXB1 Genetic association with autism HTR1B Genetic association with autism MBD4 Single gene mutation in autism, genetic association CNOT3 Disruptive de novo mutation in autism WHSC1 Mutated in ASD, Sotos syndrome HIRA Microduplication of 22q11.2 with autism SLC36A1 Probable disease causing frameshift mutation in ID KDM5A Probable disease causing mutation in AR ID ZBTB40 Probable disease causing mutation in AR ID ZNF526 Probable disease causing mutation in AR ID

ShRNAs gradually depleted during neural differentiation CNOT4 De novo mutation with autism risk PQBP1 Nonsyndromic mental retardation MBD1 Candidate gene for autism MBD3 Single gene mutation in autism, genetic association PAX5 Disruptive de novo mutation in autism

ShRNAs enriched during neural differentiation HMGN1 Genetic association with autism KAT2B Disruptive CNV in autism DOPEY2 Candidate gene in Down syndrome neurological phenotype, overexpressed in patients’ brain MEF2D Genetic association with autism NDN Recurrent CNV in autism PKNOX1 De novo CNV duplication in autism ZNF420 De novo mutation in autism CHD7 De novo mutation in autism CHD8 De novo mutation, autism susceptibility gene ZNF673 Duplication in nonsyndromic mental retardation FMNL2 RBFOX1(A2BP1)-dependent differential splicing in autistic brain ADNP De novo mutation in autism FOXP2 Rare single gene mutation in autism SYNCRIP Frameshift mutation in the setting of de novo duplication, nonsyndromic ID

The ASD and nonsyndromic MR/ID-related gene set used for cross reference was compiled from the literature using recent, large- scale, reproduced genomic datasets from affected and control subjects (SI Methods). The shRNA-mediated silencing of the gene targets was specific to the neural differentiation arm of the screen with > 80% of cases the same shRNAs did not interfere with the growth of pluripotent hESCs in parallel experiments. AR, autosomal recessive.

Zhang et al. www.pnas.org/cgi/content/short/1309725110 12 of 13 Table S8. Primer sequences for real-time qPCR of validated target genes Gene symbol Gene name Primer sequences

TBP Homo sapiens TATA box binding protein F:CCACCAACAATTTAGTAGTTATGAGCC R:CTGCTCTGACTTTAGCACCTGT TARDBP TAR DNA binding protein F:CTGCTTCGGTGTCCCTGT R:ATGGGCTCATCGTTCTCATC TIAL1 TIA1 cytotoxic granule-associated RNA binding protein-like 1 F:GGCAACCATGGAATCAACA R:AGCACCAAATCCACCCATC PQBP1 Polyglutamine binding protein 1 F:CAGTAAGCCGAAAGGATGAAG R:GCTTGGGGAGTCCTGTTGA CNOT4 CCR4-NOT transcription complex, subunit 4 F:TCTCTCAGTATAGGGAACGGTGA R:GGTGGTGGTGAAGGCGTAT LMO3 LIM domain only 3 (rhombotin-like 2) F:GCTCATCCCTGCCTTTGA R:AGGAAAAATTTGTCTCCAACACA BMI1 Polycomb ring finger oncogene F:GATGCCACAACCATAATAGAA R:CTCCAGGTAACGAACAATACA NCAM1 Neural cell adhesion molecule 1 F:GTGGGCATCCTCATCGTCAT R:GTGGTCTCGTTGGGCTCTGT

Zhang et al. www.pnas.org/cgi/content/short/1309725110 13 of 13