Strong synaptic transmission impact by copy number variations in schizophrenia

Joseph T. Glessnera, Muredach P. Reillyb, Cecilia E. Kima, Nagahide Takahashic, Anthony Albanoa, Cuiping Houa, Jonathan P. Bradfielda, Haitao Zhanga, Patrick M. A. Sleimana, James H. Florya, Marcin Imielinskia, Edward C. Frackeltona, Rosetta Chiavaccia, Kelly A. Thomasa, Maria Garrisa, Frederick G. Otienoa, Michael Davidsond, Mark Weiserd, Abraham Reichenberge, Kenneth L. Davisc,JosephI.Friedmanc, Thomas P. Cappolab, Kenneth B. Marguliesb, Daniel J. Raderb, Struan F. A. Granta,f,g, Joseph D. Buxbaumc, Raquel E. Gurh, and Hakon Hakonarsona,f,g,1

aCenter for Applied Genomics, The Children’s Hospital of Philadelphia, Philadelphia, PA 19104; bPenn Cardiovascular Institute, University of Pennsylvania School of Medicine, Philadelphia, PA 19104; cConte Center for the Neuroscience of Mental Disorders and Department of Psychiatry, Mount Sinai School of Medicine, New York, NY, 10029; dSheba Medical Center, Tel Hashomer, 52621, Israel; eMount Sinai and Institute of Psychiatry, King’s College, London, SE5 8AF, United Kingdom; fDivision of Human Genetics, The Children’s Hospital of Philadelphia, Philadelphia, PA 19104; gDepartment of Pediatrics, University of Pennsylvania School of Medicine, Philadelphia, PA 19104; and hSchizophrenia Center, Neuropsychiatry Division, Department of Psychiatry, University of Pennsylvania, Philadelphia, PA 19104

Edited by James R. Lupski, Baylor College of Medicine, Houston, TX, and accepted by the Editorial Board April 13, 2010 (received for review January 7, 2010) Schizophrenia is a psychiatric disorder with onset in late adoles- contribute to the complex etiology underlying various psychiatric cence and unclear etiology characterized by both positive and ne- and neurodevelopmental disorders (13, 14). Whereas rare recurrent gative symptoms, as well as cognitive deficits. To identify copy CNVs have been reported in patients with schizophrenia, these number variations (CNVs) that increase the risk of schizophrenia, explain only a small fraction of the genetic risk of this common we performed a whole-genome CNV analysis on a cohort of 977 complex disease (15, 16). Accordingly, we have applied approaches schizophrenia cases and 2,000 healthy adults of European ancestry with the objective of discovering variations and biological pathways fi who were genotyped with 1.7 million probes. Positive ndings contributing to the pathobiology of schizophrenia. were evaluated in an independent cohort of 758 schizophrenia

Our study cohort included 1,206 schizophrenia cases and 1,378 GENETICS cases and 1,485 controls. The Ontology synaptic transmission neurologically normal controls from the Genetic Association In- P × family of was notably enriched for CNVs in the cases ( = 1.5 formation Network (GAIN), genotyped on the Affymetrix 6.0 −7 CACNA1B DOC2A 10 ). Among these, and , both calcium-signaling array (17). We downloaded the data files from the database of genes responsible for neuronal excitation, were deleted in 16 cases Genotype and Phenotype (dbGaP) (ncbi.nlm.nih.gov/gap; study and duplicated in 10 cases, respectively. In addition, RET and RIT2, phs000021.v2.p1) and analyzed them for CNV associations. This both ras-related genes important for neural crest development, were significantly affected by CNVs. RET deletion was exclusive GAIN project, also known as Molecular Genetics of Schizophre- to seven cases, and RIT2 deletions were overrepresented common nia (MGS), previously reported linkage to 8p23.3-p21.2 and variant CNVs in the schizophrenia cases. Our results suggest that 11p13.1-q14.1 (18) and association of FGFR2 and ZNF804A in novel variations involving the processes of synaptic transmission a genome-wide association study (GWAS) (19, 20), but failed to contribute to the genetic susceptibility of schizophrenia. associate previously reported candidate genes (21) and found novel associations of common genotype variants on 6p22.1 (22). genetics | genomics | neurobiology In addition, 351 schizophrenia cases and 2,107 control subjects from the University of Pennsylvania (UPenn) were included, along chizophrenia is a devastating mental disorder characterized with 178 schizophrenia cases from Mount Sinai School of Medi- Sby reality distortion. Common features are positive symptoms cine and Sheba Medical Center. Both cohorts were genotyped on of hallucinations, delusions, disorganized speech, and abnormal the Affymetrix 6.0 array at The Children’s Hospital of Philadel- thought process; negative symptoms of social deficit, lack of mo- phia (CHOP). tivation, anhedonia, and impaired emotion processing; and cog- Control subjects from UPenn were originally recruited in re- nitive deficits with occupational dysfunction. Onset of symptoms lation with cross-sectional case-control studies on HDL choles- typically occurs in late adolescence or early adulthood, with ∼1.5% terol, coronary angiography, and heart transplantation outcomes of the population affected (1). at UPenn. The average age of the control cohort was 62 years, Previous studies have associated various copy number variations and no subjects had any major psychoses or other psychological (CNVs) with schizophrenia including deletions of 22q11.2 (2), symptoms. NRXN1 (3), APBA2 (3), and CNTNAP2 (4). These CNVs are rare, Samples from GAIN and UPenn were subsequently randomly however, and they account for a relatively small proportion of the divided into a discovery cohort of 977 schizophrenia cases and 2,000 overall genetic risk in schizophrenia. Large, rare CNVs affecting controls and an independent replication cohort of 758 schizophre- many different genes enriched in neurodevelopmental pathways nia cases and 1,485 controls, including samples from Mount Sinai – have been reported as well (5 7), with novel deletions and dupli- School of Medicine/Sheba Medical Center. Bias of contribution cations of genes observed in 15% of cases versus 5% of controls in to specific loci among these sample sources was monitored. one study (P = 0.0008) (5). A study of CNVs in Chinese schizo- phrenia patients detected no significant difference in rare CNVs between cases and controls, however (8). Another study of 1,013 fi Author contributions: J.D.B., R.E.G., and H.H. designed research; M.P.R., C.E.K., N.T., A.A., cases and 1,084 controls of European ancestry also failed to nd C.H., J.P.B., H.Z., P.M.A.S., J.H.F., M.I., E.C.F., R.C., K.A.T., M.G., F.G.O., M.D., M.W., A.R., more rare CNVs >100 kb in cases or enrichment for neuro- K.L.D., J.I.F., T.P.C., K.B.M., D.J.R., S.F.A.G., J.D.B., and R.E.G. performed research; J.T.G. developmental pathways (9). Specific loci exhibiting runs of ho- analyzed data; and J.T.G. and H.H. wrote the paper. mozygosity (ROHs) also have been associated with schizophrenia The authors declare no conflict of interest. −4 (10), and association of de novo CNVs (P =7.8× 10 ) was recently This article is a PNAS Direct Submission. J.L. is a guest editor invited by the Editorial Board. reported in sporadic schizophrenia cases compared with controls 1To whom correspondence should be addressed. E-mail: [email protected]. fi (11). Comparison of genomic ndings in schizophrenia and autism This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. has suggested a diametric etiology (12). CNVs have been shown to 1073/pnas.1000274107/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1000274107 PNAS Early Edition | 1of6 Downloaded by guest on September 29, 2021 All subjects were diagnosed with schizophrenia based on cri- might overlook bias introduced by certain samples with large teria in the Diagnostic and Statistical Manual of Mental Disorders numbers of CNVs called (typically as a result of sample quality), IV (DSM-IV) (23). Subjects had at least 6 months’ duration of sample age, cell line artifacts, probe resolution, GC content, lack of the “A” criteria for schizophrenia, were at least 18 years old at genotype information, and subphenotype characterization. the time of the interview, and were known by their informant for When using a threshold for CNV calls of 100 kb and larger, we at least 2 years. Additional demographic data for the study replicated the 22q11.2 deletions robustly and detected CNV subjects are presented in Tables S1 and S2. associations with GRID1, CNTNAP2, DISC1, and NRXN1,as Various array technologies, including aCGH, Affymetrix reported previously (2–4). However, further review revealed GeneChip and Illumina BeadChip, have been used to identify multiple smaller CNVs (<100 kb) in these regions in both the CNVs in healthy subjects. Previous studies have revealed signifi- cases and controls, suggesting that large CNVs in these regions cant common variations in the general healthy population (24). and additional environmental and/or genetic factors might be Various algorithms are also being used to call CNVs, most of required for expression of the schizophrenia phenotype. Dele- which use the hidden Markov model (HMM) as implemented in tions of 1q21.1 and 15q13.3 (6) were detected, but were not PennCNV (25). Clustering of all Affymetrix data in one run with significantly associated with schizophrenia (Figs. S1 and S2). Affymetrix Power Tools (APT), which implements BirdSeed, is We next carried out a genome-wide single SNP association essential to minimize the stratification resulting from clustering analysis. We found no loci that were significant genome-wide; bias. Indeed, the genotypes provided by dbGaP in matrix format however, we detected nominally significant associations to several had significant clustering bias among three apparent runs of APT genes that are essential for brain development and function and of on sample subsets based on Eigenstrat analysis. We ran APT with potential relevance to schizophrenia, including, but not limited to, − − all Affymetrix 6.0 samples in a single run, yielding a typical result ASTN2, CNTN5, and GRIK2 (P = 2.29 × 10 6, 6.63 × 10 6, and − showing few African and Asian admixed samples without the 2.53 × 10 5, respectively; Table 1). As demonstrated in reports three modes from clustering bias. Any possible batch effect from associating genotypes within the MHC locus with schizophrenia sample collection or processing was similarly excluded, and no (22), such nominal significance might exist in an analysis of large other processing variation was detected. Significant differential cohorts and result in genome-wide significance in a meta-analysis. CNV detection bias introduced by array batch effects is unlikely, To identify CNV loci potentially contributing to schizophrenia, given that large case and control sets were genotyped at the two we applied a segment-based scoring approach that scans the ge- sites (Broad Institute and CHOP). nome for consecutive probes with more frequent CN changes in cases compared with controls (Fig. S3). The genomic span for these Results consecutive probes forms common CNV regions, or CNVRs. In the We analyzed a total of 1,735 schizophrenia samples, genotyped discovery cohort of 977 schizophrenia cases and 2,000 healthy on the Affymetrix 6.0 array, that met strictly established data subjects, we identified CNVRs that were either unique to or had quality thresholds for CNV, from which 977 cases were used in significantly higher frequency in cases versus controls and were the discovery phase and 758 cases were used in the replication detected in the independent replication cohort as well (Table 2). phase (Methods). An average of 45.4 CNV calls were made for We also uncovered several rare CNV loci that were overrepre- each individual using the PennCNV software. We called four sented in the discovery cohort but not observed in the replication different copy number (CN) states, including 9,059 homozygous cohort, likely because of their low frequencies (Table S4). deletions (CN 0), 21,526 hemizygous deletions (CN 1), 9,750 To assess the reliability of our CNV detection method, we ex- duplications (CN 3), and 4,024 biallelic duplications/monoallelic perimentally validated all of the significant CNVRs using two triplications (CN 4). The CNV calls ranged from 3 to 3,253 additional methods, Illumina Human Hap550 Beadchip and probes, with an average of 48 probes per CNV call, and their quantitative PCR (qPCR), which is widely used by the research sizes ranged from 6 bp to 8.1 Mb, with an average size of 88.4 kb. community for independent validation of CNVs (Table S5). We The CNV calls from the schizophrenia cases were compared with also examined the CNV frequency of 4,000 healthy control sub- those from 3,485 healthy subjects. The control individuals had jects genotyped on the Illumina 550 array recruited by the Center comparable CNV frequency to that of the cases. An average of for Applied Genomics at CHOP and established CNV frequency 45.1 CNV calls were made for each control individual using the in those samples comparable to that observed in controls typed PennCNV software. Among these, we identified 29,257 homozy- on Affymetrix 6.0. We validated all significant schizophrenia- gous deletions (CN 0), 70,052 hemizygous deletions (CN 1), 32,906 associated CNVs detected by the Illumina 550 chip with qPCR duplications (CN 3), and 14,217 biallelic duplications/monoallelic for two-tiered validation. Thus, we applied experimental valida- triplications (CN 4). The CNV calls spanned from 3 to 9,258 tion on all of the CNVRs to ensure positive confirmation of all probes, with an average of 48.6 probes per CNV call, and their final results reported. Although the false-negative rate is un- sizes ranged from 4 bp to 12.7 Mb, with an average size of 87.9 kb. known, based on conservative quality thresholds, it would not be In an attempt to replicate and better classify the reported expected to differ significantly between case and control cohorts. abundance of rare CNVs in schizophrenia cases, we determined To replicate the significant findings, we examined a replication CNV case and control frequencies, applying different CNV asso- cohort of 758 schizophrenia cases and 1,485 controls. Of the 25 ciation conditions: 100+ kb CNV size, 100+ kb CNV size, and not significant loci in the discovery cohort, 8 were enriched in the present in the Database of Genomic Variants (DGV); 10+ probe replication cohort as well, reaching nominal significance (Table 2). CNV size, 10+ probe CNV size, and not present in DGV; and Among these eight loci, five were very rare in controls (<0.25%), samples with multiple novel genes impacted by CNVs. The 100-kb whereas the other three presented common CNVs that were sig- CNV size inclusion threshold excludes many CNVs that are in- nificantly overrepresented in the cases. The resulting combined − − formative and thereby affects many of the loci presented as novel P values ranged from 2.87 × 10 6 to 5.25 × 10 2 for all CNVs to cases. For example, using the 100-kb threshold would have ex- in Table 2, four of which survived correction for 21 tests for de- cluded 77% of the CNV calls in our discovery cohort. In general, letion CNVRs and 5 tests for duplication CNVRs. Notably, two CNVs called with 10 probes demonstrate a very low false-positive genes (CACNA1B and DOC2A) belong to the calcium-signaling rate based on experimental validation and results in the exclusion family and the other two (RET and RIT2) belong to the Ras- of only 6% of our called CNVs. Under all conditions, we were signaling gene family, both of which are involved in neuronal de- unable to replicate the previously reported significant over- velopment and signaling. representation of rare CNVs affecting many genes in schizophre- Although some genes did not replicate in our independent set nia versus controls (Table S3)(5–7). But such broad conclusions of cases and controls of relatively modest size, these genes play

2of6 | www.pnas.org/cgi/doi/10.1073/pnas.1000274107 Glessner et al. Downloaded by guest on September 29, 2021 Table 1. GWA of SNP genotypes from 1,067 schizophrenia cases and 1,304 controls UNADJ SNP P value Chr Position Gene A1 F_A F_U Distance Count SNPs

− rs4697472 1.35 × 10 6 4 24307401 SOD3 1 0.4675 0.3973 85,734 10 rs1587434 1.73 × 10−6 6 66672076 EYS 2 0.0519 0.02527 191,994 4 rs11789407 2.29 × 10−6 9 120399367 ASTN2 1 0.5203 0.4512 595,986 8 rs1555543 4.46 × 10−6 1 96717385 PTBLP,PTBP2 1 0.4545 0.3884 298,064 4 rs35648 5.95 × 10−6 10 80171865 AF086162 1 0.1242 0.1714 61,304 6 rs2155907 6.63 × 10−6 11 97599883 CNTN5 2 0.4035 0.3393 778,692 5 rs2271293 9.96 × 10−6 16 66459571 NUTF2 1 0.1425 0.1006 0 2 − rs4981929 9.96 × 10 6 14 31442403 NUBPL 1 0.527 0.462 2358 11 − rs11713590 1.12 × 10 5 3 5706142 EDEM1 1 0.4225 0.4865 459,455 10 − rs12140791 1.85 × 10 5 1 160357908 NOS1AP 2 0.06232 0.03569 0 2 − rs12538910 1.92 × 10 5 7 57418107 DQ578920 1 0.4243 0.3633 50,874 5 − rs10499040 2.53 × 10 5 6 104889038 GRIK2 2 0.1345 0.09555 2,264,387 3 − rs1357338 1.19 × 10 4 1 174197509 RFWD2 1 0.0188 0.00652 0 2 − rs4509495 1.33 × 10 4 X 42018121 CASK 2 0.1495 0.2015 196,216 4 − rs4813376 1.92 × 10 4 20 19799455 RIN2 2 0.1856 0.1453 18,744 2 − rs6560936 4.43 × 10 4 13 113964074 RASA3 2 0.5009 0.4497 40,707 4

The most significant SNP is reported with neighboring SNPs within 10 kb and significance ranging within a power of 10 noted in the “Count SNPs” column. F_A indicates allele frequency affected; F_U, allele frequency unaffected. Genes associated with brain development and function are in bold type.

supporting functional roles in schizophrenia and might replicate neurons and was found to be associated with bipolar disorder and with further studies of larger sample sizes. Additional associated schizophrenia via GWAS by the Wellcome Trust Case Control

Ras-related cell cycle regulation family genes include PTPRG, Consortium (30). A gene with , CACNA1C GENETICS RAB23, TM2D3, SHC2, and RAPGEF2. In addition, PTBLP, has been robustly associated with bipolar disorder based on RIN2, and RASA3 are Ras genes supported by our genotype genotypes of 4,387 cases and 6,209 controls (31). One deletion GWA presented in Table 1. Additional associated calcium-sig- and four duplications were found in cases, whereas one control naling family genes include CAMK2D and KCNMB4. Also as- duplication was found over the span of CACNA1C. RET is a re- sociated are PARK2, RFWD2, and PTPRB, which we previously ceptor tyrosine kinase, a cell surface molecule that transduces associated with autism (13), the latter interacting with the con- signals for cell growth and differentiation, which plays a crucial tactin gene family. Because these nominally significant loci might role in neural crest development (32). RET loss of function is seem unimpressive individually based on their respective P val- associated with Hirschsprung disease, whereas gain of function is ues, we sought to identify the perturbed gene pathways based on associated with cancer development. WDR1 is involved in actin CNVs of different loci and to determine whether genes in any formation and sensory perception of sound. Studies using shotgun particular signaling or biological pathway were enriched for mass spectrometry found that WDR1 was differentially expressed CNVs. Using this approach, we nominally associated an addi- in the dorsolateral prefrontal cortex of schizophrenia patients tional five Ras neural crest development genes and two calcium (33). The linkage of bipolar disorder to 4p and regulatory signaling genes, for a total of seven Ras genes and haplotype analysis have implicated the WDR1 locus (34). RIT2 four Ras-linked calcium-dependent signaling genes impacted is a Ras-like expressed in neurons. PIK3C3 has been with CNVs associated with schizophrenia. Taken together, the shown to harbor a promoter mutation that increases the risk of (GO) class synaptic transmission genes (CAC- schizophrenia and bipolar disorder (35). We previously reported NA1B, PARK2, KCNMB4, GJD2, DOC2A, COMT, RIT2, and SUMF1 (UNQ3037) deletion in 11 unrelated cases in association − ATXN1) was significantly enriched in the cases (P = 1.5 × 10 7). with autism (13). SUMF1 catalyzes the hydrolysis of sulfate esters, such as glycosaminoglycans, sulfolipids, and steroid sulfates. Discussion Although Ras has been the focus of numerous cancer studies The genes impacted by or proximal to significant CNVs encode as a pivotal tumor suppressor, less emphasis has been given to its with intriguing function. PDPR (pyruvate dehydrogenase native biological roles in neuronal survival, differentiation, and phosphatase regulatory gene) is involved in glycine catabolism, plasticity. Calcium-mediated pathways of Ras activation might be and the International Schizophrenia Consortium (ISC) data show a critical mechanism to couple rapid and transient neuronal six novel deletions in cases and one such deletion in controls. As electrical activity with long-term changes in nervous system de- shown in Fig. 1, this locus replicated in 30 independent cases and velopment and function (36–40). The small GTPase protein Ras directly affected PDPR, using the UCSC Genome Browser (26) couples calcium influx to many forms of synaptic plasticity, in- with Build 36. Pyruvate dehydrogenase is of fundamental im- cluding rapid synaptic potentiation and new synapse formation. portance to glycolysis, and the brain requires half of the total body Consistent with many essential roles of Ras signaling in neuronal glucose utilization (27). DOC2A is expressed mainly in brain and plasticity, mutations in the Ras-signaling pathway are associated is involved in Ca(2+)-dependent neurotransmitter release. This with other diseases causing cognitive impairments and learning large recurrent duplication (and deletion) of 16p11.2 was ob- deficits, such as autism, X-linked mental retardation, and neu- served in autism patients as well (13, 28, 29). Within 22q11.21, rofibromatosis 1 (41–43). Indeed, we have identified rare, highly COMT catalyzes the transfer of a methyl group from S-adeno- penetrant CNVs in ubiquitin genes and common CNVs that were sylmethionine to catecholamines, including the neurotransmitters overrepresented in neuronal development in autism (13). Fur- dopamine, epinephrine, and norepinephrine. The 22q11.21 de- thermore, based on genotype association, a common variant on letion locus was previously reported in 11 cases and no controls by 5p14.1 between CDH10 and CDH9 encoding neuronal cell– the ISC (6), an association to schizophrenia that has been pre- adhesion molecules also has been associated with autism (44). viously reported and well supported (2). CACNA1B is a N-type We also observed enrichment of several mitochondria-linked calcium channel that controls neurotransmitter release from genes affected by CNVs in the schizophrenia cases. Parkin is es-

Glessner et al. PNAS Early Edition | 3of6 Downloaded by guest on September 29, 2021 Downloaded by guest on September 29, 2021 4of6 | www.pnas.org/cgi/doi/10.1073/pnas.1000274107

Table 2. CNVRs statistically overrepresented in schizophrenia cases and replicated in an independent case-control cohort Cases Controls Discovery Cases Controls Combined Distance from Replication CNVR Probes discovery discovery P value replication replication P value OR Gene gene Type ISC Canary − − chr16:68743639– 9 19 10 1.17 × 10 3 16 9 2.87 × 10 6 3.756 PDPR 0 Del 6:1 ISC; N 68770545 P = 0.13 − − chr16:29425212– 217 5 0 3.78 × 10 3 815.69 × 10 6 26.30 52 genes; QPRT, 0 Dup 6:3 ISC; Y 30134444 DOC2A, TBX6 P = 0.51 − − chr22:17404806– 1,529 8 0 1.32 × 10 4 201.62 × 10 5 NA 75 genes; COMT 0 Del 11:0 ISC; Y 19941349 P = 0.001 − − chr9:140145139– 7 12 4 6.54 × 10 4 448.68 × 10 4 4.045 CACNA1B 8.69 kb Del — Y↑ 140152969 − − chr10:42932615– 17 5 0 3.78 × 10 3 2 2 8.09 × 10 3 7.055 RET 0 Del — N 42934354 − − chr4:9881886– 11 6 2 1.80 × 10 2 5 6 2.81 × 10 2 2.773 WDR1 154 kb Del — Y 9884092 − − chr18:38310567– 25 115 163 1.98 × 10 3 67 137 2.91 × 10 2 1.244 RIT2, PIK3C3 265 kb, Del — Y 38311765 395 kb − − chr3:4063809– 30 14 13 4.01 × 10 2 7 10 5.25 × 10 2 1.844 SUMF1 0 Del — N 4074877

Significant CNVRs based on a combined discovery and replication cohort of 1,735 schizophrenia cases and 3,485 healthy controls of European ancestry. Replication ISC: Samples from different sample sources must have reasonable contributing frequency. Canary: A CNV calling algorithm run on the sample set in addition to PennCNV-Affy to establish independent calling positive replication (Y) or lack of replication (N). ↑ indicates more samples with Canary calls. Del, deletion. Dup, duplication. CNVRs that survive multiple testing with Bonferroni adjustment in the discovery phase (P < 0.05 after correction for 20 tests in case of deletion and 5 tests in case of duplications), survived replication, and experimental validation are listed in bold. The CNVR is the CNV region shared between cases. Probes give the number of SNP and CN probes present on the Affymetrix 6.0 array in the respective CNVR from which signal was indicative of a CNV. The discovery and combined P values are based on a Fisher’s exact test of the respective sample cohorts. The count of samples in each subgroup of cases and controls in discovery and replication is provided. The nearest gene and proximal distance are provided for potential functional impact and a means to compare other sample sets that might find CNVs in the region. The “Replication ISC” column shows the frequency of cases and controls in the International Schizophrenia Consortium CNV calls of 3,391 cases and 3,181 controls. The “Canary” column shows if the analysis of the log2 ratio of intensity through the Canary CNV calling algorithm replicates the CNV call from PennCNV-Affy. Key functional genes are provided for brevity. The gene count for the two largest CNVs includes hypothetical genes. lsnre al. et Glessner Methods Affymetrix 6.0 Assay for CNV Discovery. High-throughput, genome-wide SNP and CN genotyping was performed, using the Affymetrix 6.0 technology, at the Center for Applied Genomics at CHOP. dbGaP samples were genotyped on the same platform at the Broad Institute. The genotype data content, together with the intensity data provided by the SNP probes on the geno- typing array, provide high confidence for CNV calls. Importantly, the si- multaneous analysis of intensity data and genotype data in the same ex- perimental setting establishes a highly accurate definition for normal diploid states and any deviation thereof. To call CNVs, we used the PennCNV-Affy algorithm, which combines multiple sources of information, including log R range (LRR) and B allele frequency (BAF) at each SNP marker, an HMM specifically trained on Affymetrix 6.0 data, and SNP spacing and population frequency of the B allele to generate CNV calls.

CNV Quality Control. We calculated quality control measures on our Affy- metrix 6.0 and HumanHap550 GWAS data based on statistical distributions to exclude poor-quality DNA samples and false-positive CNVs. The first threshold is the percentage of attempted SNPs that were successfully genotyped. Only samples with a call rate >96% were included. The genome-wide intensity signal must have as little noise as possible. Only samples with an SD of normalized intensity (LRR) <0.35 were included. All samples must have Caucasian ethnicity based on hierarchical clustering of AIMs genotypes and all other samples were excluded. Wave artifacts roughly correlating with GC content resulting from hybridization bias of low full-length DNA quantity are known to interfere with accurate inference of CNVs (55); thus, only samples in which the GC wave factor of LRR was between −0.02 and 0.02 were accepted. If the count of CNV calls made by PennCNV exceeds 80, then the DNA quality is usually poor; thus, only samples with a CNV call count <80 were included. Any duplicate samples (such as monozygotic twins) had one GENETICS sample excluded. Fig. 1. 16q22.1 deletions found to be overrepresented in 35 independent schizophrenia cases. Affymetrix SNP and CN probe coverages are designated Statistical Analysis of CNVs. We evaluated CNV frequency between cases and by blue lines in two separate tracks. Schizophrenia cases with deletions and controls at each SNP using Fisher’s exact test. We only considered loci that their CNV call boundaries shown by red lines. The schizophrenia cases of our were significant between cases and controls (P < 0.05), where cases in the 1,735 cases population run on Affymetrix 6.0 are shown in comparison with MGS/Gur discovery cohort had the same variation replicated in MGS/Gur or our control cohort of 3,485 showing CNV overrepresentation in cases. ISC were not observed in any of the control subjects, and were validated with an case and control CNV profiles also show overrepresentation. It is also note- independent method. We report statistical local minimums to narrow the worthy that duplications are conversely underrepresented in the schizo- association in reference to a region of nominal significance including SNPs phrenia cases (5) versus controls (27). residing within 1 Mb of one another. Resulting significant CNVRs were ex- cluded if they met any of the following criteria: (i) residing on telomere or “ ” sential for maintaining mitochondrial genome integrity and mi- centromere proximal cytobands, (ii) arising in a peninsula of common CNV arising from variation in boundary truncation of CNV calling, (iii) genomic tochondrial autophagy (45, 46). Inhibition of COMT results in regions with extremes in GC content that produce hybridization bias, or (iv) mitochondrial uncoupling (47). EDEM1 protects the cell from samples contributing to multiple CNVRs. We used DAVID (56) to assess the mitochondrial damage from paraquat (48). Pyruvate dehydroge- significance of functional annotation clustering of independently associated nase is an essential mitochondrial complex, and mutation re- CNV results into functional categories. To adjust for the number of tests sults in neurodegenerative disease (49). KCNMB4 encodes a performed, we made correction of 21 deletion and 5 duplication CNVRs, based on significance in the discovery cohort. potassium channel found in the mitochondrial inner membrane (50). RFWD2 (51), RIT2, and PIK3C3 are involved in the ubi- PennCNV-Affy. The Affymetrix 6.0 provided 848,415 SNP markers and 888,023 quitin E3 ligase growth factor pathway that directly affects mito- CN markers, which were analyzed to construct canonical clustering positions chondria. CAMK2D is the trigger for primary mitochondrial using the PennCNV-Affy workflow, which normalizes the Cartesian coor- apoptosis (52). Mitochondrial dysfunction has been linked with dinates provided by Affymetrix. PennCNV-Affy uses called genotypes and normalizes intensity from APT to create reference cluster positions in polar schizophrenia in several studies (53, 54). coordinates to compute relative differences in the signal from each sample in In conclusion, using a genome-wide approach for high-resolution the form of BAF and LRR. BAF, LRR, population BAF, interprobe distance, and CNV detection, we have identified candidate genomic loci with en- HMM model files were then analyzed by PennCNV to make CNV calls for each richment of CNVs in schizophrenia cases compared with controls, sample. The HMM was trained on Affymetrix 6.0 data. We used PennCNV- and replicated many of these using an independent data set of Affy calls because they use the LRR rather than the log 2 ratio. We reviewed the log 2 ratio values in the visualization tools Affymetrix Genotyping schizophrenia cases and controls. Two of the affected genes (CAC- Console Heat Map and Browser (Fig. S4). The log 2 ratio is based on quantile NA1B and DOC2A) encode calcium-signaling molecules, and two normalization, the sum of signal intensity for the A allele and B allele for other genes (RET and RIT2) belong to the Ras-signaling gene family, each sample and the median across all samples; for a given sample, A + B both of which are involved in neuronal development and signaling. allele intensity is divided by the median value and the logarithm base 2 is Together, these genes show significant enrichment in the gene family taken. In contrast, the LRR is based on defined signal intensity clusters of AA, − AB, and BB genotypes across a large group of samples. Given this expected P × 7 of synaptic transmission molecules based on GO ( =1.5 10 ). intensity value, the observed A + B signal intensity is divided by this expected The enrichment of genes within this molecular system suggests sus- value, and the logarithm is taken. The number of CNVs called per individual ceptibility mechanisms for schizophrenia and should spur identifi- by PennCNV-Affy was lower than that called by BirdsEye. cation of additional variations, including structural variations and single-base changes, in candidates within these gene networks. In ACKNOWLEDGMENTS. We gratefully thank all of the schizophrenia patients who participated in this study and all of the control subjects who donated addition, our results call for functional expression assays to assess the blood samples to CHOP and the UPenn for genetic research purposes. We thank biological effects of CNVs in these candidate genes in brain tissue. the technical staff at CHOP’s Center for Applied Genomics for generating the

Glessner et al. PNAS Early Edition | 5of6 Downloaded by guest on September 29, 2021 genotypes used in this study and the medical assistants, nurses, and medical MIT. The dataset used for the analyses described in this manuscript was ob- staff who recruited the study subjects. We thank Kai Wang for development of tained from dbGaP at http://www.ncbi.nlm.nih.gov/gap (dbGaP accession num- PennCNV-Affy. We also thank S. Kristinsson, L. A. Hermannsson, and A. Krisb- ber phs000021.v2.p1). Data were accessed based on an authorized data access jörnsson for their software design and contributions; Pablo V. Gejman and request by H.H. and S.F.A.G. The study was supported by an Institutional De- Daniel B. Mirel for their oversight of the MGS project and for providing access velopment Award to the Center for Applied Genomics from CHOP (to H.H.) to the data; and Jonathan Sebat and Deborah L. Levy for their collaborative and a Research Development Award from the Cotswold Foundation (to H.H. contributions. We thank the MGS consortium investigators for making the ge- and S.F.A.G.). The schizophrenia case recruitment at UPenn was supported by notype data available. Support for MGS case:control was provided by funding MH-64045 and MH-60722 (to R.E.G.). The sample collection and genotyping for from the National Institute of Mental Health and the National Alliance for Re- the control population typed at CHOP was supported by the National Institutes search on Schizophrenia and Depression. Genotyping of part of the sample was of Health [Research Grants R21HL092379, R01HL0885775, and R01AG17022 (to supported by the Genetic Association Information Network, and the Paul Mi- T.P.C. and K.B.M.)]. The collection and genotyping of samples from Mount chael Donovan Charitable Foundation, The genotyping of samples was pro- Sinai and Sheba Hospital was partially supported by the National Institute of vided through the National Center for Research Resources and performed by Mental Health, Conte Center for Neuroscience of Mental Disorders (Grant the Center for Genotyping and Analysis at the Broad Institute of Harvard and MH066392 to J.D.B.).

1. Arajärvi R, et al. (2005) Prevalence and diagnosis of schizophrenia based on register, 28. Weiss LA, et al., Autism Consortium (2008) Association between microdeletion and case record and interview data in an isolated Finnish birth cohort born 1940–1969. Soc microduplication at 16p11.2 and autism. N Engl J Med 358:667–675. Psychiatry Psychiatr Epidemiol 40:808–816. 29. Kumar RA, et al. (2008) Recurrent 16p11.2 microdeletions in autism. Hum Mol Genet 2. Liu H, et al. (2002) Genetic variation in the 22q11 locus and susceptibility to 17:628–638. schizophrenia. Proc Natl Acad Sci USA 99:16859–16864. 30. Moskvina V, et al. (2009) Wellcome Trust Case Control Consortium (2009) Gene-wide 3. Kirov G, et al. (2008) Comparative genome hybridization suggests a role for NRXN1 analyses of genome-wide association data sets: Evidence for multiple common risk and APBA2 in schizophrenia. Hum Mol Genet 17:458–465. alleles for schizophrenia and bipolar disorder and for overlap in genetic risk. Mol 4. Friedman JI, et al. (2008) CNTNAP2 gene dosage variation is associated with Psychiatry 14:252–260. schizophrenia and epilepsy. Mol Psychiatry 13:261–266. 31. Ferreira MA, et al., Wellcome Trust Case Control Consortium (2008) Collaborative 5. Walsh T, et al. (2008) Rare structural variants disrupt multiple genes in neurodevelop- genome-wide association analysis supports a role for ANK3 and CACNA1C in bipolar mental pathways in schizophrenia. Science 320:539–543. disorder. Nat Genet 40:1056–1058. 6. International Schizophrenia Consortium (2008) Rare chromosomal deletions and du- 32. Li L, et al. (2006) The role of Ret receptor tyrosine kinase in dopaminergic neuron – plications increase risk of schizophrenia. Nature 455:237 241. development. Neuroscience 142:391–400. 7. Stefansson H, et al. (2008) Large recurrent microdeletions associated with schizophrenia. 33. Martins-de-Souza D, et al. (2009) Prefrontal cortex shotgun proteome analysis reveals – Nature 455:232 236. altered calcium homeostasis and immune system imbalance in schizophrenia. Eur 8. Shi YY, et al. (2008) A study of rare structural variants in schizophrenia patients and Arch Psychiatry Clin Neurosci 259:151–163. – normal controls from Chinese Han population. Mol Psychiatry 13:911 913. 34. Le Hellard S, et al. (2007) Haplotype analysis and a novel allele-sharing method refines 9. Need AC, et al. (2009) A genome-wide investigation of SNPs and CNVs in schizophrenia. a chromosome 4p locus linked to bipolar affective disorder. Biol Psychiatry 61: PLoS Genet 5:e1000373. 797–805. 10. Lencz T, et al. (2007) Runs of homozygosity reveal highly penetrant recessive loci in 35. Stopkova P, et al. (2004) Identification of PIK3C3 promoter variant associated with – schizophrenia. Proc Natl Acad Sci USA 104:19942 19947. bipolar disorder and schizophrenia. Biol Psychiatry 55:981–988. 11. Xu B, et al. (2008) Strong association of de novo copy number mutations with 36. Farnsworth CL, et al. (1995) Calcium activation of Ras mediated by neuronal exchange – sporadic schizophrenia. Nat Genet 40:880 885. factor Ras-GRF. Nature 376:524–527. 12. Crespi B, Stead P, Elliot M (2010) Evolution in health and medicine. Sackler 37. Finkbeiner S, Greenberg ME (1996) Ca(2+)-dependent routes to Ras: Mechanisms for Colloquium: Comparative genomics of autism and schizophrenia. Proc Natl Acad Sci neuronal survival, differentiation, and plasticity? Neuron 16:233–236. USA 107(Suppl 1):1736–1741. 38. Oh JS, Manzerra P, Kennedy MB (2004) Regulation of the neuron-specific Ras GTPase- 13. Glessner JT, et al. (2009) Autism genome-wide copy number variation reveals activating protein, synGAP, by Ca2+/calmodulin-dependent protein kinase II. J Biol ubiquitin and neuronal genes. Nature 459:569–573. Chem 279:17980–17988. 14. Lee JA, Lupski JR (2006) Genomic rearrangements and gene copy-number alterations 39. Yoshimura T, et al. (2006) Ras regulates neuronal polarity via the PI3-kinase/Akt/GSK- as a cause of nervous system disorders. Neuron 52:103–121. 3β/CRMP-2 pathway. Biochem Biophys Res Commun 340:62–68. 15. Flomen RH, et al. (2006) Association study of CHRFAM7A copy number and 2 bp 40. Yoshimura T, Arimura N, Kaibuchi K (2006) Signaling networks in neuronal deletion polymorphisms with schizophrenia and bipolar affective disorder. Am J Med polarization. J Neurosci 26:10626–10630. Genet B Neuropsychiatr Genet 141:571–575. 41. Antonarakis SE, Van Aelst L (1998) Mind the GAP, Rho, Rab and GDI. Nat Genet 19: 16. Wilson GM, et al. (2006) DNA copy-number analysis in bipolar disorder and 106–108. schizophrenia reveals aberrations in genes involved in glutamate signaling. Hum Mol 42. Chelly J, Mandel JL (2001) Monogenic causes of X-linked mental retardation. Nat Rev Genet 15:743–749. Genet 2:669–680. 17. Manolio TA, et al., GAIN Collaborative Research Group, Collaborative Association Study 43. Comings DE, Wu S, Chiu C, Muhleman D, Sverd J (1996) Studies of the c-Harvey-Ras of Psoriasis, International Multi-Center ADHD Genetics Project, Molecular Genetics gene in psychiatric disorders. Psychiatry Res 63:25–32. of Schizophrenia Collaboration, Bipolar Genome Study, Major Depression Stage 1 44. Wang K, et al. (2009) Common genetic variants on 5p14.1 associate with autism Genomewide Association in Population-Based Samples Study, Genetics of Kidneys in spectrum disorders. Nature 459:528–533. Diabetes (GoKinD) Study (2007) New models of collaboration in genome-wide 45. Narendra D, Tanaka A, Suen DF, Youle RJ (2009) Parkin-induced mitophagy in the association studies: The Genetic Association Information Network. Nat Genet 39: – 1045–1051. pathogenesis of Parkinson disease. Autophagy 5:706 708. 18. Suarez BK, et al. (2006) Genome-wide linkage scan of 409 European-ancestry and 46. Rothfuss O, et al. (2009) Parkin protects mitochondrial genome integrity and supports – African-American families with schizophrenia: Suggestive evidence of linkage at 8p23.3- mitochondrial DNA repair. Hum Mol Genet 18:3832 3850. p21.2 and 11p13.1-q14.1 in the combined sample. Am J Hum Genet 78:315–333. 47. Haasio K, Koponen A, Penttilä KE, Nissinen E (2002) Effects of entacapone and – 19. O’Donovan MC, et al., Molecular Genetics of Schizophrenia Collaboration (2008) tolcapone on mitochondrial membrane potential. Eur J Pharmacol 453:21 26. Identification of loci associated with schizophrenia by genome-wide association and 48. Yang W, Tiffany-Castiglioni E, Koh HC, Son IH (2009) Paraquat activates the IRE1/ follow-up. Nat Genet 40:1053–1055. ASK1/JNK cascade associated with apoptosis in human neuroblastoma SH-SY5Y cells. – 20. O’Donovan MC, et al., Molecular Genetics of Schizophrenia Collaboration (2009) Analysis Toxicol Lett 191:203 210. β of 10 independent samples provides evidence for association between schizo- 49. Casley CS, Canevari L, Land JM, Clark JB, Sharpe MA (2002) -amyloid inhibits integrated – phrenia and a SNP flanking fibroblast growth factor receptor 2. Mol Psychiatry 14:30–36. mitochondrial respiration and key enzyme activities. JNeurochem80:91 100. 21. Sanders AR, et al. (2008) No significant association of 14 candidate genes with schizo- 50. Piwonska M, Wilczek E, Szewczyk A, Wilczynski GM (2008) Differential distribution of + phrenia in a large European ancestry sample: Implications for psychiatric genetics. Am J Ca2 -activated potassium channel beta4 subunit in rat brain: Immunolocalization in Psychiatry 165:497–506. neuronal mitochondria. Neuroscience 153:446–460. 22. Shi J, et al. (2009) Common variants on chromosome 6p22.1 are associated with 51. Kato S, Ding J, Pisck E, Jhala US, Du K (2008) COP1 functions as a FoxO1 ubi- schizophrenia. Nature 460:753–757. quitin E3 ligase to regulate FoxO1-mediated gene expression. JBiolChem283: 23. Flaum M, Andreasen NC (1991) Diagnostic criteria for schizophrenia and related 35464–35473. disorders: Options for DSM-IV. Schizophr Bull 17:133–156. 52. Zhu W, et al. (2007) Activation of CaMKIIdeltaC is a common intermediate of diverse 24. Redon R, et al. (2006) Global variation in copy number in the . Nature death stimuli-induced heart muscle cell apoptosis. J Biol Chem 282:10833–10839. 444:444–454. 53. Prabakaran S, et al. (2004) Mitochondrial dysfunction in schizophrenia: Evidence for 25. Wang K, et al. (2007) PennCNV: An integrated hidden Markov model designed for compromised brain metabolism and oxidative stress. Mol Psychiatry 9:684–697, 643. high-resolution copy number variation detection in whole-genome SNP genotyping 54. Ben-Shachar D (2002) Mitochondrial dysfunction in schizophrenia: A possible linkage data. Genome Res 17:1665–1674. to dopamine. J Neurochem 83:1241–1251. 26. Kent WJ, et al. (2002) The human genome browser at UCSC. Genome Res 12: 55. Diskin SJ, et al. (2008) Adjustment of genomic waves in signal intensities from whole- 996–1006. genome SNP genotyping platforms. Nucleic Acids Res 36:e126. 27. Fehm HL, Kern W, Peters A (2006) The selfish brain: Competition for energy resources. 56. Dennis G, Jr, et al. (2003) DAVID: Database for Annotation, Visualization, and Prog Brain Res 153:129–140. Integrated Discovery. Genome Biol 4:3.

6of6 | www.pnas.org/cgi/doi/10.1073/pnas.1000274107 Glessner et al. Downloaded by guest on September 29, 2021