Biological Psychiatry Archival Report

Functional Genomics Identify a Regulatory Risk Variation rs4420550 in the 16p11.2 Schizophrenia- Associated Locus

Hong Chang, Xin Cai, Hui-Juan Li, Wei-Peng Liu, Li-Juan Zhao, Chu-Yi Zhang, Jun-Yang Wang, Jie-Wei Liu, Xiao-Lei Ma, Lu Wang, Yong-Gang Yao, Xiong-Jian Luo, Ming Li, and Xiao Xiao

ABSTRACT BACKGROUND: Genome-wide association studies (GWASs) have reported hundreds of genomic loci associated with schizophrenia, yet identifying the functional risk variations is a key step in elucidating the underlying mechanisms. METHODS: We applied multiple bioinformatics and molecular approaches, including expression quantitative trait loci analyses, epigenome signature identification, luciferase reporter assay, chromatin conformation capture, homology- directed genome editing by CRISPR/Cas9 (clustered regularly interspaced short palindromic repeats/Cas9), RNA sequencing, and ATAC-Seq (assay for transposase-accessible chromatin using sequencing). RESULTS: We found that the schizophrenia GWAS risk variations at 16p11.2 were significantly associated with messenger RNA levels of multiple in human brain, and one of the leading expression quantitative trait loci genes, MAPK3, is located w200 kb away from these risk variations in the genome. Further analyses based on the epigenome marks in human brain and cell lines suggested that a noncoding single nucleotide polymorphism, rs4420550 (p=2.36 3 1029 in schizophrenia GWAS), was within a DNA enhancer region, which was validated via in vitro luciferase reporter assays. The chromatin conformation capture experiment showed that the rs4420550 region physically interacted with the MAPK3 promoter and TAOK2 promoter. Precise CRISPR/Cas9 editing of a single in cells followed by RNA sequencing further confirmed the regulatory effects of rs4420550 on the transcription of 16p11.2 genes, and ATAC-Seq demonstrated that rs4420550 affected chromatin accessibility at the 16p11.2 region. The rs4420550-[A/A] cells showed significantly higher proliferation rates compared with rs4420550-[G/G] cells. CONCLUSIONS: These results together suggest that rs4420550 is a functional risk variation, and this study illustrates an example of comprehensive functional characterization of schizophrenia GWAS risk loci. https://doi.org/10.1016/j.biopsych.2020.09.016

Accumulating evidence suggests that schizophrenia is a highly MAPK3). Other studies also reported regulatory effects of heritable polygenic disorder, and genome-wide surveys have 16p11.2 CNV genes on dendritic spine development and reported more than 100 risk loci containing common single synaptic function (10–12), which are believed to contribute to nucleotide polymorphisms (SNPs) (1,2) or rare copy number the pathogenesis of schizophrenia (13–18). variations (CNVs) (3–6). Among them, the Common variations in the proxy 16p11.2 CNV region also 16p11.2 region has been consistently highlighted in genetic showed genome-wide significant associations with schizo- studies of schizophrenia. For example, rare microduplications phrenia (1,2,19). However, risk SNPs primarily locate in non- at the recurrent w600-kb 16p11.2 BP4-BP5 locus are signifi- coding regions with undetermined physiological impact. It is cantly associated with schizophrenia (4,5,7). This 16p11.2 CNV presumed that such noncoding DNA variations should either region encompasses more than 25 genes. A previous study be in linkage with unknown causal mutations or exert regula- examined the effect of each individual on zebrafish head tory effects themselves (20–27), e.g., they might affect sizes and identified KCTD13 as the major driver, which messenger RNA (mRNA) expression through modulating participated in head-size regulation by interacting with MAPK3 physical interaction between transcription factors and and MVP (8). Besides, Blizinsky et al. (9) observed increased sequence elements (e.g., DNA enhancers) (28–34). In the pre- dendritic arborization in cortical pyramidal neurons isolated sent study, we performed brain expression quantitative trait from mice carrying the heterozygous 16p11.2 duplication, loci (eQTLs) analyses and found that schizophrenia risk SNPs which was rescued through pharmacological targeting of at 16p11.2 were significantly associated with mRNA expres- ERK-1 (extracellular signal-regulated kinase 1) (encoded by sion of multiple genes. Combinatory analyses including the

246 ª 2020 Society of Biological Psychiatry. Biological Psychiatry February 1, 2021; 89:246–255 www.sobp.org/journal ISSN: 0006-3223 Biological Molecular Mechanisms Underlying Genetic Risk at 16p11.2 Psychiatry

epigenome signature identification, luciferase reporter assay, was normalized to that of Renilla luciferase to control for chromatin conformation capture (3C), homology-directed transfection efficiency variations. All assays were performed in genome editing by CRISPR/Cas9 (clustered regularly inter- triplicate in at least three independent experiments. Two-tailed spaced short palindromic repeats/Cas9), RNA sequencing t test was performed for statistical analyses. (RNA-Seq), and ATAC-Seq (assay for transposase-accessible The 3C assay was performed following previous studies chromatin using sequencing) identified a functional noncoding (48,49) to detect physical interactions between genomic re- SNP rs4420550 residing in an enhancer region at 16p11.2. gions. A total of 1 3 107 HEK293T cells in 10 mL of Dulbecco’s Modified Eagle Medium culture medium were fixed using 1% formaldehyde at room temperature for 10 minutes, and the METHODS AND MATERIALS fixed cells were lysed in 5 mL lysis buffer for 15 minutes at 4C The study protocol was approved by the Institutional Review with gentle pipetting. The mixture was centrifuged, and the Board of Kunming Institute of Zoology, Chinese Academy of nuclei pellet was then suspended in 500 mL of 1.23 restriction Sciences. Detailed methods and materials are described in the enzyme buffer. Following this, the pellets were digested with Methods and Materials in Supplement 1. The primer se- 400 U restriction enzyme EcoRI (New England Biolabs, Ips- quences in real-time quantitative polymerase chain reaction wich, MA), and T4 DNA ligase (100 U; Promega) was added to (RT-qPCR) analysis are shown in Table S1 in Supplement 2, the diluted chromatin and incubated for 4 hours at 16C. the primers used in the 3C assay are listed in Table S2 in Proteinase K (500 mg) was then added and incubated at 65C Supplement 2, the single guide RNA oligo sequences targeting overnight to reverse crosslinks, and RNA was removed genes are shown in Table S3 in Supplement 2, and the through RNase A (300 mg) treatment for 60 minutes at 37C. nucleotide sequences of the double-stranded oligonucleotides TaqMan RT-qPCR (Thermo Fisher Scientific) was performed to in electrophoretic mobility shift assay are shown in Table S4 in determine the interaction frequencies between target sites of Supplement 2. interest, and the PCR products were purified using a QIAGEN The eQTL analyses were performed using datasets of the quick gel purification kit for Sanger sequencing of each dorsolateral prefrontal cortex (DLPFC) tissues from BrainSeq chimeric DNA (QIAGEN, Hilden, Germany). To normalize primer Phase 1 (n=412) (35),BrainxQTL(n=494) (36), Common- efficiency, control PCR templates were generated by digestion Mind Consortium (n=467) (37), Genotype-Tissue Expression and random ligation of bacterial artificial (GTEx) Project (n=175) (38), and PsychENCODE Consortium covering the 16p11.2 region. (n=1387) (39). We applied the summary data–based Homology-directed repair–mediated genome editing of Mendelian randomization (SMR) analysis (40,41) to prioritize rs4420550 by CRISPR/Cas9 was conducted in HEK293T cells risk genes, which integrated the schizophrenia genome-wide following a previous study with modifications (50). The proto- association studies (GWASs) (40,675 cases and 64,643 spacer sequence of CRISPR/Cas9 targeting rs4420550 (50- controls) (2) with the aforementioned eQTL datasets to assess GATGTTGCTAGGAGCTGACC-30) was designed using Opti- the pleiotropic effects of SNPs on diagnosis and mRNA mized CRISPR Design (http://crispr.mit.edu), and annealed olig- expressions. omers were cloned into the pL-CRISPR-E.EFS.GFP plasmid. A Linkage disequilibrium (LD) between pairwise SNPs was single-stranded oligodeoxynucleotide (50-gttttttgtgaaattattttgct- calculated using Haploview v4.1 (42) based on the European caacaggacaccaccatggccgtgctgttggtgccaggCcagctcctagcaa- genotype data from 1000 Genomes Project (43). Histone catcaaggcttctggagtgagggtgcaaaccagctcccagctggcagc-30)was marks (such as H3K4me1, H3K4me3, and H3K27ac) were synthesized as a homology-directed repair template for precise queried in the available Roadmap Epigenomics projects and editing of the rs4420550 locus. The pL-CRISPR-E.EFS.GFP ENCODE (Encyclopedia of DNA Elements) datasets of brain construct and the single-stranded oligodeoxynucleotide tissues and cell lines (HEK293 [human embryonic kidney 293] were transiently cotransfected into HEK293T cells on 100-mm and SK-N-SH) (44,45). RegulomeDB (http://www.regulomedb. plates using Lipofectamine 3000. Forty-eight hours after org/)(46) and HaploReg v4.1 (https://pubs.broadinstitute.org/ transfection, cells positive for green fluorescent were mammals/haploreg/haploreg.php)(47) were also used to sorted with a flow cytometer. Colonies derived from single determine the SNPs overlapped with open chromatin peaks cells were then obtained from these cells, and four colonies and chromatin immunoprecipitation sequencing (ChIP-Seq) were selected for each homozygous genotype ([A/A] or [G/G]) peaks of histone modifications in brain tissues. of rs4420550. Sanger sequencing was conducted to ensure Luciferase reporter assay was conducted in the HEK293T that the colonies had the same DNA sequence, except for the and SK-N-SH cells to examine the effects of SNPs on DNA rs4420550 locus. enhancer activities. DNA fragments containing target SNPs These eight single-cell colonies were then subject to were amplified, and site-directed mutagenesis was employed RT-qPCR, RNA-Seq, Western blotting, and ATAC-Seq ana- to generate sequences containing either alleles. The DNA lyses. The raw RNA-Seq and ATAC-Seq data in rs4420550- fragments were then cloned into the pGL3-promoter vector, edited HEK293T cells are accessible in the Gene Expression and equal amounts of each plasmid were transiently trans- Omnibus repository (accession number GSE152177). The fected into cells together with pRL-TK using Lipofectamine Wald test in DESeq2 (51) was used to analyze the mRNA 2000 (Thermo Fisher Scientific, Waltham, MA). Twenty-four expression differences revealed by RNA-Seq and different hours posttransfection, cells were lysed and luciferase activ- chromatin accessibility peaks in ATAC-Seq between colonies ity was determined using the Dual-Luciferase Reporter Assay carrying [A/A] or [G/G] genotypes at rs4420550. Relative System (Promega, Madison, WI). The firefly luciferase activity mRNA expression assessed by RT-qPCR was presented as

Biological Psychiatry February 1, 2021; 89:246–255 www.sobp.org/journal 247 Biological Psychiatry Molecular Mechanisms Underlying Genetic Risk at 16p11.2

2ˇ DDCt 27 27 26 25 means of 2 ; densities of bands in Western blotting were pSMR-multi = 2.71 3 10 , 7.59 3 10 , 6.45 3 10 , 7.57 3 10 , quantified using ImageJ (version 1.52a; National Institutes of and 7.22 3 1027, respectively) (Table S6 in Supplement 2). Health, Bethesda, MD); and statistical differences between Genetic risk of schizophrenia consistently predicted higher experimental groups in both assays were examined using 2- mRNA levels of both MAPK3 and INO80E in all these eQTL tailed t test. datasets. Cell proliferation assays were performed using the CCK-8 Besides, the PsychENCODE Consortium has recently (cell counting kit-8) reagent (Beyotime Biotechnology, Hai- analyzed the differentially expressed genes in the DLPFC tis- men, China). HEK293T cells with genotype rs4420550-[A/A] or sues from 559 cases with schizophrenia versus those from 936 -[G/G] were seeded into 96-well plates at a density of 3 3 103 controls (52). Using these data, we found that the mRNA cells/well and cultured for 24, 48, and 72 hours, and 10 mLof expression of MAPK3 was significantly higher in the DLPFC of CCK-8 was added into each well and incubated with the cells patients with schizophrenia compared with unaffected controls at 37C for 1 hour. Optical density at 450 nm was measured for (p = 3.49 3 1024)(Table S7 in Supplement 2). However, the each well using a multidetection microplate reader (Bio-Rad mRNA levels of INO80E did not differ between patients with Laboratories, Hercules, CA). All assays were performed in schizophrenia and controls in their study (p=.429) (52). triplicate in at least three independent experiments, and sta- tistical differences of cell proliferation rates at each time point LD Analysis of the Schizophrenia Risk SNPs at were examined through 2-tailed t test. 16p11.2 We then sought to identify the functional variation(s) among RESULTS the over 70 schizophrenia risk SNPs at the 16p11.2 GWAS locus (p , 5.00 3 1028)(Figure 1A)(1). We first examined the The Schizophrenia GWAS Risk SNP rs12691307 Is LD structure of these SNPs using European genotype data MAPK3 INO80E Associated With mRNA Levels of , , from the 1000 Genomes Project (43). These SNPs primarily and Other Genes at 16p11.2 constitute two LD blocks (Figure S1 in Supplement 1). To investigate potential impact of the schizophrenia risk SNPs rs12691307 locates in LD block 1, and schizophrenia risk at 16p11.2 on human brain transcriptome, we first examined SNPs in this LD block generally show stronger association with the correlations between rs12691307 [the leading schizo- MAPK3 than with INO80E, and their association signals with phrenia GWAS risk SNP at 16p11.2, p=1.30 3 10210 in 34,241 MAPK3 are also stronger than those of the SNPs in LD block 2 cases and 45,604 controls (1)] (odds ratio for A allele [i.e., the [taking results from GTEx Project dataset (38) as an example] risk allele], 1.073) and brain transcriptomes in four independent (Table S8 in Supplement 2). However, the association signals DLPFC eQTL datasets [BrainSeq Phase 1 (n=412) (35), Brain between INO80E and LD block 2 SNPs are stronger than those xQTL (n=494) (36), CommonMind (n=467) (37), and GTEx between INO80E and LD block 1 SNPs. Further analyses Project frontal cortex (n=175) (38)], and we found that the risk revealed that the schizophrenia risk SNPs in LD block 2 were SNP rs12691307 was consistently and significantly associated located either within or near INO80E. Nevertheless, the dis- with mRNA levels of MAPK3 (p=2.57 3 10211, 8.62 3 10218, tance between LD block 1 SNPs and MAPK3 on the genome is 1.93 3 10210, and 9.70 3 10211, respectively) and INO80E (p= quite far (w200 kb), suggesting long-range regulatory effects 2.33 3 10210, 4.58 3 10212, 1.24 3 1029, and 3.60 3 1025, of these SNPs. Given that dysregulation of MAPK3 has been respectively) (Table S5 in Supplement 2). We also examined observed in patients with schizophrenia (52) and in behaviorally the eQTL associations of rs12691307 in a larger dataset abnormal animals (53), further studies investigating whether from PsychENCODE (n=1387) (39). Despite that some of and how the schizophrenia risk SNPs in LD block 1 regulate these samples were overlapped with those of BrainSeq (35) MAPK3 mRNA expression is necessary. and CommonMind (36), rs12691307 was again significantly associated with mRNA expression levels of MAPK3 (p= Identification of a Potential Regulatory SNP 2 2 1.31 3 10 26) and INO80E (p=4.98 3 10 19)(Table S5 in rs4420550 in an Enhancer Element at 16p11.2 Supplement 2). To prioritize the regulatory SNP(s) in LD block 1, we examined their functional annotations in human brain tissues and cells SMR Integrative Analysis of DLPFC eQTL Datasets using data from the ENCODE (44) and Roadmap Epigenomics and Differential Expression Analysis Confirm MAPK3 INO80E projects (45). By comparing the SNP locations with regions of and as Schizophrenia Susceptibility open chromatin depicted by DNase I hypersensitivity, and with Genes regions of active histone H3 lysine modifications (H3K4me1, We then performed the SMR analysis (41) integrating schizo- H3K4me3 and H3K27ac), we found a “potential regulatory phrenia GWAS and brain eQTL datasets to define 16p11.2 region” spanning the GWAS leading risk SNP rs12691307. This genes whose mRNA levels were affected by schizophrenia region exhibited the largest spatial overlap with regulatory genetic risk. In the DLPFC eQTL datasets from BrainSeq marks in multiple human cell lines (HEK293 and SK-N-SH) and Phase 1 (35), Brain xQTL (36), CommonMind (37), GTEx Proj- brain tissues (Figure 1A). ect frontal cortex (38), and PsychENCODE (39), we found This “potential regulatory region” primarily resides in the 50 that mRNA levels of MAPK3 and INO80E were again signifi- flanking region of KCTD13 and harbors 5 high-LD schizo- cantly associated with genetic risk of schizophrenia phrenia risk SNPs (rs12716972, rs12716973, rs4420550, 25 26 25 (MAPK3: pSMR-multi = 2.68 3 10 , 6.56 3 10 , 2.16 3 10 , rs4424923, and rs12691307) (Figure 1A). However, their eQTL 1.75 3 1024, and 2.80 3 1026, respectively; INO80E: associations with KCTD13 expression were much weaker than

248 Biological Psychiatry February 1, 2021; 89:246–255 www.sobp.org/journal Biological Molecular Mechanisms Underlying Genetic Risk at 16p11.2 Psychiatry

A q13 q21 B –11 q11.2 q12.1 p13.3 p13.2 p11.2 p11.1 q11.1 q22.2 q23.1 q23.2 q24.2 p12.1 q12.2 q22.1 q22.3 p12.3 p12.2 q24.1 p=2.22×10 p13.12 p13.13 p13.11 5.6 CHR 16 (hg19) 5.4 16p11.2 CNV region 5.2 29.7 29.8 29.9 30 30.1 30.2 SPN C16orf54 MAZ CDIPT ASPHD1 TMEM219 C16orf92 ALDOA MAPK3

MAPK3 Expression 5.0 QPRT ZG16 MVP SEZ6L2 TAOK2 FAM57B PPP4C GDPD3 CORO1A BrainSeq

KIF22 PAGR1 KCTD13 HIRIP3 DOC2A TBX6 GG AG AA N=95 N=209 N=108

16p11.2 CNV PRRT2 INO80E YPEL3 rs4420550 –11 2 rs12691307 p=9.70×10 r 2.0 10 rs4420550 0.8 •••••••••••• 1.0 8 •••••••••••••••••••• 0.6 ••••••• • –8 0.4 p=5.00×10 0.0 (SCZ-p)6 Genes in ••• 10 ••••••• 0.2 •• • ••••••••••• • • ••• –1.0 4 •• • MAPK3 Expression −log • • • • • –2.0 ••• •••••••••••••••••• •••••••••• GTEx 2 GG AG AA ROADMAP Epigenomics potential regulatory region N=40 N=85 N=50 rs4420550 DNaseI H3K4me1 C

Brain HEK293T p=0.0016 H3K4me3 2 H3K27ac ENCODE

DNaseI 1 H3K4me1 Brain

H3K4me3

DNaseI Relative Luciferase Activity 0 H3K4me1 NC Comparison GG AA rs4420550

HEK293 H3K4me3 p=0.010 H3K27ac SK-N-SH 2 H3K4me1

H3K4me3 SK-N-SH

1 Single SNP prediction and eQTL in the “potential regulatory region” KCTD13

rs12716972 rs12716973 rs4420550 rs4424923 rs12691307 Relative Luciferase Activity 0 PGC2 SCZ p=3.59×10–9 p=3.68×10–9 p=2.36×10–9 p=1.85×10–9 p=1.30×10–10 NC Comparison GG AA Prediction Flanking Active TSS Active TSS EnhancerNon-functional Non-functional rs4420550

Figure 1. Molecular characterization of rs4420550. (A) Genetic associations of SNPs spanning the 16p11.2 CNV region with schizophrenia, and the epi- genome signatures in human brains and cell types, lead to identification of a “potential regulatory region” and a regulatory SNP (rs4420550). A physical map of the region is given and depicts known genes within the region, and the LD is defined based on the SNP rs12691307. (B) eQTL analyses of rs4420550 with MAPK3 mRNA expression in the BrainSeq Phase 1 and GTEx Project datasets. (C) Results of the reporter gene assay testing the regulatory activities of rs4420550. Effects of rs4420550 allele variation on pGL3-promoter activity are shown in the panels for HEK293T and SK-N-SH cells. The Comparison in the figure represents the empty pGL3 promoter vector, and NC represents the empty pGL3 basic vector. The y-axis values represent fold changes of luciferase activity relative to the empty pGL3-promoter vector. Error bars represent SEM of three biological replicates. Data are representative of three independent experiments with consistent results. CHR, chromosome; CNV, copy number variation; ENCODE, Encyclopedia of DNA Elements; eQTL, expression quanti- tative trait locus; GTEx, Genotype-Tissue Expression; LD, linkage disequilibrium; mRNA, messenger RNA; NC, negative control; SCZ, schizophrenia; SNP, single nucleotide polymorphism; TSS, transcription start site. with MAPK3 (Tables S9 and S10 in Supplement 2). We then colocalized with enhancer histone marks in 16 human tissues, used RegulomeDB (46) and HaploReg (47) to prioritize the including the brain, in the HaploReg dataset (Figure S4 in most likely causative SNP(s) among these 5 SNPs. Intriguingly, Supplement 1)(47). analyses using the RegulomeDB dataset (46) suggested that rs4420550 is in high LD with rs12691307 (r2 = .99), and is rs4420550 was in an enhancer region in brain tissues also genome-wide significantly associated with schizophrenia (Figure S2 in Supplement 1), rs12716972 and rs12716973 were (p=2.36 3 1029; odds ratio for A allele, 1.065) (1). The eQTL in a (flanking) active transcription start site (TSS) region, and analyses suggested that rs4420550 was significantly associ- rs4424923 and rs12691307 were unlikely to be functional ated with MAPK3 expression in human brains (Figure 1B) and (Figure S3 in Supplement 1). Consistently, rs4420550 also other organs (Figure S5 in Supplement 1). We then performed

Biological Psychiatry February 1, 2021; 89:246–255 www.sobp.org/journal 249 Biological Psychiatry Molecular Mechanisms Underlying Genetic Risk at 16p11.2

A rs4420550 KCTD13 TAOK2 MAPK3

3

2

1

Interaction frequency 0 Primer chr16 (kb) 29950 30050 30150

B C rs4420550-A/A rs4420550-G/G E rs4420550-A/A rs4420550-G/G Scaled Genes Position MAPK3 TPM MAPK3 16p11.2 MAPK3 10 4 INO80E 16p11.2 INO80E HAGHL 16p13.3 α-tubulin SEZ6L2 16p11.2 HIRIP3 2 MVP 16p11.2 BAHCC1 17q25.3 α-tubulin 8 0 CDIPT 16p11.2 C16orf62 16p12.3 TAOK2 −2 BMP6 6p24.3 CCDC78 16p13.3 α-tubulin TPCN1 12q24.13 (p-value) 6 −4

10 KCNC1 11p15.1 ALDOA FBXL19 16p11.2 −log KCTD13 16p11.2 α-tubulin ZNF629 16p11.2 4 EARS2 16p12.2 3 rs4420550-A/A EBF2 8p21.2 p=0.0004 rs4420550-G/G ALDOA 16p11.2 ERI2 16p12.3 2 p=0.0015 p=0.086 p=0.023 FUOM 10q26.3 2 AC009041.2 16p13.3 HIRIP3 16p11.2 NECAB2 16q23.3 1 TMEM143 19q13.33 ZNF764 16p11.2 0 CORO1A 16p11.2 Average relative density −6 −3 0 36 0 (fold change) PTPRD 9p24.1 MAPK3 HIRIP3 TAOK2 ALDOA log2 D 10 eQTL analysis of rs4420550 in GTEx frontal cortex dataset

(p-value) 5 10 −log

0 15 eQTL analysis of rs4420550 in Brain xQTL dataset

10 (p-value) 10

−log 5

0 RNA-seq differential analysis in rs4420550-edited HEK293T cells 10 (p-value)

10 5 −log

0

MAZ MVP QPRT KIF22 PRRT2 CDIPT TAOK2 HIRIP3 DOC2A ALDOA PPP4C YPEL3 GDPD3 MAPK3 SEZ6L2 ASPHD1 KCTD13 INO80E FAM57B TMEM219 CORO1A Genes at the 16p11.2 CNV region

Figure 2. 3C analysis of 16p11.2 region and RNA-Seq analysis of the rs4420550-[A/A] and -[G/G] HEK293T cells. (A) 3C interaction profiles between the area (containing rs4420550) and the MAPK3 promoter region. 3C libraries were generated with EcoRI, with the anchor point set at the rs4420550 locus. Graphs represent two biological replicates. Error bars indicate standard error. (B) Volcano plots showing effect size estimates by significance of each gene for the

RNA-Seq analysis in the rs4420550-[A/A] and -[G/G] HEK293T cells. Effect sizes captured as log2(fold change) are shown on the x-axis, and significance levels measured as –log10(p value) are shown on the y-axis. Each dot represents an individual gene. Blue dots represent significantly upregulated genes with log2(fold change) . 0.26 at false discovery rate , .05. (C) Heatmap of top dysregulated genes in the RNA-Seq analysis of rs4420550-[A/A] and -[G/G] HEK293T cells. (D) Comparisons of genes within the proxy 16p11.2 CNV region among GTEx Project dataset, Brain xQTL dataset, and rs4420550-[A/A] and -[G/G] HEK293T cells. (E) Western blotting of encoded by the genes at 16p11.2 in rs4420550-[A/A] and -[G/G] HEK293T cells (top) and quantification (bottom). Error bars represent SEM of four biological replicates. Data are representative of three independent experiments with consistent results. Note: the antibodies against these 16p11.2 proteins have been verified through sgRNA knockdown plasmids (Figure S18 in Supplement 1), and the full-scaled gel plot of Western blotting is shown in Figure S19 in Supplement 1. 3C, chromatin conformation capture; CNV, copy number variation; eQTL, expression quantitative trait locus; GTEx, Genotype-Tissue Expression; RNA-Seq, RNA sequencing; sgRNA, single guide RNA; TPM, transcripts per million.

250 Biological Psychiatry February 1, 2021; 89:246–255 www.sobp.org/journal Biological Molecular Mechanisms Underlying Genetic Risk at 16p11.2 Psychiatry

experimental verification of the regulatory effect of rs4420550 homozygotes [G/G] genotype at rs4420550. T7EN1 assay was on gene expression. The 589-bp sequence surrounding performed to ensure that no off-target effects were introduced rs4420550 was cloned into pGL3-promoter vectors, and the by CRISPR/Cas9 (Figure S8 in Supplement 1 and Table S12 in enhancer luciferase activities of vectors containing both alleles Supplement 2). Four single cell–derived colonies of each ge- of rs4420550 were measured in different cell lines. Consistent notype ([A/A] and [G/G]) at rs4420550 were expanded, and with the eQTL associations, the homozygotes of risk allele [A/ their total mRNAs were isolated. RT-qPCR of MAPK3 and A] led to w80% higher luciferase activities compared with the other 16p11.2 genes in these eight colonies was conducted. protective alleles [G/G] in both HEK293T cells (p=.0016) and As expected, the MAPK3 mRNA levels in rs4420550-[A/A] cells SK-N-SH cells (p=.01) (Figure 1C). We also cloned the 477-bp were higher than those in the rs4420550-[G/G] cells (p , region surrounding rs12691307 into pGL3-promoter vectors to .0001), and other 16p11.2 genes (e.g., INO80E, KCTD13, examine its allelic effects on enhancer activities, but no dif- TAOK2, ALDOA, HIRIP3) were also expressed differently be- ference was seen between different alleles (Figure S6 in tween cells of different genotypes (Figure S9 in Supplement 1). Supplement 1). RNA-Seq was then performed on these 8 samples, and 48 differentially expressed genes (false discovery rate [FDR] , The rs4420550 Locus Physically Interacts With .05) with clustering of samples according to the rs4420550 MAPK3 Promoter and TAOK2 Promoter in the 3C genotypes were identified (Table S13 in Supplement 2). The Assay expression changes of MAPK3 showed the most significant association with rs4420550 (p=7.33 3 10211, FDR = 5.47 3 Given the w200-kb distance between rs4420550 and MAPK3 2 2 10 7)(Figure 2B), followed by INO80E (p=8.50 3 10 10, in the genome, we hypothesized that the regulatory effect likely 2 FDR = 5.47 3 10 7). Notably, 12 of the 48 genes were located resulted from interaction between a distal enhancer and target in the 16p11.2 region (FDR , .05) (Figure 2C). We then gene promoters owing to chromatin looping. To investigate if compared the association signals between rs4420550 and the the region surrounding rs4420550 acted as an enhancer genes within the proxy 16p11.2 CNV region in GTEx (38), Brain element for MAPK3 via this mechanism, we performed 3C xQTL (36), and the RNA-Seq results of rs4420550-[A/A] and assay. Briefly, a 16p11.2 regional-wide 3C in HEK293T cells -[G/G] HEK293T cells. In all these datasets, the mRNA was carried out using a series of PCR primers designed with expression levels of MAPK3 and INO80E were significantly sequences spanning from rs4420550 to the distal MAPK3 affected by different alleles of rs4420550 with the same di- promoter. Intriguingly, an increased interaction frequency was rection of allelic effects (Figure 2D). detected between the restriction fragment containing We further confirmed that the MAPK3 protein levels were rs4420550 and that containing MAPK3 promoter in our 3C li- significantly higher in rs4420550-[A/A] cells compared with brary, demonstrating a direct physical interaction between rs4420550-[G/G] cells (p=.0015) (Figure 2E). The endogenous rs4420550 and MAPK3 promoter (Figure 2A; the RT-qPCR Ct proteins of INO80E and KCTD13 were undetectable in values are shown in Table S11 in Supplement 2). Moreover, in HEK293T cells (data not shown). The protein levels encoded the 3C experiments, higher peaks of interaction frequencies by other 16p11.2 genes, such as HIRIP3, TAOK2 and ALDOA, were observed between the restriction fragment containing were also upregulated in rs4420550-[A/A] cells compared with rs4420550 and that containing TAOK2 promoter (Figure 2A), rs4420550-[G/G] cells, despite the fact that TAOK2 did not suggesting that rs4420550 likely affected expression of reach nominal significance level (p=.086) (Figure 2E). TAOK2 through the chromosome looping mechanism. We also examined the potential physical interactions within the 16p11.2 region using public Hi-C data in human tissues rs4420550 Affects Chromatin Accessibility at (30,54) via the 3D Genome Browser (55). In the DLPFC and 16p11.2 Region developing brain cortical plates, the DNA sequences spanning We then tested whether rs4420550-[A/A] and -[G/G] cells had rs4420550 and MAPK3 (and other 16p11.2 genes such as different chromatin accessibility of the 16p11.2 region using TAOK2) were located in the same large topologically associ- the ATAC-Seq method. After quality control, we obtained an ated domain (Figure S7 in Supplement 1). In the nonbrain tis- average of 11.7 million nonduplicated paired fragments only sues (e.g., adrenal gland, bladder, lung, pancreas), rs4420550 mapped to human reference genome GRCh37 autosomes and and those genes were also in the same topologically associ- X-chromosome. After exclusion of blacklisted regions, an ated domain, suggesting that their physical interactions were average of 34,490 narrow peaks called using MACS2 (https:// not restricted to brains. These Hi-C data are in line with the pypi.org/project/MACS2/) remained. The results showed that eQTL analyses in human tissues and the 3C results in the insert size between peaks was about w180 bp or multiple HEK293T cells. 180 bp (Figure S10 in Supplement 1), so that Tn5 can only access the linker DNA rather than nucleosome-occupied re- Precise CRISPR/Cas9 Editing at rs4420550 w fi MAPK3 gions ( 147 bp). We also found that sequence tag intensity Con rms Its Regulatory Effect on and Other was significantly enriched around TSS in all samples 16p11.2 Genes (Figure S11 in Supplement 1). To analyze the chromatin The wild-type HEK293T cells carry [A/A] genotype at accessibility differences between rs4420550-[A/A] and -[G/G] rs4420550. To gain direct evidence of the regulatory effect of HEK293T cells, the read counts of 50,644 merged peaks were rs4420550 on MAPK3 mRNA expression, we employed used as inputs and analyzed using the DESeq2 program. After homology-directed repair–mediated genome editing by comparison, we noticed substantial quantitative differences in CRISPR/Cas9 and thereby obtained HEK293T cells carrying peak signals at 16p11.2 between rs4420550-[A/A] and -[G/G]

Biological Psychiatry February 1, 2021; 89:246–255 www.sobp.org/journal 251 Biological Psychiatry Molecular Mechanisms Underlying Genetic Risk at 16p11.2

Figure 3. ATAC-Seq analysis of rs4420550-[A/A] and -[G/G] HEK293T cells. (A) Genome-wide chromatin accessibility in rs4420550-[A/A] and -[G/G] HEK293T cells. (B) The open chromatin of the MAPK3 gene in the 8 rs4420550-[A/A] and -[G/G] HEK293T samples. (C) The chromatin accessibility in the TSS regulatory region of the MAPK3 gene. (D) Boxplot of ATAC-Seq signal at the TSS regulatory region of MAPK3 for the 8 rs4420550-[A/A] and -[G/G] HEK293T samples. ATAC-Seq, assay for transposase-accessible chromatin using sequencing; BPM, bins per million mapped reads; TSS, transcription start site.

cells (Figure 3A). One of the most significant differences across transcription factors (data not shown). Thus, the molecular the genome was MAZ in the 16p11.2 region (p=7.28 3 1027) mechanisms underlying the regulatory effects of rs4420550 (Figure S12 in Supplement 1 and Table S14 in Supplement 2). remain unclear. We speculate that its regulatory function might Remarkably, in the TSS regulatory regions of MAPK3, the involve mechanisms other than transcription factor binding chromatin accessibility was significantly higher in the (e.g., noncoding RNAs), and this hypothesis awaits further rs4420550-[A/A] cells compared with the rs4420550-[G/G] explorations. cells (p=3.34 3 1024)(Figure 3B–D). The chromatin acces- sibility at the INO80E TSS regulatory region was also higher in rs4420550 Affects Cell Proliferation p= 3 24 rs4420550-[A/A] cells ( 3.90 10 )(Figure S13 in Previous studies suggested that several genes in the 16p11.2 Supplement 1). region, including MAPK3, could affect cell proliferation (8,56); To reveal the potential mechanisms explaining the impact of we therefore investigated whether cells carrying different ge- rs4420550 on gene expression, we screened potential tran- notypes of rs4420550 showed different proliferation rates. By scription factor(s) that could bind the rs4420550 locus. By performing the CCK-8 assay, we observed that the HEK293T functional predictions using RegulomeDB (46) and HaploReg cell clones carrying rs4420550-[A/A] showed significantly v4.1 (47), we found that the DNA sequences spanning higher cell proliferation rates compared with the clones car- rs4420550 likely bound some transcription factors, such as rying rs4420550-[G/G] (p=.015 at 48 hours and p=.017 at 72 CREB3L1, PAX6, and HDAC2 (Figure S14 in Supplement 1). hours) (Figure 4), further confirming the putative physiological We then manipulated the expression of these three transcrip- impact of rs4420550. tion factors respectively and found that overexpression of PAX6 or HDAC2 significantly increased MAPK3 mRNA expression, whereas knockdown of them led to decreased rs4420550 * MAPK3 mRNA levels (Figure S15 in Supplement 1). We con- A/A ducted electrophoretic mobility shift assay assays using nu- G/G clear extracts from wild-type HEK293T cells, HEK293T cells 1.0 overexpressing PAX6, and HEK293T cells overexpressing HDAC2, but no specific shift bands were observed for either allele of rs4420550 (Figure S16 in Supplement 1). We then 0.5 * examined whether different alleles of rs4420550 affected gene

expression by modulating the effects of HDAC2 or PAX6. OD Value (450 nm) Briefly, we overexpressed either HDAC2 or PAX6 in the 0.0 HEK293T cells with rs4420550-[A/A] or -[G/G], but over- 0244872 expression of either transcription factor did not result in any Time (hour) MAPK3 mRNA expression differences between rs4420550-[A/ A] and -[G/G] cells (Figure S17 in Supplement 1). We also Figure 4. rs4420550 affects HEK293T cell proliferation abilities. The blank examined whether the rs4420550 locus was highlighted in the absorbance values from wells without cells were subtracted from all absorbance values. Error bars represent SEM of four biological replicates. ChIP-Seq analyses of diverse transcription factors in multiple Data are representative of three independent experiments with consistent cell lines or tissues using ENCODE dataset (44). However, results. *p , .05, rs4420550-[A/A] versus rs4420550-[G/G]. OD, optical rs4420550 was not implicated in the ChIP-Seq assays of any density.

252 Biological Psychiatry February 1, 2021; 89:246–255 www.sobp.org/journal Biological Molecular Mechanisms Underlying Genetic Risk at 16p11.2 Psychiatry

DISCUSSION physiological impact of these potentially functional variations is Through an integrative functional analysis combining eQTL needed. associations, epigenome annotations, and experimental vali- In summary, we have performed a detailed molecular dations, we identified a functional risk variation rs4420550 in characterization of a potential functional variation within the the schizophrenia-associated 16p11.2 locus. Although not schizophrenia-associated 16p11.2 region. Future studies of showing the most significant association with schizophrenia in rs4420550 using human induced pluripotent stem cells, and GWAS (1), rs4420550 likely locates in the DNA enhancer region reprogrammed cells via genome editing, may provide essential and exerts regulatory effects on transcription of multiple genes insights into its relevant schizophrenia pathophysiology in the 16p11.2 region. While our results suggest that a number (69,70). of genes are affected by rs4420550, MAPK3 remains those most significantly affected. In addition, regulation of the ACKNOWLEDGMENTS AND DISCLOSURES INO80E mRNA by rs4420550 is also consistently observed in This work was supported by National Natural Science Foundation of China multiple analyses, and this gene previously showed significant Grant Nos. 81722019 (to ML), 31701133 (to XX), and 81871067 (to HC); eQTL associations with schizophrenia genetic risk in a tran- Strategic Priority Research Program of the Chinese Academy of Sciences Grant Nos. XDB13000000 (to ML) and XDB02020003 (to Y-GY); Innovative scriptome study of human fetal brains (57). Research Team of Science and Technology Department of Yunnan Province Previous studies have shown that MAPK3 mRNA expres- Grant No. 2019HC004 (to X-JL); the Chinese Academy of Sciences Western sion is significantly higher in the DLPFC of patients with Light Program (to XX); the Youth Innovation Promotion Association, Chinese schizophrenia compared with controls (52), and elevated Academy of Sciences (to XX); the CAS Pioneer Hundred Talents Program (to synaptic ERK-1 (encoded by MAPK3) activation leads to long- ML); and the 1000 Young Talents Program (to ML). Data were generated as term memory deficits in mice (53). Conversely, lacking Mapk3 part of the CommonMind Consortium supported by funding from Takeda Pharmaceuticals Company Limited, F. Hoffman-La Roche and National In- can result in enhanced synaptic plasticity in the striatum and stitutes of Health Grants Nos. R01MH085542, R01MH093725, facilitate striatal-mediated learning and memory (58). Addi- P50MH066392, P50MH080405, R01MH097276, RO1-MH-075916, tionally, inhibition of ERK-1 signaling rescues the pathophysi- P50M096891, P50MH084053S1, R37MH057881 and R37MH057881S1, ological alterations observed in mice carrying the 16p11.2 HHSN271201300031C, AG02219, AG05138 and MH06692. Brain tissue for microduplication (9). These studies suggest that higher MAPK3 the study was obtained from the following brain bank collections: the Mount Sinai National Institutes of Health Brain and Tissue Repository, the Univer- expression (or ERK-1 level/activity) is a risk factor for schizo- ’ INO80E sity of Pennsylvania Alzheimer s Disease Core Center, the University of phrenia and other neurodevelopmental disorders. is a Pittsburgh NeuroBioBank and Brain and Tissue Repositories and the Na- component of chromatin remodeling complex (59), which tional Institute of Mental Health Human Brain Collection Core. CommonMind modulates nucleosome spacing and sliding in an ATP (aden- Consortium Leadership: Pamela Sklar, Joseph Buxbaum (Icahn School of osine triphosphate)-dependent manner (60), and regulates Medicine at Mount Sinai), Bernie Devlin, David Lewis (University of Pitts- transcription during cortical neurogenesis (61–63). burgh), Raquel Gur, Chang-Gyu Hahn (University of Pennsylvania), Keisuke Despite the fact that consistent impacts of rs4420550 on Hirai, Hiroyoshi Toyoshiba (Takeda Pharmaceuticals Company), Enrico MAPK3 INO80E Domenici, Laurent Essioux (F. Hoffman-La Roche), Lara Mangravite, Mette and expression have been reported in a series Peters (Sage Bionetworks), Thomas Lehner, Barbara Lipska (National of analyses, inconsistencies exist regarding its effects on other Institute of Mental Health). Data were generated as part of the Psy- 16p11.2 genes. For example, while many genes at 16p11.2 chENCODE Consortium, supported by Grant Nos. U01MH103392, showed different expression levels between the rs4420550-[A/ U01MH103365, U01MH103346, U01MH103340, U01MH103339, A] and -[G/G] HEK293T cells, not all of them showed evidence R21MH109956, R21MH105881, R21MH105853, R21MH103877, of eQTL associations with rs4420550 in human brains R21MH102791, R01MH111721, R01MH110928, R01MH110927, HIRIP3 ALDOA R01MH110926, R01MH110921, R01MH110920, R01MH110905, (e.g., and in Figure 2D). Given that expression R01MH109715, R01MH109677, R01MH105898, R01MH105898, of a gene is usually regulated by multiple independent or R01MH094714, P50MH106934, U01MH116488, U01MH116487, correlated genetic variations, the eQTL associations of a gene U01MH116492, U01MH116489, U01MH116438, U01MH116441, revealed in currently available datasets likely reflect the U01MH116442, R01MH114911, R01MH114899, R01MH114901, “summed” effects of multiple genetic variations. Therefore, R01MH117293, R01MH117291, and R01MH117292 awarded to Schahram additional functional variations, which either interact with Akbarian (Icahn School of Medicine at Mount Sinai), Gregory Crawford (Duke University), Stella Dracheva (Icahn School of Medicine at Mount Sinai), rs4420550 or directly affect transcription, may explain the Peggy Farnham (University of Southern California), Mark Gerstein (Yale nonsignificant eQTL associations between some 16p11.2 University), Daniel Geschwind (University of California, Los Angeles), Fer- genes and rs4420550. nando Goes (Johns Hopkins University), Thomas M. Hyde (Lieber Institute In line with hypothesis, we acknowledge the potentially for Brain Development), Andrew Jaffe (Lieber Institute for Brain Develop- incomplete assessment of other plausible functional SNPs in ment), James A. Knowles (University of Southern California), Chunyu Liu the 16p11.2 locus as a limitation of the current study. Actually, (State University of New York Upstate Medical University), Dalila Pinto (Icahn School of Medicine at Mount Sinai), Panos Roussos (Icahn School of existence of multiple functional variations (either independent Medicine at Mount Sinai), Stephan Sanders (University of California, San or correlated with each other) in one GWAS risk locus has been Francisco), Nenad Sestan (Yale University), Pamela Sklar (Icahn School of commonly seen (33,64–68). In fact, functional predictions of Medicine at Mount Sinai), Matthew State (University of California, San 16p11.2 variations using HaploReg v4.1 (47) suggested that Francisco), Patrick Sullivan (University of North Carolina), Flora Vaccarino several SNPs are potentially functional. For example, (Yale University), Daniel Weinberger (Lieber Institute for Brain Development), rs12716973, in high LD with rs4420550 and in the active TSS Sherman Weissman (Yale University), Kevin White (University of Chicago), KCTD13 Jeremy Willsey (University of California, San Francisco), and Peter Zandi region of , has 22 bound proteins in the ChIP-Seq (Johns Hopkins University). assays of human tissues as implemented in HaploReg data- We thank Yongbin Chen and Cuiping Yang (Kunming Institute of set (Figure S4 in Supplement 1). Further validation of the Zoology) for providing assistance during the experiments.

Biological Psychiatry February 1, 2021; 89:246–255 www.sobp.org/journal 253 Biological Psychiatry Molecular Mechanisms Underlying Genetic Risk at 16p11.2

HC, XX, and ML designed the study and interpreted the results. HC and 12. Yadav S, Oses-Prieto JA, Peters CJ, Zhou J, Pleasure SJ, XC conducted the primary experiments. H-JL conducted the RNA-Seq and Burlingame AL, et al. (2017): TAOK2 kinase mediates PSD95 stability ATAC-Seq analysis, as well as other bioinformatics analyses. W-PL, L-JZ, and dendritic spine maturation through Septin7 phosphorylation. C-YZ, LW, J-YW, J-WL, and X-LM provided assistance during the experi- Neuron 93:379–393. ments. Y-GY and X-JL provided advice and discussion during the design 13. Penzes P, Cahill ME, Jones KA, VanLeeuwen JE, Woolfrey KM (2011): and practice of the current study. XX and ML drafted the manuscript, and all Dendritic spine pathology in neuropsychiatric disorders. Nat Neurosci authors contributed to the final version of the article. 14:285–293. The authors report no biomedical financial interests or potential conflicts 14. Forrest MP, Parnell E, Penzes P (2018): Dendritic structural plasticity of interest. and neuropsychiatric disease. Nat Rev Neurosci 19:215–234. 15. MacDonald ML, Alhassan J, Newman JT, Richard M, Gu H, Kelly RM, ARTICLE INFORMATION et al. (2017): Selective loss of smaller spines in schizophrenia. Am J Psychiatry 174:586–594. From the Key Laboratory of Animal Models and Human Disease Mecha- 16. Focking M, Lopez LM, English JA, Dicker P, Wolff A, Brindley E, et al. nisms of the Chinese Academy of Sciences and Yunnan Province (HC, XC, (2015): Proteomic and genomic evidence implicates the postsynaptic H-JL, W-PL, L-JZ, C-YZ, J-YW, J-WL, X-LM, LW, Y-GY, X-JL, ML, XX) and density in schizophrenia. Mol Psychiatry 20:424–432. KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research in 17. Osimo EF, Beck K, Reis Marques T, Howes OD (2019): Synaptic loss in Common Diseases (Y-GY, X-JL, ML), Kunming Institute of Zoology, Chinese schizophrenia: A meta-analysis and systematic review of synaptic Academy of Sciences; Kunming College of Life Science (XC, H-JL, W-PL, protein and mRNA measures. Mol Psychiatry 24:549–561. L-JZ, C-YZ, J-YW), University of Chinese Academy of Sciences; and Center 18. Berdenis van Berlekom A, Muflihah CH, Snijders G, MacGillavry HD, for Excellence in Animal Evolution and Genetics (X-JL), Chinese Academy of Middeldorp J, Hol EM, et al. (2020): Synapse pathology in schizo- Sciences, Kunming; and the CAS Center for Excellence in Brain Science and phrenia: A meta-analysis of postsynaptic elements in postmortem Intelligence Technology (Y-GY, ML), Chinese Academy of Sciences, brain studies. Schizophr Bull 46:374–386. Shanghai, China. 19. Steinberg S, de Jong S, Mattheisen M, Costas J, Demontis D, HC, XC, and H-JL contributed equally to this work. Jamain S, et al. (2014): Common variant at 16p11.2 conferring risk of Address correspondence to Xiao Xiao, Ph.D., at [email protected]. psychosis. Mol Psychiatry 19:108–114. cn, or Ming Li, Ph.D., at [email protected]. 20. Yang Z, Cai X, Qu N, Zhao L, Zhong BL, Zhang SF, et al. (2020): Received Mar 5, 2020; revised and accepted Sep 17, 2020. Identification of a functional 339-bp Alu polymorphism in the Supplementary material cited in this article is available online at https:// schizophrenia-associated locus at 10q24.32. Zool Res 41:84–89. doi.org/10.1016/j.biopsych.2020.09.016. 21. Yang Z, Zhou D, Li H, Cai X, Liu W, Wang L, et al. (2020): The genome- wide risk alleles for psychiatric disorders at 3p21.1 show convergent effects on mRNA expression, cognitive function and mushroom den- REFERENCES dritic spine. Mol Psychiatry 25:48–66. 1. Schizophrenia Working Group of the Psychiatric Genomics Con- 22. Edwards SL, Beesley J, French JD, Dunning AM (2013): Beyond sortium (2014): Biological insights from 108 schizophrenia-associated GWASs: Illuminating the dark road from association to function. Am J genetic loci. Nature 511:421–427. Hum Genet 93:779–797. 2. Pardinas AF, Holmans P, Pocklington AJ, Escott-Price V, Ripke S, 23. Li M, Jaffe AE, Straub RE, Tao R, Shin JH, Wang Y, et al. (2016): Carrera N, et al. (2018): Common schizophrenia alleles are enriched in A human-specific AS3MT isoform and BORCS7 are molecular risk mutation-intolerant genes and in regions under strong background factors in the 10q24.32 schizophrenia-associated locus. Nat Med selection. Nat Genet 50:381–389. 22:649–656. 3. Hu Z, Xiao X, Zhang Z, Li M (2019): Genetic insights and neurobio- 24. Liu W, Li W, Cai X, Yang Z, Li H, Su X, et al. (2020): Identification of a logical implications from NRXN1 in neuropsychiatric disorders. Mol functional human-unique 351-bp Alu insertion polymorphism associ- Psychiatry 24:1400–1414. ated with major depressive disorder in the 1p31.1 GWAS risk loci. 4. Marshall CR, Howrigan DP, Merico D, Thiruvahindrapuram B, Wu W, Neuropsychopharmacology 45:1196–1206. Greer DS, et al. (2017): Contribution of copy number variants to 25. Cai X, Yang Z-H, Li H-J, Xiao X, Li M, Chang H (2020): A human- schizophrenia from a genome-wide study of 41,321 subjects. Nat specific schizophrenia risk tandem repeat affects alternative splicing of Genet 49:27–35. a human-unique isoform AS3MTd2d3 and mushroom dendritic spine 5. McCarthy SE, Makarov V, Kirov G, Addington AM, McClellan J, density [published online ahead of print Jul 14]. Schizophr Bull. Yoon S, et al. (2009): Microduplications of 16p11.2 are associated with 26. Zhang S, Zhang H, Zhou Y, Qiao M, Zhao S, Kozlova A, et al. (2020): schizophrenia. Nat Genet 41:1223–1227. Allele-specific open chromatin in human iPSC neurons elucidates 6. Wu Y, Li X, Liu J, Luo XJ, Yao YG (2020): SZDB2.0: An updated functional disease variants. Science 369:561–565. comprehensive resource for schizophrenia research. Hum Genet 27. French JD, Edwards SL (2020): The role of noncoding variants in 139:1285–1297. heritable disease. Trends Genet 36:880–891. 7. Chang H, Li L, Li M, Xiao X (2017): Rare and common variants at 28. Song M, Yang X, Ren X, Maliskova L, Li B, Jones IR, et al. (2019): 16p11.2 are associated with schizophrenia. Schizophr Res 184:105– Mapping cis-regulatory chromatin contacts in neural cells links 108. neuropsychiatric disorder risk variants to target genes. Nat Genet 8. Golzio C, Willer J, Talkowski ME, Oh EC, Taniguchi Y, Jacquemont S, 51:1252–1262. et al. (2012): KCTD13 is a major driver of mirrored neuroanatomical 29. Girdhar K, Hoffman GE, Jiang Y, Brown L, Kundakovic M, phenotypes of the 16p11.2 copy number variant. Nature 485:363–367. Hauberg ME, et al. (2018): Cell-specific histone modification maps in 9. Blizinsky KD, Diaz-Castro B, Forrest MP, Schurmann B, Bach AP, the human frontal lobe link schizophrenia risk to the neuronal epi- Martin-de-Saavedra MD, et al. (2016): Reversal of dendritic pheno- genome. Nat Neurosci 21:1126–1136. types in 16p11.2 microduplication mouse model neurons by phar- 30. Won H, de la Torre-Ubieta L, Stein JL, Parikshak NN, Huang J, macological targeting of a network hub. Proc Natl Acad Sci U S A Opland CK, et al. (2016): Chromosome conformation elucidates reg- 113:8520–8525. ulatory relationships in developing human brain. Nature 538:523–527. 10. Richter M, Murtaza N, Scharrenberg R, White SH, Johanns O, 31. Duan J, Shi J, Fiorentino A, Leites C, Chen X, Moy W, et al. (2014): Walker S, et al. (2019): Altered TAOK2 activity causes autism-related A rare functional noncoding variant at the GWAS-implicated MIR137/ neurodevelopmental and cognitive abnormalities through RhoA MIR2682 locus might confer risk to schizophrenia and bipolar disorder. signaling. Mol Psychiatry 24:1329–1350. Am J Hum Genet 95:744–753. 11. Escamilla CO, Filonova I, Walker AK, Xuan ZX, Holehonnur R, 32. Roussos P, Mitchell AC, Voloudakis G, Fullard JF, Pothula VM, Espinosa F, et al. (2017): Kctd13 deletion reduces synaptic trans- Tsang J, et al. (2014): A role for noncoding variation in schizophrenia. mission via increased RhoA. Nature 551:227–231. Cell Rep 9:1417–1429.

254 Biological Psychiatry February 1, 2021; 89:246–255 www.sobp.org/journal Biological Molecular Mechanisms Underlying Genetic Risk at 16p11.2 Psychiatry

33. Forrest MP, Zhang H, Moy W, McGowan H, Leites C, Dionisio LE, et al. 53. Seese RR, Maske AR, Lynch G, Gall CM (2014): Long-term memory (2017): Open chromatin profiling in hiPSC-derived neurons prioritizes deficits are associated with elevated synaptic ERK1/2 activation and functional noncoding psychiatric risk variants and highlights neuro- reversed by mGluR5 antagonism in an animal model of autism. Neu- developmental loci. Cell Stem Cell 21:305–318.e8. ropsychopharmacology 39:1664–1673. 34. Song JHT, Lowe CB, Kingsley DM (2018): Characterization of a 54. Schmitt AD, Hu M, Jung I, Xu Z, Qiu Y, Tan CL, et al. (2016): human-specific tandem repeat associated with bipolar disorder and A compendium of chromatin contact maps reveals spatially active schizophrenia. Am J Hum Genet 103:421–430. regions in the . Cell Rep 17:2042–2059. 35. Jaffe AE, Straub RE, Shin JH, Tao R, Gao Y, Collado-Torres L, et al. 55. Wang Y, Song F, Zhang B, Zhang L, Xu J, Kuang D, et al. (2018): The (2018): Developmental and genetic regulation of the human cortex 3D Genome Browser: A web-based browser for visualizing 3D genome transcriptome illuminate schizophrenia pathogenesis. Nat Neurosci organization and long-range chromatin interactions. Genome Biol 21:1117–1125. 19:151. 36. Ng B, White CC, Klein HU, Sieberts SK, McCabe C, Patrick E, et al. 56. Gusev A, Mancuso N, Won H, Kousi M, Finucane HK, Reshef Y, et al. (2017): An xQTL map integrates the genetic architecture of the human (2018): Transcriptome-wide association study of schizophrenia and brain’s transcriptome and epigenome. Nat Neurosci 20:1418–1426. chromatin activity yields mechanistic disease insights. Nat Genet 37. Fromer M, Roussos P, Sieberts SK, Johnson JS, Kavanagh DH, 50:538–548. Perumal TM, et al. (2016): Gene expression elucidates functional impact 57. Walker RL, Ramaswami G, Hartl C, Mancuso N, Gandal MJ, de la of polygenic risk for schizophrenia. Nat Neurosci 19:1442–1453. Torre-Ubieta L, et al. (2019): Genetic control of expression and splicing 38. GTEx Consortium; Laboratory Data Analysis, and Coordinating in developing human brain informs disease mechanisms. Cell Center—Analysis Working Group; Statistical Methods groups— 179:750–771.e22. Analysis Working Group; Enhancing GTEx groups; NIH Common 58. Mazzucchelli C, Vantaggiato C, Ciamei A, Fasano S, Pakhotin P, Fund; NIH/NCI, et al. (2017): Genetic effects on gene expression Krezel W, et al. (2002): Knockout of ERK1 MAP kinase enhances across human tissues. Nature 550:204–213. synaptic plasticity in the striatum and facilitates striatal-mediated 39. Wang D, Liu S, Warrell J, Won H, Shi X, Navarro FCP, et al. (2018): learning and memory. Neuron 34:807–820. Comprehensive functional genomic resource and integrative model for 59. Ayala R, Willhoft O, Aramayo RJ, Wilkinson M, McCormack EA, the human brain. Science 362:eaat8464. Ocloo L, et al. (2018): Structure and regulation of the human INO80- 40. Zhu Z, Zhang F, Hu H, Bakshi A, Robinson MR, Powell JE, et al. (2016): nucleosome complex. Nature 556:391–395. Integration of summary data from GWAS and eQTL studies predicts 60. Clapier CR, Iwasa J, Cairns BR, Peterson CL (2017): Mechanisms of complex trait gene targets. Nat Genet 48:481–487. action and regulation of ATP-dependent chromatin-remodelling 41. Wu Y, Zeng J, Zhang F, Zhu Z, Qi T, Zheng Z, et al. (2018): Integrative complexes. Nat Rev Mol Cell Biol 18:407–422. analysis of omics summary data reveals putative mechanisms un- 61. Morrison AJ, Shen X (2009): Chromatin remodelling beyond tran- derlying complex traits. Nat Commun 9:918. scription: The INO80 and SWR1 complexes. Nat Rev Mol Cell Biol 42. Barrett JC, Fry B, Maller J, Daly MJ (2005): Haploview: Analysis and 10:373–384. visualization of LD and haplotype maps. Bioinformatics 21:263–265. 62. Ueda T, Watanabe-Fukunaga R, Ogawa H, Fukuyama H, Higashi Y, 43. Genomes Project Consortium, Auton A, Brooks LD, Durbin RM, Nagata S, et al. (2007): Critical role of the p400/mDomino chromatin- Garrison EP, Kang HM, et al. (2015): A global reference for human remodeling ATPase in embryonic hematopoiesis. Genes Cells genetic variation. Nature 526:68–74. 12:581–592. 44. ENCODE Project Consortium (2012): An integrated encyclopedia of 63. Sokpor G, Castro-Hernandez R, Rosenbusch J, Staiger JF, Tuoc T DNA elements in the human genome. Nature 489:57–74. (2018): ATP-dependent chromatin remodeling during cortical neuro- 45. Roadmap Epigenomics Consortium, Kundaje A, Meuleman W, Ernst J, genesis. Front Neurosci 12:226. Bilenky M, Yen A, et al. (2015): Integrative analysis of 111 reference 64. Chatterjee S, Kapoor A, Akiyama JA, Auer DR, Lee D, Gabriel S, et al. human epigenomes. Nature 518:317–330. (2016): Enhancer variants synergistically drive dysfunction of a gene 46. Boyle AP, Hong EL, Hariharan M, Cheng Y, Schaub MA, Kasowski M, regulatory network in hirschsprung disease. Cell 167:355–368.e10. et al. (2012): Annotation of functional variation in personal genomes 65. Huo Y, Li S, Liu J, Li X, Luo XJ (2019): Functional genomics reveal using RegulomeDB. Genome Res 22:1790–1797. gene regulatory mechanisms underlying schizophrenia risk. Nat 47. Ward LD, Kellis M (2012): HaploReg: A resource for exploring chro- Commun 10:670. matin states, conservation, and regulatory motif alterations within sets 66. He H, Li W, Liyanarachchi S, Srinivas M, Wang Y, Akagi K, et al. (2015): of genetically linked variants. Nucleic Acids Res 40:D930–D934. Multiple functional variants in long-range enhancer elements 48. Naumova N, Smith EM, Zhan Y, Dekker J (2012): Analysis of long- contribute to the risk of SNP rs965513 in thyroid cancer. Proc Natl range chromatin interactions using Chromosome Conformation Cap- Acad Sci U S A 112:6128–6133. ture. Methods 58:192–203. 67. Roman TS, Marvelle AF, Fogarty MP, Vadlamudi S, Gonzalez AJ, 49. Hagege H, Klous P, Braem C, Splinter E, Dekker J, Cathala G, et al. Buchkovich ML, et al. (2015): Multiple hepatic regulatory variants at the (2007): Quantitative analysis of chromosome conformation capture GALNT2 GWAS locus associated with high-density lipoprotein assays (3C-qPCR). Nat Protoc 2:1722–1733. cholesterol. Am J Hum Genet 97:801–815. 50. Gupta RM, Hadaya J, Trehan A, Zekavat SM, Roselli C, Klarin D, et al. 68. Wu Y, Bi R, Zeng C, Ma C, Sun C, Li J, et al. (2019): Identification of the (2017): A genetic variant associated with five vascular diseases is a primate-specific gene BTN3A2 as an additional schizophrenia risk distal regulator of Endothelin-1 gene expression. Cell 170:522–533.e15. gene in the MHC loci. EBioMedicine 44:530–541. 51. Love MI, Huber W, Anders S (2014): Moderated estimation of fold 69. Falk A, Heine VM, Harwood AJ, Sullivan PF, Peitz M, Brustle O, et al. change and dispersion for RNA-seq data with DESeq2. Genome Biol (2016): Modeling psychiatric disorders: From genomic findings to 15:550. cellular phenotypes. Mol Psychiatry 21:1167–1179. 52. Gandal MJ, Zhang P, Hadjimichael E, Walker RL, Chen C, Liu S, et al. 70. Hoffman GE, Schrode N, Flaherty E, Brennand KJ (2019): New con- (2018): Transcriptome-wide isoform-level dysregulation in ASD, siderations for hiPSC-based models of neuropsychiatric disorders. schizophrenia, and bipolar disorder. Science 362:eaat8127. Mol Psychiatry 24:49–66.

Biological Psychiatry February 1, 2021; 89:246–255 www.sobp.org/journal 255