Functional Genomics Identify a Regulatory Risk Variation Rs4420550 in the 16P11.2 Schizophrenia- Associated Locus
Total Page:16
File Type:pdf, Size:1020Kb
Biological Psychiatry Archival Report Functional Genomics Identify a Regulatory Risk Variation rs4420550 in the 16p11.2 Schizophrenia- Associated Locus Hong Chang, Xin Cai, Hui-Juan Li, Wei-Peng Liu, Li-Juan Zhao, Chu-Yi Zhang, Jun-Yang Wang, Jie-Wei Liu, Xiao-Lei Ma, Lu Wang, Yong-Gang Yao, Xiong-Jian Luo, Ming Li, and Xiao Xiao ABSTRACT BACKGROUND: Genome-wide association studies (GWASs) have reported hundreds of genomic loci associated with schizophrenia, yet identifying the functional risk variations is a key step in elucidating the underlying mechanisms. METHODS: We applied multiple bioinformatics and molecular approaches, including expression quantitative trait loci analyses, epigenome signature identification, luciferase reporter assay, chromatin conformation capture, homology- directed genome editing by CRISPR/Cas9 (clustered regularly interspaced short palindromic repeats/Cas9), RNA sequencing, and ATAC-Seq (assay for transposase-accessible chromatin using sequencing). RESULTS: We found that the schizophrenia GWAS risk variations at 16p11.2 were significantly associated with messenger RNA levels of multiple genes in human brain, and one of the leading expression quantitative trait loci genes, MAPK3, is located w200 kb away from these risk variations in the genome. Further analyses based on the epigenome marks in human brain and cell lines suggested that a noncoding single nucleotide polymorphism, rs4420550 (p=2.36 3 1029 in schizophrenia GWAS), was within a DNA enhancer region, which was validated via in vitro luciferase reporter assays. The chromatin conformation capture experiment showed that the rs4420550 region physically interacted with the MAPK3 promoter and TAOK2 promoter. Precise CRISPR/Cas9 editing of a single base pair in cells followed by RNA sequencing further confirmed the regulatory effects of rs4420550 on the transcription of 16p11.2 genes, and ATAC-Seq demonstrated that rs4420550 affected chromatin accessibility at the 16p11.2 region. The rs4420550-[A/A] cells showed significantly higher proliferation rates compared with rs4420550-[G/G] cells. CONCLUSIONS: These results together suggest that rs4420550 is a functional risk variation, and this study illustrates an example of comprehensive functional characterization of schizophrenia GWAS risk loci. https://doi.org/10.1016/j.biopsych.2020.09.016 Accumulating evidence suggests that schizophrenia is a highly MAPK3). Other studies also reported regulatory effects of heritable polygenic disorder, and genome-wide surveys have 16p11.2 CNV genes on dendritic spine development and reported more than 100 risk loci containing common single synaptic function (10–12), which are believed to contribute to nucleotide polymorphisms (SNPs) (1,2) or rare copy number the pathogenesis of schizophrenia (13–18). variations (CNVs) (3–6). Among them, the chromosome Common variations in the proxy 16p11.2 CNV region also 16p11.2 region has been consistently highlighted in genetic showed genome-wide significant associations with schizo- studies of schizophrenia. For example, rare microduplications phrenia (1,2,19). However, risk SNPs primarily locate in non- at the recurrent w600-kb 16p11.2 BP4-BP5 locus are signifi- coding regions with undetermined physiological impact. It is cantly associated with schizophrenia (4,5,7). This 16p11.2 CNV presumed that such noncoding DNA variations should either region encompasses more than 25 genes. A previous study be in linkage with unknown causal mutations or exert regula- examined the effect of each individual gene on zebrafish head tory effects themselves (20–27), e.g., they might affect sizes and identified KCTD13 as the major driver, which messenger RNA (mRNA) expression through modulating participated in head-size regulation by interacting with MAPK3 physical interaction between transcription factors and and MVP (8). Besides, Blizinsky et al. (9) observed increased sequence elements (e.g., DNA enhancers) (28–34). In the pre- dendritic arborization in cortical pyramidal neurons isolated sent study, we performed brain expression quantitative trait from mice carrying the heterozygous 16p11.2 duplication, loci (eQTLs) analyses and found that schizophrenia risk SNPs which was rescued through pharmacological targeting of at 16p11.2 were significantly associated with mRNA expres- ERK-1 (extracellular signal-regulated kinase 1) (encoded by sion of multiple genes. Combinatory analyses including the 246 ª 2020 Society of Biological Psychiatry. Biological Psychiatry February 1, 2021; 89:246–255 www.sobp.org/journal ISSN: 0006-3223 Biological Molecular Mechanisms Underlying Genetic Risk at 16p11.2 Psychiatry epigenome signature identification, luciferase reporter assay, was normalized to that of Renilla luciferase to control for chromatin conformation capture (3C), homology-directed transfection efficiency variations. All assays were performed in genome editing by CRISPR/Cas9 (clustered regularly inter- triplicate in at least three independent experiments. Two-tailed spaced short palindromic repeats/Cas9), RNA sequencing t test was performed for statistical analyses. (RNA-Seq), and ATAC-Seq (assay for transposase-accessible The 3C assay was performed following previous studies chromatin using sequencing) identified a functional noncoding (48,49) to detect physical interactions between genomic re- SNP rs4420550 residing in an enhancer region at 16p11.2. gions. A total of 1 3 107 HEK293T cells in 10 mL of Dulbecco’s Modified Eagle Medium culture medium were fixed using 1% formaldehyde at room temperature for 10 minutes, and the METHODS AND MATERIALS fixed cells were lysed in 5 mL lysis buffer for 15 minutes at 4C The study protocol was approved by the Institutional Review with gentle pipetting. The mixture was centrifuged, and the Board of Kunming Institute of Zoology, Chinese Academy of nuclei pellet was then suspended in 500 mL of 1.23 restriction Sciences. Detailed methods and materials are described in the enzyme buffer. Following this, the pellets were digested with Methods and Materials in Supplement 1. The primer se- 400 U restriction enzyme EcoRI (New England Biolabs, Ips- quences in real-time quantitative polymerase chain reaction wich, MA), and T4 DNA ligase (100 U; Promega) was added to (RT-qPCR) analysis are shown in Table S1 in Supplement 2, the diluted chromatin and incubated for 4 hours at 16C. the primers used in the 3C assay are listed in Table S2 in Proteinase K (500 mg) was then added and incubated at 65C Supplement 2, the single guide RNA oligo sequences targeting overnight to reverse crosslinks, and RNA was removed genes are shown in Table S3 in Supplement 2, and the through RNase A (300 mg) treatment for 60 minutes at 37C. nucleotide sequences of the double-stranded oligonucleotides TaqMan RT-qPCR (Thermo Fisher Scientific) was performed to in electrophoretic mobility shift assay are shown in Table S4 in determine the interaction frequencies between target sites of Supplement 2. interest, and the PCR products were purified using a QIAGEN The eQTL analyses were performed using datasets of the quick gel purification kit for Sanger sequencing of each dorsolateral prefrontal cortex (DLPFC) tissues from BrainSeq chimeric DNA (QIAGEN, Hilden, Germany). To normalize primer Phase 1 (n=412) (35),BrainxQTL(n=494) (36), Common- efficiency, control PCR templates were generated by digestion Mind Consortium (n=467) (37), Genotype-Tissue Expression and random ligation of bacterial artificial chromosomes (GTEx) Project (n=175) (38), and PsychENCODE Consortium covering the 16p11.2 region. (n=1387) (39). We applied the summary data–based Homology-directed repair–mediated genome editing of Mendelian randomization (SMR) analysis (40,41) to prioritize rs4420550 by CRISPR/Cas9 was conducted in HEK293T cells risk genes, which integrated the schizophrenia genome-wide following a previous study with modifications (50). The proto- association studies (GWASs) (40,675 cases and 64,643 spacer sequence of CRISPR/Cas9 targeting rs4420550 (50- controls) (2) with the aforementioned eQTL datasets to assess GATGTTGCTAGGAGCTGACC-30) was designed using Opti- the pleiotropic effects of SNPs on diagnosis and mRNA mized CRISPR Design (http://crispr.mit.edu), and annealed olig- expressions. omers were cloned into the pL-CRISPR-E.EFS.GFP plasmid. A Linkage disequilibrium (LD) between pairwise SNPs was single-stranded oligodeoxynucleotide (50-gttttttgtgaaattattttgct- calculated using Haploview v4.1 (42) based on the European caacaggacaccaccatggccgtgctgttggtgccaggCcagctcctagcaa- genotype data from 1000 Genomes Project (43). Histone catcaaggcttctggagtgagggtgcaaaccagctcccagctggcagc-30)was marks (such as H3K4me1, H3K4me3, and H3K27ac) were synthesized as a homology-directed repair template for precise queried in the available Roadmap Epigenomics projects and editing of the rs4420550 locus. The pL-CRISPR-E.EFS.GFP ENCODE (Encyclopedia of DNA Elements) datasets of brain construct and the single-stranded oligodeoxynucleotide tissues and cell lines (HEK293 [human embryonic kidney 293] were transiently cotransfected into HEK293T cells on 100-mm and SK-N-SH) (44,45). RegulomeDB (http://www.regulomedb. plates using Lipofectamine 3000. Forty-eight hours after org/)(46) and HaploReg v4.1 (https://pubs.broadinstitute.org/ transfection, cells positive for green fluorescent protein were mammals/haploreg/haploreg.php)(47) were also used to sorted with a flow cytometer. Colonies derived from single determine the SNPs overlapped with open chromatin peaks cells were then obtained from these cells, and four