<<

Molecular Psychiatry (2015) 20, 839–849 © 2015 Macmillan Publishers Limited All rights reserved 1359-4184/15 www.nature.com/mp

ORIGINAL ARTICLE Variants of the CNTNAP2 5′ promoter as risk factors for autism spectrum disorders: a genetic and functional approach

AG Chiocchetti1, M Kopp1, R Waltes1, D Haslinger1, E Duketis1, TA Jarczok1, F Poustka1, A Voran2, U Graab3, J Meyer4, SM Klauck5, S Fulda3 and CM Freitag1

Contactin-associated -like 2 (CNTNAP2), a member of the Neurexin gene superfamily, is one of the best-replicated risk for autism spectrum disorders (ASD). ASD are predominately genetically determined neurodevelopmental disorders characterized by impairments of language development, social interaction and communication, as well as stereotyped behavior and interests. Although CNTNAP2 expression levels were proposed to alter ASD risk, no study to date has focused on its 5′ promoter. Here, we directly sequenced the CNTNAP2 5′ promoter region of 236 German families with one child with ASD and detected four novel variants. Furthermore, we genotyped the three most frequent variants (rs150447075, rs34712024, rs71781329) in an additional sample of 356 families and found nominal association of rs34712024G with ASD and rs71781329GCG[7] with language development. The four novel and the three known minor alleles of the identified variants were predicted to alter binding sites (TFBS). At the functional level, the respective sequences spanning these seven variants were bound by nuclear factors. In a luciferase promoter assay, the respective minor alleles showed cell line-specific and differentiation stage-dependent effects at the level of promoter activation. The novel potential rare risk-variant M2, a G4A mutation − 215 base pairs 5′ of the transcriptional start site, significantly reduced promoter efficiency in HEK293T and in undifferentiated and differentiated neuroblastoid SH-SY5Y cells. This variant was transmitted to a patient with autistic disorder. The under-transmitted, protective minor G allele of the common variant rs34712024, in contrast, increased transcriptional activity. These results lead to the conclusion that the pathomechanism of CNTNAP2 promoter variants on ASD risk is mediated by their effect on TFBSs, and thus confirm the hypothesis that a reduced CNTNAP2 level during neuronal development increases liability for ASD.

Molecular Psychiatry (2015) 20, 839–849; doi:10.1038/mp.2014.103; published online 16 September 2014

INTRODUCTION incidence of autism. In addition, the major C allele of the non- Autism spectrum disorders (ASD) are childhood-onset neurode- coding variant rs2710102 (intron 13) increased ASD risk. Again, a velopmental disorders, showing a prevalence of approximately 1% mainly male-driven association with age at first spoken word was 9 in the general population.1,2 ASD are characterized by pervasive observed. The relevance of CNTNAP2 variants for language deve- impairments of social interaction and communication, repetitive lopment is further highlighted by an association of rs2710102C and stereotyped patterns of behavior and restricted interests.3,4 with poor non-word repetition test performance and lower They comprise childhood autism, Asperger’s syndrome and receptive and expressive language abilities in specific language atypical autism respective pervasive developmental disorder—not impairment.10,11 In an Australian general population sample, a otherwise specified. Although twin studies have shown a herita- four-marker haplotype including rs2710102C correlated with bility of 70 − 90%,5,6 the genetic architecture of ASD has not yet language acquisition. The homozygous carriers of the haplotype been fully elucidated. obtained substantially lower scores in the communication sub- The contactin-associated protein-like 2 gene (CNTNAP2), a scale of the Infant Monitoring Questionnaire.12 Functionally, a member of the Neurexin gene superfamily, has been suggested magnetic resonance imaging study showed that healthy indivi- as one of the most promising risk genes for ASD and language duals homozygous for both ASD risk alleles, rs7794745T and development.7 It is located within one of the best replicated rs2710102C, exhibited an increased activation in the right inferior chromosomal regions detected by linkage studies in ASD (7q22 frontal gyrus during a verbal fluency task, which typically induces –q36)5 and language development.8 In addition, several studies on activation of the left side, that is, Broca's area.13 This atypical CNTNAP2 have reported association of genetic variants with ASD language lateralization has also been observed in ASD.14 and language development. Increased risk for ASD for rs7794745T Sequencing studies of CNTNAP2 found an elevated burden of was noted in a study combining linkage- and subsequent family- rare and private mutations in patients with ASD compared with based association methods. Furthermore, this variant of intron 2 of controls.15,16 Most of the identified deleterious or non-synony- CNTNAP2 displayed a male-specific and a parent-of-origin effect,9 mous mutations resided in the C-terminal part of CNTNAP2 with partly explaining the marked sex differences observed in the no specific enrichment in any single domain (exons 14–24). Their

1Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, JW Goethe University, Frankfurt am Main, Germany; 2Department of Child and Adolescent Psychiatry, Saarland University, Homburg, Germany; 3Institute of Experimental Cancer Research in Pediatrics, Frankfurt am Main, Germany; 4Department of Neurobehavioral Genetics, Institute of Psychobiology, University of Trier, Trier, Germany and 5Division of Molecular Genome Analysis, German Cancer Research Center (DKFZ), Heidelberg, Germany. Correspondence: Professor CM Freitag, Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, JW Goethe University, Deutschordenstr. 50, Frankfurt am Main 60528, Germany. E-mail: [email protected] Received 18 June 2013; revised 4 June 2014; accepted 14 July 2014; published online 16 September 2014 CNTNAP2 5′ promoter variants in ASD AG Chiocchetti et al 840 functional impact is still elusive. Also, several case studies des- a potential dosage-dependent effect of CNTNAP2 expression on cribed rare deletions of CNTNAP2. For most of the patients ASD etiology. language-related deficits were reported, including stuttering, delay of language or ASD.17–21 One individual with ASD carried a maternally inherited rare deleterious copy number variation MATERIALS AND METHODS (CNV) within the promoter region of CNTNAP2, which led to Participants reduced CNTNAP2 mRNA expression in lymphoblastoid cell lines of Subjects with ASD and their families were recruited at the Departments of carriers with ASD in this family.22 Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy at JW CNTNAP2 knock-out mice showed behavioral phenotypes remi- Goethe University Frankfurt/Main and at Saarland University Hospital—as niscent of ASD including stereotypic motor movements, beha- described previously.29,30 All parents and children had given written vioral inflexibility, abnormalities in social behavior and reduced informed consent. The study was approved by the local ethical committees ultrasonic vocalization. At the cellular level of the brain, an (decision 162/99 (Frankfurt); 73/04 (Homburg) and 237/09 (Frankfurt)). All aberrant neuronal migration, reduction of GABAergic interneurons patients were diagnosed using the gold-standard diagnostic tools Autism- Diagnostic Interview-Revised (ADI-R)31 and/or Autism Diagnostic Observa- and aberrant neuronal synchrony were described, pointing toward tion Schedule Generic (ADOS-G).32 Affected individuals met DSM-IV TR a central role of CNTNAP2 in early brain development and criteria for autistic disorder, Asperger’s disorder or pervasive developmental 23 neuronal differentiation. The central role of this protein is further disorder—not otherwise specified. We excluded individuals with intelli- + highlighted by its role in the clustering of Potassium (K ) channels gence quotient (IQ)o35, birth weighto1000 g, cerebral palsy, reported at the juxtaparanodal region of the nodes of Ranvier,24 and its chronic medical conditions (with the exception of epilepsy), blindness, high expression levels in language-related brain areas.25 deafness, known karyotypic or cytogenetic aberrations (including 7q35 Taken together, there is convergent evidence from linkage, CNVs). The cohort in total consisted of 510 male and 82 female patients association, CNV, knock-out, sequencing and expression studies from 592 families (492 trios, 73 duos, 27 singletons). A total of 87.5% of the for involvement of CNTNAP2 in ASD and language-related index patients were diagnosed with strict autistic disorder including high- (IQ ⩾ 70) and low-functioning (IQo70) autism. The remaining 12.5% of the phenotypes. It is thus likely that genetic variants or a reduced patients received a spectrum diagnosis (Asperger’s disorder or pervasive gene dosage of this susceptibility gene modulate ASD and/or developmental disorder—not otherwise specified). Of this cohort, we language development by changing CNTNAP2 functionality and randomly selected 236 families as sequencing sample. The remaining set availability during neuronal development. Currently, evidences was used as extended sample to improve power for singlemarker asso- emerged showing that FOXP2 is downregulating CNTNAP2 levels ciation test. Both samples were matched for diagnoses, age at diagnosis, and mutations of either FOXP2 or FOXP2 binding elements in sex and IQ (Supplementary Table 1). intron 1 of the CNTNAP2 gene promote language disorders.10,20 However, this intronic repressor is just one of the many potential Sequencing (detection sample) regulatory elements. Despite the obvious functional impact of 5′ 26,27 Genomic DNA was extracted from blood or saliva using standard methods. promoter variants on mRNA expression levels, no study to The sequence of interest was determined by annotated TFBSs 5′ of the date has focused on the identification and functional character- transcriptional start site (TSS) as provided by the transcription factor ChIP- ization of CNTNAP2 promoter variants in the context of ASD. This Seq data set (ENCODE, UCSC genome browser). This region spanned base region has so far escaped array-based genome-wide analysis, pairs − 646 to − 120 from TSS of CNTNAP2 transcript NM_014141.5 as none of the annotated variants is covered by any of the (Figure 1a). commercially available chip-sets. Furthermore, next-generation PCR primers were designed to cover transcriptional and translational start site as well as the potential 5′ promoter region of interest here. sequencing studies are limited in their detection of sequence fi − variants in GC-rich regions as 5′ promoters.28 The nal amplicon spanned the region from +585 bp to 709 bp (NM_014141.5). For full details on primers and PCR conditions see Supple- Therefore, we aimed at identifying novel mutations and fi fi ′ mentary Table 2. Amplicons were puri ed using GeneJET PCR Puri cation studying known polymorphisms of the CNTNAP2 5 core promoter Kit (Thermo Scientific, St Leon-Rot, Germany). Sanger sequencing of the in patients with an ASD diagnosis, testing their functional impact detection sample was outsourced to SRD (Bad Homburg, Germany). in vitro, and investigating their association with ASD and language-related phenotypes. After sequencing of a detection Prediction of regulatory elements cohort, we performed in-silico analyses to specify transcription factor binding sites (TFBS) likely altered by the variants. The To identify target regions for transcriptional regulation, the TF-ChIP-Seq fi data set from ENCODE was used. Internal ribosome entry sites were identi- signi cance of the respective alteration of TFBS was assessed by fied using IRESite web tool (http://www.iresite.org/). To predict modified assaying the binding of nuclear factors and by measuring the transcription factor binding sites (TFBS), Genomatix software MatInspector transcriptional activity of the respective promoter sequences (Munich, Germany) V2.6 (ref. 33) was used. Sequences including ± 50 bp of in vitro. As the expression of transcription factors varies depending each variant were analyzed. Transcription factors were not considered for on cell line and cellular differentiation stage, in-vitro analysis was further investigation if the input matrix sequence did not span the variant performed in HEK293T cell lines and at different developmental of interest, or if standard score or matrix similarity was below 0.75. stages, that is, in undifferentiated as well as during early- and late- Transcription factors, with evidence for regulation during neuronal differ- neuronal differentiation of neuroblastoid SH-SY5Y cell lines. Using entiation from a microarray experiment (data not shown) were further these two cell models, we could also study differential effects investigated for their regulatory pattern during neuronal differentiation between non-neuronal and neuronal cell types. Finally, to assess using real-time reverse transcription-polymerase chain reaction (RT-PCR). the effect of the most frequent variants on ASD risk and language development, we studied their association in an extended cohort. Cloning We report four novel mutations and three known polymorph- For construction of the luciferase vectors, a fragment of 1236 bp (+0 bp to isms of the 1 kb up-stream promoter of CNTNAP2 in ASD families. − 1236 bp; NM_014141.5) was amplified from subjects carrying major All variants were bound by nuclear factors. Two of the novel alleles only or the respective minor alleles (Supplementary Table 2). This ′ mutations were found in affected individuals. Furthermore, we region spans the TSS and additional 500 bp 5 of the predicted TF binding showed that three of the four new mutations as well as all three region. Amplicons were subcloned into pGEM-T Easy Vector-System I fi before cloning into target vector pGL4.10[luc2] (Promega, Mannheim, known variants altered transcriptional ef ciency in vitro depend- Germany). Novel mutations (M1 − M4) were generated by site-directed ing on neuronal differentiation status and cellular background. We mutagenesis of the wild-type vector pGL4.10[luc2]-WT as described34 (for report a nominal association of the functional CNTNAP2 promoter primers see Supplementary Table 2). Vectors carrying promoter variants variant rs34712024 with ASD and of the short tandem repeat (STR) rs150447075G, rs34712024G, rs71781329GCG[7] or rs71781329GCG[8] rs71781329 with language development. In conclusion, we discuss were generated using the same procedure as for wild-type/major-allele

Molecular Psychiatry (2015), 839 – 849 © 2015 Macmillan Publishers Limited CNTNAP2 5′ promoter variants in ASD AG Chiocchetti et al 841

Figure 1. Variants of the CNTNAP2 promoter. (a) Variants analyzed in this study, their schematic localization within the respective region and the potential transcription factor binding sites (TFBS) reported in the ENCODE TF-ChIP-Seq Dataset. The proximity of the different minor alleles is shown. Gray values of DNA binding factors are proportional to the maximum signal strength reported. (b) Pedigrees of families carrying identified novel variants. Nomenclature of the variants is based on their relative position to the transcriptional start site of mRNA NM014141.5. Black-filled boxes (males) or circles (females) mark individuals with an ASD diagnosis. Colored rectangles around genotype correspond to the identified mutations and are used consistently throughout the other figures. CTCF, Insulator protein (CCCTC-binding factor); NANOG Transcription Factor; EGR1, early growth response 1; , transcription factor 1; HDAC2, 2; JUND, Jun D Proto-Oncogene; NRSF, neuron-restrictive silencer factor; RAD21, double-strand-break repair protein rad21 homolog; SIN3A, Histone Deacetylase Complex Subunit Sin3a; TAF7, TATA Box binding protein (TBP)-associated factor, RNA Polymerase II; YY1, Yin and Yang 1 protein; ZNF263, Zinc-finger protein 263.

constructs pGL4.10[luc2]-WT but with DNA of the respective minor allele manufacturer’s protocols. mRNA levels were analyzed adapting the UPL- carriers as templates. Constructs were screened using restriction fragment System from Roche using the ABgene-ROX-Mastermix (Thermo Fisher length polymorphism analysis as described below. All final constructs were Scientific) on a StepOnePlus System (Life Technologies, Darmstadt, Germany). verified by direct sequencing. For details on primer and probe combinations, see Supplementary Table 2. Fold-change of mRNA expression level was calculated using the 2-ddCT 35 Cell culture method. GAPDH and POLR2F were used as stable reference genes 5 − 2 calculating the geometric mean of relative expression as proposed by HEK293T and SH-SY5Y cells were seeded at a density of 2x10 cells cm previous studies36 and as confirmed in own experiments (data not shown). and grown in DMEM+GlutaMAX medium (all media were purchased from Life Technologies, Darmstadt, Germany, if not otherwise specified) supple- mented with 10% fetal calf serum, 1% Pyruvate, 100 U ml − 1 Penicillin (PAA, Transfection and luciferase assay Piscataway, NJ, USA) and 100 μgml− 1 Streptomycin (PAA) in a humidified Transfection of the luciferase vectors was performed using MetafectenePro atmosphere at 37 °C and 5% CO2. For the differentiation assay, SH-SY5Y (Biontex Laboratories, Planegg, Germany) according to the manufacturer’s cells were seeded at a density of 5x103 cells cm − 2 in standard medium for recommendations. Cells were plated at a density of 2x105 cells cm − 2 24 h followed by treatment for an additional 10 days with differentiation (mitotic cells) or 2.5x104 cells cm − 2 (differentiating SH-SY5Y) in a 12-well medium containing 10 μM retinoic-acid (Sigma-Aldrich, St Louis, MO, USA), plate. Luciferase activity was measured by pooling two wells using the 50 ng μl − 1 brain derived neurotrophic factor (Immunotools, Friesoythe, Dual-Luciferase Reporter Assay System (Promega, Mannheim, Germany) Germany), 2 mM cAMP (Sigma-Aldrich), 20 mM KCl (Sigma-Aldrich), 1XB27 following the manufacturer’s protocol. Firefly luciferase activity was supplement and 1XGlutaMAX in Neurobasal medium, exchanging the normalized to the empty vector pGL4.10[luc2]. All experiments were medium every other day. Cells were collected 24 h after exchanging the corrected for transfection efficiency based on co-transfection with Renilla media. Transition from undifferentiated to differentiated stages was luciferase-expressing vector pGL4.74[hRluc/TK] (Promega). All experiments confirmed using real-time RT-PCR to measure mRNA expression levels of were performed in biological triplicates. neuronal marker gene MAPT (increased expression) and cell division marker gene CDC2 (reduced expression). Electrophoretic mobility shift assay (EMSA) To confirm binding of nuclear factors to the variants under study, we Relative quantitative real-time RT-PCR applied EMSA using biotin-labeled double stranded DNA sequences (for mRNA was purified using the GeneJet mRNA-Purification Kit (Thermo details see Supplementary Table 2) and nuclear extracts from fibroblastoid Fisher Scientific, Waltham, MA, USA) and cDNA was generated with the HEK293T and neuroblastoid SH-SY5Y undifferentiated cells. Nuclear protein Revert-Aid H-Plus Kit (Thermo Fisher Scientific) according to the fractions were extracted as published.37 All EMSA were performed on a

© 2015 Macmillan Publishers Limited Molecular Psychiatry (2015), 839 – 849 CNTNAP2 5′ promoter variants in ASD AG Chiocchetti et al 842 native 5% polyacrylamide (Rotiphorese, Roth, Germany) gel in 0.5x TBE was identified in an unaffected male half sibling. Maternal Buffer (45 mM Tris, 45 mM boric acid, 1 mM EDTA; all purchased from transmission could be excluded, but no DNA of the father was Applichem, Darmstadt, Germany). Gels were plotted for 10 min at 3.55 available. The G to A nucleotide exchange at position − 215 − 2 mA cm in 0.5x TBE. Binding reaction and detection of biotin-labeled (M2_ − 215G4A) was paternally transmitted to a patient with DNA fragments were performed using the LightShift Chemiluminescent fi fi ’ autistic disorder (con rmed by ADI-R and ADOS, IQ = 78 and a EMSA Kit (Thermo Fisher Scienti c) according to manufacturer s protocol. delayed onset of speech with age at first spoken phrase = In each binding reaction (20 μl), 20 fmol of biotin-labeled fragments were fi μ fi 54 months). The third variant at position − 43 changes G to C used. To reduce unspeci c bindings, 1 g of arti cial poly(deoxy-inosine * − 4 deoxy-cytosine) DNA was also included. (M3_ 43G C) and was detected in one mother. Variant M4_ − 26G4C was paternally transmitted to a child diagnosed with autistic disorder (confirmed by ADI-R and ADOS, IQ = 50 and a Genotyping (extension sample) delayed onset of speech with age at first spoken phrase = To increase sample size for association test of known variants, we 52 months). The same variant was additionally identified in a genotyped additional 356 families (see participant section above). Variants mother and a father of two independent families, respectively. rs150447075, rs34712024 and rs71781329 were genotyped using restric- tion fragment length polymorphisms (Watcut Database; Supplementary Neither of them had transmitted the allele to their affected child Table 2). A subset of N = 295 families was analyzed using Custom TaqMan but the father had transmitted the C allele to an unaffected sibling SNP Assays (Life Technologies) AH5IQWH (rs150447075) and AH39SP9 (Figure 1b). The affected individuals carried a diagnosis of early (rs34712024), respectively. Details on real-time RT-PCR setups are childhood autism with language delay and atypical autism available upon request. In total, we genotyped 99.73% (rs150447075 and without language delay, respectively. No phenotypic measures rs34712024) and 98.37% (rs71781329) of the extension sample. were available for the parents carrying any of the novel mutations. Variants M1–M4 reside in the 5′ promoter region of CNTNAP2 Statistical analysis and are thus of major interest in this study. M1 and M2 were within Statistical analyses were done by IBM SPSS v20.0 or SAS v9.3. Power DNA segments that are bound by transcriptional factors (Figure 1a) analysis was performed by Quanto v1.2.4 (ref. 38) and G*Power v3.1.0 with scores ranging from 92 (CTCF, CCCTC-binding factor) to 1000 (http://www.psycho.uni-duesseldorf.de/aap/projects/gpower). Detection of (NRSF, neuron-restrictive silencing factor) out of 1000. This Mendelian errors was performed using Haploview v4.1.39 UNPHASED v3.1.4 strongly supports the regulatory potential of these sequences. (ref. 40) was used to study single-marker association and to explore In our sequencing cohort, we also detected three out of ten imprinting and sex-specific effects. As the ADI-R language items A9 (age at known variants of the CNTNAP2 5′ promoter region of interest first word) and A10 (age at first phrase) were not normally distributed (based on dbSNP, HapMap and UCSC data). Single-nucleotide – o (Kolmogorov Smirnov P 0.001), non-parametric regression models to polymorphisms rs150447075 (NG_007092.2:g.4595T4G) and investigate the effect of the dichotomized genotypes (dominant model) rs34712024 (NG_007092.2:g.4641A4G) as well as the STR with adjustment for age at diagnosis and IQ effects was performed by the SAS macro ‘npar’ (http://www.ams.med.uni-goettingen.de/Projekte/ rs71781329 (NG_007092.2:g.4864_4865insGCGGCG) (Figure 1a). makros/index.html). Luciferase assays were analyzed by repeated mea- The STR rs71781329 is a sixfold GCG repeat (major allele termed sures analysis of variance. here GCG[6]) where the annotated insertion leads to eight repeats (GCG[8]). In one family, we identified a previously unknown paternally transmitted GCG[7] allele. RESULTS Promoter sequencing EMSA To identify known and novel variants, we sequenced the CNTNAP2 Binding of nuclear factors to the sequences spanning the identi- promoter region of 667 members of 236 German families with fied variants was confirmed by EMSA (Figure 2). Interestingly, the one child affected with ASD. In total, we identified four novel pattern of interaction between SH-SY5Y and HEK293T nuclear (M1–M4; Figure 1a) variants 5′ of the TSS of CNTNAP2 (Ref Seq: extracts differed, suggesting a cell type-specific expression of NM_014141.5, UCSC genome browser build Chr 37). None of them nuclear . has been characterized previously or mentioned in the 1000 The strongest binding was observed for the sequences Genome database (http://www.1000genomes.org) or listed on the spanning M2_–215 and rs71781329 followed by rs34712024. Exome Variant Server (http://evs.gs.washington.edu). Variant M1, a Weak protein–DNA interactions were visible for variants M3_–43 deletion of a C nucleotide at position − 407 (M1_ − 407C4delC) and M4_–26. Oligos spanning M1 and rs150447075 show similar

Figure 2. Binding of nuclear factors to 5′ promoter variants. Electrophoretic mobility shift assay (EMSA) proves binding of nuclear factors to the respective alleles. Positive control (black) DNA and EBNA protein was provided with the LightShift Chemiluminescent EMSA Kit. Double- stranded oligos spanning wild-type/major allele (dark gray) and minor allele (colored) sequences +10 base pairs in each direction were incubated with nuclear extracts of HEK293T or SH-SY5Y cells and compared with the DNA incubated without protein extracts (no prot). All experiments were performed at least twice and representative results are shown. Wild-type oligo of variant M1_ − 407C and major allele variant rs150447075T are identical. Strongest binding is observed for sequences spanning variants M2_ − 215 and rs71781329. Minor alleles of variants M2_ − 215A and rs34712024G reduce binding affinity, whereas minor alleles (GCG[7] and GCG[8] of variant rs71781329 increase binding.

Molecular Psychiatry (2015), 839 – 849 © 2015 Macmillan Publishers Limited CNTNAP2 5′ promoter variants in ASD AG Chiocchetti et al 843 weak binding of nuclear factors. Please note, that both variants specific TFs Spermatogenic Zip1 factor and ZNF263 (zinc-finger (M1 and rs150447075) are next to each other, that is, on position 263). In parallel, an additional binding site for ETV1 (Ets variant 1), − 407 and − 406, respectively, and thus show an expected similar a brain expressed transcriptional regulator involved in cell protein–DNA interaction pattern. Descriptively, when compared differentiation, is generated. with the respective major allele sequences, we observed reduced The third identified variant at position − 43, a G to C nucleotide binding of nuclear factors at minor allele variants rs34712024G, change (M3_ − 43C), potentially generates TFBS for the brain M2_ − 215G and M3_ − 43C, as well as pronounced increased expressed transcription factors CREB (cAMP-responsive element binding for both minor alleles (GCG[7] and GCG[8]) of variant binding protein), EGR3 (early growth response 3), VMAF (v-Maf), rs71781329, and a mildly increased binding to M4_ − 26C. In ATF (activating transcription factor) and WHN (winged helix contrast to the similar protein–DNA interaction pattern, minor protein), while disrupting sequences for non-brain–specific TFs allele rs150447075G at position − 406 did not, whereas only, namely, CTCF (CCCTC-binding factor) and PAX5 (paired-Box M1_ − 407delC did affect binding of nuclear factors. transcription factor 5). The paternally inherited − 26_G4C mutation (variant M4) was CNTNAP2 expression levels during neuronal differentiation predicted to generate additional TFBS for 11 different transcrip- To characterize transcriptional activation of CNTNAP2 during tion factors of which only EGR1 (early growth response 1) is active neuronal differentiation we measured mRNA expression levels in brain. over 264 h at seven time points during brain derived neurotrophic Similarly, all three known variants detected in our sequencing factor induced neuronal differentiation. The most critical time sample were predicted to change binding sites for brain points were selected for differential functional analysis in the expressed transcription factors in silico: Carriers of the minor (G) subsequent luciferase assay (see below). mRNA levels of CNTNAP2 allele of rs150447075 were predicted to carry an additional TFBS were − 3.77-fold downregulated 72 h post-induction of differen- for transcription factors GLI3 (GLI family zinc-finger 3), TLX1 (T-cell tiation (PD) when compared with time point 0 h PD with a leukemia homeobox 1) and EGR1 (Early growth response 1) while subsequent re-induction of expression reaching its peak at 216 h losing TFBS for EOMES. The minor allele rs34712024G generates PD (Figure 3a). This downregulation is in agreement with previous an additonal recognition site for the brain expressed TF EGR1. For studies on human neuronal progenitor cells,41 and has been rs7178329, an increased number of TFBS for EGR1 was predicted reported during mid-fetal development in a post-mortem brain by the number of repeats (that is, 3 TFBS for GCG[6], 4 TFBS for expression study.42 GCG[7] and 5 TFBS for GCG[8]).

In-silico analysis of TFBS Expression levels of transcription factors during neuronal All identified variants were predicted to change TFBS in-silico differentiation (Table 1 and Supplementary Table 3). The deletion of the C allele To determine the regulatory role of the transcription factors we 407 bp 5′ of the transcription start (novel variant M1) was measured the respective mRNA levels during neuronal differentia- predicted to disrupt the binding sequences for transcriptionally tion of SH-SY5Y cells. One of the most frequently in-silico identified active estrogen and retinoic acid receptors, as well as for TFBS is bound by EGR1. This early growth response gene 1 was . In addition, two new TFBS for GA repeat binding strongly upregulated (Figure 3b), reaching its maximum expres- protein, beta 1 and Wilms Tumor Suppressor 1 are generated. Of sion level 120 h PD, that is, at the same time point when CNTNAP2 these transcription factors, the latter three are expressed in brain expression was reactivated after its reduction. We observed a or nervous system. similar pattern for the paralog EGR4 (Figure 3b). In addition, we The novel variant M2, a paternally inherited G4A mutation at found an early phase upregulation for ZNF219 (Zinc-finger protein position − 215 was predicted to reduce binding of brain expressed 219), a transcriptional repressor,43 followed by wave-like regula- PLAG1 (coded on the forward strand) as well as the non-brain– tion before its level was reduced below detection limit. The lowest

Figure 3. Characterization of CNTNAP2 regulation during neuronal differentiation. (a) During 264 h of neuronal differentiation, CNTNAP2 is reduced until 72 h post induction of differentiation (PD) followed by a recovery phase until 216 h PD. In parallel, markers for neuronal differentiation (MAPT) are constantly increased, whereas markers for cell division decrease after a short activation period. (b) Transcriptional factors that bind to the target region are regulated during neuronal differentiation. Only factors with a log2 fold-change 42 are depicted. EG1, EGR4 and ZNF219 are the most strongly regulated factors and are thus most likely to regulate transcriptional activity of the CNTNAP2 5′ promoter. Interestingly, EGR1 and EGR4 (early growth response 1/4) are highest regulated in a time window where CNTNAP2 transcription is reactivated, and the transcriptional repressor ZNF219 is slightly downregulated. (c) The 5′ promoter of the CNTNAP2 gene is weakly activated in the HEK293T cells (luciferase assay) and undifferentiated SH-SY5Y cells. The strongest activation of the promoter is observed 216 h after induction of differentiation. Colored squares correspond to the mean values of three biological replicates of the different variants under study here. For further details on relative fold-changes see Figure 4.

© 2015 Macmillan Publishers Limited Molecular Psychiatry (2015), 839 – 849 844 oeua scity(05,839 (2015), Psychiatry Molecular

Table 1. Transcription factors predicted to bind to variants (MatInspector results)

Variant M1 M2 M3 M4 rs150447074 rs34712024 rs71781329

TF Strand TF Strand TF Strand TF Strand TF Strand TF Strand TF Strand

No changes NRSF + NUDR − AHRARNT + AHRARNT + NRSF + EGR1 − CTCF + when minor BNC − MOK2 − NRF1 − NRF1 + BNC − − NGFIC + allele present XCPE1 − ZF5 + ER + NM23 − HDBP1 − SP1 − ZIC2 − SP1 − WT1 − EGR1 + MAZR − EGR1 + –

849 ZNF300 + EGR1 + KLF7 − CNTNAP2 ZNF219 + AP4 −

MYOGENIN + 5 ′

AP4 + ASD in variants promoter - − − − Additional TFBS GABPB1 ETV1 + CREB + SP4 GLI3 + CTCF GCG[7]orGCG[8] Chiocchetti AG when minor WT1 − EGR3 − NRF1 − TLX1 − SP1 − HDBP1 − allele present VMAF + EGR1 − SP1 − EGR1 − EGR1 + ATF + NRF1 + EGR1 + HDBP1 − WHN + HMTE − GCG[8]only CTCF − ZNF300 + HDBP1 −

CTCF − EGR1 + al et PAX9 − ZNF219 + PAX5 − DMTE − DeletedTFBS ER + PLAG1 + CTCF + EOMES − ZNF300 + when minor EOMES − ZNF263 − PAX5 + XCPE1 − SP1 − allele present RAR_RXR − SPZ1 + RAR_RXR − Abbreviations: AHRARNT, aryl-hydrocarbon- aryl-hydrocarbon-receptor-nuclear-translocator dimer; AP4, activator protein 4; ATF, activating transcription factor; BNC, basonuclin, cooperates with UBF1 in rDNA PolI transcription; CREB, cAMP-responsive element binding protein; CTCF, CCCTC-binding factor; DMTE, Drosophila motif ten element; DRE, dioxin response elements; E2F4, E2F transcription factor 4, p107/ p130-binding protein; EGR1, early growth response 1; EGR1, early growth response 1; EGR3, early growth response gene 3 product; EOMES: Eomesodermin, TBR-2 (secondary DNA binding preference); ER, estrogen response elements, IR3 sites; ETV1, Ets variant 1; GABPB1, GA repeat binding protein beta 1; GLI3, GLI-Kruppel family member GLI3; HDBP1_2, Huntington's disease gene regulatory region-binding protein 1 and 2 (SLC2A4 regulator and papillomavirus binding factor); HMTE, human motif ten element; KLF7, kruppel-like factor 7 (ubiquitous, UKLF); MAZR, -associated zinc-finger protein related transcription factor; MOK2, ribonucleoprotein associated zinc-finger protein MOK-2 (mouse); MYOGENIN, myogenic bHLH protein myogenin (myf4); NGFIC, nerve growth factor-induced protein C; NM23, NME/ NM23 nucleoside diphosphate kinase1 and 2; NRF1, nuclear respiratory factor 1, bZIP transcription factor that acts on nuclear genes encoding mitochondrial proteins; NRSF, neuron-restrictive silencer factor; NUDR, nuclear DEAF-1 related transcriptional regulator protein; PAX5, B-cell-specific activator protein; PAX9, zebrafish PAX9 binding sites; PLAG1, pleomorphic adenoma gene 1, a developmentally regulated C2H2 zinc-finger protein; retinoic acid receptors_RXR, / heterodimer, DR1 sites; SP1, stimulating protein 1, ubiquitous zinc-finger transcription factor; SP1, stimulating 05McilnPbihr Limited Publishers Macmillan 2015 © protein 1, ubiquitous zinc-finger transcription factor; SP4, ; SPZ1, spermatogenic Zip 1 transcription factor; TFBS, transcription factor binding site; TLX1, T-cell leukemia homeobox 1; VMAF, v-Maf; WHN, winged helix protein, involved in hair keratinization and thymus epithelium differentiation; WT1, Wilms tumor suppressor; XCPE1, X gene core promoter element 1; XRE, xenobiotic response elements; bound by AHRARNT heterodimers; ZF5: zinc finger/POZ domain transcription factor; ZIC2, zic family member 2 (odd-paired Drosophila homolog; secondary DNA binding preference); ZNF219, kruppel- like zinc-finger protein 219; ZNF263, zinc-finger protein 263; ZKSCAN12, zinc-finger protein with KRAB and SCAN domains 12; ZNF300, KRAB-containing zinc-finger protein 300. Bold indicates core sequence of TFBS spans variant. Underscored indicates transcription factor is expressed in brain and/or nervous system. CNTNAP2 5′ promoter variants in ASD AG Chiocchetti et al 845

Figure 4. Functional analysis of promoter activation dependent on cell type, SH-SY5Y differentiation status and genetic variants. Promoter activity is influenced by genetic variants depending on cell type and differentiation status. Luciferase assays have been performed to compare the promoter activity of major/wild-type alleles to the respective minor alleles. Mean fold changes compared with the respective reference vectors (wild-type/major alleles vector) are shown with standard deviation (whiskers). Asterisks mark significance as tested in analysis of variance for repetitive measures compared with reference vector (+Po0.1; *Po0.05; **Po0.01). PD, post induction of differentiation.

significantly increased transcriptional activity of the promoter in Table 2. Single marker transmission disequilibrium test undifferentiated neuroblastoma SH-SY5Y (FC = 1.36; s.d. = 0.08; P=0.036) but not in HEK293T cells (FC = 0.99; s.d. = 0.13; SNP (minor/major alleles) MAFa OR minor allele, P-value N = 592 families P=0.973). A trend toward an upregulation was also observed in differentiating SH-SY5Y cells (72 h PD: FC = 1.38; s.d. = 0.16; rs150447075T/G 0.023 0.92, CI95: 0.51–1.64 0.768 P=0.243; 216 h PD: FC = 2.17; s.d. = 0.24; P=0.056). For the rs34712024A/G 0.016 0.41, CI95: 0.19–0.89 0.018b M4_ − 26C allele we did not observe any changes compared with rs71781329GCG[6]/GCG(7_8) 0.006 0.88, CI95: 0.32–2.42 0.796 the respective wild-type promoter sequence. c rs71781329GCG[6]/GCG[7] 0.003 0.60, CI95: 0.14–2.51 0.477 Compared to the major allele promoter, we found a significantly d – rs71781329GCG[6]/GCG[8] 0.003 1.34, CI95: 0.29 6.14 0.704 increased activation of the rs150447075G promoter in undiffer- Abbreviations: CI, confidence interval; MAF, minor allele frequency entiated SH-SY5Y (FC = 1.07; s.d. = 0.03; P=0.005) and HEK293T (calculated on all samples); OR, odds ratio. aMinor allele frequency within (FC = 1.52; s.d. = 0.39; P=0.028) cells but not during neuronal our autism spectrum disorders sample. bPo0.05. cFamilies of differentiation. The rs34712024G allele significantly increased rs71781329GCG[7] allele carriers were omitted; N = 584 families were transcriptional efficiency in HEK293T cells (FC = 1.61; s.d. = 0.44; d included into analysis. Families of rs71781329GCG[8] allele carriers were P=0.040) but not at any differentiation stage of SH-SY5Y cells. = omitted; N 588 families were included into analysis. The transcriptional efficiency of the rs71781329 minor alleles depended on the number of GCG repeats in HEK293T cells: GCG[7] did not change efficiency (FC = 1.22; s.d. = 0.3; P=0.107), whereas peak during the regulatory wave coincided with the reactivation GCG[8] induced a 1.66-fold increase (P=0.024). In the SH-SY5Y cell of CNTNAP2 expression 120 h PD. line, we observed a contrasting effect. GCG[7] slightly down- regulated efficiency in undifferentiated SH-SY5Y cells (FC = 0.94; Luciferase assay s.d. = 0.03; P=0.061) and GCG[8] clearly reduced expression at stage 72 h PD (FC = 0.91; s.d. = 0.03; P=0.028). The 1-kb region upstream of the TSS of CNTNAP2 showed significant transcriptional activation in the luciferase assay when compared with the empty vector, thus confirming its effect as Single marker association study in the extended ASD sample transcriptional cis-promoter. In line with the CNTNAP2 mRNA In the extended sample, a significant under-transmission of allele expression profile, all vector constructs harboring the promoter rs34712024G to ASD patients (odds ratio (OR) = 0.41; confidence region of interest showed low activity at time points 0 and 72 h interval at 95% (CI95) = 0.19–0.89; P=0.018; Table 2) was observed. PD, and a remarkable increase at time point 216 h (Figure 3c) When correcting for three statistical tests, the corrected P-value when CNTNAP2 mRNA expression reached its maximum after still shows a trend (P=0.054), but does no longer reach signifi- reactivation. Subsequent analysis of the promoter activity in SH- cance on the 5% level. The genotype-based association test SY5Y cells at each individual time point (0, 72 and 216 h PD) and in yielded a similar result (OR [A/G] = 0.45; CI95 = 0.38–0.96; undifferentiated HEK293T cells showed cell line-specific and PD P=0.034). No homozygous [G/G] genotype was detected in any stage-dependent effects (Figure 4 and Supplementary Table 4). patient. An additional analysis including only individuals with In-vitro, M1_ − 407delC significantly increased transcriptional autistic disorder did not explain more of ASD risk (OR = 0.45; activation in undifferentiated HEK293T (fold-change (FC) = 1.47; CI95 = 0.20–0.99; P=0.039). For the trimeric STR rs71781329 (7+8 s.d. = 0.05; P=0.032) and SH-SY5Y cells (FC = 1.62; s.d. = 0.09; repeats combined), no association was observed (OR = 0.88; P=0.010). Descriptively, this effect was also present in differentiat- CI95 = 0.32–2.42; P=0.796). Descriptively, rs71781329GCG[7] was ing SH-SY5Y cells but could not be confirmed statistically. When under-transmitted (OR = 0.60; CI95 = 0.14–2.51; P=0.477) and compared with the wild-type allele G, the A allele of mutation M2 rs71781329GCG[8] (OR = 1.34; CI95 = 0.29–6.14; P=0.704) was significantly reduced transcriptional efficiency in mitotically active over-transmitted when compared with rs71781329GCG[6] cells (HEK293T: FC = 0.45; s.d. = 0.03; P=0.036; SH-SY5Y: FC = 0.53; (Table 2). No gender-specific or imprinting effects were observed s.d. = 0.04; P=0.013) as well as in differentiating SH-SY5Y cells (data not shown). No co-occurrence of any minor alleles was (72 h PD: FC = 0.45; s.d. = 0.02; P=0.047; 216 h PD: FC = 0.618; observed with the exception of one index patient and an s.d. = 0.06; P=0.071). The novel identified C nucleotide at M3_ − 43 unrelated mother carrying both minor alleles of rs150447075

© 2015 Macmillan Publishers Limited Molecular Psychiatry (2015), 839 – 849 CNTNAP2 5′ promoter variants in ASD AG Chiocchetti et al 846

Table 3. Non-parametric regression models of genetic effects on ADI-R language measures

Model First words (ADI-R A9) First phrase (ADI-R A10) First words—male First phrase—male only (ADI-R A9) only (ADI-R A10)

TPTPTPTP

rs150447075 AA vs AG/GG 0.979 0.322 0.943 0.331 0.265 0.606 0.175 0.676 Age − 0.290 0.772 0.184 0.854 − 0.154 0.877 0.385 0.700 IQ − 4.942 o0.0001 − 6.039 o0.0001 − 4.424 o0.0001 − 5.652 o0.0001 N (carriers G) 383 (17) 336 (16) 331 (14) 292 (13) Mean (s.d.) TT: 25.48 (14.42) TT: 37.69 (17.95) TT: 24.82 (14.14) TT: 37.01 (18.05) TG/GG: 21.65 (9.51) TG/GG: 33.94 (10.47) TG/GG: 21.93 (10.41) TG/GG: 33.77 (10.19) rs34712024 TT vs TG/GG 2.953 0.086 0.016 0.899 3.362 0.067 0.027 0.871 Age − 0.277 0.781 0.176 0.861 − 0.142 0.887 0.375 0.707 IQ − 4.808 o0.0001 − 5.968 o0.0001 − 4.293 o0.0001 − 5.602 o0.0001 N (carriers G) 383 (10) 336 (8) 331 (10) 292 (8) Mean (s.d.) AA: 25.05 (14.07) AA: 37.36 (17.52) AA: 24.39 (13.76) AA: 36.67 (17.60) AG/GG: 34.80 (18.50) AG/GG: 43.63 (23.80) AG/GG: 34.80 (18.50) AG/GG: 43.63 (23.80) rs71781329 GCG[6]/[6] vs [6]/(7_8) 2.689 0.101 0.581 0.446 1.207 0.272 0.039 0.843 Age − 0.180 0.858 0.237 0.813 − 0.005 0.996 0.447 0.655 IQ − 4.866 o0.0001 − 6.036 o0.0001 − 4.329 o0.0001 − 5.664 o0.0001 N (carriers GCG(7_8)) 380 (5) 334 (8) 328 (4) 290 (4) Mean (s.d.) GCG[6]/[6]: 25.03 (13.62) GCG[6]/[6]: 37.22 (16.98) GCG[6]/[6]: 24.44 (13.27) GCG[6]/[6]: 36.63 (17.07) GCG[6]/(7_8): 45.00 (37.11) GCG[6]/(7_8): 58.80 (42.51) GCG[6]/(7_8): 44.25 (42.81) GCG[6]/(7_8): 55.50 (48.34) rs71781329 GCG[6]/[6] vs [6]/[7] 83.972 o0.0001 19.135 o0.0001 Age − 0.304 0.761 0.084 0.93341 IQ − 4.954 o0.0001 − 6.249 o0.0001 N (carriers GCG[7]) 377 (2) 331 (2) Mean (s.d.) GCG[6]/[6]:25.03 (13.62 GCG[6]/[6]:37.22 (16.98) Not performed as all carriers were male GCG[6]/[7]:78.00 (42.43) GCG[6]/[7]: 99.00 (38.18) rs71781329 GCG[6]/[6] vs [6]/[8] 0.109 0.741 0.104 0.748 Age − 0.309 0.757 0.046 0.964 IQ − 4.857 o0.0001 − 6.078 o0.0001 N (carriers GCG[8] allele) 378 (3) 332 (3) Mean (s.d.) GCG[6]/[6]:25.03 (13.62) GCG[6]/[6]:37.22 (13.98) Not performed as all carriers were male GCG[6]/[8]: 23.00 (6.25) GCG[6]/[8]: 32.00 (13.86) ADI-R, autism-diagnostic interview-revised; IQ, intelligence quotient; N, total number of samples included in the model with carriers of the respective minor alleles in parentheses; Mean, mean months at first words or first phrase, respectively; P, P-value of non-parametric regression model; T, T-value of non- parametric regression model.

and rs34712024. Thus, no haplotype-based association analysis or ASD trio cohort. Furthermore, we tested three polymorphisms for epistasis analysis was performed. association with ASD and language delay within ASD in an extended sample. In contrast to the FOXP2 regulatory element 10 Influence of promoter variants on ADI-R-derived language in intron 1, the 5′ promoter studied here is involved in the measures upregulation of CNTNAP2 during neuronal differentiation. In-silico To explore if any of the three known variants had an effect on analysis of novel and known variants of this sequence predicted alternative TFBS for all minor alleles. In-vitro assays showed that all language development, non-parametric regression models were − calculated. Age at diagnosis and IQ are known to influence variants but M4_ 26G differentially regulated promoter activity language development and were thus included as covariates. as a function of cellular background and neuronal differentiation Subjects with genotypes rs71781329GCG[6]/[7] but not GCG[6]/[8] stage, with potentially different pathological effects in-vivo. This showed a significantly (Po0.00001) older age at first words and at strongly suggests that the variants mediate effects that are fi dependent on the transcription factor pattern, which in our study rst spoken phrase compared with carriers of GCG[6]/[6] (for fi details on mean differences see Table 3). However, the number of have shown to be speci c to cell type and differentiation stage. Our main findings are the description of a novel potentially subjects carrying the minor alleles was very low (GCG[6]/[7] N =2 pathological rare mutation of the CNTNAP2 5′ promoter, the and GCG[6]/[8] N = 3); thus, these results have to be interpreted association of the functional variant rs34712024 with ASD and of with caution. The other known variants were not associated with rs71781329 with language development. language delay in ASD (Table 3).

Rare mutations may confer ASD risk via altered transcriptional DISCUSSION efficiency In search for an additional sequence element regulating CNTNAP2 The strongest effect on promoter activity was observed for the and for genetic variants related to ASD, we Sanger sequenced and paternally transmitted variant M2_ − 215A, which reduced tran- functionally analyzed the 5′ promoter sequence of CNTNAP2 in an scriptional efficiency in all assays. The heterozygous index patient

Molecular Psychiatry (2015), 839 – 849 © 2015 Macmillan Publishers Limited CNTNAP2 5′ promoter variants in ASD AG Chiocchetti et al 847 clinically presented with autistic disorder, a diagnosis which Owing to the rare frequency of the novel variants (N = 1), includes a delayed onset of speech. The A allele of this variant association analysis could not be performed. Thus, we cannot disrupts the core TFBS for ZNF263, a transcriptional repressor and exclude a false-positive finding for M2 or prove that M1 and M3 activator relevant during cell cycle regulation,44 and for SPZ1 may be protective. However, we speculate that increased (Spermatogenic Zip1 factor). The chromosomal region of ZNF263 promoter efficiency either may not be pathologically relevant or on 16p13 has shown weak linkage with ASD may even be protective and that a reduced CNTNAP2 expression 45 (logarithm of the odds; LOD = 2.17). Both, ZNF263 and SPZ1 increases risk for ASD. (Spermatogenic Zip1 factor), have been nominally associated with ASD in a genome wide haplotype analysis.46 The identification of CNTNAP2 variants are associated with ASD and language SPZ1 and ZNF263 genes in studies for ASD suggests that common development and rare variants of these transcription factors may trigger ASD phenotypes by decreasing transcriptional activation of their Testing the three known and more frequent promoter variants of targeted promoter sequences. Similarly, the pathomechanism of CNTNAP2 for association with ASD showed a less frequent M2_ − 215A may be driven by a reduced transcription factor transmission of the minor G allele of rs34712024 from parents binding. In conclusion, a reduced CNTNAP2 mRNA expression as to patients in the detection and the extended sample. The achieved either by reducing the affinity of the targeting protective genetic effect of the minor allele rs34712024G on ASD transcription factors or by damaging mutations in their recogni- may be attributed to an additional binding of the brain-specific tion pattern, may increase ASD risk. TF EGR1. This early growth-related transcriptional regulator is Contrasting with M2, M1_ − 407delC and M3_ − 43C increased strongly regulated during neuronal differentiation. EGR1 plays transcriptional efficiency of the CNTNAP2 promoter. M1 was a central role in the regulation of neuronal plasticity and identified in an unaffected half-sibling only, and M3 was not differentiation. Synaptic activity, a mechanism putatively impaired transmitted to affected offspring. The deletion of the C allele (M1) in ASD,54 regulates EGR1 expression leading to altered transcrip- generated or improved the recognition elements for the brain- tion of targeted genes.53 In our cellular differentiation model, EGR1 expressed transcription factors GABPB1 (GA Binding Protein and EGR4 mRNA levels are strongly upregulated when CNTNAP2 5′ Transcription Factor, Beta Subunit 1) and WT1 (Wilms Tumor 1). promoter efficiency is highest. Furthermore, their expression levels Interestingly, GABPB1 levels were reported to be downregulated in decline shortly during the downregulation of CNTNAP2. Interest- affected patients with ASD (Hu et al.47 Supplements) and the ingly, EGR1 binds to this minor G allele of rs34712024G in reverse coding region shows strong linkage with ASD.48 Deletions of the orientation. TFs can be active in both directions, but may exhibit WT1 gene are associated with the WAGR syndrome (Wilms tumor, different effects depending on their orientation.55,56 Aniridia, Genitourinary malformations and mental Retardation). Again, we speculate that an increased binding affinity and thus Patients with this rare have a high prevalence an increased expression activity of the promoter may explain the 49 (420%) of autistic features. Downregulation of GABP1 or protective effect of rs34712024G on ASD. deletions of WT1 lead to reduced availability and thus also to For STR rs71781329, we observed that each parental minor reduced binding of these regulatory factors to the respective DNA allele was transmitted to the index patients (complete transmis- sequences with a subsequent reduced transcription of the sion) in the detection set. In the extended sample, GCG[7] targeted genes. Increased binding of these factors, in contrast, descriptively was under-transmitted and GCG[8] was over- fi likely increases promoter ef ciency as observed here for the transmitted to the offspring compared with GCG[6]. No nominal CNTNAP2 promoter variant M1, and thus even may be protective significance was achieved, mainly due to the low frequency of for ASD. these minor alleles. With regard to delayed language within the − Variant M3_ 43C generates a sequence pattern recognized by ASD sample, GCG[7] but not GCG[8] did show a highly significant the brain-expressed transcriptional regulators CREB ATF4 (alias association with later onset of speech. This finding has to be CREB2), EGR3 VMAF and WHN (for abbreviations see Table 1). interpreted with caution as only two individuals carrying the CREB transcription factors are regulated by glutamatergic signal- respective allele were observed in our sample. Still, findings ing and activate transcription of the ASD-associated FMR1 gene support the reported functional role of CNTNAP2 on language (reviewed in ref. 50). The EGR3 gene has been associated with acquisition.10,12,25 schizophrenia51 and its gene product functionally regulates axonal At the level of transcriptional efficiency, we observed differ- growth and dendritic branching.52 An increased recognition of ential effects on transcriptional efficiency of the two identified these elements at the 5′ promoter may lead to an increased minor alleles depending on the number of inserts, that is, on the expression of the targeted gene as observed here in the non- differentiated neuroblastoid SH-SY5Y cells. The role of the binding number of TFBS for EGR1, and related to the cellular background, factors EGR3 and CREB underscores the relevance of M3 on brain that is, SH-SY5Y or HEK293T cells. Again, the minor alleles of function. Taken together, we assume that an increased TF binding rs7178193 (GCG[7/8]) increased the number of TFBS for EGR1. In will increase expression of CNTNAP2, and this may lead to a contrast to the EGR1 binding sequence spanning the aforemen- protective effect of variant M3. tioned variant rs34712024, the EGR1 binding sites on rs7178193 − are oriented in the forward direction and may thus have a Variant M4_ 26C increases similarity to the binding sequence 55 of EGR1, a brain-expressed transcriptional activator.53 However, different effect. In the SH-SY5Y model, the GCG[7]-repeat but not neither the EMSA nor the luciferase assay support increased the GCG[8]-repeat of rs71781329 showed a slight reduction of binding of transcriptional activators in the presence of M4_ − 26C. promoter efficiency in the pre-differentiating cells. During later We hypothesize that either the predicted reverse binding of this stages of differentiation, hardly any difference in translation nuclear factor will not affect transcriptional activity, or the binding efficiency was observed between the different repeat carrying site may be blocked by other nuclear factors. M4_ − 26C was cells. GCG[8] but not GCG[7] reduced transcriptional efficiency detected in three independent families but only once transmitted 72 h after induction of differentiation, that is, when CNTNAP2 to the index patient. This renders a pathological effect of mRNA expression was lowest. M4_ − 26C very unlikely. Following the idea of a decreased CNTNAP2 expression increas- The notion of an increased TF-binding at the CNTNAP2 5′ ing ASD risk, and in light of the association findings, differential promoter being protective is supported by the finding that both, CNTNAP2 expression in early stages of neuronal differentiation M1 and M3, were only detected in an unaffected, but ASD-related driven by the GCG[7] allele, but not in a later stage as observed for individuals within our sample. GCG[8], may be correlated with language delay in ASD.

© 2015 Macmillan Publishers Limited Molecular Psychiatry (2015), 839 – 849 CNTNAP2 5′ promoter variants in ASD AG Chiocchetti et al 848 Taken together, we propose a model in which number and ACKNOWLEDGMENTS orientation of TFBS at the 5′ CNTNAP2 promoter are determining We thank all the families and patients for their cooperation and the clinical staff for the pathogenic effect of the variants investigated here, and will their support in data collection. We thank Heiko Zerlaut for database management, lead to the observed altered transcriptional efficiency. In this and Cornelia Wirth and Silvia Lindlar for excellent technical assistance. The study was model, the upregulation of CNTNAP2 exhibits protective effects in part supported by grant Po 255/17-4 of the Deutsche Forschungsgemeinschaft to F whilst a downregulation confers increased risk to ASD. This is Poustka, and grants T 6031000-45 of Saarland University and ERA-NET NEURON/BMBF underscored by our finding that on the one hand all variants EUHFAUTISM-01EW1105 to CM Freitag. upregulating transcriptional efficiency in undifferentiated SH-SY5Y cells were potentially protective (rs34712024G) or had no effect on ASD (M1_ − 407delC; M3_ − 43C or M4_ − 26C), while on the other REFERENCES fi fi hand the identi ed risk variants decreasing transcriptional ef ci- 1 Baird G, Simonoff E, Pickles A, Chandler S, Loucas T, Meldrum D et al. Prevalence ency in the undifferentiated SH-SY5Y model (rs71781239GCG[7] of disorders of the autism spectrum in a population cohort of children in South and M2–215A) were possible risk factors for autistic disorder and Thames: the Special Needs and Autism Project (SNAP). Lancet 2006; 368: 210–215. language delay. This would imply that the CNTNAP2 level prior to 2 Brugha TS, McManus S, Bankart J, Scott F, Purdon S, Smith J et al. Epidemiology of the induction of neuronal differentiation, rather than the regula- autism spectrum disorders in adults in the community in England. Arch Gen tory pattern during this process, presents the critical element for Psychiatry 2011; 68:459–465. modulating ASD risk. 3 WHO. International Classification Of Mental And Behavioral Disorders. Clinical In the brain, CNTNAP2 is ubiquitously expressed pre- and Descriptions And Diagnostic Guidelines, 10th edn. World Health Organization: perinatally, but its expression is downregulated during mid-fetal Geneva, 1992. development and early childhood (age 1–6 years).42 Subsequently, 4 APA. Diagnostic And Statistical Manual Of Mental Disorders, Fourth Editiontion, Text Revision (DSM-IV-TR®). American Psychiatric Association, 4th edn. American Psy- CNTNAP2 is re-expressed in all brain regions besides the striatum fi chiatric Publishing: Arlington, VA, USA, 2000. and the cerebellar cortex in later life. Thus, this study con rms a 5 Freitag CM, Staal W, Klauck SM, Duketis E, Waltes R. Genetics of autistic disorders: fi brain tissue-speci c regulation of CNTNAP2. This is in line with review and clinical implications. Eur Child Adolesc Psychiatry 2010; 19:169–178. the reported downregulation during neuronal differentiation of 6 Lichtenstein P, Carlström E, Råstam M, Gillberg C, Anckarsäter H. The genetics of 41 human neuronal precursor cells. We confirmed this time- and autism spectrum disorders and related neuropsychiatric disorders in childhood. cell-specific downregulation in the HEK293T cells and the SH-SY5Y Am J Psychiatry 2010; 167: 1357–1363. cell line model for neuronal differentiation. An aberrant transcrip- 7 Peñagarikano O, Geschwind DH. What does CNTNAP2 reveal about autism tional fine tuning, an imbalanced regulation or an impaired spectrum disorder? Trends Mol Med 2012; 18: 156–163. reactivation of CNTNAP2 may thus underlie the reported ASD- 8 Alarcón M, Cantor RM, Liu J, Gilliam TC, Geschwind DH. Evidence for a language specific effects on language acquisition or hierarchical learning. quantitative trait locus on chromosome 7q in multiplex autism families. Am J Hum Genet 2002; 70:60–71. Both processes are related to the striatum or the cerebellar cortex, 57–60 9 Arking DE, Cutler DJ, Brune CW, Teslovich TM, West K, Ikeda M et al. A common which are both strongly involved in ASD neuropathology. genetic variant in the neurexin superfamily member CNTNAP2 increases familial Thus, a dysbalanced regulation of CNTNAP2 in any of these risk of autism. Am J Hum Genet 2008; 82: 160–164. functional brain areas, or more specifically a downregulation of 10 Vernes SC, Newbury DF, Abrahams BS, Winchester L, Nicod J, Groszer M et al. CNTNAP2 not compatible with normal neuronal development, may A functional genetic link between distinct developmental language disorders. modulate ASD risk. N Engl J Med 2008; 359:2337–2345. Our hypothesis is underpinned by the fact that CNV studies in 11 Newbury DF, Paracchini S, Scerri TS, Winchester L, Addis L, Richardson AJ et al. ASD so far only identified gene copy losses but no gains spanning Investigation of dyslexia and SLI risk variants in reading- and language-impaired the CNTNAP2 region.22,61,62 Furthermore, a reduction of CNTNAP2 subjects. Behav Genet 2011; 41:90–104. levels in lymphoblastoid cells driven by a reduced copy number of 12 Whitehouse AJO, Bishop DVM, Ang QW, Pennell CE, Fisher SE. CNTNAP2 variants the CNTNAP2 5′promoter has previously been postulated as risk affect early language development in the general population. Genes, Brain and 22 Behav 2011; 10: 451–456. factor for ASD. 13 Whalley HC, O'Connell G, Sussmann JE, Peel A, Stanfield AC, Hayiou-Thomas ME et al. Genetic variation in CNTNAP2 alters brain function during linguistic pro- 156 CONCLUSION cessing in healthy individuals. Am J Med Genet B Neuropsychiatr Genet 2011; : 941–948. In summary, we performed a genetic and functional characteriza- 14 Eyler LT, Pierce K, Courchesne E. A failure of left temporal cortex to specialize for tion study of the CNTNAP2 promoter in ASD. We report a novel language is an early emerging and fundamental property of autism. Brain 2012; and potentially pathogenic mutation M2_ − 215A and association 135:949–960. of rs34712024G with ASD, which might be mediated by an altered 15 Bakkaloglu B, O'Roak BJ, Louvi A, Gupta AR, Abelson JF, Morgan TM et al. Mole- transcriptional efficiency of the CNTNAP2 promoter. We conclude cular cytogenetic analysis and resequencing of contactin associated protein-like 2 82 – that a dosage-dependent effect of CNTNAP2 shapes the etiology in autism spectrum disorders. Am J Hum Genet 2008; :165 173. fi fi 16 O'Roak BJ, Deriziotis P, Lee C, Vives L, Schwartz JJ, Girirajan S et al. Exome of the disorder. Our ndings and speci cally the proposed role of sequencing in sporadic autism spectrum disorders identifies severe de novo rs71781329 in language development, and rs34712024 in ASD mutations. Nat Genet 2011; 43:585–589. should be studied in more detail in a larger ASD sample. As none 17 Caselli R, Mencarelli MA, Papa FT, Ariani F, Longo I, Meloni I et al. Delineation of of the three known variants is genotyped on any currently available the phenotype associated with 7q36.1q36.2 deletion: long QT syndrome, renal genome-wide platforms (next-generation) sequencing approaches hypoplasia and mental retardation. Am J Med Genet A 2008; 146:1195–1199. might prove more beneficial in the identification of CNTNAP2 risk 18 Rossi E, Verri AP, Patricelli MG, Destefani V, Ricca I, Vetro A et al. A 12Mb deletion variants in ASD. It is suggested that variants that do not reach the at 7q33–q35 associated with autism spectrum disorders and primary amenorrhea. threshold for significance after conservative correction for multiple Eur J Med Genet 2008; 51:631–638. testing should not be immediately dismissed, but rather subjected 19 Petrin AL, Giacheti CM, Maximino LP, Abramides DVM, Zanchetta S, Rossi NF et al. Identification of a microdeletion at the 7q33-q35 disrupting the CNTNAP2 gene in to in-silico analyses to decide whether to look at functional effects. 152 – Finally, to prove causality and to fully understand the biological role a Brazilian stuttering case. Am J Med Genet A 2010; :3164 3172. fi 20 Poot M, Beyer V, Schwaab I, Damatova N, Slot R, Prothero J et al. Disruption of of CNTNAP2 expression, speci c animal models and patient-derived CNTNAP2 and additional structural genome changes in a boy with speech delay cell line models need to be studied in more detail. and autism spectrum disorder. Neurogenetics 2010; 11:81–89. 21 Sehested LT, Møller RS, Bache I, Andersen NB, Ullmann R, Tommerup N et al. Deletion of 7q34-q36.2 in two siblings with mental retardation, language delay, CONFLICT OF INTEREST primary amenorrhea, and dysmorphic features. Am J Med Genet A 2010; 152: The authors declare no conflict of interest. 3115–3119.

Molecular Psychiatry (2015), 839 – 849 © 2015 Macmillan Publishers Limited CNTNAP2 5′ promoter variants in ASD AG Chiocchetti et al 849 22 Nord AS, Roeb W, Dickel DE, Walsh T, Kusenda M, O'Connor KL et al. Reduced 42 Kang HJ, Kawasawa YI, Cheng F, Zhu Y, Xu X, Li M et al. Spatio-temporal tran- transcript expression of genes affected by inherited and de novo CNVs in autism. scriptome of the human brain. Nature 2011; 478:483–489. Eur J Hum Genet 2011; 19:727–731. 43 Sakai T, Hino K, Wada S, Maeda H. Identification of the DNA binding specificity of 23 Peñagarikano O, Abrahams BS, Herman EI, Winden KD, Gdalyahu A, Dong H et al. the human ZNF219 protein and its function as a transcriptional repressor. DNA Res Absence of CNTNAP2 leads to epilepsy, neuronal migration abnormalities, and 2003; 10:155–165. core autism-related deficits. Cell 2011; 147:235–246. 44 Frietze S, Lan X, Jin VX, Farnham PJ. Genomic targets of the KRAB and SCAN 24 Poliak S. Juxtaparanodal clustering of Shaker-like K+ channels in myelinated domain-containing zinc-finger protein 263. J Biol Chem 2010; 285: 1393–1403. axons depends on Caspr2 and TAG-1. J Cell Biol 2003; 162: 1149–1160. 45 Buxbaum JD, Silverman J, Keddache M, Smith CJ, Hollander E, Ramoz N et al. 25 Alarcon JM, Abrahams BS, Stone JL, Duvall JA, Perederiy JV, Bomar JM et al. Linkage analysis for autism in a subset families with obsessive-compulsive Linkage, association, and gene-expression analyses identify CNTNAP2 as an behaviors: evidence for an autism susceptibility gene on chromosome 1 and autism-susceptibility gene. Am J Hum Genet 2008; 82:150–159. further support for susceptibility genes on chromosome 6 and 19. Mol Psychiatry 26 Lesch KP, Bengel D, Heils A, Sabol SZ, Greenberg BD, Petri S et al. Association of 2004; 9:144–150. anxiety-related traits with a polymorphism in the serotonin transporter gene 46 Lauritsen MB, Als TD, Dahl HA, Flint TJ, Wang AG, Vang M et al. A genome-wide regulatory region. Science 1996; 274: 1527–1531. search for alleles and haplotypes associated with autism and related pervasive 27 Shastry BS. SNPs: impact on gene function and phenotype. Methods Mol. Biol. developmental disorders on the Faroe Islands. Mol Psychiatry 2006; 11:37–46. 578 – 2009; :3 22. 47 Hu VW, Sarachana T, Kim KS, Nguyen A, Kulkarni S, Steinberg ME et al. Gene 28 Chen Y, Liu T, Yu C, Chiang T, Hwang C. Effects of GC bias in next-generation- expression profiling differentiates autism case-controls and phenotypic variants 8 sequencing data on de novo genome assembly. PLoS ONE 2013; : e62856. of autism spectrum disorders: evidence for circadian rhythm dysfunction in 29 Klauck SM, Felder B, Kolb-Kokocinski A, Schuster C, Chiocchetti A, Schupp I et al. severe autism. Autism Res 2009; 2:78–97. Mutations in the ribosomal protein gene RPL10 suggest a novel modulating 48 Allen-Brady K, Robison R, Cannon D, Varvil T, Villalobos M, Pingree C et al. 11 – disease mechanism for autism. Mol Psychiatry 2006; : 1073 1084. Genome-wide linkage in Utah autism pedigrees. Mol Psychiatry 2010; 15: 30 Freitag CM, Agelopoulos K, Huy E, Rothermundt M, Krakowitzky P, Meyer J et al. 1006–1015. Adenosine A(2A) receptor gene (ADORA2A) variants may increase autistic 49 Fischbach BV, Trout KL, Lewis J, Luis CA, Sika M. WAGR syndrome: a clinical review symptoms and anxiety in autism spectrum disorder. Eur Child Adolesc Psychiatry of 54 cases. Pediatrics 2005; 116: 984–988. 19 – 2010; :67 74. 50 Chiocchetti AG, Bour HS, Freitag CM. Glutamatergic candidate genes in autism 31 Lord C, Rutter M, Le Couteur A. Autism Diagnostic Interview-Revised: a revised spectrum disorder: an overview. J Neural Transm 2014; 121:1081–1106. version of a diagnostic interview for caregivers of individuals with possible per- 51 Kim SH, Song JY, Joo E, Lee KY, Ahn YM, Kim YS. EGR3 as a potential susceptibility vasive developmental disorders: a revised version of a diagnostic interview for gene for schizophrenia in Korea. Am J Med Genet B Neuropsychiatr Genet 2010; caregivers of individuals with possible pervasive developmental disorders. 153B:1355–1360. J Autism Dev Disord 1994; 24:659–685. 52 Quach DH, Oliveira-Fernandes M, Gruner KA, Tourtellotte WG. A sympathetic 32 Lord C, Risi S, Lambrecht L, Cook EH Jr., Leventhal BL, DiLavore PC et al. The neuron autonomous role for Egr3-mediated gene regulation in dendrite mor- autism diagnostic observation schedule-generic: a standard measure of social and phogenesis and target tissue innervation. J. Neurosci. 2013; 33: 4570–4583. communication deficits associated with the spectrum of autism. J Autism Dev 53 Knapska E, Kaczmarek L. A gene for neuronal plasticity in the mammalian brain: Disord 2000; 30:205–223. Zif268/Egr-1/NGFI-A/Krox-24/TIS8/ZENK? Prog Neurobiol 2004; 74: 183–211. 33 Quandt K, Frech K, Karas H, Wingender E, Werner T. MatInd and MatInspector: 54 Ebert DH, Greenberg ME. Activity-dependent neuronal signalling and autism new fast and versatile tools for detection of consensus matches in nucleotide spectrum disorder. Nature 2013; 493: 327–337. sequence data. Nucleic Acids Res 1995; 23: 4878–4884. 55 Kyrchanova O, Chetverina D, Maksimenko O, Kullyev A, Georgiev P. Orientation- 34 Zheng L, Baumann U, Reymond J. An efficient one-step site-directed and site- dependent interaction between Drosophila insulators is a property of this class of saturation mutagenesis protocol. Nucleic Acids Res 2004; 32: e115. regulatory elements. Nucleic Acids Res 2008; 36: 7019–7028. 35 Livak KJ, Schmittgen TD. Analysis of relative data using real-time 56 Weingarten-Gabbay S, Segal E. The grammar of transcriptional regulation. Hum quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods 2001; 25:402–408. Genet 2014; 133: 701–711. 36 Leduc V, Legault V, Dea D, Poirier J. Normalization of gene expression using SYBR 57 Haas RH, Townsend J, Courchesne E, Lincoln AJ, Schreibman L, Yeung-Courchesne green qPCR: a case for paraoxonase 1 and 2 in Alzheimer's disease brains. R. Neurologic abnormalities in infantile autism. J. Child Neurol. 1996; 11:84–92. J. Neurosci. Methods 2011; 200:14–19. 58 Langen M, Schnack HG, Nederveen H, Bos D, Lahuis BE, Jonge MV de et al. 37 Dignam JD, Lebovitz RM, Roeder RG. Accurate transcription initiation by RNA polymerase II in a soluble extract from isolated mammalian nuclei. Nucleic Acids Changes in the developmental trajectories of striatum in autism. Biol. Psychiatry 66 – Res 1983; 11: 1475–1489. 2009; :327 333. 38 Gauderman W, Morrison J QUANTO 1.1: A computer program for power and 59 Strick PL, Dum RP, Fiez JA. Cerebellum and nonmotor function. Annu Rev Neurosci 32 – sample size calculations for genetic-epidemiology studieshttp://hydra.usc.edu/ 2009; :413 434. gxe. 2006. 60 Desrochers TM, Badre D. Finding parallels in fronto-striatal organization. Trends 16 – 39 Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and Cogn Sci (Regul. Ed.) 2012; :407 408. haplotype maps. Bioinformatics 2005; 21:263–265. 61 Pinto D, Pagnamenta AT, Klei L, Anney R, Merico D, Regan R et al. Functional 40 Dudbridge F. Likelihood-based association analysis for nuclear families and impact of global rare copy number variation in autism spectrum disorders. Nature 466 – unrelated ssubjects with missing genotype data. Hum Hered 2008; 66:87–98. 2010; : 368 372. 41 Konopka G, Wexler E, Rosen E, Mukamel Z, Osborn GE, Chen L et al. Modeling the 62 Gai X, Xie HM, Perin JC, Takahashi N, Murphy K, Wenocur AS et al. Rare structural functional genomics of autism using human neurons. Mol Psychiatry 2012; 17: variation of synapse and neurotransmission genes in autism. Mol Psychiatry 2012; 202–214. 17:402–411.

Supplementary Information accompanies the paper on the Molecular Psychiatry website (http://www.nature.com/mp)

© 2015 Macmillan Publishers Limited Molecular Psychiatry (2015), 839 – 849