Functional Annotation of Genes Overlapping Copy Number Variants in Autistic Patients: Focus on Axon Pathfinding

136 Current Genomics, 2010, 11, 136-145 Functional Annotation of Genes Overlapping Copy Number Variants in Autistic Patients: Focus on Axon Pathfinding Silvia Sbacchi1, Francesco Acquadro2, Ignazio Calò1, Francesco Calì3 and Valentino Romano*,1,3 1Dipartimento di Oncologia Sperimentale e Applicazioni Cliniche, Università degli Studi di Palermo, Palermo; 2Molecular Cytogenetics Group, Centro Nacional de Investigaciones Oncologicas (C.N.I.O.), and Centro de Investiga- ciones de Enfermidades Raras (CIBERER), Madrid, Spain; 3Associazione Oasi Maria SS. (I.R.C.C.S.), Troina (EN), Italy Abstract: We have used Gene Ontology (GO) and pathway analyses to uncover the common functions associated to the genes overlapping Copy Number Variants (CNVs) in autistic patients. Our source of data were four published studies [1- 4]. We first applied a two-step enrichment strategy for autism-specific genes. We fished out from the four mentioned studies a list of 2928 genes overall overlapping 328 CNVs in patients and we first selected a sub-group of 2044 genes after excluding those ones that are also involved in CNVs reported in the Database of Genomic Variants (enrichment step 1). We then selected from the step 1-enriched list a sub-group of 514 genes each of which was found to be deleted or duplicated in at least two patients (enrichment step 2). The number of statistically significant processes and pathways identified by the Database for Annotation, Visualization and Integrated Discovery and Ingenuity Pathways Analysis softwares with the step 2-enriched list was significantly higher compared to the step 1-enriched list. In addition, statistically significant GO terms, biofunctions and pathways related to nervous system development and function were exclusively identified by the step 2-enriched list of genes. Interestingly, 21 genes were associated to axon growth and pathfinding. The latter genes and other ones associated to nervous system in this study represent a new set of autism candidate genes deserving further investigation. In summary, our results suggest that the autism’s “connectivity genes” in some patients affect very early phases of neurodevelopment, i.e., earlier than synaptogenesis. Received on: September 10, 2009 - Revised on: December 12, 2009 - Accepted on: December 15, 2009 Keywords: Autism Spectrum Disorders, Copy Number Variants, Gene Ontology, axon guidance signalling, neurodevelopment, candidate genes. INTRODUCTION reviewed in [5]). These studies have also led to the identification of several loci implicated in ASDs etiology. Autism Spectrum Disorders (ASDs) are a group of neu- rodevelopmental disorders that affects approximately 1 in Despite these important progresses the genetic bases of 150 individuals and are characterized by deficits in recipro- Autism Spectrum Disorders remain poorly understood. A cal social interaction, communication and patterns of repeti- major factor responsible of this poor outcome relates to the tive behaviours and restricted interests [5]. These “core defi- complex genetic architecture of this disorder likely consist- cits” can be variable among patients as far their severity and ing of a marked multilocus and multiallelic heterogeneity time of onset is concerned. In addition, non-core symptoms with both common and rare risk alleles occurring in the (mental retardation, epilepsy, gastro-intestinal disturbances, autistic population [9]. Additional levels of genetic complex- macrocephaly etc.), are observed only in subgroups of pa- ity would involve epigenetic and environmental factors, pos- tients. This remarkable phenotypic complexity probably un- sibly interacting with risk alleles [10, 11]. derlies multiple etiologies assumed to disrupt one or more In recent years, genomics has added a new dimension to phases of neurodevelopment. autism’s research. Taking advantage of microarray technol- A genetic etiology for ASD has been established and ogy, genomic analyses in autism have proceeded along three supported by several evidences including: (i) a higher con- main directions: whole-genome expression profiling (e.g., cordance rate in monozygotic compared to dizygotic twins [12]), whole-genome association studies (e.g., [13]) and the [6, 7], (ii) detection of chromosomal aberrations in up to 3% analysis of Copy Number Variants [1-3, 14-16]. These three - 7% of children with autism [8], (iii) linkage and linkage experimental paradigms of genomic research have concep- disequilibrium studies, and (iv) a higher chance of associa- tual and technical differences, but on the same time they tion between autism with several monogenic disorders (all share the challenging task of extracting the autism-specific biological information from the large number of data typi- cally generated in microarray analyses. *Address correspondence to this author at the Dipartimento di Oncologia Copy Number Variants are operationally defined as sub- Sperimentale e Applicazioni Cliniche, Università degli Studi di Palermo, microscopic insertions or deletions of DNA segments larger Via S. Lorenzo Colli, 312 Palermo, Italy; Tel: +39 91 6722852; Fax: +39 91 than 1 kb [17, 18]. These genomic mutations overlap thou- 6806420; E-mail: [email protected] 1389-2029/10 $55.00+.00 ©2010 Bentham Science Publishers Ltd. Functional Annotation of Genes Overlapping Copy Number Variants Current Genomics, 2010, Vol. 11, No. 2 137 sands of genes, many of which play pivotal roles in biologi- In this study, we have used Gene Ontology and Ingenuity cal pathways and may thus have profound effects on cell pathway analyses to uncover the common functions associ- functions and human phenotype. In recent years there have ated to deleted or duplicated genes in autistic patients. Using been a growing number of studies demonstrating or suggest- data from four published studies our functional annotation ing that CNVs make a substantial contribution to human analysis uncovered several biological processes related to disease [19, 20]. The predicted functional effects of Copy nervous system development and function potentially related Number Variants would include gene expression changes to autism’s pathogenesis. through a variety of mechanisms such as dosage effects, dis- ruption of gene coding sequences, alteration of splicing, METHODS change in the location or efficiency of important regulatory Data Sets elements, loss of heterozygosity, imprinting (all reviewed in [19]). The four CNV datasets we have used in this study refer to 4 distinct populations of autistic patients previously ana- Further insight on the potential biological implications of lyzed by microarray technology [1-4] accounting for a total CNVs has been gained by studying the common functions number of 268 patients (see Supplementary Table 1 for de- shared by genes overlapping CNVs detected in patients af- tails). Genes overlapping each CNV in patients were identi- fected by various diseases using a variety of bioinformatic approaches. Relevant examples concerning diseases of the fied and gene symbols were manually retrieved using the information contained in the NCBI Map Viewer Homo nervous system include amyotrophic lateral sclerosis (ALS), sapiens (human) genome view Build 35.1. The symbols re- mental retardation (MR), schizophrenia (see Discussion for lated to genes overlapping each CNV in each patient were autism). For example, genes deleted specifically in patients grouped to generate several lists of genes named by the ac- with sporadic ALS encode proteins involved in oxidative ronym “CS” (CS = Combined Set; see Table 1). Each list phosphorylation, regulation of the actin cytoskeleton, and interactions between cytokines and their receptors [21]. was stored as a txt file and used for functional annotation analyses. An additional list of gene symbols corresponds to Webber et al. [22] explored links between 148 MR– genes overlapping CNVs occurring in the general population associated CNVs and phenotypes from 5000 mouse gene that were downloaded from the Database of Genomic Vari- knock-out experiments and found that these CNVs were sig- ants (DGV: http://projects.tcag.ca/variation/). The version of nificantly enriched in those genes whose mouse orthologues, DGV database (Build 35.1) used is updated on November when disrupted, result in either abnormal axon or dopa- th minergic neuron morphologies. Genes disrupted by struc- 10 , 2008 and contains: (i) 31615 total entries (hg17), (ii) 19792 CNVs, (iii) a total of 6225 CNV loci. All data han- tural variants in patients with schizophrenia were signifi- dling and storage were done using Excel (Microsoft). cantly overrepresented in pathways important for brain development including neuregulin signalling, extracellular sig- Cytoscape Networks nal–regulated kinase/mitogen-activated protein kinase (ERK/MAPK) signalling, synaptic long-term potentiation, Bipartite networks were generated by Cytoscape software axonal guidance signalling, integrin signalling and glutamate (http://www.cytoscape.org; version 2.6.2) with the aim of receptor signalling [23]. identifying the genes in CS1_All dataset that are shared by at least two patients. The input file contained (i) ID codes for Compelling evidences that CNVs are implicated in the patients and (ii) the corresponding genes overlapping the autism’s etiology relate to the observation that the frequency CNVs in these patients. Two different types of nodes are of de novo (rare) CNVs is significantly higher in cases ver- sus controls [1, 2, 14]. On the other hand, the relative poten- displayed in

Functional Annotation of Genes Overlapping Copy Number Variants in Autistic Patients: Focus on Axon Pathfinding

Analysis of Protein Complexes Through Model-Based Biclustering of Label-Free Quantitative AP-MS Data

Functional Analysis of Structural Variation in the 2D and 3D Human Genome

STRIPAK Complexes in Cell Signaling and Cancer

The Identification of 64 Novel Genetic Loci Provides an Expanded View on the Genetic Architecture of Coronary Artery Disease

Mouse Asz1 Conditional Knockout Project (CRISPR/Cas9)

Homology-Based Negative Data Sampling Method for Genome-Scale Reconstruction of Human Protein–Protein Interaction Networks

Transcriptional and Post-Transcriptional Profile of Human Chromosome 21

The Genetics of the Mood Disorder Spectrum: Genome-Wide Association Analyses of More Than 185,000 Cases and 439,000 Controls

Specification and Epigenetic Resetting of the Pig Germline Exhibit Conservation with the Human Lineage

MOB (Mps One Binder) Proteins in the Hippo Pathway and Cancer

The Origins, Evolution, and Functional Potential of Alternative Splicing In

Chromatin Dynamics and the Role of G9a in Gene Regulation And