TRANSPOSABLE ELEMENTS OCCUR MORE FREQUENTLY in AUTISM RISK GENES: Emily L
Total Page:16
File Type:pdf, Size:1020Kb
Research Article • DOI: 10.2478/s13380-013-0113-6 • Translational Neuroscience • 4(2) • 2013 • 172-202 Translational Neuroscience TRANSPOSABLE ELEMENTS OCCUR MORE FREQUENTLY IN AUTISMRISK GENES: Emily L. Williams1*, Manuel F. Casanova2, Andrew E. Switala2, IMPLICATIONS FOR THE ROLE OF Hong Li1, Mengsheng Qiu1 GENOMIC INSTABILITY IN AUTISM 1Department of Anatomical Sciences Abstract and Neurobiology, University of Louisville An extremely large number of genes have been associated with autism. The functions of these genes span School of Medicine, Louisville, Kentucky, USA numerous domains and prove challenging in the search for commonalities underlying the conditions. In this study, we instead looked at characteristics of the genes themselves, specifically in the nature of their transposable element content. Utilizing available sequence databases, we compared occurrence of transposons in autism- 2Department of Psychiatry and Behavioral risk genes to randomized controls and found that transposable content was significantly greater in our autism Sciences, University of Louisville School group. These results suggest a relationship between transposable element content and autism-risk genes and of Medicine, Louisville, Kentucky, USA have implications for the stability of those genomic regions. Keywords Received 05 April 2013 • Autism-risk genes • Autism spectrum disorders • Genomic instability • Transposons. accepted 03 May 2013 © Versita Sp. z o.o. 1. Introduction associated with autism [3]. Pinto et al. in their X syndrome and its CGG-trinucleotide repeats, extensive study of copy number variants (CNV) or environmentally induced as shown in a Transposable elements (TE) are segments of in the condition found that on average CNVs number of cancers [6,7]. In addition, some DNA within genomes capable of mobility, tended to occur in and around genes belonging instances of CNV show a strong relationship exhibiting either a copy-and-paste method to one of three functional domains: 1) cell with their resident TE, such as the Drosophila in which an RNA intermediate is transcribed proliferation, 2) cell motility, and 3) GTPase/ melanogaster Cyp6g1 gene, the culprit for some into DNA and reinserted elsewhere within Ras signaling [4]. While these findings aid in fruit flies’ adapted resistance to DDT [8]. the genome, or a cut-and-paste method in understanding how these genes factor into the As discussed, the human genome contains which a DNA segment excises and reinserts etiology of autism, it doesn’t answer why such a close to 50% TE content but the majority of itself. These mobile elements comprise vast range of genes are targeted and how those these elements are transpositionally extinct. approximately 50% of the human genome, mutations occur. The purpose of this study was However, though they may no longer be although the majority are transpositionally to elucidate whether these types of mutations capable of motility or replication, they may extinct [1]. For instance, many autonomous are increased in autism due to an inherent still affect local DNA conformation, interaction retrotransposons contain mutations which no genomic instability of these regions in the with binding partners leading to alterations longer make protein synthesis and therefore human genome. The causes for such mutations in gene expression, and the overall stability of retrotransposition possible; therefore, they may vary. For example, individuals with Li- the local genomic region [9-11]. These are the may produce transcripts but are nonetheless Fraumeni syndrome, an autosomal dominant reasons we wished to investigate a potential considered extinct. Interestingly, some genes disorder characterized by increased risk to relationship between autism-risk genes and which house numerous TE, both mobile and various neoplasms, exhibit genomes enriched the occurrence of TE within those gene regions. extinct, are known to exhibit relative genetic in CNVs suggesting that these variants may instability, suggesting that transposition and predispose towards the genomic instability 2. Experimental procedures retrotransposition are not the only TE-related which gradually characterizes advanced stages events which may affect rates of mutation of tumorogenesis [5]. While the chromosomal The newest version of AutismKB database, [2, for example]. rearrangements which typify advanced cancer “nonsyndromic_list20121018,” was downloaded A broad array of genes have been identified are an extreme example for comparison to from the website [3]. Our autism-risk genes as risk factors for autism, so many that it autism, tendencies towards instability may were selected according to Total Score, a system is a challenge to find a common thread be similar, targeting related groupings of which ranks from 0-50. Each gene’s rank is a throughout. At present, the most recent version inherently less stable genes. “Weaker genes” compilation of scores comprising 1) genome- of the database, AutismKB, lists 3,049 genes may therefore be inherited, as seen in Fragile wide associations studies, 2) expression studies, * E-mail: [email protected] 172 Translational Neuroscience 3) copy number variants studies, 4) linkage general distribution, with autism-related genes our autism group displayed more TE per studies, 5) low-scale association studies, and displaying significantly higher numbers of total gene than our controls [autism: N = 441, 6) other low-scale association studies. In TE content [N = 508, D = 0.2090]. We went on μ = 132, ± 300 s.d.; control: N = 441, μ = 45, order to select those sequences with stronger to perform a logistic regression utilizing our full ± 106 s.d.]. In addition, approximately 7% of association to autism, we collected a listing of data set, finding that for every doubling of TE the autism genes housed extraordinarily large genes with rankings greater than or equal to content, a given gene was approximately 17% numbers of TE, ranging from 500 to well over 10 [N = 441, following exclusion criteria below]. more likely to be associated with our autism 2,000 per gene, while only approximately 1% Our 441 control genes were then selected group [OR = 1.17, CI = 95%, p-value = 5.56 x 105]. of controls reached the 500-999 range and utilizing the random gene generator available Overall, 67% of the genes from our only one gene out of 441 tallied over 1000. Our through RSAT, which provides a listing of autism sample housed TE while only 48% results suggest a strong relationship between protein-coding Ensembl ID’s [12,13]. From our of our control group exhibited the same the total TE content of a given gene and its randomized group, genes were cross-checked (Table 1 and Figure 1). As mentioned, risk for mutation, results which may hold against the AutismKB database and excluded if listed therein. We then downloaded the RefSeq Table 1. This table shows the number of transposable elements per gene per group. A descriptive analysis database, “gene2refseq,” and cross-checked all illustrating the percentage breakdown in autism and controls, separated by total size of TE content. our samples, removing any which do not occur within RefSeq so as to exclude those genes # TE/Gene Autism Control which do not appear within the TranspoGene 0 32.88% 51.93% database (downloaded: 3/12/13). All 1 2.04% 1.13% transposable content from Human Genome 17 2 1.13% 2.04% (hg17) was downloaded from the TranspoGene database [14]. Once all data were collected, 3-5 2.72% 3.17% due to the nature of our distributions, for our 6-10 3.17% 2.95% first analysis a nonparametic Kuiper test was 11-50 18.37% 17.46% performed which addressed the relationship of 51-100 12.93% 9.07% total TE content by group. Both our full data set and a reduced set (excluding zero-values) were 101-200 9.30% 5.67% analyzed. For our second analysis, a logistic 201-500 10.20% 4.99% regression was applied, utilizing TE content as 501-1000 5.67% 1.36% our independent variable and group (autism 1001-2000 1.13% 0.23% v. control) as our dependent variable. Please note that this data is reflective of the state of 2001-3000 0.45% 0.00% current knowledge at the time of acquisition, which may be subject to change as databases are updated (for full data set, see Appendix 1). 3. Results and discussion Our autism and control groups were built using data from two databases, AutismKB and TranspoGene, and occurrences of TE were determined for each gene. From the results of our analysis utilizing a nonparametric Kuiper test, it was clear that our autism genes housed significantly greater numbers of transposable elements as compared to randomized controls [N = 882, D = 0.2177, p-value = 4.321 x 108]. Upon removal of all TE zero-values, an additional Kuiper test likewise returned a p-value of 6.187 x 104, indicating that substantial differences Figure 1. A histogram showing numbers of TEs across autism and control genes. The greatest differences lie in the lay not only in the occurrence of TE but in their occurrence of genes with no TE content (autism < control) and in numbers of large TE content (100+). 173 Translational Neuroscience relevance for disease susceptibility in multiple for additional risk-genes of the condition Acknowledgments conditions such as autism, schizophrenia, and and in pinpointing factors, inherited and cancer [15,16]. ecological, which underlie autism-related A special thanks to Lena McCue for taking the Our results are particularly relevant mutations. Such knowledge can help us time to review the manuscript and offer critique. concerning two recent publications, those of better understand what lies at the core of this The work in this manuscript was supported by the Cross-Disorder Group of the Psychiatric heterogeneous condition and offer us greater funding from the National Institutes of Mental Genomics Consortium [17] and Girirajan means of intervention. Health (NIMH) RO1 MH086784. et al. [18]. The first study identified several common major risk loci for autism, AD/HD, schizophrenia, bipolar disorder, and major depression, some of which included loci within calcium channel activity genes [17].