The MER41 Family of Hervs Is Uniquely Involved in the Immune-Mediated Regulation of Cognition/Behavior-Related Genes
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/434209; this version posted October 3, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. The MER41 family of HERVs is uniquely involved in the immune-mediated regulation of cognition/behavior-related genes: pathophysiological implications for autism spectrum disorders Serge Nataf*1, 2, 3, Juan Uriagereka4 and Antonio Benitez-Burraco 5 1CarMeN Laboratory, INSERM U1060, INRA U1397, INSA de Lyon, Lyon-Sud Faculty of Medicine, University of Lyon, Pierre-Bénite, France. 2 University of Lyon 1, Lyon, France. 3Banque de Tissus et de Cellules des Hospices Civils de Lyon, Hôpital Edouard Herriot, Lyon, France. 4Department of Linguistics and School of Languages, Literatures & Cultures, University of Maryland, College Park, USA. 5Department of Spanish, Linguistics, and Theory of Literature (Linguistics). Faculty of Philology. University of Seville, Seville, Spain * Corresponding author: [email protected] bioRxiv preprint doi: https://doi.org/10.1101/434209; this version posted October 3, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. ABSTRACT Interferon-gamma (IFNa prototypical T lymphocyte-derived pro-inflammatory cytokine, was recently shown to shape social behavior and neuronal connectivity in rodents. STAT1 (Signal Transducer And Activator Of Transcription 1) is a transcription factor (TF) crucially involved in the IFN pathway. It binds consensus sequences that, in humans, are located with a high frequency in the LTRs (Long Terminal Repeats) of the MER41 family of HERVs (Human Endogenous Retrovirus). However, the role of an IFN/STAT1/MER41 pathway in human cognition is still poorly documented. Here, we identified a unique set of human cognition/behavior-related genes harboring MER41 LTR sequence(s) in their promoter regions. Interestingly, these MER41 LTRs exhibited TF binding sites (TFBSs) for not only STAT1 but also a significant number of immune TFs belonging to the NFKB (Nuclear Factor Kappa B) complex. These include notably STAT3 (Signal Transducer And Activator Of Transcription 3) and YY1 (YY1 transcription factor). Furthermore, a survey of proteomics and transcriptomics databases allowed to map a unique network linking TFs of the NFKB complex, with FOXP2 (Forkhead Box P2), currently known as a key TF involved in language evolution. Confirming the specificity of our findings, we demonstrated that a significant number of autism spectrum disorders (ASD) susceptibility genes exhibit a promoter-localized MER41 LTR sequence. Of note, this result was specific to MER41 HERVs since none of the more than 100 HERVs analyzed, most of which are common to all mammals, harbored such a feature. Finally, enrichment analyses of ASD susceptibility genes pointed again to the crucial involvement of STAT3 in human cognitive functions. bioRxiv preprint doi: https://doi.org/10.1101/434209; this version posted October 3, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. INTRODUCTION Interferon gamma (IFN), the prototypical T Helper 1 (TH1) cytokine, is a T-cell derived pro- inflammatory molecule exerting several effects on innate immune cells and others on non- immune cells including neurons. Indeed, IFN was recently shown to be a social behavior regulator and to shape neuronal connectivity in rodents (Filiano et al., 2016). Irrespective of cell type, binding of IFN to its receptor induces the transcriptional regulation of target genes via the recognition of promoter consensus sequences, by the transcription factor STAT1 (Signal Transducer And Activator Of Transcription 1) (Green et al., 2017; Ramana et al., 2002). In humans, an important share of the IFN /STAT1 pro-inflammatory pathway is mediated by the binding of STAT1 to consensus sequences localized in the LTRs (Long Terminal Repeats) of the MER41 family of HERVs (Human Endogenous Retrovirus) (Chuong et al., 2016). MER41 integrated the genome of a primate ancestor 45-60 million years ago and a total of 7,190 LTR elements belonging to 6 subfamilies (MER41A-MER41G) are detectable in the modern human genome (Chuong et al., 2016). That being said, the hypothesis of an IFN/STAT1/MER41 pathway shaping social behavior in humans (Filiano et al., 2016) is not yet supported by experimental data. As a first step, human candidate gene(s) regulated by such a pathway need(s) to be identified. Additionally, it is important to determine if STAT1 is the sole immune TF that potentially regulate behavior- or cognition- related genes via MER41 LTRs in humans. Finally, it is worth considering whether the stepped integration of HERVs into the human genome and more generally any evolutionary change dictated by infectious events could be related to key cognitive specificities experienced by hominins, in particular the emergence of language. To address these issues, we followed a bioinformatics workflow relying on the use of two recently-generated web tools allowing a survey of HERV sequences and their associated bioRxiv preprint doi: https://doi.org/10.1101/434209; this version posted October 3, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. transcription factor binding sites (TFBSs) in the entire human genome. Using this approach, we identified a specific set of genes harboring MER41 LTRs in their promoter region located 2 Kb upstream of the transcription starting site (TSS). Surprisingly, this list of genes was significantly enriched in cognition-related genes. Moreover, the identified MER41 LTRs were enriched in binding sites recognized by specific TFs of the NFKB complex, notably NFKB1, STAT3, CEBPB and YY1. We then performed phylogenetic comparisons between humans and chimpanzees regarding: i) MER41 LTR insertions sites in the promoter regions of candidate genes, and ii) the functional protein domains of immune TFs binding MER41 LTRs in such promoter regions. This was so as to infer putative consequences for the expression pattern of cognition-related genes, which might ultimately account for some of the cognitive differences between species. Our results indicate a possible evolutionary advantage to humans in the impact of MER41 LTRs on the transcriptional regulation of behavior/cognition-related genes. Finally, since FOXP2 is currently known as a key TF regulating aspects of brain development and function important for speech and language processing (Konopka et al., 2009; Oswald et al., 2017; Spiteri et al., 2007; Vernes et al., 2011, 2007), we searched genomics and proteomics databases to map a putative functional interactome linking TFs of the NFKB complex, MER41 LTRs and FOXP2. This way, we were able to infer from experimental data an evolutionary-determined connection between behavior/cognition-related genes, infectious diseases, and immunity. Finally, as a confirmatory investigation, such a connection was assessed in the pathological context of autism spectrum disorders (ASD). MATERIALS AND METHODS A scheme summarizing the workflow followed in the present work is shown in Figure 1. bioRxiv preprint doi: https://doi.org/10.1101/434209; this version posted October 3, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Figure 1: Wokflow of the sudy. Rectangles (yellow or red) frame the main results obtained following each of the analytical steps briefly described in ellipse shapes (green or blue). Terms in italics correspond to the name of the bioinformatics tools used for each analytical step. LTR: long terminal repeats; GWAS: genome-wide association studies; ASD: autism spectrum disorders . bioRxiv preprint doi: https://doi.org/10.1101/434209; this version posted October 3, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Bioinformatic tools and corresponding tasks performed in this study are described below. - The EnHERV database and web tool (Tongyoo et al., 2017): identifying human genes harboring MER41 LTR sequence(s) in the promoter region located 2 KB upstream the TSS ; only solo LTRs oriented in the sense direction relative to the gene orientation were taken into account. - The db-HERV-RE database and web tool (Ito et al., 2017): identifying experimentally- demonstrated TFBSs in HERV LTRs. - The EMBL GWAS database "GWAS catalog" (Welter et al., 2014): identifying among a queried list of genes those for which a SNP associated with cognitive and/or behavioral traits or conditions had been previously reported - The “Home genetics” NCBI database (Collins et al., 2016): identifying