Minor Allele Frequencies and Molecular Pathways Differences For
Total Page:16
File Type:pdf, Size:1020Kb
Journal of Clinical Medicine Article Minor Allele Frequencies and Molecular Pathways Differences for SNPs Associated with Amyotrophic Lateral Sclerosis in Subjects Participating in the UKBB and 1000 Genomes Project Salvatore D’Antona 1 , Gloria Bertoli 1 , Isabella Castiglioni 2 and Claudia Cava 1,* 1 Institute of Molecular Bioimaging and Physiology, National Research Council (IBFM-CNR), Via F. Cervi 93, 20090 Segrate, Italy; [email protected] (S.D.); [email protected] (G.B.) 2 Department of Physics “Giuseppe Occhialini”, University of Milan-Bicocca Piazza dell’Ateneo Nuovo, 20126 Milan, Italy; [email protected] * Correspondence: [email protected] Abstract: Amyotrophic lateral sclerosis (ALS) is a complex disease with a late onset and is character- ized by the progressive loss of muscular and respiratory functions. Although recent studies have partially elucidated ALS’s mechanisms, many questions remain such as what the most important molecular pathways involved in ALS are and why there is such a large difference in ALS onset among different populations. In this study, we addressed this issue with a bioinformatics approach, using the United Kingdom Biobank (UKBB) and the European 1000 Genomes Project (1KG) in order to analyze the most ALS-representative single nucleotide polymorphisms (SNPs) that differ for minor allele frequency (MAF) between the United Kingdom population and some European populations including Finnish in Finland, Iberian population in Spain, and Tuscans in Italy. We found 84 SNPs Citation: D’Antona, S.; Bertoli, G.; associated with 46 genes that are involved in different pathways including: “Ca2+ activated K+ Castiglioni, I.; Cava, C. Minor Allele channels”, “cGMP effects”, ”Nitric oxide stimulates guanylate cyclase”, “Proton/oligopeptide co- Frequencies and Molecular Pathways transporters”, and “Signaling by MAPK mutants”. In addition, we revealed that 83% of the 84 SNPs Differences for SNPs Associated with can alter transcription factor-motives binding sites of 224 genes implicated in “Regulation of beta-cell Amyotrophic Lateral Sclerosis in development”, “Transcription-al regulation by RUNX3”, “Transcriptional regulation of pluripotent Subjects Participating in the UKBB and 1000 Genomes Project. J. Clin. stem cells”, and “FOXO-mediated transcription of cell death genes”. In conclusion, the genes and Med. 2021, 10, 3394. https:// pathways analyzed could explain the cause of the difference of ALS onset. doi.org/10.3390/jcm10153394 Keywords: ALS; amyotrophic lateral sclerosis; motor neuron degeneration; molecular pathways; Academic Editor: Emmanuel Andrès minor allele frequencies; UKBB; 1000 genomes project; single nucleotide polymorphism; SNP; GWAS Received: 1 June 2021 Accepted: 28 July 2021 Published: 30 July 2021 1. Introduction Amyotrophic lateral sclerosis (ALS) is a complex and chronic disease with the onset of Publisher’s Note: MDPI stays neutral symptoms occurring generally between the ages of 50 and 65 [1–5]. This disease involves with regard to jurisdictional claims in motor neuron degeneration characterized by the progressive loss of upper and lower motor published maps and institutional affil- neurons at the bulbar and spinal levels [6]. This disorder initially results in a gradual loss of iations. muscular function, and it aggravates with muscle atrophy and an inability to breathe and swallow [7]. ALS occurs in two forms: (i) the sporadic form, which is the most common (90–95% of cases) and has no known hereditary component, and (ii) the family-type (5–10% of cases), which has a hereditary component involving altered genes such as C9orf72, FUS, Copyright: © 2021 by the authors. SOD1, TARDBP, and KIF5A [8–12]. Current therapeutic strategies target one or a few altered Licensee MDPI, Basel, Switzerland. molecular pathways, thus having a minimal effect on the course of the disease and on the This article is an open access article life expectancy of ALS patients [13]. Indeed, more than 50% of patients affected by ALS distributed under the terms and do not survive within three years after diagnosis and 20% of the patients survive between conditions of the Creative Commons five and ten years after symptoms onset [14]. Recent population-based motor neuron Attribution (CC BY) license (https:// disease studies show a significant difference in the incidence of ALS between the world creativecommons.org/licenses/by/ 4.0/). (0.6–2.1/100,000 person per year) and Europe (2.1–3.8/100,000 person per year) [15–18] J. Clin. Med. 2021, 10, 3394. https://doi.org/10.3390/jcm10153394 https://www.mdpi.com/journal/jcm J. Clin. Med. 2021, 10, x FOR PEER REVIEW 2 of 17 J. Clin. Med. 2021, 10, 3394 2 of 16 based motor neuron disease studies show a significant difference in the incidence of ALS between the world (0.6–2.1/100,000 person per year) and Europe (2.1–3.8/100,000 person per year) [15–18] and, in particular, these studies show a disease onset discrepancy among and,Scotland in particular, (3.8 per 100,000 these studies person show‐years) a disease [19] and onset the discrepancyrest of Europe among (such Scotland as Italy (3.8(2.8 perper 100,000100,000 person person-years)‐years)) [[18],19] and suggesting the rest that of Europe differences (such in asgenetic Italy and (2.8 environmental per 100,000 person- com‐ years))ponents [ 18are], crucial suggesting in the that onset differences of this pathology. in genetic and environmental components are crucialAlthough in the onset the ofinvolvement this pathology. of mechanisms such as mitochondrial dysfunction and Although the involvement of mechanisms such as mitochondrial dysfunction and oxidative stress and neuroinflammation have been shown in many studies [13], the path‐ oxidative stress and neuroinflammation have been shown in many studies [13], the patho- ogenetic pathways of ALS are still unclear. To date, the greatest challenge is not only to genetic pathways of ALS are still unclear. To date, the greatest challenge is not only to understand the molecular mechanisms underlying the disease but also to comprehend understand the molecular mechanisms underlying the disease but also to comprehend their their role and how they differ among different populations in order to develop more spe‐ role and how they differ among different populations in order to develop more specific cific strategies for the prediction of onset of ALS and better therapies for managing ALS. strategies for the prediction of onset of ALS and better therapies for managing ALS. Since recent genome‐wide associations studies (GWAS) showed that single nucleo‐ Since recent genome-wide associations studies (GWAS) showed that single nucleotide tide polymorphisms (SNPs) have a central role in the inheritance and onset of complex polymorphisms (SNPs) have a central role in the inheritance and onset of complex traits traits and diseases, such as diabetes II and schizophrenia [20], in this study we investi‐ and diseases, such as diabetes II and schizophrenia [20], in this study we investigated in gated in a systematic way the most significant ALS‐related SNPs that differ between the a systematic way the most significant ALS-related SNPs that differ between the United United Kingdom (UK) population and some European populations (including Finnish Kingdom (UK) population and some European populations (including Finnish populations inpopulations Finland, Iberian in Finland, populations Iberian inpopulations Spain, and in Tuscan Spain, populations and Tuscan in populations Italy). We usedin Italy). the UnitedWe used Kingdom the United Biobank Kingdom (UKBB) Biobank and the (UKBB) European and the 1000 European Genome 1000 Project Genome (1KG) forProject the high(1KG) amount for the and high quality amount of and data quality present. of In data addition, present. these In addition, databases these allow databases us to perform allow anus accurateto perform cross-ethnic an accurate GWAS cross study.‐ethnic Indeed, GWAS thestudy. UK Indeed, population the isUK represented population by is UKBBrepre‐ whilesented the by EuropeanUKBB while populations the European are populations represented are by represented the 1KG. Finally, by the we1KG. explored Finally, the we underlyingexplored the genes underlying and pathways genes and that pathways could be that responsible could be ofresponsible the discrepancy of the discrepancy of the ALS onsetof the among ALS onset these among different these populations. different populations. 2. Materials and Methods 2.1. WorkflowWorkflow The computational approach approach of of this this study study consists consists of ofseven seven steps steps that that we webriefly briefly de‐ describescribe in inFigure Figure 1.1 In. In the the first first step, step, we we selected selected the the SNPs SNPs associated associated with with ALS, ALS, based based on on p‐ pvalue-value and and Minor Minor allele allele frequency frequency (MAF), (MAF), using using a acut cut-off‐off of of 0.001 0.001 and and 0.05, 0.05, respectively. respectively. Figure 1. FlowFlow chart chart for for the the selection selection of ofamyotrophic amyotrophic lateral lateral sclerosis sclerosis-related‐related single single nucleotide nucleotide polymorphisms and pathways. polymorphisms and pathways. In the second step, we submitted the SNPs obtained to a genome association tool to execute the clump procedure. It generates a list of SNPs that we used in the third step to obtain the variants information from the 1KG database and to calculate the differences of J. Clin. Med. 2021, 10, 3394 3 of 16 MAFs between the UK and 1KG populations. Thus, in the third step, we selected SNPs with a statistically significant difference of MAFs between UK and 1KG populations. In the fourth step, we performed an SNP enrichment analysis of our statistically significant SNPs. In the fifth step, we calculated the differences of MAF of SNPs-enriched between the UK and 1KG population, selecting the most ALS-related variants. In the step 6.a, we associated the list of SNPs to genes. In step 6.b, we performed an analysis to investigate the transcription-factor motives of binding sites (TF-MBS) altered by our SNPs.