Integrative Analysis of DNA Methylation and Gene Expression in Skin Cutaneous Melanoma by Bioinformatic Approaches Yan Sun Huazhong University of Science and Technology Zhilin Wu Huazhong University of Science and Technology Rui Chen Huazhong University of Science and Technology Yan Wu Huazhong University of Science and Technology Yun Lin ( [email protected] ) Huazhong University of Science and Technology Research Article Keywords: Driven genes, Expression, Immune, Methylation, Prognosis, Skin cutaneous melanoma Posted Date: September 2nd, 2021 DOI: https://doi.org/10.21203/rs.3.rs-858303/v1 License: This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License Page 1/16 Abstract Skin cutaneous melanoma is the most life-threatening skin cancer. Finding key methylation genes of prognostic value is an under-explored but intriguing eld in the research of skin cutaneous melanoma. This work is aimed to identify survival related methylated genes and their specic methylation sites in skin cutaneous melanoma via an integrative analysis with bioinformatic approaches. The original data, including gene methylation and expression les, were downloaded from the Cancer Genome Atlas database. Statistical analysis revealed that skin cutaneous melanoma patients with highly expressed and hypomethylated HHEX had a better outcome than patients with lowly-expressed and hypermethylated HHEX. In addition, fteen methylation sites of HHEX were identied to be signicantly correlated with HHEX expression changes. In various pathological stages, the expression levels of HHEX were different, and exhibited a downward trend from stage to stage . Therefore, we speculate that the driven gene HHEX may play an important role in the survival of skin cutaneous melanoma. This nding provides novel epigenetic molecular clues and potential detection targets for early prediction of the prognosis of skin cutaneous melanoma. Introduction Derived from uncontrolled overgrowth of abnormal melanocytes, skin cutaneous melanoma (SKCM) assumes the primary responsibility for skin cancer- related deaths due to its potent metastatic power 1, 2. Thickness of primary tumor, presence of ulceration, lymph node diffusion, mitoses, and distant metastases are features of middle or late stage of SKCM. And these indices are applied to determine the prognosis of the disease 3. Numerous researches were carried out to nd novel biomarkers of prognostic value, which included circulating melanoma cells, exosomes, abnormally expressed proteins, mutated genes and non-coding RNAs 4–7. However, these biomarkers were not specic or early enough, and had limited use in clinical practice. Identifying new molecular that can be used for early and accurate prediction of the outcomes of the SKCM patients remains a goal in current days. DNA methylation changes at the CpG site is the most widespread and stable epigenetic modication in malignant tumors 8. Abnormal DNA methylation is regarded as a vital mechanism to promote the genesis and progression of tumors 9, 10, while the phenomenon of DNA hypomethylation is less common compared with DNA hypermethylation in tumorigenesis 11. Cumulative ndings indicated that melanoma development was prompted by alterations in DNA methylation, which has a strong inuence on the expression of numerous melanoma-associated genes by silencing tumor suppressor genes. DNA methylation changes also revealed a prognostic utility in patients with localized or metastatic melanoma 12–16. For instance, RASSF1A (RAS association domain family 1 isoform A) is susceptible to methylation and its expression is often down-regulated in primary uveal melanoma 17, 18. SKCM is known to be virulent, even relatively small SKCM has the potential to metastasize, resulting in extremely high mortality 19, 20. Thus identifying relations of the DNA methylation and gene expressions in SKCM may provide new clues for the study of the prognosis of the disease. The Cancer Genome Atlas (TCGA) database catalogues genomic proles of more than 30 kinds of human tumors 21. It is a database that is publicly available, free of patient consent or ethics committee approval. In this work, through a series of bioinformatic approaches, we downloaded data from TCGA and made an integrative analysis of DNA methylation and gene expression in SKCM, with an aim to nding new prognostic methylation genes and their methylation sites, and identifying more meaningful biomarkers for the evaluation of the prognosis of SKCM. Methods Data source and preprocessing All the raw data were collected from the TCGA data portal website (https://portal.gdc.cancer.gov/) 22, including 475 SKCM DNA methylation proles, 472 mRNA expression proles and clinical data. The DNA methylation data were generated with the Illumina Innium HumanMethylation 450 BeadChip array [Platform], and the methylation level of each probe was expressed as the Beta Value [Data Type]. The gene expression data were selected through the following screening conditions: RNA-Seq [Experimental Strategy], HTSeq - Counts [Workow Type]. Data were processed with Perl software and R package software. Methylation matrices for 2 normal (non-cancerous) methylation samples and 473 SKCM methylation samples were obtained. Among them, 460 had the information on corresponding survival status. Screening for differentially methylated genes Original methylation matrix was normalized and processed with R package software. Differentially methylated genes were screened with the R - limma package. T-test and Benjamini - Hochberg's method were used to for statistical analysis of the genes. The differentially methylated genes were screened out by the following selection criteria: 1) | log2 (fold-change) | >1, and 2) the P-values < 0.05. The heatmap for the differentially methylated genes were created with R - heatmap package. Correlation analysis of methylation levels of differentially hypermethylated genes and the corresponding mRNA expressions To evaluate the relationship between methylation levels of differentially hypermethylated genes and the corresponding mRNA expressions. Differently methylated genes with log2 (fold-change) > 0 were dened as up-regulated methylation genes, namely hypermethylated genes. The correlation between the methylation levels of up-regulated methylation genes and the corresponding mRNA expressions was analyzed with R software. Absolute value of correlation coecient (| cor |) > = 0.3 and P- values < 0.05 were considered to be signicantly relatively high correlation. The correlation test was carried out with Pearson- test. Page 2/16 Correlation analysis of methylation sites of driven genes and the corresponding mRNA expressions To further assess the correlation between methylation levels of driven genes’ methylated sites and the corresponding mRNA expressions, we analyzed the correlation between the methylation levels of specic methylated sites of driven genes and the corresponding mRNA expressions by R packages. (| cor |) > = 0.3 and P- values < 0.05 were identied as a signicantly high correlation, and Pearson-test was used to calculate the correlation test. Joint survival analysis for driven genes To explore the prognostic value of candidate genes, joint survival analysis for the methylation levels of those genes and genes expressions were conducted. Kaplan - Meier survival curves were used to illuminate and compare the prognostic information of a certain gene that is hypermethylated with low expression and hypomethylated with high expression. The P-value cutoff was 0.05. Exploratory analysis for corresponding mRNA of survival - methylated genes Expression analysis: GEPIA (http://gepia.cancer-pku.cn/index.html) is an expression analysis tool based on TCGA and GTEX data that quickly delivers critical interactive and customizable functionalities 23. An expression analysis in diverse pathological stages of survival - methylated genes with the “Single Gene Analysis” and “Stag Plots” module of GEPIA was made using the “SKCM” dataset. Student’s T- test was used for obtaining a P-value for analysis, and the P- values < 0.05 were still the criteria. Immune inltration analysis: TIMER (http://timer.cistrome.org/) is a powerful resource for analyzing immune inltrations across various cancer types, allowing users to fully explore the immunological, clinical and genomic features of tumors by dynamically generating high - quality Figs. 24. Herein, we clicked “Immune” module of the TIMER2 to search the potential relationship between HHEX expressions and immune inltrations in skin SKCM. The immune cells of CD8 + T -cells, CD4 + T-cells, dendritic cells (DC), macrophages, neutrophils and natural killer cells (NK cells) were selected. A variety of algorithms were utilized for estimating immune inltration, such as EPIC, TIMER, QUANTISEQ, XCELL, CIBERSORT, CIBERSORT-ABS and MCPCOUNTER. Results were visualized with a heatmap and a set of scatter plots. The purity - adjusted Spearman’s rank correlation test was used to generate the P-values and partial correlation (cor) values. Enrichment analysis: In brief, the STRING (https://string-db.org/) was visited with a single protein name (“HHEX”) and organism (“Homo sapiens”) according to the following parameters: minimum required interaction score [“Low condence (0.150)”], meaning of network edges (“evidence”), max number of interactors to show (“no more than 50 interactors”) and active interaction sources (“experiments”). Finally, 50 HHEX - binding proteins were obtained. Besides, the “Similar
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages16 Page
-
File Size-