Screening for Implicated Genes in Colorectal Cancer Using Whole‑Genome Gene Expression Profiling
Total Page:16
File Type:pdf, Size:1020Kb
8260 MOLECULAR MEDICINE REPORTS 17: 8260-8268, 2018 Screening for implicated genes in colorectal cancer using whole‑genome gene expression profiling LONG-CI SUN1,2 and HAI-XIN QIAN1 1Department of General Surgery, The First Affiliated Hospital of Soochow University, Suzhou, Jiangsu 215006; 2Department of Gastrointestinal Surgery, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200127, P.R. China Received October 24, 2017; Accepted March 2, 2018 DOI: 10.3892/mmr.2018.8862 Abstract. To identify biologically relevant genes associated WD repeat domains may represent novel target molecules for with the pathogenesis of colorectal cancer (CRC), genome future treatments of CRC. wide expression profiles of 17 pairs of CRC tumor and adjacent tissues, previously published in a DNA microarray study, were Introduction analyzed. Cytoscape, String tools and DAVID tools were used to investigate the biological pathways encoded by the genes Colorectal cancer (CRC) is one of the most commonly diag- identified as being either upregulated or downregulated in nosed cancers worldwide (1). It is the second most prevalent CRC, to determine protein-protein interactions and to identify cancer type among women, following breast cancer, and the potential hub genes associated with CRC. As a result, a total of third most prevalent cancer type among men (2,3). In recent 3,264 genes were identified as being differentially expressed decades, diagnoses and treatment of CRC have significantly in CRC and adjacent tissues, including 1,594 downregulated improved. However, the disease remains a major public health and 1,670 upregulated genes. Furthermore, 306 genes were problem worldwide. In 2016, the American Cancer Society revealed to be clustered in a complex interaction network, estimated 70,820 new male cases and 63,670 new female and the top 20 hub genes in this network were determined by cases of CRC would be diagnosed, and that there would be application of the Matthews Correlation Coefficient algorithm. 49,190 CRC associated deaths in the USA (4). The high rate In addition, the patterns of the expression levels of the 20 hub of mortality of the disease is in part due to limitations in the genes were investigated using reverse transcription‑quantitative currently available therapies for CRC treatment, which is the polymerase chain reaction. Gene Ontology analysis revealed result of an incomplete understanding of the biological mecha- that four of the 20 hub genes encoded small subunit processome nisms underlying the disease. components (UTP3 small subunit processome component; It has been well established that genes have an important UTP14 small subunit processome component; UTP 18 role in tumor pathogenesis, and that the transcriptional small subunit processome component; and UTP20 small activation or inhibition of certain genes serves an important role subunit processome component) and a further four encoded in the development and progression of the majority of human WD repeat domains (WD repeat-containing protein 3, WD tumors (5). The analysis of genome‑wide gene expression repeat domain 12, WD repeat-containing protein 43 and WD levels via DNA microarray experiments is a recently developed repeat‑containing protein 75). In conclusion, the present DNA systematic approach that is used to gain comprehensive microarray study identified genes involved in the pathogenesis information regarding tumor associated gene transcription of CRC. Furthermore, it was revealed that hub genes identified profiles (6). This information can lead to the identification of from among the total identified upregulated and downregulated prognostic biomarkers (7,8), enable the discrimination between genes in CRC encoding subunit processome components and various histological subtypes of tumors (9,10). Interactions between genes can affect the pathogenesis of CRC and tumor progression. The aim of the present study was to identify biologically relevant genes that may be associated with the pathogenesis of CRC, via analysis of genome-wide expression profiles in CRC. Following the identification of Correspondence to: Professor Hai-Xin Qian, Department differentially expressed genes (DEGs), the present study further of General Surgery, The First Affiliated Hospital of Soochow University, 899 Pinghai Road, Suzhou, Jiangsu 215006, P.R. China aimed to determine the hub genes involved in the regulatory E‑mail: [email protected] pathways of CRC progression, by Gene Ontology (GO) analysis, and to subsequently derive a gene interaction network. Key words: colorectal cancer, microarray, hub gene, expression Identification of hub genes in CRC and the development of a gene profiling interaction network may reveal the biological characteristics of CRC tumor development and enable the development of novel molecular‑targeted therapies for the treatment of CRC. SUN and QIAN: SCREENING FOR IMPLICATED GENES IN COLORECTAL CANCER 8261 Materials and methods Hospital Affiliated to Shanghai Jiaotong University School of Medicine (Shanghai, China) between November 2017 and Data collection. The gene expression profiling dataset from December 2017. The inclusion criteria for patients were as previously published relevant study was retrieved from follows: i) colorectal adenocarcinoma confirmed by patho- the NCBI GEO database (accession number GSE32323; logical examination; ii) the patients must have not received a www.ncbi.nlm.nih.gov/geo) (11). Data regarding a total of radical resection prior and must not have been treated with 17 paired samples of CRC tumors and adjacent tissues were preoperative chemoradiotherapy. Characteristics of patients available from the GSE32323 dataset. included in the present study are presented in Table I. Written informed consent was obtained from all patients included in Data processing. Raw RNA expression profile datasets the present study. The Ethics Committee of Renji Hospital were preprocessed using R 3.4.1 statistical software Affiliated to Shanghai Jiaotong University School of Medicine (https://www.r‑project.org/) together with Affy package (12). approved the present study and protocols. CRC tissue samples In accordance with the R‑package, the gcrma algorithm (13) (diameter of 0.5 cm) were obtained via tumor dissection. was used to adjust for background intensities in the Affymtrix Paracancerous tissue (distal tissue 3 cm away from the tumor) array data by including optical noise and non‑specific was also obtained from patients via dissection. Paracancerous binding. The background adjusted probe intensities were then tissues and cancerous tissues were collected as normal tissues converted into expression measures using the normaliza- and tumor tissues, respectively. RNA was extracted from these tion and summarization methods included in the multiarray tissues by TRIzol (Invitrogen; Thermo Fisher Scientific, Inc., average algorithm (12). The k‑Nearest Neighbor algorithm (14) Waltham, MA, USA) for future experiments. was subsequently used to generate the missing values. Reverse transcription‑quantitative polymerase chain reaction Microarray and statistical analysis. The R‑package and (RT‑qPCR). In order to further investigate the results of limma (www.bioconductor.org.uk) (15) were used to inves- the microarray analysis, the 20 hub genes identified to be tigate gene expression microarray data. Limma uses linear associated with CRC were subjected to RT‑qPCR analysis. A models, as well as empirical Bayesian methods, to investigate total of 1 µg of RNA from tumor or normal tissue samples was the differential expression levels in the GSE32323 dataset. reverse transcribed to cDNA using SuperScript™ II reverse Fold‑change (FC) values were calculated using the ratio of transcriptase (Invitrogen; Thermo Fisher Scientific, Inc.; the geometric means of the gene expression levels between heat mixture to 65˚C for 5 min, incubate at 42˚C for 50 min, the two treatment groups (CRC tissue vs. adjacent tissue). A inactivate the reaction by heating at 70˚C for 15 min. RT‑qPCR Wilcoxon Rank‑Sum Test was performed to determine the was performed using cDNA, iQ™ SYBR Green Supermix statistical significance of differences between the groups (Invitrogen; Thermo Fisher Scientific, Inc.) and primers (P<0.05 was considered statistically significant). Genes were (2.5 µM). The reaction was performed using the Bio‑Rad selected for if the log2‑FC values were >1 (equivalent to a RT‑qPCR detection system (Bio‑Rad Laboratories, Inc., FC value of >2.0 or <0.5), and P‑values determined by the Hercules, CA, USA) under the following PCR conditions: Initial Wilcoxon Rank‑Sum Test were P<0.0002. Hierarchical clus- denaturation at 95˚C for 5 min; 35 cycles at 95˚C for 30 sec; tering of the selected genes was performed with R software 60˚C for 30 sec; 72˚C for 30 sec; and final extension at 72˚C using the Euclidean distance and complete linkage method. for 5 min. Expression levels of these genes were normalized Regarding clustering, the expression data were standardized to the house‑keeping gene, β-actin via the 2-ΔΔCq method (19). via conversion to z scores (mean=0, variance=1) for each probe. Primer sequences used were as follows: WD repeat-containing Following this, protein‑protein interactions among the selected protein 3 (WDR3) forward, 5 ' ‑ A A T CCA GCG GGT GAC genes were analyzed using the reference data contained in TAA‑3' and reverse, 5 ' ‑ A C A GGC TGA GGA GTA GGC‑3'; WD the String database of known protein‑protein interactions, repeat‑containing protein 43 (WDR43)