3199

Original Article

Identification of key for esophageal squamous cell carcinoma via integrated bioinformatics analysis and experimental confirmation

Jia Hu, Rongzhen Li, Huikai Miao, Zhesheng Wen

Department of Thoracic Oncology, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China Contributions: (I) Conception and design: J Hu, Z Wen. (II) Administrative support: Z Wen; (III) Provision of study materials or patients: Z Wen; (IV) Collection and assembly of data: J Hu, H Miao, R Li; (V) Data analysis and interpretation: J Hu; (VI) Manuscript writing: All authors. (VII) Final approval of manuscript: All authors. Correspondence to: Zhesheng Wen. Department of Thoracic Oncology, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer center, 651 Dongfengdong, Guangzhou 510060, China. Email: [email protected].

Background: Esophageal squamous cell carcinoma (ESCC) as the main subtype of esophageal cancer (EC) is a leading cause of cancer-related death worldwide. Despite advances in early diagnosis and clinical management, the long-term survival of ESCC patients remains disappointing, due to a lack of full understanding of the molecular mechanisms. Methods: In order to identify the differentially expressed genes (DEGs) in ESCC, the microarray datasets GSE20347 and GSE26886 from Expression Omnibus (GEO) database were analyzed. The enrichment analyses of (GO) terms, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and Gene Set Enrichment Analysis (GSEA) were performed for the DEGs. The -protein interaction (PPI) network of these DEGs was constructed using the Cytoscape software based on the STRING database to select as hub genes for weighted co-expression network analysis (WGCNA) with ESCC samples from TCGA database. Results: A total of 746 DEGs were commonly shared in the two datasets including 286 upregulated genes and 460 downregulated genes in ESCC. The DEGs were enriched in biological processes such as extracellular matrix organization, proliferation and keratinocyte differentiation, and were enriched in biological pathways such as ECM-receptor interaction and cell cycle. GSEA analysis also indicated the enrichment of upregulated DEGs in cell cycle. The 40 DEGs were selected as hub genes. The MEblack module was found to be enriched in the cell cycle, Spliceosome, DNA replication and Oocyte meiosis. Among the hub genes correlated with MEblack module, GSEA analysis indicated that DEGs of TCGA samples with DLGAP5 upregulation was enriched in cell cycle. Moreover, the highly endogenous expression of DLGAP5 was confirmed in ESCC cells. DLGAP5 knockdown significantly inhibited the proliferation of ESCC cells. Conclusions: DEGs and hub genes such as DLGAP5 from independent datasets in the current study will provide clues to elucidate the molecular mechanisms involved in development and progression of ESCC.

Keywords: Esophageal squamous cell carcinoma (ESCC); differentially expressed genes (DEGs); bioinformatics analysis; proliferation; DLGAP5

Submitted Oct 30, 2019. Accepted for publication Jan 06, 2020. doi: 10.21037/jtd.2020.01.33 View this article at: http://dx.doi.org/10.21037/jtd.2020.01.33

© Journal of Thoracic Disease. All rights reserved. J Thorac Dis 2020;12(6):3188-3199 | http://dx.doi.org/10.21037/jtd.2020.01.33 Journal of Thoracic Disease, Vol 12, No 6 June 2020 3189

Introduction interaction (PPI) networks. Additionally, the correlation between the hub genes and ESCC dataset from the Cancer As one of the most common malignant cancer types, Genome Atlas (TCGA) was analyzed using weighted gene esophageal cancer (EC) ranks the sixth leading cause co-expression network analysis (WGCNA). DLGAP5 was of cancer-related mortality all over the world, with an selected for function confirmation in ESCC cells. DLAGP1 estimated 400,000 deaths annually (1,2). EC contains positively regulated ESCC cells proliferation. two major histologic subtypes: esophageal squamous cell carcinoma (ESCC) and esophageal adenocarcinoma, which are classified based on geographic location and genetic Methods alterations (3). ESCC is the predominant histological Microarray data collection classification worldwide, accounting for about 80% of EC cancer (4). To be specific, ESCC accounts for about 90% of The original microarray data were downloaded from the EC cancer and is the fourth leading cause of cancer-related GEO database (12). The GSE20347 based on GPL571 death in China (5). Current treatments for ESCC include platform (Affymetrix U133A 2.0 Array) chemotherapy, radiation therapy and surgery. Despite contains 17 paired ESCC samples and normal adjacent advances in early diagnosis and clinical management, the esophageal tissues. The GSE26886 based on GPL570 overall 5-year survival rate of ESCC patients remains less platform (Affymetrix Human Genome U133 Plus 2.0 Array) than 25% due to delayed diagnosis at an advanced stage and contains 9 ESCC samples and 19 normal esophageal tissues. lack of effective targeted therapy (6). Development of ESCC We used R, affy package and gcrma package (GC Robust is a multistep process containing a series of genetic and Multi-array Average method) for data processing (13). epigenetic alterations associated with life and environment factors (7). Thus, it is important to fully understand the molecular mechanisms of carcinogenesis process to identify GO and pathway enrichment analysis of DEGs critical targets and develop novel and effective treatments for ESCC. The online tool database for annotation, visualization, Various genes, such as mRNAs and non-coding RNAs integrated discovery (DAVID; https://david.ncifcrf.gov/) including miRNAs and lncRNAs have been reported to containing GO and KEGG pathway analysis was used form complex networks regulating the tumorigenesis and to analyze the biological characteristics and function progression of human cancers. Recently, in order to screen annotation of candidate DEGs (14). GO is a useful differentially expressed genes (DEGs) associated with tool for analyzing characteristic biological information carcinogenesis and progression of human cancer, and to including biological process in the present study (15). identify biomarkers and potential therapeutic targets, more Here, KEGG pathway analysis was also performed to and more microarray and high throughput sequencing analyze the signaling pathways mediated by the DEGs (16). technologies combined with bioinformatics analysis have Additionally, GSEA was selected to determine whether been widely used (8,9). However, despite numerous of DEGs was involved in one phenotype or signaling pathway studies have been performed using high throughput using GSEA software (http://software.broadinstitute.org/ technologies, only very few biomarkers and drug targets gsea/ index.jsp) (17). have been translated into clinical practice, mainly due to false-positive rates in independent microarray analysis, PPI network construction and hub genes screening different technological platforms or small sample size (10,11). In the present study, two original datasets were The Search Tool for Retrieval of Interacting Genes downloaded from Gene Expression Omnibus (GEO) database (STRINGdb: https://string-db.org/) was used to database and analyzed to obtain DEGs in ESCC. A total get the PPI network information (18). Here, the DEGs of 746 DEGs commonly shared by both data sets were were mapped into PPIs using Cytoscape software 3.4.0 selected for further bioinformatics analysis, including gene (http://www.cytoscape.org) and a combined score of >0.4 ontology (GO)/Kyoto Encyclopedia of Genes and Genomes was used as the cut-off value that was considered statistically (KEGG) analysis, Gene Set Enrichment Analysis (GSEA) significant. Then the 40 nodes with edge of >20 were pathway analysis and construction of protein-protein selected as hub genes for further analysis.

© Journal of Thoracic Disease. All rights reserved. J Thorac Dis 2020;12(6):3188-3199 | http://dx.doi.org/10.21037/jtd.2020.01.33 3190 Hu et al. Key genes for esophageal squamous cell carcinoma

WGCNA network construction and key modules immunoreactive bands were visualized using an ECL identification western blotting system (Beyotime, Shanghai, China). The monoclonal mouse anti-β-actin antibody was from Sigma To further confirm the DEGs with a critical role in ESCC, and the polyclonal rabbit anti-DLGAP5 antibody was from ESCC data from TCGA database was downloaded. The abcam. “WGCNA” R package was used to put the DEGs of ESCC samples from TCGA database into modules by average linkage clustering (19,20). Here, the power of β=5 (scale free Proliferation assay R2=0.8) was set as the soft thresholding to ensure a scale- Cells were seeded into 24-well plates at 10,000 cells/well. free network. The average linkage clustering tree based on The cell number was counted every 2 days and the time- the topological overlapping distance of the gene expression cell number curves were plotted. Alternatively, cells were spectrum is constructed. The hierarchical clustering tree seeded into 6-well plates at 1,000 cells/well in triplicates displays the gene classification module, and the final module and incubated to allow colony formation for 7 days. The group is obtained after fusion. The gene tree is visually colonies were stained with crystal violet and then counted. inspected by different module colors. Then, the relevance between each module and the 40 hub genes selected from GEO data was analyzed. The KEGG and GSEA analysis Statistical analysis for selecting module was performed. All data were presented as the mean ± SD. The student’s t test was used to compare the differences among different Cell culture and transfection groups. Statistical analyses were performed using GraphPad Prism 6. P value <0.05 was considered statistically significant. ESCC cell lines including TE-1, KYSE30 and KYSE410, KYSE180 and KYSE520 were cultured in RPMI- 1640 or DMEM medium (Gibco, Carlsbad, CA, Results USA) supplemented with 10% fetal bovine serum. To Identification of DEGs in ESCC establish transfectants with DLGAP5 knockdown, TE-1 and KYSE410 were transfected with psi-LVRU6GP To identify the DEGs those maybe play critical roles in the vectors with DLGAP5 shRNAs (target sequence for development and progression of ESCC, two microarray sh-1#: 5'-GGATATAAGTACTGAAATGAT-3', sh-2#: datasets were collected from the GEO database. In the 5'- GGTATTTCTTGTAAAGTCGAT-3', sh-3#: 5'- GSE20347 dataset concluding ESCC samples and paired CCATATTTCAGAAATATCCTC-3'). The transfection normal adjacent esophageal tissues from 17 patients, 465 was performed using Lipofectamine 3000 (Invitrogen, upregulated and 559 downregulated DEGs were identified Carlsbad, CA, USA) referring to recommendations. using fold change >2 and P<0.05 (Figure 1A). In the other dataset GSE26886 concluding 9 ESCC samples and 19 normal esophageal tissues, 838 upregulated and 947 Western blot downregulated DEGs were identified using fold change Cells were harvested and washed with PBS. Total protein >2 and P<0.05 (Figure 1A). Among these candidate DEGs, was collected with RIPA Lysis Buffer with Protease and a total of 746 DEGs were commonly shared in the two Phosphatase inhibitor. The protein concentration was datasets, including 286 commonly upregulated DEGs measured by BCA Protein Assay Kit. 30 μg of protein and 460 commonly downregulated DEGs shown in the was used for separation by 10% SDS-PAGE gels and Venn diagrams (Figure 1A). Meanwhile, the volcano plots transferred onto 0.2 μm PVDF membranes. The membrane (Figure 1B) and heatmaps (Figure 1C) were used to represent was blocked with 5% non-fat milk in TBS-Tween (TBS-T, the differentially expressed profiles of DEGs within the 0.1% Tween) at 37 for 2 h and incubated overnight datasets. at 4 with the primary antibodies. The membrane was ℃ incubated with horseradish peroxidase (HRP)-conjugated ℃ Enrichment analysis of the candidate DEGs in ESCC secondary antibodies (Sigma-Aldrich) at room temperature for 2 h. Finally, the membranes were washed and the To further gain insights into the function of DEGs, the

© Journal of Thoracic Disease. All rights reserved. J Thorac Dis 2020;12(6):3188-3199 | http://dx.doi.org/10.21037/jtd.2020.01.33 Journal of Thoracic Disease, Vol 12, No 6 June 2020 3191

A GSE20347 GSE20347

179 286 552 99 460 487

GSE26886 GSE26886

B GSE20347 GSE26886 25 25

20 20

15 15

log10 (P-value) 10 log10 (P-value) 10 − −

5 5

0 0 −9 −6 −3 0 3 6 9 −9 −6 −3 0 3 6 9 Log2 (fold change) Log2 (fold change)

C GSE20347 GSE26886

Kind Kind

2.6 2.6 2.4 2.4 2.2 2.2 2 2 1.8 1.8 1.6 1.6 Normal Cancer Normal Cancer

Figure 1 Identification of DEGs in ESCC samples. (A) Venn diagram showed the DEGs commonly shared by GSE20347 and GSE26886 datasets. Left panel represented the upregulated DEGs and right panel represented the downregulated DEGs; (B) the volcano plot indicated the DEGs in GSE20347 and GSE26886 datasets; (C) the samples were hierarchically clustered and the heatmap represented the DEGs in GSE20347 and GSE26886 datasets. DEGs, differentially expressed genes; ESCC, esophageal squamous cell carcinoma.

© Journal of Thoracic Disease. All rights reserved. J Thorac Dis 2020;12(6):3188-3199 | http://dx.doi.org/10.21037/jtd.2020.01.33 3192 Hu et al. Key genes for esophageal squamous cell carcinoma candidate DEGs between ESCC and normal esophageal spectrum is constructed. The hierarchical clustering tree tissues were analyzed using DAVID software for displays the gene classification module, and the final module enrichment. The GO analysis showed that the biological group is obtained after fusion. The gene tree is visually process enrichment terms of upregulated DEGs were inspected by different module colors (Figure S2B). Then, mainly associated with extracellular matrix organization, the relevance between each module and the 40 hub genes collagen biosynthetic and catabolic processes, cell cycle and selected from GEO data was analyzed (Figure 5A). After proliferation (Figure 2A) and the downregulated DEGs were filtering of modules with low quality and relevance lower mainly associated with epidermis development, keratinocyte than 0.5, the MEyellowgreen and MEblack modules met differentiation, oxidation-reduction process and Hippo the requirement (Figure 5A). The MEblack module was signaling (Figure 2B). Then, KEGG analysis of the DEGs selected for further analysis. The biological pathway analysis was performed and the biological pathway analysis indicated indicated that the MEblack module was mainly enriched that the upregulated DEGs were mainly enriched in ECM- in cell cycle, Spliceosome, DNA replication and Oocyte receptor interaction, focal adhesion, PI3K-Akt signaling meiosis (Figure 5B). Among the hub genes correlated with pathway and cell cycle (Figure 3A) and the downregulated MEblack module, GSEA analysis indicated that DEGs of DEGs were mainly enriched in metabolic pathways and TCGA samples with DLGAP5 upregulation was enriched in chemical carcinogenesis (Figure 3B). Furthermore, stratified cell cycle (Figure 5C). GSEA analysis also indicated the enrichment of upregulated DEGs in the cell cycle (Figure 3C). DLGAP5 promotes ESCC cells proliferation

To investigate the biological function of DLGAP5 in ESCC PPI network construction and hub genes screening cells, firstly DLGAP5 expression was analyzed between The PPI network of the DEGs was constructed with ESCC samples from different clinical stages, but DLGAP5 Cytoscape software based on the STRING database. The was not significantly correlated with cancer stage (data not interaction with a combined score of >0.4 were selected to shown). Then, the DLGPA5 protein level was detected construct the PPI network. The PPI network contains 465 in a series of ESCC cell lines including TE-1, KYSE30, nodes and 3,384 edges (Figure 4A). To identify the critical KYSE180, KYSE410 and KYS E520. As indicated DLGAP5 genes involved in ESCC development and progression, protein was endogenously highly expressed in ESCC 40 nodes with edge of >20 were selected as hub genes for cells lines (Figure 6A). Then, DLGAP5 expression was further analysis, among which ESPL1, VAMP8, TIMP2 knocked down using specific shRNAs and sh-1# and sh-2# and RHOA were downregulated genes and others were were selected for further studies because of their high upregulated genes (Figure 4B). efficiency in DLGAP5 knockdown (Figure 6B). MTS assay was performed to examine ESCC cells proliferation, which indicated that DLGAP5 knockdown significantly inhibited Weighted co-expression network construction and key the proliferation of TE-1 and KYSE410 cells (Figure 6C). modules identification The colony formation assay also indicated that DLGAP5 To further investigate the precise expression of DEGs in knockdown markedly inhibited colony formation capacity ESCC, we analyzed the expression pattern of the selected of TE-1 and KYSE410 cells (Figure 6D). These results 40 hub genes in 80 ESCC samples from the TCGA suggested that DLGAP5 expression was correlated with database. After excluding the abnormal samples, under the ESCC cells proliferation. screening conditions of FDR <0.05 and ∣log2FC∣>2, the 40 DEGs were well clustered and were selected for Discussion follow-up analysis (Figure S1). Then, we used “WGCNA” R package to put the DEGs of ESCC samples from TCGA ESCC as the predominant histological subtype of EC, is a database into modules by average linkage clustering. In highly aggressive malignancy. Although great attention has the present study, the power of β=5 (scale free R2=0.8) was been paid to ESCC, the response to treatments is poor and set as the soft thresholding to ensure a scale-free network the clinical outcomes of ESCC are still seriously unfavorable (Figure S2A). The average linkage clustering tree based on (4,6). It has been reported that ESCC is associated with the topological overlapping distance of the gene expression multiple environmental factors including smoking, alcohol

© Journal of Thoracic Disease. All rights reserved. J Thorac Dis 2020;12(6):3188-3199 | http://dx.doi.org/10.21037/jtd.2020.01.33 Journal of Thoracic Disease, Vol 12, No 6 June 2020 3193

A Biological process analysis of upregulated DEGs in ESCC 30 40 Pvalue (−log10) Count 35 25 30 20 25 15 20 15 10 10 5 5 0 0 Cell division Angiogenesis Cell adhesion DNA replication Cell proliferation DNA damage response Regulation of cell growth Collagen catabolic process Regulation of cell migration Regulation of cell proliferation Collagen biosynthetic process Negative regulation of cell cycle Negative regulation Extracellular matrix organization Extracellular matrix disassembly Cellular response to FGF stimulus Cellular response Positive regulation of transcription Positive regulation Cell adhesion mediated by integrin G1/S transition of mitotic cell cycle Positive regulation of cell migration Positive regulation Positive regulation of SMAD protein Positive regulation Negative requlation of cell migration Negative requlation Negative extrinsic apoptotic signalin Positive regulation of cell proliferation Positive regulation Collagen-activated signaling pathway Positive regulation of gene expression Positive regulation Cyclin-dependent protein kinase activity Cyclin-dependent protein Positive regulation of epitheliall proliferation Positive regulation Positive transcription from RNA pl II promote Positive transcription from Transcription from RNA polymerase II promote from Transcription

B Biological process analysis of downregulated DEGs in ESCC 10 45 9 Pvalue (−log10) Count 40 8 35 7 30 6 25 5 4 20 3 15 2 10 1 5 0 0 Keratinization Cell migration Cell activation Hippo signaling Cell-cell adhesion Epidermis development Drug metabolic process Innate immune response Endothelial cell migration Cytoskeleton organization Keratinocyte differentiation Epithelial cell differentiation Oxidation-reduction process Oxidation-reduction Regulation of macroautophagy Extracellular matrix disassembly Positive autophagosome maturation Negative cytokine-mediated signaling Regulation of mitotic spindle assembly Negative regulation of peptidase activity Negative regulation Small GTPase mediated signal transduction Positive regulation of inflammatory response Positive regulation Positive l-kappaB kinase/NF-kappaB signaling

Figure 2 The biological process enrichment terms of DEGs. The GO analysis with the online tool DAVID was used to analyze the biological characteristics and function annotation of upregulated DEGs (A) and downregulated DEGs (B). DEGs, differentially expressed genes; GO, gene ontology; DAVID, database for annotation, visualization, integrated discovery.

© Journal of Thoracic Disease. All rights reserved. J Thorac Dis 2020;12(6):3188-3199 | http://dx.doi.org/10.21037/jtd.2020.01.33 3194 Hu et al. Key genes for esophageal squamous cell carcinoma

A KEGG pathway analysis of upregulated DEGs in ESCC 20 30 18 Pvalue (−log10) Count 16 25 14 20 12 10 15 8 6 10 4 5 2 0 0 Cell cycle TNF signaling Focal adhesion Bladder cancer DNA replication DNA replication Actin cytoskeleton TGF-beta signaling PI3K-AKT signaling Pathways in cancer ECM-receptor inter.. ECM-receptor Proteoglycans cancer Proteoglycans Small cell lung cancer Transcriptional misre.. Transcriptional

B KEGG pathway analysis of downregulated DEGs in ESCC 8 80 7 Pvalue (−log10) Count 70 6 60 5 50 4 40 3 30 2 20 1 10 0 0 Endocytosis Drug metabolism beta-alanine meta.. Metabolic pathways Tyrosine metabollsm Tyrosine Phenylalanine meta.. Histidine metabolism Fatty acid degradation Arachidonic acid meta.. Valine, leucine and isol.. Valine, Xenobiotics metabolism Glycolysis/Gluconeoge.. Chemical carcinogenesis C GSE20347 GSE26886 Enrichment plot: KEGG_CELL_CYCLE Enrichment plot: KEGG_CELL_CYCLE 0.45 0.6 0.40 0.5 0.35 0.30 0.4 0.25 0.3 0.20 0.15 0.2 0.10

0.1 (ES) Enrichment score 0.05 Enrichment score (ES) Enrichment score 0.0 0.00

‘tumor’ (positively correlated) 2 ‘tumor’ (positively correlated) 1 Tumor (positively correlated) 1 Tumor (positively correlated)

0 0 Zero cross at 6615 Zero cross at 6853 −1

−1 (Signal2Noise) (Signal2Noise) Normal (Negatively correlated) Normal (Negatively correlated) Ranked list metric Ranked list metric ‘Normal’ (negatively correlated) −2 ‘Normal’ (negatively correlated) 0 2.500 5.000 7.500 10.000 12.500 0 2.500 5.000 7.500 10.000 12.500 Rank in ordered dataset Rank in ordered dataset Enrichment profile Hits Ranking metric scores Enrichment profile Hits Ranking metric scores

Figure 3 KEGG and GSEA analysis of candidate DEGs. (A,B) KEGG pathway analysis of upregulated DEGs (A) and downregulated DEGs (B) was performed using DAVID database; (C) GSEA enrichment analysis of DEGs was applied using GSEA software. KEGG, kyoto encyclopedia of genes and genomes; GSEA, gene set enrichment analysis; DEGs, differentially expressed genes; DAVID, database for annotation, visualization, integrated discovery.

© Journal of Thoracic Disease. All rights reserved. J Thorac Dis 2020;12(6):3188-3199 | http://dx.doi.org/10.21037/jtd.2020.01.33 Journal of Thoracic Disease, Vol 12, No 6 June 2020 3195

A B FOXM1 20 ESPL1 20 SPP1 21 RAD51AP1 21 LAMC1 21 LAMB1 21 FANCI 21 CXCL1 21 PCNA 21 MCM6 22 GINS2 22 COL5A2 22 COL4A2 22 VAMP8 22 23 COL7A1 23 COL5A1 23 COL4A1 23 CDKN3 23 TIMP1 24 NUSAP1 24 DTL 24 ASPM 25 MMP2 26 DLGAP5 26 CXCL8 26 TTK 27 TPX2 27 COL3A1 27 COL1A2 27 MMP9 28 TOP2A 29 COL1A1 29 RHOA 30 NCAPG 30 RFC4 32 CCNB2 34 UBE2C 35 ITGAV 36 FN1 46

0 10 20 30 40

Figure 4 PPI network construction and hub genes screening. (A) PPI network of DEGs was constructed using Cytoscape software based on STRING database; (B) 40 nodes with edge of >20 in PPI network were selected as hub genes. PPI, protein-protein interaction; DEGs, differentially expressed genes. consumption, diet with pickled vegetables, and exposure analysis (10,11). In the present study, to gain reliable results, to chemical factors such as N-nitroso compounds (21). we first used to two independent datasets to screen DEGs Recent genomic studies have suggested the mutational for ESCCs. After series bioinformatics analysis including genes in ESCCs, including genes involved in tumorigenesis, GO, KEGG, GSEA and PPI network construction, 40 cell cycle, apoptosis and epigenetic modification. Some DEGs were selected as hub genes for WGCNA with of those mutational genes are well-known cancer- ESCCs data from TCGA. Then, MEyellowgreen and associated genes such as TP53, RB1, CDKN2A, PIK3CA MEblack modules were identified. Among the hub genes and NOTCH1, and histone regulator genes such as MLL2, correlated with MEblack module, GSEA analysis indicated SETD1B and EP300, suggesting the important roles of gene that DEGs of TCGA samples with DLGAP5 upregulation mutation and amplification in ESCCs development and was enriched in cell cycle. progression (5,22). Despite intensive studies in molecular DLGAP5 (also known as HURP) belongs to the DLGAP mechanisms and advances in early diagnosis and clinical (discs large-associated protein) family that includes five management, the outcomes of patients with ESCCs are members, termed as DLGAP1-5 (23), which share three still very unsatisfactory. Thus, it is important and urgent key domains: a dynein light chain domain, a 14-amino- to fully understand the molecular mechanisms for ESCCs acid repeat domain and a guanylate kinase-associated development and progress, and then to identify effective protein homology domain (24). DLGAP5 as a - targets and develop novel therapeutic strategies for ESCCs. associated protein has a critical role in spindle assembly, Recently, high-throughput technologies such as fibers stabilization and chromosomal microarray and RNA sequencing, and bioinformatics segregation during mitosis (25,26). The activity of DLGAP5 analysis have been used to identify the DEGs and signaling (HURP) is regulated by Aurora A via phosphorylating pathways involved in the development and progression of the C-terminal domain and then releasing the inhibition ESCCs. However, only very few functional DEGs have on its N-terminal domain binding with microtubule, been confirmed and translated into clinical practice, mainly thus regulating centrosome formation, due to false-positive or negative rates in independent segregation and formation (27). Previous

© Journal of Thoracic Disease. All rights reserved. J Thorac Dis 2020;12(6):3188-3199 | http://dx.doi.org/10.21037/jtd.2020.01.33 3196 Hu et al. Key genes for esophageal squamous cell carcinoma

A Module-trait relationships MEsteelblue MEfloralwhite MEsalmon 1 MEivory MEred MEbrown MEareenyellow MEdarkred MEviole MElightyellow MEdarkolivegreen MEroyalblue Edarkslateblue MEbisque4 MEghtcyan1 0.5 MEgrey60 MEorange MEyellow MEdarkgreen MEmidnightblue MEwhite MEdarkgrey MEpurple MEyellowgreen MEorangered4 MEdarkorange 0 MEdarkturquoise MEpaleturquoise MEgreen MEpink MEskyblue3 MEskyblue MEbrown4 MEplum1 MElightcyan MEcyan MEsienna3 −0.5 MEdarkmagenta MEplum2 MElightsteelblue1 MEsaddlebrown MEthistle2 MEmediumpurple3 MEthistle1 MEtan MEdarkorange2 MElightgreen MEblue −1 MEblack MEmagenta MEgrey TTK DTL FN1 TPX2 SPP1 RFC4 PCNA ITGAV RHOA ASPM FANCI TIMP1 TIMP2 GINS2 MMP9 MMP2 ESPL1 MCM6 CXCL8 CXCL1 TOP2A VAMP8 UBE2C LAMB1 LAMC1 CDKN3 FOXM1 CCNB2 NCAPG COL3A1 COL4A1 COL4A2 COL5A2 COL1A1 COL5A1 COL7A1 COL1A2 DLGAP5 NUSAP1 RAD51AP1 B C

KEGG pathway analysis of black module Enrichment plot: KEGG_CELL_CYCLE 14 25 0.6 P value (−log10) Count 0.5 12 20 0.4 10 0.3 8 15 0.2 0.1 6 10 0.0 4 (ES) Enrichment score 5 2 0 0 1 ‘up_expression_DLGAP5’ (positively correlated) 0.5 Uprequlation DLGAP5 (positively correlated) 0 Zero cross at 6299 Cell cycle −1 Downregulation (Negatively correlated) (Signal2Noise) Spliceosome p53 signaling Ranked list metric

RNA transport −0.5 ‘down_expression_DLGAP5’ (negatively correlated) Oocyte meiosis DNA replication Mismatch repair 0 2.500 5.000 7.500 10.000 Fanconi pathway RNA degradation

oocyte maturation Rank in ordered dataset Herpes si infection Enrichment profile Hits Ranking metric scores Pyrimidine metabolism Progesterone-mediated Progesterone-mediated

Figure 5 Weighted co-expression network construction and key modules identification. (A) Heatmap of the correlation between module eigengenes resulted from analysis of ESCC samples from TCGA database and selected hub genes from GSE20347 and GSE26886 datasets; (B) KEGG pathway analysis of DEGs involved in MEblack module; (C) GSEA enrichment analysis of gene sets was applied for ESCC samples from TCGA database with different DLGAP5 expression. ESCC, esophageal squamous cell carcinoma; TCGA, the cancer genome atlas; KEGG, kyoto encyclopedia of genes and genomes; GSEA, gene set enrichment analysis.

© Journal of Thoracic Disease. All rights reserved. J Thorac Dis 2020;12(6):3188-3199 | http://dx.doi.org/10.21037/jtd.2020.01.33 Journal of Thoracic Disease, Vol 12, No 6 June 2020 3197

A B sh-Con sh-1# sh-2# sh-3# sh-Con sh-# sh-2# sh-3# TE-1 KYSE30 KYSE410 KYSE180 KYSE520 DLGAP5 DLGAP5

β-actin β-actin TE-1 SKYE410

C

10 sh-Con 10 sh-Con sh-1# sh-1# 8 8 ) ) 4 4 sh-2# sh-2# 10 10 × × 6 TE-1 6 SKYE410 *** 4 4 *** ** Cell number ( Cell number ( ** *** 2 2 ** *** ** ** * ** 0 0 0 2 4 6 0 2 4 6 Days Days

D

sh-Con sh-1# sh-2# 500 sh-Con sh-1# TE-1 400 sh-2# 300

200

Colony number **** **** **** **** SKYE410 100

0 TE-1 SKYE410

Figure 6 DLGAP5 promotes ESCC cells proliferation. (A) The DLGAP5 protein levels in ESCC cell lines was detected by western blot assay; (B) the efficiency of shRNAs on DLGAP5 knockdown was measured by western blot assay after transfection; (C,D) the effects of DLGAP5 knockdown on proliferation and colony-forming ability were measured in ESCC cells (crystal violet stain). Student’s t-test, *, P<0.01; **, P<0.01; ***, P<0.001; ****, P<0.0001. ESCC, esophageal squamous cell carcinoma.

study reported that DLGAP5 is highly expressed in the in some cancer types originating from multipotent bone marrow precursor cells but not expressed in the cancer stem cells. Factually, DLGAP5 overexpression peripheral blood monocytes, and that DLGAP5 express has been reported in variety of cancer types such as decreased during the differentiation process (28). hepatocellular carcinoma (29), urinary bladder transitional These data implied that DLGAP5 might be involved cell carcinoma (30), meningioma (31), adrenocortical

© Journal of Thoracic Disease. All rights reserved. J Thorac Dis 2020;12(6):3188-3199 | http://dx.doi.org/10.21037/jtd.2020.01.33 3198 Hu et al. Key genes for esophageal squamous cell carcinoma carcinoma (32,33) and prostate cancer (34). Recently, Ethical Statement: The authors are accountable for all Wang et al. showed that DLGAP5 was highly expressed aspects of the work in ensuring that questions related in aggressive non-small cell lung cancer (NSCLC) and to the accuracy or integrity of any part of the work are negatively correlated with survival. DLGAP5 silence appropriately investigated and resolved. This study resulted into inhibition of NSCLC cell proliferation and was approved by the Ethics Committee of Sun Yat- invasion (35). Here, we confirmed the endogenously high sen University Cancer Center (approval number: GZR expression of DLGAP5 protein in ESCC cells. Although 2018-120). based on DLGAP5 expression we did not get information for prognosis mainly due to small sample size, our Open Access Statement: This is an Open Access article experimental results indicated that DLGAP5 knockdown distributed in accordance with the Creative Commons significantly suppressed ESCC cell proliferation. Of course, Attribution-NonCommercial-NoDerivs 4.0 International we will analyze the role of DLGAP5 in prognosis using License (CC BY-NC-ND 4.0), which permits the non- a larger ESCC patients cohort, investigate its function commercial replication and distribution of the article with in ESCC using in vitro and in vivo models, and explore the strict proviso that no changes or edits are made and the the detailed mechanisms in our further studies. In regard original work is properly cited (including links to both the to the therapeutic point of view, recent study reported a formal publication through the relevant DOI and the license). strong synergistic effect between DLGAP5 knockdown See: https://creativecommons.org/licenses/by-nc-nd/4.0/. and docetaxel in the androgen-sensitive prostate cancer cells (36). Here, our findings also provided strong support References for rational design of novel treatment strategies based on DLGAP5 function inhibition using gene therapy mediated 1. Schweigert M, Dubecz A, Stein HJ. Oesophageal known or inhibitor, combined with chemotherapy or cancer--an overview. Nat Rev Gastroenterol Hepatol radiation. The potent implication also needs further studies. 2013;10:230-44. In summary, here, we screened DEGs that may be 2. Bray F, Ferlay J, Soerjomataram I, et al. Global cancer involved in the development or progression of ESCC using statistics 2018: GLOBOCAN estimates of incidence and two independent GEO database. After bioinformatics mortality worldwide for 36 cancers in 185 countries. CA analysis, potential hub genes from DEGs were correlated Cancer J Clin 2018;68:394-424. with ESCCs from TCGA database using weighted co- 3. Smyth EC, Lagergren J, Fitzgerald RC, et al. Oesophageal expression analysis and DLGAP5 was identified. Preliminary cancer. Nat Rev Dis Primers 2017;3:17048. experiments suggested DLGAP5 promoted ESCC cells 4. Arnold M, Soerjomataram I, Ferlay J, et al. Global incidence of oesophageal cancer by histological subtype in proliferation. However, further studies are required to 2012. Gut 2015;64:381-7. elucidate the roles and mechanisms of DLGAP5 in ESCCs. 5. Song Y, Li L, Ou Y, et al. Identification of genomic alterations in oesophageal squamous cell cancer. Nature Acknowledgments 2014;509:91-5. 6. Pennathur A, Gibson MK, Jobe BA, et al. Oesophageal We would also like to thank all investigators helped in data carcinoma. Lancet 2013;381:400-12. collection and analysis. We are also grateful to all who 7. Hao JJ, Lin DC, Dinh HQ, et al. Spatial intratumoral reviewed and commented on an early draft of the paper. heterogeneity and temporal clonal evolution in esophageal And finally, we would like to thank those who suggested on squamous cell carcinoma. Nat Genet 2016;48:1500-7. ways to enhance the paper and the reviewers. 8. Kulasingam V, Diamandis EP. Strategies for discovering Funding: None. novel cancer biomarkers through utilization of emerging technologies. Nat Clin Pract Oncol 2008;5:588-99. Footnote 9. Zhang L, Yang Y, Cheng L, et al. Identification of Common Genes Refers to Colorectal Carcinogenesis Conflicts of Interest: All authors have completed the with Paired Cancer and Noncancer Samples. Dis Markers ICMJE uniform disclosure form (available at http://dx.doi. 2018;2018:3452739. org/10.21037/jtd.2020.01.33). The authors have no conflicts 10. Ni M, Liu X, Wu J, et al. Identification of Candidate of interest to declare. Biomarkers Correlated With the Pathogenesis and

© Journal of Thoracic Disease. All rights reserved. J Thorac Dis 2020;12(6):3188-3199 | http://dx.doi.org/10.21037/jtd.2020.01.33 Journal of Thoracic Disease, Vol 12, No 6 June 2020 3199

Prognosis of Non-small Cell Lung Cancer via Integrated 26. Ye F, Tan L, Yang Q, et al. HURP regulates chromosome Bioinformatics Analysis. Front Genet 2018;9:469. congression by modulating kinesin Kif18A function. Curr 11. Li L, Lei Q, Zhang S, et al. Screening and identification Biol 2011;21:1584-91. of key biomarkers in hepatocellular carcinoma: Evidence 27. Wong J, Lerrigo R, Jang CY, et al. Aurora A regulates from bioinformatic analysis. Oncol Rep 2017;38:2607-18. the activity of HURP by controlling the accessibility 12. Edgar R, Domrachev M, Lash AE. Gene Expression of its microtubule-binding domain. Mol Biol Cell Omnibus: NCBI gene expression and hybridization array 2008;19:2083-91. data repository. Nucleic Acids Res 2002;30:207-10. 28. Gudmundsson KO, Thorsteinsson L, Sigurjonsson OE, et 13. Yang K, Gao J, Luo M. Identification of key pathways and al. Gene expression analysis of hematopoietic progenitor hub genes in basal-like breast cancer using bioinformatics cells identifies Dlg7 as a potential stem cell gene. Stem analysis. Onco Targets Ther 2019;12:1319-31. Cells 2007;25:1498-506. 14. Dennis G Jr, Sherman BT, Hosack DA, et al. DAVID: 29. Tsou AP, Yang CW, Huang CY, et al. Identification of a novel Database for Annotation, Visualization, and Integrated cell cycle regulated gene, HURP, overexpressed in human Discovery. Genome Biol 2003;4:P3. hepatocellular carcinoma. Oncogene 2003;22:298-307. 15. Torto-Alalibo T, Purwantini E, Lomax J, et al. Genetic 30. Huang YL, Chiu AW, Huan SK, et al. Prognostic resources for advanced biofuel production described with significance of hepatoma-up-regulated protein expression the Gene Ontology. Front Microbiol 2014;5:528. in patients with urinary bladder transitional cell carcinoma. 16. Du J, Yuan Z, Ma Z, et al. KEGG-PATH: Kyoto Anticancer Res 2003;23:2729-33. encyclopedia of genes and genomes-based pathway analysis 31. Stuart JE, Lusis EA, Scheck AC, et al. Identification of using a path analysis model. Mol Biosyst 2014;10:2441-7. gene markers associated with aggressive meningioma by 17. Subramanian A, Kuehn H, Gould J, et al. GSEA-P: a filtering across multiple sets of gene expression arrays. J desktop application for Gene Set Enrichment Analysis. Neuropathol Exp Neurol 2011;70:1-12. Bioinformatics 2007;23:3251-3. 32. de Reyniès A, Assie G, Rickman DS, et al. Gene expression 18. Szklarczyk D, Franceschini A, Wyder S, et al. STRING profiling reveals a new classification of adrenocortical v10: protein-protein interaction networks, integrated over tumors and identifies molecular predictors of malignancy the tree of life. Nucleic Acids Res 2015;43:D447-52. and survival. J Clin Oncol 2009;27:1108-15. 19. Horvath S, Dong J. Geometric interpretation of gene 33. Fragoso MC, Almeida MQ, Mazzuco TL, et al. Combined coexpression network analysis. PLoS Comput Biol expression of BUB1B, DLGAP5, and PINK1 as predictors 2008;4:e1000117. of poor outcome in adrenocortical tumors: validation in 20. Mason MJ, Fan G, Plath K, et al. Signed weighted a Brazilian cohort of adult and pediatric patients. Eur J gene co-expression network analysis of transcriptional Endocrinol 2012;166:61-7. regulation in murine embryonic stem cells. BMC 34. Gomez CR, Kosari F, Munz JM, et al. Prognostic value of Genomics 2009;10:327. discs large homolog 7 transcript levels in prostate cancer. 21. Liu K, Zhao T, Wang J, et al. Etiology, cancer stem cells PLoS One 2013;8:e82833. and potential diagnostic biomarkers for esophageal cancer. 35. Wang Q, Chen Y, Feng H, et al. Prognostic and predictive Cancer Lett 2019;458:21-8. value of HURP in nonsmall cell lung cancer. Oncol Rep 22. Gao YB, Chen ZL, Li JG, et al. Genetic landscape 2018;39:1682-92. of esophageal squamous cell carcinoma. Nat Genet 36. Hewit K, Sandilands E, Martinez RS, et al. A functional 2014;46:1097-102. genomics screen reveals a strong synergistic effect between 23. Rasmussen AH, Rasmussen HB, Silahtaroglu A. The docetaxel and the mitotic gene DLGAP5 that is mediated DLGAP family: neuronal expression, function and role in by the androgen receptor. Cell Death Dis 2018;9:1069. brain disorders. Mol Brain 2017;10:43. 24. Liu J, Liu Z, Zhang X, et al. Examination of the expression and prognostic significance of DLGAPs in gastric cancer Cite this article as: Hu J, Li R, Miao H, Wen Z. Identification using the TCGA database and bioinformatic analysis. Mol of key genes for esophageal squamous cell carcinoma Med Rep 2018;18:5621-9. via integrated bioinformatics analysis and experimental 25. Wong J, Fang G. HURP controls spindle dynamics to confirmation. J Thorac Dis 2020;12(6):3188-3199. doi: promote proper interkinetochore tension and efficient 10.21037/jtd.2020.01.33 kinetochore capture. J Cell Biol 2006;173:879-91.

© Journal of Thoracic Disease. All rights reserved. J Thorac Dis 2020;12(6):3188-3199 | http://dx.doi.org/10.21037/jtd.2020.01.33 Supplementary

Figure S1 Clustering dendrogram of hub genes and ESCC samples from TCGA database. The expression pattern of the selected 40 hub genes in 80 ESCC samples from the TCGA database was analyzed. After excluding the abnormal samples, under the screening conditions of FDR <0.05 and |log2FC| >2, the 40 DEGs were well clustered. ESCC, esophageal squamous cell carcinoma; TCGA, the cancer genome atlas; DEGs, differentially expressed genes. Figure S2 Determination of soft-thresholding power in the weighted gene co-expression network analysis (WGCNA). (A) Analysis of the scale-free fit index for various soft-thresholding powers; (B) dendrogram of the selected hub genes clustered based on a dissimilarity measure with ESCC samples from TCGA database. ESCC, esophageal squamous cell carcinoma; TCGA, the cancer genome atlas.