OncoTargets and Therapy Dovepress open access to scientific and medical research

Open Access Full Text Article ORIGINAL RESEARCH Identification of novel biomarkers and candidate small molecule drugs in non-small-cell by integrated microarray analysis

This article was published in the following Dove Press journal: OncoTargets and Therapy

Qiong Wu1,2,* Background: Non-small-cell lung cancer (NSCLC) remains the leading cause of cancer Bo Zhang1,2,* morbidity and mortality worldwide. In the present study, we identified novel biomarkers Yidan Sun3 associated with the pathogenesis of NSCLC aiming to provide new diagnostic and therapeu- Ran Xu1 tic approaches for NSCLC. Xinyi Hu4 Methods: The microarray datasets of GSE18842, GSE30219, GSE31210, GSE32863 and Shiqi Ren4 GSE40791 from Expression Omnibus database were downloaded. The differential Qianqian Ma5 expressed (DEGs) between NSCLC and normal samples were identified by limma Chen Chen6 package. The construction of –protein interaction (PPI) network, module analysis and Jian Shu7 enrichment analysis were performed using bioinformatics tools. The expression and prog- Fuwei Qi7 nostic values of hub genes were validated by GEPIA database and real-time quantitative fi Ting He7 PCR. Based on these DEGs, the candidate small molecules for NSCLC were identi ed by the CMap database. Wei Wang2 Results: A total of 408 overlapping DEGs including 109 up-regulated and 296 down- Ziheng Wang2 regulated genes were identified; 300 nodes and 1283 interactions were obtained from the 1 Medical School of Nantong University, PPI network. The most significant biological process and pathway enrichment of DEGs were Nantong 226001, People’s Republic of China; 2The Hand Surgery Research Center, response to wounding and cell adhesion molecules, respectively. Six DEGs (PTTG1, TYMS, Department of Hand Surgery, Affiliated ECT2, COL1A1, SPP1 and CDCA5) which significantly up-regulated in NSCLC tissues, Hospital of Nantong University, Nantong 226001, People’s Republic of China; were selected as hub genes according to the results of module analysis. The GEPIA database 3Department of Oncology, First Teaching further confirmed that patients with higher expression levels of these hub genes experienced Hospital of Tianjin University of Traditional fi Chinese Medicine, Tianjin 300193, People’s a shorter overall survival. Additionally, CMap predicted the 20 most signi cant small Republic of China; 4Department of molecules as potential therapeutic drugs for NSCLC. DL-thiorphan was the most promising Biochemistry & Molecular Biology, Nantong small molecule to reverse the NSCLC gene expression. University, Nantong, Jiangsu 226001, People’s Republic of China; 5Emergency Office, Wuxi Conclusions: Based on the gene expression profiles of 696 NSCLC samples and 237 Center for Disease Control and Prevention, normal samples, we first revealed that PTTG1, TYMS, ECT2, COL1A1, SPP1 and Wuxi 214023, People’s Republic of China; 6Department of Oncology, The Affiliated CDCA5 could act as the promising novel diagnostic and therapeutic targets for NSCLC. Cancer Hospital of Nanjing Medical University, Our work will contribute to clarifying the molecular mechanisms of NSCLC initiation and Nanjing, People’s Republic of China; 7The First People‘s Hospital of Taicang City, Taicang progression. Affiliated Hospital of Soochow University, Keywords: non-small-cell lung cancer, novel biomarkers, candidate small molecules, Suzhou 215400, People’sRepublicofChina prognosis, bioinformatics analysis *These authors contributed equally to this work

Correspondence: Ziheng Wang; Wei Wang The Hand Surgery Research Center, Introduction Department of Hand Surgery, Affiliated Lung cancer remains the leading cause of cancer morbidity and mortality worldwide. In Hospital of Nantong University, Nantong 226001, People’s Republic of China 2018, there are 234,030 newly diagnosed lung cancer patients, accounting for 13.5% of Tel +86 159 6297 0534; +86 136 1521 3504 all types of malignant tumors. In addition, lung cancer results in approximately 154,050 Email [email protected]; [email protected] death cases each year, accounting for 25.3% of all cancer-related deaths; 80–85% of all submit your manuscript | www.dovepress.com OncoTargets and Therapy 2019:12 3545–3563 3545 DovePress © 2019 Wu et al. This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution – Non Commercial (unported, v3.0) License (http://creativecommons.org/licenses/by-nc/3.0/). By accessing the work http://doi.org/10.2147/OTT.S198621 you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms (https://www.dovepress.com/terms.php). Wu et al Dovepress lung cancer patients are diagnosed with non-small-cell lung basedonGPL9948(AgilentHuman0.6KmiRNA cancer (NSCLC) subtype and 80% lung cancer-associated Microarray G4471A; Agilent Technologies, Santa Clara, deaths are caused by NSCLC. The subtypes of NSCLC CA, USA) (GSE32863) and GPL570 (Affymetrix Human mainly consist of lung adenocarcinoma, lung squamous cell Genome U113 Plus 2.0 Array) (GSE18842, GSE30219, carcinoma and large cell lung cancer based on histological GSE31210 and GSE40791). A total of 696 NSCLC samples sub-classification.1–4 Although great advance has been made and 237 normal samples were included in our study, of which in the therapeutic methods for NSCLC such as surgical 46 tumor samples and 45 normal samples were in GSE18842 resection, chemotherapy, radiotherapy and targeted therapy, profile, 272 tumor samples and 14 normal samples in the patients’ prognosis is still far from ideal, with 5-year GSE30219 profile, 226 tumor samples and 20 normal samples survival rate less than 20%. For advanced patients who are in GSE31210 profile, 58 tumor samples and 58 normal sam- inoperable, chemotherapy such as platinum still remains the ples in GSE32863 profile, and 94 tumor samples and 100 most ideal and important treatment strategy for NSCLC.5–7 normal samples in GSE40791 profile. The poor long-term survival rate of NSCLC patients is mainly attributed to the lack of specific symptoms and effec- Screeningfor DEGs tive diagnostic methods at an early stage. In additionally, The matrix data of each dataset was performed log2 con- high metastasis rate and drug resistance are also vital factors version and normalization using limma package of R/ 8–10 that can not be ignored. In recent years, with our Bioconductor software.14 The limma package was also increased understanding of molecular characterization of utilized to screen and identify the DEGs between NSCLC, molecular targeting therapies especially individua- NSCLC samples and normal tissue sample. Adjust P- lized precision treatment have undergone remarkable value <0.05 and |log2FC| >1 were considered the statistical 11,12 developments. Despite the prominent progress in the significance of differential expression. molecular diagnosis and treatment for NSCLC, substantive 13 breakthroughs have not yet been made in patients’ survival. Functional enrichment analysis Therefore, there is still an urgent demand to identify the GO and Kyoto Encyclopedia of Genes and Genomes novel biomarkers correlated with NSCLC diagnosis and (KEGG) pathway enrichment analysis were performed to prognosis to elucidate the precise molecular mechanism of determine the biological functions of the overlapping NSCLC occurrence and progression. In this study, the micro- DEGs. GO enrichment analysis is an extensively used array data of GSE18842, GSE30219, GSE31210, GSE32863 method to investigate the molecular function (MF), cell and GSE40791 from Gene Expression Omnibus (GEO) data- component (CC) and biological process (BP) of genes or base was used to identify the differential expressed genes gene products. KEGG is a widely used database for systema- (DEGs) between NSCLC and adjacent normal tissues. Gene tic analysis of high-level gene functions. In this study, we Ontology (GO) and pathway enrichment analysis were per- carried out GO function and KEGG pathway enrichment formed to better understand the biological functions of these analysis based on the platform of Database for Annotation DEGs. We also established a protein–protein interaction Visualization and Integrated Discovery (DAVID, http:// (PPI) network associated the DEGs. Furthermore, we also david.ncifcrf.gov), an online database rich in comprehensive identified potential candidate small molecules for a better annotation information of gene and protein functions. P- treatment of NSCLC. Six novel biomarkers were found to – value <0.05 was considered statistically significant.15 19 be related to the pathogenesis and prognosis of NSCLC. In summary, this study aimed to exploit promising novel bio- – markers for NSCLC diagnosis, prognosis and molecular Protein protein interaction (PPI) targeting therapies from new insights. Figure 1 shows the network construction and module workflow of our study. analysis We used the online database STRING (Search Tool for the Materials and methods Retrieval of Interacting Genes, https://string-db.org/) to better Data resources illustrate the potential interactive relationships among the Series matrix files of GSE18842, GSE30219, GSE31210, overlapping DEGs.20 Then the Cytoscape software was uti- GSE32863 and GSE40791 were downloaded from GEO data- lized for analyzing the interactions with a combined score >0.4 base (http://www.ncbi.nlm.nih.gov/geo/). The platforms were (http://www.cytoscape.org/).21 Finally, the plug-in MCODE

3546 submit your manuscript | www.dovepress.com OncoTargets and Therapy 2019:12 DovePress Dovepress Wu et al

Five microarray datasets downloaded from GEO database

Screening for DEGs

GO function and KEGG pathway enrichment analysis for DEGs

PPI network construction and module analysis

Identification of hub genes

Survival analysis of hub genes

Validation by RT-qPCR

Identification of candidate small mole- cule drugs

Figure 1 The workflow of this study.

(Molecular Complex Detection) was used to filter the signifi- Survival analysis and validation of hub cant modules from the PPI network for the selection of hub genes genes (degree cutoff = 2, node score cutoff = 0.2, k-core = 2, To further explore the roles of module genes in the NSCLC and max. depth = 100).22 We also performed functional and occurrence and development, we predicted the co-expression pathway enrichment analysis for the genes in the significant genes of module genes and the co-expression network was modules. The heat map of module genes was constructed constructed by cBioPortal online platform (http://www.cbio using UCSC Cancer Genomics Browser (http://genome-can portal.org).24,25 The Gene Expression Profiling Interactive cer.ucsc.edu). The Networks Gene Oncology tool (BiNGO), a Analysis (GEPIA) database was utilized to assess the impact plugin in Cytoscape, was used to explore and visualize the BP of hub genes on the patients’ prognosis.26 The NSCLC patients of the selected hub genes.23 were divided into high expression and low expression groups

OncoTargets and Therapy 2019:12 submit your manuscript | www.dovepress.com 3547 DovePress Wu et al Dovepress according to the median expression levels of hub genes. The CDCA5 gene 5ʹ-TTTTCAGTTCCGTGGGTTTC-3ʹ hazard ratio (HR) with 95% CI of overall survival was calcu- (sense) and 5ʹ-CCCAACTAAGGCTCCCTACAT-3ʹ (anti- lated for each group. And the GEPIA platform was applied to sense). further verify the expression level of hub genes between TYMS gene 5ʹ-AGCGAGAACCCAGACCTT-3ʹ (sense) NSCLC and normal samples. We analyzed the protein expres- and 5ʹ-AATAGTTGGATGCGGATTGTA-3ʹ (anti-sense). sion of hub genes by using the human protein atlas (HPA, ECT2 gene 5ʹ-AGGCGGAATGAACAGGA-3ʹ (sense) www.proteinatlas.org) database considering that gene expres- and 5ʹ-TTCATCTCCAAGCGGTAAA-3ʹ (anti-sense) sion was not always consistent with its protein level.27 COL1A1 gene 5ʹ-CAAGGTGTTGTGCGATGACG-3ʹ (sense) and 5ʹ-CGACGCCGGTGGTTTCTT-3ʹ (anti-sense) fi SPP1 gene 5ʹ-CTGCCAGCAACCGAAGT-3ʹ (sense) Identi cation of small molecules and 5ʹ-GTGATGTCCTCGTCTGTAGC-3ʹ (anti-sense); The CMap database (http://www.broadinstitute.org/cmap/) All reactions were performed on the Eppendorf was used to explore potential small molecule drugs for use Mastercycler ep realplex (2S; Eppendorf, Hamburg, in patients based on the gene signature of NSCLC. CMap Germany). using following cycling parameters, 95°C for 2 collects >7,000 gene expression profile changes induced min, followed by 40 cycles of 95°C for 15 s, 60°C for 45 s. by various small molecular agents.28 The overlapping dif- ferently expressed probesets among five datasets were classified into up-regulated and down-regulated groups. Ethics statement Then, these probesets from the two groups were uploaded This study was performed with the approval of the institu- fi into CMap database to match corresponding active small tional ethics committee of the Af liated Hospital of molecules. Finally, the enrichment scores between −1 and Nantong University. And written informed consent had 1, which represent similarity, were calculated. A positive been provided for the NSCLC patients included in the connectivity score (closer to +1) indicated the correspond- present study, which was conducted in accordance with ing small molecule is able to induce the state of NSCLC the Declaration of Helsinki. cells, whereas a negative connectivity score (closer to −1) demonstrate greater similarity between the genes. We Results investigated and calculated negative connectivity scores Identification of DEGs in NSCLC with potential therapeutic value. After integrated bioinformatical analysis for GSE18842, GSE30219, GSE31210, GSE32863 and GSE40791 data- Real-time quantitative PCR sets, a total of 408 overlapping genes were found to be Total RNA from tumor tissues and non-tumorous tissues differentially expressed. The volcano plot showed the up- was extracted with Trizol reagent (Invitrogen, Carlsbad, regulated and down-regulated DEGs in each dataset with CA, USA) according to the protocol. cDNA was synthe- the cutoff criterion of P<0.05 and |log2FC| >1. The Venn sized using an Omniscript Reverse Transcription kit diagrams showed the 408 overlap DEGs among the three fi (Qiagen, Valencia, CA, USA). Quantitative real-time datasets (Figure 2Ba) including 109 signi cantly up-regu- PCR (qPCR) assays were performed using EvaGreen lated genes (Figure 2Bb) and 296 down-regulated genes Master Mix (Biotium Inc., Hayward, CA, USA). The (Figure 2Bc). conditions for qPCR amplification were as follows: 95°C for 120 s followed by 40 cycles of 95°C for 15 s, anneal- Enrichment analyses ing temperature for 45 s. Each sample was run in triplicate. In order to investigate the biological functions of these Relative expression level for each target gene was normal- DEGs in NSCLC, GO and KEGG pathway, enrichment ized by the Ct value of GAPDH (endogenous reference) analysis was performed using DAVID. For BPs, GO ana- −ΔΔ using a 2 Ct relative quantification method. The primers lysis results indicated that up-regulated and down-regu- are as follows: lated DEGs were significantly enriched in response to PTTG1 gene 5ʹ-GACTCAGGCTGGAAGATTTG-3ʹ wounding, negative regulation of cell proliferation, regula- (sense) and 5ʹ- GGGAAGGTGGGAGAAGC-3ʹ (anti-sense). tion of cell proliferation, cell adhesion and response to

3548 submit your manuscript | www.dovepress.com OncoTargets and Therapy 2019:12 DovePress Dovepress Wu et al

GSE32863 GSE40791 A 60.52 87.94

48.42 70.35

36.31 -value) 52.76 P -value) P

24.21 35.18 -log 10(-log -log 10(-log

12.1 17.59

0 0 -4.79 -3.43 -2.06 -0.7 2.03 3.39 -6.6 -4.38 -2.16 0.05 4.49 6.71 0.66 2.27 log2(FoldChange) log2(FoldChange) GSE18842 GSE30219 GSE31210

46.89 57.71 35.17

37.51 46.17 28.14

28.13 34.63 21.1 -value) -value) -value) P P P

18.76 23.08 14.07 -log 10(-log -log 10(-log -log 10(-log

9.38 11.54 7.03

0 0 0 -6.58 -4.51 -2.44 -0.37 1.71 3.78 -5.88 -4.17 -2.45 -0.74 0.98 2.69 -6.8 -4.72 -2.64 -0.56 1.53 3.61

5.85 log2(FoldChange) 4.41 log2(FoldChange) 5.69 log2(FoldChange) B a

185

410 24 2 600 45 14 426 7 434 255 39 555 30 148 138 408 63 20 53 8 18 34 54 28 72 98 152 802 336 1116 b

43 c 156

204 20 233 121 19 79 406 165 5 246 249 19 6 189 413 31 361 416 17 25 13 18 23 122 32 25 4 3 13 109 5 296 105 5 120 2 38 15 73 3 55 15 0 1 22 7 159 47 185 23 1 31 4 11 693 40 505 22 1037 189

Figure 2 (A) Volcano plot of gene expression profile data between NSCLC and normal tissues in each dataset. Red dots: significantly up-regulated genes in NSCLC; green dots: significantly down-regulated genes in NSCLC; black dots: non-differentially expressed genes. Adj. P<0.01 and |log2 FC|>1 were considered as significant. (Ba) Venn diagram of 408 overlapping DEGs from GSE18842, GSE30219, GSE31210, GSE32863 and GSE40791 datasets. (Bb) Up-regulated DEGs (Bc) Down-regulated DEGs.

OncoTargets and Therapy 2019:12 submit your manuscript | www.dovepress.com 3549 DovePress Wu et al Dovepress steroid hormone stimulus. CC analysis showed that these genes were mainly enriched in DNA replication, cell cycle DEGs were particularly involved in extracellular region and oocyte meiosis (Figure 5A). The genes in module 2 were part, extracellular region, extracellular space, proteinac- mainly enriched in tumor necrosis factor signaling pathway, eous extracellular matrix and extracellular matrix. CAMs and African trypanosomiasis (Figure 5B). The genes in Similarly, changes in MF of DEGs were significantly module 3 were significantly enriched in PPAR signaling path- enriched in carbohydrate binding, growth factor binding, way, ECM-receptor interaction and Focal adhesion (Figure glycosaminoglycan binding, transforming growth factor 5C) (Table 2). The heat map clearly showed the significant beta binding and pattern binding. Furthermore, KEGG difference of module genes between cancer tissues and adja- pathway enrichment analysis revealed that these DEGs cent tissues(Figure 6A and B). Macromolecular complex sub- were mainly enriched in cell adhesion molecules unit organization, S phase and cell cycle phase were the main (CAMs), leukocyte transendothelial migration, TGF-beta BP of module genes (Figure 6C). Among the module genes, signaling pathway, complement and coagulation cascades we selected PTTG1, TYMS, ECT2, COL1A1, SPP1 and and ECM-receptor interaction (Figure 3 and Table 1). CDCA5 with high degree of connectivity as hub genes (Table 3). The expression of hub genes in NSCLC tissues PPI network construction and module was significantly up-regulated compared to normal tissues. analysis The STRING database and Cytoscape were used to construct a Analysis and validation of hub genes PPI network of the potential interactions between the over- The prognostic information of PTTG1, TYMS, ECT2, lapping DEGs. As presented in Figure 4, there were 300 nodes COL1A1, SPP1 and CDCA5 was freely obtained from and 1283 interactions found in the network. The top three GEPIA database. A total of 962 NSCLC patients were significant modules were detected by MCODE (Figure 5). available for survival analysis. It was found that the Pathway enrichment analysis suggested that the module1 high expression level of PTTG1, TYMS, ECT2,

AB

GO:0048545~response to steroid GO:0044421~extracellular hormone stimulus region part

GO:0042127~regulation of GO:0031012~extracellular matrix cell proliferation -Log 10(P-value) -Log 10(P-value) 8.50 17.5 8.25 15.0 GO:0009611~response 8.00 GO:0005615~extracellular space 12.5 7.75 10.0 to wounding Count Count 30 40 40 60 GO:0005578~proteinaceous 80 GO:0008285~negative regulation of 100 cell proliferation extracellular matrix

GO:0007155~cell adhesion GO:0005576~extracellular region

0.06 0.08 0.10 0.12 0.10 0.15 0.20 0.25 richFactor richFactor C D

GO:0050431~transforming growth hsa04670:Leukocyte transendothelial factor beta binding migration

-Log 10(P-value) GO:0030246~carbohydrate binding -Log 10(P-value) hsa04610:Complement and coagulation cascades 3.0 6.4 2.5 6.0 2.0 5.6 GO:0019838~growth factor binding hsa04514:Cell adhesion 1.5 Count Count 10 molecules (CAMs) 7 15 8 20 9 25 10 GO:0005539~glycosaminoglycan binding hsa04512:ECM−receptor interaction 11 12 13

GO:0001871~pattern binding hsa04350:TGF−beta signaling pathway

0.02 0.03 0.04 0.05 0.06 0.07 0.020 0.024 0.028 0.032 richFactor richFactor

Figure 3 Functional and signaling pathway analysis of the overlapped DEGs in NSCLC. (A) Biological processes, (B) cellular components, (C) molecular function and (D) KEGG pathway.

3550 submit your manuscript | www.dovepress.com OncoTargets and Therapy 2019:12 DovePress Dovepress Wu et al

Table 1. Functional and pathway enrichment analysis of the overlap DEGs

Category Term Count PValue

GOTERM_BP_FAT GO:0009611~response to wounding 39 3.00E-09 GOTERM_BP_FAT GO:0008285~negative regulation of cell proliferation 31 5.60E-09 GOTERM_BP_FAT GO:0048545~response to steroid hormone stimulus 22 1.08E-08 GOTERM_BP_FAT GO:0042127~regulation of cell proliferation 48 1.34E-08 GOTERM_BP_FAT GO:0007155~cell adhesion 44 2.70E-08 GOTERM_CC_FAT GO:0044421~extracellular region part 78 5.55E-20 GOTERM_CC_FAT GO:0005576~extracellular region 102 5.06E-12 GOTERM_CC_FAT GO:0005615~extracellular space 52 5.87E-12 GOTERM_CC_FAT GO:0005578~proteinaceous extracellular matrix 32 2.19E-10 GOTERM_CC_FAT GO:0031012~extracellular matrix 33 3.36E-10 GOTERM_MF_FAT GO:0030246~carbohydrate binding 28 1.64E-07 GOTERM_MF_FAT GO:0019838~growth factor binding 14 1.59E-06 GOTERM_MF_FAT GO:0005539~glycosaminoglycan binding 16 1.64E-06 GOTERM_MF_FAT GO:0050431~transforming growth factor beta binding 6 1.92E-06 GOTERM_MF_FAT GO:0001871~pattern binding 16 5.41E-06 KEGG_PATHWAY hsa04514:Cell adhesion molecules (CAMs) 13 6.98E-04 KEGG_PATHWAY hsa04670:Leukocyte transendothelial migration 10 0.010253 KEGG_PATHWAY hsa04350:TGF-beta signaling pathway 8 0.017312 KEGG_PATHWAY hsa04610:Complement and coagulation cascades 7 0.019019 KEGG_PATHWAY hsa04512:ECM-receptor interaction 7 0.044383

COL1A1, SPP1 and CDCA5 was markedly associated and Figure 9B. Among these small molecules, DL-thior- with worse overall survival for NSCLC patients phan (enrichment score = −0.826) and phenoxybenza- (Figures 7and 8A). This finding further confirmed the mine (enrichment score = −0.823) showed a highly key role of these hub genes in the onset of NSCLC. significant negative correlation and have the potential to Based on the immunohistochemical staining results reverse the tumoral status of NSCLC. This analysis pro- from HPA database, the protein expression level of vided novel insights into the treatment of NSCLC. PTTG1, TYMS, ECT2, COL1A1, SPP1 and CDCA5 However, further studies were still needed to explore was consistent with their gene expression, that is, the the molecular mechanism of these small molecules in protein levels of hub genes were also in a higher NSCLC. To futher investigate the moleuclar mechinism expression state in NSCLC tissues compared to normal of the hub genes in NSCLCS, we predicted potential tissues (Figure 7B). In addition, we established a net- transcription factors (Figure S1) and constructed a regu- work of module genes and their co-expression genes latory network of lncRNA, miRNA and mRNA (Figure (Figure9A).Insummary,PTTG1,TYMS,ECT2, S2) by Gene-Cloud Biotechnology Information (GCBI) COL1A1, SPP1 and CDCA5 could represent the database. important diagnostic and prognostic biomarkers for NSCLC. Evaluation of gene expression in NSCLC fi To further verify the expression of PTTG1, CDCA5, Identi cation of related active small TYMS, ECT2, COL1A1 and SPP1 genes in NSCLC tis- molecules sues and corresponding adjacent normal tissues, we choose To identify candidate small molecule drugs targeting the seven pairs of tumor tissues and corresponding adjacent gene expression of NSCLC, all the overlapping DEGs, tissues. Relative expression of PTTG1, CDCA5, TYMS, which were divided into up-regulated and down-regu- ECT2, COL1A1 and SPP1 mRNA in NSCLC and adjacent lated groups, were submitted to the CMap database. The non-tumorous tissues were quantified by qPCR. The 20 most significant small molecules matched to the results showed that the average PTTG1, CDCA5, TYMS, NSCLC gene expression changes are listed in Table 4 ECT2, COL1A1 and SPP1 mRNA expression level in

OncoTargets and Therapy 2019:12 submit your manuscript | www.dovepress.com 3551 DovePress Wu et al Dovepress

Figure 4 The protein–protein interaction networks of overlapping DEGs.

NSCLC tissues was significantly higher compared with GEO database, as a public repository for archiving high- non-tumorous tissues (*p<0.05, compared with adjacent throughput microarray experimental data, has provided the non-tumorous tissues, Figure 9C). powerful tools to determine key genes and pathways associated with the pathogenesis of tumors.29,30 In the Discussion present study, based on the GEO database, five gene Recently, the rapid advance in microarray and high- expression profiles including 696 NSCLC samples and throughput technologies has expanded the application bio- 237 normal samples were integrated for a comprehensive medicine in clinical practice, such as cancer early diagnosis, bioinformatics analysis. The aim of our study was to find novel targeted drug discovery and prognosis prediction. the potential small molecule drugs for the treatment of

3552 submit your manuscript | www.dovepress.com OncoTargets and Therapy 2019:12 DovePress Dovepress Wu et al

Module pathway ID pathway description

Module1 3030 DNA replication

Module1 4110 Cell cycle

Module1 4114 Oocyte meiosis

Module pathway ID pathway description

Module2 4668 TNF signaling pathway

Module2 4514 Cell adhesion molecules (CAMs)

Module2 5143 African trypanosomiasis

Module pathway ID pathway description

Module3 3320 PPAR signaling pathway

Module3 4512 ECM-receptor interaction

Module3 4510 Focal adhesion

Figure 5 The three most significant modules extracted from PPI network and KEGG pathway analysis of module genes.

NSCLC and to identify the novel biomarkers correlated overlapping DEGs, the GO function and KEGG pathway with the pathogenesis and prognosis of NSCLC. A total of enrichment for these DEGs were performed. GO term 408 overlapping DEGs between tumor tissues and corre- analysis was carried out via the following aspects: BP, sponding adjacent normal tissues were identified, which MF and CC. The BP analysis showed that these DEGs consisted of 109 up-regulated genes and 296 down-regu- were mainly enriched in response to wounding, negative lated genes. For a better in-depth understanding of these regulation of cell proliferation and regulation of cell

OncoTargets and Therapy 2019:12 submit your manuscript | www.dovepress.com 3553 DovePress Wu et al Dovepress

Table 2 The pathway enrichment analysis of module genes

Module pathway ID pathway description observed gene count false discovery rate

Module1 hsa3030 DNA replication 3 0.00264 Module1 hsa4110 Cell cycle 4 0.00264 Module1 hsa4114 Oocyte meiosis 3 0.0336 Module2 hsa4668 TNF signaling pathway 3 0.00213 Module2 hsa4514 Cell adhesion molecules (CAMs) 3 0.00225 Module2 hsa5143 African trypanosomiasis 2 0.00672 Module3 hsa3320 PPAR signaling pathway 3 0.00117 Module3 hsa4512 ECM-receptor interaction 3 0.00117 Module3 hsa4510 Focal adhesion 3 0.0103

proliferation. MF analysis indicated that these DEGs were levels of PTTG1, TYMS, ECT2, COL1A1, SPP1 and significantly associated with carbohydrate binding, growth CDCA5 experienced a worse prognosis than those with factor binding and glycosaminoglycan binding. Changes in low expression. To validate the results of bioinformatics CC were mainly enriched in extracellular region part, analysis, we performed qPCR analysis to evaluate the extracellular region and extracellular space. The KEGG expression of hub genes expression in seven paired pathway enrichment analysis revealed nine significant sig- NSCLC tissues. The qPCR analysis showed the same naling pathways including CAMs, leukocyte transendothe- gene expression trend as found in the GEO database, lial migration, TGF-beta signaling pathway, complement thereby verifying the reliability of our results. and coagulation cascades and ECM-receptor interaction. Additionally, the establishment of a network of lncRNA– Multiple CAM are involved in the tumor growth, metas- miRNA–mRNA and the prediction of transcription factors tasis and angiogenesis. Vascular CAM-1 was first noted as will enhance our understanding of the mechanism of an endothelial cell adhesion receptor for more than two PTTG1, TYMS, ECT2, COL1A1, SPP1 and CDCA5 in decades, which plays a key role in leukocyte recruitment the pathogenesis of NSCLC. Previous studies have proved in cellular immune responses. The L1 cell adhesion mole- that PTTG1 protein is abundantly expressed in various cule (L1CAM), as neural adhesion molecules, extensively invasive tumors and hematopoietic malignant tumors, but participates in the progression of human malignant its expression level is low or undetectable in most normal tumors.31,32 Targeting the TGFβ pathway has been used tissues. Several studies have further emphasized the role of for various cancer therapy.33,34 An increasing number of PTTG1 in the growth and metastasis of tumors. They have studies reveal that the ECM-receptor interaction pathway shown that ectopic expression of PTTG1 enhances prolif- is significantly associated with the various cancer cells eration or invasiveness in various histologically derived proliferation and invasion. Zhang et al demonstrated that cancer cell lines, whereas silencing of PTTG1 produces Twist2 is involved in the proliferation and invasion of the opposite result.36,37 CDCA5 plays a key role in ensur- kidney cancer cells through regulating the expression of ing the accurate separation of sister chromatids in S and two molecules in the ECM-receptor interaction pathway: G2/M phases of cell cycle by interacting with coherents ITGA6 and CD44.35 In summary, the above theories and cdk1. Additionally, CDCA5 also interacts with the key strongly supported our findings from bioinformatics ana- regulatory factors ERK and cyclin E1 of G1/Smitotic lysis. The PPI network complex based on DEGs-encoding checkpoint. Recent studies have shown that the expression was constructed and 300 nodes with 1283 inter- of CDCA5 in oral squamous cell carcinoma, urothelial cell actions were obtained. The MCODE plug-in extracted carcinoma and gastric cancer, which is related to tumor- three modules with the most significant degree from the igenesis and tissue invasion.38,39 ECT2 is a guanine PPI network. TTG1, TYMS, ECT2, COL1A1, SPP1 and nucleotide exchange factor (GEF), which is related to CDCA5 with high degree of connectivity were selected as tumor cell differentiation, TNM stage, prognosis and hub genes. Survival analysis for 962 NSCLC patients from lymph node metastasis, such as breast cancer, osteosar- GEPIA database showed that patients with high expression coma cells, gastric cancer, and gliomas.40,41 COL1A1 is a

3554 submit your manuscript | www.dovepress.com OncoTargets and Therapy 2019:12 DovePress Dovepress Wu et al

A

CAV1 LDLR TIMP1 DCN VWF ENG 6 COL1A1 SPP1 PLA2G1B SELE IL6 EDN1 7.5 PPARG LPL CD36 CXCL12 CD34 CDH5 TOP2A 9 TPX2 Expression CENPF

TK1 Gene MCM2 MCM4 FEN1 10.5 ECT2 AURKB BIRC5 NEK2 MELK ASPM 12 CDCA8 CDCA5 CCNB2 KIF20A STIL UBE2T KIAA0101 UBE2C AURKA CCNF PTTG1 B TYMS NUSAP 1 LDLR CAV1 VWF TIMP1ENG

DCNCOL1A1 6 SPP1PLA2G1B MCM2 TOP2A MCM4TPX2 TK1 CENPFECT2 CCNF 7.5 KIAA0101

STILAURKB NEK2 MELK TYMS UBE2T 9 NUSAP1 PTTG1 AURKA Expression FEN1CDCA8 KIF20A

ASPMBIRC5 Gene CCNB2 10.5 SELE

UBE2CCD34 CD36 CDCA5 IL6 CXCL12 12 PPARG CDH5 LPL EDN1

double-strand break repair via C break-induced replication

nuclear replication fork replication fork double-strand break repair via homologous recombination replication fork protection complex

intracellular

double-strand break repair protein complex protein-DNA complex assembly macromolecular complex assembly nuclear part chromosomal part pre-replicative complex assembly cellular component assembly cell part recombinational repair MCM complex intracellular part cellular macromolecular complex cell assembly response to DNA damage stimulus nuclear chromosome macromolecular complex subunit response to stress macromolecular complex chromosome DNA-dependent DNA reDplNicAatisotnrand elongation involved in DNA organization protein-DNA complex replication cellular component biogenesis DNA repair intracellular organelle cellular response to stress DNA recombination cellular component organization cellular_compionntreanctellular organelle part mitotic recombination nucleoplasm part nuclear part DNA replication preinitiation complex intracellular non-membrane-bounded response to stimulus organelle DNA synthesis involvedDNA-dependent in DNA DNA repair replication base-excision repair cellular macromolecular complex initiation nucleus subunit organization gene conversion at mating-type locus, DNA strand elongation pre-replicative complex organelle DNA repair synthesis nucleoplasm intracellular membranem-beomubnrdaende-bounded organelle cellular response to stimulus membrane-enclosed lumoerganelle part organelle gene conversion at mating-type locus DNA metabolic process biological_process DNA replication nuclear lumen intracellular organelle lumen non-membrane-bounded organelle base-excision repair, base-free sugar-phosphate removal reproductive process DNA catabolic process reproductive developmental process reproduction organelle lumen developmental procelsluslar process mating type switchingnucleic acid metabolic process cell cycle cellular macromolecule metabolic process mitotic cell cycle DNA replication origin binding metabolic process dephosphorylation reproduction of a single-celled organism cellular macromolecule catabolic sex determination cellular macromolecule biosynthetic process reproductive cellular process primary metabolic process procescsatabolic process cell cycle process interphase of mitotic cell cycle reproductive process in single-celled cellular metabolic process S phase of mitotic cell cycle organism nucleobase, nucleoside, nucleotide and phosphorus metabolic process mating type determination macromolecule metabolic process phosphate metabolic process sequence-specific DNA binding cellular developmental process nucleic acid metabolic process

cell cycle phase biosynthetic process Sphase cellular catabolic process DNA helicase activity nitrogen compound metabolic process macromolecule catabolic process

cellular biosynthetic process

interphase cell fate commitment macromolecule biosynthetic process DNA binding cell differentiation cellular nitrogen compound metabolic process

helicase activity

nucleic acid binding

nucleoside-triphosphatase activity

binding

pyrophosphatase activity binding

molecular_function hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides

catalytic activity hydrolase activity, acting on acid hydrolase activity anhydrides

Figure 6 (A, B) The heatmap of module genes between NSCLC (LUAD and LUSC) and normal samples. (C) The BiNGO revealed the biological process of module genes. The color depth of nodes represents the corrected P-value. The size of nodes represents the number of genes involved. target gene of miR-133a-3p in oral squamous cell carci- In additionally, analyzed with the overlapping DEGs noma and miR-129-5p in gastric cancer.42,43 However, no and CMap database, we determined a set of small mole- studies have reported the potential mechanism of ITGB5 cule drugs that had potential to reverse the gene expres- and RGS4 in the initiation and progression of NSCLC. sion changes of NSCLC. The small molecules with a The above studies indicated PTTG1, TYMS, ECT2, highly significant negative enrichment value may become COL1A1, SPP1 and CDCA5 may also play an important new targeted drugs for the treatment of NSCLC. DL- role in the occurrence and development of NSCLC. thiorphan, as the most significant small molecule

OncoTargets and Therapy 2019:12 submit your manuscript | www.dovepress.com 3555 DovePress Wu et al Dovepress

Table 3 The full name and functional roles of hub genes

Gene Full name Funcution symbol

PTTG1 Pituitary Tumor- The encoded protein is a homolog of yeast securin proteins, which prevent separins from promoting sister Transforming 1 chromatid separation. It is an anaphase-promoting complex (APC) substrate that associates with a separin until activation of the APC. The gene product has transforming activity in vitro and tumorigenic activity in vivo, and the gene is highly expressed in various tumors. The gene product contains 2 PXXP motifs, which are required for its transforming and tumorigenic activities, as well as for its stimulation of basic fibroblast growth factor expression. It also contains a destruction box (D box) that is required for its degradation by the APC. The acidic C-terminal region of the encoded protein can act as a transactivation domain. The gene product is mainly a cytosolic protein, although it partially localizes in the nucleus. Three transcript variants encoding the same protein have been found for this gene. CDCA5 Cell Division Cycle CDCA5 (Cell Division Cycle Associated 5) is a Protein Coding gene. Diseases associated with CDCA5 Associated 5 include Cornelia De Lange Syndrome. Among its related pathways are Cell Cycle, Mitotic and MicroRNAs in cancer. (GO) annotations related to this gene include chromatin binding. TYMS Thymidylate Thymidylate synthase catalyzes the methylation of deoxyuridylate to deoxythymidylate using, 10-methy- Synthetase lenetetrahydrofolate (methylene-THF) as a cofactor. This function maintains the dTMP (thymidine-5-prime monophosphate) pool critical for DNA replication and repair. The enzyme has been of interest as a target for cancer chemotherapeutic agents. It is considered to be the primary site of action for 5-fluorouracil, 5- fluoro-2-prime-deoxyuridine, and some folate analogs. Expression of this gene and that of a naturally occurring antisense transcript, mitochondrial enolase superfamily member 1 (GeneID:55556), vary inver- sely when cell-growth progresses from late-log to plateau phase. Polymorphisms in this gene may be associated with etiology of neoplasia, including breast cancer, and response to chemotherapy. ECT2 Epithelial Cell The protein encoded by this gene is a guanine nucleotide exchange factor and transforming protein that is Transforming 2 related to Rho-specific exchange factors and yeast cell cycle regulators. The expression of this gene is elevated with the onset of DNA synthesis and remains elevated during G2 and M phases. In situ hybridization analysis showed that expression is at a high level in cells undergoing mitosis in regenerating liver. Thus, this protein is expressed in a cell cycle-dependent manner during liver regeneration, and is thought to have an important role in the regulation of cytokinesis. Several transcript variants encoding different isoforms have been found for this gene. COL1A1 Collagen Type I Alpha This gene encodes the pro-alpha1 chains of type I collagen whose triple helix comprises two alpha1 chains 1 Chain and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIA, Ehlers-Danlos syndrome Classical type, Caffey Disease and idiopathic osteoporosis. Reciprocal translocations between 17 and 22, where this gene and the gene for platelet-derived growth factor beta are located, are associated with a particular type of skin tumor called dermatofibrosarcoma protuberans, resulting from unregulated expression of the growth factor. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. SPP1 Secreted The protein encoded by this gene is involved in the attachment of osteoclasts to the mineralized bone Phosphoprotein 1 matrix. The encoded protein is secreted and binds hydroxyapatite with high affinity. The osteoclast vitronectin receptor is found in the cell membrane and may be involved in the binding to this protein. This protein is also a cytokine that upregulates expression of interferon-gamma and interleukin-12. Several transcript variants encoding different isoforms have been found for this gene.

(enrichment score = −0.826), was the most promising also not investigated. This information is beneficial for small molecule to reverse the abnormal NSCLC gene the development of novel targeted drugs for the treatment expression. It is worth noting that so far no research of NSCLC. Given the emergence of these candidate has focused on the potential role of this small molecule biomarkers in silico, in vitro studies (with cell lines, in NSCLC. Similarly, the relationship between phenoxy- etc.) and then in vivo experiments could be worth of benzamine (enrichment score = −0.823) and NSCLC was interest in functional validation.

3556 submit your manuscript | www.dovepress.com OncoTargets and Therapy 2019:12 DovePress Dovepress Wu et al

CDCA5 COL1A1

* 14 * * * 81012 246 01234567

LUSC LUAD LUSC LUAD (num(T)=486; num(N)=338) (num(T)=483; num(N)=347) (num(T)=486; num(N)=338) (num(T)=483; num(N)=347)

ECT2 PTTG1 8 * * * * 80 6 4 0 2 0246

LUSC LUAD LUSC LUAD (num(T)=486; num(N)=338) (num(T)=483; num(N)=347) (num(T)=486; num(N)=338) (num(T)=483; num(N)=347)

SPP1 TYMS * * * * 12 14 10 68 4 2468 0 02

LUSC LUAD LUSC LUAD (num(T)=486; num(N)=338) (num(T)=483; num(N)=347) (num(T)=486; num(N)=338)• (num(T)=483; num(N)=347)

Figure 7 The expression level of hub genes according to the GEPIA database.

OncoTargets and Therapy 2019:12 submit your manuscript | www.dovepress.com 3557 DovePress Wu et al Dovepress

A Overall survival Overall survival Overall survival Low CDCA5 TPM Low COL1A1 TPM Low ECT2 TPM 1.0 High CDCA5 TPM High COL1A1 TPM 1.0 High ECT2 TPM Logrank p=0.022 Logrank p=0.015 Logrank p=0.046 HR(high)=1.3 HR(high)=1.3 HR(high)=1.2 P(HR)=0.022 P(HR)=0.015 P(HR)=0.047 n(high)=481 n(high)=481 n(high)=481 al al al

v n(low)=481 v n(low)=481 n(low)=481 vi r rviv 0.6 0.8 su 0.4 0.4 0.6 0.8 ercent P Percent su 0.2 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0

0 50 100 150 200 250 0 50 100 150 200 250 0 50 100 150 200 250 Months Months Months Overall survival Overall survival Overall survival Low PTTG1 TPM Low SPP1 TPM Low TYMS TPM 1.0 High PTTG1 TPM High SPP1 TPM 1.0 High TYMS TPM Logrank p=0.033 Logrank p=0.01 Logrank p=0.028 HR(high)=1.2 HR(high)=1.3 HR(high)=1.3 P P P 0.8 (HR)=0.034 (HR)=0.011 0.8 (HR)=0.028 n(high)=481 n(high)=481 n(high)=481

val Percent survi n(low)=481 val n(low)=481 n(low)=481 vi r rvival 0.6 0.6 0.4 0.6 0.8 1.0 0.4 0.4 ercent survi ercent su ercent su P P P 0.2 0.2 0.0 0.0 0.2 0.0

0 50 100 150 200 250 0 50 100 150 200 250 0 50 100 150 200 250 Months Months Months B NORMAL NSCLC

Figure 8 (A) The survival analysis for hub genes according to the GEPIA database. (B) The protein level expression of hub genes in NSCLC and normal tissues using immunohistochemistry.

3558 submit your manuscript | www.dovepress.com OncoTargets and Therapy 2019:12 DovePress Dovepress Wu et al

ABA

Y−27632

sulconazole

sanguinarine

rottlerin

repaglinide

pyrvinium

-Log10(PValue) puromycin

phenoxybenzamine 3

milrinone 2

menadione 1

medrysone Count Molecule 2 ginkgolide A 3

estriol 4

5 DL−thiorphan 6 blebbistatin

8−azaguanine

5255229

5224221

4,5−dianilinophthalimide

0297417−0002B

−0.90 −0.85 −0.80 −0.75 −0.70

enrichment

C Data from PCR * Control 8 * * Tumor * * 6 noisse * rpxeeneG 4

2

0 PTTG1 CDCA5 TYMS ECT2 COL1A1 SPPA

Figure 9 (A) The network of module genes and their co-expression genes constructed by cBioPortal. Nodes with thick outline: hub genes; nodes with thin outline: co- expression genes. (B) The 20 most small molecules drugs identified by CMap database. (C) qPCR validation of these six hub genes in seven paired NSCLC samples. *P<0.05.

Conclusion treatment of NSCLC. We also revealed several crucial In this study, six key genes were identified for the first signaling pathways correlated with the NSCLC initiation time in NSCLC by integrated bioinformatics analysis. and progression. Furthermore, we identified a set of can- Survival analysis revealed that high expression levels of didate small molecule drugs which could reverse the PTTG1, TYMS, ECT2, COL1A1, SPP1 and CDCA5 abnormal gene expression of NSCLC. We hope the pre- were markedly associated with worse prognosis of sent study could provide powerful evidence for the future patients. These hub genes could act as the promising development of genomic individualized treatment in novel biomarkers for the diagnosis, prognosis and NSCLC.

OncoTargets and Therapy 2019:12 submit your manuscript | www.dovepress.com 3559 DovePress Wu et al Dovepress

Table 4 List of the 20 most significant small molecule drugs 11. Niu FY, Wu YL. Personalized treatment strategies for non-small-cell lung cancer in Chinese patients: the role of crizotinib. Onco Targets cmap name enrichment p Ther. 2015;8:999–1007. doi:10.2147/OTT.S64664 12. Toro M, Gomez-Lojero C, Montal M, Estrada OS. Charge transfer DL-thiorphan -0.92 0.01312 mediated by nigericin in black lipid membranes. J Bioenerg. 1976;8 phenoxybenzamine -0.9 0.00016 (1):19–26. sanguinarine -0.844 0.04873 13. Ni M, Liu X, Wu J, et al. Identification of candidate biomarkers blebbistatin -0.826 0.05964 correlated with the pathogenesis and prognosis of non-small cell lung cancer via integrated bioinformatics analysis. Front Genet. repaglinide -0.797 0.00334 2018;9:469. doi:10.3389/fgene.2018.00173 medrysone -0.795 0.00016 14. Ritchie ME, Phipson B, Wu D, et al. limma powers differential 8-azaguanine -0.792 0.00378 expression analyses for RNA-sequencing and microarray studies. menadione -0.778 0.09664 Nucleic Acids Res. 2015;43(7):e47. doi:10.1093/nar/gkv007 milrinone -0.774 0.02354 15. Gene Ontology Consortium. The Gene ontology (GO) project in – Y-27632 -0.774 0.10102 2006. Nucleic Acids Res. 2006;34(Databaseissue):D322 D326. doi:10.1093/nar/gkj021 4,5-dianilinophthalimide -0.757 0.11651 16. Ashburner M, Ball CA, Blake JA, et al. Gene ontology: tool for the puromycin -0.748 0.00806 unification of biology. The Gene Ontology Consortium. Nat Genet. 5255229 -0.748 0.12557 2000;25(1):25–29. doi:10.1038/75556 pyrvinium -0.736 0.0007 17. Kanehisa M, Goto S. KEGG: kyoto encyclopaedia of genes and rottlerin -0.736 0.03758 genomes. Nucleic Acids Res. 2000;28(1):27–30(24). 18. Dennis G Jr., Sherman BT, Hosack DA, et al. DAVID: Database for 5224221 -0.736 0.13802 annotation, visualization, and integrated discovery. Genome Biol. ginkgolide A -0.733 0.01011 2003;4(5):P3. doi:10.1186/gb-2003-4-5-p3 0297417-0002B -0.72 0.04521 19. Huang DW, Sherman BT, Lempicki RA. Systematic and integrative estriol -0.718 0.01281 analysis of large gene lists using DAVID bioinformatics resources. sulconazole -0.71 0.01428 Nat Protoc. 2009;4(1):44–57. doi:10.1038/nprot.2008.211 20. Damian S, Andrea F, Stefan W, et al. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2015;43(Database issue):D447. doi:10.1093/nar/gku1003 Disclosure 21. Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T. Cytoscape 2.8: The authors report no conflicts of interest in this work. new features for data integration and network visualization. Bioinformatics. 2011;27(3):431–432. doi:10.1093/bioinformatics/btq675 22. Bandettini WP, Kellman P, Mancini C, et al. MultiContrast delayed enhancement (MCODE) improves detection of subendocardial myo- References cardial infarction by late gadolinium enhancement cardiovascular magnetic resonance: a clinical validation study. J Cardiovasc Magn 1. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2018. CA Cancer J Reson. 2012;14:83. doi:10.1186/1532-429X-14-83 Clin. 2018;68(1):7–30. doi:10.3322/caac.21442 23. Maere S, Heymans K, Kuiper M. BiNGO: a cytoscape plugin to 2. Ulahannan SV, Brahmer JR. Antiangiogenic agents in combination assess overrepresentation of gene ontology categories in biological with chemotherapy in patients with advanced non-small cell lung networks. Bioinformatics. 2005;21(16):3448–3449. doi:10.1093/ – cancer. Cancer Invest. 2011;29(4):325 337. doi:10.3109/ bioinformatics/bti551 07357907.2011.554476 24. Cerami E, Gao J, Dogrusoz U, et al. The cBio cancer genomics 3. Ramalingam SS, Owonikoko TK, Khuri FR. Lung cancer: new portal: an open platform for exploring multidimensional cancer geno- biological insights and recent therapeutic advances. CA Cancer J mics data. Cancer Discov. 2012;2(5):401–404. doi:10.1158/2159- – Clin. 2011;61(2):91 112. doi:10.3322/caac.20102 8290.CD-12-0095 4. Herbst RS, Heymach JV, Lippman SM. Lung cancer. N Engl J Med. 25. Gao J, Aksoy BA, Dogrusoz U, et al. Integrative analysis of complex – 2008;359(13):1367 1380. cancer genomics and clinical profiles using the cBioPortal. Sci 5. Boolell V, Alamgeer M, Watkins DN, Ganju V. The evolution of Signal. 2013;6(269):pl1. doi:10.1126/scisignal.2004088 – therapies in non-small cell lung cancer. Cancers. 2015;7(3):1815 26. Tang Z, Li C, Kang B, Gao G, Li C, Zhang Z. GEPIA: a web server 1846. for cancer and normal gene expression profiling and interactive 6. Khakwani A, Rich AL, Powell HA, et al. Lung cancer survival in analyses. Nucleic Acids Res. 2017;45(W1):W98–w102. doi:10.1093/ England: trends in non-small-cell lung cancer survival over the dura- nar/gkx247 tion of the national lung cancer audit. Br J Cancer. 2013;109 27. Maier T, Guell M, Serrano L. Correlation of mRNA and protein in (8):2058–2065. complex biological samples. FEBS Lett. 2009;583(24):3966–3973. 7. Rothschild SI. [Advanced and metastatic lung cancer - What is new doi:10.1016/j.febslet.2009.10.036 in the diagnosis and therapy?]. Praxis. 2015;104(14):745–750. 28. Lamb J, Crawford ED, Peck D, et al. The connectivity map: using 8. Song W, Tang Z, Li M, et al. Polypeptide-based combination of gene-expression signatures to connect small molecules, genes, and paclitaxel and cisplatin for enhanced chemotherapy efficacy and disease. Science. 2006;313(5795):1929–1935. doi:10.1126/ reduced side-effects. Acta Biomater. 2014;10(3):1392–1402. science.1132939 9. Sibille A, Paulus A, Martin M, et al. [MANAGEMENT OF NON- 29. Kulasingam V, Diamandis EP. Strategies for discovering novel cancer SMALL CELL LUNG CANCER]. Rev Med Liege. 2015;70(9):432–441. biomarkers through utilization of emerging technologies. Nat Rev 10. Lu X, Zhou D, Hou B, et al. Dichloroacetate enhances the antitumor Clin Oncol. 2008;5(10):588–599. doi:10.1038/ncponc1187 efficacy of chemotherapeutic agents via inhibiting autophagy in non- 30. Edgar R, Domrachev M, Lash AE. Gene expression omnibus: NCBI small-cell lung cancer. Cancer Manag Res. 2018;10:1231–1241. gene expression and hybridization array data repository. Nucleic doi:10.2147/CMAR.S156530 Acids Res. 2002;30(1):207–210.

3560 submit your manuscript | www.dovepress.com OncoTargets and Therapy 2019:12 DovePress Dovepress Wu et al

31. Schlesinger M, Bendas G. Vascular cell adhesion molecule-1 38. Chang IW, Lin VC, He HL, et al. CDCA5 overexpression is an (VCAM-1)–an increasing insight into its role in tumorigenicity and indicator of poor prognosis in patients with urothelial carcinomas of metastasis. Int J Cancer. 2015;136(11):2504–2514. the upper urinary tract and urinary bladder. Am J Transl Res. 2015;7 32. Yu X, Yang F, Fu DL, Jin C. L1 cell adhesion molecule as a therapeutic (4):710–722. target in cancer. Expert Rev Anticancer Ther. 2016;16(3):359–371. 39. Tokuzen N, Nakashiro K, Tanaka H, Iwamoto K, Hamakawa H. 33. Ahmadi A, Najafi M, Farhood B, Mortezaee K. Transforming growth Therapeutic potential of targeting cell division cycle associated 5 for oral factor-beta signaling: tumorigenesis and targeting for cancer therapy. squamous cell carcinoma. Oncotarget. 2016;7(3):2343–2353. J Cell Physiol. 2018. 40. Jin Y, Yu Y, Shao Q, et al. Up-regulation of ECT2 is associated with 34. Gotovac JR, Fujihara KM, Phillips WA, Clemons NJ. TGF-beta poor prognosis in gastric cancer patients. Int J Clin Exp Pathol. signaling and its targeted therapy in gastrointestinal cancers. Discov 2014;7(12):8724–8731. Med. 2018;26(142):103–112. 41. Sano M, Genkai N, Yajima N, et al. Expression level of ECT2 proto- 35. Zhang HJ, Tao J, Sheng L, et al. Twist2 promotes kidney cancer cell oncogene correlates with prognosis in glioma patients. Oncol Rep. proliferation and invasion by regulating ITGA6 and CD44 expression in 2006;16(5):1093–1098. the ECM-receptor interaction pathway. Onco Targets Ther. 2016;9:1801– 42. Oleksiewicz U, Liloglou T, Tasopoulou KM, et al. COL1A1, 1812. PRPF40A, and UCP2 correlate with hypoxia markers in non-small 36. Pei L, Melmed S. Isolation and characterization of a pituitary tumor- cell lung cancer. J Cancer Res Clin Oncol. 2017;143(7):1133–1141. transforming gene (PTTG). Mol Endocrinol. 1997;11(4):433–441. 43. Li J, Ding Y, Li A. Identification of COL1A1 and COL1A2 as 37. Panguluri SK, Yeakel C, Kakar SS. PTTG: an important target gene candidate prognostic factors in gastric cancer. World J Surg Oncol. for ovarian cancer therapy. J Ovarian Res. 2008;1(1):6. 2016;14(1):297.

OncoTargets and Therapy 2019:12 submit your manuscript | www.dovepress.com 3561 DovePress Wu et al Dovepress

Supplementary materials

Figure S1 The potential transcription factors associated with the expression of hub genes.

3562 submit your manuscript | www.dovepress.com OncoTargets and Therapy 2019:12 DovePress Dovepress Wu et al

Figure S2 A regulatory network of lncRNA-miRNA-mRNA constructed by GCBI. Purple nodes: related lncRNA; Blue nodes: targeted miRNA.

OncoTargets and Therapy Dovepress Publish your work in this journal OncoTargets and Therapy is an international, peer-reviewed, open agents and protocols on patient perspectives such as quality of life, access journal focusing on the pathological basis of all cancers, adherence and satisfaction. The manuscript management system is potential targets for therapy and treatment protocols employed to completely online and includes a very quick and fair peer-review improve the management of cancer patients. The journal also system, which is all easy to use. Visit http://www.dovepress.com/ focuses on the impact of management programs and new therapeutic testimonials.php to read real quotes from published authors.

Submit your manuscript here: https://www.dovepress.com/oncotargets-and-therapy-journal

OncoTargets and Therapy 2019:12 submit your manuscript | www.dovepress.com 3563 DovePress