Hindawi BioMed Research International Volume 2020, Article ID 6954793, 24 pages https://doi.org/10.1155/2020/6954793

Research Article Identification and Verification of Biomarker in Clear Cell Renal Cell Carcinoma via Bioinformatics and Neural Network Model

Bin Liu,1 Yu Xiao,2 Hao Li,3 Ai-li Zhang ,1 Ling-bing Meng ,4 Lu Feng,5 Zhi-hong Zhao,1 Xiao-chen Ni,1 Bo Fan,1 Xiao-yu Zhang,1 Shi-bin Zhao,6 and Yi-bo Liu1

1Department of Urinary Surgery, The Fourth Hospital of Hebei Medical University, No. 12 Jiankang Road, 050000, China 2School of Basic Medicine, Peking University, No. 38 Xueyuan Road, Haidian District, Beijing 100191, China 3Department of Oncology, Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing, China 4School of Basic Medical Sciences, Hebei Medical University, Shijiazhuang, Hebei, China 5MOH Key Laboratory of Geriatrics, Beijing Hospital, National Center of Gerontology, Beijing, China 6Department of Reproductive Medicine, The Fourth Hospital of Hebei Medical University, No. 12 Jiankang Road, 050000, China

Correspondence should be addressed to Ai-li Zhang; [email protected] and Ling-bing Meng; [email protected]

Received 20 December 2019; Revised 10 May 2020; Accepted 22 May 2020; Published 17 June 2020

Academic Editor: Paul Harrison

Copyright © 2020 Bin Liu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Background. Clear cell renal cell carcinoma (ccRCC) is the most common subtype of kidney cancer, which represents the 9th most frequently diagnosed cancer. However, the molecular mechanism of occurrence and development of ccRCC is indistinct. Therefore, the research aims to identify the hub biomarkers of ccRCC using numerous bioinformatics tools and functional experiments. Methods. The public data was downloaded from the Expression Omnibus (GEO) database, and the differently expressed (DEGs) between ccRCC and normal renal tissues were identified with GEO2R. -protein interaction (PPI) network of the DEGs was constructed, and hub genes were screened with cytoHubba. Then, ten ccRCC tumor samples and ten normal kidney tissues were obtained to verify the expression of hub genes with the RT-qPCR. Finally, the neural network model was constructed to verify the relationship among the genes. Results. A total of 251 DEGs and ten hub genes were identified. AURKB, CCNA2, TPX2, and NCAPG were highly expressed in ccRCC compared with renal tissue. With the increasing expression of AURKB, CCNA2, TPX2, and NCAPG, the pathological stage of ccRCC increased gradually (P <0:05). Patients with high expression of AURKB, CCNA2, TPX2, and NCAPG have a poor overall survival. After the verification of RT-qPCR, the expression of hub genes was same as the public data. And there were strong correlations between the AURKB, CCNA2, TPX2, and NCAPG with the verification of the neural network model. Conclusion. After the identification and verification, AURKB, CCNA2, TPX2, and NCAPG might be related to the occurrence and malignant progression of ccRCC.

1. Introduction and management of RCC have changed remarkably rapidly in the past decades through the unremitting efforts of gener- Worldwide, renal cell carcinoma (RCC) represents the 9th ation after generation of researchers. Despite progression in most frequently diagnosed cancer in men and the 10th in cancer control and survival, locally advanced disease and dis- women, accounting for 5% and 3% of all oncological diagno- tant metastases are still diagnosed in a notable proportion of ses, respectively, [1]. According to the most updated data patients. Nevertheless, uncertainties, controversies, and provided by the World Health Organization, there are more research questions remain [3]. Further advances are expected than 140 000 RCC-related deaths yearly, with RCC ranking from the diagnosis, treatment, and prognosis evaluation. as the 13th most common cause of cancer death worldwide. RCC is a group of heterogeneous tumors with different Age and gender factors are closely related to the risk of genetic and molecular changes, clear cell renal cell carcinoma RCC. Other potential risk factors include lifestyle, complica- (ccRCC), papillary RCC (type 1 and type 2), and chromo- tions, drugs, and environmental factors [2]. The diagnosis phobe RCC are the most common solid RCC, accounting 2 BioMed Research International for 85.90% of all malignant RCC [4]. Among them, ccRCC is Table 1: Primers and their sequences for PCR analysis. the most common subtype of kidney cancer. Both sporadic ′ ′ and inherited RCC are usually associated with structural Primer Sequence (5 –3 ) changes in the short arm of 3 [5]. In addition, VEGFA-hF GGCAACTTACTTAGCCTCTT the occurrence of RCC is related to multiple gene alterations, VEGFA-hR AGGACAGTCTGAGTATGGGT such as VHL, PBRM1, BAP1, SETD2, TCEB1, and KDM5C AURKB-hF GTTCGCATTCAACCTACCT [3]. Although our understanding of the biology of RCC has AURKB-hR GACGCCCAATCTCAAAGT improved, surgery is still the main treatment method of RCC. Drugs and comprehensive therapies, identification of CCNA2-hF AACTGGGATAAGGAAGCT new target pathways, and optimal sequencing and combina- CCNA2-hR CAGAAAGTATTGGGTAAGAA tion of existing targeted drugs are areas that are worth MCM2-hF TTCTCCCTCACTTGTCCC researching [6]. MCM2-hR CCTGTAATCCCAGCACTTT Bioinformatics tools can screen differentially expressed MCM7-hF GGGGTAGGCAGAACTCAA genes (DEGs) between diseased and normal tissues [3, 7, 8]. MCM7-hR CATGGAAGCGGTCTCAAA These DEGs are related to the pathological stage, lesion SMC4-hF AGTGGCGTAGCACAGTAA grade, and prognosis of patients. Zou et al. used microarray technology to identify the hub genes between malignant SMC4-hR ATTCCAAGATGATCCCTC glioblastoma and normal brain tissue and obtained the TPX2-hF GCAATCCTTCTGCCTTAG important targets related to brain glioma [9]. Through a TPX2-hR AGACCATCCTGGCTAACA series of bioinformatics analysis, Meng et al. concluded that SLC2A1-hF GAGACGGGAAACCATCAA TPM2 may be an important biomarker for the occurrence SLC2A1-hR CTGCTCCTTCTTCAAACCAC and development of atherosclerosis [10]. MCM5-hF GGGTGCGAGGAGAACAGT Therefore, this study will use bioinformatics technology MCM5-hR TGAGTCTGAGCCAGGGAG to explore the gene molecular markers of abnormal expres- sion during the occurrence of ccRCC and discuss the related NCAPG-hF AGAGTATTGTTGGCTTCC potential mechanisms. These differentially expressed genes NCAPG-hR AACTTCTGGACCATCACA may affect the initiation and malignant progression of ccRCC and can be used as targets for diagnosis and treatment. geo2r) could import data of the GEO database into the R lan- 2. Material and Methods guage and perform differential analysis, essentially through the following two R packages, including limma packages 2.1. Download Public Data. The Omnibus and GEOquery. Therefore, through the GEO2R tool, DEGs (GEO) database (http://www.ncbi.nlm.nih.gov/geo) is the were identified between the normal and ccRCC groups. The largest, most comprehensive, and publicly available source P values < 0.001 was defined as significant. The gene symbols of gene expression data. were necessary. SangerBox (https://shengxin.ren), one open On 20 December, 2019, we set key words “(clear cell renal tool, was used to draw volcano maps. Venn diagrams were cell carcinoma) AND (normal kidney)” to detect the datasets, delineated using FunRich software (http://www.funrich.org/), using a filter of “expression profiling by array.” The inclusion which would visualize common DEGs shared between criteria includes a diagnosis of clear cell renal cell carcinoma GSE105288 and GSE66272. (data from papillary renal cell carcinoma diagnoses were excluded), the dataset including the gene expression profile 2.3. GO and KEGG Analysis. One online tool, DAVID of normal kidney (datasets which were composed of only (https://david.ncifcrf.gov/home.jsp) (version 6.8, Maryland, tumor data were excluded), a sample number of more than America), was applied to carry out the functional annotation forty per dataset (samples of less than forty were excluded), for DEGs. (GO) [11] generally performs data from Homo sapiens (data from other species were enrichment analysis of genomes. And there are mainly cellu- excluded), and a series entry type, expression profiling by lar components (CC), biological processes (BP), and molecu- array (data using methylation profiling only by array were lar functions (MF) in the GO analysis. Kyoto Encyclopedia of excluded). Genes and Genomes (KEGG) (https://www.kegg.jp/) [12] is a Therefore, GSE105288 (GPL10558, Illumina HumanHT- comprehensive database of genomic, chemical, and systemic 12 V4.0 expression beadchip) and GSE66272 (GPL570(HG- functional information. Therefore, DAVID was used to make U133_Plus_2) Affymetrix U133 Plus 2.0 the analysis of GO and KEGG. The Biological Networks Array) were obtained from the GEO database. A total of 44 Gene Oncology tool (BiNGO) (version 3.0.3) was used to samples, including 35 ccRCC tissues and 9 normal renal tis- analyze and visualize the DEGs’ cellular component, biolog- sues, were selected from GSE105288. A total of 53 samples, ical process, and molecular function [13]. including 26 ccRCC tissues and 27 normal renal tissues, were selected from GSE66272. 2.4. Protein-Protein Interaction (PPI) Network. The common DEGs, shared between GSE105288 and GSE66272, were 2.2. Differentially Expressed Genes (DEGs) between Normal converted into differently expressed . The STRING and PCRC. GEO2R (http://www.ncbi.nlm.nih.gov/geo/ (Search Tool for the Retrieval of Interacting Genes) online BioMed Research International 3

Te diferentially expressed genes on 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y

Te gene positions are based on GRCh38.p2(NCBI). 2961 genes.

Overexpressed genes Underexpressed genes (a) Volcano plot Volcano plot 12.7 12.93

10.16 10.34

7.62 7.76

5.08 5.17 –log10 ( P value) –log10 ( P value)

2.54 2.59

0 0 –0.81 –0.57 –0.34 –0.1 0.14 0.37 0.61 –10.69 –6.82 –2.96 0.9 4.77 8.63 12.5 log2 (FoldChange) log2 (FoldChange) (b) (c)

Figure 1: (a) The differentially expressed genes on chromosomes between ccRCC and normal kidney tissue. (b) One volcano plot presents the DEGs in the GSE105288. (c) Another volcano plot presents the DEGs in the GSE66272. database (http://string-db.org) could construct the PPI net- cut − off = 2, k − score = 2, and node score cut − off = 0:2. work, which was visualized by Cytoscape (version 2.8) [14]. Then, cytoHubba [16], a free plug-in of Cytoscape, was applied to authorize the hub genes, when the degree ≥ 10. 2.5. Significant Module and Hub Genes. Molecular Complex Detection tool (MCODE) (version 1.5.1) [15], an open 2.6. Expression Analysis of Hub Genes. The clustering analysis plug-in of Cytoscape, was performed to identify tested most of expression level of hub genes was performed using heat significant module from the PPI network, and the criteria maps based on the GSE105288 and GSE66272. Also, the was that the maximum depth = 100, MCODE scores > 5, expression profiles of hub genes in the ccRCC and normal 4 BioMed Research International

GSE105288 GSE66272

1082 251 1973

(a)

CASZ1

HEPACAM2 GTF3A PACRG TMPRSS2 TRIM8 PNMA2 DTX3 HACL1 YIPF5 DNASE1 ACPP EIF3B COLGALT1 DTX1 DPP6 IFI16 TACSTD2 CALU COL23A1 PLEKHO1 SCAMP5 ATP1A1 DNASE2 RAB24 P4HA2 UBE2D2 HECW2 STC2 PAPPA PARP9 KLHL3 HIGD1A LOX PRKCSH EHBP1L1 BST2 PARM1 P4HA1 COL1A2 NMI CTSZ ERMP1 NRP2 HRG STX4 RNF123 KCNJ10 KNG1 UNC5B SMDT1 SUCNR1 P4HB B3GNTL1 ATP6V1C2 EMCN PLG SLC4A1 ATP6V0A4 BHLHE41 EGLN3 TAP1 MCFD2 APEH TBC1D24 GNB5 BNIP3 TAPBP CA9 BID CD40 MCOLN3 ATP6V1D PGK1 VEGFA SHMT2 ALDOA ANGPT2 VIM SETBP1 COX4I2 SLC41A2 SCN2A COQ6 TRADD MLKL LAIR1 SLC2A12 PREX1 PSMB8 FUT11 SLC2A1 PDGFD CANX CACNA2D2 PFKFB4 ST14 AURKB SLC25A35 CPOX NPHS1 TYMP USP2 SLC29A4 SLC48A1 TPI1 MAP4K4 SLC7A8 NDUFA4L2 HK2 YBX3 PTTG1 DCTN4 CCNA2 SVOPL GRIK5 SLC1A4 PFKFB2 NEK6 AMT CHN2 MCM2 TMCC1 SND1 XPO1 GSAP CLIC5 RARB SHC1 TPP2 NCK1 TOPBP1 GLDC MTDH NCKAP1L NETO2 TLK2 FAM167A PPP1CC SPC25 NCAPG TMEM213 L2HGDH FSCN1 CDCA7 PHKA2 MCM5 ABCA1 SPTSSA GABRA2 HOXA4 DCAF11 CORO1C CKAP2 BCR SMC4 ENSG00000259288 CENPM MCM7 RFC2 PUS7 SLCO2B1 TNIP1 INTS8 NXPH2 CNEP1R1 DLGAP5 ZDHHC3 LRBA VWA8 TBL1XR1 SYNE2 KIF20B HMMR MBNL1 SORT1 TPX2 SLC51B ELF5 RUFY1 KIF13B ASUN PALM RNF145 ARL2

TRIB3 PRLR TBC1D9B SPAG4 (b)

Figure 2: (a) The Venn diagram manifested that a total of 251 DEGs exist in the two datasets (GSE105288 and GSE66272) simultaneously. (b) The PPI network of the common DEGs. groups were analyzed using Gene Expression Profiling Inter- curve analysis was performed to test the sensitivity and spec- active Analysis (GEPIA, http://gepia.cancer-pku.cn/) [17]. ificity of the hub gene expression for the diagnosis of ccRCC. The SPSS software (version 21.0; IBM; New York; America) 2.7. Effect of Hub Gene Expression for Pathological Stage and was used to conduct all the statistical analysis. A P value < Overall Survival. The effect of hub gene expression for path- 0.05 was defined as statistically significant. ological stage and overall survival was analyzed by the GEPIA. Finally, the correlation and linear regression analyses 2.8. RT-qPCR Assay. A total of 10 ccRCC participates were between AURKB, CCNA2, TPX2, and NCAPG were per- recruited. After surgery, 10 ccRCC tumor samples from formed. And the receiver operator characteristic (ROC) ccRCC patients and 10 adjacent normal kidney tissues BioMed Research International 5

Top 18 of pathway enrichment P value

Sister chromatid cohesion 0.04 Platelet degranulation Cellular response to hypoxia Glycine decarboxylation via glycine cleavage system Glycine catabolic process Carbohydrate phosphorylation 0.03 Regulation of actin cytoskeleton organization Mitotic nuclear division Regulation of insulin secretion DNA replication initiation Cell division Fructose metabolic process 0.02 Cell proliferation Angiogenesis Peptidyl−proline hydroxylation to 4−hydroxy−L−proline Glycolytic process Canonical glycolysis 0.01 0.2 0.4 0.6 RichFactor GeneNumber 2.5 10.0 5.0 12.5 7.5 (a) Top 18 of Pathway Enrichment P value Microvillus TAP complex Microspike Cytoplasm Lysosomal membrane 0.03 Procollagen−proline 4−dioxygenase complex Endoplasmic reticulum Intracellular membrane−bounded organelle Centrosome 0.02 Platelet alpha granule lumen MCM complex Cytosol Melanosome Extracellular exosome 0.01 Membrane Endoplasmic reticulum lumen Basolateral plasma membrane 0.00 0.25 0.50 0.75 1.00 RichFactor GeneNumber 20 60 40 80 (b)

Figure 3: Continued. 6 BioMed Research International

Top 15 of pathway enrichment P value

Double−stranded RNA binding 0.04 ATP binding DNA helicase activity Peptide antigen−transporting ATPase activity TAP2 binding 0.03 Enzyme binding GTPase activator activity Actin binding 0.02 L−ascorbic acid binding Apolipoprotein binding Anion transmembrane transporter activity Identical protein binding 0.01 Procollagen−proline 4−dioxygenase activity Protein binding

0.0 0.2 0.4 0.6 RichFactor

GeneNumber 50 100 150 (c) Top 10 of Pathway Enrichment P value

hsa01230:Biosynthesis of amino acids

hsa00010:Glycolysis / Gluconeogenesis 0.075 hsa05211:Renal cell carcinoma

hsa00630:Glyoxylate and dicarboxylate metabolism 0.050 hsa04066:HIF-1 signaling pathway

hsa03030:DNA replication 0.02

hsa04966:Collecting duct acid secretion

hsa01200:Carbon metabolism 0.025

hsa00051:Fructose and mannose metabolism

0.050 0.075 0.100 0.125 0.150 RichFactor

GeneNumber 3 6 4 7 5 (d)

Figure 3: Continued. BioMed Research International 7

Macromolecule metabolic Cellular macromolecule process metabolic process Small molecule metabolic Cellular protein metabolic process process Protein metabolic process Organic acid metabolic process Metabolic process

Primary metabolic process Cellular amino acid and Protein modifcation process derivative metabolic process Macromolecule modifcation Cellular ketone metabolic Cellular process process Peptidyl-amino acid modifcation

Biological_process Peptidyl-proline modifcation Oxoacid metabolic process

Heterocycle metabolic process

Peptidyl-proline hydroxylation Cellular metabolic process

Peptidyl-proline hydroxylation Carboxylic acid metabolic to 4-hydroxy-L-proline process Nitrogen compound metabolic Monocarboxylic acid metabolic process process

4-hydroxyproline metabolic Cellular amino acid derivative process metabolic process Amine metabolic process

(e)

Figure 3: (a) Detailed information relating to changes in the biological processes (BP) of DEGs in ccRCC and normal kidney tissue. (b) Detailed information relating to changes in the cellular components (CC) of DEGs in ccRCC and normal kidney tissue. (c) Detailed information relating to changes in the molecular functions (MF) of DEGs in ccRCC and normal kidney tissue. (d) KEGG pathway analysis for DEGs. (e) The BP analysis for DEGs via the BiNGO software. samples were obtained. The research conformed to the Dec- IlluminaHiSeq was selected as gene expression RNAseq in laration of Helsinki and was authorized by the Human Ethics the research. In addition, the gene expression levels of and Research Ethics Committees of the Fourth Hospital of VEGFA, AURKB, CCNA2, MCM2, MCM7, SMC4, TPX2, Hebei Medical University. The informed consents were SLC2A1, MCM5, and NCAPG between ccRCC and normal obtained from all participates. renal samples were compared using the one-way ANOVA. Total RNA was extracted from 10 ccRCC tumor samples Furthermore, the effect of the gene expression of VEGFA, and 10 adjacent normal kidney tissue samples by the RNAiso AURKB, CCNA2, MCM2, MCM7, SMC4, TPX2, SLC2A1, Plus (TRIzol) kit (Thermo Fisher, Massachusetts, America MCM5, and NCAPG on overall survival was analyzed by and reverse transcribed to cDNA. RT-qPCR was performed using the TCGA data. using a Light Cycler® 4800 System with specific primers for genes. Table 1 presents the primer sequences used in the 2.10. The Construction of Neural Network Model. The train- −ΔΔ experiments. The RQ values (2 Ct, where Ct is the thresh- ing group was randomly divided into the calibration data old cycle) of each sample were calculated and are presented and training data according to the proportion of 3 : 7. There as fold change in the gene expression relative to the control were 6 samples in the calibration data, and 20 samples in group. GAPDH was used as an endogenous control. the training data. We used MATLAB (version 8.3) to accomplish the normalization processing of variable values, 2.9. The Confirmation Using The Cancer Genome Atlas network simulation, network training, and network initiali- (TCGA) Data. The gene expression dataset of ccRCC in the zation. The number of input neurons in the input layer is TCGA was downloaded using the University of California the same as the number of input variables, and the number Santa Cruz (UCSC) Xena (https://xena.ucsc.edu/welcome- is two. The hidden layer is designed as 1 layer, and the out- to-ucsc-xena/). There were a total of 944 samples including put layer is also designed for 1 layer. One output variable is 537 ccRCC samples and 407 normal renal samples. The the intima-media thickness. When training to 2000 steps 8 BioMed Research International

Nucleoplasm Nucleus

Intracellular part nonmembrane-bounded organelle Nuclear lumen Nuclear part Cytoplasmic Nuclear chromosome part membrane-bounded vesicle Chromosome Intracellular Chromosomal part membrane-bounded organelle MCM complex Membrane-bounded vesicle Cytoplasmic vesicle

Endoplasmic reticulum Melanosome Nonmembrane-bounded Intracellular organelle lumen organelle Subsynaptic reticulum intracellular organelle part

Vesicle Membrane-bounded organelle Organelle part Endoplasmic reticulum part

Macromolecular complex MHC class I peptide loading Intracellular organelle complex Protein complex Organelle lumen Organelle Cell projection membrane Cytoplasm Intracellular part

Cellular_component Membrane-enclosed lumen Cytoplasmic part Intracellular

Pigment granule Cell part

Cell Microvillus membrane

Cell projection Microvillus Cell projection part

(a)

Figure 4: Continued. BioMed Research International 9

Ion transmembrane transporter activity

Substrate-specifc transmembrane transporter activity 6-Phosphofructo-2-kinase Phosphotransferase activity, activity alcohol group as acceptor Substrate-specifc transporter activity Transmembrane transporter Phosphofructokinase activity activity

Transferase activity, transferring Carbohydrate kinase activity Kinase binding phosphorus-containing groups Transporter activity

Enzyme binding Kinase activity Intermediate flament binding Transferase activity

Protein binding Molecular_function Catalytic activity

Protein complex binding Oxidoreductase activity Binding Peptidyl-proline dioxygenase activity TAP binding

TAP1 binding Procollagen-proline TAP2 binding dioxygenase activity Antigen binding Peptide binding Procollagen-proline 4-dioxygenase activity

Peptide antigen binding Peptidyl-proline 4-dioxygenase activity (b)

Figure 4: (a) The CC analysis for DEGs via the BiNGO software. (b) The MF analysis for DEGs via the BiNGO software. after repeated training, the falling gradient is 0, and the 3.3. The Functional Enrichment Analysis of DEGs via GO and training speed is uniform [10]. At the same time, the train- KEGG. GO analysis manifested that variations in DEGs ing error ≤ 0:05, and the R (relativity) value reached 0.9906. related with biological processes (BP) were significantly enriched in canonical glycolysis, glycolytic process, peptidyl- 3. Results proline hydroxylation to 4-hydroxy-L-proline, angiogenesis, cell proliferation, fructose metabolic process, cell division, 3.1. DEGs between Normal Kidney and ccRCC Samples. There DNA replication initiation, regulation of insulin secretion, are plenty of DEGs on all chromosomes between the ccRCC mitotic nuclear division, regulation of actin cytoskeleton orga- and normal samples (Figure 1(a)). One volcano plot presents nization, carbohydrate phosphorylation, glycine catabolic pro- the DEGs in the GSE105288 (Figure 1(b)), and another vol- cess, glycine decarboxylation via glycine cleavage system, cano plot presents the DEGs in the GSE66272 (Figure 1(c)). cellular response to hypoxia, and so on (Figure 3(a)). The var- The Venn diagram manifested that a total of 251 DEGs exist iations in DEGs related with cellular components (CC) were in the two datasets (GSE105288 and GSE66272) simulta- significantly enriched in the basolateral plasma membrane, neously (Figure 2(a)). endoplasmic reticulum lumen, membrane, extracellular exo- some, melanosome, cytosol, MCM complex, and so on 3.2. Construction of the PPI Network. After construction of (Figure 3(b)). The variations in the DEGs related with molec- the PPI network for the common DEGs, there are 189 nodes ular functions (MF) were significantly enriched in protein and 406 edges in the PPI network (Figure 2(b)). binding, procollagen-proline 4-dioxygenase activity, identical 10 BioMed Research International

CKAP2 P4HA2 TPX2 SMC4

AURKB KIF20B COL23A1 P4HA1 COL1A2

HMMR SPC25 CALU

STC2 PRKCSH EGLN3 NCAPG VEGFA PTTG1 HRG

MCM2 MCM7 PGK1 CA9 PLG

CENPM CCNA2 DLGAP5 SLC2A1 BNIP3

(a) (b) Rank Node

AURKB VEGFA SMC4 AURKB

VEGFA CCNA2 TPX2 MCM2 MCM7 SLC2A1 SMC4 TPX2 MCM2 CCNA2 MCM5 SLC2A1 MCM5 NCAPG

MCM7 NCAPG (c) (d)

Figure 5: (a) A significant module was screened from the PPI network, and one module network consisted of 14 nodes and 84 edges. (b) Another module network consisted of 15 nodes and 32 edges. (c) Ten hub genes were identified, including VEGFA, AURKB, CCNA2, MCM2, MCM7, SMC4, TPX2, SLC2A1, MCM5, and NCAPG. protein binding, anion transmembrane transporter activity, network, and one module network consisted of 14 nodes apolipoprotein binding, L-ascorbic acid binding, actin bind- and 84 edges (Figure 5(a)). Another module network con- ing, and so on (Figure 3(c)). The KEGG pathway enrichment sisted of 15 nodes and 32 edges (Figure 5(b)). And ten hub analysis showed that the top pathways related with DEGs genes were identified, including VEGFA, AURKB, CCNA2, were fructose and mannose metabolism, carbon metabolism, MCM2, MCM7, SMC4, TPX2, SLC2A1, MCM5, and collecting duct acid secretion, DNA replication, HIF-1 sig- NCAPG (Figure 5(c)). naling pathway, and so on (Figure 3(d)). The BP analysis for DEGs is presented in Figure 3(e) 3.5. Difference of Expression of Hub Genes between ccRCC via the BiNGO software (Figure 3(e)). The CC analysis for and Normal Kidney Samples. Hierarchical clustering allowed DEGs is presented in Figure 4(a) via the BiNGO software for simple differentiation of ccRCC tissues from normal colo- (Figure 4(a)). The MF analysis for DEGs is presented in rectal tissues via the expression levels of hub genes in the Figure 4(b) via the BiNGO software (Figure 4(b)). GSE105288 and GSE66272 datasets. One heat map showed that the expressions of all the hub genes were higher in the 3.4. Significant Module Network and Identification of Hub ccRCC samples than the normal samples in the GSE105288 Genes. A significant module was screened from the PPI (Figure 6(a)). Another heat map also showed that the BioMed Research International 11

VEGFA 3 SLC2A1

SMC4 2

MCM5 1 MCM7 0 MCM2

CCNA2 −1

TPX2 −2 AURKB −3 NCAPG GSM2825240 GSM2825242 GSM2825243 GSM2825252 GSM2825255 GSM2825263 GSM2825265 GSM2825275 GSM2825277 GSM2825239 GSM2825241 GSM2825244 GSM2825245 GSM2825246 GSM2825247 GSM2825248 GSM2825249 GSM2825250 GSM2825251 GSM2825253 GSM2825254 GSM2825256 GSM2825257 GSM2825258 GSM2825259 GSM2825260 GSM2825261 GSM2825262 GSM2825264 GSM2825266 GSM2825267 GSM2825268 GSM2825269 GSM2825270 GSM2825271 GSM2825272 GSM2825273 GSM2825274 GSM2825276 GSM2825278 GSM2825279 GSM2825280 GSM2825281 GSM2825282 Normal ccRCC (a)

SMC4 4 VEGFA

SLC2A1 2 MCM5 TPX2 0 NCAPG AURKB MCM7 –2 CCNA2

MCM2 −4 GSM1618416 GSM1618388 GSM1618390 GSM1618392 GSM1618394 GSM1618396 GSM1618398 GSM1618400 GSM1618402 GSM1618404 GSM1618406 GSM1618408 GSM1618410 GSM1618412 GSM1618414 GSM1618418 GSM1618420 GSM1618422 GSM1618424 GSM1618426 GSM1618428 GSM1618430 GSM1618432 GSM1618434 GSM1618436 GSM1618438 GSM1618440 GSM1618387 GSM1618389 GSM1618391 GSM1618393 GSM1618395 GSM1618397 GSM1618399 GSM1618401 GSM1618403 GSM1618405 GSM1618407 GSM1618409 GSM1618411 GSM1618413 GSM1618415 GSM1618419 GSM1618421 GSM1618423 GSM1618425 GSM1618427 GSM1618429 GSM1618431 GSM1618433 GSM1618435 GSM1618437 GSM1618439 Normal ccRCC (b)

Figure 6: (a) One heat map showed that the expressions of all the hub genes were higher in the ccRCC samples than the normal samples in the GSE105288. (b) Another heat map also showed that the expressions of all the hub genes were higher in the ccRCC samples than the normal samples in the GSE66272. expressions of all the hub genes were higher in the ccRCC (P <0:05), while the expressions of VEGFA, MCM2, samples than the normal samples in the GSE66272 MCM7, SMC4, SLC2A1, and MCM5 were not (Figure 6(b)). Through the GEPIA analysis, the expressions (Figure 7(b)). The results showed that the expression level of hub genes in the ccRCC patients were higher than the nor- of VEGFA was not related with the overall survival of mal individuals (Figure 7(a)). ccRCC patients (P >0:05, Figure 8(a)). The overall sur- vival analysis showed that ccRCC patients with high 3.6. Association between Hub Gene Expression, Pathological expression levels of AURKB (Figure 8(b)) and CCNA2 Stage, and Overall Survival. The results of GEPIA manifested (Figure 8(c)) had poorer overall survival times than those that the expressions of AURKB, CCNA2, TPX2, and NCAPG with low expression levels (P <0:05). The expression level were significantly positively related with pathological stage of MCM2 was not related with the overall survival of 12 BioMed Research International

VEGFA AURKB CCNA2 MCM2 MCM7 ⁎⁎⁎6 12 6 5 5 10 5 6 4 4 8 4 3 3 4 6 3 2 4 2 2 2 2 1 1 1

0 0 0 0 0 KIRC KIRC KIRC KIRC KIRC (num(T)=523; num(N)=100) (num(T)=523; num(N)=100) (num(T)=523; num(N)=100) (num(T)=523; num(N)=100) (num(T)=523; num(N)=100) SMC4 TPX2 SLC2A1 MCM5 NCAPG 7 7 ⁎ ⁎⁎ 10 5 6 6 6

5 5 8 5 4

4 4 4 6 3 3 3 3 4 2 2 2 2 2 1 1 1 1

0 0 0 0 0 KIRC KIRC KIRC KIRC KIRC (num(T)=523; num(N)=100) (num(T)=523; num(N)=100) (num(T)=523; num(N)=100) (num(T)=523; num(N)=100) (num(T)=523; num(N)=100) (a)

VEGFA AURKB CCNA2 MCM2 MCM7 7 F 6 8 F F value = 2.67 value = 20.8 F value = 13 6 F value = 2.19 value = 2.18 F F 12 Pr(>F) = 0.0471 Pr(> ) = 1.05e−12 Pr(>F) = 3.5e−086 Pr(>F) = 0.0879 Pr(> ) = 0.0895 6 5 7 5 5 10 4 6 4 4 8 3 5 3 3 4 6 2 2 2 3 1 1 4 1 0 2 Stage I Stage II Stage III Stage IV Stage I Stage II Stage III Stage IV Stage I Stage II Stage III Stage IV Stage I Stage II Stage III Stage IV Stage I Stage II Stage III Stage IV

SMC4 TPX2 SLC2A1 MCM5 NCAPG 7 F value = 15.3 F 6 F value = 14.3 F value = 3.26 7 F value = 0.548 7 value = 1.91 Pr(>F) = 1.52e−09 F Pr(>F) = 5.85e−09 Pr(>F) = 0.0213 Pr(>F) = 0.649 Pr(> ) = 0.127 6 10 5 6 6 5 5 4 8 5 4 4 3 3 6 4 3 2 2 3 2 4 1 1 2 1 2 0 0 Stage I Stage II Stage III Stage IV Stage I Stage II Stage III Stage IV Stage I Stage II Stage III Stage IV Stage I Stage II Stage III Stage IV Stage I Stage II Stage III Stage IV (b)

Figure 7: (a) The comparison of expressions of all hub genes between ccRCC and normal kidney samples. (b) The relationship between the expression of hub genes and pathological stage. ccRCC patients (P >0:05, Figure 8(d)). The expression level sion levels (P <0:05, Figure 9(a)). The expression levels of of MCM7 was not related with the overall survival of ccRCC SLC2A1 (Figure 9(b)) and MCM5 (Figure 9(c)) were not patients (P >0:05, Figure 8(e)). The expression level of SMC4 related with the overall survival of ccRCC patients (P >0:05). was not related with the overall survival of ccRCC patients The overall survival analysis showed that ccRCC patients (P >0:05, Figure 8(f)). The overall survival analysis showed with high expression levels of NCAPG had poorer overall that ccRCC patients with high expression levels of TPX2 survival times than those with low expression levels had poorer overall survival times than those with low expres- (P <0:05, Figure 9(d)). BioMed Research International 13

Overall survival Overall survival 1.0 Log-rank P = 0.2 1.0 Log-rank P = 2.8e−06 HR(high) = 0.82 HR(high) = 2.1 P(HR) = 0.2 P 0.8 0.8 (HR) = 4.6e−06 n(high) = 258 n(high) = 257 n (low) = 258 n(low) = 257 0.6 0.6

0.4 0.4 Percent survival Percent survival

0.2 0.2

0.0 0.0

0 50 100 150 0 50 100 150 Months Months

Low VEGFA TPM Low AURKB TPM High VEGFA TPM High AURKB TPM (a) (b) Overall survival Overall survival 1.0 P 1.0 Log-rank = 0.0012 Log-rank P = 0.61 HR(high) = 1.7 HR(high) = 0.92 P(HR) = 0.0014 P(HR) = 0.61 0.8 n(high) = 257 0.8 n n(low) = 257 (high) = 257 n(low) = 258 0.6 0.6

0.4 0.4 Percent survival Percent survival

0.2 0.2

0.0 0.0

0 50 100 150 0 50 100 150 Months Months

Low CCNA2 TPM Low MCM2 TPM High CCNA2 TPM High MCM2 TPM (c) (d) Overall survival Overall survival 1.0 1.0 P Log-rank P = 0.24 Log-rank = 0.64 HR(high) = 0.83 HR(high) = 1.1 P 0.8 P(HR)= 0.24 0.8 (HR) = 0.64 n(high) = 258 n(high) = 257 n(low) = 258 n(low) = 258 0.6 0.6

0.4 0.4 Percent survival Percent survival

0.2 0.2

0.0 0.0 0 50 100 150 0 50 100 150 Months Months

Low MCM7 TPM Low SMC4 TPM High MCM7 TPM High SMC4 TPM (e) (f)

Figure 8: The overall survival Kaplan-Meier of six hub genes. (a) VEGFA, (b) AURKB, (c) CCNA2, (d) MCM2, (e) MCM7, and (f) SMC4. 14 BioMed Research International

Overall survival Overall survival 1.0 Log-rank P = 0.0041 1.0 Logrank P = 0.49 HR(high)= 1.6 HR(high) = 0.9 P P(HR) = 0.0044 (HR) = 0.49 0.8 n(high) = 258 n(high) = 256 0.8 n(low) = 258 n(low) = 257 0.6 0.6

0.4 0.4 Percent survival Percent survival 0.2 0.2

0.0 0.0 0 50 100 150 0 50 100 150 Months Months

Low TPX2 TPM Low SLC2A1 TPM High TPX2 TPM High SLC2A1 TPM (a) (b) Overall survival Overall survival 1.0 Log-rank P = 0.17 1.0 Logrank P = 0.014 HR(high) = 0.81 HR(high) = 1.5 P(HR) = 0.17 P(HR) = 0.015 n 0.8 (high) = 258 0.8 n(high) = 258 n(low) = 258 n(low) = 257

0.6 0.6

0.4 0.4 Percent survival Percent survival 0.2 0.2

0.0 0.0

0 50 100 150 0 50 100 150 Months Months

Low MCM5 TPM Low NCAPG TPM High MCM5 TPM High NCAPG TPM (c) (d)

Figure 9: The overall survival Kaplan-Meier of another four hub genes. (a) TPX2, (b) SLC2A1, (c) MCM5, and (d) NCAPG.

3.7. The Interaction Analysis among the Hub Genes. Through of TPX2 in the GSE66272 was shown in Figure 11(g). The the Pearson correlation test, heat maps manifested that there ROC curve of NCAPG in the GSE66272 was shown in were strong correlations among hub genes in the GSE105288 Figure 11(h). (Figure 10(a)) and GSE66272 (Figure 10(b)) datasets. The correlation between AURKB, CCNA2, TPX2, and NCAPG 3.9. Results of RT-qPCR Analysis. As presented in Figure 12, was strong (Figures 10(c)–10(h)). the relative expression levels of VEGFA, AURKB, CCNA2, MCM2, MCM7, SMC4, TPX2, SLC2A1, MCM5, and 3.8. ROC Analysis. To identify accurate thresholds for hub NCAPG were significantly higher in the ccRCC samples, genes to predict ccRCC, we constructed ROC. The expression compared with the normal kidney tissues groups. The result of all hub genes was associated with a diagnosis of ccRCC. demonstrated that VEGFA, AURKB, CCNA2, MCM2, The ROC curve of AURKB in the GSE105288 was shown MCM7, SMC4, TPX2, SLC2A1, MCM5, and NCAPG might in Figure 11(a). The ROC curve of CCNA2 in the be considered biomarkers for ccRCC. GSE105288 was shown in Figure 11(b). The ROC curve of TPX2 in the GSE105288 was shown in Figure 11(c). The 3.10. The Verification by TCGA. According to the above ROC curve of NCAPG in the GSE105288 was shown in expression analysis, VEGFA, AURKB, CCNA2, MCM2, Figure 11(d). The ROC curve of AURKB in the GSE66272 MCM7, SMC4, TPX2, SLC2A1, MCM5, and NCAPG were was shown in Figure 11(e). The ROC curve of CCNA2 in markedly upregulated in ccRCC tumor samples compared the GSE66272 was shown in Figure 11(f). The ROC curve with the normal renal samples. After confirmation using BioMed Research International 15

1 1 VEGFA VEGFA

AURKB AURKB 0.9 0.8 CCNA2 CCNA2 0.8 MCM2 MCM2 0.7 MCM7 0.6 MCM7

SMC4 SMC4 0.6

TPX2 TPX2 0.4 0.5 SLC2A1 SLC2A1 0.4 MCM5 MCM5 0.2 NCAPG NCAPG 0.3 TPX2 TPX2 SMC4 SMC4 MCM2 MCM7 MCM5 MCM2 MCM7 MCM5 VEGFA CCNA2 VEGFA AURKB SLC2A1 CCNA2 NCAPG AURKB SLC2A1 NCAPG (a) (b)

P value = 0 P value = 0 P value = 0 5 5 R = 0.82 6 R = 0.93 R = 0.83

4 5 4

4 3 3 3 2 2

log2 (TPX2 TPM) 2 log2 (CCNA2 TPM) log2 (NCAPG TPM) 1 1 1

0 0 0 0123456 0123456 0123456 log2 (AURKB TPM) log2 (AURKB TPM) log2 (AURKB TPM) (c) (d) (e)

P value = 0 P value = 0 P value = 0 5 5 6 R = 0.9 R = 0.95 R = 0.91

5 4 4

4 3 3 3 2 2

log2 (TPX2 TPM) 2 log2 (NCAPG TPM) log2 (NCAPG TPM) 1 1 1

0 0 0 012345 012345 0123456 log2 (CCNA2 TPM) log2 (CCNA2 TPM) log2 (TPX2 TPM) (f) (g) (h)

Figure 10: (a) Heat maps showing the correlations between hub genes in the GSE105288 datasets. (b) Heat maps showing the correlations between hub genes in the GSE66272 datasets. (c) The correlation between AURKB and CCNA2. (d) The correlation between AURKB and TPX2. (e) The correlation between AURKB and NCAPG. (f) The correlation between CCNA2 and TPX2. (f) The correlation between AURKB and NCAPG. (g) The correlation between CCNA2 and NCAPG. (h) The correlation between TPX2 and NCAPG. the TCGA data, these gene expression levels in the ccRCC The results showed that the expression level of VEGFA samples were also significantly higher than the normal renal (P =0:4946), MCM2 (P =0:5249), MCM7 (P =0:092), samples (P <0:05, Figure 13). SMC4 (P =0:856), SLC2A1 (P =0:209), and MCM5 16 BioMed Research International

AURKB CCNA2 1.0 1.0 6.829 (0.857, 1.000) 7.237 (0.829, 1.000)

0.8 0.8

0.6 0.6 AUC: 0.937 AUC: 0.933 Sensitivity Sensitivity 0.4 0.4

0.2 0.2

0.0 0.0 1.0 0.8 0.6 0.4 0.2 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Specifcity Specifcity (a) (b) TPX2 NCAPG 1.0 1.0 7.256 (0.829, 1.000) 7.209 (0.743, 0.889) 0.8 0.8

0.6 0.6 AUC: 0.895 AUC: 0.940 Sensitivity 0.4 Sensitivity 0.4

0.2 0.2

0.0 0.0 1.0 0.8 0.6 0.4 0.2 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Specifcity Specifcity (c) (d) AURKB CCNA2 1.0 1.0 0.083 (0.846, 0.926) 0.8 0.8 −0.389 (1.000, 0.852)

0.6 0.6 AUC: 0.930 AUC: 0.983 Sensitivity 0.4 Sensitivity 0.4

0.2 0.2

0.0 0.0 1.0 0.8 0.6 0.4 0.2 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Specifcity Specifcity (e) (f)

Figure 11: Continued. BioMed Research International 17

TPX2 NCAPG 1.0 1.0 −0.002 (0.962, 0.926) 0.302 (0.846, 0.963) 0.8 0.8

0.6 0.6 AUC: 0.977 AUC: 0.934 Sensitivity 0.4 Sensitivity 0.4

0.2 0.2

0.0 0.0 1.0 0.8 0.6 0.4 0.2 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Specifcity Specifcity (g) (h)

Figure 11: ROC curves of hub genes for ccRCC.

(P =0:303) was not related with the overall survival of ccRCC on personal experience and subjective guesses to a more patients. The overall survival analysis showed that ccRCC objective science [18–20]. patients with high expression levels of AURKB (P =0:000), In this paper, bioinformatics tools are used to mine the CCNA2 (P =0:000), TPX2 (P =0:000), and NCAPG targeted biomarkers of ccRCC. The results showed that (P =0:000) had poorer overall survival times than those with AURKB, CCNA2, TPX2, and NCAPG were highly expressed low expression levels (Figure 14). in ccRCC compared with renal tissue. With the increasing expression of AURKB, CCNA2, TPX2, and NCAPG, the 3.11. The Neural Network Prediction Model between AURKB, pathological stage of ccRCC increased gradually. Compared CCNA2, TPX2, and NCAPG. The mean squared error is with the individuals with low expression of AURKB, <0.05 (Figure 15(a)). The relativity of training is 0.9906. CCNA2, TPX2, and NCAPG, patients with high expression The relativity of validation is 0.99768. The relativity of test of AURKB, CCNA2, TPX2, and NCAPG have a poor overall is 0.93812. And the relativity of all procedure is 0.97977 survival. (Figure 15(b)). Through verifying the predicted value of the Aurora kinase B (AURKB) is a serine/threonine kinase data against the actual value, we found that there are only that participates in the regulation of chromosome arrange- small differences in the comparison chart of training results ment and segregation by binding to microtubules [21]. (Figure 15(c)) and error analysis diagram (Figure 15(d)). Numerous studies have found that the overexpression of Based on the above result, we could speculate that there were AURKB exists in a variety of cancer cell lines [22–24]. Sor- strong correlations between AURKB, CCNA2, TPX2, and rentino et al. found that AURKB is highly expressed in thy- NCAPG. roid carcinoma, and its expression level is related to Through the cubic spline interpolation algorithm, we malignant degree. The block of AURKB expression or by find the high-risk warning indicator of TPX2: CCNA2 < 5:0 using an inhibitor of Aurora kinase activity significantly and 5:2 < AURKB. The three-dimensional stereogram could reduced the growth of thyroid carcinoma cells [23]. present the warning range well (Figure 15(e)). The plane Katayama et al. have similar findings in colorectal cancer graph is also shown (Figure 15(f)). [25], Smith et al. in lung cancer [22], and Chieffi et al. in pros- tate cancer [24]. Abnormal mitotic regulation can induce the 4. Discussion production of aneuploid cells and act as a driving role in the process of malignant progression, while serine/theronine RCC is a common disease in the urinary system. According protein kinases of the Aurora family genes play a critical role to the statistics of the World Health Organization in 2018, in the regulation of key cell cycle processes. The abnormal its incidence is second only to prostate cancer and bladder expression of AURKB can produce malignant and invasive cancer and is increasing year by year [1]. Although many aneuploid cells. This further indicates that AURKB is related genes are considered potential therapeutic targets and prog- to tumorigenesis [26, 27]. With the discovery of abnormal nostic predictors of RCC, the molecular mechanism of the expression of AURKB in cancer cells, researchers realized occurrence and development of RCC remains controversial. that it may become a new target for cancer treatments. At With the continuous progress of science, microarray present, many AURKB inhibitors have been developed, technology, as a special data mining method, is very influen- including AZD1152, AT9283, VX-680/MK-0457, PHA- tial at present. This revolutionary technology transforms 680632, AMG-900, PHA-739358, and CYC-116, and some traditional molecular research from a situation that relies of them have entered clinical trials [28]. 18 BioMed Research International

VEGFA AURKB CCNA2 ⁎ ⁎ ⁎ 3 8 10

8 6 2 6 4 4 1 2 2 Relative expression of VEGFA Relative expression of AURKB 0 0 Relative expression of CCNA2 0

Con ccRCC Con ccRCC Con ccRCC (a) (b) (c) MCM2 MCM7 SMC4 ⁎ ⁎ ⁎ 20 10 10

8 8 15 6 6 10 4 4 5 2 2 Relative expression of SMC4

Relative expression of MCM2 0 Relative expression of MCM7 0 0

Con ccRCC Con ccRCC Con ccRCC (d) (e) (f) TPX2 SLC2A1 ⁎ ⁎ 25 20

20 15 15 10 10 5 5 Relative expression of TPX2 0 Relative expression of SLC2A1 0

Con ccRCC Con ccRCC (g) (h) MCM5 NCAPG ⁎ ⁎ 20 20

15 15

10 10

5 5 Relative expression of MCM5 0 Relative expression of NCAPG 0

Con ccRCC Con ccRCC (i) (j)

Figure 12: Relative expression of VEGFA, AURKB, CCNA2, MCM2, MCM7, SMC4, TPX2, SLC2A1, MCM5, and NCAPG by RT-qPCR analysis. ∗P <0:05, compared with normal kidney tissues. BioMed Research International 19

TCGA clear cell renal cell carcinoma (ccRCC) 20 ⁎ 17.5 ⁎ 15 ⁎ ⁎ 12.5 ⁎ ⁎ ⁎ ⁎ 10 ⁎ ⁎ 7.5 5 2.5 Unit: log2 (norm_count+1) 0 Gene expression RNAseq -IlluminaHiSeq –2.5 TPX2 SMC4 MCM2 MCM7 MCM5 VEGFA CCNA2 AURKB SLC2A1 NCAPG Gene symbol Color: sample_type ccRCC Normal

Figure 13: The confirmation of gene expression level using The Cancer Genome Atlas (TCGA) data. The gene expression levels of VEGFA, AURKB, CCNA2, MCM2, MCM7, SMC4, TPX2, SLC2A1, MCM5, and NCAPG in ccRCC samples were significantly higher than the normal renal samples.

The proteins encoded by CyclinA2 (CCNA2) belong to a TPX2 is tyrosine phosphorylated in malignant transformed highly conserved cyclin family, which promotes cell transfor- 16HBE-C, and this phosphorylation may be involved in the mation by binding and activating cyclin-dependent kinases malignant proliferation of cancer cells [39]. Ma et al. found (CDKs) through G1/S and G2/M [29]. Previous studies have that the level of the TPX2 protein in normal bronchial epithe- found that the overexpression of CyclinA2 occurs in lung lium and alveoli was very low, while the level of TPX2 protein cancer [30, 31], breast cancer [32, 33], colorectal cancer increased gradually in squamous metaplasia, dysplasia, and [34], and other tumors and related to poor prognosis of carcinoma in situ and invasive tumor. The immunohisto- cancer patients. Aaltomaa et al. found that CyclinA2 was chemical labeling index of TPX2 was related to the degree expressed in the cytoplasm of RCC but not in the normal of differentiation, stage, and lymph node metastasis of lung tissue near the tumor, and the overexpression of CyclinA2 squamous cell carcinoma, and the overexpression of TPX2 was related to the survival time of patients with RCC, was significantly correlated with the decrease of 5-year sur- suggesting that it may be a prognostic indicator of RCC vival rate [40]. Similar results were found in a variety of [35]. The increase of the CyclinA2 expression is related to cancers, such as colorectal cancer [41], cervical cancer the uncontrolled and accelerated cell cycle, which leads to [42], and prostate cancer [43]. The expression of TPX2 in gene amplification and chromosome ectopia. Gopinathan RCC was significantly higher than that in normal renal tis- et al. found that knockout CyclinA2 in mice can inhibit sue, and it was related to tumor size, histological grade, tumorigenesis [36]. Liang et al. found that the increased tumor stage, and poor prognosis [44–47]. This may be expression of sclerostin domain-containing protein1 due to the significant upregulation of TPX2 in RCC tissues, (SOSTDC1) can inhibit CyclinA2, while SOSTDC1 can thus increasing the proliferation and invasive ability of inhibit tumor growth [37]. CyclinA2 can not only be used renal cancer cells. From this point of view, TPX2 can not as a predictor of prognosis and survival in patients with only become a target for RCC treatment but also play a role RCC but also has great potential in cancer treatment. as an independent prognostic factor of RCC. TPX2 microtubule nucleation factor (TPX2) encodes a Non-SMC I complex subunit G (NCAPG) microtubule-associated protein that activates cell cycle kinase coding a condensed protein complex subunit is responsible called Aurora A and regulates mitotic spindles. The overex- for chromosome condensation and stabilization during pression of TPX2 is related to the genesis of different can- mitosis and meiosis [48]. In recent years, there are more cers and is closely related to chromosome instability. The and more studies on the abnormal expression of NCAPG in uncontrolled expression of TPX2 may eventually become prostate cancer [49], lung cancer [50], breast cancer [51], the driving force of cancer development by inducing aneu- and other cancers. In the study of Liu et al., NCAPG was ploidy [38]. Zhang et al. found that compared with human found to be overexpressed in hepatocellular carcinoma com- bronchial epithelial cells (16HBE), TPX is overexpressed in pared with the adjacent normal tissue, and high levels of malignant transformed 16HBE cells(16HBE-C) through NCAPG expression were found to significantly correlate with anti-benzo[a]pyrene-trans-7,8-dihydrodiol-9,10-epoxide, in recurrence, the time of recurrence, metastasis, differentiation, which TPX2 RNA interference (RNAi) can lead to S-phase and TNM stage. The knockdown of NCAPG expression also arrest, inhibit cell proliferation, and induce cell apoptosis. inhibited tumor cell migration and the cell invasive capacity 20 BioMed Research International

VEGFA AURKB CCNA2 1 1 1

0.75 0.75 0.75

0.5 0.5 0.5 Percent survival Percent survival Percent survival

0.25 0.25 0.25

P = 0.4946 P = 0.000 P = 0.000 Log-rank test statistics = 0.4666 Log-rank test statistics = 23.30 Log-rank test statistics = 21.59 0 0 0 1,000 2,000 3,000 4,000 0 1,000 2,000 3,000 4,000 0 1,000 2,000 3,000 4,000 Days Days Days

MCM2 MCM7 SMC4 1 1 1

0.75 0.75 0.75

0.5 0.5 0.5 Percent survival Percent survival Percent survival 0.25 0.25 0.25 P = 0.5249 P P Log-rank test statistics = 0.4042 = 0.092 = 0.856 0 0 Log-rank test statistics = 2.831 0 Log-rank test statistics = 0.033 0 1,000 2,000 3,000 4,000 0 1,000 2,000 3,000 4,000 0 1,000 2,000 3,000 4,000 Days Days Days

1 TPX2 1 SLC2A1 MCM5

0.75 0.75 0.75

0.5 0.5 0.5 Percent survival Percent survival Percent survival 0.25 0.25 0.25 P P = 0.000 = 0.209 P = 0.303 Log-rank test statistics = 1.581 Log-rank test statistics = 15.16 Log-rank test statistics = 1.059 0 0 1,000 2,000 3,000 4,000 0 1,000 2,000 3,000 4,000 0 1,000 2,000 3,000 4,000 Days Days Days

1 NCAPG

0.75

0.5 Percent survival 0.25

P = 0.000 Log-rank test statistics = 20.47 0 0 1,000 2,000 3,000 4,000 Days

Low High

Figure 14: The effect of gene expression on overall survival by using the TCGA data. The expression level of VEGFA, MCM2, MCM7, SMC4, SLC2A1, and MCM5 was not related with the overall survival of ccRCC patients. The ccRCC patients with high expression levels of AURKB, CCNA2, TPX2, and NCAPG had poorer overall survival times than those with low expression levels. BioMed Research International 21

Training: R = 0.9906 Validation: R = 0.99768 18 18 16 16 14 14 Target+0.28 Target+–0.1 ⁎ 12 ⁎ 12 10 10 8 8 6 6 Output~=0.97 4 Output~=0.93 4 51015 51015 Target Target Data Data 102 Fit Fit Y = T Y = T Test: R = 0.93812 All: R = 0.97977 0 10 18 18 16 16 –2 14 MSE 10 14 Target+0.35 Target+–0.36 ⁎ ⁎ 12 12 10 10 10–4 8 8 6 6 Output~=0.93 10–6 Output~=0.89 4 4 0 50 100 150 200 250 300 51015 51015 Number of iterations Target Target

Data Fit Fit Fit Y = T Y = T (a) (b) 1.0 20

0.5 16 0.0 12 –0.5 TPX2 8

Absolute error Absolute –1.0

4 –1.5

0 –2.0 048121620 036912 15 18 21 Sample number Sample number Raw data Training results (c) (d)

Figure 15: Continued. 22 BioMed Research International

TPX2 26.40 7 23.10 25 6 19.80 20 16.50 5 15 13.20 4 TPX2

10 AURKB 9.900 5 3 6.600 7 0 6 3.300 5 1 2 2 4 KB 0.000 3 3 CCNCCNA24 5 12345678 5 6 AURAURKB 7 2 8 CCNA2 (e) (f)

Figure 15: The neural network models. (a) The best training performance. (b) The relativity of training, validation, test, and all procedure. (c) The comparison chart of training results. (d) The error analysis diagram. (e) The high-risk warning range at the level of the three-dimensional stereogram. (f) The high-risk warning range at the level of the plane form. in vitro [52]. Through genome-wide functional knockout Conflicts of Interest screen, Wang et al. believe that NCAPG is a necessary fl clinical-related target for the growth of hepatocellular The authors declare that they have no con icts of interests. carcinoma cells [53]. Ai et al. found that microRNA-181c (miR-181c) inhibits cancer by downregulating the expression Authors’ Contributions of NCAPG, affecting the infiltration, migration, proliferation, and apoptosis of hepatoma cells [54]. In the study of Arai Ai-li Zhang and Ling-bing Meng contributed equally to this et al., microRNA-99a-3p downregulated the expression of work. NCAPG, thereby inhibiting cancer cell invasion in castration-resistant prostate cancer [49]. In conclusion, Acknowledgments NCAPG represents a promising novel target and a prognos- tic biomarker for clinical management. We gratefully thank the Department of Urinary Surgery in However, this study also has some shortcomings. The Fourth Hospital of Hebei Medical University for the Although the core genes screened in this study may play an technical assistance. important role in the occurrence of ccRCC, more clinical samples and patient prognosis information are still needed References for verification. [1] R. L. Siegel, K. D. Miller, and A. Jemal, “Cancer statistics, 5. Conclusion 2018,” CA: a Cancer Journal for Clinicians, vol. 68, no. 1, pp. 7–30, 2018. To sum up, 251 differentially expressed genes and 10 hub [2] U. Capitanio, K. Bensalah, A. Bex et al., “Epidemiology of renal genes (especially AURKB, CCNA2, TPX2, and NCAPG) cell carcinoma,” European Urology, vol. 75, no. 1, pp. 74–84, were screened from ccRCC and normal renal tissues by 2019. “ ” microarray technology, which could be used as diagnostic [3] U. Capitanio and F. Montorsi, Renal cancer, The Lancet, – and therapeutic biomarkers for ccRCC. AURKB, CCNA2, vol. 387, no. 10021, pp. 894 906, 2016. “ TPX2, and NCAPG which might be related to the occurrence [4] B. Shuch, A. Amin, A. J. Armstrong et al., Understanding and malignant progression of ccRCC. pathologic variants of renal cell carcinoma: distilling therapeu- tic opportunities from biologic complexity,” European Urol- ogy, vol. 67, no. 1, pp. 85–97, 2015. Data Availability [5] R. Srinivasan, C. J. Ricketts, C. Sourbier, and W. M. Linehan, “New strategies in renal cell carcinoma: targeting the genetic The datasets used and/or analyzed during the current study and metabolic basis of disease,” Clinical Cancer Research, are available from the corresponding author on reasonable vol. 21, no. 1, pp. 10–17, 2015. request. [6] P. Singh, N. Agarwal, and S. K. Pal, “Sequencing systemic ther- apies for metastatic kidney cancer,” Current Treatment Disclosure Options in Oncology, vol. 16, article 3, 2015. [7] H. M. Huang, X. Jiang, M. L. Hao et al., “Identification of bio- The authors ensure that questions related to the accuracy or markers in macrophages of atherosclerosis by microarray integrity of any part of the work are appropriately investi- analysis,” Lipids in Health and Disease, vol. 18, no. 1, article gated and resolved. 107, 2019. BioMed Research International 23

[8] Y. Qiu, L. B. Meng, C. Y. Di et al., “Exploration of the differen- [25] H. Katayama, T. Ota, F. Jisaki et al., “Mitotic kinase expression tially expressed long noncoding RNAs and genes of morphine and colorectal cancer progression,” Journal of the National tolerance via bioinformatic analysis,” Journal of Computa- Cancer Institute, vol. 91, no. 13, pp. 1160–1162, 1999. – tional Biology, vol. 26, no. 12, pp. 1379 1393, 2019. [26] M. Tatsuka, H. Katayama, T. Ota et al., “Multinuclearity and [9] Y. F. Zou, L. B. Meng, Z. K. He et al., “Screening and authen- increased ploidy caused by overexpression of the aurora- and tication of molecular markers in malignant glioblastoma based Ipl1-like midbody-associated protein mitotic kinase in human on gene expression profiles,” Oncology Letters, vol. 18, cancer cells,” Cancer Research, vol. 58, no. 21, pp. 4811–4816, pp. 4593–4604, 2019. 1998. [10] L. B. Meng, M. J. Shan, Y. Qiu et al., “TPM2 as a potential pre- [27] T. Ota, S. Suto, H. Katayama et al., “Increased mitotic phos- dictive biomarker for atherosclerosis,” Aging, vol. 11, no. 17, phorylation of histone H3 attributable to AIM-1/Aurora-B pp. 6960–6982, 2019. overexpression contributes to chromosome number instabil- [11] M. Ashburner, C. A. Ball, J. A. Blake et al., “Gene ontology: tool ity,” Cancer Research, vol. 62, no. 18, pp. 5168–5177, 2002. for the unification of biology,” Nature Genetics, vol. 25, pp. 25– [28] A. Tang, K. Gao, L. Chu, R. Zhang, J. Yang, and J. Zheng, 29, 2000. “Aurora kinases: novel therapy targets in cancers,” Oncotarget, [12] M. Tanabe and M. Kanehisa, “Using the KEGG database vol. 8, no. 14, pp. 23937–23954, 2017. resource,” Curr Protoc Bioinformatics, vol. 11, no. 1, 2005. [29] D. Gong, J. R. Pomerening, J. W. Myers et al., “Cyclin A2 reg- [13] S. Maere, K. Heymans, and M. Kuiper, “BiNGO: a Cytoscape ulates nuclear-envelope breakdown and the nuclear accumula- plugin to assess overrepresentation of gene ontology categories tion of cyclin B1,” Current Biology, vol. 17, no. 1, pp. 85–91, in biological networks,” Bioinformatics, vol. 21, no. 16, 2007. pp. 3448-3449, 2005. [30] M. Volm, R. Koomägi, J. Mattern, and G. Stammler, “Cyclin A [14] M. E. Smoot, K. Ono, J. Ruscheinski, P. L. Wang, and T. Ideker, is associated with an unfavourable outcome in patients with “Cytoscape 2.8: new features for data integration and network non- small-cell lung carcinomas,” British Journal of Cancer, visualization,” Bioinformatics, vol. 27, no. 3, pp. 431-432, 2011. vol. 75, no. 12, pp. 1774–1778, 1997. [15] G. D. Bader and C. W. Hogue, “An automated method for [31] Y. Dobashi, M. Shoji, S. X. Jiang, M. Kobayashi, Y. Kawakubo, finding molecular complexes in large protein interaction net- and T. Kameya, “Active cyclin A-CDK2 complex, a possible works,” BMC Bioinformatics, vol. 4, no. 1, p. 2, 2003. critical factor for cell proliferation in human primary lung car- [16] C. H. Chin, S. H. Chen, H. H. Wu, C. W. Ho, M. T. Ko, and cinomas,” The American Journal of Pathology, vol. 153, no. 3, – C. Y. Lin, “cytoHubba: identifying hub objects and sub- pp. 963 972, 1998. networks from complex interactome,” BMC Systems Biology, [32] I. R. Bukholm, G. Bukholm, and J. M. Nesland, “Over-expres- vol. 8, article S11, 2014. sion of cyclin A is highly associated with early relapse and [17] Z. Tang, C. Li, B. Kang, G. Gao, C. Li, and Z. Zhang, “GEPIA: a reduced survival in patients with primary breast carcinomas,” web server for cancer and normal gene expression profiling International Journal of Cancer, vol. 93, no. 2, pp. 283–287, and interactive analyses,” Nucleic Acids Research, vol. 45, 2001. no. W1, pp. W98–W102, 2017. [33] R. Michalides, H. van Tinteren, A. Balkenende et al., “Cyclin A [18] T. Flieder, M. Weiser, T. Eller et al., “Interference of DOACs in is a prognostic indicator in early stage breast cancer with and different DRVVT assays for diagnosis of lupus anticoagu- without tamoxifen treatment,” British Journal of Cancer, lants,” Thrombosis Research, vol. 165, pp. 101–106, 2018. vol. 86, no. 3, pp. 402–408, 2002. [19] G. Hussain, J. Wang, A. Rasul et al., “Role of cholesterol and [34] J. Q. Li, H. Miki, F. Wu et al., “Cyclin a correlates with carcino- kip1 sphingolipids in brain development and neurological dis- genesis and metastasis, and p27 correlates with lymphatic eases,” Lipids in Health and Disease, vol. 18, no. 1, p. 26, 2019. invasion, in colorectal neoplasms,” Human Pathology, – [20] W. Wang, W. Lou, B. Ding et al., “A novel mRNA-miRNA- vol. 33, no. 10, pp. 1006 1015, 2002. lncRNA competing endogenous RNA triple sub-network asso- [35] S. Aaltomaa, P. Lipponen, M. Ala-Opas, M. Eskelinen, ciated with prognosis of pancreatic cancer,” Aging, vol. 11, K. Syrjänen, and V. M. Kosma, “Expression of cyclins A no. 9, pp. 2610–2627, 2019. and D and p21(waf1/cip1) proteins in renal cell cancer and their ” [21] F. L. Chan, B. Vinod, K. Novy et al., “Aurora kinase B, a novel relation to clinicopathological variables and patient survival, – regulator of TERF1 binding and telomeric integrity,” Nucleic British Journal of Cancer, vol. 80, no. 12, pp. 2001 2007, Acids Research, vol. 45, no. 21, pp. 12340–12353, 2017. 1999. [22] S. L. Smith, N. L. Bowers, D. C. Betticher et al., “Overexpres- [36] L. Gopinathan, S. L. Tan, V. C. Padmakumar, V. Coppola, “ sion of aurora B kinase (AURKB) in primary non-small cell L. Tessarollo, and P. Kaldis, Loss of Cdk2 and cyclin A2 ” lung carcinoma is frequent, generally driven from one allele, impairs cell proliferation and tumorigenesis, Cancer – and correlates with the level of genetic instability,” British Jour- Research, vol. 74, no. 14, pp. 3870 3879, 2014. nal of Cancer, vol. 93, no. 6, pp. 719–729, 2005. [37] W. Liang, H. Guan, X. He et al., “Down-regulation of [23] R. Sorrentino, S. Libertini, P. L. Pallante et al., “Aurora B over- SOSTDC1 promotes thyroid cancer cell proliferation via ” expression associates with the thyroid carcinoma undifferenti- regulating cyclin A2 and cyclin E2, Oncotarget, vol. 6, – ated phenotype and is required for thyroid carcinoma cell no. 31, pp. 31780 31791, 2015. proliferation,” The Journal of Clinical Endocrinology and [38] C. Aguirre-Portolés, A. W. Bird, A. Hyman, M. Cañamero, Metabolism, vol. 90, no. 2, pp. 928–935, 2005. I. Pérez de Castro, and M. Malumbres, “Tpx2 controls spindle [24] P. Chieffi, L. Cozzolino, A. Kisslinger et al., “Aurora B expres- integrity, genome stability, and tumor development,” Cancer sion directly correlates with prostate cancer malignancy and Research, vol. 72, no. 6, pp. 1518–1528, 2012. influence prostate cell proliferation,” Prostate, vol. 66, no. 3, [39] L. Zhang, H. Huang, L. Deng et al., “TPX2 in malignantly pp. 326–333, 2006. transformed human bronchial epithelial cells by anti- 24 BioMed Research International

benzo[a]pyrene-7,8-diol-9,10-epoxide,” Toxicology, vol. 252, [53] Y. Wang, B. Gao, P. Y. Tan et al., “Genome-wide CRISPR no. 1-3, pp. 49–55, 2008. knockout screens identify NCAPG as an essential oncogene ” [40] Y. Ma, D. Lin, W. Sun et al., “Expression of targeting protein for hepatocellular carcinoma tumor growth, The FASEB – for xklp2 associated with both malignant transformation of Journal, vol. 33, no. 8, pp. 8759 8770, 2019. respiratory epithelium and progression of squamous cell lung [54] J. Ai, C. Gong, J. Wu et al., “MicroRNA-181c suppresses cancer,” Clinical Cancer Research, vol. 12, no. 4, pp. 1121– growth and metastasis of hepatocellular carcinoma by 1127, 2006. modulating NCAPG,” Cancer Management and Research, – [41] A. H. Sillars-Hardebol, B. Carvalho, M. Tijssen et al., “TPX2 vol. 2019, pp. 3455 3467, 2019. and AURKA promote 20q amplicon-driven colorectal ade- noma to carcinoma progression,” Gut, vol. 61, no. 11, pp. 1568–1575, 2012. [42] H. Chang, J. Wang, Y. Tian, J. Xu, X. Gou, and J. Cheng, “The TPX2 gene is a promising diagnostic and therapeutic target for cervical cancer,” Oncology Reports, vol. 27, no. 5, pp. 1353– 1359, 2012. [43] P. Vainio, J. P. Mpindi, P. Kohonen et al., “High-throughput transcriptomic and RNAi analysis identifies AIM1, ERGIC1, TMED3 and TPX2 as potential drug targets in prostate can- cer,” PLoS One, vol. 7, no. 6, article e39801, 2012. [44] Q. I. Chen, B. Cao, N. Nan et al., “TPX2 in human clear cell renal carcinoma: expression, function and prognostic significance,” Oncology Letters, vol. 11, no. 5, pp. 3515– 3521, 2016. [45] Z. A. Glaser, H. D. Love, S. Guo et al., “TPX2 as a prognostic indicator and potential therapeutic target in clear cell renal cell carcinoma,” Urologic Oncology, vol. 35, no. 5, pp. 286–293, 2017. [46] Y. Gu, L. Lu, L. Wu, H. Chen, W. Zhu, and Y. He, “Identifica- tion of prognostic genes in kidney renal clear cell carcinoma by RNA-seq data analysis,” Molecular Medicine Reports, vol. 15, no. 4, pp. 1661–1667, 2017. [47] W. Wei, Y. Lv, Z. Gan, Y. Zhang, X. Han, and Z. Xu, “Identi- fication of key genes involved in the metastasis of clear cell renal cell carcinoma,” Oncology Letters, vol. 17, no. 5, pp. 4321–4328, 2019. [48] A. Eberlein, A. Takasuga, K. Setoguchi et al., “Dissection of genetic factors modulating fetal growth in cattle indicates a substantial role of the non-SMC condensin I complex, subunit G (NCAPG) gene,” Genetics, vol. 183, no. 3, pp. 951–964, 2009. [49] T. Arai, A. Okato, Y. Yamada et al., “Regulation of NCAPG by miR-99a-3p (passenger strand) inhibits cancer cell aggressive- ness and is involved in CRPC,” Cancer Medicine, vol. 7, no. 5, pp. 1988–2002, 2018. [50] S. Li, Y. Xuan, B. Gao et al., “Identification of an eight-gene prognostic signature for lung adenocarcinoma,” Cancer Man- agement and Research, vol. 2018, pp. 3383–3392, 2018. [51] J. Chen, X. Qian, Y. He, X. Han, and Y. Pan, “Novel key genes in triple-negative breast cancer identified by weighted gene co- expression network analysis,” Journal of Cellular Biochemistry, vol. 120, no. 10, pp. 16900–16912, 2019. [52] W. Liu, B. Liang, H. Liu et al., “Overexpression of non-SMC condensin I complex subunit G serves as a promising prognos- tic marker and therapeutic target for hepatocellular carci- noma,” International Journal of Molecular Medicine, vol. 40, no. 3, pp. 731–738, 2017.