Hindawi BioMed Research International Volume 2021, Article ID 1093702, 18 pages https://doi.org/10.1155/2021/1093702

Research Article Microarray Data Mining and Preliminary Bioinformatics Analysis of Hepatitis D Virus-Associated Hepatocellular Carcinoma

Zhe Yu ,1 Xuemei Ma ,2 Wei Zhang,2 Xiujuan Chang ,2 Linjing An ,2 Ming Niu ,3 Yan Chen ,2 Chao Sun ,1 and Yongping Yang 1,2

1Peking University 302 Clinical Medical School, Beijing 100039, China 2Department of Liver Disease of Chinese PLA General Hospital, The Fifth Medical Center of Chinese PLA General Hospital, Beijing 100039, China 3China Military Institute of Chinese Medicine, The Fifth Medical Centre of Chinese PLA General Hospital, Beijing 100039, China

Correspondence should be addressed to Yongping Yang; [email protected]

Received 18 May 2020; Revised 4 June 2020; Accepted 19 January 2021; Published 30 January 2021

Academic Editor: Xingshun Qi

Copyright © 2021 Zhe Yu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Several studies have demonstrated that chronic hepatitis delta virus (HDV) infection is associated with a worsening of hepatitis B virus (HBV) infection and increased risk of hepatocellular carcinoma (HCC). However, there is limited data on the role of HDV in the oncogenesis of HCC. This study is aimed at assessing the potential mechanisms of HDV-associated hepatocarcinogenesis, especially to screen and identify key and pathways possibly involved in the pathogenesis of HCC. We selected three microarray datasets: GSE55092 contains 39 specimens and 81 paracancer specimens from 11 HBV-associated HCC patients, GSE98383 contains 11 cancer specimens and 24 paracancer specimens from 5 HDV-associated HCC patients, and 371 HCC patients with the RNA-sequencing data combined with their clinical data from the Cancer Atlas (TCGA). Afterwards, 948 differentially expressed genes (DEGs) closely related to HDV-associated HCC were obtained using the R package and filtering with a Venn diagram. We then performed ontology (GO) annotation and Kyoto Encyclopedia of Genes and (KEGG) pathway enrichment analysis to determine the biological processes (BP), cellular component (CC), molecular function (MF), and KEGG signaling pathways most enriched for DEGs. Additionally, we performed Weighted Gene Coexpression Network Analysis (WGCNA) and -to-protein interaction (PPI) network construction with 948 DEGs, from which one module was identified by WGCNA and three modules were identified by the PPI network. Subsequently, we validated the expression of 52 hub genes from the PPI network with an independent set of HCC dataset stored in the Profiling Interactive Analysis (GEPIA) database. Finally, seven potential key genes were identified by intersecting with key modules from WGCNA, including 3 reported genes, namely, CDCA5, CENPH, and MCM7, and 4 novel genes, namely, , CDC45, CDCA8, and MCM4, which are associated with nucleoplasm, cell cycle, DNA replication, and mitotic cell cycle. The CDCA8 and stage of HCC were the independent factors associated with overall survival of HDV-associated HCC. All the related findings of these genes can help gain a better understanding of the role of HDV in the underlying mechanism of HCC carcinogenesis.

1. Introduction with hepatitis B virus (HBV), hepatitis C virus (HCV), and hepatitis delta virus (HDV) [4]. Approximately 292 million Hepatocellular carcinoma (HCC) is the sixth most commonly people worldwide are chronically infected with HBV, which diagnosed cancer and the fourth leading cause of cancer- causes liver injury that can progress to cirrhosis, resulting in related mortality globally [1, 2] and the second in China [3]. HCC, liver failure, and eventually death [5, 6]. The HDV is More than 80% of all HCC causes are associated with infection known as the satellite of HBV and affects 15–20 million people 2 BioMed Research International in the world [7]. Concurrent HBV and HDV infections signif- (NCBI) Gene Expression Omnibus (GEO, https://www.ncbi icantly increase both the incidence and mortality of HCC .nlm.nih.gov/geo/) database. The inclusion criteria for selec- among patients with chronic hepatitis B (CHB) [8]. HDV is tion GEO datasets in this study were as follows: (1) hepato- a kind of defective RNA virus which uses HBV envelope pro- cellular carcinoma containing cancer and paracancer tissue; tein for successful spread in hepatocytes [9]. Although the risk (2) HBsAg positive at least 6 months with serum HBV of HCC is thought to be higher when a HBV-infected patient DNA positive; (3) anti-HDAg positive with serum HDV is superinfected with HDV, the molecular mechanisms of car- RNA positive (applies only to filter HDV-associated HCC cinogenesis remain unclear [10]. Chronic hepatitis D (CHD) dataset); and (4) sample size more than 10 with data unbi- is more severe than any other type of hepatitis, but its carcino- ased. The exclusion criteria were as follows: (1) the second genesis mechanism remains poorly understood. Additionally, liver cancer, (2) HCV-related HCC, (3) dataset biased, and it has been found that the intrahepatic HBV DNA levels in (4) no paracancer tissue and carcinoma tissue present at the patients with HDV-associated HCC and non-HCC cirrhosis same patients. We searched the GEO database using “Hepa- are significantly reduced [11]. This phenomenon of HDV- titis D Virus,”“Hepatocellular Carcinoma,” and “Homo sapi- mediated inhibition of HBV replication suggests that the ens” as keywords, and there were only 2 results, between effects of HDV are mediated through a unique molecular which only GSE98383 met our criteria for HDV-associated mechanism. While HBV and HCV are both included in Inter- HCC. Similarly, we searched the keywords “Hepatitis B national Agency for Research on Cancer (IARC) group 1 (high Virus,”“Hepatocellular Carcinoma,” and “Homo sapiens” evidence of carcinogenicity to ), HDV was assigned and obtained 581 search results. The flow chart of screening several years ago to group 3 (not sufficient evidence of carcino- HBV-associated HCC could be seen in Figure 1, and only genicity) [12], due to inadequacy to support the contribution GSE55092 was suited for an in-depth study. of HDV to HBV-induced HCC. Due to the dependency of Microarray data GSE55092 [18] and GSE98383 [11] were HDV on HBV, there are still controversies regarding the generated using the GPL570 Affymetrix HG-U133 Plus 2.0 increased risk of HCC development in chronically HDV- Array platform, and we performed the analysis on the data infected patients [4], and the available data on the particular from whole liver tissue. GSE55092 contains 39 cancer speci- mechanism by which HDV contributes to HCC are sparse. mens and 81 paracancer specimens from 11 HBV-associated With the development of genomics and other “-omics” disci- HCC patients (average age = 57:7±7:7 years; 10 male plines, substantial omics data from HCC specimens have been patients and 1 female patient). GSE98383 contains 11 cancer accumulated [13]. Therefore, researchers have taken advan- specimens and 24 paracancer specimens from 5 HDV- tage of (GO) and signal pathway analysis tools associated HCC patients (average age = 57 ± 3 years, 5 male to identify and characterize many differentially expressed patients). The comparison of baseline characteristics between genes (DEGs). patients with HBV-associated HCC and HDV-associated In the present study, the GSE55092 and GSE98383 HCC is shown in Table 1. mRNA expression profile datasets were retrieved from Gene We downloaded the GSE55092 and GSE98383 datasets, Expression Omnibus (GEO) online [14, 15]. And the RNA- normalized them by the Affy package of the R Bioconductor, sequencing data of 371 HCC patients and their clinical data and then converted the gene expression profile at the probe were from the Cancer Genome Atlas (TCGA). We performed level into gene symbol level and removed the duplicated sym- DEG analysis between cancerous specimens of HBV- bols. When numerous probes were mapped to one gene, the associated HCC and HDV-associated HCC patients and average value was defined as the expression level of that gene. their respective paracancerous specimens using Linear According to the description of the uploader, an unsuper- Models for Microarray Data (LIMMA) [16] and other pack- vised multidimensional scaling (MDS) of all specimens ages implemented in R/Bioconductor [17]. We aimed to obtained from HBV-associated HCC patients showed a clear investigate potential HDV carcinogenesis mechanisms by separation between two distinct clusters that corresponded to PPI network, Gene Expression Profiling Interactive Analysis cancer areas and paracancerous areas. A similar separation (GEPIA), and Weighted Gene Coexpression Network Analy- between two clusters that corresponded to cancerous areas sis (WGCNA), particularly to screen and identify key genes and paracancerous areas was observed for the specimens and pathways to determine their possible role in HCC from HDV-associated HCC patients. Therefore, all the data pathogenesis, and to help determine their mechanism of are available for the identification of DEGs. inhibition of HBV replication and their effects in diagnosis, treatment and prognosis. 2.2. Identification of DEGs. DEG analysis refers to the identi- fication of genes with significantly different expression levels 2. Methods and Materials between two groups through multiple analysis modes [19]. We performed differential expression analysis using Bayes t 2.1. Acquisition of Data and Preprocessing. The RNA- -statistics from the LIMMA implemented in the R Biocon- sequencing data and clinical data of 371 HCC patients were ductor and corrected p values for multiple testing using the downloaded from TCGA (http://cancergenome.nih.gov/. Benjamini-Hochberg method [20]. We identified the DEGs http://cancergenome.nih.gov/). The expression of genes was in primary cancerous specimens of HCC patients by compar- represented by fragments per kilobase of exon per million ing them with paracancerous normal specimens of the same fragments mapped (FPKM). Microarray data were available HCC patients. The absolute value of log2-fold change (FC) at the National Center for Biotechnology Information was set to ≥1.0, and a p value of <0.01 was used as the BioMed Research International 3

Datasets screened N = 581

Excluded N = 542 Incomplete datasets

Datasets N = 39

Met inclusion criteria N = 3

Same processing platform (GPL570) with GSE98383 N = 1 GSE55092

Diferent processing platform with GSE98383 N = 2 GSE22058 and GSE94660

Excluded N = 36

N Reason for exclusion 3 Not related to HCC 15 No biopsy 13 Not have para-cancer specimens 5 No more than 10 available samples

Figure 1: Flow chart of enrolled datasets and availability of datasets.

Table 1: Comparison of baseline characteristics between patients with HBV-associated HCC and HDV-associated HCC.

HBV-associated HCC HDV-associated HCC p Patients 11 5 — Age (years old) 57:7±7:757±30.849 Male (%) 10 [90.9] 5 [100] 1.000 ALT (U/L) 36:18 ± 17:887±250.000 AST (U/L) 39:09 ± 17:082±210.001 GGT (U/L) 93:9±83:56 98 ± 22 0.917 PT (INR) 1:13 ± 0:14 1:4±0 0.001 TB (mg/dL) 0:88 ± 0:47 2:3±1:3 0.005 PLT (103/mL) 15:381 ± 9:373 101:4±16:1 0.000 Liver pathology Activity grade 5:75 ± 3:06 9:1±1:1 0.034 Fibrosis stage 5:1±1:56 6:0±0:0 0.226 F5/F6 9 5 Tumor grade 0.107 G2 7 1 G3 3 4 G4 1 0 Tumor size 0.407 <2cm 4 0 ≥2 and ≤3cm 4 3 >3cm 3 2 Serum HDV RNA positive, no. 0 5 — Serum HBV DNA positive, no. 11 5 — ALT: alanine aminotransferase; AST: aspartate aminotransferase; GGT: γ-glutamyl transferase; TB: total bilirubin; PT: prothrombin time; PLT: Platelets. 4 BioMed Research International significance criteria, and genes that met these criteria were 2.7. Univariate and Multivariate Cox Proportional Hazards used for further analysis. The final step is to use Venn dia- Model Analysis. We performed univariate and multivariate grams to identify DEGs closely related to HDV-associated Cox proportional hazards model analysis in patients from HCC [21]. TCGA, to find independent factors associated with the overall survival of HCC. 2.3. GO Enrichment and Pathway Analysis. DAVID program (https://david.ncifcrf.gov/) [22] is a bioinformatics resource 3. Result comprising a biological database and a set of annotation and analytical tool that intuitively integrates functional geno- 3.1. Comparison of Baseline Characteristics. GSE98383 was mic annotations with graphics. In this study, the DEGs were the only dataset of HDV-associated HCC that met our cri- submitted to DAVID for GO [23] and KEGG [24] enrich- teria, as to the screen of HBV-associated HCC dataset, overall ment analyses, which included biological process (BP), cellu- 581 datasets were enrolled and screened for eligibility and 3 lar component (CC), molecular function (MF), and related datasets met inclusion criteria. Of these 3 datasets, the data biological metabolic pathways. A p value < 0.05 was consid- processing platform of GSE55092 was GPL570, which was ered statistically significant. the same with GSE98383, while GSE22058 and GSE94660 were different, so we took GSE55092 to stand for HBV- 2.4. Weighted Gene Coexpression Network Construction and associated HCC for study. The number of datasets and rea- Module-Clinical Characteristic Associations. We compared sons for exclusion are shown in Figure 1. The HDV- DEGs with the genes of 371 HCC patients downloaded from associated HCC patients in GSE98383 had higher levels of TCGA website. Expression data of the matched genes from alanine aminotransferase (ALT), aspartate aminotransferase TCGA were applied to find gene modules significantly asso- (AST), total bilirubin (TB), prothrombin time (PT), platelet ciated with clinical trait (stage of HCC) by WGCNA [25]. In counts (PLT), and inflammatory activity grade than the this analysis mode, the soft thresholding acts as the lowest HBV-associated HCC patients in GSE55092, which is consis- power based on the criterion of approximate scale-free topol- tent with the characteristics of CHD of the most severe hep- ogy [26], and analogous modules would be merged together atitis. On the other hand, there was no significant difference due to the similarity. The heatmap of module-clinical in sex, age, tumor grade, and tumor size between the two characteristic relationship could reveal modules significantly groups. associated with clinical characteristics. 3.2. Identification of DEGs. We identified DEGs from the 2.5. Construction of PPI Networks and Module Analysis. The microarray GSE55092 and GSE98383 datasets using the visual protein-to-protein interaction (PPI) networks of DEGs LIMMA package, setting |log2-FC| to ≥1.0 and adjusted p were predicted using the web resource Search Tool for the value to <0.01 as the criteria. By comparing the cancerous Retrieval of Interacting Genes (STRING) [27] to search the and paracancerous specimens in GSE55092 up to 1,375, STRING database (https://string-db.org/), which contains DEGs were identified, comprising 518 upregulated and 857 over 5,000 organisms as well as their over 24.6 million pro- downregulated genes (Table S1). A similar comparison in teins and over 2 billion interactions. We correlated the target GSE98383 contains 1,605 DEGs, including 592 upregulated DEGs with the STRING database and set the significant and 1,013 downregulated genes (Table S1). Volcano plots of threshold to the highest confidence level (interaction score the GSE55092 and GSE98383 microarrays are shown in ≥ 0:900). Subsequently, we used Cytoscape [28], a software Figures 2(a) and 2(b), respectively. We used Venn diagrams to construct PPI networks and analyze highly interconnected to determine the DEGs closely related to HDV-associated modules using the built-in Molecular Complex Detection HCC (Figure 2(c)), and 948 DEGs (373 upregulated and (MCODE) clustering algorithm. The parameters were set 582 downregulated) were identified and selected for further by default except for the K-core value which was equal to 8. analysis (Table S2).

2.6. Validation of Module Gene. First, we uploaded the poten- 3.3. GO Enrichment and Pathway Analysis. In order to tial genes identified by PPI-network analysis to GEPIA [29] further screen HDV-associated HCC potential target genes (http://gepia.cancer-pku.cn/, an online server containing among these DEGs, GO and pathway analysis were TCGA/GTEx datasets) to validate the gene expression con- performed on HDV-associated HCC DEGs using p value < sistency between the microarray datasets (GSE55092 and 0.05 as the threshold (Figure 2(d)). The results are presented GSE98383) and TCGA/GTEx HCC dataset, setting the in Figure 2(d) and show the TOP-7 GO terms (BP, CC, and threshold parameters as follows: |log2FC| cutoff ≥ 1:0 and p MF) and KEGG pathway terms significantly enriched in the value cutoff < 0.01. Afterwards, we performed the overall sur- DEGs. Additionally, the TOP-5 annotations of the DEGs vival analysis as follows: we divided the patients in TCGA/G- are shown in Table 2. In the BP series, the DEGs were mostly TEx dataset into high and low expression groups with the enriched for genes related to the cellular response to chemical TPM (transcripts per kilobase million) midvalue as a break- stimulus and organic substance, defense response, and cell point; a log-rank test was used to determine significance at adhesion. In the CC series, the DEGs were primarily enriched p <0:05. Finally, we took the intersection of related genes for genes involved in cell surface, side of membrane, to the OS of HCC by GEPIA and genes contained in the membrane-bounded vesicle, external side of plasma mem- hub module obtained by WGCNA and got the key genes. brane, and proteinaceous extracellular matrix. In the MF BioMed Research International 5

LIFR CXCL14

ERLIN1 CPEB3 15 HHIP 30 LINC01093 KDM8 HEATR2 RCAN1 SLC7A11 LYVE1 SLC38A6 SLC7A11 CLEC1B SORBS1 CD5L MASP1 HAMP CENPW TMPRSS2

10 20 .Val) p −log 10(adj. p .Val) −log 10(adj.

10 5

0 0

−5.0 −2.5 0.0 2.5 5.0 −6 −3 0 3 log2 fold change log2 fold change

Treshold Treshold Down−regulated Down−regulated Unchanged Unchanged Up−regulated Up−regulated (a) (b) HBV-associated HCC HDV-associated HCC

718 657 948 (30.9%) (28.3%) (40.8%)

(c)

Figure 2: Continued. 6 BioMed Research International

p Gene count –log2 value 250 50 45 200 40 35 150 30 25 100 20 15 50 10 5

0 GO:0070887~cellular response to chemical stimulus GO:0071310~cellular response to organic substance GO:0010033~response to organic substance GO:0006952~defense response GO:0007155~cell adhesion GO:0022610~biological adhesion GO:0001775~cell activation GO:0009986~cell surface GO:0098552~side of membrane GO:0031988~membrane-bounded vesicle GO:0005578~proteinaceous extracellular matrix GO:0009897~external side of plasma membrane GO:0044421~extracellular region part GO:0031012~extracellular matrix GO:0001948~glycoprotein binding GO:0098772~molecular function regulator GO:0019955~cytokine binding GO:0008201~heparin binding GO:0005539~glycosaminoglycan binding GO:0005102~receptor binding GO:0097367~carbohydrate derivative binding hsa04062:Chemokine signaling pathway hsa05150:Staphylococcus aureus infection hsa05202:Transcriptional misregulation in cancer hsa04510:Focal adhesion hsa04670:Leukocyte transendothelial migration hsa04666:Fc gamma R-mediated phagocytosis hsa05146:Amoebiasis 0

Biological process Molecular function Cellular component KEGG Pathways (d)

Figure 2: Identification and filtering of DEGs closely related to HDV-associated HCC and their GO and KEGG pathway analysis. (a) Volcano plot of DEGs related to HBV-associated HCC; the DEGs with the top-10 P value differences are shown in the plot. (b) Volcano plot of DEGs related to HDV-associated HCC; the DEGs with the top-10 P value differences are shown in the plot. (c) The Venn diagram shows the intersection of DEGs between HBV-associated HCC and HDV-associated HCC. The right part (yellow) represents the DEGs closely related to HDV-associated HCC. (d) The GO terms and the KEGG pathways of DEGs significantly enriched in HDV-associated HCC. GO: gene ontology; KEGG: Kyoto Encyclopedia of Genes and Genomes; HCC: hepatocellular carcinoma. series, the DEGs were predominantly enriched for genes contained the most DEGs with the number of 320. The five associated with glycoprotein binding, molecular function modules and contents are stored in supplementary material regulator, cytokine binding, heparin binding, and glycosami- Table S3. Next, we further analyzed these modules with noglycan binding. In the KEGG pathway series, the enrich- clinical characteristics (sex, event, OS, stage, grade, and ment for DEGs was mainly in the chemokine signaling age). Obviously, the MEturquoise module was significantly pathway, Staphylococcus aureus infection, transcriptional associated with event (correlation coefficients ðrÞ =0:19, p misregulation in cancer, focal adhesion, and leukocyte <0:001), OS (r = −0:21, p <0:001), stage (r =0:23, p < transendothelial migration. 0:001), grade (r =0:24, p <0:001), and age (r = −0:14, p = 0:01) (Figure 3(d)). 3.4. Weighted Gene Coexpression Network Construction and Module-Clinical Characteristics. We compared 948 DEGs 3.5. Construction of PPI Networks and Gene Module Analysis. with the genes of 371 samples downloaded from TCGA data- We uploaded the DEGs onto the STRING online tool and sets, matched a total of 883 genes, and performed WGCNA. analyzed them with the Cytoscape software. We then selected As shown in Figures 3(a), 3(b), and 3(c), the soft thresholding 353 nodes and 939 edges with the highest confidence power β was set to 3, and MEblue and MEred were merged (scores > 0:900) to construct the PPI networks together due to the similarity (the height = 0:25). Then, we (Figure 4(a)). Then, the MCODE plugin filtered out three found five coexpressed gene modules. The MEturquoise important gene modules. Genes within module 1 and module BioMed Research International 7

Table 2: The top five annotations in GO and KEGG enrichment analysis of the DEGs.

Category Term Count p value GOTERM_BP_FAT GO:0070887~cellular response to chemical stimulus 208 4:96E − 14 GOTERM_BP_FAT GO:0071310~cellular response to organic substance 180 7:46E − 14 GOTERM_BP_FAT GO:0010033~response to organic substance 210 7:89E − 12 GOTERM_BP_FAT GO:0006952~defense response 129 9:24E − 11 GOTERM_BP_FAT GO:0007155~cell adhesion 139 1:95E − 10 GOTERM_CC_FAT GO:0009986~cell surface 69 1:59E − 06 GOTERM_CC_FAT GO:0098552~side of membrane 43 4:16E − 05 GOTERM_CC_FAT GO:0031988~membrane-bounded vesicle 225 4:83E − 05 GOTERM_CC_FAT GO:0005578~proteinaceous extracellular matrix 37 4:95E − 05 GOTERM_CC_FAT GO:0009897~external side of plasma membrane 28 8:75E − 05 GOTERM_MF_FAT GO:0001948~glycoprotein binding 16 9:68E − 05 GOTERM_MF_FAT GO:0098772~molecular function regulator 97 1:11E − 04 GOTERM_MF_FAT GO:0019955~cytokine binding 15 1:44E − 04 GOTERM_MF_FAT GO:0008201~heparin binding 20 2:02E − 04 GOTERM_MF_FAT GO:0005539~glycosaminoglycan binding 23 3:53E − 04 KEGG_PATHWAY hsa04062: Chemokine signaling pathway 25 1:16E − 04 KEGG_PATHWAY hsa05150: Staphylococcus aureus infection 12 1:73E − 04 KEGG_PATHWAY hsa05202: Transcriptional misregulation in cancer 22 4:38E − 04 KEGG_PATHWAY hsa04510: Focal adhesion 25 5:49E − 04 KEGG_PATHWAY hsa04670: Leukocyte transendothelial migration 17 6:91E − 04

2 are comprised of downregulated genes, while module 3 is genes in module 3, including CDC6, CDC45, CDCA5, comprised of upregulated genes, except for PPP2R5C and CDCA8, CENPH, MCM4, MCM7, and TCEB1, were upregu- LONRF1. Module 1 contains 15 nodes and 105 edges lated in tumor compared to normal specimens in the HCC (Figure 4(b)), which are mainly related to G protein- datasets, which is consistent with the HDV-associated HCC coupled receptor signaling pathway (BP), plasma membrane patients. All the box plots comparing gene expression are (CC), G protein-coupled receptor binding (MF), and chemo- shown in Figure 5(a). For further verification, 11 genes whose kine signaling pathway (KEGG) (Table 3). Module 2 contains gene expression trends are consistent with HCC datasets 12 nodes and 66 edges (Figure 4(c)), which are primarily were selected and used to conduct overall survival analysis. related to type I interferon signaling pathway (BP), cytosol In Figure 5(b), there were the upregulated genes (CDC6, (CC), 2′-5′-oligoadenylate synthetase activity (MF), and CDC45, CDCA5, CDCA8, CENPH, MCM4, MCM7, and hepatitis C (KEGG) (Table 4). Module 3 contains 25 nodes TCEB1) which are associated with a lower survival rate in and 94 edges (Figure 4(d)), which are predominantly related the high expression group than in the low expression group. to the mitotic cell cycle process (BP), chromosomal part The reason for excluding CCL21, FPR1, IFI6, IFI27, (CC), DNA replication origin binding (MF), and cell cycle ISG15, and XAF1 is that their p value did not comply with (KEGG) (Table 5). the standards or the opposite gene expression. All the 8 retained potential genes are related to the nucleoplasm, and most of them are related to the mitotic cell cycle process, cell 3.6. Validation of Module Genes. We compared the gene cycle, and DNA replication (Figure 5(c)). Taken together, the expression changes between the three module genes of intersection of the above validated 8 genes and 320 genes in HDV-associated HCC (a total of 52 genes) and the validation MEturquoise module by WGCNA, the potential 7 key genes HCC (TCGA/GTEx) datasets in the GEPIA website to verify (CDC6, CDC45, CDCA5, CDCA8, CENPH, MCM4, and whether their expression in both datasets is consistent. We MCM7) were found (Figure 5(d)). noticed that CCL21 and FPR1 in module 1 as well as XAF1 in module 2 were downregulated in tumors compared to nor- mal specimens in the HCC datasets, which is in accordance 3.7. Identification of Independent Factors of Overall Survival with the HDV-associated HCC specimens. However, IFI6, of HCC. We performed the univariate analysis in 371 patients IFI27, and ISG15 in module 2, which were downregulated from TCGA and found that CDCA8, stage, CDC45, CDC6, in cancerous specimens compared to paracancerous normal CDCA5, MCM4, CENPH, MCM7, sex, and age were signifi- specimens, were conversely expressed in HCC datasets. The cantly associated with OS of HCC. The multivariate Cox 8 BioMed Research International

Scale independence Mean connectivity 4 5 6 8 10 12 14 1 3 7 9 16 2 120 0.8 2 100

0.6 80

60 0.4

Mean connectivity 40 2 18 20

0.2 20 3

Scale free topology model ft, signed R 4 5 6 7 1 0 8 9 10 12 14 16 18 20 5101520 5 101520 Sof threshold (power) Sof threshold (power) (a) Clustering of ME before combination Clustering of ME afer combination 1.0 1.0 0.9 0.8 0.8 MEgrey MEgrey 0.6 0.7 Height 0.6 Height 0.4 0.5 0.2

MEbrown 0.4 MEgreen MEyellow 0.0 MEturquoise MEbrown MEblue MEyellow MEred MEturquoise MEblue MEgreen (b) Gene dendrogram and module colors 1.0

0.8

Height 0.6

0.4 Dynamic tree cut Merged dynamics

(c)

Figure 3: Continued. BioMed Research International 9

Module−trait relationships 1

−0.065 −0.018 0.088 −0.13 MEyellow −0.027 −0.018 (0.2) (0.8) (0.1) (0.6) (0.8) (0.02)

−0.022 0.031 0.0092 −0.03 0.065 −0.027 MEblue 0.5 (0.7) (0.6) (0.9) (0.6) (0.2) (0.6)

−0.00066 −0.0068 −0.027 0.043 0.052 −0.073 MEgreen (1) (0.9) (0.6) (0.4) (0.4) (0.2)

0

−0.19 0.054 −0.057 0.18 0.12 −0.054 MEbrown (5e−04) (0.3) (0.3) (0.001) (0.03) (0.3)

−0.029 0.19 −0.21 0.23 0.24 −0.14 MEturquoise −0.5 (0.6) (5e−04) (2e−04) (3e−05) (2e−05) (0.01)

0.05 0.057 −0.022 0.072 0.065 −0.12 MEgrey (0.4) (0.3) (0.7) (0.2) (0.2) (0.03)

−1 Age A1_OS A18_Sex A6_Stage A2_Event A7_Grade (d)

Figure 3: The processing steps of WGCNA. (a) Analysis of the soft thresholding power (β =3). (b) MEblue and MEred merged together due to the similarity (the height = 0:25). (c) Gene dendrogram and module colors; the MEturquoise contained the most DEGs (n = 320). (d) Heatmap of module-trait relationships. The MEturquoise was the most significantly associated with event, OS, stage, grade, and age. WGCNA: Weighted Gene Coexpression Network Analysis. proportional hazards model showed that CDCA8 and stage GEPIA database. In the module confirmed by the WGCNA of HCC were independent factors of OS of HCC (Table 6). that was significantly associated with clinical features includ- ing event, OS, stage, grade, and age, only CDC6, CDC45, 4. Discussion CDCA5, CDCA8, CENPH, MCM4, and MCM7 were consis- tent with genes identified by the PPI network, which were As the virus causing the most severe type of hepatitis, HDV found to be significantly correlated with nucleoplasm, cell affects 15-20 million people worldwide, but its specific path- cycle, DNA replication, and mitotic cell cycle. Univariate ogenic mechanism remains unclear. Accordingly, we under- and multivariate Cox proportional hazards model analysis took to find potential genes and pathways involved in the showed the stage of HCC and CDCA8 are the independent pathogenesis of this disease through text mining to help factors associated with the OS of HCC. explain the underlying carcinogenic mechanism of HDV as cycle 6 (CDC6) is thought to be significantly well as the HBV inhibitory mechanism. associated with pancreatic cancer and colorectal cancer In this study, we compared cancerous and paracancerous (CRC) [30]. It has a pivotal role in regulating the process of specimens of patients suffering from HBV or HDV- DNA replication as well as tumorigenesis; its overexpression associated HCC with the aim of identifying potential genes could interfere with the expression of tumor suppressor closely related to HDV-associated HCC. The study identified genes (INK4/ARF) through the mechanism of epigenetic 373 upregulated DEGs and 582 downregulated DEGs. These modification [31]. During the S phase of DNA replication DEGs were subjected to GO and KEGG annotation and in eukaryotic cells, cell division cycle 45 (CDC45) is an essen- enrichment analyses. In addition, we constructed PPI net- tial component of CMG (CDC45–MCM–GINS) . works and sorted out 353 nodes with 939 edges, from which CDC45 acts as a hubprotein, significantly upregulated in can- the three most significant modules were selected and 52 cen- cerous tissues from CRC and non-small-cell tral nodes/genes were selected for validation using the (NSCLC) patients, and promotes tumor progression [32]. It 10 BioMed Research International

MTHFD2L WDR12 MTHFD1 XPO4 BAG2 GP2 MFAP4 CLDN11 CSTF3 XPOT GGCT CSTA DCAF13 GEMIN6 DNAJB6 CD52 NSUN6 UTP15 RNF213 TCEB1 EFEMP1 COPS6 SAP30 PRMT6 LONRF1 SLC12A8 CDC25B KCTD6 NUP205 NME5 FOLR2 EFEMP2 CLDN12 LSM4 GGT5 CHEK2 DDB2 FBXL18 NUP155 ANP32E KLHL13 TYMS MND1 TRIM37 SKP2 H2AFZ CMPK2 ENPP6 NDC1 DTYMK HSPA2 GZMB ANTXR1 TMOD1 EIF5 UBE2T TIPIN RAD54B NCAPG2 CCND3 CKS2 SMC4 FABP4 RAD51 CBX3 NPM1 KPNA3 TNFRSF17 POLQ EXO1 RNASEH2A TIMELESS NCAPH CENPI PRF1 ANTXR2 VIM TMEM86B ASF1A SPAG5 EIF3B MCM4 CDCA8 OIP5 NCOR2 NME1 CDC45 KIF23 CENPH DBF4 MIS18A TNFSF13B RFC3 CDC6 ESPL1 GZMA C1orf162 HKDC1 CCND1 KPNA2 SSPN DLAT MCM10 MCM7 ZWILCH PGR TNFSF11 CCNE1 CENPE CDKN1A CFH LTB POLE2 HLA-DOA GMNN ORC6 PPP2R5C TFAP2A PPAP2C DGAT2 KIF18A PPAPDC1A KIFC2 TNFRSF1B SGCD GNPDA1 ACACA CFB CDCA5 PPP4R4 TACC3 CD55 NFATC1 TMED3 SLC44A2 CD97 PTPRS AGPAT2 SARS PTGDR RUNX3 SPRY1 SSR1 ISG15 KCNJ10 RGS9 SPHK1 SELPLG LILRB2 BST2 CCT6A GLIPR1 PNOC XAF1 CXCL5 SELP CD53 RIT1 SELL RASGRP1 ATP8B4 PTPRB NLRC3 MX2 CCR7 NTF3 OAS2 P2RY12 ERBB2 GARS SEC61A2 IRF9 ADCY7 ADRA2A PRDM1 IFI27 IL2RB ITGB2 JAK3 ETS1 PAX5 TMEM173 IFI6 PRKCB KLB CCL4 GNG2 OAS1 FPR1 KAL1 VSIG4 CXCL16 LAT IFIT1 TRIM6 LCK AREG CSF2RB RRAS2 MX1 IRF7 P2RY14 CXCR4 SLA FGF1 C1QB IFI16 BLK SH2D1A CCL21 GNB4 PRKCH CSF1R TLR8 PIK3CG NRAS CD163 C1QC CCR2 ITGA4 TRIM24 C1QA FYN IGLL5 CYSLTR1 RALAA GBP5 NCAM1 PIK3CD DPYSL3 HOMER1 ASGR1 TGM2 PIK3R5 NLRC5 VAV2 NCKAP1L SYT1 SYT9 RNF135 GPR65 TBXA2R PTGFR PTK2 WASF1 FNBP1L MAP2K6 AGFG1 NRXN3 TLR4 FN1 MAP3K9 ITGA1 PIP5K1B FGB NAA50 NTS GNA14 GRK5 SERPINA1 ASGR2 COL4A3 AMICA1 ARPC1A TFRC TGFB1I1 WIPF1 ATP6V1C1 COL4A4 TXNIP TGOLN2 RAB31 PRKG1 LAMA2 STC2 PDSS1 COL6A1 LAMC3 RAB38 PLA2G4A IGFBP7 AP1S1 NAA20 ADRBK2 SRGN COL13A1 GSN AFP SQLE CHML SLIT2 CALD1 FUCA2 CHSY1 FDPS RAC2 QSOX1 TPD52 RAB29 DENND2C COL28A1 AP1M2 MYL9 RHOH FTH1 SDC3 DOCK2 LSS ENAH NTN4 TIMP2 ROBO2 HMGCS1 DDC TES CSGALNACT1 HMGCR PREX1 CECR1 FGD3 DLC1 NEU1 CRACR2A GLA AOC3 EIF4EBP1 ARHGAP25 SASS6 LAMTOR4 IRAK3 CENPJ ENG ARHGEF37 MYH4 RBP1 PIWIL4 SMAD7 ARHGDIB PTN ALDH1A3 ARHGAP30 TIMP3 LGALS3BP HS3ST3B1 IL1RN PLK4 RRAGC GPC4 MZT1 SMAD6 ITIH3 IL1R2 HENMT1 (a) CXCR4 OAS1

ADCY7 IFI27 IRF9 GNG2 ADRA2A IRF7 ISG15 FPR1 CCR7 CXCL16

PNOC CCL21 XAF1 OAS2

CCR2 GNB4 CCL4

CXCL5 MX2 IFI6 P2RY12 P2RY14

MX1 IFIT1 BST2 (b) (c)

Figure 4: Continued. BioMed Research International 11

FBXL18 LONRF1 TCEB1

SKP2 TRIM37 RNF213

KCTD6 CDCA8 KLHL13 PPP2R5C

CENPE CENPH CDCA5

CENPI ZWILCH ESPL1 KIF18A CDC6 CDC45

MCM4

MCM7 MCM10 ORC6

DBF4 POLE2 (d)

Figure 4: PPI networks and the top-3 significant modules (module 1-3). (a) PPI networks constructed with the DEGs closely related to HDV- associated HCC. The red border indicates upregulation, the blue border indicates downregulation, the pink core represents module 1, the green core represents module 2, the dark blue core represents module 3, and the size of the circle represents the relative expression level of the genes; (b) module 1, the DEGs in module 1 are all downregulated; (c) module 2, the DEGs in module 2 are all downregulated; (d) module 3, the DEGs in module 3 are upregulated except for PPP2R5C and LONRF1. PPI: protein-to-protein interaction; DEGs: differentially expressed genes.

Table 3: Functional and pathway enrichment of module 1 genes.

Category Term Count p value Genes GOTERM_ G protein-coupled receptor ADCY7, ADRA2A, CCL21, CCL4, CCR2, CCR7, CXCL16, CXCL5, 14 2:79E − 13 BP_FAT signaling pathway CXCR4, FPR1, GNG2, P2RY12, P2RY14, PNOC GOTERM_ Cell chemotaxis 8 2:16E − 10 CCL21, CCL4, CCR2, CCR7, CXCL16, CXCL5, CXCR4, FPR1 BP_FAT GOTERM_ Chemokine-mediated 6 6:75E − 09 CCL21, CCL4, CCR2, CCR7, CXCL5, CXCR4 BP_FAT signaling pathway GOTERM_ ADCY7, ADRA2A, CCR2, CCR7, CXCL16, CXCR4, FPR1, GNG2, Plasma membrane 11 0.0205 CC_FAT P2RY12, P2RY14, PNOC GOTERM_ External side of plasma 3 0.0205 CCR2, CCR7, P2RY12 CC_FAT membrane GOTERM_ Side of membrane 4 0.0205 CCR2, CCR7, GNG2, P2RY12 CC_FAT GOTERM_ G protein-coupled receptor 7 7:55E − 08 ADRA2A, CCL21, CCL4, CCR2, CXCL16, CXCL5, PNOC MF_FAT binding GOTERM_ Chemokine receptor binding 5 7:55E − 08 CCL21, CCL4, CCR2, CXCL16, CXCL5 MF_FAT GOTERM_ Chemokine activity 4 2:30E − 06 CCL21, CCL4, CXCL16, CXCL5 MF_FAT KEGG_ ADCY7, CCL21, CCL4, CCR2, CCR7, CXCL16, CXCL5, CXCR4, Chemokine signaling pathway 10 1:15E − 15 PATHWAY GNB4, GNG2 KEGG_ Cytokine-cytokine receptor 7 1:67E − 08 CCL21, CCL4, CCR2, CCR7, CXCL16, CXCL5, CXCR4 PATHWAY interaction KEGG_ Circadian entrainment 3 0.00092 ADCY7, GNB4, GNG2 PATHWAY 12 BioMed Research International

Table 4: Functional and pathway enrichment of module 2 genes.

Category Term Count p value Genes GOTERM_BP_ BST2, IFI27, IFI6, IFIT1, IRF7, IRF9, ISG15, MX1, MX2, Type I interferon signaling pathway 12 2:85E − 27 FAT OAS1, OAS2, XAF1 GOTERM_BP_ Defense response to virus 9 1:15E − 14 BST2, IFIT1, IRF7, IRF9, ISG15, MX1, MX2, OAS1, OAS2 FAT GOTERM_BP_ Negative regulation of viral genome 5 3:48E − 09 BST2, IFIT1, ISG15, MX1, OAS1 FAT replication GOTERM_CC_ BST2, IFIT1, IRF7, IRF9, ISG15, MX1, MX2, OAS1, OAS2, Cytosol 10 0.004 FAT XAF1 GOTERM_CC_ Mitochondrion 6 0.0067 IFI27, IFI6, MX1, MX2, OAS1, XAF1 FAT GOTERM_CC_ BST2, IFI27, IFI6, IFIT1, IRF7, IRF9, ISG15, MX1, MX2, Cytoplasmic part 12 0.0067 FAT OAS1, OAS2, XAF1 GOTERM_MF_ 2′-5′-Oligoadenylate synthetase 2 0.00028 OAS1, OAS2 FAT activity GOTERM_MF_ Double-stranded RNA binding 2 0.0229 OAS1, OAS2 FAT KEGG_ Hepatitis C 5 1:72E − 07 IFIT1, IRF7, IRF9, OAS1, OAS2 PATHWAY KEGG_ Measles 5 1:72E − 07 IRF7, IRF9, MX1, OAS1, OAS2 PATHWAY KEGG_ Influenza A 5 1:92E − 07 IRF7, IRF9, MX1, OAS1, OAS2 PATHWAY

Table 5: Functional and pathway enrichment of module 3 genes.

Category Term Count p value Genes GOTERM_ CDC45, CDC6, CDCA5, CDCA8, CENPE, DBF4, ESPL1, KIF18A, MCM10, Mitotic cell cycle process 16 2:69E − 16 BP_FAT MCM4, MCM7, ORC6, POLE2, PPP2R5C, SKP2, ZWILCH GOTERM_ CDC45, CDC6, CDCA5, CDCA8, CENPE, DBF4, ESPL1, KIF18A, KLHL13, Cell cycle 17 6:32E − 13 BP_FAT MCM10, MCM4, MCM7, ORC6, POLE2, PPP2R5C, SKP2, ZWILCH GOTERM_ G1/S transition of mitotic 9 4:80E − 12 CDC45, CDC6, DBF4, MCM10, MCM4, MCM7, ORC6, POLE2, SKP2 BP_FAT cell cycle GOTERM_ CDC45, CDCA5, CDCA8, CENPE, CENPH, CENPI, KIF18A, MCM10, Chromosomal part 13 5:18E − 10 CC_FAT MCM7, ORC6, POLE2, PPP2R5C, ZWILCH GOTERM_ , 8 3:39E − 09 CDCA5, CDCA8, CENPE, CENPH, CENPI, KIF18A, PPP2R5C, ZWILCH CC_FAT centromeric region Intracellular CDC45, CDC6, CDCA5, CDCA8, CENPE, CENPH, CENPI, ESPL1, KCTD6, GOTERM_ nonmembrane-bounded 18 1:05E − 06 KIF18A, MCM10, MCM7, ORC6, POLE2, PPP2R5C, RNF213, SKP2, CC_FAT organelle ZWILCH GOTERM_ DNA replication origin 3 0.00011 CDC45, MCM10, ORC6 MF_FAT binding GOTERM_ Single-stranded DNA 4 0.00037 CDC45, MCM10, MCM4, MCM7 MF_FAT binding GOTERM_ DNA helicase activity 3 0.00069 CDC45, MCM4, MCM7 MF_FAT KEGG_ Cell cycle 8 8:55E − 11 CDC45, CDC6, DBF4, ESPL1, MCM4, MCM7, ORC6, SKP2 PATHWAY KEGG_ DNA replication 3 0.00022 MCM4, MCM7, POLE2 PATHWAY KEGG_ Ubiquitin-mediated 4 0.00024 KLHL13, SKP2, TCEB1, TRIM37 PATHWAY proteolysis BioMed Research International 13

Module 1 CCL21 FPR1 ⁎⁎6 12 5 10 4 8 3 6 4 2 2 1 0 0 LIHC LIHC (num(T)=369; num(N)=160) (num(T)=369; num(N)=160) Module 2 IFI6 IFI27 ISG15 XAF1 ⁎ 14 ⁎ ⁎ 8 ⁎ 12 12 12 10 10 6 10 8 8 8 4

6 6 6 2 4 4 4

2 2 0 LIHC LIHC LIHC LIHC (num(T)=369; num(N)=160) (num(T)=369; num(N)=160) (num(T)=369; num(N)=160) (num(T)=369; num(N)=160)

Module 3 CDC6 CDC45 CDCA5 CDCA8 ⁎ ⁎ ⁎ ⁎ 6 5 5 5 5 4 4 4 4 3 3 3 3

2 2 2 2

1 1 1 1

0 0 0 0 LIHC LIHC LIHC LIHC (num(T)=369; num(N)=160) (num(T)=369; num(N)=160) (num(T)=369; num(N)=160) (num(T)=369; num(N)=160)

CENPH MCM4 MCM7 TCEB1 ⁎ 7 ⁎ ⁎ ⁎ 8 8 4 6 7 7 5 3 6 4 6 5 2 3 5 4 2 4 1 3 1 3

LIHC LIHC LIHC LIHC (num(T)=369; num(N)=160) (num(T)=369; num(N)=160) (num(T)=369; num(N)=160) (num(T)=369; num(N)=160) (a)

Figure 5: Continued. 14 BioMed Research International

CDC6 CDC45 CDCA5 CDCA8 Overall survival Overall survival Overall survival Overall survival

1.0 Logrank p = 0.02 1.0 Logrank p = 0.00031 1.0 Logrank p = 0.00021 1.0 Logrank p = 0.00026 HR(high) = 1.5 HR(high) = 1.9 HR(high) = 1.9 HR(high) = 1.9 p(HR) = 0.02 p(HR) = 0.00039 p(HR) = 0.00026 p(HR) = 0.00032 0.8 0.8 n 0.8 n 0.8 n n(high) = 182 (high) = 182 (high) = 182 (high) = 182 n n n(low) = 182 (low) = 182 (low) = 182 n(low) = 182 0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4 Percent survival Percent survival Percent survival Percent Percent survival Percent 0.2 0.2 0.2 0.2

0.0 0.0 0.0 0.0 0 20 40 60 80 100 120 0 20406080100120 0 20 40 60 80 100 120 0 20 40 60 80 100 120 Months Months Months Months

Low CDC6 TPM Low CDC45 TPM Low CDCA5 TPM Low CDCA8 TPM High CDC6 TPM High CDC45 TPM High CDCA5 TPM High CDCA8 TPM

CENPH MCM4 MCM7 TCEB1 Overall survival Overall survival Overall survival Overall survival p 1.0 Logrank = 0.00068 1.0 Logrank p = 0.049 1.0 Logrank p = 0.00032 1.0 Logrank p = 0.024 HR(high) = 1.8 HR(high) = 1.4 HR(high) = 1.9 HR(high) = 1.5 p(HR) = 0.00081 p(HR) = 0.05 p p(HR) = 0.025 0.8 0.8 0.8 (HR) = 4e−04 0.8 n(high) = 182 n(high) = 182 n(high) = 182 n(high) = 182 n(low) = 182 n(low) = 182 n(low) = 182 n(low) = 182 0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4 Percent survival Percent survival Percent survival Percent survival Percent 0.2 0.2 0.2 0.2

0.0 0.0 0.0 0.0 0 20 40 60 80 100 120 0 20406080100120 0 20 40 60 80 100 120 0 20 40 60 80 100 120 Months Months Months Months

Low CENPH TPM Low MCM4 TPM Low MCM7 TPM Low TCEB1 TPM High CENPH TPM High MCM4 TPM High MCM7 TPM High TCEB1 TPM (b)

Figure 5: Continued. BioMed Research International 15

MF BP DNA helicase activity DNA replication initiation Single stranded DNA binding DNA conformation change logFC mitotic cell cycle process 1 CDCA5 G1 S transition of mitotic cell cycle Chromosome organization DNA duplex unwinding Positive regulation of mitotic cell cycle MCM7 phase transition DNA metabolic process DNA unwinding involved in DNA replication DNA replication checkpoint CDCA8 Positive regulation of G1 S transition of mitotic cell cycle Regulation of transcription involved in G1/S transition of mitotic cell cycle Mitotic metaphase plate congression Regulation of chromosome organization Regulation of sister segregation TCEB1 Cell division Nucleic acid metabolic process Regulation of mitotic nuclear division Cellular response to stress Double strand break repair DNA packaging Cellular response to DNA damage

stimulus CDC45 Protein DNA complex assembly Cellular macromolecule biosynthetic process

CENPH CC Nucleoplasm MCM complex Chromosomal part Chromosome centromeric region Spindle midzone CDC6 Protein containing complex Intracellular non membrane bounded organelle Condensed chromosome MCM4 KEGG Cytosol 0 cytoskeleton Cell cycle Nuclear chromosome part DNA replication Cytoskeletal part (c)

313 7 1 (97.5%) (2.2%) (0.3%)

(d)

Figure 5: Validation of the expression data and survival curve of hub genes from the 3 modules using the GEPIA database and functional and pathway enrichment analysis. (a) The box plots that verify whether the expression of these DEGs is consistent with that in the LIHC datasets. Among the downregulated genes, CCL21 and FPR1 (module 1) as well as XAF1 (module 2) are consistent with the HCC datasets, while IFI6, IFI27, and ISG15 (module 2) are not. All 8 upregulated genes (module 3) are consistent with the LIHC datasets. (b) The genes are associated with overall survival whose expression is consistent with that in the LIHC datasets. All 8 upregulated genes are from module 3. (c) The chord diagram showing GO terms and KEGG pathway enrichment with the 8 hub genes involved. (d) The intersection of these 8 genes from PPI network analysis and 320 genes contained in module MEturquoise obtained by WGCNA, 7 potential key genes in the middle part. HCC: hepatocellular carcinoma; GO: gene ontology; KEGG: Kyoto Encyclopedia of Genes and Genomes; BP: biological process; CC: cellular component; MF: molecular function; PPI: protein-to-protein interaction; WGCNA: Weighted Gene Coexpression Network Analysis. is established that the MCM4/6/7 (minichromosome mainte- the interaction with MCM7, thereby disrupting the stability nance complex component 4/6/7) hexamer complex acts as a of the MCM4/6/7 complex [33]. In addition, MCM4 is also DNA helicase. Additionally, in endometrial cancer and skin a member of significant predictors of poor prognosis in cancer studies, it was found that MCM4 mutations may affect CRC patients [34]. Moreover, MCM7 is also a promising 16 BioMed Research International

Table 6: Univariate and Multivariate Cox proportional hazards model analysis of overall survival of HCC.

Univariate analysis Multivariate analysis Variables HR 95% CI p value HR 95% CI p value CDCA8 1.11 1.08-1.15 <0.001 1.10 1.06-1.14 <0.001 Stage I <0.001 0.019 Stage II 1.43 0.88-2.34 0.151 1.23 0.75-2.02 0.416 Stage IIIA 2.68 1.70-4.21 <0.001 2.11 1.31-3.39 0.002 Stage IIIB 2.87 1.02-8.04 0.045 1.84 0.62-5.44 0.273 CDC45 1.13 1.07-1.19 <0.001 CDC6 1.10 1.05-1.15 <0.001 CDCA5 1.09 1.05-1.13 <0.001 MCM4 1.06 1.04-1.09 <0.001 CENPH 1.18 1.08-1.28 <0.001 MCM7 1.01 1.00-1.01 0.029 Gender (F vs. M) 0.82 0.57-1.16 0.257 Age (years) 1.01 0.99-1.02 0.403 F: female; M: male. biomarker for early diagnosis of gastric cancer and even a comparative analysis with previous studies, these genes were predictor of meningioma recurrence after surgery [35, 36]. found to be involved in many pathways related to tumorigen- Another study found that high expression of MCM7 may esis which provided clues to elucidate the mechanism of hep- be involved in the progression of HCC through the MCM7- atitis D virus-induced HCC or its unique molecular cyclin D1 pathway, and MCM7 may serve as a prognostic mechanism in the inhibition of HBV replication. However, marker for patients with HCC [37]. The cell division cycle- additional in-depth molecular biological research on these associated protein 5 (CDCA5) is a member of the CDCA candidate genes closely related to HDV-associated HCC is family that comprises CDCA1-8. It plays a crucial role as a necessary to confirm their functions. regulator of sister-chromatid cohesion and separation during cell division, and its upregulation has been shown to be asso- Data Availability ciated with various , including breast cancer, esopha- fi geal squamous cell carcinoma, CRC, and HCC [38, 39]. Also, The datasets used to support the ndings of this study are a study found that the activation of the ERK and AKT path- available from GEO (https://www.ncbi.nlm.nih.gov/geo/) ways may be involved in the regulation of HCC cell prolifer- and TCGA (https://portal.gdc.cancer.gov/). ation by CDCA45 [39]. CDCA8 is an essential regulator of , and its overexpression is significantly associated with Conflicts of Interest bladder cancer, cutaneous melanoma, and the progression The authors declare that they have no competing interests. and prognosis of breast cancer [40, 41]. protein H (CENPH) is considered to be an essential part of the active ’ centromere complex, and its overexpression is highly related Authors Contributions to poor prognosis in renal cell carcinoma, nasopharyngeal Zhe Yu, Xuemei Ma, and Wei Zhang contributed equally to carcinoma, CRC, and HCC [42, 43]. Another study found this work. that CENPH may promote the proliferation of HCC through the mitochondrial apoptosis pathway [43]. Acknowledgments Most of the abovementioned genes are significantly asso- ciated with the cell cycle and DNA replication, and their This work was supported by the State Key Projects Special- overexpression may affect the replication of HBV DNA, ized on Infectious Disease, Chinese Ministry of Science and thereby promoting the unique phenomenon of HDV inhibits Technology (2013ZX10005002 and 2018ZX10725506), and HBV replication. All the findings related to these genes may Beijing Key Research Project of Special Clinical Application also help us understand the mechanisms of HDV-induced (Z151100004015221). Thanks are due to Dr. Zhao’s team liver injury and HCC. In the future, we will further verify (Biomedical Official Wechat Account: SCIPhD) of Sheng- those genes’ function by performing animals, cells, and clin- XinZhuShou for assistance. ical trials. Supplementary Materials 5. Conclusions Table S1: DEGs from microarray datasets GSE55092 and In summary, 7 potential candidate genes closely related to GSE98383. Table S2: 948 DEGs related to HDV-associated HDV-associated HCC were identified in this study. Through HCC including 373 upregulated and 582 downregulated BioMed Research International 17 genes. Table S3: the five modules and contents obtained by [18] M. Melis, G. Diaz, D. E. Kleiner et al., “Viral expression and WGCNA. (Supplementary Materials) molecular profiling in liver tissue versus microdissected hepa- tocytes in hepatitis B virus-associated hepatocellular carci- References noma,” Journal of Translational Medicine, vol. 12, no. 1, p. 230, 2014. [1] F. Bray, J. Ferlay, I. Soerjomataram, R. L. Siegel, L. A. Torre, [19] E. Korpelainen, J. Tuimala, and P. Somervuo, RNA-Seq Data and A. Jemal, “Global cancer statistics 2018: GLOBOCAN esti- Analysis - a Practical Approach, Chapman and Hall/CRC, mates of incidence and mortality worldwide for 36 cancers in 2014. 185 countries,” CA: a Cancer Journal for Clinicians, vol. 68, [20] Y. Benjamini and Y. Hochberg, “Controlling the false discov- no. 6, pp. 394–424, 2018. ery rate: a practical and powerful approach to multiple test- [2] A. Villanueva, “Hepatocellular carcinoma,” The New England ing,” Journal of the Royal statistical society: series B Journal of Medicine, vol. 380, no. 15, pp. 1450–1462, 2019. (Methodological), vol. 57, no. 1, pp. 289–300, 1995. [3] W. Chen, R. Zheng, P. D. Baade et al., “Cancer statistics in [21] J. C. Oliveros, Venny. An interactive tool for comparing lists China, 2015,” CA: a Cancer Journal for Clinicians, vol. 66, with Venn's diagrams, 2007, Available from: https:// no. 2, pp. 115–132, 2016. bioinfogp.cnb.csic.es/tools/venny/index.html. [4] R. Romeo, A. Petruzziello, E. I. Pecheur et al., “Hepatitis delta [22] G. Dennis Jr., B. T. Sherman, D. A. Hosack et al., “DAVID: virus and hepatocellular carcinoma: an update,” Epidemiology database for annotation, visualization, and integrated discov- and Infection, vol. 146, no. 13, pp. 1612–1618, 2018. ery,” Genome Biology, vol. 4, no. 5, 2003. [5] W. K. Seto, Y. R. Lo, J. M. Pawlotsky, and M. F. Yuen, “Chronic [23] M. Ashburner, C. A. Ball, J. A. Blake et al., “Gene Ontology: hepatitis B virus infection,” Lancet, vol. 392, no. 10161, tool for the unification of biology,” Nature , vol. 25, pp. 2313–2324, 2018. no. 1, pp. 25–29, 2000. [6] L. S. Y. Tang, E. Covert, E. Wilson, and S. Kottilil, “Chronic [24] H. Ogata, S. Goto, K. Sato, W. Fujibuchi, H. Bono, and hepatitis B infection,” JAMA, vol. 319, no. 17, pp. 1802–1813, M. Kanehisa, “KEGG: Kyoto Encyclopedia of Genes and 2018. Genomes,” Nucleic Acids Research, vol. 27, no. 1, pp. 29–34, 1999. [7] Y. Ni, Z. Zhang, L. Engelskircher et al., “Generation and char- [25] P. Langfelder and S. Horvath, “WGCNA: an R package for acterization of a stable cell line persistently replicating and weighted correlation network analysis,” BMC Bioinformatics, secreting the hepatitis delta virus,” Scientific Reports, vol. 9, no. 1, 2008. vol. 9, no. 1, p. 10021, 2019. [26] B. Zhang and S. Horvath, “A general framework for weighted [8] G. Brancaccio, M. Fasano, A. Grossi, T. A. Santantonio, and gene co-expression network analysis,” Statistical Applications G. B. Gaeta, “Clinical outcomes in patients with hepatitis D, in Genetics and Molecular Biology, vol. 4, no. 1, 2005. cirrhosis and persistent hepatitis B virus replication, and [27] A. Franceschini, D. Szklarczyk, S. Frankild et al., “STRING receiving long-term tenofovir or entecavir,” Alimentary Phar- v9.1: protein-protein interaction networks, with increased cov- macology & Therapeutics, vol. 49, no. 8, pp. 1071–1076, 2019. erage and integration,” Nucleic Acids Research, vol. 41, [9] M. T. Binh, N. X. Hoan, H. van Tong et al., “HDV infection pp. D808–D815, 2013. rates in northern Vietnam,” Scientific Reports, vol. 8, no. 1, [28] M. Kohl, S. Wiese, and B. Warscheid, “Cytoscape: software for p. 8047, 2018. visualization and analysis of biological networks,” Methods in [10] C. Sureau and F. Negro, “The hepatitis delta virus: replication Molecular Biology, vol. 696, pp. 291–303, 2011. and pathogenesis,” Journal of Hepatology, vol. 64, no. 1, [29] Z. Tang, C. Li, B. Kang, G. Gao, C. Li, and Z. Zhang, “GEPIA: a pp. S102–S116, 2016. web server for cancer and normal gene expression profiling [11] G. Diaz, R. E. Engle, A. Tice et al., “Molecular signature and and interactive analyses,” Nucleic Acids Research, vol. 45, mechanisms of hepatitis D virus-associated hepatocellular car- no. W1, pp. W98–W102, 2017. cinoma,” Molecular Cancer Research, vol. 16, no. 9, pp. 1406– [30] Y. Hu, L. Wang, Z. Li et al., “Potential prognostic and diagnos- 1419, 2018. tic values of CDC6, CDC45, ORC6 and SNHG7 in colorectal [12] Humans, I.W.G.o.t.E.o.C.R.t and I.A.f.R.o. Cancer, Hepatitis cancer,” Oncotargets and Therapy, vol. Volume 12, viruses, vol. 59, World Health Organization, 1994. pp. 11609–11621, 2019. [13] H. Lee, B. T. Lau, and H. P. Ji, “Targeted sequencing strategies [31] L. R. Borlado and J. Mendez, “CDC6: from DNA replication to in cancer research,” in Next Generation Sequencing in Cancer cell cycle checkpoints and oncogenesis,” Carcinogenesis, Research, pp. 137–163, Springer, 2013. vol. 29, no. 2, pp. 237–243, 2008. [14] T. Barrett, D. B. Troup, S. E. Wilhite et al., “NCBI GEO: [32] J. Huang, Y. Li, Z. Lu et al., “Analysis of functional hub genes archive for high-throughput functional genomic data,” Nucleic identifies CDC45 as an oncogene in non-small cell lung cancer Acids Research, vol. 37, pp. D885–D890, 2009. - a short report,” Cellular Oncology (Dordrecht), vol. 42, no. 4, [15] R. Edgar, M. Domrachev, and A. E. Lash, “Gene expression pp. 571–578, 2019. omnibus: NCBI gene expression and hybridization array data [33] R. Tatsumi and Y. Ishimi, “An MCM4 mutation detected in repository,” Nucleic Acids Research, vol. 30, no. 1, pp. 207– cancer cells affects MCM4/6/7 complex formation,” Journal 210, 2002. of Biochemistry, vol. 161, no. 3, pp. 259–268, 2017. [16] G. K. Smyth, “limma: linear models for microarray data,” in [34] P. Ahluwalia, A. K. Mondal, C. Bloomer et al., “Identification Bioinformatics and computational biology solutions using R and clinical validation of a novel 4 gene-signature with prog- and Bioconductor, pp. 397–420, Springer, New York, NY, nostic utility in colorectal cancer,” International Journal of 2015. Molecular Sciences, vol. 20, no. 15, p. 3818, 2019. [17] R. B. Dessau and C. B. Pipper, ““R”–project for statistical com- [35] T. L. Winther and S. H. Torp, “MCM7 expression is a promis- puting,” Ugeskrift for Laeger, vol. 170, no. 5, pp. 328–330, 2008. ing predictor of recurrence in patients surgically resected for 18 BioMed Research International

meningiomas,” Journal of Neuro-Oncology, vol. 131, no. 3, pp. 575–583, 2017. [36] J. Y. Yang, D. Li, Y. Zhang et al., “The expression of MCM7 is a useful biomarker in the early diagnostic of gastric cancer,” Pathology Oncology Research, vol. 24, no. 2, pp. 367–372, 2018. [37] K. Qu, Z. Wang, H. Fan et al., “MCM7 promotes cancer pro- gression through cyclin D1-dependent signaling and serves as a prognostic marker for patients with hepatocellular carci- noma,” Cell Death & Disease, vol. 8, no. 2, 2017. [38] N. N. Phan, C. Y. Wang, K. L. Li et al., “Distinct expression of CDCA3, CDCA5, and CDCA8 leads to shorter relapse free survival in breast cancer patient,” Oncotarget, vol. 9, no. 6, pp. 6977–6992, 2018. [39] J. Wang, C. Xia, M. Pu et al., “Silencing of CDCA5 inhibits cancer progression and serves as a prognostic biomarker for hepatocellular carcinoma,” Oncology Reports, vol. 40, no. 4, pp. 1875–1884, 2018. [40] C. Ci, B. Tang, D. Lyu et al., “Overexpression of CDCA8 pro- motes the malignant progression of cutaneous melanoma and leads to poor prognosis,” International Journal of Molecular Medicine, vol. 43, no. 1, pp. 404–412, 2019. [41] Y. Bi, S. Chen, J. Jiang et al., “CDCA8 expression and its clin- ical relevance in patients with bladder cancer,” Medicine (Bal- timore), vol. 97, no. 34, 2018. [42] W. Wu, F. Wu, Z. Wang et al., “CENPH inhibits rapamycin sensitivity by regulating GOLPH3-dependent mTOR signaling pathway in colorectal cancer,” Journal of Cancer, vol. 8, no. 12, pp. 2163–2172, 2017. [43] G. Lu, H. Hou, X. Lu et al., “CENP-H regulates the cell growth of human hepatocellular carcinoma cells through the mito- chondrial apoptotic pathway,” Oncology Reports, vol. 37, no. 6, pp. 3484–3492, 2017.