MOLECULAR MEDICINE REPORTS 18: 5579-5593, 2018

LASSO‑based Cox‑PH model identifies an 11‑lncRNA signature for prognosis prediction in gastric cancer

YONGHONG ZHANG1*, HUAMIN LI2*, WENYONG ZHANG1, YA CHE3, WEIBING BAI4 and GUANGLIN HUANG4

1Department of General Surgery, Shangluo Central Hospital, Shangluo, Shaanxi 726000; 2Department of Pathology, Weinan Central Hospital, Weinan, Shaanxi 714000; 3Department of Medical Oncology, Shangluo Central Hospital, Shangluo, Shaanxi 726000; 4Department of General Surgery, Yulin Xingyuan Hospital, Yulin, Shaanxi 719000, P.R. China

Received December 23, 2017; Accepted September 13, 2018

DOI: 10.3892/mmr.2018.9567

Abstract. The present study aimed to identify a long cantly different overall survival and recurrence‑free survival non‑coding (lnc) RNAs‑based signature for prognosis assess- times. The predictive capability of this signature was verified ment in gastric cancer (GC) patients. By integrating in an independent set. These signature lncRNAs were impli- expression data of GC and normal samples from the National cated in several biological processes and pathways associated Center for Biotechnology Information with the immune response, the inflammatory response and Omnibus, the EBI ArrayExpress and The Cancer Genome cell cycle control. The present study identified an 11‑lncRNA Atlas (TCGA) repositories, the common RNAs in Genomic signature that could predict the survival rate for GC. Spatial Event (GSE) 65801, GSE29998, E‑MTAB‑1338, and TCGA set were screened and used to construct a weighted Introduction correlation network analysis (WGCNA) network for mining GC‑related modules. Consensus differentially expressed Gastric cancer (GC) is the fifth leading cause of malignancy RNAs (DERs) between GC and normal samples in the four worldwide, with a 5‑year survival rate of <10% (1,2). In China, datasets were screened using the MetaDE method. From the it is the second most commonly diagnosed cancer in men and overlapped lncRNAs shared by preserved WGCNA modules the third most commonly diagnosed cancer in women (3). and the consensus DERs, an lncRNAs signature was obtained The poor prognosis is primarily attributable to patients being using L1‑penalized (lasso) Cox‑proportional hazard (PH) frequently identified at an advanced stage and therefore diffi- model. LncRNA‑mRNA networks were constructed for these cult to cure (4). Early detection is key to improving survival signature lncRNAs, followed by functional annotation. A total rate of GC patients. Therefore, discovery of valuable molec- of 14,824 common mRNAs and 2,869 common lncRNAs were ular biomarkers is of significance for the facilitation of early identified in the 4 sets and 5 GC‑associated WGCNA modules diagnosis and effective prediction of prognosis and thereby were preserved across all sets. MetaDE method identified contributing to improved outcomes in GC patients. 1,121 consensus DERs. A total of 50 lncRNAs were shared Long noncoding RNAs (lncRNAs) are defined as a group by preserved WGCNA modules and the consensus DERs. of non‑protein‑coding transcripts of greater than 200 nucleo- Subsequently, an 11‑lncRNA signature was identified by tides in length, which are characterized by tissue‑specific LASSO‑based Cox‑PH model. The lncRNAs signature‑based expression patterns (5,6). With the number of lncRNAs being risk score could divide patients into 2 risk groups with signifi- triple the number of protein‑coding , lncRNAs are predicted to exhibit a more important role in basic, transla- tional and clinical oncology than protein‑coding genes (7). Several lncRNAs have been demonstrated in GC, including H19 (8‑10), HOTAIR (11,12) and ANRIL (13). However, the Correspondence to: Dr Weibing Bai or Dr Guanglin Huang, association of lncRNAs with GC prognosis has not been fully Department of General Surgery, Yulin Xingyuan Hospital, 33 Middle elucidated. Although a recent study by Miao et al (14) reported Section of West Renmin Road, Yuyang, Yulin, Shaanxi 719000, a 4‑lncRNA signature of prognostic value for GC patients, the P. R. Ch i na signature is yielded by bioinformatics analysis of The Cancer E‑mail: [email protected] Genome Atlas (TCGA) data only. A comprehensive analysis E‑mail: [email protected] of gene expression data of GC patients from more databases is *Contributed equally required for acquiring a more convincing prognostic lncRNAs signature. Key words: network, mRNA, pathway, , differentially In contrast with the study of Miao et al (14), the present expressed RNAs study performed an integrated analysis on GC gene expression data mined in the National Center for Biotechnology Information (NCBI), Gene Expression Omnibus (GEO), EBI 5580 ZHANG et al: A PROGNOSTIC lncRNA SIGNATURE FOR GASTRIC CANCER

ArrayExpress and TCGA repositories. The present study was scale‑free topology criterion. Following the removal of RNAs mainly focused on revealing the critical lncRNAs involved in with coefficients of variation <0.1, the weighted adjacency matrix GC pathogenesis and the roles of the critical lncRNAs in the was then developed. A dynamic tree cut algorithm was used to molecular mechanisms of GC. An 11‑lncRNA signature was mine modules with a module size ≥30 and a minimum cut height identified for prognostic risk assessment of GC patients using of 0.95. In addition, preservation of modules in all 4 datasets was weighted correlation network analysis (WGCNA) network, examined using the module preservation function of the WGCNA the MetaDE method and a LASSO‑based Cox‑proportional package. In addition, functional annotation of the modules iden- hazard (PH) model. In addition, the prognostic significance of tified was investigated using the userListEnchment function of this signature was validated in an independent set. In order to WGCNA package. reveal the molecular mechanisms of these critical lncRNAs, the lncRNA‑mRNA interaction network was constructed Identification of consensus differentially expressed RNAs. for functional and pathway enrichment analysis. The results Consensus differentially expressed RNAs (DERs) between GC revealed that these critical lncRNAs can regulate the associ- specimens and normal control specimens across the 4 datasets ated mRNAs to influence the immune response, inflammatory (GSE6580, GSE29998, E‑MTAB‑1338 and TCGA) were response and cell cycle in the pathogenesis of GC. identified with metaDE package (22,23) (https://cran.r‑project. org/web/packages/MetaDE/) in R language version 3.4.1. Materials and methods The cutoff was set at tau2=0, Qpval>0.05, P<0.05 and false discovery rate (FDR)<0.05. tau2 denotes the amount of hetero- Data resource and preprocessing. Gene expression profiles for geneity while Qpval denotes heterogeneity of a dataset. The GC were searched in publicly accessible GEO at the NCBI common lncRNAs shared by the list of consensus DERs and (http://www.ncbi.nlm.nih.gov/geo/) and EBI ArrayExpress the RNAs in the preserved WGNCA modules were selected (https://www.ebi.ac.uk/arrayexpress/). Inclusion criteria were: for further analysis. Human gene expression data; gastric cancer specimens and paired normal specimens; total count of specimens ≥50. Development of a prognostic risk scoring system for GC. Finally, Genomic Spatial Event (GSE) (15) 6580 and GSE29998 L1‑penalized (lasso) characterized by simultaneous variable downloaded from NCBI GEO and E‑MTAB‑1338 from EBI selection and shrinkage is a useful method for determining ArrayExpress were selected in the present study (Table I). interpretable prediction rules in high‑dimensional data (24). Raw data (TXT) in GSE6580, GSE29998 and E‑MTAB‑ In order to determine an lncRNA signature for prognosis,

1338 were subject to log2 transformation by limma (version the penalized package (24) in R language (version 3.4.1) was 3.34.0) software (16) (https://bioconductor.org/ pack- applied to fit a lasso Cox‑PH (25) to the overlapped lncRNAs. ages/release/bioc/html/limma.html). Subsequently, the data Based on the optimal lambda value that was selected through were transformed from a skewed distribution to normal a 1,000 cross‑validations, a panel of prognostic lncRNAs distribution, followed by median normalization. Based on was determined. An equation for calculating risk score was the platform annotation files (Table I), probe sets that were generated based on the expression levels of these prognostic assigned with a RefSeq transcript ID and/or Ensembl gene lncRNAs and their regression coefficients from the Cox‑PH ID were obtained, of which the probe sets labeled as ‘NR’ model as follows: (non‑coding RNA in the Refseq database) were selected. In addition, platform sequencing data was aligned with human Risk score=βlncRNA1 x exprlncRNA1 + βlncRNA2 x exprln- genome (GRCh38) (17,18) using Clustal 2 (http://www. cRNA2 + · ···· + βlncRNAn x exprlncRNAn clustal.org/clustal2/) (19). The resulting lncRNAs and the above‑mentioned lncRNAs annotated in Refseq database were Risk score was calculated and assigned to each patient in combined and used in further analysis. the training set (TCGA set, Table II). With the median risk The present study also acquired mRNA‑seq data of 384 score as cutoff, all patients in the training set were split into GC samples and 26 normal controls from TCGA portal a high‑risk group and a low‑risk group. Overall survival (OS) (https://gdc‑portal.nci.nih.gov/), which did not require time and recurrence‑free survival (RFS) time of the two risk preprocessing. Common RNAs of the GSE6580, GSE29998, groups were analyzed and compared by Kaplan‑Meier survival E‑MTAB‑1338 and TCGA sets were used for further analysis. analysis and the logrank test. The robustness of the risk scoring system was validated WGCNA network analysis. WGCNA (20) is a bioinformatics tool in an independent dataset (GSE62254) (26) downloaded from used to build a gene co‑expression networks to mine network NCBI GEO (platform: GPL570, Affymetrix modules closely associated with dieases. Based on the common U133 Plus 2.0 Array). GSE62254 included the gene expres- RNAs identified, WGCNA package (21) (version 1.61) in R 3.4.1 sion data of 300 GC tissue samples (Table II). Raw data was language was applied to identify GC‑associated RNA modules preprocessed using an oligo (27) package in R language (https://cran.r‑project.org/web/packages/WGCNA/index.html) in (version 3.4.1). Risk score and risk groups were determined the present study. The TCGA set was used as the training set, similarly for the GSE62254 dataset. Discrepancies in OS time while GSE6580, GSE29998 and E‑MTAB‑1338 were selected and RFS time between the risk groups were analyzed using as testing sets. Comparability of these 4 sets were assessed by Kaplan‑Meier survival analysis and the log rank test. correlation anaysis of RNA expression levels. A weighted gene co‑expression network was built as previously described (20). Functional analysis of prognostic lncRNAs. To investigate Briefly, the soft threshold power of β was determined using the biological function of these prognostic lncRNAs identified MOLECULAR MEDICINE REPORTS 18: 5579-5593, 2018 5581

Table I. Basic information of gene expression profiles from NCBI GEO, EBI ArrayExpress and TCGA.

Accession ID Platform Total sample Tumor Control

GSE65801 GPL14550 Agilent 64 32 32 GSE29998 GPL6947 Illumina 99 50 49 E‑MTAB‑1338 Illumina HumanHT 71 50 21 TCGA Illumina HiSeq 420 384 36

NCBI, National Center for Biotechnology Information; GEO, Gene Expression Omnibus; TCGA, The Cancer Genome Atlas; GSE, Genomic Spatial Event.

Table II. Clinical features of TCGA dataset and GSE622254.

Clinical characteristics TCGA (n=384) GSE62254 (n=300)

Age (years, mean ± SD) 65.15±10.61 61.94±11.36 Gender (male/female/data unavailable) 243/133/8 199/101 Recurrence (yes/no/data unavailable) 78/260/46 125/157/18 Vitality (dead/alive/data unavailable) 122/238/24 135/148//17 DFS (months) (mean ± SD) 15.84±17.05 33.72±29.82 OS (months) (mean ± SD) 16.17±16.96 50.59±31.42

TCGA, The Cancer Genome Atlas; GSE, Genomic Spatial Event; SD, standard deviation; ‑, data unavailable; DFS, disease free survival time; OS, overall survival time.

above in GC tumorigenesis, lncRNA‑mRNA networks were Table III. Numbers of mRNAs and lncRNAs in the datasets. constructed for them based on the correlation coefficients between RNAs from WGCNA modules. Gene ontology Accession ID Total count mRNA lncRNA (GO; http://www.geneontology.org/) function and Kyoto Encyclopedia of Genes and Genomes (KEGG; https://www. GSE65801 23,081 17,056 6,025 kegg.jp/) pathway enrichment analysis was performed for E‑MTAB‑1338 18,730 15,376 3,354 the RNAs in these lncRNA‑mRNA networks by DAVID GSE29998 20,586 15,376 5,210 Bioinformatics Tool (28,29) (version 6.8; https://david‑d. TCGA 24,840 17,579 7,261 ncifcrf. gov/). Common 17,693 14,824 2,869

Results lnc, long non‑coding; GSE, Genomic Spatial Event; TCGA, The Cancer Genome Atlas. RNA expression data. Following data preprocessing, the present study identified 17,693 common RNAs in the GSE6580, GSE29998, E‑MTAB‑1338 and TCGA sets, including 14,824 mRNAs and 2,869 lncRNAs (Table III). in GSE29998, GSE6580 and E‑MTAB‑1338. The gene dendrograms are presented in Fig. 3B‑D. WGCNA network and modules. Based on these common As illustrated in a gene multi‑dimensional scaling (MDS) RNAs, WGCNA was used to mine GC‑associated modules, plot (Fig. 4A), RNAs of the same module were prone to cluster with TCGA set as the training set and GSE6580, GSE29998, together, suggesting similar expression patterns of RNAs in E‑MTAB‑1338 as validation sets. The correlation of gene the same module. A hierarchical clustering analysis of the expression between these sets was in the range of 0.4‑1 with 11 modules identified that the associated modules clustered P<1x10‑200 (Fig. 1), indicating good comparability between together, such as the black module and the yellow module, the the sets. For adjacencies calculation, the soft threshold power pink module and the purple module, the magenta module and of β was determined to be 5 when the scale‑free topology fit the red module, and the grey module and the turquoise module (scale‑free R2) achieved 0.9 (Fig. 2). (Fig. 4B). Not unexpectedly, these modules were also close to A total of 11 modules (black, blue, brown, green, grey, each other in the module MDS plot (Fig. 4C). magenta, pink, red, turquoise, yellow and purple) were mined In addition, out of the 11 modules, black, blue, brown, with WGCNA for the TCGA dataset. In the resulting dendro- turquoise and yellow modules with Z‑score >5 were identified gram (Fig. 3A), these modules were represented by branches to be well preserved across the GSE6580, GSE29998, in different colors. Module mining was also conducted E‑MTAB‑1338 and TCGA sets (Table IV). Functional 5582 ZHANG et al: A PROGNOSTIC lncRNA SIGNATURE FOR GASTRIC CANCER

Figure 1. Analysis of comparability of the TCGA, GSE29998, GSE65801 and E‑MTAB‑1338 sets. Each panel presents the correlation of ranked expression of genes between 2 datasets. Cor value and P‑value are calculated using the WGCNA package. TCGA, The Cancer Genome Atlas; GSE, Genomic Spatial Event; WGCNA, weighted correlation network analysis; Cor, correlation coefficient.

Figure 2. Net topology analysis for optimizing soft‑threshold power. (A) The scale‑free fit index (scale‑free R2, y‑axis) as a function of the soft‑threshold power (x‑axis). When the scale‑free topology fit reaches 0.9 (red line), the soft threshold power is 5. (B) The mean connectivity (degree, y‑axis) as a function of the soft threshold power (x‑axis). When the soft threshold power is 5, the mean connectivity is 2 (red line).

annotation of the 5 modules was performed using WGCNA Consensus DERs. The metaDE package identified 1,121 package (Table IV). The black module was associated with consensus DERs in the GSE6580, GSE29998, E‑MTAB‑1338 digestion. The blue module was associated with immune and TCGA sets, of which 255 were lncRNAs. A heatmap of response. The brown module was correlated with cell cycle. these consensus DERs was generated by heatmap.sig.genes The turquoise module was associated with cell adhesion. function in MetaDE package (Fig. 5). Clearly, expression The yellow module was linked to protein amino acid patterns of these consensus DERs were similar in 4 datasets. glycosylation (Table IV). Furthermore, 288 RNAs were overlapped between the 5 MOLECULAR MEDICINE REPORTS 18: 5579-5593, 2018 5583

Table IV. Characteristics of WGCNA network modules.

Module Module preservation Module TCGA GSE29998 GSE65801 E‑MTAB‑133 Color size (Z‑score) characterization

D1M1 D2M1 D3M1 D4M1 Black 59 28.06 Digestion D1M2 D2M2 D3M2 D4M2 Blue 417 31.59 Immune response D1M3 D2M3 D3M3 D4M3 Brown 411 25.26 Cell cycle D1M4 D2M4 D3M4 D4M4 Green 111 6.41 ‑ D1M5 D2M5 D3M5 D4M5 Grey 1,097 4.90 ‑ D1M6 D2M6 D3M6 D4M6 Nagenta 38 10.21 ‑

D1M7 D2M7 D3M7 D4M7 Pink 56 22.08 - D1M8 D2M8 D3M8 D4M8 Red 78 17.64 ‑ D1M9 D2M9 D3M9 D4M9 Turquoise 564 29.46 Cell adhesion D1M10 D2M10 D3M10 D4M10 Yellow 215 14.37 Protein amino acid glycosylation D1M11 D2M11 D3M11 D4M11 Purple 35 8.30 ‑

WGCNA, weighted correlation network analysis; TCGA, The Cancer Genome Atlas; GSE, Genomic Spatial Event.

Figure 3. Clustering dendrograms of identified modules in (A) TCGA (B) GSE29998, (C) GSE65801 and (D) E‑MTAB‑1338 sets. Modules are labeled in different colors. TCGA, The Cancer Genome Atlas; GSE, Genomic Spatial Event.

Figure 4. Module analysis. (A) MDS plot demonstrating the similarity of RNAs expression patterns between different modules. RNAs of different modules are marked in different colors. (B) Module cluster tree. (C) MDS plot exhibiting the degree of similarity between the identified modules. Modules are labeled in different colors. MDS, multi‑dimensional scaling. 5584 ZHANG et al: A PROGNOSTIC lncRNA SIGNATURE FOR GASTRIC CANCER

Figure 5. A heatmap of consensus RNAs identified by MetaDE. RNAs expression patterns are similar in the TCGA, GSE29998, GSE65801 and E‑MTAB‑1338 sets. TCGA, The Cancer Genome Atlas; GSE, Genomic Spatial Event.

Figure 6. Analysis of overlapped RNAs. (A) Venn diagram displaying the overlapped RNAs between the preserved WGCNA modules and the consensus DERs identified by MetaDE. (B) Distribution of overlapped mRNAs (upper) and lncRNAs (lower) in the 5 preserved WGCNA modules (black, blue, brown, turquoise and yellow). lnc, long non‑coding; WGCNA, weighted correlation network analysis; DERs, differentially expressed RNAs. MOLECULAR MEDICINE REPORTS 18: 5579-5593, 2018 5585

Table V. The 11 prognostic lncRNAs identified by LASSO‑based Cox‑proportion hazard model. lncRNA Coefficient HR 95% CI

ARHGAP5‑AS1 0.0124 1.1907 0.8259‑1.7166 FLVCR1‑AS1 -0.1191 0.6610 0.4916‑0.8886 H19 0.9171 1.0497 0.9390‑1.1735 HOTAIR -0.4973 0.8970 0.6584‑1.2222 LINC00221 1.1799 1.9190 1.2021‑3.0633 MCF2L‑AS1 -0.7009 0.7785 0.6053‑1.0014 MUC2 -0.0902 0.9516 0.8631‑1.0492 PRSS30P 0.2572 1.1254 0.8263‑1.5329 SCARNA9 -0.8615 0.7383 0.5449‑1.0004 TP53TG1 0.1493 1.1386 0.8808‑1.4720 XIST ‑0.9235 0.5469 0.1926‑1.5527 lnc, long non‑coding; HR, hazard ratio; CI, confidence interval.

Figure 7. Kaplan‑Meier curves for OS time (left) and RFS time (right) of patients in (A) TCGA and (B) GSE62254 sets. Patients of each set are divided by risk score into a high‑risk group and a low‑risk group. OS and RFS between two risk groups were analyzed and compared by Kaplan‑Meier analysis and logRank test. TCGA, The Cancer Genome Atlas; GSE, Genomic Spatial Event; OS, overall survival; RFS, recurrence‑free survival.

preserved modules and the list of consensus DERs (Fig. 6A). Development and validation of an lncRNAs‑based risk Among these overlapped RNAs, 50 were lncRNAs, of which scoring system. Based on the expression of these overlapped 32 were included in the blue module, 14 in the brown module, lncRNAs in the TCGA set, the LASSO‑based Cox‑PH model 3 in the turquoise module and 1 in the yellow module (Fig. 6B). identified an 11‑lncRNA signature that was significantly asso- 5586 ZHANG et al: A PROGNOSTIC lncRNA SIGNATURE FOR GASTRIC CANCER

Figure 8. Constructed lncRNA‑mRNA networks for prognostic lncRNAs. (A) lncRNA‑mRNA network of 9 lncRNAs. The 9 lncRNAs are also contained in the WGCNA blue module. (B) lncRNA‑mRNA network of 2 lncRNAs. The lncRNAs are also contained in the WGCNA brown module. Each red square module stands for an lncRNA. Each round node stands for an mRNA. A link between two nodes reveals positive (red link) or negative (green link) correlation between an lncRNA and an mRNA. lnc, long non‑coding; WGCNA, weighted correlation network analysis.

ciated with survival rate based on the optimal lambda value and RFS time (15.76±11.51 months vs. 21.72±21.03, logRank (19.70021). This signature consisted of FLVCR1‑AS1, H19, P=0.0117) compared with the patients in the low‑risk group LINC00221, MUC2, RSS30P, SCARNA9, TP53TG1, XIST, (n=155, Fig. 7A). Prognostic performance of this 11‑lncRNA ARHGAP5‑AS1, HOTAIR and MCF2L‑AS1 (Table V). signature‑based risk scoring system was tested in an independent LncRNA signature‑based risk score was calculated using the set (GSE62254). All 300 patients in GSE62254 were divided into following formula: a high‑risk group (n=150) and a low‑risk group (n=150) by risk score. Similarly, OS time (54.79±31.83 months vs. 46.40±31.83,

Risk score=0.012437 x ExpARHGAP5-AS1 + (-0.11914) x logRank P=0.0311) and RFS time (37.45±31.08 months vs. ExpFLVCR1-AS1 + 0.917082xExpH19 + (-0.49726) x ExpHOTAIR + 29.99±28.11, logRank P=0.0282) were markedly elongated in 1.179896 x ExpLINC00221 + (-0.70093) x ExpMCF2L-AS1 + (-0.09017) the low‑risk group relative to the high‑risk group (Fig. 7B). x ExpMUC2 + 0.257189 x ExpPRSS30P + (-0.86146) x ExpSCARNA9 + 0.149341 x ExpTP53TG1 + (-0.92352) x ExpXIST Function analysis of the 11‑lncRNA signature. Among the 11 signature lncRNAs, 9 lncRNAs (FLVCR1‑AS1, H19, Risk score was calculated for each patient. All patients in the LINC00221, MUC2, RSS30P, SCARNA9, TP53TG1, XIST TCGA set were split into a high‑risk group and a low‑risk group and ARHGAP5‑AS1) were involved in the blue module, with the median risk score as the cutoff. Patients in the high‑risk whereas another 2 lncRNAs (HOTAIR and MCF2L‑AS1) group (n=156) demonstrated significantly shorter OS time were present in the brown module. Correlations between the (15.56±13.15 months vs. 21.23±19.99, logRank P=7.44x10‑5) 9 lncRNAs in the blue module and mRNAs revealed by the MOLECULAR MEDICINE REPORTS 18: 5579-5593, 2018 5587 ‑ 49 ‑ 20 ‑ 19 ‑ 19 ‑ 19 ‑ 19 ‑ 18 ‑ 18 ‑ 18 ‑ 18 ‑ 16 FDR 5.39x10 6.22x10 1.53x10 1.83x10 2.32x10 3.92x10 2.21x10 3.44x10 3.64x10 7.10x10 2.55x10 Genes

FAS, HLA ‑ DMB, DMA, C1QC, PDCD1, CD96, SH2D1A, CLEC4E, MS4A1, LTF, MICB, CD8A, LY86, WAS, TNFRSF17, CMKLR1, LAIR1, POU2AF1, SIT1, NCF2, GZMA, NCF1, LY96, FCGR3A, SPN, CIITA, TNFSF13B, CCR4, LAX1, CCR5, IGSF6, C1QB, LILRB2, IL18BP, CTSW, TRAT1, HLA ‑ DQA1, PDCD1LG2, MADCAM1, GBP4, LCP1, GBP1, LCP2, HLA ‑ DQB1, PSMB10, ITGAL, CCR1, GPSM3, LILRB4, HLA ‑ DPA1, CXCL9, CX3CL1, IL7R, CCL5, CCL4, POU2F2, ZAP70, HLA ‑ DRB5, IL2RG, CD4, DPB1, DOA, APOL1, CD300A, TNFSF10, CYBB, AIM2, CORO1A, TNFRSF13C, CCL19, SLAMF7, CD180, PTPRC, IL2RA, CXCL13, CD209, IRF8, CD274, CD79B, CD79A HLA ‑ DOA, LAG3, SPN, PTPRC, SIT1, IL2RA, KLRK1, IL7R, HLA ‑ DMA, CD2, ZAP70, CD4, IL2RG, FAS, TNFSF13B, TNFRSF13C, CD40, PDCD1LG2, CD38, PRKCQ, CORO1A, SIRPG, IKZF1, PLEK, CD3E, LAX1, CD274, JAK2, IRF4, SASH3 HLA ‑ DOA, LAG3, SPN, PTPRC, SIT1, IL2RA, KLRK1, IL7R, HLA ‑ DMA, CD2, ZAP70, CD4, IL2RG, FAS, TNFSF13B, LAX1, CD274, TNFRSF13C, CD40, PDCD1LG2, CD38, PRKCQ, CORO1A, SIRPG, IKZF1, CD3E, IRF4, SASH3 ITGAL, MICB, CD8A, IL21R, KLRK1, PTPN22, IL7R, HLA ‑ DMA, DOCK2, CXCR5, ZAP70, MS4A1, CD2, LAX1, CD79A, WAS, SPN, RHOH, PTPRC, CD3G, CD3D, IKZF1, CD3E, SLAMF7, ITGA4, CD40, CD4, FAS, IRF4, BANK1, LCP1 C3AR1, MICB, CD247, KLRK1, PTPN22, IL7R, C1QC, HLA ‑ DMA, SH2D1A, CD2, ZAP70, CD4, IL2RG, CD38, PRKCQ, C1QB, TRAT1, TNFRSF13C, CD40, PDCD1LG2, LAG3, SPN, PTPRC, IL2RA, IKZF1, CD3E, TNFSF13B, LAX1, CD79A, SASH3 CORO1A, CD37, SIRPG, ITGAL, MICB, CD8A, IL21R, KLRK1, PTPN22, CX3CL1, IL7R, HLA ‑ DMA, DOCK2, CXCR5, ZAP70, SPN, RHOH, PTPRC, CD3G, CD3D, IKZF1, CD3E, SLAMF7, ITGA4, CD40, MS4A1, CD2, CD4, FAS, LAX1, CD79A, IRF4, BANK1, LCP1, LCP2 WAS, TNFRSF13C, KLRK1, IL7R, HLA ‑ DMA, PDCD1LG2, PRKCQ, CORO1A, PTPRC, SIT1, IL2RA, IKZF1, CD3E, TNFSF13B, LAX1, CD274, ZAP70, CD2, CD4, IL2RG, IRF4, HLA ‑ DOA, SPN, LAG3, SASH3 SIRPG, CCL4, C1QC, CXCL9, ITGB2, CX3CL1, CCL5, PTPRCAP, AIF1, CCR1, LY86, C3AR1, ITGAL, PRF1, HCK, CCL19, CD40, ITK, PTPRC, IL2RA, NCF2, NCF1, LY96, SPN, CIITA, AOAH, LTF, SH2D1A, APOL3, SIGLEC1, C1QB, LILRB2, CORO1A, CD163, LSP1, CD84, TRAT1, CD180, SP140, WAS, SLAMF7, APOL1, CCR5, CCR4, CXCL13, MNDA, PLA2G7 CYBB, HLA ‑ DOA, LAG3, SPN, PTPRC, SIT1, IL2RA, KLRK1, IL7R, HLA ‑ DMA, CD2, ZAP70, CD4, IL2RG, FAS, TNFSF13B, LAX1, CD274, TNFRSF13C, CD40, PDCD1LG2, CD38, PRKCQ, CORO1A, SIRPG, IKZF1, CD3E, IRF4, SASH3 ITGAL, MICB, CD8A, IL21R, KLRK1, PTPN22, CX3CL1, IL7R, HLA ‑ DMA, DOCK2, CXCR5, ZAP70, SPN, RHOH, PTPRC, CD3G, CD3D, PLEK, IKZF1, CD3E, SLAMF7, ITGA4, MS4A1, CD2, CD4, FAS, LAX1, CD79A, IRF4, BANK1, LCP1, LCP2 WAS, CD40, TNFRSF13C, CD40, IL7R, HLA ‑ DMA, PDCD1LG2, PTPRC, IL2RA, IKZF1, PLEK, CD3E, KLRK1, TNFSF13B, CD2, ZAP70, JAK2, CD4, IL2RG, SASH3, SPN CD38, PRKCQ, CORO1A, SIRPG,

80 30 28 31 33 33 25 47 28 34 23 Count Term Immune response Regulation of cell activation Regulation of lymphocyte activation activation Lymphocyte Positive regulation of immune system process Leukocyte activation cell T Regulation of activation Defense response Regulation of leukocyte activation Cell activation Positive regulation of cell activation Table VI. Significant GO terms and KEGG pathways for the genes in the constructed lncRNA‑mRNA network of nine prognostic lncRNAs involved in the blue module. VI. Significant GO terms and KEGG pathways for the genes in constructed lncRNA‑mRNA Table GO category Biology process 5588 ZHANG et al: A PROGNOSTIC lncRNA SIGNATURE FOR GASTRIC CANCER ‑ 14 ‑ 13 ‑ 12 ‑ 09 ‑ 09 ‑ 08 ‑ 08 ‑ 08 ‑ 07 ‑ 05 ‑ 04 ‑ 04 ‑ 15 ‑ 08 FDR 3.41x10 1.78x10 1.09x10 2.97x10 6.83x10 1.02x10 2.60x10 2.60x10 4.82x10 4.54x10 8.34x10 8.58x10 6.37x10 8.68x10 Genes PTPRC, IL2RA, IKZF1, CD3E, KLRK1, TNFRSF13C, CD40, IL7R, HLA ‑ DMA, PDCD1LG2, CD38, PTPRC, IL2RA, IKZF1, CD3E, KLRK1, TNFSF13B, CD2, ZAP70, CD4, IL2RG, SASH3, SPN PRKCQ, CORO1A, SIRPG, TNFRSF13C, CD40, IL7R, HLA ‑ DMA, PDCD1LG2, CD38, PTPRC, IL2RA, IKZF1, CD3E, KLRK1, TNFSF13B, ZAP70, CD4, IL2RG, SASH3, SPN PRKCQ, CORO1A, SIRPG, DOCK2, WAS, ITGAL, PTPRC, MICB, CD3G, CD3D, IKZF1, CD8A, CD3E, PTPN22, IL7R, HLA ‑ DMA, IRF4, LCP1, SPN, RHOH ZAP70, CD2, CD4, FAS, PTPRC, CD3D, PLEK, IKZF1, CD8A, CD3E, HCLS1, PTPN22, ITGA4, IFI16, IL7R, HLA ‑ DMA, CD79A, IRF4, SPN, RHOH DOCK2, CXCR5, CXCL13, IRF8, ZAP70, JAK2, CD4, FAS, IL2RA, LY96, AOAH, CIITA, CCR1, CXCL9, ITGB2, CCL5, C1QC, CCL4, AIF1, LY86, ITGAL, C3AR1, APOL3, CYBB, CCR5, CXCL13, CCR4, PLA2G7 CCL19, CD40, CD180, CD163, C1QB, SIGLEC1, PTPRC, CD3D, PLEK, IKZF1, CD8A, CD3E, HCLS1, PTPN22, ITGA4, IFI16, IL7R, HLA ‑ DMA, CD79A, IRF4, SPN, RHOH DOCK2, CXCR5, CXCL13, IRF8, ZAP70, JAK2, CD4, FAS, PTPRC, CD3D, PLEK, IKZF1, CD8A, CD3E, HCLS1, PTPN22, ITGA4, IFI16, IL7R, HLA ‑ DMA, DOCK2, CD79A, IRF4, SPN, RHOH IRF8, ZAP70, JAK2, CD4, FAS, TNFRSF13C, PTPN22, CX3CL1, CCL5, HLA ‑ DMA, C3AR1, PTPRC, MICB, CD3E, CD247, KLRK1, TNFSF13B, LAX1, CCR4, ZAP70, JAK2, CD79A, SASH3, LAG3 C1QC, C1QB, SH2D1A, IL2RA, PLEK, AOAH, CIITA, CCR1, CXCL9, ITGB2, CCL5, C1QC, CCL4, AIF1, LY86, C3AR1, ITGAL, APOL3, PRKCQ, C1QB, SIGLEC1, CYBB, CCR5, CCR4, CD180, CD163, WAS, CCL19, CD40, LY96, CXCL13, PLA2G7, JAK2 CMKLR1, MICB, CD8A, PTPN22, CXCR5, CXCR6, SPN, LAG3, KLRB1, PIK3CG, CD3G, CD3D, LY96, CD3E, GPR171, CD40, IGSF6, LILRB2, DOK2, CCR5, CCR4, LAX1, LCP2, C3AR1, ITGAL, CCR1, ITGAX, ITGB7, GPR25, ZAP70, CD2, CD4, CD247, KLRK1, CXCL9, FPR3, ITGB2, IL7R, CCL5, P2RY6, CD274, CD79B, JAK2, PTPRC, IL2RA, PLEK, DTX1, CCL19, RGS19, EVL, ITGA4, BIRC3, P2RY10, JAK3, CD79A, ADAMDEC1 ITGAL, CCR1, FERMT3, ITGB2, CX3CL1, CCL5, CCL4, CD96, ITGAX, ITGB7, CD2, CD22, CD4, CD6, PTPRC, PLEK, SIGLEC10, ITGA4, SLAMF7, EMILIN2, CD84, SIGLEC1, CORO1A, SELPLG, PARVG, SIRPG, CD300A, CD209, MADCAM1 ITGAL, CCR1, FERMT3, ITGB2, CX3CL1, CCL5, CCL4, CD96, ITGAX, ITGB7, CD2, CD22, CD4, CD6, PTPRC, PLEK, SIGLEC10, ITGA4, SLAMF7, EMILIN2, CD84, SIGLEC1, CORO1A, SELPLG, PARVG, SIRPG, CD300A, CD209, MADCAM1 HLA ‑ DQB1, ITGAL, PTPRC, CD8A, ITGB2, CD40, ITGA4, DMB, DMA, PDCD1, DQA1, MADCAM1, HLA ‑ DPB1, PDCD1LG2, SIGLEC1, ITGB7, CD274, CD2, CD22, HLA ‑ DRB5, CD4, DPA1, HLA ‑ DOA, CD6, SELPLG, SPN CD40, HLA ‑ DMB, DOA, HLA ‑ DPB1, FAS, HLA ‑ DQB1, PRF1, DRB5, GZMB, DPA1, HLA ‑ DMA, DQA1

21 20 21 24 26 24 22 22 30 56 29 29 26 12 Count Term Positive regulation of leukocyte activation Positive regulation of lymphocyte activation cell activation T Hemopoietic or lymphoid development organ Inflammatory response Immune system development Hemopoiesis Positive regulation of response to stimulus Response to wounding Cell surface signal transduction Cell adhesion Biological adhesion Cell adhesion molecules (CAMs) Allograft rejection receptor linked Table VI. Continued. Table GO category KEGG pathway MOLECULAR MEDICINE REPORTS 18: 5579-5593, 2018 5589 ‑ 06 ‑ 06 ‑ 05 ‑ 04 ‑ 03 ‑ 03 WGCNA were used to construct an lncRNA‑mRNA network (Fig. 8A). Similarly, another lncRNA‑mRNA network was FDR built for the 2 lncRNAs (HOTAIR andMCF2L‑AS1), in the 2.80×10 4.48x10 4.64x10 5.69x10 2.20x10 7.92x10 brown module (Fig. 8B). The genes in the lncRNA‑mRNA network that correlated with the 9 prognostic lncRNAs in the blue module were significantly associated with 23 GO biological process terms (including immune response, regula- tion of cell activation and regulation of lymphocyte activation) and 8 KEGG pathways (including cell adhesion molecules, allograft rejection and cytokine‑cytokine receptor interac- tion; Table VI). The genes in the lncRNA‑mRNA network that correlated with HOTAIR and MCF2L‑AS1 were mainly associated with the cell cycle phase, cell cycle and mitotic cell cycle. In addition, 4 KEGG pathways were enriched for the genes in this lncRNA‑mRNA network including cell cycle, DNA replication, progesterone‑mediated oocyte maturation and steroid biosynthesis pathways (Table VII).

Discussion

A growing number of studies have demonstrated that aberrantly expressed lncRNAs are implicated in GC tumorigenesis and progression (30,31). Nonetheless, the prognostic significance of lncRNAs in GC remains to be elucidated. Based on the Genes common RNAs data and corresponding clinical information of GC patients and normal controls which were obtained through data mining in NCBI GEO, EBI ArrayExpress and TCGA, a 11‑lncRNA prognostic signature was identified by a series of bioinformatics analyses featuring WGCNA, the MetaDE method and a LASSO‑based Cox‑PH model. Furthermore, it was identified that patients could be classified into a high‑risk group and a low‑risk group by the risk score based on the 11‑lncRNA signature in the training set, with noticeable sepa- rations being observed in the Kaplan‑Meier curves between the 2 groups. The high‑risk group exhibited significantly longer OS time and PFS time compared with the low‑risk group. The predictive ability of risk score was confirmed in an indepen- dent set. Therefore, the present study demonstrated that the 11‑lncRNA signature has the potential for assessing survival rate of GC patients. The 11‑lncRNA signature determined in the study was IL2RB, IL2RA, CCR1, IL21R, TNFRSF13C, CXCL9, TNFRSF17, CCL19, CD40, CX3CL1, IL7R, CCL5, TNFRSF13C, CXCL9, IL2RB, IL2RA, CCR1, IL21R, TNFSF13B, CXCR5, CCR5, CCR4, CXCL13, IL10RA, CXCR6, CSF2RB, IL2RG, FAS TNFSF10, CCL4, DOCK2, CXCR5, CCR5, WAS, PIK3CG, ITK, NCF1, HCK, CCR1, CXCL9, CCL19, CX3CL1, CCL5, CCL4, CCR4, CXCL13, CXCR6, JAK2, JAK3 TNFSF10, ZAP70, SH2D1A, PIK3CG, PRF1, ITGAL, MICB, CD247, KLRK1, GZMB, ITGB2, HCST, FCGR3A, LCP2 FAS, PIK3CG, ITK, PRKCQ, PTPRC, CD3G, CD8A, CD3D, CD3E, CD247, ZAP70, CD4, PDCD1, LCP2 HLA ‑ DQA1 DPB1, FAS, HLA ‑ DMB, DOA, DMA, HLA ‑ DPB1, FAS, HLA ‑ DQB1, PRF1, DRB5, GZMB, DPA1, HLA ‑ DPB1, DMB, DOA, DMA, CD8A, HLA ‑ DRB5, CD4, DPA1, HLA ‑ DQB1, CIITA, comprised of FLVCR1‑AS1, H19, LINC00221, MUC2, PRSS30P, SCARNA9, TP53TG1, XIST, ARHGAP5‑AS1, HOTAIR and MCF2L‑AS1. Among these lncRNAs, H19 is identified to be HLA ‑ DQA1 11 11 24 19 15 13 upregulated in plasma of GC patients and is proposed as a Count diagnostic biomarker (8). Increasing evidence also demonstrates that H19 upregulation promotes GC proliferation, migration and invasion (9,10). It has been established that MUC2 is associated with outcome of GC patients (32). lncRNA X inactive specific transcript (XIST) encoded by XIST gene acts as a regulator of X inactivation in mammals (33). Chen et al (34) observed upregu-

Term lated XIST in GC tissue and identified that this lncRNA serves a regulatory role in GC progression via microRNA (miR)‑101 and its direct target polycomb group protein enhancer of zeste Cytokine ‑ cytokine receptor interaction Graft vs.host disease Chemokine signaling pathway Natural killer cell mediated cytotoxicity cell receptor T signaling pathway Antigen processing and presentation homolog 2. HOTAIR transcribed from the HOXC is identified to be overexpressed in GC, which is a characteristic molecular alteration of GC (35). Furthermore, there is evidence that HOTAIR functions as a GC oncogene through regulating the expression of human epithelial growth factor receptor 2 by competing with miR‑331‑3p (12). Table VI. Continued. Table GO category GO, gene ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; lnc, long non ‑ coding; FDR, false discovery rate. 5590 ZHANG et al: A PROGNOSTIC lncRNA SIGNATURE FOR GASTRIC CANCER ‑ 22 ‑ 21 ‑ 21 ‑ 19 ‑ 19 ‑ 17 ‑ 17 ‑ 17 ‑ 17 ‑ 12 ‑ 05 ‑ 04 FDR 2.14x10 2.83x10 6.89x10 2.26x10 2.68x10 1.39x10 1.39x10 2.25x10 4.04x10 3.42x10 4.79x10 2.40x10 Genes

E2F1, KIF23, PRC1, NEK3, NEK2, DBF4, TTK, PKMYT1, ANLN, AURKA, PTTG1, CEP55, AURKB, CCNE1, AURKA, PTTG1, CEP55, ANLN, TTK, PKMYT1, E2F1, KIF23, PRC1, NEK3, NEK2, DBF4, CDC20, BIRC5, TPX2, SKP2, NUF2, CENPF, TRIP13, CDCA3, CDC6, MKI67, MSH5, CDCA2, CDCA5, CDC25B, CCNB1, MAD2L1, , POLD1, DSCC1 TACC3, CENPE, NDC80, ESPL1, PBK, CDKN3, UBE2C, AURKB, CDT1, CCNE2, AURKA, PTTG1, TTK, PKMYT1, E2F1, KIF23, CEP72, PRC1, DBF4, E2F7, UBE2C, UHRF1, TACC3, TPX2, ESPL1, MCM2, PBK, CCNE1, CDCA2, CDCA5, CDCA3, CDC6, SKP2, TRIP13, CKAP2, MKI67, MSH5, PSRC1, ANLN, CEP55, CENPA, MAD2L1, DSCC1, NEK3, NEK2, FOXM1, BIRC5, NDC80, CENPE, CDC20, CDKN3, CDC25B, CCNB1, PLK1, POLD1 NUF2, CENPF, AURKB, AURKA, PTTG1, CEP55, ANLN, TTK, PKMYT1, KIF23, E2F1, PRC1, NEK3, NEK2, DBF4, CDC20, BIRC5, CENPE, TPX2, SKP2, NUF2, CENPF, CDCA2, CDCA5, CDCA3, CDC6, CCNE1, CENPA, NDC80, ESPL1, PBK, CDKN3, UBE2C, CDC25B, CCNB1, MAD2L1, PLK1, POLD1, DSCC1 AURKB, AURKA, PTTG1, ANLN, TTK, PKMYT1, E2F1, KIF23, CEP72, PRC1, NEK3, NEK2, DBF4, TPX2, SKP2, NUF2, TRIP13, CDC6, MKI67, MSH5, CDCA2, CDCA5, CDCA3, CEP55, CCNE1, CENPA, CDC25B, CCNB1, TACC3, CDC20, BIRC5, CENPE, NDC80, ESPL1, PBK, CDKN3, UBE2C, CENPF, MAD2L1, PLK1, POLD1, DSCC1 AURKB, CDCA2, CDCA5, AURKA, PTTG1, CEP55, ANLN, TTK, PKMYT1, KIF23, PRC1, NEK3, NEK2, CDC20, BIRC5, CENPE, NDC80, ESPL1, TPX2, NUF2, CENPF, TRIP13, CDCA3, CDC6, MKI67, MSH5, CDC25B, CCNB1, MAD2L1, PLK1, DSCC1 TACC3, PBK, UBE2C, AURKB, PTTG1, CDCA2, CDCA5, CDCA3, ANLN, CEP55, AURKA, KIF23, NEK3, NEK2, PKMYT1, BIRC5, CENPE, NDC80, ESPL1, CDC20, PBK, UBE2C, CDC25B, CCNB1, TPX2, NUF2, CENPF, CDC6, MAD2L1, PLK1, DSCC1 AURKB, PTTG1, CDCA2, CDCA5, CDCA3, ANLN, CEP55, AURKA, KIF23, NEK3, NEK2, PKMYT1, BIRC5, CENPE, NDC80, ESPL1, CDC20, PBK, UBE2C, CDC25B, CCNB1, TPX2, NUF2, CENPF, CDC6, MAD2L1, PLK1, DSCC1 AURKB, PTTG1, CDCA2, CDCA5, CDCA3, ANLN, CEP55, AURKA, KIF23, NEK3, NEK2, PKMYT1, BIRC5, CENPE, NDC80, ESPL1, CDC20, PBK, UBE2C, CDC25B, CCNB1, TPX2, NUF2, CENPF, CDC6, MAD2L1, PLK1, DSCC1 AURKB, PTTG1, CDCA2, CDCA5, CDCA3, ANLN, CEP55, AURKA, KIF23, NEK3, NEK2, PKMYT1, BIRC5, CENPE, NDC80, ESPL1, CDC20, PBK, UBE2C, CDC25B, CCNB1, TPX2, NUF2, CENPF, CDC6, MAD2L1, PLK1, DSCC1 AURKB, CCNE2, CCNE1, CDCA2, CDCA5, CDCA3, ANLN, CEP55, PTTG1, KIF23, PRC1, NEK3, NEK2, BIRC5, CDC20, CENPE, NDC80, ESPL1, UBE2C, CDC25B, CCNB1, MAD2L1, PLK1 CDC6, NUF2, CENPF, TACC3, ANLN, BIRC5, TTK, PKMYT1, ESPL1, CENPE, E2F1, CDC6, HOXA13, NEK2, SKP2, CENPF, UBE2C, CDKN3, CDT1, CCNE2, CCNB1, MAD2L1 UBE2C, TACC3, AURKA, NDC80, CENPE, TTK, ESPL1, KIFC2, KIF23, CEP72, PRC1, NEK2, PSRC1, KIF20A HOOK1, CENPA, 40 50 37 42 34 28 28 28 28 26 19 16 Count Term Cell cycle phase Cell cycle Mitotic cell cycle Cell cycle process M phase Mitosis Nuclear division M phase of mitotic cell cycle fission Organelle Cell division Regulation of cell cycle Microtubule ‑ based process Table VII. Significant GO terms and KEGG pathways for the genes in the constructed lncRNA‑mRNA network of two prognostic lncRNAs in the brown module. VII. Significant GO terms and KEGG pathways for the genes in constructed lncRNA‑mRNA Table GO category Biology process MOLECULAR MEDICINE REPORTS 18: 5579-5593, 2018 5591

‑ 03 ‑ 03 ‑ 12 ‑ 03 ‑ 02 ‑ 02 Investigation of lncRNA profiles in human cancer remains to be performed. Apart from H19, MUC2, XIST and FDR HOTAIR, other prognostic lncRNAs have not been identified 2.78x10 5.77x10 1.01x10 5.38x10 1.04x10 1.22x10 in GC. FLVCR1‑AS1 has been reported in lung adeno- carcinoma by a study based on an miR‑lncRNA‑mRNA network (36). TP53TG1 is a critical lncRNA responsible for correct response of p53 to DNA damage and acts as a tumor suppressor (37). There is evidence that TP53TG1 expression is elevated in human glioma tissue and TP53TG1 under glucose deprivation may promote cell proliferation and migration by influencing the expression of glucose metabo- lism associated genes in glioma (38). LINC00221 has been reported to be aberrantly expressed in bladder cancer (39). Li et al (40) noted that PRSS30P is upregulated in lung adenocarcinoma. SCARNA9 is observed to be overex- pressed in breast cancer cells on exposure to cadmium (41). However, ARHGAP5‑AS1 and MCF2L‑AS1 are rarely studied in cancer. In future studies, the expression levels of ARHGAP5‑AS1 and MCF2L‑AS1 will be investigated in clinical samples of GC patients since the prognostic value of these lncRNAs was observed for GC. Correlations between the critical lncRNAs and mRNAs revealed by the WGCNA were used to construct lncRNA‑mRNA

Genes networks. In order to investigate the molecular mechanisms of the 11 prognostic lncRNAs in GC, GO function and KEGG pathway enrichment analysis were performed for the genes in the construct lncRNA‑mRNA networks. The results demon- strated that the genes correlated with the 9 lncRNAs in the blue module (FLVCR1‑AS1, H19, LINC00221, MUC2, RSS30P, SCARNA9, TP53TG1, XIST and ARHGAP5‑AS1) were asso- ciated with the immune response, regulation of cell activation, regulation of lymphocyte activation and cytokine‑cytokine receptor interaction. These results suggested that these 9 lncRNAs may serve important roles in the pathogenesis of GC by regulating their associated genes to affect the immune and inflammatory responses. The genes associated with the 2 lncRNAs (HOTAIR and MCF2L‑AS1) in the brown module were revealed to be implicated in cell cycle regulation. This indicated that ARHGAP5‑AS1 and MCF2L‑AS1 may also be critical in the pathogenesis of GC by regulating their asso- SATB2, FOXA2, FOXJ1, OTX1, HOXA11, HOXC6, FOXH1, HOXC10, HOXC9, HOXC11, HOXB7, HOXC6, FOXH1, HOXC10, HOXC9, HOXC11, FOXA2, FOXJ1, OTX1, HOXA11, SATB2, HOXA10, HOXA9, HOXB9 VEGFA, MCM2, PTTG1, MCM4, CDT1, CCNE2, RECQL4, GINS1, CDC6, RAD51AP1, DBF4, MSH5, CENPF, TRIP13, DSCC1 TOP2A, TYMS, UHRF1, RFC3, POLD1, DNMT3B, TTK, CDC20, ESPL1, MCM2, PTTG1, MCM4, CDC25B, E2F1, CDC6, E2F5, DBF4, SKP2, PKMYT1, CCNE2, CCNB1, CCNE1, MAD2L1, PLK1 RFC3, POLD1, MCM2, MCM4 CCNB1, MAD2L1, PLK1, PKMYT1, CDC25B CYP51A1, SQLE, DHCR7 ciated genes to influence the cell cycle. A growing body of evidence demonstrates the important roles of inflammation, immune and dysregulated cell cycle control in tumor growth 4 5 3 15 20 18 and progression (42‑44). Therefore, it can be concluded that Count the 11 critical lncRNAs may participate in the development and progression of GC by regulating their correlated genes to influence the immune response, inflammatory response and cell cycle. Based on bioinformatics analysis of existing gene expres- sion data from NCBI GEO, EBI ArrayExpress and TCGA, the present study identified an 11‑lncRNA signature that could

Term be used for predicting survival rate of GC patients. These 11 critical lncRNAs may participate in the pathogenesis of GC by regulating their correlated genes that are associated Pattern specification process metabolic process DNA Cell cycle replication DNA Progesterone ‑ mediated oocyte maturation Steroid biosynthesis with the immune response, inflammatory response and cell cycle. It is hoped that the present study may contribute to an improved understanding of the pathogenesis involved with lncRNAs in GC development and progression. Validation of this 11‑lncRNA signature in large cohorts of GC patients and clinical trials are also essential in further investigation. Table VII. Continued. Table GO category KEGG pathway GO, gene ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; lnc, long non ‑ coding; FDR, false discovery rate. 5592 ZHANG et al: A PROGNOSTIC lncRNA SIGNATURE FOR GASTRIC CANCER

Acknowledgements 11. Pan W, Liu L, Wei J, Ge Y, Zhang J, Chen H, Zhou L, Yuan Q, Zhou C and Yang M: A functional lncRNA HOTAIR genetic variant contributes to gastric cancer susceptibility. Mol Not applicable. Carcinog 55: 90‑96, 2016. 12. Liu XH, Sun M, Nie FQ, Ge YB, Zhang EB, Yin DD, Kong R, Xia R, Lu KH, Li JH, et al: Lnc RNA HOTAIR functions as Funding a competing endogenous RNA to regulate HER2 expression by sponging miR‑331‑3p in gastric cancer. Mol Cancer 13: 92, 2014. No funding was received. 13. Zhang EB, Kong R, Yin DD, You LH, Sun M, Han L, Xu TP, Xia R, Yang JS, De W and Chen Jf: Long noncoding RNA ANRIL indicates a poor prognosis of gastric cancer and promotes Availability of data and materials tumor growth by epigenetically silencing of miR‑99a/miR‑449a. Oncotarget 5: 2276‑2292, 2014. 14. Miao Y, Sui J, Xu SY, Liang GY, Pu YP and Yin LH: The datasets analyzed during the present study are available Comprehensive analysis of a novel four‑lncRNA signature as a from the corresponding author on reasonable request. prognostic biomarker for human gastric cancer. Oncotarget 8: 75007‑75024, 2017. 15. Danford T, Rolfe A and Gifford D: GSE: A comprehensive Authors' contributions database system for the representation, retrieval, and analysis of microarray data. Pac Symp Biocomput: 539‑550, 2008. YZ and HL performed data analyses and wrote the manuscript. 16. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W and Smyth GK: Limma powers differential expression analyses for WZ and YC contributed significantly to the data analyses and RNA‑sequencing and microarray studies. Nucleic Acids Res 43: critical revision of the manuscript. GH and WB conceived and e47, 2015. designed the study. All authors read and approved the final 17. Zhou M, Guo M, He D, Wang X, Cui Y, Yang H, Hao D and Sun J: A potential signature of eight long non‑coding RNAs manuscript. predicts survival in patients with non‑small cell lung cancer. J Transl Med 13: 231, 2015. Ethics approval and consent to participate 18. Zhou M, Xu W, Yue X, Zhao H, Wang Z, Shi H, Cheng L and Sun J: Relapse‑related long non‑coding RNA signature to improve prognosis prediction of lung adenocarcinoma. Oncotarget 7: Not applicable. 29720‑29738, 2016. 19. Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Patient consent for publication Wilm A, Lopez R, et al: Clustal W and Clustal X version 2.0. Bioinformatics 23: 2947‑2948, 2007. Not applicable. 20. Zhai X, Xue Q, Liu Q, Guo Y and Chen Z: Colon cancer recurrence‑associated genes revealed by WGCNA co‑expression network analysis. Mol Med Rep 16: 6499‑6505, 2017. Competing interests 21. Langfelder P and Horvath S: WGCNA: An R package for weighted correlation network analysis. BMC Bioinformatics 9: 559, 2008. The authors declare that they have no competing interests. 22. Qi C, Hong L, Cheng Z and Yin Q: Identification of metastasis‑associated genes in colorectal cancer using metaDE References and survival analysis. Oncol Lett 11: 568‑574, 2016. 23. Wang X, Kang DD, Shen K, Song C, Lu S, Chang LC, Liao SG, Huo Z, Tang S, Ding Y, et al: An R package suite for microarray 1. IARC: World Cancer Report 2014. Stewart BW and Wild CP meta‑analysis in quality control, differentially expressed gene (eds). World Health Organization, Geneva, 2015. analysis and pathway enrichment detection. Bioinformatics 28: 2. Orditura M, Galizia G, Sforza V, Gambardella V, Fabozzi A, 2534‑2536, 2012. Laterza MM, Andreozzi F, Ventriglia J, Savastano B, 24. Goeman JJ: L1 penalized estimation in the Cox proportional Mabilia A, et al: Treatment of gastric cancer. World J hazards model. Biom J 52: 70‑84, 2010. Gastroenterol 20: 1635‑1649, 2014. 25. Tibshirani R: The lasso method for variable selection in the Cox 3. Chen W, Zheng R, Baade PD, Zhang S, Zeng H, Bray F, Jemal A, model. Stat Med 16: 385‑395, 1997. Yu XQ and He J: Cancer statistics in china, 2015. CA Cancer J 26. Cristescu R, Lee J, Nebozhyn M, Kim KM, Ting JC, Wong SS, Clin 66: 115‑132, 2016. Liu J, Yue YG, Wang J, Yu K, et al: Molecular analysis of gastric 4. Sun Z, Wang ZN, Zhu Z, Xu YY, Xu Y, Huang BJ, Zhu GL and cancer identifies subtypes associated with distinct clinical Xu HM: Evaluation of the seventh edition of american joint outcomes. Nat Med 21: 449‑456, 2015. committee on cancer TNM staging system for gastric cancer: 27. Parrish RS and Spencer HJ III: Effect of normalization on Results from a chinese monoinstitutional study. Ann Surg significance testing for oligonucleotide microarrays. J Biopharm Oncol 19: 1918‑1927, 2012. Stat 14: 575‑589, 2004. 5. Mikkelsen TS, Ku M, Jaffe DB, Issac B, Lieberman E, 28. Huang da W, Sherman BT and Lempicki RA: Bioinformatics Giannoukos G, Alvarez P, Brockman W, Kim TK, Koche RP, et al: enrichment tools: Paths toward the comprehensive func- Genome‑wide maps of chromatin state in pluripotent and tional analysis of large gene lists. Nucleic Acids Res 37: 1‑13, 2009. lineage‑committed cells. Nature 448: 553‑560, 2007. 29. Huang da W, Sherman BT and Lempicki RA: Systematic and 6. Liz J and Esteller M: lncRNAs and microRNAs with a role integrative analysis of large gene lists using DAVID bioinfor- in cancer development. Biochim Biophys Acta 1859: 169‑176, matics resources. Nat Protoc 4: 44‑57, 2009. 2016. 30. Li T, Mo X, Fu L, Xiao B and Guo J: Molecular mechanisms 7. Evans JR, Feng FY and Chinnaiyan AM: The bright side of dark of long noncoding RNAs on gastric cancer. Oncotarget 7: matter: LncRNAs in cancer. J Clin Invest 126: 2775‑2782, 2016. 8601‑8612, 2016. 8. Zhou X, Yin C, Dang Y, Ye F and Zhang G: Identification of the 31. Sun M, Nie FQ, Wang ZX and De W: Involvement of lncRNA long non‑coding RNA H19 in plasma as a novel biomarker for dysregulation in gastric cancer. Histol Histopathol 31: 33‑39, diagnosis of gastric cancer. Sci Rep 5: 11516, 2015. 2016. 9. Zhou X, Ye F, Yin C, Zhuang Y, Yue G and Zhang G: The 32. Lee HS, Lee HK, Kim HS, Yang HK, Kim YI and Kim WH: interaction between MiR‑141 and lncRNA‑H19 in regulating MUC1, MUC2, MUC5AC, and MUC6 expressions in gastric cell proliferation and migration in gastric cancer. Cell Physiol carcinomas. Cancer 92: 1427‑1434, 2001. Biochem 36: 1440‑1452, 2015. 33. Brown CJ, Ballabio A, Rupert JL, Lafreniere RG, Grompe M, 10. Li H, Yu B, Li J, Su L, Yan M, Zhu Z and Liu B: Overexpression Tonlorenzi R and Willard HF: A gene from the region of the of lncRNA H19 enhances carcinogenesis and metastasis of human X inactivation centre is expressed exclusively from the gastric cancer. Oncotarget 5: 2318‑2329, 2014. inactive X . Nature 349: 38‑44, 1991. MOLECULAR MEDICINE REPORTS 18: 5579-5593, 2018 5593

34. Chen DL, Ju HQ, Lu YX, Chen LZ, Zeng ZL, Zhang DS, Luo HY, 39. Wang H, Niu L, Jiang S, Zhai J, Wang P, Kong F and Jin X: Wang F, Qiu MZ, Wang DS, et al: Long non‑coding RNA XIST Comprehensive analysis of aberrantly expressed profiles of regulates gastric cancer progression by acting as a molecular lncRNAs and miRNAs with associated ceRNA network in sponge of miR‑101 to modulate EZH2 expression. J Exp Clin muscle‑invasive bladder cancer. Oncotarget 7: 86174‑86185, Cancer Res 35: 142, 2016. 2016. 35. Endo H, Shiroki T, Nakagawa T, Yokoyama M, Tamai K, 40. Li J, Li P, Zhao W, Yang R, Chen S, Bai Y, Dun S, Chen X, Du Y, Yamanami H, Fujiya T, Sato I, Yamaguchi K, Tanaka N, et al: Wang Y, et al: Expression of long non‑coding RNA DLX6‑AS1 Enhanced expression of long non‑coding RNA HOTAIR is in lung adenocarcinoma. Cancer Cell Int 15: 48, 2015. associated with the development of gastric cancer. PLoS One 8: 41. Lubovac‑Pilav Z, Borràs DM, Ponce E and Louie MC: Using e77070, 2013. expression profiling to understand the effects of chronic 36. Li DS, Ainiwaer JL, Sheyhiding I, Zhang Z and Zhang LW: cadmium exposure on MCF‑7 breast cancer cells. PLoS One 8: Identification of key long non‑coding RNAs as competing e84646, 2013. endogenous RNAs for miRNA‑mRNA in lung adenocarcinoma. 42. Candido J and Hagemann T: Cancer‑related inflammation. J Clin Eur Rev Med Pharmacol Sci 20: 2285‑2295, 2016. Immunol 33 (Suppl 1): S79‑S84, 2013. 37. Diaz‑Lagares A, Crujeiras AB, Lopez‑Serra P, Soler M, Setien F, 43. Elinav E, Nowarski R, Thaiss CA, Hu B, Jin C and Flavell RA: Goyal A, Sandoval J, Hashimoto Y, Martinez‑Cardús A, Inflammation‑induced cancer: Crosstalk between tumours, Gomez A, et al: Epigenetic inactivation of the p53‑induced long immune cells and microorganisms. Nat Rev Cancer 13: 759‑771, noncoding RNA TP53 target 1 in human cancer. Proc Natl Acad 2013. Sci USA 113: E7535‑E7544, 2016. 44. Evan GI and Vousden KH: Proliferation, cell cycle and apoptosis 38. Chen X, Gao Y, Li D, Cao Y and Hao B: LncRNA‑TP53TG1 in cancer. Nature 411: 342‑348, 2001. participated in the stress response under glucose deprivation in glioma. J Cell Biochem 118: 4897‑4904, 2017. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) License.