556 (2015) 227–234

Contents lists available at ScienceDirect

Gene

journal homepage: www.elsevier.com/locate/gene

Genome-wide analysis of long noncoding RNA signature in human colorectal cancer

Yao Xue a,b,1,GaoxiangMaa,b,1,DongyingGuc,1, Lingjun Zhu d,QiuhanHuaa,b, Mulong Du a,b,HaiyanChua, Na Tong a, Jinfei Chen c, Zhengdong Zhang a,b,⁎, Meilin Wang a,b,⁎ a Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Cancer Center, Nanjing Medical University, Nanjing, China b Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology, Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China c Department of Oncology, Nanjing First Hospital, Nanjing Medical University, Nanjing, China d Department of Oncology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China article info abstract

Article history: Long noncoding (lncRNAs) have been widely regarded as crucial regulators in various biological processes Received 21 October 2014 involved in carcinogenesis. However, the comprehensive lncRNA expression signature in colorectal cancer re- Accepted 28 November 2014 mains fully unknown. We performed a high throughput microarray assay to detect lncRNA expression profile Available online 29 November 2014 in three paired human colorectal cancer tissues and their adjacent normal tissues. Additional 90 paired colorectal samples were collected to verify differently expression levels of two selected lncRNAs using q-RT-PCR assay. Keywords: Bioinformatic approaches were performed to explore into the functions of these differently expressed lncRNAs. lncRNA Colorectal cancer Microarray assay showed a series of lncRNAs were differently expressed in colorectal cancer. Two of the lncRNAs, Expression profile HOTAIR and a novel lncRNA, lncRNA-422 were confirmed in more samples (P = 0.015 for HOTAIR and P =0.027 GSEA for lncRNA-422, respectively). GSEA indicated that gene sets most correlated with them were those named up- GO regulated in KRAS-over, down-regulated in JAK2-knockout, down-regulated in PDGF-over and down-regulated in TBK1-knockout, all of which were cancer-related. Subsequently, GO analyses of most significantly correlated coding of HOTAIR and lncRNA-422 showed that these two lncRNAs may participate in carcinogenesis by regulating protein coding genes involved in special biological process relevant to cancer. Our study demonstrated that different lncRNA expression patterns were involved in colorectal cancer. Besides, HOTAIR and lncRNA-422 were identified to participate in colorectal cancer. Further studies into biological mechanisms of differently expressed lncRNAs identified in our study will help to provide new perspective in colorectal cancer pathogenesis. © 2014 Elsevier B.V. All rights reserved.

1. Introduction noncoding transcripts (ncRNAs) (Mattick, 2004). This has been the most surprising findings of the project and a remark- Over the past decades, high-throughput genomic platforms have able challenge and completion of central dogma, which was centered revealed that numerous sites of human genome are transcribed to around protein coding genes and merely regard RNA as the information transmitters. Researchers have found that only approximately 1.5% of the whole genome was responsible for protein coding (Wang and Abbreviations: MALAT1, metastasis associated lung adenocarcinoma transcript 1; Chang, 2011). In addition to the huge number, relatively conservative HOTAIR, HOX transcript antisense RNA; ANRIL, CDKN2B antisense RNA 1; PCGEM1, prostate part of the sequence and the specific temporal and spatial expression cancer (PCa) gene expression marker 1; GAPDH, glyceraldehyde 3-phosphate dehydroge- patterns also provided support for the important functions of ncRNA, nase; GSEA, gene set enrichment analysis; ES, enrichment score; GO, ; DAVID, database for annotation, visualization and integrated discovery; GOEAST, Gene which were initially proverbial “dark matter” of the genome (Ponting Ontology Enrichment Analysis Software Toolkit; KRAS, Kirsten rat sarcoma viral oncogene and Belgard, 2010). Indeed, emerging studies have demonstrated the homolog;JAK2,Januskinase2;PDGF,plateletderivedgrowthfactor;TBK1,TANK-bindingki- major biological role of ncRNA in a variety of process impacting evolu- nase1;SNHG1, small nucleolar RNA host gene1; NEAT1, nuclear paraspeckle assemblytran- tion, embryonic development, metabolism, oncogenesis, etc. (Ponting script 1; PRSS3, protease, serine, 3; MMP10, matrix metallopeptidase 10; CXCL2, chemokine (C-X-C motif) ligand 2; TP53, tumor protein ; E2F3,E2Ftranscriptionfactor3;LEF1,lym- et al., 2009; Wilusz et al., 2009). phoid enhancer-binding factor 1. NcRNAs are roughly separated into two groups according to ⁎ Corresponding authors at: Department of Environmental Genomics, School of Public the transcript length, which are small ncRNAs and long ncRNAs Health, Nanjing Medical University, 818 Tianyuan Road, Jiangning District, Nanjing (lncRNAs) (Mattick, 2001). The former one was defined as ncRNAs 211166, China. with number of nucleotides (nt) less than 200, and was represented E-mail addresses: [email protected] (Z. Zhang), [email protected] (M. Wang). by the widely-explored microRNA, which negatively regulated the 1 These authors contribute equally to this work. expression level of protein coding genes by base pairing with their

http://dx.doi.org/10.1016/j.gene.2014.11.060 0378-1119/© 2014 Elsevier B.V. All rights reserved. 228 Y. Xue et al. / Gene 556 (2015) 227–234 targets (Bartel, 2009). In contrast, another transcriptional class, role in the biological process of colorectal cancer (Ge et al., 2013; Qi lncRNAs, was described to be longer than 200 nt (Mercer et al., et al., 2013; Kogo et al., 2011). However, comprehensive studies into 2009), and a rising number of studies reported their functional role the special expression patterns of lncRNAs in colorectal cancer have as regulatory RNAs (Wilusz et al., 2009; Wapinski and Chang, 2011; not been reported. Whitehead et al., 2009). Different from the relatively accordant To investigate the potential role of lncRNA in carcinogenesis of mechanisms of microRNA regulation, functions of lncRNA cannot be colorectal cancer in a more comprehensive way, we performed a accurately forecasted by its sequence because various mechanisms microarray analysis to identify the genome-wide different expres- of lncRNAs have been identified, e.g. genomic imprinting (Lee and sion profiles of lncRNAs and mRNAs between 3 pairs of colorectal Bartolomei, 2013), chromatin modification (Marchese and Huarte, cancer tissues and their adjacent normal specimens. We further val- 2014), and post-transcriptional processing (Yoon et al., 2013). Based idated the different expression levels of two lncRNAs in more tissue on the remarkable regulatory functions of lncRNAs in multiple key samples and predicted their putative functions using bioinformatic biological processes, accumulating researches have explored into approaches. their role in human disease (Li et al., 2013). It is well established that a series of lncRNAs were dysregulated in diseases, especially malignant tumors, such as MALAT1 (Ji et al., 2003), HOTAIR (Gupta et al., 2010), 2. Material and methods ANRIL (Cunnington et al., 2010), PCGEM1 (Petrovics et al., 2004), etc. Besides, the observed abnormal expressions of lncRNAs were also 2.1. Patient specimens and clinical assessments important indications of their biological functions in carcinogenesis. Although aberrant expression of lncRNAs has been gradually recog- The present study recruited 98 pairs of colorectal cancer tissue nized as a biological signature of cancers (Gibb et al., 2011), studies and corresponding non-tumor tissue samples, all of which were obtain- into the detailed expression pattern of lncRNAs in a special tumor ed from patients who underwent surgical operation at the First Affiliat- were still deficient. ed Hospital and Nanjing First Hospital of Nanjing Medical University Colorectal cancer is the third most common cancer and the fourth from September 2010. All the participants were histologically con- leading cause of cancer related death worldwide (Jemal et al., 2011), firmed to be colorectal adenocarcinoma and did not receive any other with more than one million new cases diagnosed each year (Karsa forms of therapy on the time of enrollment. Clinical information for all et al., 2010). Due to the changes of human living habits, the prevalence the subjects was obtained retrospectively from clinical files. Tumor rate of colorectal cancer has dramatically risen in China (Sung et al., grade of colorectal cancer was divided into low, intermediate, and 2005). Nowadays, colorectal cancer has been a great threat to public high, while the pathological stage was classified into Dukes A, B, C, health. Therefore, a substantial amount of studies have investigated mo- and D. The detailed information of 3 samples selected to be analyzed lecular abnormalities in occurrence of colorectal cancer, in order to learn in microarray platform and the other 95 samples in verification stage more about pathogenesis of colorectal cancer (Colussi et al., 2013; are shown in Supplementary Tables 1 and 2. A questionnaire about life- Vaiopoulos et al., 2014). Among the numerous molecules demonstrated style factors was administered to all the subjects through face-to-face to be involved in colorectal cancer, lncRNAs have drawn emerging at- interviews. The informed consent was obtained from all the tentions for their aberrant expression features in carcinogenesis. The participants and procedures used in this study were approved by the abnormal expression level of lncRNAs always indicated their functional institutional review boards of Nanjing Medical University.

Fig. 1. (A) Results of the hierarchical clustering analysis of lncRNA microarray expression data and (B) a visualization for assessing the variation (or reproducibility) between chips, var- iables on X- and Y-axes represent the expression level of each lncRNA (log transferred) in normal tissues and tumor tissues, respectively. Y. Xue et al. / Gene 556 (2015) 227–234 229

Table 1 Selection and exclusion criteria of top 3 differently expressed lncRNAs in microarray.

Seqname Regulation Fold change P value Chrom Relationship Associated gene Remark

ENST00000434306 Up 11.93 9.02 × 10−3 2 Natural antisense BOK A U93033 Up 9.74 8.30 × 10−4 8 Intronic antisense SLA B ENST00000428194 Up 7.70 2.31 × 10−3 X Intergenic C ⁎ −2 ENST00000455246 (HOTAIR) Up 1.20 3.10 × 10 12 Intronic antisense HOXC11 D ENST00000494900 Down 8.42 4.03 × 10−2 3 Intergenic E ENST00000427340 Down 6.76 3.56 × 10-2 6 Natural antisense HLA-F E ENST00000415820 (LncRNA-422) Down 6.03 3.86 × 10−3 21 Intergenic F

The bold entries indicated the lncRNAs selected in the following studies. A — No evident role of its antisense protein coding gene BOK in carcinogenesis, not selected. B — No comprehensive information in database, not selected. C — Located on X, not selected. D — Previously reported to be involved in colorectal cancer, and significantly upregulated in our microarray, selected for q-RT-PCR verification. E — Not significantly associated with differently expressed protein coding mRNA, not selected. F — With comprehensive information in Ensembl database, and significantly associated with 3 differently expressed protein coding mRNAs (Pearson correlation coefficient = 1, P b 0.05), selected for q-RT-PCR verification. ⁎ As the first 3 upregulated lncRNA in microarray were not selected for verification, we included HOTAIR in our subsequently assays.

2.2. RNA extraction Scanned images (TIF format) were subsequently imported into Agilent Feature Extraction software for grid alignment and expression Samples were immediately frozen with liquid nitrogen after data analysis. Raw signal intensities were normalized in quantile surgical resections. Total RNA was extracted from all the colorectal method by GeneSpring GX v12.0, and low intensity LncRNAs were cancer and paired non-cancerous tissues using Trizol reagent filtered. Differentially expressed LncRNAs with statistical significance (Invitrogen, CA, USA) according to the manufacturer's instruction. between the two groups were identified through passing Volcano Plot The quantity of RNA was assessed with the NanoDrop ND-1000 filtering. spectrophotometer (OD 260 nm, NanoDrop, Wilmington, DE, USA), and RNA integrity was assessed using standard denaturing agarose gel 2.5. Quantitative real-time PCR (qRT PCR) electrophoresis. After RNA extraction, M-MLV reverse transcriptase (Invitrogen) 2.3. LncRNA and mRNA microarrays was used to synthesize cDNA according to the manufacturer's recom- mendations. The expression level of two differently expressed lncRNAs Human LncRNA Microarray V2.0 was manufactured by Arraystar (i.e. lncRNA-422 and HOTAIR) were detected by q-RT-PCR using Inc (MD, USA) and covered more than thirty-three thousand lncRNAs SYBR Green assays (TaKaRa Biotechnology, Dalian, China) in addi- as well as thirty thousand mRNAs in human genome. Sources of the tional 95 pairs of colorectal tissues. Glyceraldehyde 3-phosphate transcripts including NCBI RefSeq, UCSC Genome Browser, Ensembl dehydrogenase (GAPDH) was used as an internal control and the database, RNA db, etc. and lncRNAs from literature were also recruit- assay was performed using ABI 7300 system (Applied Biosystems, ed. Each transcript is represented by 1–5 probes to improve statisti- CA, USA). For quantitative results, the expression of each lncRNA cal confidence. The microarray hybridization and collection of was calculated according to fold change using 2−ΔΔCt methods. All expression data were performed by KangChen Bio-tech, Shanghai, the primers are available on request. China. 2.6. Hierarchical clustering analysis 2.4. Data analysis of microarray To give a general overview of the characteristics of lncRNA Five micrometers/pixel resolutions of the microarrays were expression profile between two groups, we performed unsupervised scanned by Agilent Microarray Scanner (Agilent p/n G2565BA). hierarchical clustering analysis using Cluster/TreeView program.

Fig. 2. HOTAIR (A) and lncRNA-422 (B) expression levels in additional 95 pairs of colorectal cancer and adjacent normal tissues analyzed by q-RT-PCR assay. 230 Y. Xue et al. / Gene 556 (2015) 227–234

Up-regulated or down-regulated lncRNAs with top 10 fold-change cluster analysis was conducted to generate expression patterns of among all the transcripts between cancer tissues and normal tissues lncRNAs with the most significantly altered expression level be- were selected for clustering analysis. Average linkage hierarchical tween two groups.

Fig. 3. Identification of gene sets enriched in phenotypes of correlated with HOTAIR (A: positively, B: negatively) and lncRNA-422 (C:positively, D:nagatively) by GSEA, as well as top 20 protein-coding genes correlated with lncRNAs HOTAIR (E) and lncRNA-422 (F). Heatmaps were generated by GSEA. Y. Xue et al. / Gene 556 (2015) 227–234 231

2.7. Gene set enrichment analysis (GSEA) correlated genes with lncRNAs) (http://omicslab.genetics.ac.cn/ GOEAST/php/batch_genes.php). The website organizes submitted Correlated mRNAs with two differently expressed lncRNAs genes into hierarchical categories and constructs gene regulatory net- (i.e. lncRNA-422 and HOTAIR) were analyzed in terms of their works based on biological processes and molecular functions. By de- expression enrichment in a predefined biological set of genes. Finally, fault, P value of Batch-Genes is calculated under the null hyperthesis microarray expression data of two selected lncRNA along with all the that the submitted genes were picked out randomly from the genome. coding-gene mRNAs were used to generate an 18,057 (genes) × 6 (sam- ples) expression matrix. GSEA analysis was performed by GSEA soft- 2.9. Statistical analysis ware V2.0 (Broad Institute, MIT, USA). Gene set used for enrichment analysis was “c6.all.v4.0.symbols.gmt”, which were oncogenic signa- Student's t-test was applied to compare expression level of selected tures. Expression level of lncRNA-422 and HOTAIR were separately lncRNAs between human colorectal cancer and adjacent normal tissues. used as phenotype labels. GSEA will firstly order all the genes according Different expression levels of lncRNAs in samples of all the clinical to their correlation with lncRNA-422 or HOTAIR, then the gene subgroup were analyzed using one way ANOVA analysis. A value of sets mentioned above receives an enrichment score (ES), which reflects P b 0.05 was considered significant. All the analyses were performed the degree to which a gene set is overrepresented at the top or bottom using SPSS 13.0 (Chicago, IL, USA). of a ranked list of genes. Finally, ES will be normalized to NES, which is the primary statistic for examining gene set enrichment results and 3. Results can be used to compare results across gene sets. 3.1. Differently expressed lncRNAs in microarray 2.8. Gene ontology (GO) analysis The microarray expression profile contained a total of 23,920 The coding genes most correlated with HOTAIR or lncRNA-422 lncRNAs and 18,056 mRNAs which were detected in the paired colo- (top 3000 of Pearson correlation coefficient) were input into the on- rectal tissues. Results of the hierarchical clustering analysis were line software DAVID (Database for Annotation, Visualization and represented by a dendrogram (Fig. 1A) and the visualization of the Integrated Discovery, http://david.abcc.ncifcrf.gov/home.jsp). Re- variation between chips was shown in Fig. 1B. Finally, we identified sults showed annotation of inputted genes utilizing Gene Ontology 762 remarkably different expressed lncRNAs (P b 0.05) between can- terms (http://www.geneontology.org/). P value indicated the signif- cer and normal tissues. Among them, 390 were up-regulated while icance of enrichment of the GO term in the selected genes. 372 were down-regulated. For the categories of lncRNA location, Gene Ontology Enrichment Analysis was conducted to reveal rela- 56.8% of the differently expressed lncRNAs (433 lncRNAs) were tran- tionship among genes most evidently correlated with selected lncRNAs scribed from the protein coding genes and 43.2% were originated (top 20 protein-coding genes shown in Fig. 3E and F), on the basis from intergenic genome regions. Unsupervised hierarchical cluster- of their biological functions. Batch-Genes tool of Gene Ontology ing analysis of the differently expressed lncRNAs among all the sub- Enrichment Analysis Software Toolkit (GOEAST) was used to identify jects was conducted based on the similarity in the expression pattern significantly enriched GO terms among the given list of genes (i.e. of them. We found that samples were distinctly separated into two

Fig. 4. Resulted GO terms of GO analysis annotating correlated coding genes in categories of biological process, cellular component, and molecular functionwith10minimumP value. 232 Y. Xue et al. / Gene 556 (2015) 227–234 groups (i.e. cancer tissues and normal tissues) according to their was performed to annotate expression correlated coding genes lncRNA expression patterns. (top 3000 of Pearson correlation coefficient) in categories of biological process, cellular component, and molecular function. The resulted GO 3.2. Selection and verification of differently expressed lncRNAs terms with 10 minimum P values were summarized in Fig. 4.We observed that coding genes which were most correlated with HOTAIR In order to select lncRNAs with fundamental biological functions for q- or lncRNA-422, were enriched in the regulation of cell proliferation RT-PCR confirmation, we explored into the detailed information of the for both HOTAIR and lncRNA-422 (ontology category: biological pro- first 3 up-regulated or down-regulated lncRNAs in microarray. As a result, cess), cytosol for HOTAIR (ontology category: cellular component), we only selected transcript ENST00000415820 (named lncRNA-422 in membrane-enclosed lumen for lncRNA-422 (ontology category: cellular the present study), which was evidently down-regulated in colorectal component), and RNA binding for both HOTAIR and lncRNA-422 cancer tissues and had a significant correlation with 3 differently (ontology category: molecular function). Part of the GO terms in molec- expressed coding-gene mRNAs (Pearson correlation coefficient = 1, ular function and their connections were analyzed by GOEAST (Supple- P b 0.05). Reasons for exclusion of other 5 lncRNAs were elaborated in mentary Figs. 1 and 2). Table 1. In addition, as all the 3 up-regulated lncRNAs were not selected for verification, we searched for published literatures and found that 4. Discussion lncRNA HOTAIR was reported to be up-regulated in colorectal cancers. Subsequently, the expression level of HOTAIR was searched in our With the expanding of researches on human cancer, it was gradually microarray data, and a significantly elevated level was found in cancer recognized that expression pattern of lncRNAs had vast importance in tissues (fold change = 1.2, P = 0.031). Therefore, in the further q- revealing pathogenesis of malignant tumor (Perez et al., 2008; Silva RT-PCR assay, HOTAIR was selected for the representative for the et al., 2010). Dysregulation in a specific tumor may be a strong hint for up-regulated lncRNAs in colorectal cancer. the biological function of lncRNAs in carcinogenesis. As the primary q-RT-PCR assay for the expression level of HOTAIR and lncRNA- function of lncRNAs was epigenetic regulation of protein coding genes 422 in our additional 95 pairs of tissues provided a further confirma- (Wang and Chang, 2011), exploration of the dysregulated lncRNAs tion of the different expression levels of these two selected lncRNAs. may contribute to comprehend molecular alteration of the whole Consistently with microarray data, the level of HOTAIR was signifi- genome, including coding RNAs and noncoding RNAs. Nowadays, high cantly increased while lncRNA-422 was decreased in colorectal can- throughput microarrays have been applied in achieving the expression cer tissues (P =0.015forHOTAIRandP = 0.027 for lncRNA-422, profiles of lncRNAs in various cancers and helped to identify a series of respectively) (Fig. 2A and B). Nevertheless, we did not find any asso- novel lncRNAs, which played vital roles in carcinogenesis (Yang et al., ciations between the expression levels of these lncRNAs and clinical 2013; Huang et al., 2013). features of colorectal cancer, suggesting that they were less likely In order to probe into the aberrant expression pattern of lncRNAs in involved in the development of this disease (data were shown in colorectal cancer, we performed lncRNAs microarray assay using paired Supplementary Tables 3 and 4). It should be mentioned that expression colorectal cancer and adjacent normal tissues. The profiling data sug- of lncRNA-422 was not very stable so we only obtained 85 pairs of gested that numerous lncRNAs were significantly differently expressed available values in q-RT-PCR assay of lncRNA-422. between cancer and normal tissues, which provided striking evidence for the important role of lncRNAs in colorectal cancer. In addition, 3.3. GSEA of HOTAIR and lncRNA-422 some lncRNAs previously reported to be cancer-related were also found to exhibit an abnormal expression pattern in our result, e.g. To further probe the coding-gene sets correlated with HOTAIR SNHG1 (Chaudhry, 2013), NEAT1 (Kim et al., 2010) and HOTAIR or lncRNA-422, we performed GSEA using the expression level (Gupta et al., 2010). We believed that our microarray result was reliable of HOTAIR or lncRNA-422, as well as the whole microarray data of because lncRNAs known to be dysregulated in colorectal cancer was coding-gene mRNAs. As a result, in the 186 gene sets of “oncogenic also picked out in our microarray assay. These consistent results not signature”, 149 were positively and 37 were negatively correlated only supported for the reliability of our microarray data, but also indi- with HOTAIR. Among them, the one with highest NES was cated a somewhat universality among various human cancers. It was KRAS.LUNG.BREAST_UP.V1_UP or named up-regulated in KRAS- inevitable that some other lncRNAs previously demonstrated to be over, which was a group of genes up-regulated in epithelial lung associated with carcinogenesis may be missed in our present study. and breast cancer cell lines over-expressing an oncogenic form of We attribute this inconformity to the instability of lncRNA expression KRAS gene and the one with lowest NES was JAK2_DN.V1_DN or and distinct distributions of study subjects. named down-regulated in JAK2-knockout, which was genes down- Subsequently, the up-regulated HOTAIR, which was the first lncRNA regulated in HEL cells (erythroleukemia) after knockdown of JAK2 reported to be involved in colorectal cancer and differently expressed gene by RNAi. Besides, results of lncRNA-422 GSEA analysis showed in our data, as well as the down-regulated lncRNA-422 were selected that 60 gene sets were positively correlated with expression of for q-RT-PCR validation in additional 95 pairs of tissues. Among them, lncRNA-422. The gene set with highest NES was PDGF_UP.V1_DN or novel lncRNA-422 reflected innovativeness, while the known lncRNA called down-regulated in PDGF-over. This gene set represented genes HOTAIR ensured the creditability of our study. As a result, abnormal down-regulated in SH-SY5Y cells (neuroblastoma) in response to plate- expression level of these two lncRNAs founded in microarray stage let derived growth factor (PDGF) stimulation. There were 126 gene sets was verified in q-RT-PCR assay, suggesting a functional role of them in negatively correlated with lncRNA-422 and the one with lowest NES pathogenesis of colorectal cancer. was TBK1.DN.48HRS_DN or called down-regulated in TBK1-knockout, Studies have demonstrated that, lncRNAs exert diverse impact on which was genes down-regulated in epithelial lung cancer cell lines regulation of coding gene expression (Mercer et al., 2009; Moran upon over-expression of an oncogenic form of KRAS gene and knock- et al., 2012). Therefore, detection of the expression correlated coding down of TBK1 gene by RNAi. Identification of the four gene sets genes will be an effective way for revealing the putative functions of a mentioned above were shown in Fig. 3A–D. specific lncRNA. In the present study, we explored into the regulatory role of lncRNA HOTAIR and lncRNA-422 by some bioinformatic ap- 3.4. Gene ontology analysis proaches, i.e. GSEA and GO analyses. GSEA uses computerized algorithm to determine whether a predefined functional gene set shows concor- The top 20 protein-coding genes correlated with lncRNAs HOTAIR dant difference between two groups of phenotypes at a statistically sig- and lncRNA-422 are shown in Fig. 3E and F. Subsequently, GO analysis nificant level. Results of GSEA showed that gene set with the highest Y. Xue et al. / Gene 556 (2015) 227–234 233

NES enriched in phenotype of positively correlated with HOTAIR were (12KJA330002 and 11KJB330002), and the Project Funded by the Prior- those up-regulated in KRAS-over, including PRSS3, MMP10, CXCL2 ity Academic Program Development of Jiangsu Higher Education Insti- and many other oncogenic genes induced in cells overexpressing tutions (Public Health and Preventive Medicine). KRAS gene. This may be an indication for HOTAIR regulatory functions in KRAS induced genes. Besides, enriched gene sets with relative high Conflict of interest NES also included P53_DN.V1_UP, RB_DN.V1_UP and E2F3_UP.V1_UP, Authors declare no conflicts of interest. representing genes up-regulated in NCI-60 panel of cell lines with mu- tated TP53, genes up-regulated in primary keratinocytes from RB1 Appendix A. Supplementary data skin specific knockout mice and genes up-regulated in primary epithe- lial breast cancer cell culture over-expressing E2F3 gene, respectively. Supplementary data to this article can be found online at http://dx. These result provided further support for the tumor promotion role of doi.org/10.1016/j.gene.2014.11.060. HOTAIR. As for GSEA results of lncRNA-422, enriched gene set with the highest NES was down-regulated in PDGF-over, which consist of a References group of genes down-regulated in cancer cells in response of PDGF stim- ulation (Antipova et al., 2008). Besides, tumor inhibited gene sets Antipova, A.A., Stockwell, B.R., Golub, T.R., 2008. Gene expression-based screening for CYCLIN_D1_UP.V1_DN and LEF1_UP.V1_DN were also enriched in inhibitors of PDGFR signaling. Genome Biol. 9, R47. Bartel, D.P., 2009. MicroRNAs: target recognition and regulatory functions. Cell 136, lncRNA-422 positive correlation phenotype. It was worth mentioning 215–233. that LEF1_UP.V1_DN represented for genes down-regulated in colon Cerella, C., Teiten, M.H., Radogna, F., Dicato, M., Diederich, M., 2014. From nature to carcinoma cells over-expressing lymphoid enhancer-binding factor 1 bedside: pro-survival and cell death mechanisms as therapeutic targets in cancer treatment. Biotechnol. Adv. 32, 1111–1122. (LEF1), suggesting that LEF1 may be a potential target of lncRNA-422 Chaudhry, M.A., 2013. Expression pattern of small nucleolar RNA host genes and long in carcinogenesis of colorectal cancer. We also found that gene sets non-coding RNA in X-rays-treated lymphoblastoid cells. Int. J. Mol. Sci. 14, most negatively correlated with lncRNA-422 was down-regulated in 9099–9110. Colussi, D., Brandi, G., Bazzoli, F., Ricciardiello, L., 2013. Molecular pathways involved in TBK1-knockout, which was concordant with the potential inhibitory colorectal cancer: implications for disease behavior and prevention. Int. J. Mol. Sci. role of lncRNA-422 in carcinogenesis observed in the present study 14, 16365–16385. and indicated an interaction between oncogene TBK1 and lncRNA- Cunnington, M.S., Santibanez Koref, M., Mayosi, B.M., Burn, J., Keavney, B., 2010. Chromo- 422. However, the predicted most negatively correlated gene set with some 9p21 SNPs associated with multiple disease phenotypes correlate with ANRIL expression. PLoS Genet. 6, e1000899. HOTAIR was a group of cancer up-regulated genes. We assumed that Didelot, C., Schmitt, E., Brunet, M., Maingret, L., Parcellier, A., Garrido, C., 2006. Heat shock this discrepancy may be aroused from the different gene expression proteins: endogenous modulators of apoptotic cell death. Handb. Exp. Pharmacol. – patterns between HEL cells (in which the gene set identified) and our 171 198. Ge, X., Chen, Y., Liao, X., et al., 2013. Overexpression of long noncoding RNA PCAT-1 is a colorectal tissues. novel biomarker of poor prognosis in patients with colorectal cancer. Med. Oncol. GO project provides a controlled vocabulary of terms for describing 30, 588. gene product characteristics and function annotation data from GO con- Gibb, E.A., Brown, C.J., Lam, W.L., 2011. The functional role of long non-coding RNA in human carcinomas. Mol. Cancer 10, 38. sortium members. The present GO analysis demonstrated biological Gupta, R.A., Shah, N., Wang, K.C., et al., 2010. Long non-coding RNA HOTAIR reprograms process, cellular component as well as molecular function of the most chromatin state to promote cancer metastasis. Nature 464, 1071–1076. correlated coding genes in forms of GO terms. In general, most of the in- Heger, Z., Zitka, O., Krizkova, S., Beklova, M., Kizek, R., Adam, V., 2013. Molecular biology of beta-estradiol–estrogen receptor complex binding to estrogen response element and volved biological process were cancer-related, e.g. regulation of cell pro- the effect on cell proliferation. Neuro Endocrinol. Lett. 34 (Suppl. 2), 123–129. liferation (Heger et al., 2013), cell death (Cerella et al., 2014), response Huang, J.F., Guo, Y.J., Zhao, C.X., et al., 2013. Hepatitis B virus X protein (HBx)-related long to endogenous stimulus (Didelot et al., 2006), etc. This provided credi- noncoding RNA (lncRNA) down-regulated expression by HBx (Dreh) inhibits hepato- cellular carcinoma metastasis by targeting the intermediate filament protein ble evidence for the functional role of dysregulated lncRNAs, HOTAIR vimentin. Hepatology 57, 1882–1892. and lncRNA-422, in the mechanism of tumor formation. The hypothesis Jemal, A., Bray, F., Center, M.M., Ferlay, J., Ward, E., Forman, D., 2011. Global cancer statis- can be generated that HOTAIR and lncRNA-422 may participate in carci- tics. CA Cancer J. Clin. 61, 69–90. nogenesis through regulation of protein coding genes involved in Ji, P., Diederichs, S., Wang, W., et al., 2003. MALAT-1, a novel noncoding RNA, and thymo- sin beta4 predict metastasis and survival in early-stage non-small cell lung cancer. special biological process relevant to cancer. Oncogene 22, 8031–8041. Although studies have gained insight into the role of lncRNAs in Karsa, L.V., Lignini, T.A., Patnick, J., Lambert, R., Sauvaget, C., 2010. The dimensions of the – colorectal cancer, a comprehensive study describing the different ex- CRC problem. Best Pract. Res. Clin. Gastroenterol. 24, 381 396. Kim, Y.S., Hwan, J.D., Bae, S., Bae, D.H., Shick, W.A., 2010. Identification of differentially pression pattern between colorectal cancer tissue and normal tissue expressed genes using an annealing control primer system in stage III serous ovarian has not been reported yet. By high throughput microarray approaches, carcinoma. BMC Cancer 10, 576. we identified a series of lncRNAs differently expressed in colorectal can- Kogo, R., Shimamura, T., Mimori, K., et al., 2011. Long noncoding RNA HOTAIR regulates fi fi polycomb-dependent chromatin modi cation and is associated with poor prognosis cer tissues in the current study, and veri ed the aberrant expression in colorectal cancers. Cancer Res. 71, 6320–6326. level of HOTAIR and lncRNA-422 in larger samples. Furthermore, we ap- Lee, J.T., Bartolomei, M.S., 2013. X-inactivation, imprinting, and long noncoding RNAs in plied bioinformatic analyses to predict the biological functions of these health and disease. Cell 152, 1308–1323. Li, J., Xuan, Z., Liu, C., 2013. Long non-coding RNAs and complex human diseases. Int. J. two lncRNAs and found solid supports for the involvement of them in Mol. Sci. 14, 18790–18808. carcinogenesis. The most remarkable advantages of our study were Marchese, F.P., Huarte, M., 2014. Long non-coding RNAs and chromatin modifiers: their that we interpreted lncRNAs expression patterns from the view of place in the epigenetic code. Epigenetics 9, 21–26. fi Mattick, J.S., 2001. Non-coding RNAs: the architects of eukaryotic complexity. EMBO Rep. both global pro les and two representative lncRNAs, as well as we ana- 2, 986–991. lyzed the correlated coding genes of the whole genome, rather than Mattick, J.S., 2004. RNA regulation: a new genetics? Nat. Rev. Genet. 5, 316–323. focused on the neighboring coding genes of lncRNAs. Further studies Mercer, T.R., Dinger, M.E., Mattick, J.S., 2009. Long non-coding RNAs: insights into – into detailed mechanisms of differently expressed lncRNAs identified functions. Nat. Rev. Genet. 10, 155 159. Moran, V.A., Perera, R.J., Khalil, A.M., 2012. Emerging functional and mechanistic para- in our study were warranted. digms of mammalian long non-coding RNAs. Nucleic Acids Res. 40, 6391–6400. Perez, D.S., Hoage, T.R., Pritchett, J.R., et al., 2008. Long, abundantly expressed non-coding – Acknowledgments transcripts are altered in cancer. Hum. Mol. Genet. 17, 642 655. Petrovics, G., Zhang, W., Makarem, M., et al., 2004. Elevated expression of PCGEM1, a prostate-specific gene with cell growth-promoting function, is associated with This study was partly supported by the National Natural Science high-risk patients. Oncogene 23, 605–611. Foundation of China (81230068, 81201570, 81373091 and ), the Natural Ponting, C.P., Belgard, T.G., 2010. Transcribed dark matter: meaning or myth? Hum. Mol. Genet. 19, R162–R168. Science Foundation of Jiangsu Province (BK2011773), the Key Program Ponting, C.P., Oliver, P.L., Reik, W., 2009. Evolution and functions of long noncoding RNAs. for Basic Research of Jiangsu Provincial Department of Education Cell 136, 629–641. 234 Y. Xue et al. / Gene 556 (2015) 227–234

Qi, P., Xu, M.D., Ni, S.J., et al., 2013. Low expression of LOC285194 is associated with poor Wapinski, O., Chang, H.Y., 2011. Long noncoding RNAs and human disease. Trends Cell prognosis in colorectal cancer. J. Transl. Med. 11, 122. Biol. 21, 354–361. Silva, J.M., Perez, D.S., Pritchett, J.R., Halling, M.L., Tang, H., Smith, D.I., 2010. Identification Whitehead, J., Pandey, G.K., Kanduri, C., 2009. Regulation of the mammalian epigenome of long stress-induced non-coding transcripts that have altered expression in cancer. by long noncoding RNAs. Biochim. Biophys. Acta 1790, 936–947. Genomics 95, 355–362. Wilusz, J.E., Sunwoo, H., Spector, D.L., 2009. Long noncoding RNAs: functional surprises Sung, J.J., Lau, J.Y., Goh, K.L., Leung, W.K., 2005. Increasing incidence of colorectal cancer in from the RNA world. Genes Dev. 23, 1494–1504. Asia: implications for screening. Lancet Oncol. 6, 871–876. Yang, F., Huo, X.S., Yuan, S.X., et al., 2013. Repression of the long noncoding RNA-LET by Vaiopoulos, A.G., Athanasoula, K.C., Papavassiliou, A.G., 2014. Epigenetic modifications in histone deacetylase 3 contributes to hypoxia-mediated metastasis. Mol. Cell 49, colorectal cancer: molecular insights and therapeutic challenges. Biochim. Biophys. 1083–1096. Acta 1842, 971–980. Yoon, J.H., Abdelmohsen, K., Gorospe, M., 2013. Posttranscriptional gene regulation by Wang, K.C., Chang, H.Y., 2011. Molecular mechanisms of long noncoding RNAs. Mol. Cell long noncoding RNA. J. Mol. Biol. 425, 3723–3730. 43, 904–914.