Transcriptome Analysis Uncovers the Diagnostic Value of Mir-192-5P/HNF1A-AS1/VIL1 Panel in Cervical Adenocarcinoma
Total Page:16
File Type:pdf, Size:1020Kb
www.nature.com/scientificreports OPEN Transcriptome analysis uncovers the diagnostic value of miR‑192‑5p/ HNF1A‑AS1/VIL1 panel in cervical adenocarcinoma Junfen Xu1*, Jian Zou1, Luyao Wu1 & Weiguo Lu1,2* Despite the fact that the incidence of cervical squamous cell carcinoma has decreased, there is an increase in the incidence of cervical adenocarcinoma. However, our knowledge on cervical adenocarcinoma is largely unclear. Transcriptome sequencing was conducted to compare 4 cervical adenocarcinoma tissue samples with 4 normal cervical tissue samples. mRNA, lncRNA, and miRNA signatures were identifed to discriminate cervical adenocarcinoma from normal cervix. The expression of VIL1, HNF1A‑AS1, MIR194‑2HG, SSTR5‑AS1, miR‑192‑5p, and miR‑194‑5p in adenocarcinoma were statistically signifcantly higher than that in normal control samples. The Receiver Operating Characteristic (ROC) curve analysis indicated that combination of miR‑192‑5p, HNF1A‑AS1, and VIL1 yielded a better performance (AUC = 0.911) than any single molecule -and could serve as potential biomarkers for cervical adenocarcinoma. Of note, the combination model also gave better performance than TCT test for cervical adenocarcinoma diagnosis. However, there was no correlation between miR‑192‑5p or HNF1A‑AS1 and HPV16/18 E6 or E7. VIL1 was weakly correlated with HPV18 E7 expression. In summary, our study has identifed miR‑192‑5p/HNF1A‑AS1/VIL1 panel that accurately discriminates adenocarcinoma from normal cervix. Detection of this panel may provide considerable clinical value in the diagnosis of cervical adenocarcinoma. Abbreviations SCC Cervical squamous cell carcinoma NcRNA Non-coding RNA MiRNA MicroRNA LncRNA Long non-coding RNA ROC Receiver operating characteristic AUC Area under the ROC curve RT-qPCR Quantitative reverse-transcriptase PCR HNF1A-AS1 LncRNA HNF1A antisense RNA 1 VIL1 Villin 1 Cervical cancer ranks fourth for both cancer incidence and mortality in women worldwide 1. Te predomi- nant cases are cervical squamous cell carcinoma (SCC), and the second pool (about 10–25%) is cervical adenocarcinoma2,3. Te former has been greatly prevented by cytological screening and HPV vaccination in developed countries. In contrast, a relative and absolute increasing proportion of adenocarcinoma was seen over the past decades, especially in women aged 20 to 39 years old 4–8. Tus, it is important to investigate the pathogenesis of cervical adenocarcinoma and its related molecular alterations. High-risk human papillomavirus (HR-HPV) is the major cause of cervical cancer, especially in SCC. However, HR-HPV infection rates vary from 60 to 85.8% in cervical adenocarcinoma 9–13. At least 15% of cervical adenocar- cinoma is not related to HPV infection. Other factors may also contribute to the development of adenocarcinoma. Moreover, it has been reported that cervical adenocarcinoma has a higher rate of metastasis and resistance to 1Department of Gynecologic Oncology, Women’s Hospital, Zhejiang University School of Medicine, Hangzhou 310006, Zhejiang, China. 2Center of Uterine Cancer Diagnosis & Therapy of Zhejiang Province, Hangzhou 310006, Zhejiang, China. *email: [email protected]; [email protected] SCIENTIFIC REPORTS | (2020) 10:16584 | https://doi.org/10.1038/s41598-020-73523-0 1 Vol.:(0123456789) www.nature.com/scientificreports/ chemotherapy and radiotherapy than that of SCC14,15. In view of this clinical challenge, there is an urgent need to elucidate molecular mechanisms and explore new reliable prevention strategy for cervical adenocarcinoma. Various types of biomarkers, especially RNA biomarkers, have been applied to clinical diagnosis for diseases for decades. Recently, high-throughput RNA sequencing technology facilitates the quantifcation measurements of RNA expression at the entire transcriptome level. Tese large-scale expression profles of RNAs enable the detection of protein-coding RNAs (i.e., mRNAs) and non-coding RNAs (i.e., ncRNAs), which could be of pro- found value in terms of disease characterization. As the best-characterized type of RNA biomarkers, the mRNA biomarkers have successfully programmed into multi-gene panels for cancer diagnosis 16–19. Lots of functionally important ncRNAs can also be used as biomarkers for increasing evidence of abnormal expression of ncRNAs closely associated with various cancers. For instance, a blind study of 22 diferent tumor types showed that microRNA (miRNA) expression pattern could accurately classify tumors according to tissue of origin20. It was reported that the expression of miR-27a is decreased in cervical cancer cell lines and it functions as a tumor suppressor in cervical adenocarcinoma by inhibiting TGF-βRI signaling pathway21. MiR-362-3p downregula- tion in cervical adenocarcinoma is associated with advanced clinical stage and poor prognosis via targeting minichromosome maintenance protein 5 (MCM5)22. In addition, long non-coding RNAs (lncRNAs) could be considered as biomarkers. For instance, lncRNA HOTAIR plays important roles in the development of cancer 23–26. In cervical cancer, it was reported that high levels of circulating HOTAIR are correlated with tumor recurrence and short overall survival27,28. To date, the lncRNA profle in cervical adenocarcinoma remains largely unknown. In this study, we comprehensively examined the mRNA and ncRNA expression profles (including miRNAs and lncRNAs) in cervical adenocarcinoma and normal cervix by using transcriptome sequencing. We aim to identify and validate a mRNA/ncRNA-combined signature that could improve the screening or early detection of cervical adenocarcinoma. We revealed that a panel with miR-192-5p, HNF1A-AS1 and VIL1 could accurately discriminate adenocarcinoma from normal cervix. Methods Human tissue samples. Tree groups of cervical adenocarcinoma and normal cervical tissue samples were all collected from the Women’s Hospital, Zhejiang University School of Medicine. Cervical samples, including 4 normal and 4 adenocarcinoma tissue samples for RNA sequencing were obtained from January 2018 to February 2018. Te second panel of cervical tissue samples containing 20 normal and 20 adenocarcinoma tissue samples for the frst step validation by RT-qPCR analysis, were collected from March 2018 to March 2019. Te third panel of cervical tissue samples including 57 normal and 141 adenocarcinoma tissue samples for the second step validation by RT-qPCR analysis, were obtained from January 2009 to December 2017. None of these patients received radiotherapy or chemotherapy before operation. Diagnosis of these samples was confrmed indepen- dently by two senior pathologists before further analysis. Afer removal from the patients, tissue samples were put immediately in liquid nitrogen and stored at -80 °C until use. Tis study was approved by the Ethics Com- mittee of Women’s Hospital of Zhejiang University (IRB-2019062-R). All steps of the study were performed with the relevant guidelines and regulations (according to the Declaration of Helsinki Principles). Written informed consent was obtained from each patient prior to the study. Te three diferent phases of our study design are shown in Fig. 1A and the patients’ clinical-pathological information is summarized in supplementary Table S1. Transcriptome sequencing. Four cervical adenocarcinoma tissues and 4 normal cervical tissues were used for transcriptome sequencing. Transcriptome Sequencing was performed by NovelBio Corp. Laboratory, Shanghai, China. Our transcriptome sequencing data have been deposited in NCBI Gene Expression Omnibus (GEO) datasets with the accession number GSE145372 (https ://www.ncbi.nlm.nih.gov/geo). RNA was utilized to construct both rRNA depletion library and miRNA library. Te products were sequenced on Hiseq Sequencer and HiSeq Xten platform (Illumina, San Diego, CA), respectively. Te reads were then mapped to Human genome (GRCh38 NCBI) utilizing HISAT2 (-5 5 -3 5 -p 8 -min-intronlen 20 -max-intronlen 500,000 -k 3)29 and HTSeq was used to calculate the gene count of mRNA and lncRNA annotated by NCBI genome gf3 fle30. To achieve miRNA expression, the reads were fltered (length ranged from 15 to 33 bp; Q20 > 20%; Trim end Q15) and mapped to Human miRNA database (miRBase v22.0) and Human genome (GRCh38 NCBI), together with mapping to the RFAM database (https ://rfam.xfam.org/) for miRNA quality control by BWA (bwa aln -n 0.04 -e 3 -l 32 -k 2 -t 12)31. We applied DESeq2 package32 to discover the diferentially expressed mRNA, lncRNA, and miRNA under the following criteria: Absolute Fold Change > 2 and FDR < 0.05. For heatmap generation, the sample FPKM (Fragments per kilobases million reads) values of Diferent expres- sion genes were clustered using Sofware Cluster 3.0 (https://bonsa i.hgc.jp/~mdeho on/sofw are/clust er/sofw are. htm). Te rows and columns expression matrix was centered according to the mean value and fnally normalized to value ranged from − 1 to 1. Average linkage was applied to cluster genes and samples with the uncentered correlation method. Te heatmap was fnally visualize by TreeView (https://bonsa i.hgc.jp/~mdehoon/sofw are/ clust er/sofw are.htm). Te interactions among lncRNAs, miRNAs and mRNAs were predicted using the miRNA target predic- tion sofware miranda (-strict -sc 150 -en -20)33, RNAhybrid (-e -25 -b 1 -f 2,7 -m 2,000,000 -s 3utr_human -n 50)34 for non-coding RNA target and targetscan for mRNA target35. An mRNA-miRNA-lncRNA network was constructed according to the consistent target miRNAs of the lncRNAs and mRNAs. Co-expression network was constructed according to