J Biosci (2020)45:104 Ó Indian Academy of Sciences

DOI: 10.1007/s12038-020-00075-w (0123456789().,-volV)(0123456789().,-volV)

Differential transcriptome analysis in HPV-positive and HPV-negative cervical cancer cells through CRISPR knockout of miR-214

1 1 2 PRAKRITI SEN ,POOJA GANGULY ,KIRTI KKULKARNI , 2 1 ROLI BUDHWAR and NILADRI GANGULY * 1Cancer Biology Lab, School of Biotechnology, KIIT University, Bhubaneswar, India 2Bionivid Technology Pvt. Ltd., Kasturi Nagar, Bengaluru, India

*Corresponding author (Email, [email protected]) MS received 14 January 2020; accepted 19 July 2020

In this study we have investigated the effects of a tumour suppressor microRNA, miR-214, on expression in HPV-positive (CaSki) and HPV-negative cervical cancer cells (C33A) by RNA sequencing using next generation sequencing. The HPV-positive and HPV-negative cervical cancer cells were either miR-214- knocked-out or miR-214-overexpressed. Gene expression analysis showed that a total of 904 were upregulated and 365 genes were downregulated between HPV-positive and HPV-negative cervical cancer cells with a fold change of ±2. Furthermore, 11 differentially expressed and relevant genes (TNFAIP3, RAB25, MET, CYP1B1, NDRG1, CD24, LOXL2, CD44, PMS2, LATS1 and MDM1) which showed a fold change of ±5 were selected to confirm by real-time PCR. This study represents the first report of miR-214 on global gene expression in the context of HPV.

Keywords. Cervical cancer; CRISPR; Human papilloma virus; miR-214; RNA transcriptome Abbreviations: 30-UTR, 30-untranslated region; ATCC, American Type Culture Collection; CC, cervical cancer; CIN, cervical intraepithelial neoplasia; CRISPR, clustered regularly interspaced short palindromic repeats; DAVID, the database for annotation, visualization and integrated discovery; DEG, differentially expressed genes; DMEM, Dulbecco’s Modified Eagle Medium; FBS, fetal bovine serum; GO, ; KEGG, Kyoto Encyclopedia of Genes and Genomes; miR, microRNA; NGS, Next Generation sequencing; NIAID, National Institute of Allergy and Infectious Diseases; qRT-PCR, quantitative real-time PCR; RIN, RNA integrity number; RNA-seq, RNA-sequencing; sgRNA, single guide RNA; SRA, sequence read archive.

1. Introduction called Human papillomavirus, or HPV (Sen et al. 2018). Cervical carcinoma development happens from As indicated by World Cancer Report, cervical malig- standard to low-grade squamous intraepithelial lesion nancy or disease of the uterine cervix is the second (LSIL) to high-grade squamous intraepithelial lesion most regular reason for disease and the fourth most (HSIL) to in vitro carcinoma (CIS) and lastly to basic reason for death due to disease in women metastatic disease (Hopman and Ramaekers 2017). worldwide. It has additionally been accounted for that Oncogenic human papillomaviruses (HPV), mostly in the year 2012, an expected 528,000 cases have come HPV types-16, 18, 31, 33, etc., are associated with about among which 266,000 deaths have occurred invasive lesions of cervical cancer. There are over 200 around the world (Bray et al. 2018). It has been seen distinct HPV varieties (Agorastos et al. 2015). Human that the foremost common anorexigenic agent of cer- papillomaviruses (HPV) have their viral genome in the vical malignant growth is by an infection of a virus capsid of the 72-capsomere. The genome is split into

Electronic supplementary material: The online version of this article (https://doi.org/10.1007/s12038-020-00075-w) contains supplementary material, which is available to authorized users. http://www.ias.ac.in/jbiosci 104 Page 2 of 25 Prakriti Sen et al. three areas that include the Long Control Region the first report of comparative gene expression on the (LCR) without coding capacity; the Early effect of miR-214 between HPV-negative and HPV- Region (E1–E7) and the Late Protein Region (L1 and positive cervical cancer cells. L2) (Ghittoni et al. 2010) LCR. Early Region and Late Region regulate DNA replication, oncogenesis, cell conversion and viral capsule formation, respectively. 2. Materials and methods The early HPV are used to determine the risk factor engaged in cervical cancer. The HPV found in 2.1 Cell lines and reagents benign tumors are low-risk kinds: HPV-6, 11, 42, 43 and 44; whereas in invasive and malignant tumors the C33A and CaSki cervical cancer-derived cell line was high-risk HPV are found which are HPV-16, 18, 31, 33, obtained from American Type Culture Collection 35, 51, 52 and 58 (Goodman 2015). (ATCC). The C33A cell line was grown as a monolayer (miRs) are a class of small non-coding and maintained in Dulbecco’s Modified Eagle Medium single-stranded RNAs which are composed of nearly (DMEM) media (GIBCO, Life Technologies) and 19-24 nucleotides. According to different types of CaSki cell line was grown as a monolayer and main- cancers, oncogenic miRs are upregulated and tumor tained in RPMI 1640 media (GIBCO, Life Technolo- suppressor miRs are downregulated (Sochor et al. gies). Along with the media the cells were 2014;Liet al. 2016). There are around 250 micro- supplemented with 1% antibiotics which consists of RNAs which are uncharacteristically expressed in 100 U/ml penicillin and 10 mg/ml streptomycin cervical cancer cells and miR-214 is one of them (GIBCO, Life Technologies), 10% Fetal Bovine Serum (Kawai et al. 2018). This miR-214 in cervical cancer is (FBS) (GIBCO, Life Technologies) and 1% (w/v) reported to be downregulated in all three stages of L-glutamine (GIBCO, Life Technologies) at 37 °Cina cervical intraepithelial neoplasia (CIN) which are stage humidified atmosphere containing 5% CO2. 1, 2 and 3. Cellular activity, such as cell proliferation, apoptosis, cell invasion, metastasis and angiogenesis in cervical carcinoma, is regulated by miR-214 (Penna 2.2 Knockdown of mir-214 using CRISPR/Cas 9 et al. 2015; Tomasetti et al. 2016; Frediani and Fabbri Technique and confirming gene expression by RNA 2016). Recently, profiling of separate miRNAs has extraction followed by quantitative RT-PCR suggested a crosstalk between miR expression and progression of disease in cervical cancer (Berdasco and Using the CRISPR/Cas9 technique mir-214 was Esteller 2017; Abba et al. 2017; Rupaimoole and Slack knocked down in cervical cancer cell lines C33A (C1) 2017). and CaSki (K1). By using CRISPR Design Tool online Transcriptome analysis by using RNA-sequencing (http://crispr.mit.edu/) top and bottom sgRNA oligo (RNA-seq) has emerged as a powerful tool to study inserts for mir-214 were designed (miR-214a forward: gene expression profile (Byron et al. 2016; Uhlen et al. 50-GACTGAGAGCGTTGTCTGTCT-30; miR-214a 2017). The major drawbacks of microarray-based gene reverse: 50-TGCCATCTGTCTGCCTGT-30). The vec- expression analysis such as cross-hybridization tor pSpCas9 (BB)-2A-Puro vector was obtained from between similar sequences can be overcome by tran- Addgene (plasmid ID: 48139) for cloning the sgRNA scriptome analysis (Mao et al. 2017). Also, it offers oligos. First, the top and bottom sgRNAs strands were accurate single nucleotide resolution, which permits the phosphorylated and annealed at 37 °C for 30 min; discrimination between highly related sequences, as 95 °C for 5 min; ramped down to 25 °Cat5°C min-1. well as provide increased sensitivity to detect rare Next, the vector pSpCas9 (BB)-puro-2A, was digested sequences (Jaskowiak et al. 2018). Furthermore, RNA- using the BbsI restriction endonuclease at 37 °C for seq provides a unique advantage in quantitatively 1–16 h. The diluted sgRNAs were then ligated in the analyzing RNA expression levels. Based on the above- vector pSpCas9 (BB)-puro-2A with T4 DNA Ligase at mentioned advantages, NGS technology has replaced 16 °C overnight and then heat-inactivated at 65 °C for microarrays as the tool of choice for RNA analysis 10 minutes. For transformation 5 ll of the above mix (Imam 2016). was added to 50 llofE. coli DH5a chemically com- In this study, we performed a comparative tran- petent cells. A plasmid of positive cloned cells was scriptome analysis between HPV-positive (CaSki) and selected, isolated and purified using the Qiagen Midi- HPV-negative (C33A) cervical cancer cells having Prep Plasmid Isolation Kit. After obtaining purified either overexpression or knockout of miR-214. This is plasmid DNAs they were used for transfection in C33A Differential transcriptome analysis in HPV cervical cancer cells Page 3 of 25 104 (C1) and CaSki (K1) cervical cancer cells. For which cDNAs of C2 and K2 were prepared with the help of cells were seeded overnight in 60 mm cell culture dish SuperScriptTM IV First-Strand Synthesis System Kit at a density of 6 9 105 cells/ml with media consisting (Invitrogen, Thermo Scientific) using gene-specific of only 10% FBS and no antibiotics (100 U/ml peni- primer for mir-214. Gene expression was confirmed cillin and 10 mg/ml streptomycin). A total of 500 ng using PCR as described above. sgRNAs at equimolar ratios was transfected using Lipofectamine 2000. After 48 h, 2 lg/ml mammalian selective antibiotic Puromycin was added to the cells 2.4 Sample preparation followed by Library for screening of positively transfected cells. After 3 Construction and Sequencing days the dose of the antibiotic Puromycin was changed to 1lg/ml and maintained further. For library construction, total RNA was extracted from The gene expression was confirmed by subjecting the samples. After performing quality control (QC), the stable transfected cells C33A knockdown (C3) and passed samples were proceeded with the library con- CaSki knockdown (K3) to total RNA extraction using struction. The extracted RNA with an RNA integrity TRIZOL reagent (Invitrogen). Reverse transcription number (RIN) of 7.0 were used for mRNA purification. was performed using SuperScriptTM IV First-Strand The mRNAs were purified using oligo-dT beads Synthesis System (Invitrogen, Thermo Fisher Scein- (TruSeq RNA Sample Preparation Kit, Illumina) taking tific) as per the manufacturer’s protocol. The RT primer 1 lg of intact total RNA. Next, the purified mRNAs used for mir-214 was 50- TCGTATCCAGTGCAGGG were fragmented at 90°C in the presence of divalent TCCGAGGTGCACTGGATACGACACTGCCTG-30. cations. The fragments were reverse transcribed using Following quantitative real-time PCR (qRT-PCR) random hexamers and Superscript II Reverse Tran- analyses was performed for miR-214 gene using Fast scriptase (Life Technologies). Second strand cDNA SYBR Ò Green Master Mix (29) (Applied Biosys- was synthesized on the first strand template using tems, Thermo Fisher Sceintific). b- was used RNaseH and DNA polymerase I. The cDNAs so as an endogenous control and mRNA fold changes obtained were cleaned using Beckman Coulter Agen- were calculated using the 2-DDCT method. The court Ampure XP SPRI beads. primer sequence used for qRT-PCR were (miR-214 The sequencing library was prepared by random Forward: 50-TGCGGACAGCAGGCACAGAC-30; fragmentation of the cDNA samples obtained above, and Reverse: 50-CCAGTGCAGGGTCCGAGGT-30 followed by 50 and 30 adapter ligation, after end- and b-actin Forward: 50-TCACCCACACTGTGCC repair and the addition of an ‘A’ base and SPRI CATCTACGA-30;Reverse:50-CAGCGGAACCG cleanup. The prepared cDNA library was next CTCATTGCCAATGG-30) All primers were purchased amplified using PCR for the enrichment of the from IDT (Integrated DNA Technologies, USA) and adapter-ligated fragments. On the other hand, a pro- was used according to the manufacturer’s protocol. All cess known as ‘tagmentation’ combines the frag- the PCR was performed in BioRad Thermocycler mentation and ligation reactions into a single step that greatly increases the efficiency of the library preparation process. The individual libraries were 2.3 Over-expression of miR-214 by pCDNA-HisA then quantified using a NanoDrop spectrophotometer plasmid transfection and confirming gene (Thermo Scientific) and confirmed for quality with a expression by RNA extraction followed Bioanalyzer (Agilent Technologies). by quantitative RT-PCR For cluster generation, the library was loaded into a flow cell where fragments were captured on a lawn of Over-expression of mir-214 was performed by trans- surface-bound oligos complementary to the library fecting C33A (C1) and CaSki (K1) with 500 ng of adapters. Each fragment was then amplified into dis- equimolar mir-214 cloned plasmid- pcDNA HisA tinct, clonal clusters through bridge amplification. obtained from GeneArt (Life Technologies). After 48 h When cluster generation was completed, the templates 200 lg/ml mammalian selective antibiotic G418 was were ready for sequencing. Illumina SBS technology added to the cells for screening of positively transfected utilized a proprietary reversible terminator-based cells. After 3 days the dose of G418 was reduced to method that detected single bases as they were incor- 100 lg/ml and maintained further. The stable miR-214 porated into DNA template strands. As all 4 reversible, overexpressed C33A (C2) and CaSki (K2) cell’s RNAs terminator-bound dNTPs were present during each were extracted using the TRIZOL reagent. Next, the sequencing cycle, natural competition minimized 104 Page 4 of 25 Prakriti Sen et al. incorporation bias and greatly reduced raw error rates BridgeIsland Software (Bionivid Technology Pvt Ltd, compared to other technologies. Bangalore, India) for identifying key edges that con- nects genes: GO / Pathway. Statistical scores from differential expression and biological analysis was used 2.5 RNA sequencing and data analysis as attributes to visualize the network. Output of BridgeIsland Software was used as input to CytoScape Stringent quality control of Paired-End sequence reads V 2.8.3 Force Directed Spring Embedded Layout of all the samples were done using the NGSQC Tool algorithm was used to visualize the network that kit. Paired-end sequence reads with Phred score encompasses biological categories and differentially [ Q30 were taken for further analysis. UCSC Human expressed genes that were significantly enriched. In genome (hg38) was used for reading alignment and gene regulatory figures, gene nodes were colored identification of transcripts. TopHat pipeline was used according to their log2FC (red showed upregulation for alignment. Differential expression analysis was and green showed downregulated), processes were done in comparisons of untreated C33A (C1) vs. colored in blue. untreated CaSki (K1), overexpressed miR-214 C33A (C2) vs. overexpressed miR-214 CaSki (K2) and knockdown miR-214 C33A (C3) vs. knockdown miR- 2.7 RNA extraction and quantitative RT-PCR 214 CaSki (K3). The sequence read archive (SRA) Metadata was created and uploaded in the project Real-time qPCR (RT-qPCR) was performed to further number PRJNA484129, which includes the complete validate ten selected genes with log2FC value B 5 and RNA transcriptomics study. Differentially expressed log2FC value [5 from the NGS results. Previously, transcripts between Treated and Control samples were harvested total RNA using TRIZOL reagent (Invitro- identified by CuffDiff data analysis pipeline statistical gen) from C1 and K1 (control C33A and CaSki cells analyses was performed by Student’s t-test, p-value respectively), C2 and K2 (Overexpressed miR-214 B 0.05 and log2FC cutoff of 2 were considered to be C33A and CaSki cells respectively) and C3 and K3 significant. We computed both p-value and q value as (Knocked Down miR-214 C33A and CaSki cells a part of our standard analysis. To estimate the degree respectively) was reverse transcribed into cDNA using of biological variation and to identify differentially random hexamer primers using SuperScriptTM IV First- expressed transcripts with acceptable false discovery Strand Synthesis System (Invitrogen, Thermo Fisher rate (FDR), we proceeded with p-value as FDR to Sceintific) as per the manufacturer’s protocol. Quanti- measure the total number of biologically meaningful tative real-time PCR was performed using the Fast differentially expressed genes (Kim and Bang 2016). SYBR Green PCR Master Mix (Life Technologies, In K1 vs. C1 (Treated vs. Control) out of 16,128 Invitrogen) and the BIO-RAD CFX96 Real-Time PCR expressed transcripts, according to the cut off, 285 Detection System (BIO-RAD, USA). The sequences of transcripts were upregulating (log2FC C 2 and p- all the primers are mentioned in supplementary table 4. value C 0.05) and 131 were downregulating Each sample was analyzed in duplicate. The expression (log2FC B-2 and p-value C 0.05). In K2 vs. C2 levels of mRNA were normalized to internal control b- out of 16,016 expressed transcripts, according to the Actin and subsequently calculated using the 2-DDCT cut off, 319 transcripts were upup regulating and 111 method. were downregulating. In K3 vs. C3 out of 15,411 expressed transcripts, according to the cut off, 300 transcripts were upregulating and 123 were 3. Results downregulating. 3.1 Validation of miR-214 knockout by CRISPR- Cas9 and overexpression using pcDNA His A vector 2.6 Biological pathways and gene ontology in HPV-positive and HPV-negative cervical cancer enrichment (GEO) analysis cell lines (CaSki and C33A) respectively

GEO and KEGG pathways have been done using Before establishment of knockout and overexpressed DAVID Functional Annotation Tool for significant miR-214 stable cell lines the basal level of miR-214 genes. Gene ontologies and pathways along with the was measured by quantitative RT-PCR. In this study differentially expressed genes was used as input for we have compared the expression level of miR-214 in Differential transcriptome analysis in HPV cervical cancer cells Page 5 of 25 104 HPV-positive (CaSki) and HPV-negative (C33A) cell that there was specific downregulation of 57 genes in lines. The expression level of miR-214 was normalized K1 vs. C1, 28 genes in K2 vs. C2 and 42 genes in K3 against the expression level of miR-214 in normal vs. C3 (figure 1E). We also assessed the quality of our epithelial cell line such as HaCaT. As both CaSki and transcriptome by checking for the presence of common C33A are cervical epithelial cells therefore to compare housekeeping genes for C1 vs. K1, C2 vs. K2 and C3 the expression level between these two cell lines nor- vs. K3 (supplementary table 5) that are generally mal epithelial cell line was used as reference. This expected to be constitutively expressed. As expected, showed a significant decrease in its miRNA fold we recovered housekeeping genes in our transcrip- change in normal C33A and CaSki (supplementary tomics data which further strengthen our analysis as figure 1A). After establishment of stable miR-214 biological variability does not have any effect on the knockout and overexpressed cervical cancer cell lines genes which are present in both the cell lines. C33A and CaSki, they were confirmed by quantitative The differentially expressed mRNAs between the RT-PCR (supplementary figure 1B) which showed samples were represented as heat map in figure 2. The significant increase in the miRNA fold change of miR- heat maps were developed using Cluster 3.0 and Java 214 overexpressed cervical cancer cell lines and a TreeView (v 1.1.6) for Top 30 upregulated and Top 30 significant decrease in the miRNA fold change in downregulated genes. The downregulated genes were CaSki than in C33A of miR-214 knockout cells com- indicated by the green color and the upregulated genes pared to untreated cells. were indicated by red color. The limit was maintained as log2FC of ± 2. The heat map representation of differentially expressed genes (DEGs) were shown in 3.2 Overview of differential gene expression (figure 2A) which represents the DEGs between K1 vs. between HPV-positive and HPV-negative cell lines C1, whereas figure 2B indicates the DEGs between K2 after miR-214 overexpression and knockdown vs. C2 and figure 2C specifies the DEGs between K3 vs. C3. In addition, the genes exhibited log2FC of ? 5 Differentially expressed transcripts between normal and above and - 5 and below were mentioned in and miR-214 overexpressed and knockdown cervical tables 1, 2 and 3 for K1 vs. C1, K2 vs. C2 and K3 vs. cancer cell lines (C33A and CaSki) were identified by C3 respectively. CuffDiff data analysis pipeline (figure 1). Statistical analysis was performed by Student’s t-test, p-value B 0.05 and log2FC cutoff of 2 were considered to be 3.3 Analysis of GEO and KEGG pathway significant. In K1 vs. C1 (normal CaSki vs. normal in differentially expressed HPV-positive and HPV- C33A) (figure 1A) based on predetermined cut off, out negative cell lines after miR-214 overexpression of 16,128 expressed transcripts, 285 transcripts were and knockdown upregulated (log2FC C 2 and p-value B 0.05) and 131 were downregulated (log2FC B-2 and p-value To characterize the functions of genes showing sig- B 0.05). In K2 vs. C2 (miR-214 overexpressed CaSki nificant changes between samples, we performed the vs. miR-214 overexpressed C33A) (figure 1B) out of Gene Ontology (GEO) and Kyoto Encyclopaedia of 16,016 expressed transcripts, 319 transcripts were Genes and Genomes (KEGG) pathway analysis using upregulated and 111 were downregulated. In K3 vs. C3 DAVID functional annotation tool with a p-value (miR-214 knockdown CaSki vs. miR-214 knockdown B 0.05 to select the significant genes. Gene Ontology C33A) (figure 1C) out of 15,411 expressed transcripts, analysis was sub-grouped into three categories: cellular 300 transcripts were upregulating and 123 were component, biological process, and molecular func- downregulated. The complete and differential data list tions. figure 3A showed GO analysis and top 5 KEGG consisting of genes with their p-value, q-value (the pathways for K1 vs. C1, whereas figure 3B represented FDR-adjusted p-value) and other parameters which K2 vs. C2 and figure 3C represented K3 vs. C3. Top 5 were done as a part of our standard analysis is men- genes from each categories were further selected for tioned in supplementary table 6 (A) for C1 vs. K1, GO analysis. The top 5 KEGG pathways selected for (B) for C2 vs. K2 and (C) C3 vs. K3. This test statistic K1 vs. C1 and K2 vs. C2 were: PI3K/Akt signaling was done in Cuffdiff. pathway, focal adhesion, proteoglycans in cancer, It was observed that 49 genes were specifically HTLV-I infection and pathways in cancer. Similarly, for upregulated in K1 vs. C1, 58 genes in K2 vs. C2 and 56 K3 vs. C3 the top 5 KEGG pathways selected were: genes in K3 vs. C3 (figure 1D). It was also observed PI3K/Akt signaling pathway, focal adhesion, Hippo 104 Page 6 of 25 Prakriti Sen et al.

Figure 1. Differential gene expression analysis between (A) normal HPV-positive cell line CaSki (K1) vs. normal HPV negative cell line C33A (C1), (B) overexpressed miR-214 CaSki (K2) vs. overexpressed miR-214 C33A (C2), (C) knockdown miR-214 CaSki (K3) vs. knockdown miR-214 C33A (C3). Cufflinks Package (v 2.2.1) was used for differential analysis. The transcripts with log2FC C 2 and p-value B 0.05 were considered as significantly upregulated, transcripts with log2FC B-2 and p-value C 0.05 were considered as significantly downregulated. Also, we found sample-specific transcripts. Venn diagram representation for (D) downregulated genes between C1 vs. K1, C2 vs. K2 and K3 vs. C3 and (E) upregulated genes between C1 vs. K1, C2 vs. K2 and K3 vs. C3. signaling pathway, HTLV-I infection and pathways in yellow shows baseline expression, processes are col- cancer. ored in blue). It was evident from the three different gene regulatory networks that majority of the cellular functions (network hubs) were different in based on 3.4 Effect of miR-214 on gene regulatory networks expression level of miR-214. Furthermore, most of the genes in miR-214 untreated cells (C1 vs. K1) were MicroRNAs regulate a wide variety of genes. From the downregulated, whereas in miR-214 overexpressed list of differentially expressed genes, we mapped a cells (C32 vs. K2) most of the genes were upregulated gene network model of mRNAs which are modulated and in miR-214 knockout cells (C3 vs. K3) the genes by miR-214. Figure 4(A) represented the genes that were downregulated. were present in untreated C33A and CaSki cell lines (C1 vs. K1). Figure 4(B) represented the genes mod- ulated by overexpression of miR-214 in C33A and 3.5 Gene networks regulated by miR-214 CaSki (C2 vs. K2) whereas figure 4(C) represented the genes modulated by knockdown of miR-214 in C33A MicroRNAs control an extensive range of genes. From and CaSki (C3 vs. K3). For network analysis between the list of differentially expressed genes, 904 genes sample pairs of C1 vs. K1, C2 vs. K2 and C3 vs. K3 42 were observed to be upregulated and 365 genes were genes, 39 genes and 43 genes were selected and rep- downregulated between K1 vs. C1, K2 vs. C2 and K3 resented in supplementary tables 1, 2 and 3 respec- vs. C3. tively. In gene regulatory networks, gene nodes were Among these genes 11 were selected which showed colored according to their log2FC (red and orange high and low log2FC from baseline between the sam- shows upregulation, green shows downregulation and ples and were validated using quantitative real time Differential transcriptome analysis in HPV cervical cancer cells Page 7 of 25 104

Figure 2. Heat maps of (A) C1 vs. K1, (B) C2 vs. K2 and (C) C3 vs. K3 were created using Cluster 3.0 and Java TreeView (v 1.1.6) for significantly Top 30 upregulated and Top 30 downregulated genes. A cut off of log2FC B-2 and log2FC C 2 was set. Red indicates high relative expression, and green indicates low relative expression.

PCR. Figure 5 (A-K) represents the validated quanti- K1 with a log2 FC value of 6.41133, was upregulated tative real time PCR graphs of these 11 genes in C2 vs. K2 with a log2 FC value of 7.55625 and also (TNFAIP3, RAB25, MET, CYP1B1, NDRG1, CD24, upregulated in C3 vs. K3 with a log2 FC value of LOXL2, CD44, PMS2, LATS1 and MDM1) between 7.38807; CD44 was upregulated in C3 vs. K3 with a K1 vs. C1, K2 vs. C2 and K3 vs. C3. These genes were log2 FC value of 9.22106; PMS2 was downregulated selected based on their earlier stated roles in cancer in C1 vs. K1 with a log2 FC value of - 13.9237; except for MDM1 which might be a novel biomarker LATS1 was downregulated in C2 vs. K2 with a log2 for cancer. The control has been assigned a fold change FC value of - 15.4909 and MDM1 was downregu- of 1 for all the genes. lated in C3 vs. K3 with a log2 FC value of - 17.7052 All the 11 genes selected from table 1 (C1vs K1), (table 4). table 2 (C2 vs. K2) and table 3 (C3 vs. K3) depending upon its following criteria: TNFAIP3 was upregulated in C1 vs. K1 with a log2 FC value of 18.5629; RAB25 4. Discussion was upregulated in C2 vs. K2 with a log2 FC value of 5.46678; MET was upregulated in C1 vs. K1 with a In this study, we investigated the effect of miR-214 on log2 FC value of 6.78671 and also upregulated in C3 global gene expression and gene networks in HPV- vs. K3 with a log2 FC value of 7.33169; CYP1B1 was positive and HPV-negative cervical cancer cells. MiR- upregulated in C2 vs. K2 with a log2 FC value of 214 being a tumor suppressor MicroRNA plays a 8.9522; NDRG1 was upregulated in C1 vs. K1 with a critical role in development of cancer. It is frequently log2 FC value of 5.65499, was upregulated in C2 vs. downregulated in cancer. In this study, we either K2 with a log2 FC value of 7.03665 and also upreg- CRISPR knocked out or over expressed miR-214 in ulated in C3 vs. K3 with a log2 FC value of 6.42796; cervical cancer cells C33A and CaSki. The study was CD24 was upregulated in C3 vs. K3 with a log2 FC aimed at finding the effects of miR-214 when altered value of 7.22612; LOXL2 was upregulated in C1 vs. above and below the basal level of expression. 104 Page 8 of 25 Prakriti Sen et al. Table 1. The genes exhibited log2FC of ? 5 and above and - 5 and below in C1 vs. K1 obtained by NGS data.

Serial number Gene log2FC Refseq ID Gene Name 1 TMEM102 5.03391 NM_178518 transmembrane protein 102 2 CDCA7L 5.06241 NM_018719 cell division cycle associated 7 like 3 ATP2B4 5.06799 NM_001684 ATPase plasma membrane Ca2? transporting 4 4 TNFSF9 5.08793 NM_003811 TNF superfamily member 9 5 TOR4A 5.11362 NM_017723 torsin family 4 member A 6 PSMB9 5.12104 NM_002800 proteasome subunit beta 9 7 GLI3 5.13128 NM_000168 GLI family zinc finger 3 8 APOBEC3C 5.14932 NM_014508 apolipoprotein B mRNA editing enzyme catalytic subunit 3C 9 PLXNB3 5.15637 NM_005393 plexin B3 10 KIAA1549L 5.16432 NM_012194 KIAA1549 like 11 FAT2 5.17174 NM_001447 FAT atypical cadherin 2 12 NNMT 5.18544 NM_006169 nicotinamide N-methyltransferase 13 CDC42BPG 5.20915 NM_017525 CDC42 binding protein kinase gamma 14 SEMA7A 5.21199 NM_003612 semaphorin 7A (John Milton Hagen blood group) 15 DEF6 5.27814 NM_022047 DEF6, guanine nucleotide exchange factor 16 MACC1 5.2823 NM_182762 MACC1, MET transcriptional regulator 17 KLF4 5.28372 NM_004235 Kruppel like factor 4 18 ZHX2 5.29813 NM_014943 zinc fingers and homeoboxes 2 19 TBX3 5.30389 NM_005996 T-box 3 20 TGFB1I1 5.35309 NM_001042454 transforming growth factor beta 1 induced transcript 1 21 TXNIP 5.36644 NM_006472 thioredoxin interacting protein 22 JAG1 5.36777 NM_000214 jagged 1 23 CARD10 5.41925 NM_014550 caspase recruitment domain family member 10 24 IGFBP4 5.438 NM_001552 insulin like growth factor binding protein 4 25 BAIAP2L2 5.4472 NM_025045 BAI1 associated protein 2 like 2 26 GPNMB 5.45647 NM_002510 glycoprotein nmb 27 FAM129A 5.45758 NM_052966 family with sequence similarity 129 member A 28 IER3 5.47641 NM_003897 immediate early response 3 29 IFI30 5.4861 NM_006332 IFI30, lysosomal thiol reductase 30 MCAM 5.51097 NM_006500 melanoma cell adhesion molecule 31 DDX60L 5.53716 NM_001012967 DExD/H-box 60 like 32 ITPKB 5.54654 NM_002221 inositol-trisphosphate 3-kinase B 33 SOCS3 5.56026 NM_003955 suppressor of cytokine signaling 3 34 MN1 5.59785 NM_002430 MN1 proto-oncogene, transcriptional regulator 35 ARHGEF19 5.61042 NM_153213 Rho guanine nucleotide exchange factor 19 36 RFTN1 5.61485 NM_015150 raftlin, lipid raft linker 1 37 NDRG1 5.65499 NM_006096 N-myc downstream regulated 1 38 ITGA3 5.69011 NM_002204 integrin subunit alpha 3 39 DRAM1 5.74185 NM_018370 DNA damage regulated autophagy modulator 1 40 GNAO1 5.75621 NM_138736 G protein subunit alpha o1 41 MMP14 5.76634 NM_004995 matrix metallopeptidase 14 42 FAM84B 5.77978 NM_174911 family with sequence similarity 84 member B 43 CCDC71L 5.80334 NM_175884 coiled-coil domain containing 71 like 44 PARP6 5.81115 NM_020214 poly(ADP-ribose) polymerase family member 6 45 AFAP1L1 5.84366 NM_152406 actin filament associated protein 1 like 1 46 KLHL35 5.90967 NM_001039548 kelch like family member 35 47 F2RL1 5.91906 NM_005242 F2R like trypsin receptor 1 48 SOX13 5.93105 NM_005686 SRY-box 13 49 RASD2 5.94043 NM_014310 RASD family member 2 50 ELF4 5.94135 NM_001421 E74 like ETS 4 51 LRRK1 5.94858 NM_024652 leucine rich repeat kinase 1 52 FHDC1 5.97736 NM_033393 FH2 domain containing 1 53 FIBCD1 6.00564 NM_001145106 fibrinogen C domain containing 1 54 SDC4 6.04416 NM_002999 syndecan 4 55 GLT8D2 6.10066 NM_031302 glycosyltransferase 8 domain containing 2 Differential transcriptome analysis in HPV cervical cancer cells Page 9 of 25 104 Table 1. (continued)

Serial number Gene log2FC Refseq ID Gene Name 56 ERBB3 6.12751 NM_001982 erb-b2 receptor tyrosine kinase 3 57 CEP126 6.13175 NM_020802 centrosomal protein 126 58 WNT7B 6.16222 NM_058238 Wnt family member 7B 59 RUNX1 6.16941 NM_001001890 runt related transcription factor 1 60 PXDC1 6.24605 NM_183373 PX domain containing 1 61 ANXA3 6.26148 NM_005139 annexin A3 62 PSD4 6.26652 NM_012455 pleckstrin and Sec7 domain containing 4 63 LOXL2 6.41133 NM_002318 lysyl oxidase like 2 64 PTGES 6.41165 NM_004878 prostaglandin E synthase 65 NRIP3 6.43448 NM_020645 nuclear receptor interacting protein 3 66 MT2A 6.45004 NM_005953 metallothionein 2A 67 SLFN13 6.45072 NM_144682 schlafen family member 13 68 CSPG4 6.47012 NM_001897 chondroitin sulfate proteoglycan 4 69 FOXF2 6.51358 NM_001452 forkhead box F2 70 ITPR3 6.59796 NM_002224 inositol 1,4,5-trisphosphate receptor type 3 71 S100A2 6.599 NM_005978 S100 calcium binding protein A2 72 TFAP2C 6.6148 NM_003222 transcription factor AP-2 gamma 73 PDLIM1 6.67384 NM_020992 PDZ and LIM domain 1 74 SPINT2 6.67769 NM_021102 serine peptidase inhibitor, Kunitz type 2 75 CDK6 6.70053 NM_001259 cyclin dependent kinase 6 76 MET 6.78671 NM_000245 MET proto-oncogene, receptor tyrosine kinase 77 RNF144B 6.79941 NM_182757 ring finger protein 144B 78 EPN3 6.82797 NM_017957 epsin 3 79 CRISPLD2 6.88477 NM_031476 cysteine rich secretory protein LCCL domain containing 2 80 VAMP8 6.89927 NM_003761 vesicle associated membrane protein 8 81 SFXN3 6.93345 NM_030971 sideroflexin 3 82 EHD2 6.97445 NM_014601 EH domain containing 2 83 COL5A1 6.98219 NM_000093 collagen type V alpha 1 chain 84 IFITM3 7.03192 NM_021034 interferon induced transmembrane protein 3 85 RASIP1 7.09158 NM_017805 Ras interacting protein 1 86 TAGLN2 7.11769 NM_003564 transgelin 2 87 CCBE1 7.11773 NM_133459 collagen and calcium binding EGF domains 1 88 RAB32 7.24856 NM_006834 RAB32, member RAS oncogene family 89 CLDN4 7.2803 NM_001305 claudin 4 90 OXCT1 7.41235 NM_000436 3-oxoacid CoA-transferase 1 91 SEMA5A 7.43227 NM_003966 semaphorin 5A 92 TNFRSF1A 7.5665 NM_001065 TNF receptor superfamily member 1A 93 SH2D3A 7.57404 NM_005490 SH2 domain containing 3A 94 EPAS1 7.64997 NM_001430 endothelial PAS domain protein 1 95 APOL6 7.86017 NM_030641 apolipoprotein L6 96 FERMT1 7.86597 NM_017671 fermitin family member 1 97 CAPG 7.88439 NM_001256139 capping actin protein, gelsolin like 98 LAMB1 7.94343 NM_002291 laminin subunit beta 1 99 LYPD3 8.01614 NM_014400 LY6/PLAUR domain containing 3 100 EPCAM 8.01656 NM_002354 epithelial cell adhesion molecule 101 DUSP4 8.02241 NM_001394 dual specificity phosphatase 4 102 COL17A1 8.14467 NM_000494 collagen type XVII alpha 1 chain 103 CDC42EP1 8.30407 NM_152243 CDC42 effector protein 1 104 HSPA2 8.37218 NM_021979 heat shock protein family A (Hsp70) member 2 105 EPS8L2 8.44245 NM_022772 EPS8 like 2 106 PFKP 8.50479 NM_002627 phosphofructokinase, platelet 107 TRIP6 8.53378 NM_003302 hormone receptor interactor 6 108 UNC13D 8.58145 NM_199242 unc-13 homolog D 109 UCA1 8.68713 NR_015379 urothelial cancer associated 1 110 CTSZ 8.83891 NM_001336 cathepsin Z 111 CYBA 8.89735 NM_000101 cytochrome b-245 alpha chain 104 Page 10 of 25 Prakriti Sen et al. Table 1. (continued)

Serial number Gene log2FC Refseq ID Gene Name 112 CAMK2N1 8.92429 NM_018584 calcium/calmodulin dependent protein kinase II inhibitor 1 113 SERPINE1 8.98948 NM_000602 serpin family E member 1 114 ADAMTS1 9.04743 NM_006988 ADAM metallopeptidase with thrombospondin type 1 motif 1 115 F3 9.2067 NM_001993 coagulation factor III, tissue factor 116 TNFAIP2 10.0456 NM_006291 TNF alpha induced protein 2 117 EGFR 10.7937 NM_005228 epidermal growth factor receptor 118 KRT17 11.8426 NM_000422 keratin 17 119 GPSM3 12.0955 NM_001276501_2 G protein signaling modulator 3 120 CFAP69 13.5461 NM_001039706 cilia and flagella associated protein 69 121 GRM4 14.2699 NM_000841 glutamate metabotropic receptor 4 122 TRIM2 14.2876 NM_001130067 tripartite motif containing 2 123 PCSK6 14.5111 NM_002570 proprotein convertase subtilisin/kexin type 6 124 ARHGEF10L 14.7219 CUFF.41.2 Rho guanine nucleotide exchange factor 10 like 125 ZKSCAN5 14.9635 NM_014569 zinc finger with KRAB and SCAN domains 5 126 FRMD6 16.0338 CUFF.1007.2 FERM domain containing 6 127 KIF21B 16.1681 NM_001252102 kinesin family member 21B 128 EDARADD 16.3659 NM_080738 EDAR associated death domain 129 LPP 17.1434 NM_001167671 LIM domain containing preferred translocation partner in lipoma 130 NR1D2 17.1841 NM_005126 nuclear receptor subfamily 1 group D member 2 131 LRIG3 17.8469 NM_153377 leucine rich repeats and immunoglobulin like domains 3 132 FGD3 18.0218 NM_033086 FYVE, RhoGEF and PH domain containing 3 133 SH3D21 18.3007 NM_001162530 SH3 domain containing 21 134 TNFAIP3 18.5629 NM_001270508 TNF alpha induced protein 3 135 CPA4 18.601 NM_016352 carboxypeptidase A4 136 BCAR3 21.6371 NM_003567 BCAR3, NSP family adaptor protein 137 LOX 21.6747 NM_002317 lysyl oxidase 138 TES 21.8695 NM_015641 testin LIM domain protein 139 CRMP1 - 18.5661 NM_001313 collapsin response mediator protein 1 140 CCDC92 - 18.2406 NM_001304959 coiled-coil domain containing 92 141 IDH2 - 17.1355 NM_001289910 isocitrate dehydrogenase (NADP(?)) 2, mitochondrial 142 DIS3L - 16.5862 NM_133375 DIS3 like exosome 3’–5’ exoribonuclease 143 PIK3R3 - 16.157 NM_001303428 phosphoinositide-3-kinase regulatory subunit 3 144 CCNB1IP1 - 15.9881 NM_182849 cyclin B1 interacting protein 1 145 STRIP1 - 15.6995 NM_001270768 striatin interacting protein 1 146 MAPRE3 - 15.6511 NM_001303050 microtubule associated protein RP/EB family member 3 147 BLOC1S6 - 15.1661 NR_132359 biogenesis of lysosomal organelles complex 1 subunit 6 148 FANCC - 14.9743 NM_001243743 FA complementation group C 149 HPCAL1 - 14.9721 NM_001258357 hippocalcin like 1 150 RNF19A - 14.6313 NM_001280539 ring finger protein 19A, RBR E3 ubiquitin protein ligase 151 ZNF689 - 14.514 NR_073482 zinc finger protein 689 152 SEL1L3 - 14.182 NM_001297592 SEL1L family member 3 153 PMS2 - 13.9237 NR_003085 PMS1 homolog 2, mismatch repair system component 154 TIPARP - 13.657 NM_001184718 TCDD inducible poly(ADP-ribose) polymerase 155 ZNF41 - 13.6248 NM_007130 zinc finger protein 41 156 FCHO1 - 13.5699 NM_001161358 FCH domain only 1 157 NR6A1 - 13.1277 CUFF.3763.4 nuclear receptor subfamily 6 group A member 1 158 CAMKK1 - 13.0013 NM_172206 calcium/calmodulin dependent protein kinase kinase 1 159 IFNLR1 - 12.4642 NM_173065 interferon lambda receptor 1 160 OSCAR - 11.1391 NM_001282350 osteoclast associated, immunoglobulin-like receptor 161 DCDC2 - 8.99153 NM_001195610 doublecortin domain containing 2 162 COL5A2 - 8.53646 NM_000393 collagen type V alpha 2 chain 163 GJA1 - 8.35758 NM_000165 gap junction protein alpha 1 164 ZNF608 - 7.90456 NM_020747 zinc finger protein 608 Differential transcriptome analysis in HPV cervical cancer cells Page 11 of 25 104 Table 1. (continued)

Serial number Gene log2FC Refseq ID Gene Name 165 ID2 - 7.71651 NM_002166 inhibitor of DNA binding 2 166 GANAB - 7.5451 NM_001278193 glucosidase II alpha subunit 167 DIRAS1 - 7.49413 NM_145173 DIRAS family GTPase 1 168 PALD1 - 7.10807 NM_014431 phosphatase domain containing, paladin 1 169 IGF2BP1 - 7.0317 NM_006546 insulin like growth factor 2 mRNA binding protein 1 170 EGR1 - 7.02907 NM_001964 early growth response 1 171 SCG2 - 6.89024 NM_003469 secretogranin II 172 PNMA2 - 6.54471 NM_007257 PNMA family member 2 173 FKBP10 - 6.4379 NM_021939 FK506 binding protein 10 174 PALM3 - 6.39595 NM_001145028 paralemmin 3 175 LIN28B - 6.33477 NM_001004317 lin-28 homolog B 176 ANKRD1 - 6.25256 NM_014391 ankyrin repeat domain 1 177 SDC2 - 6.21331 NM_002998 syndecan 2 178 FOS - 6.0867 NM_005252 Fos proto-oncogene, AP-1 transcription factor subunit 179 GLDC - 6.01753 NM_000170 glycine decarboxylase 180 PADI2 - 6.00525 NM_007365 peptidyl arginine deiminase 2 181 EFR3B - 6.00226 NM_014971 EFR3 homolog B 182 SAMD5 - 5.96897 NM_001030060 sterile alpha motif domain containing 5 183 LINC01447 - 5.91591 NR_108090 long intergenic non-protein coding RNA 1447 184 PRDX2 - 5.84781 NM_005809 peroxiredoxin 2 185 TENM2 - 5.71144 NM_001122679 teneurin transmembrane protein 2 186 FGFBP3 - 5.62854 NM_152429 fibroblast growth factor binding protein 3 187 TERT - 5.59914 NM_198253 telomerase reverse transcriptase 188 RPSAP58 - 5.3902 NR_003662 ribosomal protein SA pseudogene 58 189 UNC13A - 5.33583 NM_001080421 unc-13 homolog A 190 TOX - 5.31902 NM_014729 thymocyte selection associated high mobility group box 191 NR2F1 - 5.28818 NM_005654 nuclear receptor subfamily 2 group F member 1 192 WTIP - 5.21562 NM_001080436 WT1 interacting protein 193 CHD5 - 5.12072 NM_015557 chromodomain helicase DNA binding protein 5 194 SCCPDH - 5.09 NM_016002 saccharopine dehydrogenase (putative) 195 WSCD1 - 5.07113 NM_015253 WSC domain containing 1 196 SKIDA1 - 5.02472 NM_207371 SKI/DACH domain containing 1 197 HNRNPA1P10 - 5.00184 NR_002944 heterogeneous nuclear ribonucleoprotein A1 pseudogene 10

Although miR-214 is downregulated in cervical cancer, interventions and may serve as prognostic biomarker in its expression was not lost completely; in fact there was cervical cancer. a low level of expression. Further from the figure S1 From this study a total of 904 genes were found to be (A) we observed differential level of miR-214 expres- upregulated between CaSki and C33A (control, miR- sion in cervical cancer cells based on its HPV status. It 214 knock out and miR-214 over expressed) and 365 was reported in earlier studies that HPV infection genes were downregulated. Out of these genes, 11 affects the miRNA expression through oncoproteins E6 genes were chosen which showed high fold changes and E7 which further contributes towards viral patho- from baseline expression between the samples and genesis (Santos et al. 2018) and also modulates DNA relevant to cancer. These genes were TNFAIP3, methylation in cervical cancer (Sen et al. 2018). These RAB25, MET, CYP1B1, NDRG1, CD24, LOXL2, findings prompted us to look into the transcriptomic CD44, PMS2, LATS1 and MDM1. The RNA analysis between HPV-positive and HPV-negative cells sequencing data for these 11 genes were further vali- in presence and/or absence of miR-214. This will aid in dated by quantitative Real time PCR. identification of targets modulated by miR-214 Tumor necrosis factor alpha-induced protein 3 expression level based on its HPV status for therapeutic (TNFAIP3), is a deubiquitinating factor which plays a 104 Page 12 of 25 Prakriti Sen et al. Table 2. The genes exhibited log2FC of ? 5 and above and - 5 and below in C2 vs. K2 obtained by NGS data.

Serial number Gene log2FC Refseq ID Gene Name 1 COL7A1 5.00492 NM_000094 collagen type VII alpha 1 chain 2 MAML2 5.02578 NM_032427 mastermind like transcriptional coactivator 2 3 MDK 5.03437 NM_002391 midkine 4 GNAO1 5.04851 NM_138736 G protein subunit alpha o1 5 NMI 5.07956 NM_004688 N-myc and STAT interactor 6 IER3 5.0874 NM_003897 immediate early response 3 7 SOX13 5.09054 NM_005686 SRY-box 13 8 KIAA1549L 5.09631 NM_012194 KIAA1549 like 9 PARP10 5.10202 NM_032789 poly(ADP-ribose) polymerase family member 10 10 CDKN1A 5.12503 NM_000389 cyclin dependent kinase inhibitor 1A 11 PSMB9 5.13986 NM_002800 proteasome subunit beta 9 12 KLF4 5.15083 NM_004235 Kruppel like factor 4 13 ABCA1 5.15973 NM_005502 ATP binding cassette subfamily A member 1 14 LOC729737 5.17665 NR_039983 uncharacterized LOC729737 15 AP1M2 5.18736 NM_005498 adaptor related protein complex 1 subunit mu 2 16 TXNIP 5.1983 NM_006472 thioredoxin interacting protein 17 CEP126 5.22265 NM_020802 centrosomal protein 126 18 APOBEC3C 5.22803 NM_014508 apolipoprotein B mRNA editing enzyme catalytic subunit 3C 19 KRT6A 5.24475 NM_005554 keratin 6A 20 MCAM 5.29863 NM_006500 melanoma cell adhesion molecule 21 TRIM29 5.30239 NM_012101 tripartite motif containing 29 22 NRIP3 5.3583 NM_020645 nuclear receptor interacting protein 3 23 GBP6 5.37335 NM_198460 guanylate binding protein family member 6 24 JAG1 5.37954 NM_000214 jagged 1 25 COL4A2 5.4034 NM_001846 collagen type IV alpha 2 chain 26 CSPG4 5.41461 NM_001897 chondroitin sulfate proteoglycan 4 27 RAB13 5.41606 NM_002870 RAB13, member RAS oncogene family 28 CARD10 5.42408 NM_014550 caspase recruitment domain family member 10 29 AHNAK2 5.45015 NM_138420 AHNAK nucleoprotein 2 30 MN1 5.46358 NM_002430 MN1 proto-oncogene, transcriptional regulator 31 RAB25 5.46678 NM_020387 RAB25, member RAS oncogene family 32 CEBPD 5.47758 NM_005195 CCAAT enhancer binding protein delta 33 GLI3 5.47967 NM_000168 GLI family zinc finger 3 34 AFAP1L1 5.48195 NM_152406 actin filament associated protein 1 like 1 35 DDX60L 5.49285 NM_001012967 DExD/H-box 60 like 36 HIVEP2 5.50997 NM_006734 human immunodeficiency virus type I enhancer binding protein 2 37 KRT14 5.54432 NM_000526 keratin 14 38 LYPD3 5.54606 NM_014400 LY6/PLAUR domain containing 3 39 OTUD1 5.56116 NM_001145373 OTU deubiquitinase 1 40 SLFN13 5.56414 NM_144682 schlafen family member 13 41 IFIT2 5.56737 NM_001547 interferon induced protein with tetratricopeptide repeats 2 42 FGD3 5.63202 NM_033086 FYVE, RhoGEF and PH domain containing 3 43 F2RL1 5.63961 NM_005242 F2R like trypsin receptor 1 44 IGFBP4 5.64734 NM_001552 insulin like growth factor binding protein 4 45 GLT8D2 5.64922 NM_031302 glycosyltransferase 8 domain containing 2 46 ELF4 5.65802 NM_001127197 E74 like ETS transcription factor 4 47 EYA2 5.66018 NM_005244 EYA transcriptional coactivator and phosphatase 2 48 PSD4 5.67402 NM_012455 pleckstrin and Sec7 domain containing 4 49 ZHX2 5.67757 NM_014943 zinc fingers and homeoboxes 2 50 ST6GALNAC2 5.67832 NM_006456 ST6 N-acetylgalactosaminide alpha-2,6-sialyltransferase 2 51 TMEM102 5.70495 NM_178518 transmembrane protein 102 52 ITPKB 5.74018 NM_002221 inositol-trisphosphate 3-kinase B 53 SOCS3 5.7434 NM_003955 suppressor of cytokine signaling 3 54 RNF144B 5.79969 NM_182757 ring finger protein 144B Differential transcriptome analysis in HPV cervical cancer cells Page 13 of 25 104 Table 2. (continued)

Serial number Gene log2FC Refseq ID Gene Name 55 SLPI 5.87297 NM_003064 secretory leukocyte peptidase inhibitor 56 FAM129A 5.91406 NM_052966 family with sequence similarity 129 member A 57 RASD2 5.91567 NM_014310 RASD family member 2 58 FAM84B 5.95565 NM_174911 family with sequence similarity 84 member B 59 YBX2 6.02181 NM_015982 Y-box binding protein 2 60 STARD13 6.0401 NM_178006 StAR related lipid transfer domain containing 13 61 S100A2 6.0429 NM_005978 S100 calcium binding protein A2 62 PDZD2 6.08724 NM_178140 PDZ domain containing 2 63 ITGA3 6.09828 NM_002204 integrin subunit alpha 3 64 CRISPLD2 6.1191 NM_031476 cysteine rich secretory protein LCCL domain containing 2 65 WNT7B 6.13115 NM_058238 Wnt family member 7B 66 SDC4 6.15933 NM_002999 syndecan 4 67 GALNT3 6.16932 NM_004482 polypeptide N-acetylgalactosaminyltransferase 3 68 S100A6 6.20813 NM_014624 S100 calcium binding protein A6 69 CCDC71L 6.22769 NM_175884 coiled-coil domain containing 71 like 70 MMP14 6.25006 NM_004995 matrix metallopeptidase 14 71 LRRK1 6.2662 NM_024652 leucine rich repeat kinase 1 72 DSEL 6.30149 NM_032160 dermatan sulfate epimerase like 73 NNMT 6.33605 NM_006169 nicotinamide N-methyltransferase 74 NTN4 6.40073 NM_021229 netrin 4 75 RFTN1 6.4288 NM_015150 raftlin, lipid raft linker 1 76 MACC1 6.44976 NM_182762 MACC1, MET transcriptional regulator 77 EFNB2 6.45966 NM_004093 ephrin B2 78 ERBB3 6.47876 NM_001982 erb-b2 receptor tyrosine kinase 3 79 PARP6 6.49776 NM_020214 poly(ADP-ribose) polymerase family member 6 80 TGFB1I1 6.51313 NM_001042454 transforming growth factor beta 1 induced transcript 1 81 TFAP2C 6.51373 NM_003222 transcription factor AP-2 gamma 82 TNFSF9 6.53355 NM_003811 TNF superfamily member 9 83 FIBCD1 6.59677 NM_001145106 fibrinogen C domain containing 1 84 KRT8 6.67736 NM_002273 keratin 8 85 PXDC1 6.77698 NM_183373 PX domain containing 1 86 SPINT2 6.79206 NM_021102 serine peptidase inhibitor, Kunitz type 2 87 MT2A 6.80783 NM_005953 metallothionein 2A 88 VAMP8 6.81921 NM_003761 vesicle associated membrane protein 8 89 PDLIM1 6.85017 NM_020992 PDZ and LIM domain 1 90 ALCAM 6.85551 NM_001627 activated leukocyte cell adhesion molecule 91 KLHL35 6.93538 NM_001039548 kelch like family member 35 92 TRIM47 6.95984 NM_033452 tripartite motif containing 47 93 RASIP1 7.00392 NM_017805 Ras interacting protein 1 94 NDRG1 7.03665 NM_006096 N-myc downstream regulated 1 95 RAB32 7.05928 NM_006834 RAB32, member RAS oncogene family 96 DRAM1 7.09334 NM_018370 DNA damage regulated autophagy modulator 1 97 ITPR3 7.09718 NM_002224 inositol 1,4,5-trisphosphate receptor type 3 98 PTGES 7.12968 NM_004878 prostaglandin E synthase 99 SFXN3 7.19759 NM_030971 sideroflexin 3 100 ANXA3 7.26923 NM_005139 annexin A3 101 KRT80 7.28627 NM_182507 keratin 80 102 CLDN4 7.29121 NM_001305 claudin 4 103 CCBE1 7.32102 NM_133459 collagen and calcium binding EGF domains 1 104 FERMT1 7.32792 NM_017671 fermitin family member 1 105 MET 7.36656 NM_000245 MET proto-oncogene, receptor tyrosine kinase 106 OXCT1 7.45284 NM_000436 3-oxoacid CoA-transferase 1 107 RUNX1 7.49822 NM_001001890 runt related transcription factor 1 108 ANOS1 7.51535 NM_000216 anosmin 1 109 BACE2 7.54296 NM_012105 beta-secretase 2 104 Page 14 of 25 Prakriti Sen et al. Table 2. (continued)

Serial number Gene log2FC Refseq ID Gene Name 110 LOXL2 7.55625 NM_002318 lysyl oxidase like 2 111 IFITM3 7.60026 NM_021034 interferon induced transmembrane protein 3 112 SLC16A3 7.78668 NM_004207 solute carrier family 16 member 3 113 KRT15 7.79916 NM_002275 keratin 15 114 EHD2 7.82677 NM_014601 EH domain containing 2 115 SH2D3A 7.83445 NM_005490 SH2 domain containing 3A 116 EPCAM 7.86726 NM_002354 epithelial cell adhesion molecule 117 SYNPO 7.87492 NM_007286 synaptopodin 118 VGLL3 7.87755 NM_016206 vestigial like family member 3 119 SFN 7.88865 NM_006142 stratifin 120 COL5A1 7.91214 NM_000093 collagen type V alpha 1 chain 121 TAGLN2 7.9236 NM_003564 transgelin 2 122 COL17A1 8.08507 NM_000494 collagen type XVII alpha 1 chain 123 CAV2 8.10074 NM_001206747 caveolin 2 124 EPS8L2 8.43076 NM_022772 EPS8 like 2 125 TRIP6 8.45763 NM_003302 thyroid hormone receptor interactor 6 126 CAPG 8.5697 NM_001256139 capping actin protein, gelsolin like 127 CCDC80 8.64492 NM_199511 coiled-coil domain containing 80 128 PFKP 8.66356 NM_002627 phosphofructokinase, platelet 129 SEMA5A 8.7154 NM_003966 semaphorin 5A 130 LAMB1 8.76926 NM_002291 laminin subunit beta 1 131 ADAMTS1 8.79958 NM_006988 ADAM metallopeptidase with thrombospondin type 1 motif 1 132 TACSTD2 8.81203 NM_002353 tumor associated calcium signal transducer 2 133 EPAS1 8.84498 NM_001430 endothelial PAS domain protein 1 134 UNC13D 8.87214 NM_199242 unc-13 homolog D 135 CYP1B1 8.9522 NM_000104 cytochrome P450 family 1 subfamily B member 1 136 S100A11 9.02893 NM_005620 S100 calcium binding protein A11 137 ARHGAP29 9.11066 NM_004815 Rho GTPase activating protein 29 138 CDC42EP1 9.21072 NM_152243 CDC42 effector protein 1 139 IGFBP6 9.38873 NM_002178 insulin like growth factor binding protein 6 140 CTSZ 9.67481 NM_001336 cathepsin Z 141 EGFR 9.94578 NM_005228 epidermal growth factor receptor 142 TNFAIP2 10.0478 NM_006291 TNF alpha induced protein 2 143 IGFBP5 11.6007 NM_000599 insulin like growth factor binding protein 5 144 ANKRD29 12.4541 NM_173505 ankyrin repeat domain 29 145 FGF1 12.9437 NM_001144892 fibroblast growth factor 1 146 DENND2C 14.6293 CUFF.177.3 DENN domain containing 2C 147 SYT17 14.8891 NM_016524 synaptotagmin 17 148 SSPN 14.9618 NM_005086 sarcospan 149 TOX2 15.3775 NM_001098797 TOX high mobility group box family member 2 150 CTHRC1 15.4351 NM_138455 collagen triple helix repeat containing 1 151 NMNAT2 15.9976 NM_170706 nicotinamide nucleotide adenylyltransferase 2 152 FCHO1 16.083 NM_001161359 FCH domain only 1 153 PTPRB 16.1231 NM_002837 protein tyrosine phosphatase, receptor type B 154 PCDH1 16.1884 NM_002587 protocadherin 1 155 KLF11 16.2794 NM_003597 Kruppel like factor 11 156 ARMCX4 16.6327 CUFF.3893.2 armadillo repeat containing X-linked 4 157 PHF11 16.8888 NM_001040443 PHD finger protein 11 158 TRIM7 17.1368 NM_203293 tripartite motif containing 7 159 LRIG3 17.499 NM_153377 leucine rich repeats and immunoglobulin like domains 3 160 CPA4 17.6446 NM_016352 carboxypeptidase A4 161 PHLDB2 17.7478 NM_001134438 pleckstrin like domain family B member 2 162 TES 22.0969 NM_015641 testin LIM domain protein 163 NME4 - 19.6861 NM_005009 NME/NM23 nucleoside diphosphate kinase 4 164 SPRY1 - 18.5762 NM_001258039 sprouty RTK signaling antagonist 1 Differential transcriptome analysis in HPV cervical cancer cells Page 15 of 25 104 Table 2. (continued)

Serial number Gene log2FC Refseq ID Gene Name 165 CHN1 - 18.4563 NM_001822 chimerin 1 166 TERT - 17.546 NM_198253 telomerase reverse transcriptase 167 HPCAL1 - 17.4556 NM_001258357 hippocalcin like 1 168 SMAD1 - 17.1398 NM_001003688 SMAD family member 1 169 CGREF1 - 16.7815 NM_001166239 cell growth regulator with EF-hand domain 1 170 GAB2 - 16.6371 NM_012296 GRB2 associated binding protein 2 171 PTCH1 - 16.3896 NM_001083604 patched 1 172 DTNB - 16.2969 NM_001256304 dystrobrevin beta 173 CTDP1 - 16.182 NM_001202504_1 CTD phosphatase subunit 1 174 FGFR3 - 16.1449 NM_000142 fibroblast growth factor receptor 3 175 UIMC1 - 15.8772 NM_016290 ubiquitin interaction motif containing 1 176 C2CD2 - 15.6617 NM_199050 C2 calcium dependent domain containing 2 177 LATS1 - 15.4909 CUFF.3150.3 large tumor suppressor kinase 1 178 CDYL - 15.4147 NM_001143971 chromodomain Y like 179 PPP2R2A - 15.3631 NM_001177591 protein phosphatase 2 regulatory subunit Balpha 180 LYSMD2 - 15.2559 NM_001143917 LysM domain containing 2 181 CLCN5 - 15.1407 NM_000084 chloride voltage-gated channel 5 182 ASGR1 - 14.7141 NM_001197216 asialoglycoprotein receptor 1 183 GARNL3 - 14.6032 NM_001286779 GTPase activating Rap/RanGAP domain like 3 184 NR1D2 - 14.6017 NM_001145425 nuclear receptor subfamily 1 group D member 2 185 FHL1 - 14.6 NM_001167819 four and a half LIM domains 1 186 KLF11 - 14.464 NM_001177716 Kruppel like factor 11 187 MAN1C1 - 14.3265 NM_020379 mannosidase alpha class 1C member 1 188 ROBO1 - 14.1937 NM_133631 roundabout guidance receptor 1 189 ZNF41 - 13.7562 NM_007130 zinc finger protein 41 190 RHOBTB1 - 13.7464 NR_024555 Rho related BTB domain containing 1 191 MIRLET7BHG - 13.4684 NR_027033 MIRLET7B host gene 192 TOX2 - 12.8591 NM_001098796 TOX high mobility group box family member 2 193 TFEB - 12.4808 NM_001271945 transcription factor EB 194 NLRX1 - 12.2718 NM_001282358 NLR family member X1 195 GJA1 - 10.2722 NM_000165 gap junction protein alpha 1 196 COL5A2 - 9.3129 NM_000393 collagen type V alpha 2 chain 197 ZNF608 - 8.68827 NM_020747 zinc finger protein 608 198 IGF2BP1 - 7.76376 NM_006546 insulin like growth factor 2 mRNA binding protein 1 199 FADS2 - 7.55459 NM_004265 fatty acid desaturase 2 200 ID2 - 7.22391 NM_002166 inhibitor of DNA binding 2 201 SAMD5 - 6.76394 NM_001030060 sterile alpha motif domain containing 5 202 RBM20 - 6.74579 NM_001134363 RNA binding motif protein 20 203 GLDC - 6.66612 NM_000170 glycine decarboxylase 204 ANKRD1 - 6.41758 NM_014391 ankyrin repeat domain 1 205 HSD17B11 - 6.17681 NM_016245 hydroxysteroid 17-beta dehydrogenase 11 206 FOS - 5.85202 NM_005252 Fos proto-oncogene, AP-1 transcription factor subunit 207 EGR1 - 5.84165 NM_001964 early growth response 1 208 PRR36 - 5.81272 NM_001190467 proline rich 36 209 TOX - 5.75761 NM_014729 thymocyte selection associated high mobility group box 210 PRDX2 - 5.72491 NM_005809 peroxiredoxin 2 211 TBX1 - 5.71328 NM_080647 T-box 1 212 EFR3B - 5.67475 NM_014971 EFR3 homolog B 213 FGFBP3 - 5.67255 NM_152429 fibroblast growth factor binding protein 3 214 FKBP10 - 5.66514 NM_021939 FK506 binding protein 10 215 SEMA3E - 5.61993 NM_012431 semaphorin 3E 216 ATP6V0E2 - 5.58759 NM_145230 ATPase H? transporting V0 subunit e2 217 SIX3 - 5.5742 NM_005413 SIX homeobox 3 218 SCCPDH - 5.46594 NM_016002 saccharopine dehydrogenase (putative) 219 CRMP1 - 5.44081 NM_001313 collapsin response mediator protein 1 220 UNC13A - 5.32743 NM_001080421 unc-13 homolog A 104 Page 16 of 25 Prakriti Sen et al. Table 2. (continued)

Serial number Gene log2FC Refseq ID Gene Name 221 DOCK3 - 5.26758 NM_004947 dedicator of cytokinesis 3 222 WTIP - 5.10962 NM_001080436 WT1 interacting protein 223 NR2F1 - 5.09111 NM_005654 nuclear receptor subfamily 2 group F member 1 224 ASPHD1 - 5.0386 NM_181718 aspartate beta-hydroxylase domain containing 1 225 CHD5 - 5.01972 NM_015557 chromodomain helicase DNA binding protein 5 226 SKIDA1 - 5.00058 NM_207371 SKI/DACH domain containing 1

role in cancer development. TNFAIP3 affects the levels 2017). Conversely, attenuation of CYP1B1 by shRNA of EMT markers Snail and Zeb 1 leading to malignant leads to apoptosis by caspase-1. We observed in table 2 transformation (Du et al. 2019). Depletion of TNFAIP3 and also supplementary table 2 that the level of from the cells resulted in reduced migration and pro- CYP1B1 was upregulated in both miR-214 overex- liferation capacity of gastric cancer cells. In our studies, pressed C33A and CaSki which indicated that miR-214 TNFAIP3 is upregulated in HPV-positive cells CaSki does not overpower the expression of CYP1B1 in these compared to HPV-negative cells C33A, where it is cells. N-myc downstream regulated gene 1 (NDRG1) is downregulated. Overexpression of miR-214 lead to a tumor suppressor which inhibits cell migration and decline in TNFAIP3 levels while knockout increased invasion. In cells (CRC), NDRG1 the levels. This indicated that miR-214 suppressed mediates caveolin-1 degradation; caveolin-1 interacts TNFAIP3 expression. RAB25 is known to play a dual directly with NDRG1 and mediates metastatic sup- role, in some cells it acts as tumor promoting and tumor pression functions of NDRG1 (Mi et al. 2017). In our inhibiting in others. In squamous cell carcinoma of study, in both overexpression and knockout of miR- skin, RAB25 acts as a tumor suppressor. Loss of 214, NDRG1 expression was significantly increased in RAB25 in these cells led to increased cell proliferation CaSki compared to C33A cells. This suggested that (Jeong et al. 2019). RAB25 knockout mice showed NDRG1 expression is not directly dependent on miR- increased tumor generation and malignant transforma- 214. CD24 is a cell adhesion molecule and its tion. Since majority cases of cervical cancers are expression correlated with prognosis of several tumors. squamous cell carcinoma of epithelial cells in cervix, In cervical cancer cells, CD24 showed overexpression RAB25 is most likely acting as a tumor suppressor. which contributed to reduced apoptosis and increased Overexpression of miR-214 resulted in upregulation of cell proliferation in these cells (PEI et al. 2016). CD24 RAB25 in CaSki compared to C33A; which suggested was also shown to alter the levels of p38, JNK2 and that miR-214 being a tumor suppressor upregulated c-Jun in vitro. Overexpression of miR-214 upregulated RAB25 which also acts as a tumor suppressor here. CD24 expression which further increased upon Knockout of miR-214 resulted in in marginal reduction knockout of miR-214 in C33A. CaSki cells showed in HPV-positive only and not in HPV-negative RAB25 very little change in CD24 levels by miR-214. Lysyl levels. C-MET is a proto-oncogene which plays a role oxidase like protein-2 (LOXL2) acts a of in cervical cancer. MET is over expressed in cervical tumorigenesis and metastasis. In mice model of breast cancer and can be a potential molecular therapeutic cancer, overexpression of LOXL2 resulted in increased target (Refaat et al. 2017). We observed that miR-214 metastasis and tumor growth while depletion of overexpression also increased MET expression in both LOXL2 reversed the effects (Salvador et al. 2017). We CaSki and C33A; which indicated that miR-214 does observed LOXL2 is downregulated in CaSki compared not suppress the MET expression in these cells. to C33A. The knockout of miR-214 significantly Knockout of miR-214 downregulated c-MET expres- upregulated LOXL2 expression in C33A. In CaSki, sion in C33A but no change was observed in CaSki. reverse effects were observed; miR-214 overexpression Cytochrome P450 1B1 (CYP1B1) is a well-known reduced gene expression below basal level while tumor biomarker. Overexpression of CYP1B1 in knockout of miR-214 partially restored it. Therefore, prostate cancer cells resulted in enhanced cell prolif- miR-214 is effective in suppressing LOXL2 levels in eration, migration and invasiveness (Chang et al. HPV-positive CaSki cells. CD44 is a well-known stem Differential transcriptome analysis in HPV cervical cancer cells Page 17 of 25 104 Table 3. The genes exhibited log2FC of ? 5 and above and - 5 and below in C3 vs. K3 obtained by NGS data.

Serial number Gene log2FC Refseq ID Gene Name 1 NXPH4 5.02325 NM_007224 neurexophilin 4 2 MCAM 5.0409 NM_006500 melanoma cell adhesion molecule 3 NRIP3 5.04446 NM_020645 nuclear receptor interacting protein 3 4 MACC1 5.05975 NM_182762 MACC1, MET transcriptional regulator 5 MYO10 5.06446 NM_012334 myosin X 6 MAML2 5.09455 NM_032427 mastermind like transcriptional coactivator 2 7 FGD3 5.10217 NM_033086 FYVE, RhoGEF and PH domain containing 3 8 IFI30 5.11679 NM_006332 IFI30, lysosomal thiol reductase 9 TMEM102 5.21779 NM_178518 transmembrane protein 102 10 ARL4C 5.23671 NM_005737 ADP ribosylation factor like GTPase 4C 11 KIAA1549L 5.24877 NM_012194 KIAA1549 like 12 KCNIP3 5.25078 NM_001034914 potassium voltage-gated channel interacting protein 3 13 SOX13 5.26316 NM_005686 SRY-box 13 14 TOR4A 5.26538 NM_017723 torsin family 4 member A 15 RAB13 5.29098 NM_002870 RAB13, member RAS oncogene family 16 YBX2 5.34813 NM_015982 Y-box binding protein 2 17 INPP4B 5.35536 NM_001101669 inositol polyphosphate-4-phosphatase type II B 18 CARD10 5.35929 NM_014550 caspase recruitment domain family member 10 19 FHDC1 5.36011 NM_033393 FH2 domain containing 1 20 RIN1 5.38786 NM_004292 Ras and Rab interactor 1 21 IGFBP4 5.39563 NM_001552 insulin like growth factor binding protein 4 22 EPN3 5.43753 NM_017957 epsin 3 23 LIF 5.44943 NM_002309 LIF, interleukin 6 family cytokine 24 KRT14 5.45803 NM_000526 keratin 14 25 ADAM8 5.47037 NM_001109 ADAM metallopeptidase domain 8 26 AP1M2 5.48826 NM_005498 adaptor related protein complex 1 subunit mu 2 27 RFTN1 5.50976 NM_015150 raftlin, lipid raft linker 1 28 PANX2 5.52833 NM_052839 pannexin 2 29 EFEMP1 5.53505 NM_001039349 EGF containing fibulin extracellular matrix protein 1 30 COL4A2 5.54392 NM_001846 collagen type IV alpha 2 chain 31 MN1 5.55571 NM_002430 MN1 proto-oncogene, transcriptional regulator 32 SLC39A4 5.57722 NM_001280557 solute carrier family 39 member 4 33 CDCA7L 5.58519 NM_018719 cell division cycle associated 7 like 34 SOCS3 5.59528 NM_003955 suppressor of cytokine signaling 3 35 CSPG4 5.61702 NM_001897 chondroitin sulfate proteoglycan 4 36 KRT15 5.61734 NM_002275 keratin 15 37 HIVEP2 5.62581 NM_006734 human immunodeficiency virus type I enhancer binding protein 2 38 ABCA1 5.63109 NM_005502 ATP binding cassette subfamily A member 1 39 FAM129A 5.64781 NM_052966 family with sequence similarity 129 member A 40 CDC42BPG 5.65861 NM_017525 CDC42 binding protein kinase gamma 41 AFAP1L1 5.69301 NM_152406 actin filament associated protein 1 like 1 42 GLI3 5.70171 NM_000168 GLI family zinc finger 3 43 AHNAK2 5.70365 NM_138420 AHNAK nucleoprotein 2 44 MST1R 5.7286 NM_002447 macrophage stimulating 1 receptor 45 MOB3B 5.73248 NM_024761 MOB kinase activator 3B 46 DRAM1 5.73931 NM_018370 DNA damage regulated autophagy modulator 1 47 MMP14 5.77736 NM_004995 matrix metallopeptidase 14 48 NTN4 5.86224 NM_021229 netrin 4 49 FRMD6 5.89661 NM_001267046 FERM domain containing 6 50 FAM84B 5.93367 NM_174911 family with sequence similarity 84 member B 51 ITGA3 5.98549 NM_002204 integrin subunit alpha 3 52 TFAP2C 6.00119 NM_003222 transcription factor AP-2 gamma 53 PARP6 6.01806 NM_020214 poly(ADP-ribose) polymerase family member 6 104 Page 18 of 25 Prakriti Sen et al. Table 3. (continued)

Serial number Gene log2FC Refseq ID Gene Name 54 LYPD3 6.08819 NM_014400 LY6/PLAUR domain containing 3 55 SDC4 6.10572 NM_002999 syndecan 4 56 KRT80 6.13331 NM_182507 keratin 80 57 C3 6.13873 NM_000064 complement C3 58 CCDC71L 6.23363 NM_175884 coiled-coil domain containing 71 like 59 ATXN1 6.25357 NM_001128164 ataxin 1 60 SPINT2 6.35806 NM_021102 serine peptidase inhibitor, Kunitz type 2 61 KRT8 6.3764 NM_002273 keratin 8 62 CLDN4 6.38374 NM_001305 claudin 4 63 NDRG1 6.42796 NM_006096 N-myc downstream regulated 1 64 KLHL35 6.44791 NM_001039548 kelch like family member 35 65 FBXO32 6.45804 NM_058229 F-box protein 32 66 VAMP8 6.48632 NM_003761 vesicle associated membrane protein 8 67 GALNT14 6.55655 NM_024572 polypeptide N-acetylgalactosaminyltransferase 14 68 PDLIM1 6.61464 NM_020992 PDZ and LIM domain 1 69 MT2A 6.62229 NM_005953 metallothionein 2A 70 ALCAM 6.67232 NM_001627 activated leukocyte cell adhesion molecule 71 PXDC1 6.78658 NM_183373 PX domain containing 1 72 EHD2 6.83716 NM_014601 EH domain containing 2 73 RAB32 6.83873 NM_006834 RAB32, member RAS oncogene family 74 ANXA3 6.84819 NM_005139 annexin A3 75 S100A2 6.85 NM_005978 S100 calcium binding protein A2 76 MISP 6.86417 NM_173481 mitotic spindle positioning 77 OXCT1 7.02394 NM_000436 3-oxoacid CoA-transferase 1 78 SH2D3A 7.1495 NM_005490 SH2 domain containing 3A 79 CD24 7.22612 NM_001291738 CD24 molecule 80 ANOS1 7.26907 NM_000216 anosmin 1 81 EPCAM 7.27376 NM_002354 epithelial cell adhesion molecule 82 COL5A1 7.32714 NM_000093 collagen type V alpha 1 chain 83 MET 7.33169 NM_000245 MET proto-oncogene, receptor tyrosine kinase 84 BACE2 7.37386 NM_012105 beta-secretase 2 85 LOXL2 7.38807 NM_002318 lysyl oxidase like 2 86 TNFRSF1A 7.39981 NM_001065 TNF receptor superfamily member 1A 87 TAGLN2 7.41691 NM_003564 transgelin 2 88 WNT7B 7.43571 NM_058238 Wnt family member 7B 89 ITPR3 7.52253 NM_002224 inositol 1,4,5-trisphosphate receptor type 3 90 FERMT1 7.57487 NM_017671 fermitin family member 1 91 SLC16A3 7.6219 NM_004207 solute carrier family 16 member 3 92 COL17A1 7.63787 NM_000494 collagen type XVII alpha 1 chain 93 LAMB1 7.81292 NM_002291 laminin subunit beta 1 94 MALL 7.92452 NM_005434 mal, T cell differentiation protein like 95 EPS8L2 8.06273 NM_022772 EPS8 like 2 96 CAV2 8.08102 NM_001206747 caveolin 2 97 CAPG 8.12387 NM_001256139 capping actin protein, gelsolin like 98 SFN 8.24478 NM_006142 stratifin 99 LAD1 8.30396 NM_005558 ladinin 1 100 SEMA5A 8.31911 NM_003966 semaphorin 5A 101 TACSTD2 8.31919 NM_002353 tumor associated calcium signal transducer 2 102 CDH1 8.69269 NM_004360 cadherin 1 103 S100A11 9.0997 NM_005620 S100 calcium binding protein A11 104 PROM2 9.12778 NM_001165978 prominin 2 105 CD44 9.22106 NM_001001389 CD44 molecule (Indian blood group) 106 SERPINE1 9.22973 NM_000602 serpin family E member 1 107 FOSL1 9.2825 NM_005438 FOS like 1, AP-1 transcription factor subunit 108 GPRC5A 9.31245 NM_003979 G protein-coupled receptor class C group 5 member A Differential transcriptome analysis in HPV cervical cancer cells Page 19 of 25 104 Table 3. (continued)

Serial number Gene log2FC Refseq ID Gene Name 109 KRT17 10.7611 NM_000422 keratin 17 110 IGFBP5 10.8985 NM_000599 insulin like growth factor binding protein 5 111 GPSM3 11.3503 NM_001276501_2 G protein signaling modulator 3 112 ENO3 12.0713 NM_001976 enolase 3 113 DENND2C 14.0075 CUFF.177.3 DENN domain containing 2C 114 TBCEL 14.1926 NM_152715 tubulin folding cofactor E like 115 CFAP69 14.2498 NM_001039706 cilia and flagella associated protein 69 116 CDK18 14.4553 NM_002596 cyclin dependent kinase 18 117 SSPN 14.5149 NM_005086 sarcospan 118 GK 14.6057 NM_001128127 glycerol kinase 119 SLC43A1 14.6985 NM_003627 solute carrier family 43 member 1 120 RAB40C 14.6994 NM_021168 RAB40C, member RAS oncogene family 121 TOX2 14.7922 NM_001098797 TOX high mobility group box family member 2 122 GEN1 15.1593 NM_001130009 GEN1, Holliday junction 5’ flap endonuclease 123 FAM110A 15.2252 NM_001289147 family with sequence similarity 110 member A 124 IL11 15.9497 NM_000641 interleukin 11 125 KLF11 16.103 NM_003597 Kruppel like factor 11 126 RALGAPB 16.1099 NM_020336 Ral GTPase activating protein non-catalytic beta subunit 127 TGFB1I1 16.4686 NM_001042454 transforming growth factor beta 1 induced transcript 1 128 PLEKHG4 16.516 NM_001129727 pleckstrin homology and RhoGEF domain containing G4 129 SLC37A2 16.533 NM_198277 solute carrier family 37 member 2 130 SH3D21 17.2078 NM_001162530 SH3 domain containing 21 131 NMNAT2 17.5671 NM_170706 nicotinamide nucleotide adenylyltransferase 2 132 FAS 17.7143 NM_000043 Fas cell surface death receptor 133 ELAVL2 18.1065 NM_001171195 ELAV like RNA binding protein 2 134 THSD4 20.3555 NM_024817 thrombospondin type 1 domain containing 4 135 TES 21.3157 NM_015641 testin LIM domain protein 136 IGF2BP1 - 18.636 NM_006546 insulin like growth factor 2 mRNA binding protein 1 137 ZNF497 - 18.2374 NM_001207009 zinc finger protein 497 138 MYADM - 18.1125 NM_138373 myeloid associated differentiation marker 139 MDM1 - 17.7052 NM_001205028 Mdm1 nuclear protein 140 MAGED1 - 17.4254 NM_001005332 MAGE family member D1 141 NIPA1 - 17.0712 NM_001142275 NIPA magnesium transporter 1 142 ULK3 - 16.9925 NM_001284364 unc-51 like kinase 3 143 RAPGEFL1 - 16.4682 NM_001303533 Rap guanine nucleotide exchange factor like 1 144 EPC1 - 16.0655 NM_001272019 enhancer of polycomb homolog 1 145 FAM169A - 15.9567 NM_015566 family with sequence similarity 169 member A 146 FGFR3 - 15.955 NM_000142 fibroblast growth factor receptor 3 147 RBAK - 15.917 NM_001204456 RB associated KRAB zinc finger 148 FARP1 - 15.7709 CUFF.955.3 FERM, ARH/RhoGEF and pleckstrin domain protein 1 149 NUMBL - 15.6942 NM_001289980 NUMB like, endocytic adaptor protein 150 SMAD1 - 15.5389 NM_001003688 SMAD family member 1 151 SLC9B2 - 15.4344 NM_178833 solute carrier family 9 member B2 152 BVES - 15.2076 NM_007073 blood vessel epicardial substance 153 MYH14 - 14.8458 NM_024729 myosin heavy chain 14 154 BTBD9 - 14.8413 NM_152733 BTB domain containing 9 155 CDYL - 14.8256 NM_001143971 chromodomain Y like 156 CDIP1 - 14.7894 NM_001199056 cell death inducing p53 target 1 157 NR1D2 - 14.7096 NM_001145425 nuclear receptor subfamily 1 group D member 2 158 SLC22A23 - 14.6942 NM_001286455 solute carrier family 22 member 23 159 PPP1R12B - 14.6401 NM_001197131 protein phosphatase 1 regulatory subunit 12B 104 Page 20 of 25 Prakriti Sen et al. Table 3. (continued)

Serial number Gene log2FC Refseq ID Gene Name 160 HAUS3 - 14.5968 NM_024511 HAUS augmin like complex subunit 3 161 ENO3 - 14.5787 NM_053013 enolase 3 162 SCAF8 - 14.5496 NM_001286194 SR-related CTD associated factor 8 163 LPXN - 14.5253 NM_004811 leupaxin 164 CTDP1 - 14.4868 NM_001202504_1 CTD phosphatase subunit 1 165 DMXL2 - 14.4778 CUFF.1139.4 Dmx like 2 166 RCAN1 - 14.4163 NM_001285389 regulator of calcineurin 1 167 ST3GAL5 - 14.366 NM_001042437 ST3 beta-galactoside alpha-2,3-sialyltransferase 5 168 CAB39L - 14.3157 NM_001079670 calcium binding protein 39 like 169 VANGL1 - 14.1838 NM_001172412 VANGL planar cell polarity protein 1 170 FAM227A - 14.1605 NM_001291030 family with sequence similarity 227 member A 171 DICER1 - 14.0689 NM_001291628 dicer 1, ribonuclease III 172 KLF11 - 14.0448 NM_001177716 Kruppel like factor 11 173 TPCN1 - 13.9105 NM_001301214 two pore segment channel 1 174 MPP1 - 13.8877 NM_002436 membrane palmitoylated protein 1 175 ZNF234 - 13.7007 NM_001144824 zinc finger protein 234 176 MIRLET7BHG - 13.3325 NR_027033 MIRLET7B host gene 177 SH2B3 - 13.3161 NM_001291424 SH2B adaptor protein 3 178 RASSF8 - 12.9103 NM_001164748 Ras association domain family member 8 179 TOX2 - 12.5758 NM_001098796 TOX high mobility group box family member 2 180 GJA1 - 10.0172 NM_000165 gap junction protein alpha 1 181 COL5A2 - 8.4367 NM_000393 collagen type V alpha 2 chain 182 ZNF608 - 7.78192 NM_020747 zinc finger protein 608 183 SAMD5 - 7.35078 NM_001030060 sterile alpha motif domain containing 5 184 ID2 - 7.28839 NM_002166 inhibitor of DNA binding 2 185 EGR1 - 6.69874 NM_001964 early growth response 1 186 ANKRD1 - 6.65172 NM_014391 ankyrin repeat domain 1 187 FKBP10 - 6.31386 NM_021939 FK506 binding protein 10 188 TBX1 - 6.30501 NM_080647 T-box 1 189 ATP6V0E2 - 6.01696 NM_145230 ATPase H? transporting V0 subunit e2 190 TOX - 5.83629 NM_014729 thymocyte selection associated high mobility group box 191 PALM3 - 5.82064 NM_001145028 paralemmin 3 192 RPSAP58 - 5.81012 NR_003662 ribosomal protein SA pseudogene 58 193 FGFBP3 - 5.66274 NM_152429 fibroblast growth factor binding protein 3 194 CENPV - 5.62202 NM_181716 centromere protein V 195 CRMP1 - 5.61241 NM_001313 collapsin response mediator protein 1 196 UNC13A - 5.56359 NM_001080421 unc-13 homolog A 197 ASPHD1 - 5.55324 NM_181718 aspartate beta-hydroxylase domain containing 1 198 LRFN1 - 5.54764 NM_020862 leucine rich repeat and fibronectin type III domain containing 1 199 PRRT2 - 5.48063 NM_001256443 proline rich transmembrane protein 2 200 PRDX2 - 5.39134 NM_005809 peroxiredoxin 2 201 NR2F1 - 5.36001 NM_005654 nuclear receptor subfamily 2 group F member 1 202 CHD5 - 5.22914 NM_015557 chromodomain helicase DNA binding protein 5 203 SKIDA1 - 5.20958 NM_207371 SKI/DACH domain containing 1 204 DOCK3 - 5.20111 NM_004947 dedicator of cytokinesis 3 205 EFR3B - 5.18173 NM_014971 EFR3 homolog B 206 TERT - 5.12728 NM_198253 telomerase reverse transcriptase 207 SCCPDH - 5.07884 NM_016002 saccharopine dehydrogenase (putative) 208 MTRNR2L9 - 5.07431 NM_001190706 MT-RNR2 like 9 Differential transcriptome analysis in HPV cervical cancer cells Page 21 of 25 104

Figure 3. Gene Ontology and Pathway Enrichment analysis for significantly up and downregulated genes was performed using DAVID. Functional annotation tool (v 6.7) with p-value B 0.05. Pie-donut was plotted for Top 5 GOs and KEGG Pathways using Highcharts online tool for (A) C1 vs. K1, (B) C2 vs. K2 and (C) C3 vs. K3.

Figure 4. Biological network analysis of (A) C1 vs. K1, (B) C2 vs. K2 and (C) C3 vs. K3. Statistical scores from differential expression and biological analysis was used as attributes to visualize the network. In gene regulatory figures, gene nodes are colored according to their log2FC (red and orange show upregulation, green shows Downregulation and Yellow shows baseline expression, and processes are colored in blue). cell marker, however recent reports suggest its role in PCR data we observed that CD44 showed similar malignancy. In syngenic mouse models of oral squa- pattern of expression as LOXL2. CD44 is downregu- mouse cell carcinoma, using anti-CD44 mAbs by Near lated in CaSki compared to C33A. MiR-214 overex- Infrared Photo Immunotherapy (NIR-PIT) is an effec- pression reduced it below basal level while miR-214 tive therapy against oral cancer (Nagaya et al. 2017). knockout partially restored it in CaSki. In C33A, Thus, CD44 is a potential anti-cancer target. From RT- overexpression marginally increased CD44 expression 104 Page 22 of 25 Prakriti Sen et al.

Figure 5. Quantitative real-time PCR validation for gene expression in different genes obtained from NGS data (TNFAIP3, RAB25, MET, CYP1B1, NDRG1, CD24, LOXL2, CD44, PMS2, LATS1 and MDM1) for C1 vs. K1, C2 vs. K2 and C3 vs. K3. The mRNA Fold Change values of different genes were plotted for C1 vs. K1, C2 vs. K2 and C3 vs. K3 and represented in the graph. Statistical significance was determined by unpaired t-test (ns p[0.05), (*p B 0.05), (**p B 0.01), (***p B 0.001), (****p B 0.0001).

while knockout significantly increased CD44 expres- protein, which primarily localizes to the centrioles of sion. Post meiotic segregation increased 2 (PMS2) dividing cells (Van de Mark et al. 2015). There has protein is a member of DNA mismatch repair (MMR) been no report of MDM1 being associated with any group of proteins. In , PMS2 is required types of cancer. Our studies show for the first time that for cisplatin induced apoptosis and arrest of cell cycle MDM1 is downregulated in cervical cancer. Therefore, at G2/M boundary (Jia et al. 2016). In our studies from MDM1 represented a new biomarker for cervical RT-PCR, PMS2 was downregulated significantly upon cancer. miR-214 overexpression in both C33A and CaSki; the In this article we illustrated for the first time that, downregulation being higher in CaSki. Knockout of miR-214 can modulate gene expression in HPV-nega- miR-214 in CaSki resulted in partial restoration of tive and HPV-positive cervical cancer cells, C33A and PMS2 levels. It’s interesting that CaSki being HPV- CaSki respectively. MiR-214 overexpression/knockout positive downregulated the tumor suppressor more than altered genes involved in metabolism, cell cycle, DNA HPV-negative C33A. However, miR-214 seeme to repair, apoptosis, cell migration, and transcription fac- oppose PMS2 expression in both the cell types. Large tors etc. The gene regulatory networks patterns which tumor suppressor kinase 1 (LATS1) plays an important emerge upon knockout and overexpression of miR- role in cancer. Studies with cervical cancer patients 214, respectively, suggest that cellular pathways and showed that LATS1 is frequently downregulated; while genes are activated at different levels which will give overexpression of LATS1 inhibits cervical cancer cell rise to different cell response. Several genes identified proliferation and invasion (Deng et al. 2017). In our by this transcriptome analysis have a known role in studies, LATS1 is downregulated to similar extent in other types of cancer; therefore, these genes might be C33A and CaSki. MiR-214 overexpression or knock- playing a critical role in cervical cancer also. These out did not seem to have much effect on LATS1 as it genes could be used as potential biomarkers to evaluate was similarly downregulated in both cases. Mouse disease prognosis and also for predisposition for future double minute 1 (MDM1) is a microtubule binding development of cervical cancer. We also identified a Differential transcriptome analysis in HPV cervical cancer cells Page 23 of 25 104 Table 4. A comparative list of common genes with opposite expressions depending upon its log2FC values between (A) C1 vs K1 and C2 vs K2, (B) C1 vs K1 and C3 vs K3 and (C) C2 vs K2 and C3 vs K3. A

S.No Gene Name Expression level in C1 vs K1 Expression level in C2 vs K2 1 NR1D2 UP DOWN 2 DUSP6 DOWN BASELINE 3 PIK3R3 DOWN BASELINE 4 FOXO1 DOWN BASELINE 5 ID3 DOWN BASELINE 6 TBX1 BASELINE DOWN 7 FGFR3 BASELINE DOWN 8 SMAD1 BASELINE DOWN 9 FGF1 UP BASELINE 10 IGFBP5 UP BASELINE 11 LOX UP BASELINE 12 SERPINE1 UP BASELINE 13 CDKN1A BASELINE UP 14 PCK2 BASELINE UP 15 COL1A1 BASELINE UP 16 NKX3-1 BASELINE UP 17 EFNB2 BASELINE UP 18 ZNF703 BASELINE UP 19 ZC3H12A BASELINE UP 20 OSGIN1 UP BASELINE 21 DKK1 UP BASELINE 22 LYN UP BASELINE 23 SEMA7A UP BASELINE 24 CDK6 UP BASELINE 25 FLNC UP BASELINE B

S.No Gene Name Expression level in C1 vs K1 Expression level in C3 vs K3 1 TBX1 BASELINE DOWN 2 FGFR3 BASELINE DOWN 3 SMAD1 BASELINE DOWN 4 LATS1 DOWN UP 5 CALM1 UP BASELINE 6 IGFB5 BASELINE UP 7 ZNF703 BASELINE UP 8 ZC3H12A BASELINE UP 9 NKX3-1 BASELINE UP 10 IRS2 BASELINE UP 11 COL1A1 BASELINE UP 12 EFNB2 BASELINE UP C

S.No Gene Name Expression level in C3 vs K3 Expression level in C2 vs K2 1 SEMA7A UP BASELINE 2 CDK6 UP BASELINE 3 FGF1 BASELINE UP 4 LATS1 UP DOWN 5 FOXO1 DOWN BASELINE 6 DUSP6 DOWN BASELINE 7 PIK3R3 DOWN BASELINE 8 NR1D2 DOWN DOWN 9 CDKN1A BASELINE UP 104 Page 24 of 25 Prakriti Sen et al. Table 4. (continued)

C

S.No Gene Name Expression level in C3 vs K3 Expression level in C2 vs K2 10 SERPINE1 UP BASELINE 11 LOX UP BASELINE 12 FOXF2 UP BASELINE 13 OSGIN1 UP BASELINE 14 JUNB UP BASELINE 15 LYN UP BASELINE 16 SOX7 UP BASELINE 17 FLNC UP BASELINE new biomarker MDM1 which was not reported earlier Du B, Liu M, Li C, Geng X, Zhang X, Ning D and Liu M in any other cancer. 2019 The potential role of TNFAIP3 in malignant transformation of gastric carcinoma. Pathol. Res. Pract. 215 152471 Acknowledgements Frediani JN and Fabbri M 2016 Essential role of miRNAs in orchestrating the biology of the tumor microenvironment. The funding from SERB, India, is duly acknowledged. Mol. Cancer 15 42 Ghittoni R, Accardi R, Hasan U, Gheit T, Sylla B, and Tommasino M 2010 The biological properties of E6 and E7 oncoproteins from human papillomaviruses. Virus References Genes 40 1–13 Goodman A 2015 HPV testing as a screen for cervical Abba ML, Patil N, Leupold JH, Moniuszko M, Utikal J, cancer. BMJ 350 h2372 Niklinski J and Allgayer H 2017 MicroRNAs as novel Hopman AHN and Ramaekers FCS 2017 Development of targets and tools in cancer therapy. Cancer Lett. 387 the uterine cervix and its implications for the pathogenesis 84–94 of cervical cancer; In Pathology of the Cervix. Essentials Agorastos T, Chatzistamatiou K, Katsamagkas T, Koliopou- of Diagnostic Gynecological Pathology (volume 3) (eds) los G, Daponte A, Constantinidis T, and Constantinidis Herrington C (Springer, Cham) pp 1–20 TC 2015 Primary screening for cervical cancer based on Imam N 2016 Computational analysis of human cancer high-risk human papillomavirus (HPV) detection and related RNA-Seq data: A review. J. Appl. Comput. 1 HPV 16 and HPV 18 genotyping, in comparison to 30–37 cytology. PLoS ONE 10 e0119755 Jaskowiak PA, Costa IG and Campello RJGB 2018 Clus- Berdasco M and Esteller M 2017 Crosstalk between non- tering of RNA-Seq samples: Comparison study on cancer coding RNAs and the epigenome in development. data. Methods 132 42–49 Chromatin Regul. Dyn. https://doi.org/10.1016/B978-0- Jeong H, Lim KM, Kim KH, et al. 2019 Loss of Rab25 12-803395-1.00009-5 promotes the development of skin squamous cell carci- Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA and noma through the dysregulation of integrin trafficking. J. Jemal A 2018 Global cancer statistics GLOBOCAN Pathol. 249 227–240 estimates of incidence and mortality worldwide for 36 Jia J, Wang Z, Cai J and Zhang Y 2016 PMS2 expression in cancers in 185 countries. Cancer J. Clin. 68 394–424 epithelial ovarian cancer is posttranslationally regulated Byron SA, Keuren-Jensen KR Van, Engelthaler DM, by Akt and essential for platinum-induced apoptosis. Carpten JD and Craig DW 2016 Translating RNA Tumor Biol. 37 3059–3069 sequencing into clinical diagnostics: opportunities and Kawai S, Fujii T, Kukimoto I, et al. 2018 Identification of challenges. Nat. Rev. Genet. 17 257–271 miRNAs in cervical mucus as a novel diagnostic marker Chang I, Mitsui Y, Kim SK, et al. 2017 Cytochrome P450 for cervical neoplasia. Sci. Rep. 8 7070 1B1 inhibition suppresses tumorigenicity of prostate Kim J and Bang H 2016 Three common misuses of P- cancer via caspase-1 activation. Oncotarget 8 values. Dental Hypotheses 7 73 39087–39100 Li H, Zhang H, Lu G, Li Q, Gu J, Song Y, Gao S and Ding Y Deng J, Zhang W, Liu S, An H, Tan L and Ma L 2017 2016 Mechanism analysis of colorectal cancer according LATS1 suppresses proliferation and invasion of cervical to the microRNA expression profile. Oncol. Lett. 12 cancer. Mol. Med. Rep. 15 1654–1660 2329–2336 Differential transcriptome analysis in HPV cervical cancer cells Page 25 of 25 104 Mao Y, Shen J, Lu Y, et al. 2017 RNA sequencing analyses a potential molecular therapeutic target. Am. J. Clin. reveal novel differentially expressed genes and pathways Oncol. 40 590–597 in . Oncotarget 8 42537–42547 Rupaimoole R and Slack FJ 2017 MicroRNA therapeutics: Mark D Van de, Kong D, Loncarek J and Stearns T 2015 towards a new era for the management of cancer and other MDM1 is a microtubule-binding protein that negatively diseases. Nat. Rev. Drug Discov. 16 203–222 regulates centriole duplication. Mol. Biol. Cell 26 Salvador F, Martin A, Lo´pez-Mene´ndez C, et al. 2017 Lysyl 3788–802 oxidase-like protein LOXL2 promotes lung metastasis of Mi L, Zhu F, Yang X, et al. 2017 The metastatic suppressor breast cancer. Cancer Res. 77 5846–5859 NDRG1 inhibits EMT, migration and invasion through Santos JMO, Peixoto da Silva S, Costa NR, Gil da Costa interaction and promotion of caveolin-1 ubiquitylation in RM and Medeiros R 2018 The role of MicroRNAs in the human colorectal cancer cells. Oncogene 36 4323–4335 metastatic process of high-risk HPV-induced cancers. Nagaya T, Nakamura Y, Okuyama S, Ogata F, Maruoka Y, Cancers 10 493 Choyke PL, Allen C and Kobayashi H 2017 Syngeneic Sen P, Ganguly P and Ganguly N 2018 Modulation of DNA mouse models of oral cancer are effectively targeted by methylation by human papillomavirus E6 and E7 onco- anti-CD44-based NIR-PIT. Mol. Cancer Res. 15 proteins in cervical cancer. Oncol. Lett. 15 11–22 1667–1677 Sochor M, Basova P, Pesta M, Dusilkova N, Bartos J, Burda P, PEI Z, Zhu G, Huo X, et al. 2016 CD24 promotes the Pospisil V and Stopka T 2014 Oncogenic MicroRNAs: proliferation and inhibits the apoptosis of cervical cancer miR-155, miR-19a, miR-181b, and miR-24 enable moni- cells in vitro. Oncol. Rep. 35 1593–1601 toring of early breast cancer in serum. BMC Cancer 14 448 Penna E, Orso F and Taverna D 2015 miR-214 as a key hub Tomasetti M, Amati M, Santarelli L and Neuzil J 2016 that controls cancer networks: small player, multiple MicroRNA in metabolic re-programming and their role in functions. J. Invest. Dermatol. 135 960–9 tumorigenesis. Int. J. Mol. Sci. 17 754 Refaat T, Donnelly ED, Sachdev S, et al. 2017 c-Met Uhlen M, Zhang C, Lee S, et al. 2017 A pathology atlas of overexpression in cervical cancer, a prognostic factor and the human cancer transcriptome. Science 357 eaan2507

Editorial Responsibility: SAUMITRA DAS