Cheng et al. J Transl Med (2021) 19:18 https://doi.org/10.1186/s12967-020-02689-y Journal of Translational Medicine

RESEARCH Open Access Three hematologic/immune system‑specifc expressed are considered as the potential biomarkers for the diagnosis of early rheumatoid arthritis through bioinformatics analysis Qi Cheng1,2, Xin Chen1,2, Huaxiang Wu1* and Yan Du1*

Abstract Background: Rheumatoid arthritis (RA) is the most common chronic autoimmune connective tissue disease. How- ever, early RA is difcult to diagnose due to the lack of efective biomarkers. This study aimed to identify new biomark- ers and mechanisms for RA disease progression at the transcriptome level through a combination of microarray and bioinformatics analyses. Methods: Microarray datasets for synovial tissue in RA or osteoarthritis (OA) were downloaded from the Expres- sion Omnibus (GEO) database, and diferentially expressed genes (DEGs) were identifed by R software. Tissue/organ- specifc genes were recognized by BioGPS. Enrichment analyses were performed and –protein interaction (PPI) networks were constructed to understand the functions and enriched pathways of DEGs and to identify hub genes. Cytoscape was used to construct the co-expressed network and competitive endogenous RNA (ceRNA) networks. Biomarkers with high diagnostic value for the early diagnosis of RA were validated by GEO datasets. The ggpubr pack- age was used to perform statistical analyses with Student’s t-test. Results: A total of 275 DEGs were identifed between 16 RA samples and 10 OA samples from the datasets GSE77298 and GSE82107. Among these DEGs, 71 tissue/organ-specifc expressed genes were recognized. (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis indicated that DEGs are mostly enriched in immune response, immune-related biological process, immune system, and cytokine signal pathways. Fifteen hub genes and gene cluster modules were identifed by Cytoscape. Eight haematologic/immune system-specifc expressed hub genes were verifed by GEO datasets. GZMA, PRC1, and TTK may be potential biomarkers for diagno- sis of early RA. NEAT1-miR-212-3p/miR-132-3p/miR-129-5p-TTK, XIST-miR-25-3p/miR-129-5p-GZMA, and TTK_hsa_ circ_0077158- miR-212-3p/miR-132-3p/miR-129-5p-TTK might be potential RNA regulatory pathways to regulate the disease progression of early RA.

*Correspondence: [email protected]; [email protected] 1 Department of Rheumatology, the Second Afliated Hospital of Zhejiang University School of Medicine, 88 Jiefang Road, Hangzhou 310009, China Full list of author information is available at the end of the article

© The Author(s) 2021. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativeco​ ​ mmons.org/licen​ ses/by/4.0/​ . The Creative Commons Public Domain Dedication waiver (http://creativeco​ mmons​ .org/publi​ cdoma​ in/​ zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Cheng et al. J Transl Med (2021) 19:18 Page 2 of 15

Conclusions: This work identifed three haematologic/immune system-specifc expressed genes, namely, GZMA, PRC1, and TTK, as potential biomarkers for the early diagnosis and treatment of RA and provided insight into the mechanisms of disease development in RA at the transcriptome level. In addition, we proposed that NEAT1-miR- 212-3p/miR-132-3p/miR-129-5p-TTK, XIST-miR-25-3p/miR-129-5p-GZMA, and TTK_hsa_circ_0077158-miR-212-3p/ miR-132-3p/miR-129-5p-TTK are potential RNA regulatory pathways that control disease progression in early RA. Keywords: Early rheumatoid arthritis, Tissue-specifc expressed genes, Biomarker, Microarray, Bioinformatics analysis, RNA regulatory pathways

Background organ-specific expressed genes by the online tool Rheumatoid arthritis (RA) is a common chronic autoim- BioGPS. Next, GO and KEGG enrichment analyses mune connective tissue disease that mainly involves the were performed by the Gene Set Enrichment Analysis joints. Te incidence of RA is 5 to 10 per 1000 people (GSEA) software, R software clusterProfiler package, [1]. With the progression of the disease and the continu- and online tool KEGG Orthology-Based Annotation ation of synovial infammation, the involved joint tissue System (KOBAS) 3.0. PPI network was constructed is gradually eroded. Eventually, RA leads to irreversible using the online tool STRING, and Cytoscape was damage to the joint, which is a very large burden on indi- used to identify cluster modules and hub genes related viduals and society. However, early diagnosis and treat- to RA. Then, target microRNAs (miRNAs) of selected ment of RA can efectively prevent disease progression, hub genes were predicted by five online miRNA data- joint damage, and other complications in 90% of patients bases, and a co-expressed network was constructed [2]. Terefore, the earlier a patient with RA is diagnosed, with Cytoscape. Subsequently, we validated the the less burden will be placed on the patient and soci- selected hub genes using GEO datasets, and ceRNA ety. At present, serum biomarkers used in the diagnosis networks were constructed based on prediction of established RA are rheumatoid factor and anti-cyclic results of long noncoding RNAs (lncRNAs) and circu- citrullinated peptide antibody [3]. However, early RA lar RNAs (circRNAs). This work provides insight into especially serum RF and anti-CCP antibody-negative is the mechanisms of disease development in RA at the difcult to diagnose due to the lack of efective biomark- transcriptome level and explores potential biomarkers ers. Studies have reported that some biomarkers, such as for the early diagnosis and treatment of RA. 14–3-3η autoantibodies and calprotectin, may be efec- tive in the diagnosis of early RA [4–7]. However, because Methods these biomarkers are not validated in prospective cohorts Microarray data acquisition or the clinical relevance of them are unclear, they have Te GEO database was used to obtain microarray data not been used in clinical diagnosis. Terefore, it is vital to for synovial tissue in RA or OA. Screening criteria identify new and efective biomarkers for the early diag- included the following: (1) Homo sapiens Expression nosis and treatment of RA. Profling by array; (2) synovial tissue of RA or OA from Currently, transcriptomic and microarray analyses joint synovial biopsies; (3) datasets contain more than have been widely used in a variety of diseases, including fve samples, (4) datasets contain complete information a variety of tumours and RA, to identify new biomarkers about the samples, (5) one biopsy sample per subject to improve diagnosis and treatment [8–12]. In addition, was analysed without replicates. Finally, two GPL570 competitive endogenous RNA (ceRNA) networks can datasets GSE77298 and GSE82107, which included 16 elucidate a new mechanism for promoting the develop- RA samples and 10 OA samples, were selected as test ment of the disease in a transcriptional regulatory net- sets; three GPL96 datasets GSE55584, GSE55457, and work [13]. Trough the combination of microarray and GSE55235, which included 33 RA samples and 26 OA bioinformatics analyses, it is possible to explore potential samples, and the GPL11154 GSE89408 dataset, which key genes and pathway networks that are closely related included 57 early RA samples, 95 established RA sam- to the development of diseases. ples and 22 OA samples, were selected as the validation In the present study, we first downloaded microar- sets (Table 1). ray datasets for synovial tissue in RA or OA from the GEO database. After pre-processing and normalizing Data normalization and identifcation of DEGs the data by the Robust Multiarray Average (RMA) Te original fles that were downloaded from the GEO method in R language, we identified DEGs based database were pre-processed and normalized by the on the screening criteria and obtained the tissue/ Robust Multiarray Average (RMA) method based on the Cheng et al. J Transl Med (2021) 19:18 Page 3 of 15

Table 1 Information for selected microarray datasets GEO accession Platform Samples Source tissue Age Sex (male/female) Attribute OA RA OA RA OA RA

GSE77298 GPL570 0 16 Synovium – – – – Test set GSE82107 GPL570 10 0 Synovium – – – – Test set GSE55584 GPL96 6 10 Synovium 73.7 7.1 54.9 12.9 0/6 3/7 Validation set ± ± GSE55457 GPL96 10 13 Synovium 72.4 5.9 60 20 2/8 3/10 Validation set ± ± GSE55235 GPL96 10 10 Synovium – – – – Validation set GSE89408 GPL11154 22 152 (57 early and Synovium 53.3 19.8 55.1 15 9/13 46/106 Validation set 95 established) ± ±

Annotation: GPL570: [HG-U133_Plus_2] Afymetrix U133 Plus 2.0 Array; GPL96: [HG-U133A] Afymetrix Human Genome U133A Array; GPL11154: Illumina HiSeq 2000 (Homo sapiens); DEGs diferentially expressed genes

R software (version 4.0.1) afy package. Te limma pack- KEGG pathway and Reactome enrichment analyses of age was used to conduct gene analysis of inter-sample DEGs. Te signifcant enriched functions and pathways diferences, and multiple hypothesis testing and cor- was selected with Q < 0.05. Te Q value is the adjusted p rection were conducted after p-value was obtained. Te value. threshold value of p-value was determined by controlling False Discovery Rate (FDR), and the corrected p-value Construction of the PPI network was adjusted p value (Q value)[14,15]. Te screening cri- Te PPI network was constructed based on all DEGs by teria were log2 (fold change) > 1 or < -1 and adjusted p the online tool STRING (https​://strin​g-db.org/) with a value (Q value) < 0.05. flter condition (combined score > 0.4). Next, we down- loaded the interaction information and optimized the Heatmap and volcano plot analyses PPI network with Cytoscape software (v3.8.0) for better To better visualize these DEGs, R software was used to visualization. Minimal Common Oncology Data Ele- make heatmaps and volcano plots. Heatmaps were made ments (MCODE) was used to identify signifcant gene with the pheatmap package. clusters and obtain cluster scores (flter criteria: degree cut-of = 2; node score cut-of = 0.2; k-core = 2; max Identifcation of tissue/organ‑specifc expressed genes depth = 100). CytoHubba was used to identify signif- To understand the tissue/organ-specifc expression of cant genes in this network as hub genes [20]. We used these DEGs, the online tool BioGPS (http://biogp​s.org/) fve algorithms, namely Degree, Maximal Clique Cen- was used to analyse the tissue distribution [16]. Te trality (MCC), Maximum Neighborhood Component screening criteria were as follows: (1) transcripts that (MNC), Density of Maximum Neighborhood Compo- mapped to a single organ system with an expression value nent (DMNC), and Clustering Coefcient, to calculate of > 10 multiples of the median, and 2() second-most- the top 30 hub genes [21,22]. Finally, all the results were abundant tissue’s expression was no more than a third as intersected to obtain the fnal hub genes. high [17]. Te genes obtained by these criteria were con- sidered to be tissue-specifc genes. Prediction of target miRNAs We used fve online miRNA databases, namely, RNA22, Enrichment analysis DIANA-micro T, miRWalk, miRDB, and miRcode, GSEA was used to assess the distribution trend of the to predict target miRNAs of hub genes and selected genes of a predefned set in the gene table to determine miRNAs that were found in at least four databases as their contribution to the phenotype [18]. We downloaded the target miRNAs. Te messenger RNA (mRNA)- GSEA_4.1.0 and c5: GO gene sets (c5.all.v7.1.symbols. miRNA co-expressed network based on the relationship gmt) for functional enrichment analyses. Te R soft- between mRNAs and miRNAs was constructed by using ware clusterProfler package was used to analyse the GO Cytoscape. enrichment of DEGs, and a chord plot was created for the visualization of these enrichment results. KOBAS 3.0 is Construction of ceRNA networks an online database widely used for gene/protein function StarBase (version 3.0) (http://starb​ase.sysu.edu.cn/ annotation and pathway enrichment (http://kobas​.cbi. index​.php) was used to predict lncRNAs and circR- pku.edu.cn/kobas​3) [19]. KOBAS 3.0 was used for the NAs that interacted with the selected miRNAs [23]. Cheng et al. J Transl Med (2021) 19:18 Page 4 of 15

Te intersections of the predicted results were used as OA samples, we identifed a total of 275 DEGs in the RA the target lncRNAs and circRNAs. CeRNA networks samples, which comprised 197 downregulated genes and based on the interactions among mRNAs, miRNAs, and 78 upregulated genes. Next, heatmap and volcano plot noncoding RNAs (ncRNAs) were constructed by using analyses were used to visualize these DEGs, which are Cytoscape. shown in Fig. 1a, b.

Statistics analysis Identifcation of the tissue/organ‑specifc expressed genes Te R software ggpubr package was used to perform A total of 71 tissue/organ-specifc expressed genes were statistical analyses, and the ggplot2 package was used to identifed by BioGPS (Table 2). We observed that most of draw boxplots. Student’s t-test was used to compare the these genes were specifcally expressed in the haemato- diferences between the two groups. IBM SPSS Statistics logic/immune system (35/71, 49.29%). Te second organ- 25 (SPSS, Inc., Chicago, IL, USA) was used to analyse the specifc expressed system was the nervous system, which data and draw the ROC curve. included 13 genes (13/71, 18.31%). Tis was followed by the digestive system (7/71, 9.86%), respiratory system Results (4/71, 5.63%), circulatory system (4/71, 5.63%), and pla- Identifcation of DEGs centa (3/71, 4.22%). Finally, the endocrine system, genital Te datasets GSE77298 and GSE82107, which included 16 RA samples and 10 OA samples, were selected to ana- lyse and identify the DEGs. Compared with genes in the

a b

4 Volcano plot of DEGs

2 0

0

−2

−4 alue) v −log10(Q 46 81 2 0

−3 −2 −1 0123

log2FC OA O RA 3 RA 7 RA11 RA15 RA 5 RA 8 RA14 RA16 OA O O OA OA OA OA OA RA 2 RA 1 RA 9 RA 6 RA13 RA12 RA 4 RA10 A9 A2 A3 10 6 8 7 4 5 1

Fig. 1 Identifcation of DEGs. a Heatmap of DEGs between the RA samples and the OA samples. Red rectangles represent high expression, and green rectangles represent low expression. b Volcano plot of DEGs between the RA samples and the OA samples. The red plots represent upregulated genes, the black plots represent nonsignifcant genes, and the green plots represent downregulated genes. Cheng et al. J Transl Med (2021) 19:18 Page 5 of 15

Table 2 Distribution of tissue/organ-specifc expressed genes identifed by BioGPS System/Organ Genes Counts

Haematologic/Immune Haematologic/Immune cells PLA2G7, SLC50A1, T, MSC, MATK, PRKCD, CCR7, CYB561A3, P2RY8, CD3G, EMR2, NOV, 32 BCL2A1, CD52, CD27, IL7R, TTK, MAP3K7CL, PNOC, FCGR1B, GZMB, GZMA, DLGAP5, TRBC1, MYOM2, CORO1A, PRC1, CEP55, CD3D, IER2, ITK, TNFRSF17 Immune organs CXCL13, LCK, CD163 3 Nervous PALM, KCND2, RASL10A, DACH1, STXBP1, DNM1, IL17D, PLP1, WRB, RCAN2, ZNF423, 13 LRRN4CL, LPHN3 Digestive GIPC2, AKR1B10, IGJ, ADAMDEC1, C6, TOX3, C15orf48 7 Respiratory CHAD, MFAP4, CLDN5, LAMP3 4 Circulatory ACTC1, CASQ2, LRRC2, CKMT2 4 Placenta PVRL3, RHOBTB1, AGTR1 3 Endocrine DUOX2 1 Genital MLF1 1 Others Tongue MAL 1 Prostate PPAP2A 1 Adipose HOXC10 1

system, and tongue, prostate, and adipose tissues had the receptor interaction, primary immunodefciency, JAK- lowest specifc expressed genes (1/71, 1.41%). STAT signalling pathway, Fc gamma R-mediated phago- cytosis, and neuroactive ligand-receptor interaction. Enrichment analysis Reactome enrichment analysis showed that DEGs were Te GSEA software, R software clusterProfler package, mostly enriched in the immune system and signal trans- and online tool KOBAS 3.0 were used for functional and duction. According to Q value < 0.05, we selected the top pathway enrichment analyses. First, we uploaded the fve KEGG pathways and the top fve Reactome terms expression profle of all genes in the RA and OA samples and showed them in a bubble plot (Fig. 3b). to GSEA, and the c5: GO gene set was used to perform the GO enrichment analysis of the expression profle at PPI network analysis, MCODE cluster modules and hub an overall level. Te screening criterion for signifcant gene identifcation gene sets was p < 0.05 and Q < 0.25. We observed that Te interaction network between coded by most of the enriched gene sets were related to the innate DEGs, which was comprised of 187 nodes and 307 immune cell-mediated immune response, immune- edges, was constructed by STRING and visualized related biological processes, and pathways (Fig. 2). by Cytoscape (Fig. 4a). Te MCODE plugin was used Next, the R software clusterProfler package and to identify gene cluster modules. In this network, we KOBAS 3.0 were used to perform GO, KEGG pathway, identifed four modules, which are shown in Fig. 4b-e, and Reactome enrichment analyses of DEGs, respec- according to the flter criteria. Cluster 1 had the highest tively. Te KEGG pathway is more comprehensive and cluster score (score: 9, 9 nodes and 36 edges), followed contains more genes; while the Reactome pathway has by cluster 2 (score: 5.167, 13 nodes and 31 edges), clus- more specifc functions and focuses more on biochemi- ter 3 (score: 3.333, 4 nodes and 5 edges), and cluster cal reactions[24]. We will observe the enrichment path- 4 (score: 2.8, 6 nodes and 7 edges). Next, we used the way of DEGs from multiple perspectives. GO enrichment cytoHubba plugin to identify hub genes. Fifteen hub analysis of DEGs also revealed that the immune response genes were identifed by intersecting the results from in RA samples was stronger than that in OA samples, the fve algorithms of cytohubba including Degree, and this included the regulation of humoral immune MCC, MNC, DMNC, and Clustering Coefcient [20]. response, complement activation, leukocyte activation, Tese hub genes with detailed information are shown and migration. Te top 10 biological processes were in Table 3. Tese genes are the most important genes selected based on a Q value < 0.05 and were drawn in a in PPI network and may play an important role in the chord plot (Fig. 3a). KEGG pathway enrichment analysis pathogenesis of RA. Additionally, GO and KEGG showed that DEGs were enriched in cytokine-cytokine enrichment analyses showed that DEGs were mainly Cheng et al. J Transl Med (2021) 19:18 Page 6 of 15

Table 3 15 hub genes identifed by fve algorithms of cytoHubba Gene symbol Description log2FC Q value Regulation

CXCL13 C–X–C motif chemokine ligand 13 2.846 0.012 Up CD52 CD52 molecule 1.928 0.004 Up GZMA Qranzyme A 1.753 0.022 Up CD27 CD27 molecule 1.419 0.004 Up CEP55 Centrosomal protein 55 1.264 0.01 Up SKA3 Spindle and kinetochore associated complex subunit 3 1.215 3.03E 07 Up − IL21R Interleukin 21 receptor 1.208 7E 04 Up − DLGAP5 DLG associated protein 5 1.172 0.027 Up PRC1 Protein regulator of cytokinesis 1 1.112 0.042 Up EOMES eomesodermi 1.108 0.031 Up TTK TTK protein kinas 1.065 0.006 Up CDCA8 Cell division cycle associated 8 1.049 1.74E 08 Up − UHRF1 Ubiquitin-like with PHD and ring fnger domains 1 1.048 0.016 Up KIAA0101 PCNA clamp associated factor 1.017 0.013 Up FNBP1L Formin binding protein 1 like 1.325 0.004 Down − Annotation: FC: fold change, Q value: adjust P-value. The gene symbol in bold indicates eight hematologic/immune system-specifc expressed hub genes enriched in immune-related biological processes and specifcally expressed hub genes. Te R software ggplot2 pathways. As the most common autoimmune disease, package and ggpubr package were used to draw boxplots a better understanding of immune-related mecha- and perform Student’s t-test statistical analyses. Consist- nisms in RA is an important part of current research. ent with our predictions, the mRNA expression levels Te discovery of genes specifcally expressed by the of the 8 specifcally expressed hub genes in the RA sam- immune system in RA synovium may contribute to ples were signifcantly increased compared with those the discovery of key targets in the pathogenesis of RA. in the OA samples (p < 0.01) (Fig. 6a, b). In addition, we Terefore, we intersected 15 hub genes and genes spe- observed that the mRNA expression levels of GZMA, cifcally expressed in the haematologic/immune system. PRC1, and TTK in the 57 early RA samples were signif- Ultimately, we obtained eight haematologic/immune cantly increased compared with those in the 95 estab- system-specifc expressed hub genes, includin–g CD52, lished RA samples (p < 0.05) (Fig. 6b). CD27, TTK, GZMA, DLGAP5, PRC1, CEP55, and CXCL13 (Table 3, in italics). ROC curve of the 8 specifcally expressed hub genes in early RA samples and established RA samples We used IBM SPSS Statistics 25 to analyse the 8 spe- Prediction of target miRNAs and construction cifically expressed hub genes expression profiles of OA of the co‑expressed network samples, early RA samples, and established RA sam- We used fve online miRNA databases to predict target ples and draw the ROC curves. Area under the curve miRNAs of hub genes. Finally, we obtained 95 target (AUC) is an indicator combining sensitivity and speci- miRNAs of 8 specifcally expressed hub genes and deter- ficity, which can describe the intrinsic effectiveness of mined 105 mRNA-miRNA pairs. According to the pre- diagnostic tests [25]. Compared to OA samples, these diction results, a co-expressed network of mRNAs and 8 specifically expressed hub genes have higher diag- miRNAs, which comprised 103 nodes and 105 edges, was nostic value both in the early RA samples and in the constructed by Cytoscape (Fig. 5). established RA samples. Among them, GZMA has the highest diagnostic value (AUC: 0.906) in the early RA Verifcation of the 8 specifcally expressed hub genes by 4 samples, while CXCL13 has the highest diagnostic datasets from the GEO database value (AUC: 0.900) in the established RA samples. The diagnostic value of other genes are follows: in early RA Tree GPL96 datasets, namely, GSE55584, GSE55457 samples, CXCL13 (AUC: 0.893), CD27 (AUC: 0.872), and GSE55235, which included 33 RA samples and 26 CD52 (AUC: 0.863), DLGAP5 (AUC: 0.810), PRC1 OA samples, and the GPL11154 GSE89408 dataset, (AUC: 0.809), CEP55 (AUC: 0.805), TTK (AUC: 0.793) which included 57 early RA samples, 95 established RA (Fig. 7a), while in established RA samples, GZMA samples and 22 OA samples were selected to verify the 8 Cheng et al. J Transl Med (2021) 19:18 Page 7 of 15

Fig. 2 GSEA plot showing most enriched immune-related gene sets in the RA group and OA group. The c5: GO gene set was used to perform the GO enrichment analysis of the expression profle at the overall level. a The most signifcant enriched immune-related gene set was regulation of natural killer cell mediated immunity (ES 0.666, NES 1.615, p < 0.05). b The second signifcant enriched immune-related gene set was positive = = regulation of natural killer cell medicated immunity (ES 0.729, NES 1.604, p < 0.05). c The third signifcant enriched immune-related gene set = = was negative regulation of cytokine production involved in immune response (ES 0.553, NES 1.603, p < 0.05). d The fourth signifcant enriched = = immune-related gene set was positive regulation of natural killer cell mediated cytotoxicity (ES 0.766, NES 1.600, p < 0.05). e The ffth signifcant = = enriched immune-related gene set was positive regulation of monocyte chemotaxis (ES 0.712, NES 1.510, p < 0.05). F. The sixth signifcant = = enriched immune-related gene set was positive regulation of antigen receptor mediated signaling pathway (ES 0.750, NES 1.489, p < 0.05). The = = screening criteria for signifcant gene sets were p < 0.05 and Q < 0.25. ES: enrichment score; NES: normalized enrichment score; Q: also named False Discovery Rates (FDR)or adjust p value

(AUC: 0.852), CD27 (AUC: 0.817), CD52 (AUC: 0.837), Prediction of target ncRNAs and construction of ceRNA DLGAP5 (AUC: 0.786), PRC1 (AUC: 0.703), CEP55 networks (AUC: 0.731), TTK (AUC: 0.726) (Fig. 7b). Due to miRNAs are well known to induce gene silencing and their good diagnostic performance in both early RA downregulate by binding mRNAs. How- and established RA, we combined with their expres- ever, its upstream molecules, circRNAs, and lncRNAs, sion levels in early RA and established RA to identify can afect the function of miRNA by combining miRNA better biomarkers. Compared with established RA, response elements, thus upregulating gene expression. GZMA, PRC1 and TTK were up-regulated in early RA Tis interaction between RNAs is called a ceRNA net- with statistical significance (Fig. 6b). Therefore, we work [13]. Next, we used the online database Starbase hypothesize that GZMA, PRC1 and TTK may be bio- 3.0 to predict the lncRNAs and circRNAs that interact markers for early diagnosis of RA based on our present with the selected miRNAs. Te screening criteria were: samples. mammalian, human h19 genome, strict stringency (> = 5) Cheng et al. J Transl Med (2021) 19:18 Page 8 of 15

a Top 10 biological processes of DEGs b KEGG pathway and Reatome enrichment CXCL13 SL

IL7 C AM IG IG CR7

R L L F V1−4 GZMB C S 7 L 1 log2FC -log10 (Qvalue) PLA2G7 A MF 4 T 3 R 8 6 BC IG HM 1 Immune System C D 5 CR 27 C T OR AM O1 4 IG A K PRKCD C Signal Transduction 3 LC K EO 2 MES IGLV −3 6−57 MSC Adaptive Immune System BTLA CCR5 DOCK8 Vesicle−mediated transport NDRG4 Category

HEY2 KEGG PATHWAY PROS1 Clathrin−mediated endocytosis Reactome JAM2

EDNRB

STXBP1 DISP1 Cytokine−cytokine receptor interaction X8 SO LP TS 2 GeneCount EFNB YOM2 M Primary immunodeficiency 10 SGCA 1 GR E H 3 20 CF R B F TG 30 D3 Jak−STAT signaling pathway SGCE 3 HO R F 5

EG M1 BMP YO 1 C6 M M2 C7 N

TC

ZFP PL AC Fc gamma R−mediated phagocytosis RBM2 4

regulation of humoral immune response striated muscle tissue development Neuroactive ligand−receptor interaction

muscle organ development positive regulation of leukocyte activation 0.02 0.04 0.06 0.08 GO Terms regulation of complement activation leukocyte migration Gene Ratio

lymphocyte mediated immunity muscle tissue development

complement activation positive regulation of cell activation Fig. 3 GO, KEGG pathway, and Reactome enrichment analyses of DEGs. a The chord plot showing the top 10 enriched biological processes of DEGs. Compared with OA, RA patients have stronger immune cells and complement system activation in the synovium, such as leukocyte and lymphocyte. b The bubble plot showing the most enriched KEGG and Reactome pathways of DEGs. The most signifcant KEGG pathways involved cytokines and their signaling pathways, while the most signifcant Reactome pathways were immune system and signal transduction. These indicate that the immune signal is more strongly activated in RA synovium, which is consistent with the clinical manifestations (such as arthritis). The screening criteria for signifcant enriched biological processes and pathways were Q < 0.05. The Q value is the adjusted p value

of CLIP-Data, and with or without degradome data. We disease, Sjogren’s syndrome, for further analysis. We pro- chose the ncRNAs that exist in most of the prediction pose that NEAT1-miR-212-3p/miR-132-3p/miR-129-5p- results of miRNAs as our predicted lncRNAs and circR- TTK (Fig. 8d) and XIST-miR-25-3p/miR-129-5p-GZMA NAs. In addition, since a transcript has multiple circRNA (Fig. 8e) might be potential RNA regulatory pathways to shear sites in the prediction results of the Starbase data- regulate the disease progression of early RA. Regarding base, we selected the circRNA with the most samples the prediction results of circRNAs, we found a circRNA and highest score in the circBase database as the target (TTK_hsa_circ_0077158) predicted by target miRNAs circRNA. Finally, we obtained 3 target lncRNAs and 4 of TTK, and its target is TTK. Hence, we propose the target circRNAs of the target miRNAs of PRC1; 1 target following circRNA-miRNA-mRNA pathway: TTK_ lncRNA and 19 target circRNAs of the target miRNAs hsa_circ_0077158-miR-212-3p/miR-132-3p/miR-129-5p- of GZMA; and 1 target lncRNA and 14 target circR- TTK (Fig. 8f); it might be a key regulatory pathway in the NAs of the target miRNAs of TTK. Tree ceRNA net- pathogenesis of early RA. works based on the prediction results were constructed and illustrated by Cytoscape and are shown in Fig. 8a-c. Discussion Subsequently, according to the ceRNA hypothesis, we Te main characteristic of RA is chronic synovial infam- conducted a literature search and selected four reported mation, which leads to erosion and damage of joints. downregulated miRNAs and an upregulated lncRNA in Early diagnosis and treatment of RA will efectively pre- RA and upregulated lncRNA in another autoimmune vent joint damage and improve quality of life. However, Cheng et al. J Transl Med (2021) 19:18 Page 9 of 15

CFH ZDHHC15 a NDRG4 PCDH19 PRSS35 ZNF711 SHE C6 SYPL2 CPNE4 BCHE FHOD3 C15orf61 SFRP2 GJB6 LIMCH1 PVRL3 GTSF1 C7 GULP1 LAX1 ZFPM2 QPCT GHR STXBP1 PRKCD T SHISA2 MAL MICU3 AMPH BTC SOX8 MATK FNBP1L ADAMTS3 LAMP3 EBF3 WIF1 HEY2 PLP1 TSLP CRTAM ZNF423 TGFBR3 LRCH2 IL17D DNM1 LCK BLOC1S4 SUSD5 CHAD OLA1 BAG3 TOX3 EFNB2 ASPA IER2 ITK IL7R CD3D EOMES GPC5 CD163 FGF13 P2RY8 IGJ HS3ST3A1 MLF1 IL21R LPHN3 KCNA3 IGSF10 CCR7 RCAN2 PROS1 FCGR1B CCR5 TPD52L1 AGTR1 CD3G MZB1 DOCK8 SLAMF8 EGR3 GZMB EGR1 CHRDL1 TCEAL2 GPRASP1 MSC EDNRB CD52 LRRC2 LRRN3 CD27 RASL10A SLAMF7 CXCL13 LYVE1 PLA2G7 KCNS3 CCDC85A BTLA TNFRSF17 ENPP5 ADAMDEC1 MLIP PNOC MRAP2 PCDHB7 GZMA MFAP4 NKX3-2 RERGL CORO1A TOB1 FCRL5 NR3C2 MYOM2 CKMT2 TUBAL3 AR CLIC5 MYOM1 FOXF1 REEP1 PCDHB16 SMAD9 SGCA SGCE KIAA0101 BMP5 PLN COLEC12 CASQ2 EMCN UHRF1 C2orf40 MYPN SDPR BCL2A1 DISP1 RBPMS2 DACH1 PRC1 UBE2C TTK ANO5 CEP55 PDLIM3 SCN4B NBEA FXYD1 PHYH DLGAP5 CLUL1 SKA3 DHRS9 NAP1L5 CALCRL CDCA8 RBM24 PPAP2A SCUBE2 CLDN5 OLFM3 KLHL13

ACTC1 KCND2 RHOBTB1 CYB561A3 CKB ASB9 BOD1 GPR84 JAM2 KHDRBS3 ASAP2 NAP1L2 SHISA6

b c d EFNB2 FNBP1L DNM1 e TTK UBE2C SLAMF7

AMPH CCR5

HEY2 EGR3 CEP55 PRC1 AGTR1 CD52

EOMES GZMA LCK CD3D DLGAP5 KIAA0101

IL21R CD3G IER2 FGF13

SKA3 CDCA8 BTLA GZMB UHRF1 CCR7 CD27 EGR1 cluster 1 score: 9cluster 2 score: 5.167 cluster 3 score: 3.333 cluster 4 score: 2.8 Fig. 4 PPI network of DEGs and four cluster modules extracted by MCODE. a The interaction network between proteins coded by DEGs was comprised of 187 nodes and 307 edges. Each node represents a protein, while each edge represents one protein–protein association. Red diamonds represent the upregulated genes, and green hexagons represent the downregulated genes. The smaller the value of Q is, the larger the shape size. Four cluster modules extracted by MCODE. Cluster 1 (b) had the highest cluster score (score: 9, 9 nodes and 36 edges), followed by cluster 2 (c) (score: 5.167, 13 nodes and 31 edges), cluster 3 (d) (score: 3.333, 4 nodes and 5 edges), and cluster 4 (e) (score: 2.8, 6 nodes and 7 edges)

early RA is difcult to diagnose due to the lack of efec- Fc gamma R-mediated phagocytosis, and neuroactive tive biomarkers. It is crucial to identify new and efective ligand-receptor interaction. Reactome enrichment analy- biomarkers for the early diagnosis and treatment of RA. sis also showed that DEGs were mostly enriched in the In our study, we identifed 275 DEGs, including 71 immune system and signal transduction. GO, KEGG and tissue/organ-specifc expressed genes, by comparing Reactome enrichment analysis all showed that RA syno- genes expressed in RA and OA samples. GO enrich- vial membrane had strong immune activation and signal ment analysis of all genes and DEGs indicated that the transduction, which was the main cause of RA synovial immune responses, such as the immune cell-medi- infammation, leading to arthritis and arthralgia. It is well ated immune response and the regulation of humoral known that arthritis and arthralgia are the main clinical immune response, were stronger in RA samples than manifestations of RA[1]. in OA samples. KEGG pathways that were enriched After the hub genes that were screened by the PPI included cytokine-cytokine receptor interaction, pri- network were validated using the GEO datasets, we mary immunodefciency, JAK-STAT signalling pathway, identifed eight haematologic/immune system-specifc Cheng et al. J Transl Med (2021) 19:18 Page 10 of 15

hsa-miR-6882-5p hsa-miR-26b-5p hsa-miR-455-3p

hsa-miR-196b-5p hsa-miR-92b-3p hsa-miR-26a-5p hsa-miR-214-5p hsa-miR-212-3p hsa-miR-25-3p

TTK CD52

hsa-miR-132-3p hsa-miR-92a-3p GZMA

hsa-miR-138-5p hsa-miR-196a-5p hsa-miR-129-5p

hsa-miR-6780b-5p hsa-miR-3174 hsa-miR-31-5p hsa-miR-9-5p

hsa-miR-1291 hsa-miR-183-5p hsa-miR-125b-5p hsa-miR-34c-5p hsa-miR-26b-3p hsa-miR-3665 hsa-miR-4725-3p hsa-miR-6768-5p hsa-miR-7854-3p CXCL13

hsa-miR-338-3p hsa-miR-7160-3p hsa-miR-4271 hsa-miR-205-5p hsa-miR-6510-5p DLGAP5

hsa-miR-365b-5p hsa-miR-9500 hsa-miR-193a-5p hsa-miR-627-3p hsa-miR-24-3p hsa-miR-19a-3p CD27 hsa-miR-6838-5p hsa-miR-424-5p hsa-miR-497-5p hsa-miR-130b-3p hsa-miR-3619-5p hsa-miR-507 hsa-miR-365a-5p hsa-miR-34a-5p hsa-miR-15a-5p hsa-miR-101-3p hsa-miR-128-3p hsa-miR-7156-3p hsa-miR-670-5p CEP55 hsa-miR-125a-5p hsa-miR-5698 hsa-miR-195-3p hsa-miR-16-5p hsa-miR-3666 hsa-miR-124-3p hsa-miR-3977 hsa-miR-4319 hsa-miR-4536-5p hsa-miR-629-5p hsa-miR-141-5p hsa-miR-6773-3p hsa-miR-15a-3p hsa-miR-4317 hsa-miR-15b-5p hsa-miR-182-5p hsa-miR-5000-3p hsa-miR-22-3p hsa-miR-301b-3p hsa-miR-4742-5p hsa-miR-2682-5p hsa-miR-449c-5p hsa-miR-130a-3p hsa-miR-211-3p hsa-miR-17-3p hsa-miR-449a PRC1 hsa-miR-490-3p hsa-miR-503-3p hsa-miR-214-3p

hsa-miR-3664-3p hsa-miR-16-1-3p hsa-miR-4451 hsa-miR-96-5p hsa-miR-4515

hsa-miR-671-5p hsa-miR-34b-5p hsa-miR-4723-5p hsa-miR-107

hsa-miR-194-5p hsa-miR-223-5p hsa-miR-93-3p

hsa-miR-503-5p hsa-miR-1185-1-3p

hsa-miR-7111-5p Fig. 5 A co-expressed network of mRNAs and target miRNAs. The mRNA-miRNA co-expressed network was constructed by Cytoscape including 103 nodes and 105 edges. PRC1 has the most target miRNAs (37), while CD52 has only 2 target miRNAs. One node represents a mRNA or miRNA, while one edge represents one interaction of mRNA and miRNA. Red diamonds represent the hub genes, and blue circles represent miRNAs

expressed genes. ROC curve analysis suggests that these the expression level of GZMA in OA patients, the expres- genes have high diagnostic value for both early RA and sion level of GZMA increases in plasma, synovial tissues, established RA. Combined with their expression levels and synovial membranes in patients with RA [28,29]. in early RA and established RA, GZMA, PRC1 and TTK Tis indicates that GZMA plays a signifcant role in the were up-regulated in early RA with statistical signif- pathogenesis of RA. Consistent with this research, our cance (p < 0.05). Terefore, we hypothesize that GZMA, study found that GZMA was upregulated in the synovial PRC1 and TTK may be biomarkers for early diagnosis of membrane of RA, especially in early RA. In addition, the RA based on our present samples. In addition, we con- ROC curve of GZMA indicated that it has a very high structed an mRNA-miRNA co-expression network and diagnostic value in early RA (AUC = 0.906). We consid- ceRNA networks to elucidate the pathogenesis of RA at ered GZMA a very efective biomarker for the diagnosis the transcriptome level. of early RA. GZMA, a member of the serine protease family, is PRC1 (also called ASE1), a human mitotic spindle- secreted by cytotoxic cells such as cytotoxic T cells and associated CDK substrate protein, is a key regulator of natural killer (NK) cells and plays an important role cell division [30]. According to BioGPS, PRC1 is specif- in cell death, cytokine processing, and infammation cally expressed in early erythrocytes, endotheliocytes, [26,27]. Several studies have reported that compared with and B lymphocytes. At present, PRC1 has not been Cheng et al. J Transl Med (2021) 19:18 Page 11 of 15

GSE55584&GSE55457&GSE55235

a 13 *** *** 9 ****** ****** 11

12

3

1

5

7 2

L

5

2 5 P

8 C 11 OA ( n=26 ) D

10 D

E X C

C 11

C C

f f

f f

o o

o

o RA ( n=33 )

n n

n n o

9 o

i i o

10 o

i i s

s 7 9

s s

s s

s s

e e

r r

e e

r r

p p

p p x

8 x 9

x x

E E

E E

6 7 7 8

9 ***

*** *** *** 7.0 5

P 10

A

A

1

K

M G

7 C

T

Z

L

R T

6.5

D G f

P

8

f f

o

f

o o

o

n

9

n n

o

n

i

o o

o

s

i i

i

s

s s

s

e

s s

s 6.0

r

e e

e

r r

6 p

r

x

p p

8 p

x x x

7 E

E E E

5.5

7 5

b GSE89408

ns ns ns 16 ns *** *** 15 ***

*** 3

1

2 5 7 *** OA ( n=22 )

L ***

5 5

2 ***

*** 10

P C

D

D

E X C

C 12

10 early_RA ( n=57 )

f

C C

f

o

f f

o

o

o 10

n

n

o

n n

o i

i established_RA ( n=95 )

o o

s

i i

s

s

s s

s

e

s s

e 8

r

r

e e

p

r r

p

x p 5 p

x 5

x x

E

E E E 5

4

0 0 0

ns * 12.5 * 12 * *** 12 * **

5 *** P

1 10.0 K *** *** A

A *** ***

C

T

M

G

T

R

Z 10

L

f

P

G D

10 o

f

f f 8 o

n 7.5

o o

o

n

i

n n

o

s

i

o o

s

i i

s

e

s s s

8 r

s s

e 5.0

p

r

e e

r r x

5 p

p p E

4 x

x x

E

E E 6 2.5

0 0 4 0.0 Fig. 6 Verifcation of the 8 specifcally expressed hub genes by 4 datasets of the GEO database. a Verifcation by three GPL96 datasets: GSE55584, GSE55457 and GSE55235. Compared with OA samples, all hub genes are upregulated in RA samples with signifcance. b Verifcation by the GPL11154 GSE89408 dataset. ***: p < 0.001, **: p < 0.01, *: p < 0.05, ns: no signifcant diference. Compared with OA samples, all hub genes are upregulated in RA samples with signifcance. Compared with established RA samples, GZMA, PRC1, and TTK were upregulated in early RA samples, while others have no signifcance

reported in RA-related studies. However, in our study, highly specifcally expressed in early red blood cells and PRC1 was upregulated in the synovial membrane of RA, endothelial cells. A study by H Ah-Kim et al. reported especially in early RA. Compared with OA, synovial that tumour necrosis factor-alpha (TNF-α) can increase infammation and hyperplasia are marked in RA. In addi- TTK expression in human articular chondrocytes [35], tion, it has been reported that the metabolic level of the suggesting that TTK is regulated by TNF-α in some synovial membrane is elevated, similar to that of tumour biological processes. We know that TNF-α plays a very tissue [31]. Tese results all refect the increased prolif- signifcant role in the pathogenesis of RA [36]. Tus, we eration of cells like synovial fbroblasts and macrophages hypothesized that TTK plays an important role in syno- in the synovial membrane of RA to some extent [32,33]. vial cell proliferation and TNF-α-mediated pathogenesis. Terefore, PRC1 may play an important role in the pro- In addition, we identifed that TTK was highly expressed liferation of synovial cells and the disease progression of in the synovial membrane of RA and has a high diagnos- RA. tic value in early RA (AUC = 0.793). We considered TTK TTK (also called MPS1 and CT96), which encodes a as a novel and efective biomarker for the diagnosis of dual specifcity protein kinase that phosphorylates a vari- early RA. ety of amino acids such as tyrosine and serine, is related Furthermore, target miRNAs and the target lncR- to cell proliferation [34]. Similar to PRC1, TTK is also NAs and circRNAs of these miRNAs were predicted Cheng et al. J Transl Med (2021) 19:18 Page 12 of 15

Fig. 7 ROC curve of the 8 specifcally expressed hub genes. a ROC curve of the 8 specifcally expressed hub genes in early RA samples. b ROC curve of the 8 specifcally expressed hub genes in established RA samples. AUC ​area under the ROC curve

for GZMA, PRC1, and TTK, and a ceRNA network downregulated miRNAs in RA for further analy- was constructed with Cytoscape. This network reveals sis. Among the target miRNAs of GZMA, PRC1, and the mechanism by which selected genes are regulated TTK, the expression of the following miRNAs was at the transcriptome level. According to the ceRNA downregulated in RA: miR-129-5p (in RA synovial hypothesis, we performed a literature search to select tissue and synovial fibroblasts), miR-132-3p (in RA Cheng et al. J Transl Med (2021) 19:18 Page 13 of 15

IQGAP1_hsa_circ_0000651 TTC17_hsa_circ_0021773 a b CLTC_hsa_circ_0107274 c FRS2_hsa_circ_0005578 mRNA hsa-miR-671-5p DST_hsa_circ_0008924 hsa-miR-503-5p hsa-miR-4319 CLIP2_hsa_circ_0002755 SERINC3_hsa_circ_0060491 CNIH_hsa_circ_0031985 miRNA IDH1_hsa_circ_0057967 KCNQ1OT1 hsa-miR-5000-3p hsa-miR-670-5p FBN1_hsa_circ_0008421 lncRNA hsa-miR-132-3p hsa-miR-212-3p BIRC6_hsa_circ_0053441 TTK_hsa_circ_0077158 BRWD1_hsa_circ_0001195 hsa-miR-196b-5p hsa-miR-25-3p circRNA hsa-miR-490-3p STAG3L3_hsa_circ_0080459 hsa-miR-96-5p PRPF8_hsa_circ_0041290

TTK TCONS_l2_00001808_hsa_circ_0003163 GZMA

hsa-miR-449c-5p GTF2I_hsa_circ_0006944 PRC1 CLIP2_hsa_circ_0002755 hsa-miR-107 FNIP1_hsa_circ_11228 NBPF10_hsa_circ_0013872 hsa-miR-196a-5p NBPF10_hsa_circ_0013872 hsa-miR-26b-5p hsa-miR-129-5p hsa-miR-129-5p

ELK4_hsa_circ_0000175 NEAT1 XIST hsa-miR-34b-5p RANBP2_hsa_circ_0006965 hsa-miR-125a-5p RGPD8_hsa_circ_0056134

XIST NEAT1 RGPD6_hsa_circ_0003024 RANBP2_hsa_circ_0006965 BCL9_hsa_circ_0110714 hsa-miR-92b-3p hsa-miR-92a-3p

hsa-miR-34a-5p hsa-miR-182-5p hsa-miR-26a-5p hsa-miR-455-3p ANAPC1_hsa_circ_0004439

GPR89C_hsa_circ_0008497 hsa-miR-24-3p hsa-miR-194-5p ANAPC1_hsa_circ_0004439 RGPD4_hsa_circ_0004542 RGPD6_hsa_circ_0003024 hsa-miR-22-3p NBPF9_hsa_circ_0002258 FNIP1_hsa_circ_11228 RGPD8_hsa_circ_0056134 LIMS1_hsa_circ_0008348 CLIP2_hsa_circ_0002755

NEAT1 deNEAT1 f TTK_hsa_circ_0077158

hsa-miR-129-5p hsa-miR-212-3p hsa-miR-129-5p hsa-miR-34a-5p hsa-miR-132-3p hsa-miR-212-3p hsa-miR-132-3p

TTK TTK PRC1

Fig. 8 Three ceRNA networks of PRC1, TTK, and GZMA and the potential RNA regulatory pathways. a ceRNA network of PRC1. b ceRNA network of TTK. c ceRNA network of GZMA. d NEAT1-miR-212-3p/miR-132-3p/miR-129-5p-TTK. e XIST-miR-25-3p/miR-129-5p-GZMA. f TTK_hsa_ circ_0077158-miR-212-3p/miR-132-3p/miR-129-5p-TTK. Red diamonds represent the hub genes, blue circles represent miRNAs, yellow triangle represents lncRNAs, and the V represents the circRNA

synovial fibroblasts), miR-212-3p (in RA synovial miR-129-5p-TTK; it might be a key regulatory path- tissue and synovial fibroblasts), and miR-25-3p (in way in the pathogenesis of early RA. Of course, in peripheral blood mononuclear cells) [37–40]. In addi- our study, the sample size for analysis and verifica- tion, it has been reported that the lncRNA NEAT1 is tion is relatively small. Therefore, future studies need upregulated in peripheral blood mononuclear cells to increase the sample size and conduct prospective of patients with RA [41]. Therefore, we propose that cohort studies to further confirm our views. NEAT1-miR-212-3p/miR-132-3p/miR-129-5p-TTK might be a potential RNA regulatory pathway to reg- Conclusions ulate the disease progression of early RA. Addition- Our work identifed three haematologic/immune ally, although lncRNA XIST has not been reported in system-specifc expressed genes, GZMA, PRC1, and RA, it has been reported to be upregulated in another TTK, as potential biomarkers for the early diagno- autoimmune disease, Sjogren’s syndrome [42]. We sis and treatment of RA and provided insight into hypothesize that XIST-miR-25-3p/miR-129-5p-GZMA the mechanisms of disease development in RA at has an important regulatory role in RA. Regarding the transcriptome level. In addition, we propose that the prediction results of circRNAs, we found a cir- NEAT1-miR-212-3p/miR-132-3p/miR-129-5p-TTK, cRNA (TTK_hsa_circ_0077158) predicted by tar- XIST-miR-25-3p/miR-129-5p-GZMA, and TTK_hsa_ get miRNAs of TTK, and its target was TTK. Hence, circ_0077158- miR-212-3p/miR-132-3p/miR-129- we proposed a circRNA-miRNA-mRNA pathway: 5p-TTK are potential RNA regulatory pathways that TTK_hsa_circ_0077158-miR-212-3p/miR-132-3p/ control disease progression in early RA. Cheng et al. J Transl Med (2021) 19:18 Page 14 of 15

Abbreviations 5. Jonsson MK, Sundlisæter NP, Nordal HH, Hammer HB, Aga AB, Olsen IC, RA: Rheumatoid arthritis; OA: Osteoarthritis; GEO: Gene Expression Omnibus; Brokstad KA, van der Heijde D, Kvien TK, Fevang BS, et al. Calprotectin as DEGs: Diferentially expressed genes; FDR: False Discovery Rate; GO: Gene a marker of infammation in patients with early rheumatoid arthritis. Ann Ontology; KEGG: Kyoto Encyclopedia of Genes and Genomes; ceRNA: Com- Rheum Dis. 2017;76:2031–7. petitive endogenous RNA; GSEA: Gene Set Enrichment Analysis; RMA: Robust 6. De Winter LM, Hansen WL, van Steenbergen HW, Geusens P, Lenaerts J, Multiarray Average; KOBAS: KEGG Orthology-Based Annotation System; PPI: Somers K, Stinissen P, van der Helm-van Mil AH, Somers V. Autoantibod- Protein–protein interaction; MCC: Maximal Clique Centrality; MNC: Maximum ies to two novel peptides in seronegative and early rheumatoid arthritis. Neighborhood Component; DMNC: Density of Maximum Neighborhood Rheumatology (Oxford). 2016;55:1431–6. Component; ROC: Receiver operating characteristic; AUC​: Area under the ROC 7. Dunaeva M, Blom J, Thurlings R, Pruijn GJM. Circulating serum miR- curve; ncRNAs: Noncoding RNAs. 223-3p and miR-16-5p as possible biomarkers of early rheumatoid arthritis. Clin Exp Immunol. 2018;193:376–85. Acknowledgements 8. Carr HL, Turner JD, Major T, Scheel-Toellner D, Filer A. New developments Not applicable. in transcriptomic analysis of synovial tissue. Front Med (Lausanne). 2020;7:21. Authors’ contributions 9. Li WC, Bai L, Xu Y, Chen H, Ma R, Hou WB, Xu RJ. Identifcation of dif- Conception and design of the work, QC and YD; acquisition and analysis of ferentially expressed genes in synovial tissue of rheumatoid arthritis and data, QC and XC; interpretation of data, QC, YD, and HW; writing and preparing osteoarthritis in patients. J Cell Biochem. 2019;120:4533–44. the original draft, QC; writing, reviewing and editing the paper, QC and YD; 10. Macías-Segura N, Castañeda-Delgado JE, Bastian Y, Santiago-Algarra D, supervision, YD and HW; project administration, YD and HW; and funding Castillo-Ortiz JD, Alemán-Navarro AL, Jaime-Sánchez E, Gomez-Moreno acquisition, YD. All authors have read and agreed to the published version of M, Saucedo-Toral CA, Lara-Ramírez EE, et al. Transcriptional signature the manuscript and to have agreed to both be personally accountable for associated with early rheumatoid arthritis and healthy individuals at high the author’s contributions and ensure to answer any questions related to the risk to develop the disease. PLoS ONE. 2018;13:e0194205. accuracy or integrity of any part of the work. All authors read and approved 11. Demircioğlu D, Cukuroglu E, Kindermans M, Nandi T, Calabrese C, Fon- the fnal manuscript. seca NA, Kahles A, Lehmann KV, Stegle O, Brazma A, et al. A Pan-cancer transcriptome analysis reveals pervasive regulation through alternative Funding promoters. Cell. 2019;178(1465–1477):e1417. The National Natural Science Foundation of China (No. 81501388) and the 12. Kaczkowski B, Tanaka Y, Kawaji H, Sandelin A, Andersson R, Itoh M, National Natural Science Foundation of Zhejiang Province (Nos. LY20H100007 Lassmann T, Hayashizaki Y, Carninci P, Forrest AR. Transcriptome analysis and LQ17H160007). of recurrently deregulated genes across multiple cancers identifes new pan-cancer biomarkers. Cancer Res. 2016;76:216–26. Availability of data and materials 13. Salmena L, Poliseno L, Tay Y, Kats L, Pandolf PP. A ceRNA hypothesis: the The [GSE datasets] data that support the fndings of this study are available Rosetta Stone of a hidden RNA language? Cell. 2011;146:353–8. in the GEO database (https​://www.ncbi.nlm.nih.gov/geo/) with the follow- 14. Benjamini Y, Hochberg Y. Controlling the false discovery rate - a ing data accession identifer(s): GSE77298, GSE82107, GSE55584, GSE55457, practical and powerful approach to multiple testing. J R Stat Soc B. GSE55235, and GSE89408. 1995;57:289–300. 15. Benjamini Y, Yekutieli D. The control of the false discovery rate in multiple Ethics approval and consent to participate testing under dependency. Ann Stat. 2001;29:1165–88. Not applicable. 16. Wu C, Orozco C, Boyer J, Leglise M, Goodale J, Batalov S, Hodge CL, Haase J, Janes J, Huss JW 3rd, Su AI. BioGPS: an extensible and customizable Consent for publication portal for querying and organizing gene annotation resources. Genome Not applicable. Biol. 2009;10:R130. 17. Wang H, Zhu H, Zhu W, Xu Y, Wang N, Han B, Song H, Qiao J. Bioinformatic Competing interests analysis identifes potential key genes in the pathogenesis of turner The authors declare no conficts of interest regarding this work. The corre- syndrome. Front Endocrinol (Lausanne). 2020;11:104. sponding authors have the right to and do speak on behalf of all authors. 18. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP. Gene Author details set enrichment analysis: a knowledge-based approach for inter- 1 Department of Rheumatology, the Second Afliated Hospital of Zhejiang preting genome-wide expression profles. Proc Natl Acad Sci USA. University School of Medicine, 88 Jiefang Road, Hangzhou 310009, China. 2005;102:15545–50. 2 Department of Clinic Medicine, The Second Afliated Hospital of Zhejiang 19. Xie C, Mao X, Huang J, Ding Y, Wu J, Dong S, Kong L, Gao G, Li CY, Wei L. University School of Medicine, 88 Jiefang Road, Hangzhou 310009, China. KOBAS 2.0: a web server for annotation and identifcation of enriched pathways and diseases. Nucleic Acids Res. 2011;39:316–22. Received: 5 September 2020 Accepted: 22 December 2020 20. Chin CH, Chen SH, Wu HH, Ho CW, Ko MT, Lin CY. cytoHubba: identifying hub objects and sub-networks from complex interactome. BMC Syst Biol. 2014;8(Suppl 4):S11. 21. Luan H, Zhang C, Zhang T, He Y, Su Y, Zhou L. Identifcation of key prog- nostic biomarker and its correlation with immune infltrates in pancreatic References ductal adenocarcinoma. Dis Markers. 2020;2020:8825997. 1. Smolen JS, Aletaha D, McInnes IB. Rheumatoid arthritis. Lancet. 22. Yang X, Li Y, Lv R, Qian H, Chen X, Yang CF. Study on the multitarget 2016;388:2023–38. mechanism and key active ingredients of herba siegesbeckiae and vola- 2. Aletaha D, Smolen JS. Diagnosis and management of rheumatoid arthri- tile oil against rheumatoid arthritis based on network pharmacology. Evid tis: a review. JAMA. 2018;320:1360–72. Based Complement Alternat Med. 2019;2019:8957245. 3. Aletaha D, Neogi T, Silman AJ, Funovits J, Felson DT, Bingham CO 3rd, 23. Li JH, Liu S, Zhou H, Qu LH, Yang JH. starBase v2.0: decoding miRNA- Birnbaum NS, Burmester GR, Bykerk VP, Cohen MD, et al. 2010 Rheuma- ceRNA, miRNA-ncRNA and protein-RNA interaction networks from large- toid arthritis classifcation criteria: an American College of Rheumatology/ scale CLIP-Seq data. Nucleic Acids Res. 2014;42:D92-97. European League Against Rheumatism collaborative initiative. Arthritis 24. Croft D, O’Kelly G, Wu G, Haw R, Gillespie M, Matthews L, Caudy M, Rheum. 2010;62:2569–81. Garapati P, Gopinath G, Jassal B, et al. Reactome: a database of reactions, 4. Maksymowych WP, Boire G, van Schaardenburg D, Wichuk S, Turk S, Boers pathways and biological processes. Nucleic Acids Res. 2011;39:D691-697. M, Siminovitch KA, Bykerk V, Keystone E, Tak PP, et al. 14-3-3η Autoan- 25. Kumar R, Indrayan A. Receiver operating characteristic (ROC) curve for tibodies: Diagnostic Use in Early Rheumatoid Arthritis. J Rheumatol. medical researchers. Indian Pediatr. 2011;48:277–87. 2015;42:1587–94. Cheng et al. J Transl Med (2021) 19:18 Page 15 of 15

26. Anthony DA, Andrews DM, Watt SV, Trapani JA, Smyth MJ. Functional 36. Brennan FM, Maini RN, Feldmann M. TNF alpha–a pivotal role in rheuma- dissection of the granzyme family: cell death and infammation. Immunol toid arthritis? Br J Rheumatol. 1992;31:293–8. Rev. 2010;235:73–92. 37. Zhang Y, Yan N, Wang X, Chang Y, Wang Y. MiR-129–5p regulates cell 27. van Daalen KR, Reijneveld JF, Bovenschen N. Modulation of infammation proliferation and apoptosis via IGF-1R/Src/ERK/Egr-1 pathway in RA- by extracellular granzyme A. Front Immunol. 2020;11:931. fbroblast-like synoviocytes. Biosci Rep. 2019;39:1. 28. Tak PP, Spaeny-Dekking L, Kraan MC, Breedveld FC, Froelich CJ, Hack CE. 38. Tseng CC, Wu LY, Tsai WC, Ou TT, Wu CC, Sung WY, Kuo PL, Yen JH. Dif- The levels of soluble granzyme A and B are elevated in plasma and syno- ferential expression profles of the transcriptome and miRNA interactome vial fuid of patients with rheumatoid arthritis (RA). Clin Exp Immunol. in synovial fbroblasts of rheumatoid arthritis revealed by next generation 1999;116:366–70. sequencing. Diagnostics (Basel). 2019;9:1. 29. Kummer JA, Tak PP, Brinkman BM, van Tilborg AA, Kamp AM, Verweij CL, 39. Kurowska W, Kuca-Warnawin E, Radzikowska A, Jakubaszek M, Maślińska Daha MR, Meinders AE, Hack CE, Breedveld FC. Expression of granzymes M, Kwiatkowska B, Maśliński W. Monocyte-related biomarkers of rheuma- A and B in synovial tissue from patients with rheumatoid arthritis and toid arthritis development in undiferentiated arthritis patients - a pilot osteoarthritis. Clin Immunol Immunopathol. 1994;73:88–95. study. Reumatologia. 2018;56:10–6. 30. Jiang W, Jimenez G, Wells NJ, Hope TJ, Wahl GM, Hunter T, Fukunaga 40. Liu Y, Zhang XL, Li XF, Tang YC, Zhao X. miR-212-3p reduced proliferation, R. PRC1: a human mitotic spindle-associated CDK substrate protein and promoted apoptosis of fbroblast-like synoviocytes via down- required for cytokinesis. Mol Cell. 1998;2:877–85. regulating SOX5 in rheumatoid arthritis. Eur Rev Med Pharmacol Sci. 31. Falconer J, Murphy AN, Young SP, Clark AR, Tiziani S, Guma M, Buckley CD. 2018;22:461–71. Review: synovial cell metabolism and chronic infammation in rheuma- 41. Shui X, Chen S, Lin J, Kong J, Zhou C, Wu J. Knockdown of lncRNA NEAT1 toid arthritis. Arthritis Rheumatol. 2018;70:984–99. inhibits Th17/CD4( ) T cell diferentiation through reducing the STAT3 32. Veale DJ, Orr C, Fearon U. Cellular and molecular perspectives in rheuma- protein level. J Cell +Physiol. 2019;234:22477–84. toid arthritis. Semin Immunopathol. 2017;39:343–54. 42. Mougeot JL, Noll BD, Bahrani Mougeot FK. Sjögren’s syndrome X-chro- 33. Nygaard G, Firestein GS. Restoring synovial homeostasis in rheumatoid mosome dose efect: an epigenetic perspective. Oral Dis. 2019;25:372–84. arthritis by targeting fbroblast-like synoviocytes. Nat Rev Rheumatol. 2020;16:316–33. 34. Liu X, Winey M. The MPS1 family of protein kinases. Annu Rev Biochem. Publisher’s Note 2012;81:561–85. Springer Nature remains neutral with regard to jurisdictional claims in pub- 35. Ah-Kim H, Zhang X, Islam S, Sof JI, Glickberg Y, Malemud CJ, Moskowitz lished maps and institutional afliations. RW, Haqqi TM. Tumour necrosis factor alpha enhances the expression of hydroxyl lyase, cytoplasmic antiproteinase-2 and a dual specifcity kinase TTK in human chondrocyte-like cells. Cytokine. 2000;12:142–50.

Ready to submit your research ? Choose BMC and benefit from:

• fast, convenient online submission • thorough peer review by experienced researchers in your field • rapid publication on acceptance • support for research data, including large and complex data types • gold Open Access which fosters wider collaboration and increased citations • maximum visibility for your research: over 100M website views per year

At BMC, research is always in progress.

Learn more biomedcentral.com/submissions