The m6A-related signature for predicting the prognosis of breast cancer

Shanliang Zhong1,*, Zhenzhong Lin2,*, Huanwen Chen3, Ling Mao4, Jifeng Feng5 and Siying Zhou6

1 Center of Clinical Laboratory Science, The Affiliated Cancer Hospital of Nanjing Medical University & Jiangsu Cancer Hospital & Jiangsu Institute of Cancer Research, Nanjing, China 2 Department of Pathology, The Affiliated Cancer Hospital of Nanjing Medical University & Jiangsu Cancer Hospital & Jiangsu Institute of Cancer Research, Nanjing, China 3 Xinglin laboratory, The First Affiliated Hospital of Xiamen University, Nanjing, China 4 Department of Thyroid Breast Surgery, The Affiliated Huai’an Hospital of Xuzhou Medical University, Huai’an, China 5 Department of Medical Oncology, The Affiliated Cancer Hospital of Nanjing Medical University & Jiangsu Cancer Hospital & Jiangsu Institute of Cancer Research, Nanjing, China 6 Department of General Surgery, The First Affiliated Hospital of Soochow University, Suzhou, China * These authors contributed equally to this work.

ABSTRACT N6-methyladenosine (m6A) modification has been shown to participate in tumorigen- esis and metastasis of human cancers. The present study aimed to investigate the roles of m6A RNA methylation regulators in breast cancer. We used LASSO regression to identify m6A-related gene signature predicting breast cancer survival with the datasets downloaded from Omnibus and The Cancer Genome Atlas (TCGA). RNA-Seq data of 3409 breast cancer patients from GSE96058 and 1097 from TCGA were used in present study. A 10 m6A-related gene signature associated with prognosis was identified from 22 m6A RNA methylation regulators. The signature divided patients into low- and high-risk group. High-risk patients had a worse prognosis than the low-risk group. Further analyses indicated that IGF2BP1 may be a key m6A RNA methylation regulator in breast cancer. Survival analysis showed that IGF2BP1 is an Submitted 26 January 2021 independent prognostic factor of breast cancer, and higher expression level of IGF2BP1 Accepted 13 May 2021 is associated with shorter overall survival of breast cancer patients. In conclusion, we Published 4 June 2021 identified a 10 m6A-related gene signature associated with overall survival of breast Corresponding authors cancer. IGF2BP1 may be a key m6A RNA methylation regulator in breast cancer. Jifeng Feng, [email protected] Siying Zhou, zhousiy- [email protected] Subjects Bioinformatics, Oncology, Women’s Health, Computational Science Academic editor Keywords Breast cancer, Breast neoplasms, m6A modification, IGF2BP1, Prognosis Consolato Sergi Additional Information and Declarations can be found on BACKGROUND page 11 Breast cancer is the most common malignancy in women worldwide. Only in the United DOI 10.7717/peerj.11561 States, more than 250,000 new cases are diagnosed with breast cancer each year and 66,000 Copyright cases die from this malignancy (Siegel, Miller & Jemal, 2019). Besides environmental factors, 2021 Zhong et al. the genetic background has been shown to associate with development of many cancers Distributed under including breast cancer (Zhong et al., 2016). Although global gene expression patterns have Creative Commons CC-BY 4.0 been extensively studied in breast cancer, gene regulation at post-transcriptional level has OPEN ACCESS not yet been fully investigated (Niu et al., 2019).

How to cite this article Zhong S, Lin Z, Chen H, Mao L, Feng J, Zhou S. 2021. The m6A-related gene signature for predicting the progno- sis of breast cancer. PeerJ 9:e11561 http://doi.org/10.7717/peerj.11561 Several levels of post-transcriptional regulation can influence the gene expression in cells. For example, microRNAs (miRNAs) inhibit gene expression post-transcriptionally by either blocking translation or inducing degradation of messenger RNA (mRNA) targets (Zhong et al., 2013). Meanwhile, modifications such as the poly (A) tail and the 50 cap have marked influence on gene expression. Over 150 different RNA modifications have been observed in various RNA molecules, including mRNAs, transfer RNAs (tRNAs), ribosomal RNAs (rRNAs), small non-coding RNAs and long non-coding RNAs (lncRNAs) (Yang et al., 2018). These RNA molecules can experience methylation at different positions, such as N6-methyladenosine (m6A), N1-methyladenosine, 5-methylcytosine, pseudouridine, N7-methyladenosine, and 20-O-methylation (Chai et al., 2019). Of these modifications, m6A was first discovered in the 1970s and is the most prevalent modification in the mRNA (Desrosiers, Friderici & Rottman, 1974). Three classes of m6A regulatory enzymes are referred to as ‘writers’, ‘erasers’ and ‘readers’ (Heck & Wilusz, 2019). ‘writers’, such as RBM15/15B (Meyer & Jaffrey, 2017), METTL3 (Schumann, Shafik & Preiss, 2016), METTL14 (Liu et al., 2014), WTAP (Ping et al., 2014) and KIAA1429 (Schwartz et al., 2014) catalyze the formation of m6A; ‘erasers’, ALKBH5 (Zheng et al., 2013) and FTO (Jia et al., 2011), remove m6A from select transcripts; and ‘readers’ such as HNRNPA2B1 (Alarcon et al., 2015), YT521-B homology (YTH) domain-containing (YTHDF1, YTHDF2, YTHDF3, YTHDC1 and YTHDC2) (Liao, Sun & Xu, 2018), HNRNPC (Liu et al., 2015) and HNRNPG (Liu et al., 2017) recognize and generate functional signals. M6A RNA modifications participate in the tumorigenesis and metastasis of multiple malignancies by regulating RNA transcript, splicing, processing and translation (Chen, Zhang & Zhu, 2019). For example, Wu et al. (2019) found that reduced m6A may contribute to tumorigenesis and is associated with poor prognosis in breast cancer. In the present study, we comprehensively analyzed the expression of previous reported m6A RNA regulators in breast cancer using the data from Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA).

MATERIAL AND METHODS Selection of m6A RNA methylation regulators After searching published literature, we obtained a list of 22 m6A RNA methylation regulators: RBM15/15B (Meyer & Jaffrey, 2017), METTL3 (Schumann, Shafik & Preiss, 2016), METTL14 (Liu et al., 2014), METTL16 (Warda et al., 2017), WTAP (Ping et al., 2014), KIAA1429 (also known as VIRMA) (Schwartz et al., 2014), ALKBH5 (Zheng et al., 2013), FTO (Jia et al., 2011), IGF2BP1/2/3 (Huang et al., 2018b), HNRNPA2B1 (Alarcon et al., 2015), YTH domain-containing proteins (YTHDC1 YTHDC2, YTHDF1, YTHDF2 and YTHDF3) (Liao, Sun & Xu, 2018), HNRNPC (Liu et al., 2015), HNRNPG (also known as RBMX) (Liu et al., 2017), RBMX (Liu et al., 2017), FMR1 (Edupuganti et al., 2017), EIF3 (Meyer et al., 2015). Data acquisition We downloaded clinical data for 1097 female breast cancer patients as well as their level 3 RNA-Seq data (HTSeq-Counts) from TCGA (accessed December 2019) (search term:

Zhong et al. (2021), PeerJ, DOI 10.7717/peerj.11561 2/15 BRCA) (Cancer Genome Atlas, 2012; Tomczak, Czerwinska & Wiznerowicz, 2015). Dataset of GSE96058 for 3409 breast cancer patients was downloaded from GEO (accession number: GSE96058)(Brueffer et al., 2018). Data analysis All of the analyses in present study were done with R software (version 4.0.1). Because GSE96058 includes 3409 patients and detailed clinical information, we investigated the relationships between regulators and pathological features of breast cancer, including molecular classification [Normal, Luminal A (LumA), Luminal B (LumB), Her2 and Basal] and Nottingham histologic grade (grade2, grade3 and grade4) with GSE96058. R package of pheatmap (R package version 1.0.12) was used to plot heatmap. Box plot was drawn using ggpubr package (R package version 0.2.4). R package of ‘‘beeswarm’’ (R package version 0.2.3) was used to show the expression level of each gene in different groups. To analyze the RNA-Seq data from TCGA, raw read counts were normalized and the differentially expressed were compared between breast cancer tissues and adjacent normal tissues using DESeq2 (Love, Huber & Anders, 2014) (R package version 1.18.1). The data of breast cancer tissues or adjacent normal tissues were identified according to their TCGA barcode (https://docs.gdc.cancer.gov/Encyclopedia/pages/TCGA_Barcode/). Adjacent normal tissues were taken from greater than two cm from the tumor (Cancer Genome Atlas, 2012). We used the Benjamini & Hochberg method (Benjamini et al., 2001) to correct for multiple testing. Fold-change >2 and adjusted P value < 0.05 were used as selection criteria to identify differentially expressed genes between different groups. Then, the differentially expressed m6A RNA methylation regulators were selected from the differentially expressed genes. Most useful prognostic markers were selected from m6A RNA methylation regulators using LASSO (Tibshirani, 1996) penalized regression analysis. LASSO is a feature selection method to nullify the impact of irrelevant features. With increasing λ, LASSO shrinks all regression coefficients towards zero and sets the coefficients of irrelevant features exactly to zero. The optimal λ is selected as the λ that yields minimum cross validation error in 10-fold cross validation. GSE96058 was used as the training set, and TCGA was used as the test set. The risk score was calculated for each subject by sum of the product of expression level of each gene and the corresponding regression coefficients as in our previous study (Liu et al., 2019). We used the ‘‘glmnet’’ package (R package version 2.0-16) to conduct the LASSO analysis and a P value < 0.05 was considered statistically significant. In survival analyses, patients were separated into two groups according to the expression level of a gene or the risk scores, and the median was used as cut-off. Then, the log-rank test was used to assess the overall survival (OS) with the survival package (R package version 3.1-7). Hazard ratios (HRs) and their 95% confidence intervals (CIs) were calculated using Cox proportional hazards. We used forestplot package (R package version 3.4.3) to draw forest plot to show the HRs and 95% CIs of the genes. To identify downstream genes of m6A RNA methylation regulators, the correlations between m6A RNA methylation regulators and other genes were assessed with the RNA-Seq data from GSE96058 and TCGA using Spearman’s correlation. The Spearman’s rho was

Zhong et al. (2021), PeerJ, DOI 10.7717/peerj.11561 3/15 used to assess the degree of association between IGF2BP1 and its potential coexpressed genes. Only the coexpressed genes identified from the both datasets were kept. We used clusterProfiler package (Yu et al., 2012) (R package version 4.0.1) to carry out Kyoto Encyclopedia of Genes and Genomes (KEGG) and (GO) enrichment analyses. Benjamini & Hochberg method was applied to correct for multiple testing (Benjamini et al., 2001). An adjusted P value of less than 0.05 was considered statistically significant. A network of genes and enriched GO terms was constructed with Cytoscape software (version 3.8) (Shannon et al., 2003). GOplot package was utilized to illustrate the relationship between enriched KEGG pathways and genes.

RESULTS Expression of m6A RNA methylation regulators in breast cancer To uncover the biological functions of these m6A RNA methylation regulators, we explored the relationships between each regulator and pathological features of breast cancer, including molecular classification (Normal, LumA, LumB, Her2 and Basal) and Nottingham histologic grade (grade1, grade2 and grade3) with GSE96058. The expression level of each m6A RNA methylation regulator according to molecular classification is presented in Fig. 1A, showing that most m6A RNA methylation regulators had different expression levels in different molecular classifications. Similar results were found in different grades (Fig. 1B). We used beeswarm plots to show distributions of the regulators in different molecular classifications and grades (Figs. 1C, S1, 1D and S2). With the increase of malignancy of breast cancer, the expression level of several m6A RNA methylation regulators, such as FTO, HNRNPA2B1 YTHDC1, IGF2BP1, IGF2BP2 and IGF2BP3, decreased or increased (Fig. 1C). The expression level of these six m6A RNA methylation regulators also had a trend with grade from 1 to 3 (Fig. 1D). We further used TCGA RNA-Seq data to explore whether there is a difference in expression levels of m6A RNA methylation genes between adjacent normal tissues and breast cancer tissues. We identified 6168 differentially expressed genes after comparing 1097 breast cancer tissues with 113 adjacent normal tissues. Then, we identified 17 m6A RNA methylation regulators had an adjusted P value less than 0.05, including 2 (IGF2BP1 and IGF2BP3) with a fold change > 2 (Table S1). Figure 2 shows the distribution of expression levels of the 22 m6A RNA methylation genes in breast cancer tissues and adjacent normal tissues. Identification of an m6A-related gene signature associated with prognosis After analyzing the expression level of the 22 m6A RNA methylation regulators using LASSO regression analysis, a 10 m6A-related gene signature associated with OS was identified in breast cancer patients (Figs. 3A–3C). We calculated risk scores for all the patients. When we sorted the patients according to the risk score, it was easily noted that high-risk group had more deaths than low risk group (Figs. 3D–3E). The heatmap indicated that the expression level of the 10 m6A-related genes showed an increase or decrease trend from low risk score to high risk score (Fig. 3F). Then, we analyzed the over survival of breast cancer patients

Zhong et al. (2021), PeerJ, DOI 10.7717/peerj.11561 4/15 Figure 1 Expression of the 22 m6A RNA methylation regulators in GSE96058. (A) Heatmap of the rel- ative expression level of the 22 m6A RNA methylation regulators across all 3409 patients in GSE96058. Pa- tients are categorized as normal or by the molecular classification of their breast cancer Luminal A (Lum A), Luminal B (lum B), Her2 or basal. The 22 m6A methylation regulators are clustered into groups with similar expression behavior in the five different categories. The functional group of the m6A methylation regulator is also indicated. (B) Heatmap of the relative expression level of the 22 m6A RNA methylation regulators across all 3409 patients in GSE96058. Patients are categorized by grade. The 22 m6A methyla- tion regulators are clustered into groups with similar expression behavior in the three categories. (C) The relative expression level of 6 representative m6A RNA methylation regulators in different molecular sub- types of breast cancer. The expression level of the six regulators shows a decreased or increased trend with the increase of malignancy of breast cancer from normal to basal. (D) The relative expression levels of six representative m6A RNA methylation regulators in different grades of breast cancer. The expression level of the six regulators shows a decreased or increased trend with grade from 1 to 3. Full-size DOI: 10.7717/peerj.11561/fig-1

according to the risk score, and found that OS of high-risk patients was shorter OS than that of low risk patients (HR = 1.907, 95% CI [1.533–2.373]; Fig. 3G). Validation of the m6A-related gene signature using data from TCGA We validated the m6A-related gene signature using the data from TCGA. The results were similar (Fig. 4). When the patients were sorted by risk score and divided into two groups, we found that high-risk patients had more deaths and shorter OS than low-risk patients (Figs. 4A–4B). The expression level of the 10 m6A-related genes also showed a trend from low risk score to high risk score (Fig. 4C). The results of survival analysis indicated that high-risk patients had a HR of 1.406 (95%: 1.016−1.946; Fig. 4D).

Zhong et al. (2021), PeerJ, DOI 10.7717/peerj.11561 5/15 Adjacent normal tissues Breast cancer tissues

12

8

Expression 4

0 O AP F T A2B1 FMR1 W T EIF3A RBMX P VIRMA RBM15 METTL3 ALKBH5 YTHDF1 YTHDF2 YTHDF3 YTHDC1 YTHDC2 RBM15B IGF2BP1 IGF2BP2 IGF2BP3 HNRNPC METTL14 METTL16 HNRN

Figure 2 Expression level of the 22 m6A RNA methylation regulators in breast cancer tissues and ad- jacent normal tissues in TCGA. The box plot shows the distribution of normalized read count based on the five number summary: minimum, first quartile, median, third quartile, and maximum. The central rectangle spans the first quartile to the third quartile. The segment inside the rectangle shows the median and ‘‘whiskers’’ above and below the box show the locations of the minimum and maximum. The dots are outliers. Full-size DOI: 10.7717/peerj.11561/fig-2

M6A RNA methylation regulators associated with breast cancer survival We then evaluated the association between the expression levels of the m6A RNA methylation genes and breast cancer survival. Regarding GSE96058, 11 regulators had significant associations in univariate analysis, including 8 beneficial regulators and harmful regulators (Fig. 5A). However, after adjusted to age, molecular classification, grade, tumor size, lymph node status, endocrine treated, and chemotherapy treated, only YTHDC1, FMR1, IGF2BP1 and WTAP showed a significant association (Fig. 5A). With respect to TCGA, higher expression levels of YTHDF3, IGF2BP1 and KIAA1429 were associated with shorter OS of breast cancer patients in univariate analysis (Fig. 5B). After adjusted for age and stage, YTHDF3 and IGF2BP1 still had a significant association with OS (Fig. 5B). Combining the findings in Fig. 2, IGF2BP1 may be a key m6A RNA methylation regulator associated with OS of breast cancer patients. Therefore, we have further explored the potential mechanisms by which IGF2BP1 exerts its effect. Genes coexpressed with m6A RNA methylation regulators Totally, 182 coexpressed genes with a Spearman’s rho larger than 0.3 were identified for IGF2BP1 from GSE96058 and TCGA (Table S2). In GO enrichment analysis, genes were annotated to 3 ontologies: biological process (BP), molecular function (MF) and cellular component (CC). Figure 6A presents the coexpressed genes and the top 10 GO terms in BP, MF and CC. Collagen-containing extracellular matrix, endoplasmic reticulum lumen, and focal adhesion were the 3 most commonly assigned terms for the CC. Endopeptidase activity, extracellular matrix structural

Zhong et al. (2021), PeerJ, DOI 10.7717/peerj.11561 6/15 a 21 21 21 20 19 19 19 15 10 9 7 5 4 3 d 17.0 0.5 16.9 viance e 0.0 Risk score 16.8 elihood D −0.5 High Risk Low 16.7

tial Li k e a r 16.6 P 2500 ys) 16.5

−8 −7 −6 −5 −4 1500

Log (λ) al time (d a vi v

21 20 19 10 4 Alive Death b Su r 500 5 0 0.4 13 15 f YTHDC1 −2 RBM15B 0 0.2 10 WTAP 7 14

9 METTL3 2 3 18 METTL14 1

0.0 FTO 16 Coefficients 114 202 YTHDF1 178 RBMX 19 21 IGF2BP3 −0.2 22 IGF2BP1

6

−8 −7 −6 −5 −4 g λ + Log ( ) 1.00 + + ++++ c ++++++++++++++++++++++++++++++ +++++++++ +++++++++++++++++++++++ ++++++++++++++++++ +++++++++++++++++++ +++++++++++++++++ +++++++++++++++++++++++++++++++++++++ 0.28 +++++++++++++++ ++++++ 0.3 ++++++++++++++++++ 0.25 ++++++++++++++++++++++++ 0.23 ++++++++++++++++++ 0.75 0.2

0.1

AP 0.05 O 0.50 al probability YTHDC1 RBM15B W T METTL3 METTL14 F T 0.0 vi v −0.03 Su r + Low Risk Coefficients 0.25 High Risk −0.1 −0.08−0.08 RBMX + YTHDF1 −0.14 IGF2BP3 IGF2BP1 P<0.0001 −0.2 −0.21 0.00 −0.3 −0.27 0 500 1000 1500 2000 2500 Time (days)

Figure 3 LASSO Cox regression of the 22 m6A RNA methylation regulators using GSE96058. (A) Ten m6A RNA methylation regulators were selected by LASSO Cox regression analysis. The top x-axis shows the number of regulators with non-zero coefficients. The optimal λ is selected as the λ that yields mini- mum cross validation error in 10-fold cross validation. (B) LASSO coefficient profiles of the 22 m6A RNA methylation regulators. The top x-axis shows the number of regulators with a non-zero coefficient. With increasing λ, LASSO shrinks all regression coefficients towards zero and sets the coefficients of irrelevant features exactly to zero. Ten regulators with non-zero coefficients is selected at optimal λ. (C) Ten m6A RNA methylation regulators and their coefficients. (D) The distribution of the risk score for the patients. (E) Survival time and status of the patients in the high- and low-risk groups, as defined by the risk score. (F) Expression patterns of the 10 m6A RNA methylation regulators. (g) Patients with high risk score had shorter overall survival time than those with low risk score (HR = 1.907, 95% CI [1.533–2.373]). Dotted line in (D) and (E) represents the median of the risk score. The patients in (D), (E) and (F) were sorted by risk score in ascending order. Full-size DOI: 10.7717/peerj.11561/fig-3

constituent, and metalloendopeptidase activity were the most commonly assigned terms for the MF. The most commonly assigned GO terms in the BP were extracellular matrix organization, extracellular structure organization, and connective tissue development. The result of KEGG enrichment analysis is presented in Fig. 6B. The 19 coexpressed genes are enriched 3 pathways including proteoglycans in cancer, ECM–receptor interaction, and digestion and absorption.

Zhong et al. (2021), PeerJ, DOI 10.7717/peerj.11561 7/15 hn ta.(2021), al. et Zhong PeerJ 10.7717/peerj.11561 DOI , b a d c Vldto ftem the of Validation 4 Figure insi A,()ad()wr otdb iksoei sedn order. pa- ascending The in score. score risk risk the by of sorted median were the (HR (C) represents score and (B) risk (B) and low (A), (A) with in in those tients line than dotted time groups, The survival low-risk [1.016–1.946]). overall m and CI shorter 10 high- 95% had the the score of in risk patterns patients high Expression the with (C) of tients score. status risk and the time by Survival defined (B) as patients. the for score risk the METTL14 IGF2BP3 IGF2BP1 RBM15B YTHDC1 YTHDF1 METTL3

Survival probability RBMX Survival time (days)

WTAP Risk score FTO 0.00 0.25 0.50 0.75 1.00 0 2000 6000 −4 −3 −2 −1 0 + + + + + + + + + + 0 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + P=0.039 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 2000 + + High Risk Low Risk + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 6 + + + + + + + + -eae eesgaueuigdt rmTCGA. from data using signature gene A-related + + + + + + 4000 + + + + + + + + + + + + + + + + + + + 6000 + + Time (da + + 8000 + ys) +

+

2 0 −2 Full-size 6 Death Alive High Risk Low Risk N ehlto euaos D Pa- (D) regulators. methylation RNA A O:10.7717/peerj.11561/fig-4 DOI: A h itiuinof distribution The (A) = 1.406, 8/15 DISCUSSION In this study, we identified a 10 m6A-related gene signature associated with breast cancer OS using RNA-Seq data from GSE96058 and TCGA. The 10 m6A-related gene signature divided patients into low- and high-risk group, and survival time of high-risk patients was shorter than that of low-risk group. Then, we evaluated the expression level of previous reported m6A RNA genes in breast cancer. Our results showed that m6A RNA methylation regulators have different expression characteristics according to molecular classification and grade of breast cancer. IGF2BP1 may be a key m6A RNA methylation regulator associated with OS of breast cancer patients. We found m6A RNA methylation regulators showed different expression characteristics in different molecular classifications and grades of breast cancer. The expression level of HNRNPA2B1, IGF2BP1, IGF2BP2 and IGF2BP3 was shown to increase as malignancy increased, but FTO and YTHDC1 were reduced as malignancy increased. After comparing breast cancer tissues with adjacent normal tissues, we found IGF2BP1 and IGF2BP3 had a fold change >2 in breast cancer tissues. Although IGF2BP1-3 were expressed in a lower level than other m6A RNA methylation regulators, they had a higher level in more malignant cancer. In the survival analyses, we found that IGF2BP1 is an independent prognostic factor of breast cancer. IGF2BP1 may be a key m6A RNA methylation regulator associated with OS of breast cancer patients. The full name of IGF2BP1 is insulin-like growth factor-2 mRNA-binding protein 1, which expresses in more than 16 cancers and most fetal tissues but only in a limited number of normal adult tissues (Huang et al., 2018c). IGF2BP1 can act as m6A readers to regulate more than 3000 transcript targets (Dominissini et al., 2012). After the target mRNAs are methylated at the 30-UTR, IGF2BP1 can recognize it and inhibit the decay of m6A-RNAs under the co-effects of stabilizers such as ELAVL1 (Huang et al., 2018a). IGF2BP1 can also compete for the same m6A sites with other m6A readers such as YT521-B homology (YTH) domain containing proteins (YTHDFs). YTHDFs act as m6A readers and promote degradation of modified RNA (Huang et al., 2018c). Therefore, we used Spearman’s rank analysis to identify potential coexpression genes which may be regulated by IGF2BP1. The result showed that most genes had a positive correlation with IGF2BP1. IGF2BP1 has been shown to implicate in proliferation, migration, invasion, adhesion, and apoptosis of tumor cells (Bell et al., 2013). In the GO annotation, we found the coexpressed genes of IGF2BP1 were mainly enriched in the terms associated with extracellular matrix, adhesion, and collagen metabolism. KEGG enrichment analysis indicated that these coexpressed genes were enriched in ECM-receptor interaction, protein digestion and absorption, and proteoglycans in cancer. ECM-receptor interaction has been shown to play important roles in breast cancer (Bao et al., 2019). Protein digestion and absorption, and proteoglycans in cancer are two pathways related to extracellular matrix and microenvironment of tumor cells (Afratis et al., 2017). Based on the above results, we proposed that IGF2BP1 up-regulates the target genes’ expression, thus the destroying extracellular matrix, inhibiting apoptosis, and promoting migration, adhesion and proliferation, and finally promotes the

Zhong et al. (2021), PeerJ, DOI 10.7717/peerj.11561 9/15 a Genes P value HR (95% CI) Univariate P value HR (95% CI) Multivariate ALKBH5 0.06 0.82 (0.66−1.01) 0.21 0.87 (0.69−1.09) FTO 0.04 0.80 (0.65−0.99) 0.39 1.10 (0.88−1.38) HNRNPA2B1 0.78 1.03 (0.84−1.27) 0.37 0.90 (0.72−1.13) HNRNPC 0.05 0.81 (0.66−1.00) 0.15 0.85 (0.68−1.06) RBMX 0.35 1.10 (0.90−1.36) 0.69 1.05 (0.84−1.31) YTHDC1 0.00 0.62 (0.50−0.76) 0.02 0.76 (0.60−0.96) YTHDC2 0.16 0.86 (0.70−1.06) 0.94 1.01 (0.78−1.30) FMR1 0.46 0.92 (0.75−1.14) 0.01 0.76 (0.61−0.94) EIF3A 0.43 0.92 (0.75−1.13) 0.36 1.11 (0.89−1.38) YTHDF1 0.18 1.15 (0.93−1.42) 0.98 1.00 (0.80−1.25) YTHDF2 0.02 0.77 (0.63−0.96) 0.14 0.84 (0.68−1.05) YTHDF3 0.22 0.88 (0.71−1.08) 0.38 0.91 (0.73−1.13) IGF2BP1 0.00 1.61 (1.30−2.00) 0.01 1.36 (1.07−1.72) IGF2BP2 0.05 1.24 (1.00−1.52) 0.86 1.02 (0.80−1.30) IGF2BP3 0.00 1.58 (1.27−1.95) 0.80 1.03 (0.80−1.33) KIAA1429 0.13 0.85 (0.69−1.05) 0.91 1.01 (0.81−1.26) METTL14 0.00 0.66 (0.53−0.82) 0.61 0.94 (0.73−1.20) METTL16 0.01 0.77 (0.62−0.95) 0.99 1.00 (0.79−1.26) METTL3 0.01 0.77 (0.62−0.94) 0.12 0.84 (0.67−1.05) RBM15 0.51 0.93 (0.76−1.15) 0.40 0.91 (0.73−1.14) WTAP 0.01 0.74 (0.60−0.92) 0.02 0.77 (0.62−0.96) RBM15B 0.00 0.65 (0.52−0.80) 0.36 0.90 (0.72−1.13)

0.5 1 1.5 2 0.5 1 1.5 HR (95% CI) HR (95% CI) b Genes P value HR (95% CI) Univariate P value HR (95% CI) Multivariate ALKBH5 0.97 1.01 (0.73−1.39) 0.43 0.87 (0.62−1.23) FTO 0.67 1.07 (0.77−1.48) 0.44 1.15 (0.81−1.61) HNRNPA2B1 0.80 0.96 (0.69−1.33) 0.78 1.05 (0.75−1.47) HNRNPC 0.56 0.91 (0.66−1.25) 0.96 1.01 (0.72−1.41) RBMX 0.66 0.93 (0.67−1.29) 1.00 1.00 (0.71−1.40) YTHDC1 0.44 0.88 (0.64−1.22) 0.51 0.89 (0.64−1.25) YTHDC2 0.76 1.05 (0.76−1.46) 0.74 1.06 (0.75−1.49) FMR1 0.97 0.99 (0.72−1.37) 0.13 1.31 (0.92−1.86) EIF3A 0.35 0.86 (0.62−1.18) 0.44 0.88 (0.63−1.23) YTHDF1 0.59 1.09 (0.79−1.51) 0.76 1.05 (0.75−1.48) YTHDF2 0.71 0.94 (0.68−1.30) 0.31 1.19 (0.85−1.68) YTHDF3 0.01 1.51 (1.09−2.10) 0.02 1.51 (1.07−2.13) IGF2BP1 0.02 1.47 (1.06−2.03) 0.00 1.65 (1.17−2.33) IGF2BP2 0.71 0.94 (0.68−1.30) 0.34 1.18 (0.84−1.65) IGF2BP3 0.65 1.08 (0.78−1.49) 0.07 1.37 (0.97−1.92) KIAA1429 0.01 1.51 (1.09−2.10) 0.07 1.38 (0.97−1.95) METTL14 0.60 1.09 (0.79−1.51) 0.89 0.98 (0.70−1.37) METTL16 0.20 1.24 (0.89−1.71) 0.56 1.11 (0.79−1.56) METTL3 0.51 0.90 (0.65−1.24) 0.08 0.74 (0.52−1.04) RBM15 0.56 0.91 (0.66−1.26) 0.51 1.12 (0.80−1.57) WTAP 0.37 1.16 (0.84−1.61) 0.33 1.18 (0.84−1.67) RBM15B 0.06 0.73 (0.53−1.02) 0.18 0.79 (0.56−1.12)

0.5 1 1.5 2 0.5 1 1.5 2 2.5 HR (95% CI) HR (95% CI)

Figure 5 Survival analysis of the 22 m6A RNA methylation regulators in breast cancer patients. (A) Survival analysis was conducted using GSE96058. (B) Survival analysis was conducted using TCGA. Full-size DOI: 10.7717/peerj.11561/fig-5

progression of breast cancer. Therefore, IGF2BP1 may serve as a target for breast cancer treatment. In conclusion, a 10 m6A-related gene signature associated with OS was identified in breast cancer patients. The signature divided patients into low- and high-risk group. High-risk patients had shorter survival time than low-risk group. The 10 m6A-related gene signature may be used in clinical practice to predict the survival of breast cancer patients in the future. We then systematically analyzed the expression of 22 m6A RNA regulators in breast cancer and identified that IGF2BP1 may be a key m6A RNA methylation regulator associated with OS of breast cancer patients. Further studies are needed to validate the 10 m6A-related gene signature in larger samples and confirm the roles of IGF2BP1 in breast cancer using experimental methods.

Zhong et al. (2021), PeerJ, DOI 10.7717/peerj.11561 10/15 COL10A1 MMP13 ITGA11 CALU SDC1 COL11A1 a NKX3-2 ADAMTS7 b SULF1 HOXA11 COL12A1 COL22A1 INHBA CDH2 COL22A1 MMP1 COL10A1 PLAUR fibrillar collagen trimer COL5A1 banded collagen fibril cell-substrate junction MMP14 FN1 focal adhesion basement membrane LOXL2 COL11A1 endoplasmic COL5A2 lysosomal lumen reticulum lumen ARSB PLAU collagen-containing collagen trimer SERPINH1 APH1B extracellular matrix CTSL COLGALT1 TAPT1 complex of collagen arylsulfatase activity trimers NOX4 FN1 CTSL

collagen catabolic collagen binding ADAM19 COL5A2 process IBSP ANTXR1 ACTR3 endodermal cell proteoglycan binding differentiation CHST11 COL5A1 ITGA11

metallopeptidase extracellular structure BMP1 MMP8 activity organization

UCHL1 ITGA5 NID2 extracellular matrix structural constituent collagen fibril conferring tensile organization CTSB ADAMTS12 strength MMP9 sulfuric ester extracellular matrix ARPC2 ADAMTS2 hydrolase activity organization U ARSI PLA COL8A1 metalloendopeptidase connective tissue activity development CORO1C SERPINB7 UR endoderm PLA integrin binding development GPC6 SRPX2 endopeptidase activity endoderm formation TGFBI SDC1 MMP9 extracellular matrix formation of primary collagen metabolic structural constituent germ layer process WNT2 CTHRC1 IBSP SLC36A1 SPP1 SEMA7A SPP1 Proteoglycans in cancer PLOD2 Cellular Component LIMS1 THBS2 THBS2 WNT2 Spearman's rho LRRC15 TLL2 Spearman's rho ECM−receptor interaction LMO7 ADAMTS14 Molecular Function PXDN COL12A1MFAP5 ITGA5 -0.5 0.5 Biological Process 0 1 Protein digestion and absorption

Figure 6 Functional annotation of the 182 coexpressed genes of IGF2BP1. (A) The top 10 gene ontol- ogy terms in biological process, molecular function and cellular component as well as the related coex- pressed genes. (B) KEGG enrichment analysis of the coexpressed genes. Full-size DOI: 10.7717/peerj.11561/fig-6

ADDITIONAL INFORMATION AND DECLARATIONS

Funding This work was supported by the National Natural Science Foundation of China (No. 81602551), the Young Talents Program of Jiangsu Cancer Hospital (QL201810) and the Huai’an Natural Science Research Plan (HAB201816). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Grant Disclosures The following grant information was disclosed by the authors: National Natural Science Foundation of China: 81602551. Young Talents Program of Jiangsu Cancer Hospital: QL201810. Huai‘an Natural Science Research Plan: HAB201816. Competing Interests The authors declare there are no competing interests. Author Contributions • Shanliang Zhong and Huanwen Chen performed the experiments, analyzed the data, prepared figures and/or tables, and approved the final draft. • Zhenzhong Lin performed the experiments, authored or reviewed drafts of the paper, and approved the final draft. • Ling Mao analyzed the data, prepared figures and/or tables, and approved the final draft. • Jifeng Feng and Siying Zhou conceived and designed the experiments, authored or reviewed drafts of the paper, and approved the final draft. Data Availability The following information was supplied regarding data availability: All the raw data is available at TCGA (search term: BRCA) and GEO: GSE96058.

Zhong et al. (2021), PeerJ, DOI 10.7717/peerj.11561 11/15 Supplemental Information Supplemental information for this article can be found online at http://dx.doi.org/10.7717/ peerj.11561#supplemental-information.

REFERENCES Afratis NA, Karamanou K, Piperigkou Z, Vynios DH, Theocharis AD. 2017. The role of heparins and nano-heparins as therapeutic tool in breast cancer. Glycoconjugate Journal 34:299–307 DOI 10.1007/s10719-016-9742-7. Alarcon CR, Goodarzi H, Lee H, Liu X, Tavazoie S, Tavazoie SF. 2015. HNRNPA2B1 is a mediator of m(6)A-dependent nuclear RNA processing events. Cell 162:1299–1308 DOI 10.1016/j.cell.2015.08.011. Bao Y, Wang L, Shi L, Yun F, Liu X, Chen Y, Chen C, Ren Y, Jia Y. 2019. Transcriptome profiling revealed multiple genes and ECM-receptor interaction pathways that may be associated with breast cancer. Cellular & Molecular Biology Letters 24:38 DOI 10.1186/s11658-019-0162-0. Bell JL, Wächter K, Mühleck B, Pazaitis N, Köhn M, Lederer M, Hüttelmaier S. 2013. Insulin-like growth factor 2 mRNA-binding proteins (IGF2BPs): post-transcriptional drivers of cancer progression? Cellular and Molecular Life Sciences 70:2657–2675 DOI 10.1007/s00018-012-1186-z. Benjamini Y, Drai D, Elmer G, Kafkafi N, Golani I. 2001. Controlling the false discov- ery rate in behavior genetics research. Behavioural Brain Research 125:279–284 DOI 10.1016/S0166-4328(01)00297-2. Brueffer C, Vallon-Christersson J, Grabau D, Ehinger A, Hakkinen J, Hegardt C, Malina J, Chen Y, Bendahl PO, Manjer J, Malmberg M, Larsson C, Loman N, Ryden L, Borg A, Saal LH. 2018. Clinical value of RNA sequencing-based classifiers for prediction of the five conventional breast cancer biomarkers: a report from the population-based multicenter sweden cancerome analysis network-breast initiative. JCO Precision Oncology 2:1–18 DOI 10.1200/PO.17.00135. Cancer Genome Atlas N. 2012. Comprehensive molecular portraits of human breast tumours. Nature 490:61–70 DOI 10.1038/nature11412. Chai RC, Wu F, Wang QX, Zhang S, Zhang KN, Liu YQ, Zhao Z, Jiang T, Wang YZ, Kang CS. 2019. m(6)A RNA methylation regulators contribute to malignant progression and have clinical prognostic impact in gliomas. Aging (Albany NY) 11:1204–1225 DOI 10.18632/aging.101829. Chen XY, Zhang J, Zhu JS. 2019. The role of m(6)A RNA methylation in human cancer. Molecular Cancer 18:103 DOI 10.1186/s12943-019-1033-z. Desrosiers R, Friderici K, Rottman F. 1974. Identification of methylated nucle- osides in messenger RNA from Novikoff hepatoma cells. Proceedings of the National Academy of Sciences of the United States of America 71:3971–3975 DOI 10.1073/pnas.71.10.3971. Dominissini D, Moshitch-Moshkovitz S, Schwartz S, Salmon-Divon M, Ungar L, Osenberg S, Cesarkas K, Jacob-Hirsch J, Amariglio N, Kupiec M. 2012. Topology

Zhong et al. (2021), PeerJ, DOI 10.7717/peerj.11561 12/15 of the human and mouse m 6 A RNA methylomes revealed by m 6 A-seq. Nature 485:201–206 DOI 10.1038/nature11112. Edupuganti RR, Geiger S, Lindeboom RGH, Shi H, Hsu PJ, Lu Z, Wang SY, Baltissen MPA, Jansen P, Rossa M, Muller M, Stunnenberg HG, He C, Carell T, Ver- meulen M. 2017. N(6)-methyladenosine (m(6)A) recruits and repels proteins to regulate mRNA homeostasis. Nature Structural & Molecular Biology 24:870–878 DOI 10.1038/nsmb.3462. Heck AM, Wilusz CJ. 2019. Small changes, big implications: the impact of m(6)A RNA methylation on gene expression in pluripotency and development. Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms 1862:194402 DOI 10.1016/j.bbagrm.2019.07.003. Huang H, Weng H, Sun W, Qin X, Shi H, Wu H, Zhao BS, Mesquita A, Liu C, Yuan CL. 2018a. Recognition of RNA N 6-methyladenosine by IGF2BP pro- teins enhances mRNA stability and translation. Nature Cell Biology 20:285–295 DOI 10.1038/s41556-018-0045-z. Huang H, Weng H, Sun W, Qin X, Shi H, Wu H, Zhao BS, Mesquita A, Liu C, Yuan CL, Hu YC, Huttelmaier S, Skibbe JR, Su R, Deng X, Dong L, Sun M, Li C, Nachtergaele S, Wang Y, Hu C, Ferchen K, Greis KD, Jiang X, Wei M, Qu L, Guan JL, He C, Yang J, Chen J. 2018b. Recognition of RNA N(6)-methyladenosine by IGF2BP proteins enhances mRNA stability and translation. Nature Cell Biology 20:285–295 DOI 10.1038/s41556-018-0045-z. Huang X, Zhang H, Guo X, Zhu Z, Cai H, Kong X. 2018c. Insulin-like growth factor 2 mRNA-binding protein 1 (IGF2BP1) in cancer. Journal of Hematology & Oncology 11:88 DOI 10.1186/s13045-018-0628-y. Jia G, Fu Y, Zhao X, Dai Q, Zheng G, Yang Y, Yi C, Lindahl T, Pan T, Yang YG, He C. 2011. N6-methyladenosine in nuclear RNA is a major substrate of the obesity- associated FTO. Nature Chemical Biology 7:885–887 DOI 10.1038/nchembio.687. Liao S, Sun H, Xu C. 2018. YTH Domain: a family of N(6)-methyladenosine (m(6)A) readers. Genomics Proteomics Bioinformatics 16:99–107 DOI 10.1016/j.gpb.2018.04.002. Liu N, Dai Q, Zheng G, He C, Parisien M, Pan T. 2015. N(6)-methyladenosine- dependent RNA structural switches regulate RNA-protein interactions. Nature 518:560–564 DOI 10.1038/nature14234. Liu J, Yue Y, Han D, Wang X, Fu Y, Zhang L, Jia G, Yu M, Lu Z, Deng X, Dai Q, Chen W, He C. 2014. A METTL3-METTL14 complex mediates mammalian nuclear RNA N6-adenosine methylation. Nature Chemical Biology 10:93–95 DOI 10.1038/nchembio.1432. Liu N, Zhou KI, Parisien M, Dai Q, Diatchenko L, Pan T. 2017. N6-methyladenosine alters RNA structure to regulate binding of a low-complexity protein. Nucleic Acids Research 45:6051–6063 DOI 10.1093/nar/gkx141. Liu M, Zhou S, Wang J, Zhang Q, Yang S, Feng J, Xu B, Zhong S. 2019. Identification of genes associated with survival of breast cancer patients. Breast Cancer 26:317–325 DOI 10.1007/s12282-018-0926-9.

Zhong et al. (2021), PeerJ, DOI 10.7717/peerj.11561 13/15 Love MI, Huber W, Anders S. 2014. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology 15:550 DOI 10.1186/s13059-014-0550-8. Meyer KD, Jaffrey SR. 2017. Rethinking m(6)A readers, writers, and erasers. Annual Review of Cell and Developmental Biology 33:319–342 DOI 10.1146/annurev-cellbio-100616-060758. Meyer KD, Patil DP, Zhou J, Zinoviev A, Skabkin MA, Elemento O, Pestova TV, Qian SB, Jaffrey SR. 2015. 50 UTR m(6)A promotes cap-independent translation. Cell 163:999–1010 DOI 10.1016/j.cell.2015.10.012. Niu Y, Lin Z, Wan A, Chen H, Liang H, Sun L, Wang Y, Li X, Xiong XF, Wei B, Wu X, Wan G. 2019. RNA N6-methyladenosine demethylase FTO promotes breast tumor progression through inhibiting BNIP3. Molecular Cancer 18:46 DOI 10.1186/s12943-019-1004-4. Ping XL, Sun BF, Wang L, Xiao W, Yang X, Wang WJ, Adhikari S, Shi Y, Lv Y, Chen YS, Zhao X, Li A, Yang Y, Dahal U, Lou XM, Liu X, Huang J, Yuan WP, Zhu XF, Cheng T, Zhao YL, Wang X, Rendtlew Danielsen JM, Liu F, Yang YG. 2014. Mammalian WTAP is a regulatory subunit of the RNA N6-methyladenosine methyltransferase. Cell Research 24:177–189 DOI 10.1038/cr.2014.3. Schumann U, Shafik A, Preiss T. 2016. METTL3 gains R/W access to the epitranscrip- tome. Molecular Cell 62:323–324 DOI 10.1016/j.molcel.2016.04.024. Schwartz S, Mumbach MR, Jovanovic M, Wang T, Maciag K, Bushkin GG, Mertins P, Ter-Ovanesyan D, Habib N, Cacchiarelli D, Sanjana NE, Freinkman E, Pacold ME, Satija R, Mikkelsen TS, Hacohen N, Zhang F, Carr SA, Lander ES, Regev A. 2014. Perturbation of m6A writers reveals two distinct classes of mRNA methylation at internal and 50 sites. Cell Reports 8:284–296 DOI 10.1016/j.celrep.2014.05.048. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T. 2003. Cytoscape: a software environment for integrated mod- els of biomolecular interaction networks. Genome Research 13:2498–2504 DOI 10.1101/gr.1239303. Siegel RL, Miller KD, Jemal A. 2019. Cancer statistics, 2019. CA: A Cancer Journal for Clinicians 69:7–34 DOI 10.3322/caac.21551. Tibshirani R. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B 58:267–288 DOI 10.1111/j.2517-6161.1996.tb02080.x. Tomczak K, Czerwinska P, Wiznerowicz M. 2015. The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp Oncol (Pozn) 19:A68–A77 DOI 10.5114/wo.2014.47136. Warda AS, Kretschmer J, Hackert P, Lenz C, Urlaub H, Hobartner C, Sloan KE, Bohnsack MT. 2017. Human METTL16 is a N(6)-methyladenosine (m(6)A) methyltransferase that targets pre-mRNAs and various non-coding RNAs. EMBO Reports 18:2004–2014 DOI 10.15252/embr.201744940. Wu L, Wu D, Ning J, Liu W, Zhang D. 2019. Changes of N6-methyladenosine modula- tors promote breast cancer progression. BMC Cancer 19:326 DOI 10.1186/s12885-019-5538-z.

Zhong et al. (2021), PeerJ, DOI 10.7717/peerj.11561 14/15 Yang Y, Hsu PJ, Chen YS, Yang YG. 2018. Dynamic transcriptomic m(6)A decoration: writers, erasers, readers and functions in RNA metabolism. Cell Research 28:616–624 DOI 10.1038/s41422-018-0040-8. Yu G, Wang LG, Han Y, He QY. 2012. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS: A Journal of Integrative Biology 16:284–287 DOI 10.1089/omi.2011.0118. Zheng G, Dahl JA, Niu Y, Fedorcsak P, Huang CM, Li CJ, Vagbo CB, Shi Y, Wang WL, Song SH, Lu Z, Bosmans RP, Dai Q, Hao YJ, Yang X, Zhao WM, Tong WM, Wang XJ, Bogdan F, Furu K, Fu Y, Jia G, Zhao X, Liu J, Krokan HE, Klungland A, Yang YG, He C. 2013. ALKBH5 is a mammalian RNA demethylase that impacts RNA metabolism and mouse fertility. Molecular Cell 49:18–29 DOI 10.1016/j.molcel.2012.10.015. Zhong S, Chen X, Wang D, Zhang X, Shen H, Yang S, Lv M, Tang J, Zhao J. 2016. MicroRNA expression profiles of drug-resistance breast cancer cells and their exosomes. Oncotarget 7:19601–19609 DOI 10.18632/oncotarget.7481. Zhong S, Li W, Chen Z, Xu J, Zhao J. 2013. MiR-222 and miR-29a contribute to the drug-resistance of breast cancer cells. Gene 531:8–14 DOI 10.1016/j.gene.2013.08.062.

Zhong et al. (2021), PeerJ, DOI 10.7717/peerj.11561 15/15