Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1947

Cancer Genome and Epigenome Research

Race Disparities in the Contribution of miRNA Isoforms and tRNA-Derived Fragments to Triple-Negative Breast Cancer Aristeidis G. Telonis and Isidore Rigoutsos

Abstract

Triple-negative breast cancer (TNBC) is a breast cancer subtype miR-21, the miR-17/92 cluster, the miR-183/96/182 cluster) and characterized by marked differences between White and Black/ from specific tRNA loci (e.g., the nuclear tRNAGly and tRNALeu,the African-American women. We performed a systems-level analysis mitochondrial tRNAVal and tRNAPro) were strongly associated on datasets from The Cancer Genome Atlas to elucidate how the with the observed race disparities in TNBC. We highlight the expression patterns of mRNAs are shaped by regulatory noncoding race-specific aspects of transcriptome wiring by discussing in detail RNAs (ncRNA). Specifically, we studied isomiRs, that is, isoforms the metastasis-related MAPK and the Wnt/b-catenin signaling of miRNAs, and tRNA-derived fragments (tRF). In normal breast pathways, two of the many key pathways that were found differ- tissue, we observed a marked cohesiveness in both the ncRNA and entially wired. In conclusion, by employing a data- and knowl- mRNA layers and the associations between them. This cohesive- edge-driven approach, we comprehensively analyzed the normal ness was widely disrupted in TNBC. Many mRNAs become either and cancer transcriptomes to uncover novel key contributors to differentially expressed or differentially wired between normal the race-based disparities of TNBC. breast and TNBC in tandem with isomiR or tRF dysregulation. Significance: This big data-driven study comparing normal and The affected pathways included energy metabolism, cell signaling, cancer transcriptomes uncovers RNA expression differences between and immune responses. Within TNBC, the wiring of the affected Caucasian and African-American patients with triple-negative breast pathways with isomiRs and tRFs differed in each race. Multiple cancer that might help explain disparities in incidence and aggres- isomiRs and tRFs arising from specific miRNA loci (e.g., miR-200c, sive character. Cancer Res; 78(5); 1140–54. 2017 AACR.

Introduction known contributors to the worse clinical picture of B/Aa TNBC patients, differences between the two races continue to remain Breast cancer is the most common type of cancer among women after adjusting for these factors (5). in the United States with nearly 250,000 new cases in 2016 (1). At the genetic level, the mutational profile of tumors for estab- Breast cancer's clinical classification relies on the expression levels lished cancer-linked is indistinguishable in the two races, of three receptors on which the tumor relies for its growth (2): these and, thus, cannot explain the observed differences in outcome are the receptors for the hormones estrogen and progesterone (ER (5, 6). This observation together with the persistent disparities and PR, respectively), and the human epidermal growth factor suggests other possibilities. For example, there may be consequen- receptor 2 (HER2). In the breast cancer subtype known as "triple tial mutations that lie beyond the exons of the cancer-linked genes, negative" (TNBC), none of the three receptors is expressed. or dynamic processes may be at work that are currently not TNBC accounts for 10% to 20% of all breast cancer cases and is understood and do not relate to mutations. Other possibilities the most aggressive subtype. In the absence of an expressed could include expression differences in signaling cascades that are receptor, chemotherapy and surgery remain the only options for found upregulated in B/Aa patients, such as insulin-like growth TNBC patients (2). Epidemiologically, TNBC is more frequent factor 1 receptor (IGF1R) (7) or Wnt/b-catenin (5). among Black/African-American (B/Aa) premenopausal women Noncoding RNAs (ncRNA) represent a recently emerged com- than is among White (Wh) women (3, 4). In B/Aa patients, TNBC ponent of TNBC biology. Work by others (8, 9) and us (10, 11) has exhibits higher grade, is more aggressive, and has higher meta- shown important differences in the transcriptomic profiles of static potential (3, 5). Even though socioeconomic factors are ncRNAs among individuals who differ by sex, population origin, or race. Such transcripts include long ncRNAs (8), the shorter miRNAs and their isoforms (isomiRs), as well as the tRNA-derived Computational Medicine Center, Sidney Kimmel Medical College, Thomas fragments (tRF; refs. 9–11). Importantly for this presentation, we Jefferson University, Philadelphia, Pennsylvania. showed in recent work that isomiRs and tRFs are expressed in a Note: Supplementary data for this article are available at Cancer Research tissue state–specific- and race-specific manner (10, 11). Given that Online (http://cancerres.aacrjournals.org/). isomiRs and tRFs can directly regulate mRNA abundance in mul- Corresponding Authors: Isidore Rigoutsos, Thomas Jefferson University, 1020 tifaceted ways, for example, through RNAi or by displacing RNAs Locust Street, Philadelphia, PA 19107. Phone: 215-503-4219; Fax: 215-503-0466; from RNA-binding (10–13), it is conceivable that they are E-mail: [email protected]; and Aristeidis G. Telonis, E-mail: also associated with the race-based health disparities of TNBC. [email protected] In this presentation, we build on our previous work (10, 11) doi: 10.1158/0008-5472.CAN-17-1947 by approaching the transcriptomic differences between B/Aa 2017 American Association for Cancer Research. and Wh TNBC patients from a systems-biology and network-

1140 Cancer Res; 78(5) March 1, 2018

Downloaded from cancerres.aacrjournals.org on September 24, 2021. © 2018 American Association for Cancer Research. Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1947

Short Noncoding RNAs in TNBC Disparities

biology perspective (14, 15). As before, we base our studies on Target prediction and sequence similarity the TNBC datasets of The Cancer Genome Atlas (TCGA) repos- To identify potential binding sites of miRNAs in mRNAs, we itory (16). Our analysis takes a holistic approach to differen- used the rna22 algorithm (24). We allowed up to one unpaired tially expressed (DE) transcripts and their potential interac- base (¼mismatch or bulge) within the seed region and enforced a tions. We use and isomiR/tRF coexpression networks to minimum of 12 matched bases in the putative heteroduplex. As study the coordination of tissue-specific biological processes the mechanisms of interactions for tRFs remain elusive, we opted with short ncRNAs in normal breast and in TNBC. We integrate for a more general approach to minimize false negative results and, data from other modalities and mine the mRNA networks to therefore, identified potential mRNA targets by sequence similarity identify actors and effectors. We do so separately for each tissue with the glsearch36 algorithm (25) with default settings. For all state (¼normal or TNBC) and for each race (¼B/Aa or Wh). We searches involving mRNAs, we used all cDNA variants as reported place particular emphasis on how mRNAs are wired with the in ENSEMBL75 (26). We used both the tRF sequence (in search of two categories of regulatory ncRNAs of interest, the isomiRs putative "decoyed" targets) and its reverse complement (in search and the tRFs, separately for each state and race. We also com- of putative "direct" targets) against the mRNA sequences. We pare how these relationships are rewired across tissue states and queried tRFs for instances of RNA-binding motifs. RNA-binding across races. Our analyses have two goals: identify novel mole- motifs were downloaded from RBPDB (27). In these analyses, each cules and novel interactions in the context of racial disparities isomiR and tRF was considered separately from the rest. and TNBC and elucidate whether isomiRs and tRFs associate with molecular pathways that have known links to disparities. Gene coexpression networks and identification of commonly and differentially wired features Materials and Methods We constructed gene coexpression networks using the meth- odology of previous studies (28–30) and performed correlation Acquisition of the molecular profiles analyses by computing all pair-wise Spearman r correlation As in our previous work (10, 11, 17, 18), we utilized the coefficients among all mRNAs, tRFs, and isomiRs (six sets of molecular profiles of the TCGA breast cancer samples (16) to correlations in all). For each case (Wh-Normal, Wh-TNBC, identify which samples to use in our analysis. We excluded B/Aa-TNBC), the correlations were run independently and P all breast cancer samples from male patients. We employed all values were corrected to FDR values. Our threshold choices here normal breast female samples that are available in TCGA (94 from were |r| ¼ 0.333 and FDR of 1%. In each case, we kept not more Wh and 6 from B/Aa subjects), irrespective of the adjacent breast than the top 5,000 positive and the top 5,000 negative r coeffi- cancer subtype. Of the TNBC samples, we excluded those that cients and used them to form the set of significant correlations in TCGA annotates as potentially problematic (18), which left us with each case. This process allowed us to create coexpression tables 66 samples from Wh and 32 from B/Aa women donors. Each small and convert the significant correlations to network connected RNA-seq dataset was analyzed with our recently published method nodes. In each coexpression network, the nodes represent features for defining the lowest abundance threshold (19), and subsequent- (mRNAs, isomiRs, or tRFs) and the edges denote the existence ly mined using our tRF (17) and isomiR (11, 18) pipelines. We only of a significant correlation between the connecting nodes. In the considered those tRFs and isomiRs with a median abundance of rendered networks, to improve clarity, we collapsed isomiRs 2 RPM. For the mRNA profiles, we used TCGA's rsem_genes. and tRFs to the miRNA arm or tRNA of origin, for example, normalized_results files and filtered out those genes whose average nodes with names like "miR-200c-3p" and "mtLysTTT" stand for abundance was less than the median of the means of abundances isomiRs originating from the miR-200c-3p arm, and for tRFs of all mRNAs. In the case of mRNA isoforms, we used TCGA's rsem. overlapping with the mature mitochondrial (MT) tRNALysTTT, isoforms.normalized_results files and filtered out isoforms that had a respectively. Network edges are undirected as they represent mean expression lower than the 25th percentile. abun- correlation (positive or negative) between two features, and not dance data (Level 4) were downloaded from The Cancer Proteome direct molecular coupling between the features. IsomiR–mRNA Atlas (20). edges and tRF–mRNA edges indicate a putative regulatory action Statistical and pathway enrichment analyses and data on the mRNA by the isomiR or the tRF, respectively. visualization Coexpression networks can have many structures. However, Analyses were carried out in R and Python and data were log2- biological networks exhibit a scale-free topology with a few, transformed before further processing. Variability analysis was central "hubs," that is, nodes of high degree, and many "peri- done by computing the coefficient of variations for each variable pheral" nodes of low degree (31). To evaluate the scale-free (mRNA, isomiR, or tRF) and by plotting their distributions in log10 topology of the gene coexpression network, we plotted the scale. For univariate analyses, we used the Mann–Whitney U test distribution of node degrees (number of connections with and for multivariate the significance analysis of microarrays (21) at other nodes in the network) and performed linear regression 5,000 permutations and a FDR threshold of 5%. The mRNA iso- on the log scale of the distributions. forms between two states were expressed as a percentage (%) of In each mRNA–mRNA network, we assessed how the 50 mRNA total by dividing the calculated expression of each hubs with the highest degree are connected to other mRNA nodes isoform by the sum of expression of all isoforms from the current by computing pair-wise "mRNA–Jaccard" indices. We define the gene. Pathway enrichment was estimated with DAVID (22) using "mRNA–Jaccard" index between two mRNA hubs as the ratio of an FDR threshold of 5%. The terms that were queried were Gene the number of correlated mRNAs that are common to these two Ontology (GO) terms for biological process, molecular function, mRNA hubs over the number of all the distinct mRNAs with and cellular compartment, as well as KEGG pathways. We used which these two mRNA hubs are correlated. We then performed Cytoscape for network visualization (23). All other visualizations hierarchical clustering (metric: Euclidean distance) in each of the were carried out in R as described previously (10, 11, 18). three (square) matrices that we obtained.

www.aacrjournals.org Cancer Res; 78(5) March 1, 2018 1141

Downloaded from cancerres.aacrjournals.org on September 24, 2021. © 2018 American Association for Cancer Research. Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1947

Telonis and Rigoutsos

In each mRNA–mRNA network, we also assessed how the 50 strength, as defined by the UCSC track, to be 100. Finally, we mRNA hubs with the highest degree are connected to isomiRs considered two mRNAs to be part of the same pathway, if and tRFs by computing pairwise "isomiR/tRF–Jaccard" indices. they were listed under the same biological process by DAVID. We define the "isomiR/tRF–Jaccard" index between two mRNA If both mRNAs in a correlation pair encode for proteins that hubs as the ratio of the number of correlated isomiR/tRF nodes directly interact, are in the same cellular compartment, are part of that are common to these two mRNA hubs over the number of the same biological process as identified by DAVID, or are all the distinct isomiRs/tRFs with which these two hubs are encoded by genes that are under the control of the same tran- correlated. This allowed us to obtain a global perspective of scription factor per the ENCODE ChIP-seq data, then we can which isomiRs/tRFs are connected with which mRNA hubs in conclude that this mRNA pair can be "explained" by the respective the coexpression networks. common feature(s). In the case of immune system, we considered We compared commonly wired and differentially wired pairs correlations to be "immune-system explained" if at least one of transcripts in the following two cases: (i) Wh women in the mRNA in the mRNA pairs is predominantly expressed in the normal and cancer states; and (ii) Wh women and B/Aa women immune system tissues as outlined above. To link mRNA pairs to in the cancer state. A specified pair of transcripts is "commonly isomiRs and tRFs, we examined whether both mRNAs of the pair wired" (CW) in two different states, if and only if the two were significantly correlated with the same isomiR or tRF. transcripts are statistically significantly coexpressed in both states and with the same correlation sign (either positive or Monte Carlo simulations negative) in both states. Analogously, a specified pair of tran- We examined whether we could attribute the mRNA correla- scripts is "differentially wired" (DW) in two states, if and only if tion pairs to factors (e.g., encoded proteins localize to the the two transcripts are statistically significantly coexpressed in same cell compartment) that can potentially cause the observed exactly one state, or statistically significantly coexpressed in correlations and, thus, could account for (i.e., explain) a certain both states but with opposite correlation signs. To further limit portion of the correlations. To examine whether the accounted- potential false positive DW pairs, in those cases where there was for-portion was due to chance, we ran Monte Carlo simulations. no change of sign, we required that the absolute r coefficient of For each of the three cases (Wh-Normal, Wh-TNBC, B/Aa-TNBC), the nonsignificant correlation be <0.2. Thus, our differentially we randomly selected the same number of unique mRNA pairs wired pairs were defined qualitatively, and not quantitatively, and calculated the same percentage. Repeating this random due to the relatively low statistical power (30). selection 20,000 times yielded the distribution of percentage values for each factor that are expected by chance. Finally, using Attributing mRNA correlation pairs to upstream or this distribution, we calculated the respective z-score for the downstream factors or to isomiRs and tRFs percentage value that we observed. z-scores with absolute values We leveraged information from several public databases and 2.0 were deemed statistically significant. projected it on the mRNA correlation pairs. For the cellular compartment of the mRNAs' protein products, we used the Data availability manually reviewed human proteome of UniProt (accessed on The original datasets that we analyzed are available in the November 27, 2016; ref. 32). Protein–protein interactions, nor- TCGA repository (https://gdc.cancer.gov). The tRF profiles that malized at the gene level with default filtering, were drawn from we generated in each case are in MINTbase (https://cm.jeffer the v2.1 release of the PICKLE database (accessed on September son.edu/MINTbase/). The results from our correlation analysis 13, 2017; ref. 33). Collections of genes that share the same and their attributes to other levels of cellular function are transcription-binding site were downloaded from MSigDB (C3. available in the Supplementary Tables that are included with TFT set accessed on March 23, 2017; ref. 34). To find genes that are this presentation. exclusively (or predominantly) expressed in immune cells as compared with normal breast tissue, we used the data from the Expression Atlas of EMBL-EBI (accessed on February 14, 2017; Results ref. 35). We first formed a collection of "immune tissues" com- TNBC progression in B/Aa and Wh patients follows parallel prised of the entries "EBV-transformed lymphocyte," "blood," yet distinct trajectories "small intestine Peyer patch," and "spleen." To identify the rel- In our previous work, we showed that TNBC samples can be evant genes, we ranked each gene's expression across the 53 tissues distinguished from normal breast tissue using isomiR or tRF and kept those genes whose four highest expressing tissues profiles (10, 11). Also, we showed that isomiRs and tRFs can included at least three of the four immune tissues that also showed also distinguish TNBC samples by race (10, 11). Having reported low expression (rank 20) in breast. We obtained transcription recently the isomiR and tRF profiles for all TCGA datasets (17, 18), factor (TF) information from ENCODE's chromatin immunopre- we sought to examine how the expression patterns of mRNAs, cipitation followed by DNA sequencing (ChIP-seq) experiments isomiRs, and tRFs differ between normal breast and TNBC, and (36, 37) that we downloaded from the UCSC Genome Browser also between the two races in TNBC. wgEncodeRegTfbsClusteredV3 track (accessed on April 6, 2017). Looking at the variability of expression for mRNAs, isomiRs, We linked genes with TF-binding sites by searching for binding and tRFs in normal and TNBC for each race, we found that TNBC sites not more than 5,000 base pairs upstream or 5,000 bases or exhibits a significant increase in average variability compared with downstream from the gene's transcriptional start site. In the case normal breast, in all three classes of molecules (Supplementary of multiple start sites, we considered the union of the respective Fig. S1A and S1B). This result reflects the considerable heteroge- 10 Kbp genomic regions around the multiple start sites. Before neity of TNBC. using a binding event, we required that it be observed in four or We also performed a significance analysis to find which features more independent experiments and that the maximum signal are DE between tissue states, or, between races. These analyses

1142 Cancer Res; 78(5) March 1, 2018 Cancer Research

Downloaded from cancerres.aacrjournals.org on September 24, 2021. © 2018 American Association for Cancer Research. Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1947

Short Noncoding RNAs in TNBC Disparities

show many mRNAs and small ncRNAs to be DE in the TNBC state, DE isomiRs arise from the 5p arms of mir-21, mir-182, and in both races (Supplementary Fig. S1C; Supplementary Table S1). mir-183, whereas most of the DE tRFs arise from the isode- We first examine the transcripts that are DE in TNBC versus the coders of the nuclear tRNAGlyGCC,nucleartRNAHisGTG,and normal state in both races. mitochondrial tRNAValTAC. Among the mRNAs that are upregulated in TNBC compared Among the mRNAs that are downregulated in TNBC compared with normal breast tissue, we found several pathways that with normal tissue, we found several pathways that are DE in are enriched and common to both races (Fig. 1A). They patients from both races (Fig. 1B; Supplementary Tables S1 and include DNA replication, ribosome biogenesis, and pyrimi- S2) and include circadian rhythms, regulation of lipolysis, and dinemetabolism,aswellasproteasomeandoxidativephos- TGFb signaling. Notably, there was no overlap among the path- phorylation (Supplementary Table S2). With regard to the ways of downregulated and upregulated genes, respectively. In the shortncRNAsthatarefoundupregulatedinTNBCcompared case of ncRNAs, we found numerous isomiRs from both arms of with normal breast, we find that in both races, most of the the miR-10b and miR-99a miRNA precursors to be downregulated

A Upregulated KEGG Pathways IsomiRs tRNA Fragments

Proteasome miR−21−5p GlyGCC (n) DNA replication miR−183−5p ValTAC (mt) Oxidative phosphorylation miR−182−5p HisGTG (n) Parkinson's disease miR−151a−3p GlnTTG (n) Huntington's disease miR−30e−5p GlnCTG (n) Ribosome biogenesis in eukaryotes miR−200c−3p GluCTC (n) Spliceosome miR−192−5p ValCAC (n) Alzheimer's disease miR−21−3p TyrGTA (n) Pyrimidine metabolism miR−103a−3p LeuCAG (n) Epstein−Barr virus infection miR−142−3p ArgTCT (n)

3.02.52.01.51.00.50.0 302520151050 302520151050

Mean fold enrichment Mean number of DE isomiRs Mean number of DE tRFs B Downregulated 5'-tRF i-tRF Drug metabolism − cytochrome P450 miR−10b−5p HisGTG (n) Circadian rhythm miR−10a−5p GlnTTG (n) 3'-tRF Neuroactive ligand−receptor interaction miR−143−3p GlnCTG (n) Regulation of lipolysis in adipocytes miR−30a−3p AspGTC (n) TGF−beta signaling pathway let−7b−5p AlaCGC (n) Cytokine−cytokine receptor interaction let−7c−5p PheGAA (mt) Vascular smooth muscle contraction miR−126−3p ValTAC (mt) Circadian entrainment miR−145−5p GluTTC (mt) FoxO signaling pathway miR−199a−3p GluTTC (n) Signaling pathways regulating pluripotency of stem cells miR−30a−5p GlyTCC (n)

3.02.52.01.51.00.50.0 302520151050 302520151050

Mean fold enrichment Mean number of DE isomiRs Mean number of DE tRFs

Upregulated mRNAs C ATP5J NDUFA8 NDUFS3 NDUFB6 SDHB ATP6V1F D Upregulated UQCRH NDUFA9 ATP6AP1 COX7B NDUFA13 COX8A Primary immunodeficiency Upregulated UQCRC1 NDUFA4 NDUFA6 NDUFB3 ATP6V0B NDUFA7 Downregulated mRNAs nGlnCTG Antigen processing and presentation short ncRNAs RNA transport NDUFB9 COX5A COX5B NDUFA11 NDUFA12 NDUFB2 Wh ZFYVE9 SMAD2 SP1 RPS6KB1 Protein processing in endoplasmic reticulum ATP5G1 CYC1 UQCR10 NDUFS2 NDUFB11 ATP5D nGluTTC ID4 BMPR2 PPP2R1B SMAD3 nGlnTTG Biosynthesis of amino acids NDUFS6 ATP5J2 ATP5C1 ATP5I SDHC ATP5G3 ACVR1C EP300 ACVR1B CREBBP Cell cycle nGlyTCC nMetCAT UQCRFS1 COX17 TCIRG1 NDUFB10 ATP5E NDUFS7 Non−alcoholic fatty liver disease (NAFLD) ROCK1 SKP1 SMAD5 ACVR2A Biosynthesis of antibiotics

COX6C NDUFV1 COX6B1 UQCR11 NDUFAB1 B/Aa nSerTGA SMAD1 MAPK3 BMPR1A SMAD4 Metabolic pathways nHisGTG TGFB3 ZFYVE16 TGFBR2 POLR2D POLR3D RRM2 NT5C UCK2 nLeuTAG 3.02.52.01.51.00.50.0 miR-30e-5p UCKL1 POLR1A DCTPP1 NME2 POLR2F CLOCK CREB1 FBXL3 Mean fold enrichment miR-146b-5p NME1 POLR1C PNPT1 POLR2I PNP PER3 BHLHE41 PER2 nCysGCA POLR2G POLR2H POLR2J TK1 NME4 miR-146b-3p E Downregulated nAspGTC BTRC PRKAA1 BHLHE40 TYMS CAD DTYMK TXNRD2 TYMP CRY1 RORA PER1 miR-223-3p Fatty acid degradation nArgCCG Valine, leucine and isoleucine degradation nTyrGTA CRY2 FBXW11 PRKAB1 MCM4 RNASEH1 SSBP1 POLD1 Oxytocin signaling pathway

Wh Ras signaling pathway mtValTAC LIG1 MCM2 MCM7 RFC5 MAPK signaling pathway

MCM5 POLD2 RFC2 RNASEH2A Ox-Phos nGlyGCC miR-30e-3p MCM3 POLA2 FEN1 POLE Adherens junction Pyrimidine metabolism Jak−STAT signaling pathway miRNA DNA replication RFC4 MCM6 Transcriptional misregulation in cancer

Downregulated B/Aa β tRNA TGF signaling short ncRNAs Circadian rhythm 3.02.52.01.51.00.50.0 Mean fold enrichment

Figure 1. A complex transcriptomic landscape. A and B, Commonly upregulated (A) and downregulated (B) mRNAs, grouped by KEGG pathways, isomiRs, and tRFs. The shown pathways, miRNA arms, and mature tRNAs capture the mRNAs, isomiRs, and tRFs, respectively, that are DE between normal breast and TNBC, independently of race. x-axis, mean value in the two races. Pathways such as "Parkinson disease" are shown enriched due to characteristics they share with TNBC, for example, oxidative phosphorylation (Supplementary Table S2). C, Examples of miRNA loci and tRNA loci whose isomiRs and tRFs are anticorrelated with the shown mRNAs. Both isomiRs/tRFs and mRNAs are DE in opposite directions between normal breast and TNBC. The lines connecting the miRNAs/tRNAs and the genes show that isomiRs/tRFs from the respective loci are predicted to target the mRNAs from this gene. Green edges indicate that the archetype miRNA is among the produced isoforms from the that are anticorrelated with the shown mRNA. The mRNAs are colored based on the pathway they participate. Ox-Phos, oxidative phosphorylation. D and E, KEGG pathways that are enriched in a race-specific manner in mRNAs that are upregulated (D) or downregulated (E) in TNBC compared with normal.

www.aacrjournals.org Cancer Res; 78(5) March 1, 2018 1143

Downloaded from cancerres.aacrjournals.org on September 24, 2021. © 2018 American Association for Cancer Research. Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1947

Telonis and Rigoutsos

(Supplementary Table S1). We also found that in both may also be race-specific differences within a tissue state. We races, tRNAHisGTG, tRNAGlnTTG, and tRNAGlnCTG give rise to tRFs compared the expression profiles of B/Aa and Wh patients in whose abundance decreases significantly in TNBC compared with (i) normal breast and (ii) TNBC, respectively. In the normal normal breast. breast state, compared with Wh donors, B/Aa donors exhibit In a handful of cases, we observed an intriguing dual behavior increased expression of key genes (e.g., CD2 and CXCL9)that for miRNAs and tRNAs (Fig. 1A and B). For example, isomiRs arelinkedtohightissueinflammation (Supplementary Tables from the 5p arm of mir-146b are upregulated in TNBC, whereas S1 and S2). Analogously, in the TNBC state, key genes from cell isomiRs from its 3p arm are downregulated in TNBC (Supple- signaling pathways (e.g., MAP2K7, PIM1,andPIM2)aredys- mentary Table S1). The miR-30 family provides another intrigu- regulated in B/Aa patients (in agreement with the results of Fig. ing example of dual behavior (see Fig. 1A and B; Supplementary 1D and E). Table S1, and Discussion). This phenomenon of opposing Collectively, the above results show extensive and very con- changes in abundance is also present in the case of tRFs, for crete molecular differences between B/Aa and Wh patients at example, see tRNAHisGTG in Supplementary Fig. S1D. This behav- the two physiologic endpoints, that is, normal breast and ior suggests multiple biogenesis mechanisms for tRFs. TNBC. During the transition from normal tissue physiology IsomiRs and tRFs are known to load on Argonaute (AGO) to TNBC (38), the patients in the two race groups follow and to directly target mRNAs through RNAi (10, 11). Thus, we parallel yet distinct trajectories whose specifics are influenced sought possible links between DE isomiRs and tRFs, on one sharplybythepatients'race. hand, and mRNAs that are DE in the opposite direction and contain a predicted binding site (in the case of isomiRs) and a The transcriptomic networks in TNBC exhibit pronounced predicted binding site or sufficient sequence similarity (in the heterogeneity case of tRFs) on the other. For both races, we could link 26% of To investigate the nature of correlations among mRNAs and the downregulated mRNAs to upregulated short ncRNAs. How- short ncRNAs from a network perspective (28, 39, 40), we con- ever, of the upregulated mRNAs, only 7% could be linked to a structed gene coexpression networks (see Materials and Methods; downregulated short ncRNA. Our DE analyses provide some Supplementary Table S3). We focused on the three groups, or first insight into the evidently complex regulatory processes that "patient states," for which there are adequate samples in the TCGA underlie TNBC. repository for the purposes of correlation analysis: normal breast To further demonstrate this complexity, we selected important samples from Wh donors, TNBC samples from Wh donors, and pathways that we found enriched in these analyses and generated TNBC samples from B/Aa donors. The available normal breast a network of mRNAs, miRNAs, and tRNA isoacceptors that are DE samples from B/Aa donors are not adequate in this regard. independent of race. The miRNA arms and tRNA isoacceptors The three resulting networks are found to be scale-free networks of Fig. 1C produce isomiRs and tRFs, respectively, that are DE in a with the number of edges per number of nodes following a power- direction opposite to that of the connected mRNAs. The green law distribution (Supplementary Fig. S2A). We note that the links in Fig. 1C indicate that the archetype miRNA is among the correlations we observe in normal samples are not affected by DE isoforms produced from the shown mRNA locus. Recall here the adjacent tumor subtype (Supplementary Fig. S2B). Despite the that by archetype we refer to the miRNA isoform that is listed in similarity in their topology, the three networks exhibit extensive the public databases. It is important to stress here that the differences in terms of (i) the genes they comprise (Supplemen- archetype neither has to be the corresponding locus' most abun- tary Fig. S2C) and (ii) the correlations in which these genes dant isoform nor is the only functionally important isoform, as we participate (Supplementary Fig. S2C and S2D). showed in other settings (11). Figure 1 makes evident that the vast In the normal state, the hubs of the mRNA–mRNA coexpres- majority of the miRNA:mRNA links, which we uncovered, involve sion networks correspond to "cell growth and survival" regulators isoforms other than the archetype: These isoforms have not been (e.g., RAB25 and EPS15) as well as "lipid metabolism and storage" studied before, and thus, their functional roles have not been genes like PLIN1. In TNBC, for both Wh and B/Aa patients, the elucidated, especially in the context of these molecular processes hubs correspond to "transcription factors/regulators" (like REST in TNBC. and NCOA2), "cell signaling" (e.g., APC and TAOK1) and Among the dysregulated RNAs are transcripts and pathways "immune system" (e.g., CD4 and CD48) genes, as well as enzymes that are DE between TNBC and normal breast in only one of the that participate in "oxidative phosphorylation" (Supplemen- two races. Specifically, there are pathways that are DE between tary Table S4). The KEGG pathway "ribosome" is the only normal breast and TNBC exclusively in Wh patients (Fig. 1D and E, one that is enriched in all three groups (Supplementary top). Analogously, other pathways are DE exclusively in B/Aa Table S4). "Immune system responses" and "oxidative phos- patients (Fig. 1D and E, bottom). We note the multiple manifesta- phorylation" were enriched only in the TNBC states, which sup- tions of immune system responses ("primary immunodeficiency" ports the view of higher tissue- and cellular heterogeneity in the and "antigen processing and presentation") in Wh TNBC patients cancer state. We examine the similarities and differences of the (Fig. 1D). Similarly, there are several entries for cell signaling three mRNA–mRNA networks below. deregulation that are exclusive to B/Aa TNBC patients (oxytocin, Ras, MAPK, and Jak–STAT signaling; Fig. 1D and E). It is interesting There are two distinct subnetworks of mRNAs that are to note that although "amino acid metabolism" is common to both coexpressed with isomiRs and tRFs in TNBC and correspond to groups, it appears as a decrease in the degradation of branched- nonoverlapping modules chain amino acids in Wh TNBC patients and as an increase in the We focused on the genes that are hubs in the mRNA–mRNA biosynthesis of amino acids in B/Aa TNBC patients. coexpression networks and examined the immediate mRNA Inlightoftherace-specific differences across tissue states neighbors of the 50 nodes with the highest degree. Specifically, (i.e., normal breast and TNBC), we hypothesized that there for each pair of hubs, we evaluated their "mRNA overlap" by

1144 Cancer Res; 78(5) March 1, 2018 Cancer Research

Downloaded from cancerres.aacrjournals.org on September 24, 2021. © 2018 American Association for Cancer Research. Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1947

Short Noncoding RNAs in TNBC Disparities

Wh - Normal Wh - TNBC B/Aa - TNBC A B C

PRKCZ MRPS15 ZNF423 ELMO3 NAA10 JAM3 FAM83H LSM7 DPY30 DDR1 POP7 SF3B14 KIAA1543 SNRPA TEK RTKN EDF1 SPARCL1 AP1M2 SSNA1 LDB2 SPINT2 C16orf13 EBF1 EPHA1 NDUFA3 PEAR1 PTK7 NDUFB11 S1PR1 ZEB2 C19orf43 MMRN2 ZEB1 DRAP1 CD34 SEC23A NDUFS8 ARHGEF15 FERMT2 PRDX5 CXorf36 RHOQ C11orf83 CDH5 ITSN1 ATP5D CDC34 GBE1 RAPGEF6 PSMB3 DOCK11 APC MRPS24 ECM2 ZKSCAN1 GADD45GIP1 EBF1 ROCK1 ERBB2IP C1orf172 PPTC7 AFF4 RAB25 RNF111 KIAA0754 SH3GLB1 LEPROT KIAA1715 QKI CHD9 TAOK1 MBNL1 NR2C2 HCLS1 VAMP3 BIRC6 IL10RA EPS15 REST CD53 PAK4 ASXL2 IL2RG ESRP1 NCOA2 GMFG CCDC69 MGAT5 TNFRSF1B PYGL RIF1 CYTH4 CD36 TAOK1 DOCK2 MMD UHMK1 APBB1IP FAM89A MAP3K2 IKZF1 HSPB6 RPL28 EVI2B ITGB1BP1 RPL18 CD48 TSPAN3 RPS11 JAK3 EPCAM RPL35 SASH3 PPARG IL10RA GPSM3 ACO1 NCKAP1L AKNA LPL CD4 CD3E PDE3B SELPLG ARHGAP25 PLIN1 MYO1F ITGAL AOC3 HAVCR2 TMC8 KCNIP2 CD53 MFNG GYG2 PLEK CD37 mRNA 1st neighbors AQP7 LCP2 CD4 CIDEC SASH3 HLA−DMB GPD1 EVI2B PLEK LIPE IL2RG NCKAP1L QKI LPL TEK CD4 CD4 APC RIF1 LIPE JAK3 MMD PAK4 AFF4 LCP2 EBF1 ZEB1 LDB2 EBF1 ZEB2 PTK7 JAM3 CD53 CD36 CD53 EDF1 CD34 CD48 CD37 PLEK PLEK LSM7 AQP7 CD3E REST POP7 PYGL GBE1 RTKN AOC3 ACO1 CDH5 TMC8 DDR1 CHD9 IKZF1 GPD1 AKNA ECM2 GYG2 PLIN1 ITSN1 EVI2B EVI2B IL2RG IL2RG ITGAL MFNG BIRC6 GMFG RHOQ RPL18 RPL28 RPL35 ATP5D CIDEC EPS15 ASXL2 TAOK1 TAOK1 NAA10 S1PR1 RPS11 DPY30 RAB25 PPTC7 HCLS1 CDC34 CYTH4 NR2C2 SASH3 SSNA1 PEAR1 SASH3 AP1M2 ESRP1 PDE3B HSPB6 EPHA1 VAMP3 MGAT5 SNRPA MBNL1 PRDX5 DRAP1 IL10RA IL10RA MYO1F PPARG PRKCZ ELMO3 PSMB3 NCOA2 ROCK1 DOCK2 UHMK1 SPINT2 GPSM3 EPCAM ZNF423 KCNIP2 SF3B14 MMRN2 CXorf36 RNF111 TSPAN3 FAM89A SEC23A FAM83H SELPLG NDUFA3 LEPROT MAP3K2 HAVCR2 MRPS15 MRPS24 DOCK11 FERMT2 NDUFS8 CCDC69 C16orf13 C11orf83 C19orf43 C1orf172 APBB1IP ERBB2IP SPARCL1 SH3GLB1 KIAA1715 KIAA0754 KIAA1543 NDUFB11 NCKAP1L NCKAP1L ZKSCAN1 RAPGEF6 HLA−DMB ITGB1BP1 TNFRSF1B ARHGEF15 ARHGAP25

GADD45GIP1 Jaccard index D E F 10

ITSN1 BIRC6 DPY30 GBE1 UHMK1 MRPS24 RHOQ TAOK1 KIAA0754 PPARG MAP3K2 CDC34 FAM89A REST KIAA1715 HSPB6 CHD9 ERBB2IP CCDC69 NDUFB11 GADD45GIP1 ITGB1BP1 SNRPA AFF4 TSPAN3 EDF1 TAOK1 EBF1 MRPS15 PSMB3 DOCK11 NCOA2 CD34 ESRP1 RPL18 CDH5 EPCAM RIF1 ARHGEF15 PTK7 ASXL2 MMRN2 ECM2 POP7 S1PR1 ZEB2 LEPROT TEK RAB25 RNF111 EBF1 FERMT2 ATP5D SPARCL1 AQP7 MGAT5 SF3B14 CIDEC C19orf43 CXorf36 KCNIP2 RPL35 PEAR1 ACO1 C11orf83 ZNF423 GYG2 APC JAM3 PLIN1 ROCK1 LDB2 AOC3 RAPGEF6 PLEK PYGL NDUFA3 NCKAP1L CD36 C16orf13 IL10RA MMD RPS11 IL2RG LIPE RPL28 JAK3 GPD1 LSM7 EVI2B LPL PRDX5 AKNA PDE3B DRAP1 IKZF1 SEC23A NDUFS8 ITGAL ELMO3 ZKSCAN1 CD3E EPHA1 SSNA1 ARHGAP25 FAM83H NR2C2 SASH3 C1orf172 PPTC7 CD48 KIAA1543 NAA10 CD53 RTKN HAVCR2 DOCK2 SPINT2 SELPLG GPSM3 DDR1 CD4 MFNG VAMP3 NCKAP1L TMC8 ZEB1 IL10RA APBB1IP AP1M2 MYO1F CD37 PRKCZ EVI2B TNFRSF1B QKI SASH3 HLA−DMB MBNL1 PLEK CYTH4 PAK4 CD53 CD4 SH3GLB1 LCP2 HCLS1 EPS15 IL2RG GMFG QKI LPL CD4 APC Short ncRNA 1st neighbors RIF1 LIPE MMD PAK4 TEK ZEB1 ZEB2 LCP2 PTK7 EBF1 CD4 CD53 PLEK CD36 EDF1 LSM7 AQP7 REST POP7 PYGL GBE1 RTKN AOC3 ACO1 DDR1 CHD9 GPD1 ECM2 PLIN1 GYG2 EVI2B ITSN1 JAK3 IL2RG BIRC6 AFF4 RHOQ EBF1 LDB2 JAM3 RPL35 RPL28 RPL18 CD37 CD53 CD48 CD34 PLEK ATP5D CIDEC EPS15 ASXL2 TAOK1 RPS11 NAA10 RAB25 CD3E PPTC7 TMC8 CDH5 AKNA IKZF1 NR2C2 EPHA1 AP1M2 SASH3 SSNA1 ESRP1 HSPB6 PDE3B VAMP3 SNRPA MGAT5 DRAP1 PRDX5 MBNL1 IL10RA EVI2B MYO1F PPARG ELMO3 PRKCZ IL2RG ITGAL ROCK1 NCOA2 MFNG UHMK1 SPINT2 GMFG EPCAM KCNIP2 TAOK1 DPY30 S1PR1 RNF111 HCLS1 TSPAN3 CYTH4 CDC34 SASH3 PEAR1 FAM89A SEC23A IL10RA FAM83H SELPLG PSMB3 NDUFA3 LEPROT MAP3K2 DOCK2 HAVCR2 FERMT2 NDUFS8 CCDC69 DOCK11 MRPS15 GPSM3 C16orf13 C1orf172 C11orf83 C19orf43 ZNF423 SF3B14 MMRN2 CXorf36 SH3GLB1 KIAA1543 MRPS24 NCKAP1L NDUFB11 ZKSCAN1 RAPGEF6 APBB1IP ERBB2IP ITGB1BP1 SPARCL1 KIAA1715 KIAA0754 NCKAP1L HLA−DMB TNFRSF1B ARHGEF15 ARHGAP25 GADD45GIP1

G Wh - TNBC H B/Aa - TNBC IsomiRs IsomiRs tRFs tRFs

IsomiRs IsomiRs tRFs tRFs

1086420 1086420

Number of ncRNAs (log2) Number of ncRNAs (log2)

miR-142-5p miR-10b-5p miR-150-5p mtProTGG miR-29b-3p miR-142-5p nGlyTCC miR-1307-5p

miR-181a-2-3p miR-142-3p miR-361-3p nLeuCAG miR-141-3p miR-155-5p Other Other

Other miR-142-3p miR-150-5p Other

Figure 2. mRNA hubs and their neighbors. The heatmaps of A–F visualize the matrix of Jaccard indices of the shared connections for the top 50 mRNA hubs. A–C, The hubs are compared by examining their immediate mRNA neighbors (mRNA-Jaccard index). D–F, The hubs are compared by examining their immediate short ncRNA neighbors (isomiR/tRF-Jaccard index). The higher the Jaccard index between two hubs, the higher the number of shared first neighbors (see Materials and Methods). A and D, Normal breast, Wh donors. B and E, Wh TNBC patients. C and F, B/Aa TNBC patients. G–H, Number of isomiRs/tRFs that are first neighbors of the mRNA hubs in Wh-TNBC (G)andB/Aa-TNBC(H). IsomiRs and tRFs are broken down by cluster (orange or blue, as identified in B and C for each race). The pie charts show the composition of the respective bar of the histogram.

computing their mRNA–Jaccard index (see Materials and Meth- again two clusters, for both the Wh and B/Aa patients. However, ods). This index gauges the number of immediate mRNA neigh- the clusters are now very cleanly separated, and hubs belonging bors that the two hubs have in common. to different clusters share no first mRNA neighbors (Fig. 2B and In the normal state, the top 50 hubs form two clusters that C). Strikingly, one cluster comprises multiple genes involved overlap considerably, that is, hubs not belonging to the same in immune system responses and immune system pathways, cluster share many first neighbors (Fig. 2A). In TNBC, we observe whereas the other comprises multiple genes from oxidative

www.aacrjournals.org Cancer Res; 78(5) March 1, 2018 1145

Downloaded from cancerres.aacrjournals.org on September 24, 2021. © 2018 American Association for Cancer Research. Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1947

Telonis and Rigoutsos

phosphorylation and ribosomal components. These findings (a) protein products that belong to the same pathway, as per strongly suggest that the stroma significantly influences gene DAVID (22); expression (in this case of the immune system): This stromal (b) protein products that localize to the same cellular compart- influence is present in TNBC but absent from normal breast. We ment, as per UniProt (32); note that the "orange" clusters in Wh-TNBC and B/Aa-TNBC (c) protein products that directly interact with each other, as patients, respectively, share only one gene. The "blue" cluster per PICKLE (33); contains several common genes in the two races. (d) gene promoter regions that share the same regulatory motif, Next, for the 50 highest degree hubs of the mRNA–mRNA as per MSigDB (34); and coexpression networks, we examined their immediate isomiR/tRF (e) genes expressed predominantly in immune cells, as per the neighbors. We hypothesized that the distinction in two clusters in Expression Atlas (35). the TNBC state might be reflected on the isomiRs and tRFs with which the mRNAs are correlated. Specifically, for each pair of Colocalization in the same cell compartment and a shared hubs, we evaluated their "isomiR/tRF overlap" by computing their DNA motif accounted evenly for a combined approximately isomiR/tRF–Jaccard index (see Materials and Methods). This 44% of the correlation pairs in each of the three states (Fig. 3A). index gauges the number of immediate isomiR/tRF neighbors Of note, protein–protein interactions accounted for a much that the two hubs have in common. lower, yet statistically significant, fraction of the mRNA–mRNA For normal breast, we did not observe any notable clustering correlation pairs (Fig. 3A). We also observed that in the case of of the mRNA hubs based on their isomiR/tRF–Jaccard index TNBC, many of the correlated mRNAs belong to the same (Fig. 2D). However, in TNBC, we found that the mRNA hubs pathway (42% for Wh and 34% for B/Aa patients). Still, in are split again into two nonoverlapping clusters (Fig. 2E and F). all cases, between 36% and 51% of the correlations cannot be When we examined the formed mRNA clusters and their explained by any of the above five potential "drivers." corresponding first isomiR/tRF neighbors, we made three strik- An important limitation of the above analyses is that they rely ing observations: on static data. With the exception of the Expression Atlas, this is (a) The gene hubs are split again into an "immune response" unavoidable given the nature of the databases. Dynamic dimen- cluster (blue) and an "oxidative phosphorylation and ribo- sions, such as tissue-specific expression and posttranslational somal components" cluster (orange). These are exactly the modifications, would help select among the universe of possible same two groups we obtained when we clustered the hubs interactions but are not available. based on their mRNA neighbors (Fig. 2B and C). To identify potential dynamic regulators that could be behind (b) The ncRNAs that are immediate neighbors of the blue some of the observed correlations, we did the following two cluster in the TNBC state belong exclusively to the isomiR things: category in both races (Fig. 2E–H). IsomiRs originate from (a) we used the ENCODE data to check whether the corre- thesamemiRNAarmslargelyirrespectiveofrace,for sponding gene promoters are bound by the same TF example, from miR-150-4p and both arms of mir-142 (36, 37); and (Fig. 2G and H). (b) we examined which of the mRNA–mRNA correlation pairs (c) The ncRNAs that are immediate neighbors of the orange can be projected to isomiRs and tRFs. There are two cluster belong overwhelmingly to the tRF category in Wh known mechanisms of actions that justify the inclusion patients and exclusively to the isomiR category in B/Aa of tRFs: (i) the previous demonstration that tRFs can act as patients (Fig. 2G and H). Moreover, the isomiR-producing miRNAs by entering the RNAi pathway through AGO- miRNA loci that are linked to the orange cluster in B/Aa loading (10, 13); and (ii) the previous demonstration patients, for example, miR-10b-5p and miR-141-3p (Fig. 2H; that tRFs can regulate mRNA levels in trans by decoying right pie chart), have no overlap with the miRNA loci that are RNA-binding proteins (12). linked to the blue cluster (Fig. 2H; left pie chart). Our results show that the portion of the mRNA–mRNA correla- These findings argue strongly that the biological processes tions that can be explained by TF-binding remains essentially driven by the genes of each cluster are associated with a distinct unchanged at approximately 82% across normal breast and type and identity of short ncRNAs. They also show that the TNBC. However, we note that this could be a significant overes- intercluster coordination of isomiRs and tRFs in normal breast timation (see Discussion). Unlike TF binding, isomiRs or tRFs (Fig. 2A and D) has given its place to an intracluster coordination exhibit a much higher dependence on the tissue and state under in the TNBC state (Fig. 2B, C, E, and F). Moreover, the results reveal consideration. an important race disparity in TNBC that pertains to how immune More specifically, in normal breast, isomiRs can explain 68% system genes could be regulated by isomiRs and tRFs. of the mRNA–mRNA pairs whereas tRFs can explain 86% (Fig. 3B), that is, there is at least one isomiR or tRF that is correlated The short ncRNAs are likely the most important modulators of with both mRNAs of most pairs. In TNBC, isomiRs can link the mRNA transcriptome in normal breast and in TNBC comparativelyfewermRNApairs:21%inWhand35%inB/Aa We should not lose sight of the fact that gene coexpression patients. Analogously, tRFs can still account for 42% of the networks capture correlation patterns. Thus, attempts to infer mRNA–mRNApairsinWhTNBCpatients.However,thereisa causality from these undirected graphs should be based on very drastic drop in the case of B/Aa TNBC patients: a mere 1% additional evidence. With that in mind, we focused on the mRNA of the mRNA–mRNA pairs can be linked to tRFs. These data coexpression networks and sought to identify factors ("drivers") strongly indicate that the disruption of the regulatory layer that could potentially explain the correlated pairs. Specifically, we affected by isomiRs and tRFs is system wide and has a race- first examined five categories of drivers: dependent component.

1146 Cancer Res; 78(5) March 1, 2018 Cancer Research

Downloaded from cancerres.aacrjournals.org on September 24, 2021. © 2018 American Association for Cancer Research. Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1947

Short Noncoding RNAs in TNBC Disparities

lamroN - hW - lamroN hW - CBNT B/Aa - TNBC A

51% 36% 43%

Cell DNA Immune Cell DNA Immune Cell DNA Immune PPI Pathway PPI Pathway PPI Pathway compartment Motif cell compartment Motif cell compartment Motif cell (0.4%) (19%) (19%) (25%) (0.4%) (1%) (42%) (23%) (20%) (4%) (1%) (34%) (23%) (23%) (4%) * * + * * + * * * + * B

isomiRs tRFs TF isomiRs tRFs TF isomiRs tRFs TF (68%) (86%) (83%) (21%) (42%) (84%) (35%) (1%) (80%)

Pos Pos Pos Pos Pos Pos Pos Pos Pos Neg Neg Neg Neg Neg Neg Neg Neg Neg

CD2 IL10RA C VSIG4 IL2RG SELPLG mtArgCCT MS4A6A CYTH4 CD37 FCER1G NDUFB10 mtArgCCG mRNA CD3E LAIR1 C3AR1 CD52 ITGAL HAVCR2 tRF miR-25-3p HLA-DRA DCI FAM128B IL2RB EVI2B IsomiR CD48 MRPS34 miR-181a-2-3p mtGlyTCC CSF2RB CD53 CD74 IDO1 AIF1 ADAP2 STX11 LZTS2 JAK3 ABCF1 APOL6 GIMAP7 CD93 C15orf63 mtGluTTC GBP4 WARS FMNL1 PEAR1 SSNA1 IL16

BAT3 mtValTAC PCOLCE AEBP1 miR-106b-5p LHFP miR-200c-5p miR-101-3p GNG11 MFRP OLFML1

COL3A1 PECAM1 miR-141-3p miR-141-5p ANXA2P2 ANXA2 miR-92a-3p THY1

LUM COL6A3

MXRA5 miR-96-5p

Figure 3. The mRNA associations within a state can be largely explained by short ncRNAs and common transcription factors. A, Alluvial diagrams representing the percentage of mRNA couplings that are associated with each factor. The percentage in blue shows the fraction of mRNA couplings that cannot be explained by any of the considered factors. An asterisk () indicates that this percentage is significantly higher than expected by chance, whereas a cross (þ) indicates that it is lower. Statistical significance was assessed with Monte Carlo simulations with a threshold of an absolute z-score of at least 2.0 as compared with the constructed expected distribution. B, Alluvial diagrams showing the percentage (also shown below each label) of mRNA coexpression pairs that can be projected to the sameisomiR,tRF,orTFineachstate.Thegrayfields are proportional to the overlap among the clusters. The pie charts show the sign (negative, blue; positive, orange) for those of the coexpressed pairs that are also associated with the corresponding regulator. The TF values are boxed to indicate the possibility that the percentage is an overestimate (see also text). C, Example subnetworks comprising mRNAs (circles) that are coexpressed with isomiRs or tRFs from the shown miRNA (orange diamonds) or tRNA (magenta diamonds) loci in Wh TNBC patients. Blue edges indicate coexpressed and positively correlated mRNA–mRNA pairs. Gray edges represent miRNA or tRNA loci whose isomiRs or tRFs are coexpressed with the corresponding mRNAs. All shown miRNA–mRNA pairs capture negative correlations. The shown tRF–mRNA pairs capture either positive or negative correlations.

Another important finding pertains to the concordance is exceptionally high. Indeed, most of the mRNA correlation between the ability of isomiRs and tRFs to account for pairs that can be linked with isomiRs can also be linked with mRNA–mRNA correlations. In normal breast, the concordance tRFs, and vice versa (Fig. 3B, left). This is true for both

www.aacrjournals.org Cancer Res; 78(5) March 1, 2018 1147

Downloaded from cancerres.aacrjournals.org on September 24, 2021. © 2018 American Association for Cancer Research. Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1947

Telonis and Rigoutsos

A Wh-Normal -- Wh-TNBC B Wh-TNBC -- B/Aa-TNBC

GADD45GIP1 CD37 RPLP0 MRPS26 MRPS24 RPL14 CMKLR1 TMC8 RPL10 RPS2 PRMT1 ITGAL IL2RB CTSS IL10RA RPL10A RPL12 RPP21 NDUFB10 COX6B1 HLA-DPA1 PLEK FERMT3 AP2S1 RPL6 FYB RPL37 RPS5 RPL13 MRPL55 HLA-DMB APBB1IP NDUFS5 GNB2L1 TIMM10 SLAMF7 MFNG CD4 TNFRSF1B DPP8 RPS14 MRPS24 RPL7A FKBP8 CD48 DOCK2 RPL18 LEPROT PTPRC HLA-DPB1 RPL27A RPS6 CYBB IL6ST LCP2 HLA-DMA RPS2 RPS3 RPS8 RPS11 GPSM3 ATP5G2 CD53 PIK3AP1 VPS13C RPS7 SASH3 HLA-DRA CD74 RPS20 RPL5 ADAM10 HSD17B10 CORO1A RPL19 NDUFB11 CD2 NCKAP1L RPLP2 RPL27 RPS21 CD3E JAK3 RNF111 NHP2 RPL28 LAIR1 PLXNC1 RPL35 IL2RG LPXN C3AR1 HAVCR2 HLA-DOA RPL38 RPL13A RPS9 CIITA RPS18 RPS7 RPS15 RPL36 PPARG C1QC ITGB2 IL16 PLCB2 UHMK1 RPL24 CIDEA AIF1 INPP5D KCNIP2 RPP21 PIN4 LILRB4 CSF2RB HCLS1 PLXNA4 C1QB GPD1 ITGAX TIMP4 FPR3 IRF8 HCK FZD7

SPINT1 ECM2 LPL PHIP PPP3CB PLIN1 APEH MARK2 FZD4 MGLL MIB1 PLEKHB2 NUFIP2 MMD ELMO3 FABP4 GYG2 RAB20 NF1 SPINT2 PNPLA2 DDR1 GPAM SLC39A10 KIDINS220 NDUFB11 PCK1 LIPE CAT NDUFA3 KIAA1543

Signaling/development CW DW Ox-Phos CW DW Ribosome Wh-TNBC Ribosome Tmr B/Aa-TNBC Triglyceride metabolism Immune system responses Nrm Mitochondrion Cell surface receptor signaling

C Wh-Normal -- Wh-TNBC Wh-TNBC -- B/Aa-TNBC D IsomiRs tRFs 1% Pos Pos Same Same Pos Same Pos/Neg Pos/Neg Same Neg Neg Pos

Pos/Neg

Different Different (97%) (86%) Different None None (74%) (69%) Different Neg (67%) Neg (56%)

None (26%) None (15%) Wh-Normal -- Wh-TNBC IsomiRs tRFs IsomiRs tRFs Wh-Normal Wh-TNBC Wh-Normal Wh-TNBC

miR-203a Pos Pos Pos E miR-200b-3p miR-200a-5p ENST00000317610 Pos/Neg Pos/Neg Pos/Neg miR-200c-3p miR-141-3p ENST00000394609 100 nt Neg nMetCAT RTN4 miR-141-5p Neg Neg

mtValTAC miR-9-5p

nThrAGT miR-200a-3p Pos None nGlnCTG Neg (95%) None None None (69%) (67%) (66%) Wh-TNBC -- B/Aa-TNBC CBNT-hW CBNT-aA/B CBNT-hW CBNT-aA/B Ratio (log2) 20 40 60 80 100 20 40 60 80 100 ENST00000394609 (% Expression) ENST00000317610 (% Expression) 012345 0 0

Wh Wh Wh Wh Wh Wh Normal TNBC Normal TNBC Normal TNBC * * *

Figure 4. IsomiRs and tRFs as potential drivers of differentially wired mRNAs. A and B, The panels show CW and DW mRNAs when comparing normal breast and TNBC in Wh donors (A), or, the TNBC state between Wh and B/Aa patients (B). The mRNA nodes are colored by pathway. C, Alluvial diagram showing the percentage of the DW associations that can be attributed to differential wiring ("Different") with an isomiR (left column of each panel) or a tRF (rightcolumn of each panel). The box labeled "Same" captures the percentage of DW mRNA–mRNA pairs that are either not correlated with isomiRs/tRFs or the correlations with isomiRs/tRFs are CW between the two tissue states or races. (Continued on the following page.)

1148 Cancer Res; 78(5) March 1, 2018 Cancer Research

Downloaded from cancerres.aacrjournals.org on September 24, 2021. © 2018 American Association for Cancer Research. Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1947

Short Noncoding RNAs in TNBC Disparities

positively- and negatively correlated pairs (see accompanying associations of the 40S ribosomal protein S8 (RPS8) mRNA with pie charts) and suggests that isomiRs and tRFs mediate highly other mRNAs encoding ribosomal proteins remain unchanged coordinated regulatory programs in normal breast tissue. (CW) between normal breast and TNBC. Other associations, However, this concordance/coordination is reduced radically however, are tissue-state specific. This indicates an extensive in the cancer state wherein isomiRs and tRFs are linked to different rewiring of the ribosomal protein network in Wh-TNBC through mRNA pairs (Fig. 3B, middle/right). Moreover, there is an appar- losses and gains of correlation. ent dichotomy in the case of Wh TNBC patients: Whereas the Note also how the correlation of the signaling processes with isomiRs are linked primarily to positively correlated mRNA– triglyceride metabolism by means of the PLIN1 hub mRNA is lost mRNA pairs, the tRFs are linked primarily to negatively correlated in TNBC. At the same time, the leptin receptor overlapping pairs in these patients (see pie charts). We visualize examples transcript (LEPROT) mRNA emerges as a hub in TNBC that links of how miRNAs and tRFs can potentially explain positively many mRNAs encoding ribosomal and mitochondrial proteins correlated mRNAs in Fig. 3C. (Fig. 4A). These results indicate a wide-ranging deregulation of The outcome of this analysis further supports the previous normal cell morphology and tissue physiology in the TNBC state. findings (Figs. 1 and 2) that the TNBC states differ considerably from the normal state. The extensive differences between any two The wiring of coexpressed mRNAs differs considerably between of the three patient states (Wh-normal, Wh-TNBC, and B/Aa- Wh TNBC patients and B/Aa TNBC patients TNBC) suggest broad disruptions of the wiring at the molecular Next, we analyzed coexpressed mRNAs in Wh TNBC patients level in TNBC, and also between races. At the same time, the andB/AaTNBCpatientsandfoundsignificant rewiring relatively large percentage of mRNA–mRNA pairs in both Wh between the two races. This is consistent with the view that TNBC and B/Aa TNBC patients that cannot be explained by the these two groups of patients are characterized by extensive categories we considered here suggests either the existence of an molecular differences. Among the coexpressed mRNA pairs additional layer of regulation whose details elude us, or a pro- that are present in either patient group, 17% are CW in Wh found alteration of the posttranscriptional control implemented and B/Aa TNBC patients, and include important components of by isomiRs and tRFs. tumor biology such as the ribosome, oxidative phosphory- lation, and immune system responses. This is exemplified by The wiring of coexpressed mRNAs is disrupted extensively the subnetwork shown in Fig. 4B (see also Supplementary between normal breast and TNBC Table S5). Here too, many correlations are gained and lost In this section, we focus on mRNAs that either remain CW or between the two races in TNBC. become DW (29, 41) between states (see Materials and Methods A modest, yet considerable, fraction (7%) of coexpressed for definitions). Intuitively, two specified mRNAs that remain CW mRNA pairs are DW between the two states. These mRNAs belong across multiple states can be assumed to belong to "housekeep- to pathways that include proteasome, cell signaling, oxidative ing" processes. Conversely, two mRNAs that become DW in a phosphorylation, transcription regulation, and other (Supple- state-dependent manner are likely related to each state's under- mentary Table S5). They are captured by the "nonalcoholic fatty lying biology (41) and can be informative just like DE mRNAs liver disease" label due to the latter's extensive molecular simi- (14, 15, 39–41). larities with those pathways. As evidenced from Supplementary First, we compared normal breast and TNBC in Wh patients and Table S5, for a few pathways (e.g., immune system, oxidative found that a mere 2% of the mRNAs remain CW. Perhaps not phosphorylation, Wnt signaling), a subset of their genes is CW surprisingly, these few CW pairs comprise positively correlated between Wh and B/Aa patients, whereas a different subset is DW. mRNAs that code for ribosomal proteins (for the full list of results, This indicates that the corresponding mRNAs make race-specific see Supplementary Table S5). Of the mRNA pairs that are coex- contributions to the biology of TNBC. pressed in normal breast, 45% are lost in TNBC (DW mRNAs). Of Finally, we note that thioredoxin (TXN) is among the genes that the mRNA pairs that are coexpressed in TNBC, 9% are not present are DW between TNBC patients of different race. TXN participates in normal breast, and are DW as well. in multiple negative correlations in the B/Aa TNBC patients that These results speak to profound differences in coexpression are absent from the Wh TNBC patients. These negative correla- between normal breast and TNBC. We highlight this observation tions include Aquaporin 1 (AQP1) and leptin (LEP), suggesting a with a few example subnetworks in Fig. 4A. In all these instances, race-specific coordination of genes involved in redox balance and the coexpression is absent in one of the states. Note that many energy signaling/homeostasis.

(Continued.) The percentage numbers noted in red show the fraction of DW mRNA associations that cannot be attributed to DW with isomiRs or tRFs. For example, 86% and 97% of the DW associations between Wh-normal and Wh-TNBC can be attributed to DW with an isomiR and a tRF, respectively. The gray area connecting the columns represents the overlap. In the left plot, almost all of the DW mRNA pairs that are DW with isomiRs are also DW with tRFs. D, Alluvial diagrams showing how the isomiRs and tRFs are differentially wired with the DW mRNA associations between the normal breast and TNBC in Wh donors (top) and between Wh TNBC and B/Aa TNBC patients (bottom). Each "Pos" ("Neg" respectively) block indicates the portion of DW mRNA pairs that are positively (negatively, respectively) correlated with at least one isomiR/tRF in the corresponding state. The "Pos/Neg" block captures the portion of DW mRNA pairs that are both positively and negatively correlated with isomiRs/tRFs. "None" captures the portion of DW mRNA pairs that cannot be correlated with any isomiRs/tRF. E, DW of RTN4 as a result of differential targeting by isomiRs. RTN4 is DW with isomiRs and tRFs from several different loci when comparing the Wh-normal and Wh-TNBC state (top left). The archetype miRNA of miR-200b-3p has a putative target site only in one of the two mRNA isoforms of RTN4 (top right). The RTN4 mRNA isoforms are differentially expressed between the normal tissue and tumor in Wh patients (bottom boxplots). , statistically significant differences (P < 0.05; Mann–Whitney U test). A, B,andE, The shown DW transcripts capture the absence of coexpression in one of the states being compared. A, B,andE, The edges connecting DW transcripts are colored by the state in which the transcripts are coexpressed.

www.aacrjournals.org Cancer Res; 78(5) March 1, 2018 1149

Downloaded from cancerres.aacrjournals.org on September 24, 2021. © 2018 American Association for Cancer Research. Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1947

Telonis and Rigoutsos

The disruption of the wiring of coexpressed mRNAs in TNBC isoacceptors including the nuclear tRNAHisGTG and tRNAAlaTGC is race dependent and involves isomiRs and tRFs and the mitochondrial tRNAGluTTC (Supplementary Table S6). Considering the above findings, we hypothesized that many HuR has long been studied for its ability to stabilize the mRNA of the DW mRNAs are associated with isomiRs or tRFs. Our transcripts of oncogenes, particularly under stress conditions analysis shows that for nearly all (99%) of the mRNA pairs that (45). Thus, it is conceivable that tRFs interfere with HuR function, are DW between normal breast and TNBC in Wh patients, their in a manner analogous to what was shown for YBX1 (12). wiring with isomiRs or tRFs is also DW between these states (shown in cyan in Fig. 4C). It is interesting to note that, despite Genes from key metastasis-linked pathways show race-specific being numerically fewer, tRFs can be linked with 10% more associations with isomiRs and tRFs DW mRNAs than isomiRs. When comparing Wh TNBC with TNBC tumors in Wh and B/Aa patients exhibit striking differ- B/Aa TNBC patients, the results remain unchanged qualitative- ences in their metastatic potential (3, 5). Although not apparent ly, but now the two states exhibit more similarities (Fig. 4C). from the genetic or clinical TCGA data (6), there is indirect Characteristic rewiring examples are shown in Supplementary evidence of an increased potential for metastases among the Fig. S3A and S3B. TCGA B/Aa patients. For example, tumors from these patients Looking deeper into the links between DW mRNAs and have lower expression of E-cadherin (CDH1) at both the mRNA isomiRs/tRFs in Wh patients, we find that the majority of DW and protein levels (Fig. 5A). Thus, we sought to understand the mRNAs lose their links to isomiRs or tRFs between normal and potential contribution of tRFs and isomiRs on the mRNA abun- TNBC (top half of Fig. 4D). Supplementary Figure S3C and S3D dance of two known metastasis-associated pathways. shows that isomiRs from the mir-200 family, which has been linked to metastasis in many settings (42, 43), and tRFs from the The P38 pathway. We first examined metastasis suppressors that nuclear tRNAGlnCTG are associated with several of the mRNAs that have been studied for their potential to represent rate-limiting are DW between the Wh-normal and the Wh-TNBC states. When steps in the metastatic cascade (46, 47). Among them, 13 are comparing Wh TNBC and B/Aa TNBC patients, we find that the found correlated with isomiRs and/or tRFs in TNBC in exactly one DW mRNAs are linked with isomiRs or tRFs predominantly in one of the two races (Supplementary Table S7). Two of the 13 genes of the two races but not in the other (bottom half of Fig. 4D). are noteworthy because of their multiple associations with tRFs: Taken together, these findings suggest that the associations mitogen-activated protein kinase 14 (MAPK14 or P38) and Raf between isomiRs/tRFs and mRNAs appear or disappear as a kinase inhibitor protein (RKIP). RKIP is also known as phospha- function of the patient's race and of the state of the tissue. On tidylethanolamine binding protein 1 (PEBP1). The MAPK14 and the basis of the fact that both isomiRs and tRFs enter the RNAi RKIP mRNAs are correlated with tRFs but only in Wh TNBC pathway, it is reasonable to assume that some of these associa- patients. Notably, the mRNAs' correlations with tRFs have oppo- tions are indicative of direct regulatory interactions that are active site signs (Fig. 5B). Specifically, several tRFs from mitochondrially only in the specific biological context. encoded tRNAs, many of them from the MT tRNAProTGG, are correlated negatively with MAPK14 and positively with RKIP.On Potential contributors to the differential wiring of mRNAs the other hand, tRFs from nuclearly encoded tRNAs exhibit the could include splicing and decoying opposite pattern (Fig. 5B). Onepossiblemechanismbehind the DW between mRNAs We stress that these are nonaccidental correlations. Indeed, and isomiRs could be the existence of splicing variants between these two genes belong to the same signaling pathway, and normal breast and TNBC. For example, isomiRs could be each of the respective protein products interacts directly with targeting specifically one of the mRNA isoforms but not the several protein partners (48). We stress that the MT-derived tRFs other, possibly doing so in only one of the tissue states (44). showninFig.5BaredownregulatedintheB/AaTNBCpatients. We highlight this possibility through an example involving the The observed dynamics suggests that the tRFs compete with one Reticulon 4 (RTN4) gene. In Wh donors, RTN4's mRNA is another for dynamic control of the abundances of these two correlated with the mRNAs of other genes in normal breast genes. The loss of the MT tRFs in B/Aa TNBC patients leads but not in TNBC. The available data indicate that RTN4 is potentially to the rewiring of this module with a consequent correlated with several isomiRs and tRFs from multiple miRNA adverse impact on RKIP's ability to inhibit the metastatic loci and tRNA isoacceptors (top left of Fig. 4E). RTN4 has two cascade. mRNA isoforms. Using the rna22 miRNA target prediction tool (24), we find that only one of the two mRNA isoforms has a The Wnt/b-catenin pathway. The Wnt pathway has been receiving putative target site for the archetype miRNA of miR-200b-3p a lot of attention as a potential contributor to metastasis (49) and, (top right of Fig. 4E). Interestingly, this putative target site is also, to the health disparities in TNBC (5). In Fig. 5C, we depict a present in the mRNA of the transcript that is downregulated in portion of the Wnt signaling pathway (32, 48, 50) with the PPI Wh TNBC patients compared with normal breast (boxplots among its components shown in gray. Information on whether a of Fig. 4E). The relative expression of these two mRNA isoforms gene's mRNA is DW with isomiRs or tRFs between Wh TNBC and changes between normal breast and tumor in Wh patients B/Aa TNBC patients is shown overlaid (the full list of DW (Fig. 4E). Thus, it is conceivable that an alternative splicing isomiRs/tRFs and mRNAs appears in Supplementary Table S7). event is responsible for the DW between miR-200b-3p and Surprisingly, we find isomiRs and tRFs to be DW with numerous RTN4 and the increase of the nontargeted isoform in TNBC. components of the pathway, including cell surface Frizzled recep- Recently, it was shown that tRFs can also act by decoying RNA- tors, intracellular signaling proteins, and gene expression regula- binding proteins (12). In addition to expected alignments of tRFs tors (Fig. 5C). It can be seen that the mRNAs that are DW with with YBX1 and SSB RNA-binding motifs, our in silico studies show isomiRs originate from loci with well-documented roles in metas- that HuR's binding motif matches tRF sequences from several tasis (43, 51). These loci include the miR-200c-3p and miR-21-5p

1150 Cancer Res; 78(5) March 1, 2018 Cancer Research

Downloaded from cancerres.aacrjournals.org on September 24, 2021. © 2018 American Association for Cancer Research. Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1947

Short Noncoding RNAs in TNBC Disparities

CDH1 mRNA CDH1 Protein A B nGlnTTG nLeuTAG PEBP1 nHisGTG nThrAGT RPKM) 2 MAP3K7 nAlaAGC IKBKB TRAF6 mtProTGG AKT1 nAlaCGC MAP2K1 Expression (log MAPK1 Normalized protein abundance mtGluTTC nLeuCAG −3−2−10123 10 11 12 13 14 15 mtLysTTT MAPK14 nTyrGTA Wh B/Aa Wh B/Aa TNBC TNBC TNBC TNBC mtTyrGTA PPI Pos. correlation ** Neg. correlation

miR-141-3p nGlyTCC C miR-92b-3p SERPINF1 miR-181a-5p miR-18a-5p miR-197-3p miR-195-3p miR-15b-3p FZD5 miR-29c-3p miR-19a-3p miR-17-5p miR-20a-5p WNT3A miR-141-5p miR-200c-5p miR-10b-3p miR-19b-3p NFATC2 PPP3R1 miR-17-3p miR-155-5p NFATC1 ROCK2 WNT1 mtLeuTAA MAPK10 PPP3CB miR-3613-5p RAC3 miR-143-3p RHOA mtAlaTGC miR-29a-3p miR-103a-3p miR-135b-5p WNT2 NFATC3 MAPK9 NFATC4 FZD1 PPP3CC LRP6 miR-1468-5p miR-511-5p miR-150-5p CREBBP FOSL1 MAPK8 miR-22-5p miR-452-5p miR-126-3p FRAT2 SFRP1 miR-660-5p mtProTGG EP300 PPP3CA CCND1 miR-181a-2-3p PRKCA CTBP1 let-7b-5p WNT5A DVL2 PRKACA CSNK2A2 JUN WNT4 CCND2 DVL1 CSNK1A1 TP53 MAP3K7 SFRP2 DAAM1 CTNNBIP1 CSNK2B LEF1 MYC miR-183-5p SMAD4 miR-185-5p RAC1 SMAD3 RUVBL1 miR-32-5p DVL3 miR-375 TCF7L2 CAMK2G GSK3B AXIN1 APC miR-10a-5p miR-217 BTRC nLysCTT FZD4 RBX1 CSNK2A1 miR-505-3p PSEN1 AXIN2 CSNK1E CUL1 NLK CACYBP RAC2 PPI miR-218-5p FBXW11 miR-21-3p DW (Wh-TNBC) TBL1X DW (B/Aa-TNBC) miR-200c-3p SKP1 let-7g-5p miR-101-3p mtPheGAA Gene product CCND3 miR-125a-5p CTNNB1 nGlnCTG miR-582-3p miRNA SIAH1 miR-22-3p miR-15a-5p mtGluTTC tRNA miR-411-5p miR-21-5p mtValTACnProAGG nAlaAGC nArgTCT mtLysTTT miR-30e-3p mtTyrGTA nArgCCT miR-889-3p

Figure 5. Race-specific metastasis-related genes and pathways. A, mRNA and protein levels of E-cadherin (CDH1), a marker of metastatic potential, show differential expression between the two races in TNBC. , P < 0.05; Mann-Whitney U test. B, DW tRFs and mRNAs participating in MAPK signaling. The shown correlations between tRFs and mRNAs are present only in Wh TNBC patients. Shown are tRNA loci (diamonds) that produce tRFs that are correlated with the mRNAs of the shown genes (circles) as well as PPIs among the protein products of these genes. PEBP1 is also known as RKIP and MAPK14 is known as P38. C, PPI network and DW associations of the corresponding mRNAs with isomiRs/tRFs of the Wnt/b-catenin signaling pathway as integrated from KEGG, Pickle, and UniProt databases. The shown DW associations correspond to an absence of coexpression in one of the states. The edges connecting DW transcripts are colored by the state in which the transcripts are coexpressed. Gene nodes are sorted on the basis of the cellular compartment of the encoded protein, from extracellular (left) to nuclear (right). The full list of how the Wnt/b-catenin pathway genes are DW with isomiRs and tRFs is included in Supplementary Table S7.

arms, the miR-17/92 miRNA cluster, and the miR-183/96/182 The various associations between isomiRs/tRFs and mRNAs miRNA cluster. As indicated by the color of the edges connecting from the two pathways mentioned above are but a small fraction the DW mRNA and miRNA transcripts, these transcript pairs are of numerous such associations. As we depict in Fig. 4, other coexpressed almost exclusively in B/Aa TNBC patients. Also, the important pathways with similar differentially wired associations DW mRNA-tRF pairs of transcripts are coexpressed almost exclu- include immune system responses, oxidative phosphorylation, sively in Wh TNBC patients. Finally, we note that more than half of and cell adhesion molecules. Nonetheless, even the two examples the shown DW mRNA-tRF pairs involve tRFs from MT tRNAs, that we showed above for the P38 and Wnt pathways suggest particularly MT tRNAProTGG and MT tRNAValTAC. widespread, race-specific regulatory events that involve isomiRs

www.aacrjournals.org Cancer Res; 78(5) March 1, 2018 1151

Downloaded from cancerres.aacrjournals.org on September 24, 2021. © 2018 American Association for Cancer Research. Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1947

Telonis and Rigoutsos

and tRFs. Presumably, these events shift B/Aa TNBC patients to a cistrome and not events that are specific to a tissue or a tissue state molecular state that greatly favors metastasis. (57), the portion of mRNA correlation pairs that TFs could account for remained unchanged across tissue states and patient races (Supplementary Table S3). This indicates a potential over- Discussion estimation by our analyses of the TFs' influence (Fig. 3B). We analyzed the molecular profiles of TNBC tumors from B/Aa Examination of the changes in the interaction network from and Wh patients and compared them with the profile of normal normal breast to TNBC allowed us to trace the CW and DW of breast. We identified mRNAs, isomiRs, and tRFs that, in various mRNAs to differences in the associations of mRNAs with isomiRs combinations, are either DE or DW between normal breast and and tRFs. We found widespread loss of these associations in TNBC TNBC, and between patients belonging to the two races. Our compared with normal breast. Presumably, some of these asso- findings suggest pronounced disruption of the normal breast's ciations correspond to regulatory activity by the isomiRs/tRFs, molecular physiology in TNBC. By dissecting the underlying links, which would suggest a profound loss of the isomiRs' and tRFs' we flagged isomiRs and tRFs as potentially very important drivers regulatory capacity in TNBC. In fact, we find that only 2% of the of this disruption and of the racial health disparities in TNBC. mRNA–mRNA associations are CW in the normal breast and Components of the proteasome are significantly upregulated in TNBC in Wh patients. TNBC compared with normal breast tissue as are ribosomal IsomiRs and tRFs appear to have a very broad impact on proteins and ribosome biogenesis factors, in agreement with tumorigenesis and many other aspects of TNBC biology (43). reports on altered protein dynamics at multiple stages of tumor We found analogous molecular differences between Wh TNBC biology (52). In addition to the pathways of metabolism, signal- and B/Aa TNBC patients, which adds support to previous reports ing, and immune response that are DE between the normal breast by others (3, 5–9) and us (10, 11) that molecular causes underlie and TNBC (Fig. 1A), we found that several circadian rhythm health disparities in TNBC. However, to resolve whether the loss genes, like CLOCK and PER1 (Fig. 1C; Supplementary Table of regulatory capacity drives or results from this widespread S1), are also downregulated in both Wh and B/Aa TNBC patients, mRNA rewiring will require exhaustive time-series data. We also in concordance with previous reports (53). note that based on the TCGA data, there is a marked inability of Expanding on our previous work (10, 11), we identified tRFs to be connected with mRNAs in B/Aa TNBC patients. isomiRs and tRFs that are DE between normal tissue and With regard to disparities, we focused on metastasis, one aspect TNBC. IsomiRs from the miR-21-5p arm, the miR-200 and of TNBC where Wh and B/Aa patients exhibit well-documented miR-10 families, and the miR-183/96/182 cluster, all of which differences. Starting with metastasis inhibitor genes and pathways are known to be dysregulated in breast cancer (54), are DE in (46, 47), we sought mRNAs that are DW with tRFs or isomiRs and TNBC compared with normal in both races. Similarly, tRFs from provided evidence of race-based disparities in the context of two tRNAHisGTG, tRNAGlnTTG, and tRNAGlnCTG are DE in TNBC com- pathways: pared with normal breast. (a) P38: We found the axis between MAPK14 (P38) and RKIP to A new finding pertains to the discovery of miRNA and tRNA be DW with MT and nuclear tRFs. RKIP regulates Raf activity loci, each of which produces two groups of transcripts with a (46, 47), thereby inhibiting MAPK activation. The coordi- curious property: each of the two groups from the locus is DE nated association of tRFs with these two mRNAs suggests between normal and TNBC, but the two groups exhibit opposite that tRFs likely maintain a balance that becomes disrupted in signs in their change of abundance. Examples include: miR-146b- B/Aa TNBC patients, thereby increasing the tumor's poten- 5p, miR-146b-3p, miR-223-3p, and several members (mir-30b, tial to metastasize. mir-30c, mir-30d, and mir-30e) of the mir-30 family (Supple- (b) Wnt/b-catenin: We established associations of key Wnt mentary Table S1). This is concordant with our previous report members with several miRNA loci (e.g., miR-21, miR-29, that isomiRs from the 5p arm of mir-183 affect the abundance of miR-200, miR-19a, miR-150) and tRNA isoacceptors (e.g., mRNAs in TNBC in opposing ways (11). MT tRNAProTGG and tRNAValTAC). Although the implica- DE transcripts provide invaluable insights into pathways that tions of these associations cannot be inferred directly from deviate from their normal function. However, expression changes RNA expression studies, our results suggest extensive do not always capture the changes in the relationships among regulation of Wnt signaling by isomiRs and tRFs. The transcripts (28, 39). To determine which network nodes are dynamics of this regulation appear to be affected by the critical to such changes, we used gene coexpression analysis biological context and to be altered in TNBC. We note that (39), an approach that has yielded valuable insights in multiple many of the identifiedmiRNAsarelinkedtoandrogen contexts (28, 30, 55). receptor (AR) signaling (58), and AR is emerging as an We also integrated the observed mRNA correlations with mul- important pathway in breast cancer and TNBC (59). Thus, timodal information. This allowed us to link putatively mRNA our data may describe a race-specificinteractionofARand correlations with events that are temporally upstream (e.g., reg- Wnt/b-catenin signaling and miRNAs. ulation by a common TF, isomiR, or tRF) or downstream (e.g., participation of the protein product in common processes). We Interestingly, we could separate the normal breast samples by found that TFs, isomiRs, and tRFs often have a profound effect on race (Fig. 1). B/Aa women are known to exhibit higher tissue mRNA wiring. The other modalities could not explain adequately inflammation that results in differences of fat tissue physiology the mRNA–mRNA correlations, something that was observed compared with Wh women (5). This difference likely predisposes previously by others (55) and us (56). them to a more aggressive instance of TNBC. The existence of a With regard to TFs, in particular, we leveraged ENCODE's different "starting" point (38) for B/Aa women is also supported experimentally identified binding sites (36, 37) from multiple by recent findings that stem cells in normal breast tissue exhibit cellular settings. Because the ENCODE data reflect a TF's complete ethnicity dependencies (60).

1152 Cancer Res; 78(5) March 1, 2018 Cancer Research

Downloaded from cancerres.aacrjournals.org on September 24, 2021. © 2018 American Association for Cancer Research. Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1947

Short Noncoding RNAs in TNBC Disparities

These race dependencies appear to extend to the trajectory's Authors' Contributions "ending" point as well, that is, to TNBC. Indeed, we found Conception and design: A.G. Telonis, I. Rigoutsos several immune system response genes to be DW between races Development of methodology: A.G. Telonis, I. Rigoutsos in the TNBC state. In addition, we found several aspects of lipid Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): A.G. Telonis metabolism to be DW between normal breast and TNBC in Wh fi Analysis and interpretation of data (e.g., statistical analysis, biostatistics, women. These ndings indicate that dissecting the complexity computational analysis): A.G. Telonis, I. Rigoutsos of racial disparities in TNBC will benefitfromsystembiology Writing, review, and/or revision of the manuscript: A.G. Telonis, I. Rigoutsos approaches. Administrative, technical, or material support (i.e., reporting or organizing In summary, we provided a systems perspective on the data, constructing databases): A.G. Telonis, I. Rigoutsos transcriptomic differences between normal breast and TNBC, Study supervision: A.G. Telonis, I. Rigoutsos and, on the racial disparities in TNBC. We generated extensive evidence supporting a central role for miRNAs and tRFs in the Acknowledgments regulation of mRNAs in TNBC (61). Our approach provides a This work was supported by a William M. Keck Foundation grant framework in which to generate novel hypotheses that link (I. Rigoutsos), NIH/NCIR21-CA195204 (I. Rigoutsos), and by Institutional specific mRNAs and cellular processes to specificisomiRsand Funds (I. Rigoutsos). tRFs. Our findings also underline the importance and value of integrative systems and network analyses in tackling the com- The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked plexity of dynamic and context-dependent molecular interac- advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate tions (14, 28, 62). this fact.

Disclosure of Potential Conflicts of Interest Received July 7, 2017; revised October 19, 2017; accepted November 30, No potential conflicts of interest were disclosed. 2017; published OnlineFirst December 11, 2017.

References 1. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2016. CA Cancer J Clin 15. Barabasi AL, Gulbahce N, Loscalzo J. Network medicine: a network-based 2016;66:7–30. approach to human disease. Nat Rev Genet 2011;12:56–68. 2. Dawson SJ, Provenzano E, Caldas C. Triple negative breast cancers: clinical 16. The Cancer Genome Atlas Research Network. Comprehensive molecular and prognostic implications. Eur J Cancer 2009;45Suppl 1:27–40. portraits of human breast tumours. Nature 2012;490:61–70. 3. Ademuyiwa FO, Edge SB, Erwin DO, Orom H, Ambrosone CB, Underwood 17. Loher P, Telonis AG, Rigoutsos I. MINTmap: fast and exhaustive profiling W III. Breast cancer racial disparities: unanswered questions. Cancer Res of nuclear and mitochondrial tRNA fragments from short RNA-seq data. Sci 2011;71:640–4. Rep 2017;7:41184. 4. Williams DR, Mohammed SA, Shields AE. Understanding and effectively 18. Telonis AG, Magee R, Loher P, Chervoneva I, Londin E, Rigoutsos I. addressing breast cancer in African American women: Unpacking the social Knowledge about the presence or absence of miRNA isoforms (isomiRs) context. Cancer 2016;122:2138–49. can successfully discriminate amongst 32 TCGA cancer types. Nucleic Acids 5. Dietze EC, Sistrunk C, Miranda-Carboni G, O'Regan R, Seewaldt VL. Triple- Res 2017;45:2973–85. negative breast cancer in African-American women: disparities versus 19. Magee R, Loher P, Londin E, Rigoutsos I. Threshold-seq: a tool for biology. Nat Rev Cancer 2015;15:248–54. determining the threshold in short RNA-seq datasets. Bioinformatics 6. Ademuyiwa FO, Tao Y, Luo J, Weilbaecher K, Ma CX. Differences in the 2017;33:2034–6. mutational landscape of triple-negative breast cancer in African Americans 20. Li J, Lu Y, Akbani R, Ju Z, Roebuck PL, Liu W, et al. TCPA: a resource for and Caucasians. Breast Cancer Res Treat 2017;161:491–9. cancer functional proteomics data. Nat Methods 2013;10:1046–7. 7. Lindner R, Sullivan C, Offor O, Lezon-Geyda K, Halligan K, Fischbach 21. Tusher VG, Tibshirani R, Chu G. Significance analysis of microarrays N, et al. Molecular phenotypes in triple negative breast cancer from applied to the ionizing radiation response. Proc Natl Acad Sci U S A African American patients suggest targets for therapy. PLoS One 2013;8: 2001;98:5116–21. e71915. 22. Huang da W, Sherman BT, Lempicki RA. Systematic and integrative analysis 8. Stewart PA, Luks J, Roycik MD, Sang QX, Zhang J. Differentially expressed of large gene lists using DAVID bioinformatics resources. Nat Protoc transcripts and dysregulated signaling pathways and networks in African 2009;4:44–57. American breast cancer. PLoS One 2013;8:e82460. 23. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. 9. Sugita B, Gill M, Mahajan A, Duttargi A, Kirolikar S, Almeida R, et al. Cytoscape: a software environment for integrated models of biomolecular Differentially expressed miRNAs in triple negative breast cancer between interaction networks. Genome Res 2003;13:2498–504. African-American and non-Hispanic white women. Oncotarget 2016;7: 24. Miranda KC, Huynh T, Tay Y, Ang YS, Tam WL, Thomson AM, et al. A 79274–91. pattern-based method for the identification of MicroRNA binding sites and 10. Telonis AG, Loher P, Honda S, Jing Y, Palazzo J, Kirino Y, et al. Dissecting their corresponding heteroduplexes. Cell 2006;126:1203–17. tRNA-derived fragment complexities using personalized transcriptomes 25. Pearson WR, Lipman DJ. Improved tools for biological sequence compar- reveals novel fragment classes and unexpected dependencies. Oncotarget ison. Proc Natl Acad Sci U S A 1988;85:2444–8. 2015;6:24797–822. 26. Aken BL, Ayling S, Barrell D, Clarke L, Curwen V, Fairley S, et al. The 11. Telonis AG, Loher P, Jing Y, Londin E, Rigoutsos I. Beyond the one-locus- Ensembl gene annotation system. Database 2016;2016:baw093. one-miRNA paradigm: microRNA isoforms enable deeper insights into 27. Cook KB, Kazan H, Zuberi K, Morris Q, Hughes TR. RBPDB: a database of breast cancer heterogeneity. Nucleic Acids Res 2015;43:9158–75. RNA-binding specificities. Nucleic Acids Res 2011;39:D301–8. 12. Goodarzi H, Liu X, Nguyen HC, Zhang S, Fish L, Tavazoie SF. Endogenous 28. Bansal M, Belcastro V, Ambesi-Impiombato A, di Bernardo D. How to infer tRNA-derived fragments suppress breast cancer progression via YBX1 gene networks from expression profiles. Mol Syst Biol 2007;3:78. displacement. Cell 2015;161:790–802. 29. Hudson NJ, Reverter A, Dalrymple BP. A differential wiring analysis of 13. Shigematsu M, Kirino Y. tRNA-derived short non-coding RNA as interact- expression data correctly identifies the gene containing the causal muta- ing partners of argonaute proteins. Gene Regul Syst Bio 2015;9:27–33. tion. PLoS Comput Biol 2009;5:e1000382. 14. Ahn AC, Tewari M, Poon CS, Phillips RS. The clinical applications of a 30. Reznik E, Sander C. Extensive decoupling of metabolic genes in cancer. systems approach. PLoS Med 2006;3:e209. PLoS Comput Biol 2015;11:e1004176.

www.aacrjournals.org Cancer Res; 78(5) March 1, 2018 1153

Downloaded from cancerres.aacrjournals.org on September 24, 2021. © 2018 American Association for Cancer Research. Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1947

Telonis and Rigoutsos

31. Albert R. Scale-free networks in cell biology. J Cell Sci 2005;118:4947–57. 48. Klapa MI, Tsafou K, Theodoridis E, Tsakalidis A, Moschonas NK. Recon- 32. Consortium TU. UniProt: the universal protein knowledgebase. Nucleic struction of the experimentally supported human protein interactome: Acids Res 2017;45:D158–D69. what can we learn? BMC Syst Biol 2013;7:96. 33. Gioutlakis A, Klapa MI, Moschonas NK. PICKLE 2.0: A human protein- 49. Zhan T, Rindtorff N, Boutros M. Wnt signaling in cancer. Oncogene protein interaction meta-database employing data integration via genetic 2017;36:1461–73. information ontology. PLoS One 2017;12:e0186039. 50. Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K. KEGG: new 34. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res et al. Gene set enrichment analysis: a knowledge-based approach for 2017;45:D353–D61. interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 51. Mogilyansky E, Rigoutsos I. The miR-17/92 cluster: a comprehensive 2005;102:15545–50. update on its genomics, genetics, functions and increasingly important 35. Consortium GT. Human genomics. The Genotype-Tissue Expression and numerous roles in health and disease. Cell Death Differ 2013;20: (GTEx) pilot analysis: multitissue gene regulation in humans. Science 1603–14. 2015;348:648–60. 52. Demarchi F, Brancolini C. Altering protein turnover in tumor cells: 36. GersteinMB,KundajeA,HariharanM,LandtSG,YanKK,ChengC,etal. new opportunities for anti-cancer therapies. Drug Resist Updat 2005; Architecture of the human regulatory network derived from ENCODE 8:359–68. data. Nature 2012;489:91–100. 53. Blakeman V, Williams JL, Meng QJ, Streuli CH. Circadian clocks and breast 37. Wang J, Zhuang J, Iyer S, Lin X, Whitfield TW, Greven MC, et al. Sequence cancer. Breast Cancer Res 2016;18:89. features and chromatin structure around the genomic regions bound by 54. Kurozumi S, Yamaguchi Y, Kurosumi M, Ohira M, Matsumoto H, Hor- 119 human transcription factors. Genome Res 2012;22:1798–812. iguchi J. Recent trends in microRNA research into breast cancer with 38. Mar JC, Quackenbush J. Decomposition of gene expression state space particular focus on the associations between microRNAs and intrinsic trajectories. PLoS Comput Biol 2009;5:e1000626. subtypes. J Hum Genet 2017;62:15–24. 39. GaiteriC,DingY,FrenchB,TsengGC,SibilleE.Beyondmodulesand 55. Aure MR, Jernstrom S, Krohn M, Vollan HK, Due EU, Rodland E, et al. hubs: the potential of gene coexpression networks for investigating Integrated analysis reveals microRNA networks coordinately expressed molecular mechanisms of complex brain disorders. Genes Brain Behav with key proteins in breast cancer. Genome Med 2015;7:21. 2014;13:13–24. 56. Londin ER, Hatzimichael E, Loher P, Edelstein L, Shaw C, Delgrosso K, 40. Padi M, Quackenbush J. Integrating transcriptional and protein inter- et al. The human platelet: strong transcriptome correlations among action networks to prioritize condition-specific master regulators. BMC individuals associate weakly with the platelet proteome. Biol Direct Syst Biol 2015;9:80. 2014;9:3. 41. Ideker T, Krogan NJ. Differential network biology. Mol Syst Biol 2012; 57. MacNeil LT, Pons C, Arda HE, Giese GE, Myers CL, Walhout AJ. Transcrip- 8:565. tion factor activity mapping of a tissue-specific in vivo gene regulatory 42. Rigoutsos I, Lee SK, Nam SY, Anfossi S, Pasculli B, Pichler M, et al. N-BLR, a network. Cell Syst 2015;1:152–62. primate-specific non-coding transcript leads to colorectal cancer invasion 58. ChunJiao S, Huan C, ChaoYang X, GuoMei R. Uncovering the roles of and migration. Genome Biol 2017;18:98. miRNAs and their relationship with androgen receptor in prostate cancer. 43. ShahMY,FerrajoliA,SoodAK,Lopez-BeresteinG,CalinGA.microRNA IUBMB Life 2014;66:379–86. therapeutics in cancer - an emerging concept. EBioMedicine 2016;12: 59. Barton VN, D'Amato NC, Gordon MA, Christenson JL, Elias A, Richer JK. 34–42. Androgen receptor biology in triple negative breast cancer: a case for 44. Sandberg R, Neilson JR, Sarma A, Sharp PA, Burge CB. Proliferating cells classification as ARþ or quadruple negative disease. Horm Cancer 2015; express mRNAs with shortened 30 untranslated regions and fewer micro- 6:206–13. RNA target sites. Science 2008;320:1643–7. 60. Nakshatri H, Anjanappa M, Bhat-Nakshatri P. Ethnicity-dependent and 45. Abdelmohsen K, Gorospe M. Posttranscriptional regulation of cancer traits -independent heterogeneity in healthy normal breast hierarchy impacts by HuR. Wiley Interdiscip Rev RNA 2010;1:214–29. tumor characterization. Sci Rep 2015;5:13526. 46. Bohl CR, Harihar S, Denning WL, Sharma R, Welch DR. Metastasis 61. Bracken CP, Scott HS, Goodall GJ. A network-biology perspective of suppressors in breast cancers: mechanistic insights and clinical potential. microRNA function and dysfunction in cancer. Nat Rev Genet 2016; J Mol Med 2014;92:13–30. 17:719–32. 47. Stafford LJ, Vaidya KS, Welch DR. Metastasis suppressors genes in cancer. 62. Vidal M, Cusick ME, Barabasi AL. Interactome networks and human Int J Biochem Cell Biol 2008;40:874–91. disease. Cell 2011;144:986–98.

1154 Cancer Res; 78(5) March 1, 2018 Cancer Research

Downloaded from cancerres.aacrjournals.org on September 24, 2021. © 2018 American Association for Cancer Research. Published OnlineFirst December 11, 2017; DOI: 10.1158/0008-5472.CAN-17-1947

Race Disparities in the Contribution of miRNA Isoforms and tRNA-Derived Fragments to Triple-Negative Breast Cancer

Aristeidis G. Telonis and Isidore Rigoutsos

Cancer Res 2018;78:1140-1154. Published OnlineFirst December 11, 2017.

Updated version Access the most recent version of this article at: doi:10.1158/0008-5472.CAN-17-1947

Supplementary Access the most recent supplemental material at: Material http://cancerres.aacrjournals.org/content/suppl/2017/12/09/0008-5472.CAN-17-1947.DC1

Cited articles This article cites 62 articles, 11 of which you can access for free at: http://cancerres.aacrjournals.org/content/78/5/1140.full#ref-list-1

Citing articles This article has been cited by 5 HighWire-hosted articles. Access the articles at: http://cancerres.aacrjournals.org/content/78/5/1140.full#related-urls

E-mail alerts Sign up to receive free email-alerts related to this article or journal.

Reprints and To order reprints of this article or to subscribe to the journal, contact the AACR Publications Department at Subscriptions [email protected].

Permissions To request permission to re-use all or part of this article, use this link http://cancerres.aacrjournals.org/content/78/5/1140. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC) Rightslink site.

Downloaded from cancerres.aacrjournals.org on September 24, 2021. © 2018 American Association for Cancer Research.