Characterisation of the long noncoding RNA GHSROS in prostate and breast cancer

Patrick Brian Thomas

Bachelor of Applied Science (Hons) Bachelor of Applied Science (Medical Science)

Translational Research Institute Institute of Health and Biomedical Innovation Faculty of Health Queensland University of Technology

A thesis submitted for the degree of Doctor of Philosophy of the Queensland University of Technology

2018

KEYWORDS

Growth hormone secretagogue , GHSROS, noncoding RNA, long noncoding RNA, natural antisense transcript, transcriptional regulation, cancer, prostate cancer, breast cancer, alternative splicing, tumour progression, chemotherapy, expression, proliferation, migration, cell survival.

i

ABSTRACT

Characterisation of the long noncoding RNA GHSROS in prostate and breast cancer By Patrick B. Thomas

In a relatively short amount of time, long noncoding RNA (lncRNA) transcripts have emerged as critical regulators of cancer, playing key roles in cancer progression and providing a largely untapped and abundant source of novel therapeutic targets. However, systematic perturbation in cancer, characterisation of cellular functions, and observations regarding the ability to target many of these is largely unknown.

Using in silico bioinformatics analyses and quantitative expression studies, the expression of the lncRNA growth hormone secretagogue receptor opposite strand (GHSROS) was demonstrated across multiple cancer types. GHSROS is highly expressed in breast cancer samples and a subset of high-grade prostate tumours. Additionally, GHSROS is upregulated in high Gleason score prostate tumours. Forced GHSROS overexpression significantly increases in vitro cell proliferation and migration of PC3, DU145, and LNCaP prostate cancer cell lines. Increased cellular proliferation is recapitulated in prostate cancer xenografts, where GHSROS overexpression results in larger and more vascularised tumours. These findings suggest that inhibition of this lncRNA in cancer may be a beneficial approach. Two antisense oligonucleotides (ASOs) targeting GHSROS in vitro, were designed and evaluated. These potently inhibit GHSROS expression and reciprocally regulate cell growth, migration, and cell survival.

To identify fundamental drivers of the observed tumourigenic phenotype, high-throughput RNA-seq data from in vitro cultured PC3 cells and LNCaP xenografts overexpressing GHSROS were examined. A quarter of the genes (101 genes) differentially expressed by GHSROS-overexpressing PC3 cells were also differentially expressed by GHSROS-overexpressing LNCaP xenografts. These genes represent candidate mediators of GHSROS function. They include several transcription factors with established roles in prostate cancer (including the ), chemoresistance and cell survival, and genes associated with metastasis and poor prognosis. GHSROS may mediate tumour survival and resistance to the prostate cancer chemotherapeutic docetaxel.

Finally, we demonstrate that GHSROS promotes tumour growth in breast cancer. Using cDNA arrays, the expression level of GHSROS was found to be significantly upregulated in breast cancer tumours compared to non-malignant tissues. While forced ectopic expression of GHSROS had no significant

vii effects on in vitro proliferation, in vitro cell migration of both the MDA-MB-231 and MCF10A cells were increased by GHSROS overexpression. Using microarray analysis, with selected genes validated by qRT-PCR, GHSROS represses the expression of multiple genes from the HLA-class II axis, collectively enriching for both cancer-related and immune response-associated pathways. MDA-MB- 231-GHSROS cells significantly increased mammary fat pad xenograft growth – suggesting that GHSROS promotes tumour growth across multiple cancer types.

Taken together, this thesis represents the first comprehensive study on the GHSR antisense lncRNA, GHSROS in breast and prostate cancer. Future studies will enable the discovery of the molecular mechanism of its actions and determine whether targeting this lncRNA in cancer may prove a useful therapeutic strategy.

viii LIST OF PUBLICATIONS

Manuscripts related to this thesis – under review

Thomas P, Jeffery P, Gahete M, Whiteside E, Walpole C, Maugham M, Jovanovic J, Gunter J, Williams E, Nelson N, Herington H, Luque R, Veedu V, Chopin L, Seim I. (2018). The long noncoding RNA GHSROS mediates expression of genes associated with aggressive prostate cancer and promotes tumour progression. Under review, Oncogenesis.

Thomas P, Seim I, Jeffery P, Stacey A, Shah E, Whiteside E, Walpole C, Maugham M, Nelson N, Herington H, Chopin L. (2018). The long noncoding RNA GHSROS promotes cell migration and tumour growth in breast cancer. Under review, International Journal of Oncology.

Publications not directly related to this thesis

Maugham M, Seim I, Thomas P, Crisp G, Shah E, Herington A, Chen C, Rockstroh A, Gregory L, Nelson C, Jeffery P, Chopin L. (2018). Short-term inhibitory effects of the ghrelin receptor antagonist [D-Lys3]-GHRP-6 on human prostate cancer xenograft growth in NOD/SCID mice. Accepted, Endocrine.

Seim I, Jeffery P, Thomas P, Nelson P, Chopin L. (2017). Whole-genome sequence of the metastatic PC3 and LNCaP human prostate cancer cell lines. G3 (Bethseda). doi: 10.1534/g3.117.039909.

Maugham M, Thomas P, Crisp G, Philp L, Shah E, Herington A, Chen C, Gregory L, Nelson C, Seim I, Jeffery P, Chopin L. (2017). Insights from engraftable immunodeficient mouse models of hyperinsulinaemia. Scientific Reports. doi: 10.1038/s41598-017-00443-x.

Seim I, Jeffery P*, Thomas P*, Walpole C*, Maugham M, Fung J, Yap P, O’Keeffe A, Whiteside E, Herington A, Chopin L. (2016). Multi-species sequence comparison reveals conservation of ghrelin gene-derived splice variants encoding a truncated ghrelin peptide. Endocrine. 52(3):609-17

Seib C, Whiteside E, Humphreys J, Lee K, Thomas P, Chopin L, Crisp G, O’Keeffe A, Kimlin M, Stacey A, Anderson D. (2014). A longitudinal study of the impact of chronic psychological stress on health-related quality of life and clinical biomarkers: Protocol for the Australian Healthy Aging of Women Study. BMC Public Health. 8;14:9

ix Seim I, Whiteside E, Pauli J, Stacey A, Thomas P, O’Keeffe A, Carter S, Walpole C, Josh P, Herington A, Chopin. (2013). Identification of a long noncoding RNA gene, growth hormone secretagogue receptor opposite strand, which stimulates cell migration in non-small cell lung cancer cell lines. International Journal of Oncology. 43(2):566-74

Conference Presentations

Thomas P, Jeffery P, Gahete M, Whiteside E, Walpole C, Maugham M, Jovanovic L, Gunter J, Nelson C, Herington A, Luque R, Veedu R, Chopin L, Seim I. (2017). The lncRNA GHSROS mediates tumour growth and expression of genes associated with metastasis and adverse outcome. ESA/SRB Joint Meeting 2017, Perth, Australia. Novartis Junior Scientist oral presentation finalist. Novartis Junior Scientist Award Winner

Thomas P, Jeffery P, Gahete M, Whiteside E, Walpole C, Maugham M, Jovanovic L, Gunter J, Nelson C, Herington A, Luque R, Veedu R, Chopin L, Seim I. (2017). The lncRNA GHSROS mediates tumour growth and expression of genes associated with metastasis and adverse outcome. Translational Research Institute Symposium, Brisbane, Australia. Poster presentation

Thomas P, Seim I, Whiteside E, Walpole C, Maugham M, Crisp G, Jovanovic L, Nelson CC, Herington A, Veedu R, Jeffery P, Chopin L. (2016). The long noncoding RNA, GHSROS, mediates prostate cancer growth. Brisbane Cancer Conference, Brisbane, Aus. Oral presentation.

Thomas P, Seim I, Whiteside E, Walpole C, Maugham M, Crisp G, Jovanovic L, Nelson CC, Herington A, Veedu R, Jeffery P, Chopin L. (2016). The long noncoding RNA, GHSROS, mediates prostate cancer growth. ESA/SRB 2016, Gold Coast, Brisbane, Aus. Oral presentation.

Thomas P, Seim I, Whiteside E, Walpole C, Maugham M, Crisp G, Jovanovic L, Nelson C, Herington A, Veedu R, Jeffery P, Chopin L. (2016). Targeting the long noncoding RNA GHSROS, a mediator of prostate cancer tumour growth, with short antisense oligonucleotides. ENDO 2016, Boston, Massachusetts, USA. Poster presentation. Judges Presidential poster winner

Thomas P, Seim I, Whiteside E, Walpole C, Maugham M, Crisp G, Jovanovic L, Nelson C, Herington A, Veedu R, Jeffery PL, Chopin L.(2015). The ghrelin receptor long noncoding RNA, GHSROS, as a target for prostate cancer. Functional Nucleic Acids: From laboratory to targeted molecular therapy, Perth, Australia. Invited oral presentation

Thomas P, Seim I, Whiteside E, Walpole C, Maugham M, Crisp G, Jovanovic L, Nelson C, Herington A, Veedu R, Jeffery PL, Chopin L. (2015). The ghrelin receptor antisense long noncoding RNA,

x GHSROS, in prostate cancer growth and survival. IHBI inspires postgraduate student conference, Gold Coast, Australia. Poster presentation

Thomas P, Seim I, Whiteside E, Walpole C, Maugham M, Crisp G, Jovanovic L, Nelson CC, Herington A, Veedu R, Jeffery P, Chopin L. (2015). The ghrelin receptor antisense long noncoding RNA, GHSROS, in prostate cancer growth and survival. Translational Research Institute Symposium, Brisbane, Australia. Poster presentation

Thomas P, Seim I, Whiteside E, Walpole C, Maugham M, Crisp G, Jovanovic L, Nelson C, Herington A, Veedu R, Jeffery PL, Chopin L. (2015). The ghrelin receptor antisense long noncoding RNA, GHSROS, in prostate cancer growth and survival. Prostate Cancer World Congress, Cairns, Australia. Oral and poster presentation. Judges best pick poster

Thomas P, Walpole C, Jeffery P, Herington A, Chopin L, Seim I. (2014). Multi-species sequence comparison reveals conservation of preproghrelin splice variants and a novel variant encoding a truncated ghrelin peptide. International conference on bioinformatics, Sydney, Australia. Oral presentation

Thomas P, Whiteside E, Stacey A, Herington A, Chopin L, Seim I. (2013). The ghrelin receptor antisense gene, GHSROS, in breast cancer progression. ENDO 2013, San Francisco, United States of American. June 2013. Poster presentation.

Thomas P, Seim I, Walpole C, Jeffery P, Jovanovic L, Herington A, Nelson C, Whiteside E, Chopin L. (2013). The expression and function of GHSROS, a long noncoding RNA encoded by the opposite strand of the ghrelin receptor gene, in prostate cancer. IHBI inspires post graduate student conference, Brisbane, Australia. Poster presentation. Student poster prize winner

Thomas P, Seim I, Walpole C, Jeffery P, Jovanovic L, Herington A, Nelson C, Whiteside E, Chopin L. (2013). The expression and function of GHSROS, a long noncoding RNA encoded by the opposite strand of the ghrelin receptor gene, in prostate cancer. The Australian-Canadian Prostate Cancer Research Alliance Symposium, August 2013, Port Douglas. Poster presentation

Thomas P, Seim I, Walpole C, Jeffery P, Jovanovic L, Herington A, Nelson C, Whiteside E, Chopin L. (2013). The expression and function of GHSROS, a long noncoding RNA encoded by the opposite strand of the ghrelin receptor gene, in prostate cancer. Prostate Cancer World Congress, August 2013, Melbourne. Oral and poster presentation

xi AWARDS RELATING TO THIS THESIS

2017 – Novartis Junior Scientist Award; Endocrine Society of Australia/ Society of Reproductive Biology (ESA/SRB) Joint meeting, Perth, Australia.

2016 – Runner up oral presentation; Australian Society of Medical Research (ASMR) postgraduate student conference, Brisbane, Australia.

2016 – Presidential poster prize (Tumour biology); Meeting of the Endocrine Society, ENDO, Boston, Massachusetts, USA.

2015 – Judges pick poster prize; Prostate Cancer World Congress (PCWC) Conference, Cairns, Australia.

2013 – Judges pick poster prize; IHBI Inspires Student Conference, Brisbane, Australia.

xii TABLE OF CONTENTS

KEYWORDS ...... i ABSTRACT ...... ii LIST OF PUBLICATIONS AND PRESENTATIONS ...... iv AWARDS RELATING TO THIS THESIS ...... vii TABLE OF CONTENTS ...... viii LIST OF FIGURES ...... xviii LIST OF TABLES ...... xx LIST OF ABBREVIATIONS ...... xxi STATEMENT OF ORIGINAL AUTHORSHIP ...... xxvii ACKNOWLEDGEMENTS ...... xxviii

CHAPTER 1- INTRODUCTION AND LITERATURE REVIEW ...... 1 1.1 Introduction ...... 2 1.2 A brief overview of noncoding RNAs in the ...... 3 1.3 Long noncoding RNAs (lncRNAs) ...... 4 1.3.1 LncRNA subtypes ...... 5 1.3.2 Roles and functions of lncRNAs ...... 7 1.3.3 Mechanisms of action of lncRNA ...... 8 1.3.3.1 LncRNA mediated gene and genome regulation ...... 8 1.3.3.1.1 Regulation of histone modifications by lncRNAs...... 8 1.3.3.1.2 Modulation of DNA methylation by lncRNAs ...... 9 1.3.3.1.3 Chromatin remodelling ...... 9 1.3.3.1.4 Gene regulation through lncRNA scaffold functions, and decoys for ..... 10 1.3.3.1.5 Direct control of transcription ...... 10 1.3.3.2 Function of lncRNA at the post-transcriptional level ...... 11 1.3.3.2.1 miRNA sponge lncRNAs ...... 11 1.3.3.2.2 lncRNAs mediating transcription processing of pre-mRNA ...... 12 1.3.3.3 lncRNA regulation of mRNA stability and translation ...... 12 1.3.4 Long noncoding RNAs in cancer ...... 14 1.3.4.1 LncRNAs functioning as oncogenes ...... 14 1.3.4.2 Tumour suppressing lncRNAs ...... 15 1.3.5 LncRNAs in prostate cancer ...... 17 1.3.5.1 Incidence, genomic landscape and treatment of prostate cancer ...... 17 1.3.5.2 LncRNA and the hallmarks of prostate cancer ...... 18 1.3.5.2.1 LncRNAS involved in cell proliferation and apoptosis ...... 18

xiii 1.3.5.2.3 Cell migration, invasion, and metastasis ...... 19 1.3.5.2.4 LncRNAs involved in the process of angiogenesis and tumour hypoxia ...... 20 1.3.5.3 LncRNAs involved in AR signalling and castrate-resistant prostate cancer ...... 22 1.3.5.4 LncRNAs that confer resistance to chemotherapeutics ...... 24 1.3.5.5 LncRNA as prostate cancer diagnostics ...... 25 1.3.5.6 Therapeutic strategies to target lncRNAs ...... 29 1.4 Growth Hormone Secretagogue Receptor Opposite Strand (GHSROS) gene ...... 30 1.5 Conclusion ...... 33 1.6 Overall Objectives ...... 34 1.6.1 Hypotheses ...... 34 1.6.2 The specific aims of this study ...... 34 1.6.3 Significance ...... 34

CHAPTER 2- MATERIALS AND METHODS...... 35 2.1 Cell Culture, prostate cancer patient-derived xenograft (PDX) cell lines and treatments ...... 36 2.2 Production of GHSROS overexpressing cancer cell lines ...... 37 2.3 RNA extraction, reverse transcription and quantitative reverse transcription Polymerase Chain Reaction (qRT-PCR) ...... 37 2.4 Cell proliferation assays ...... 38 2.5 Cell Migration assays ...... 38 2.6 Locked Nucleic Acid-Antisense Oligonucleotides (LNA-ASO) ...... 39 2.7 Mouse subcutaneous and mammary fat pad in vivo xenograft models ...... 39 2.8 Histology and immunohistochemistry ...... 40 2.9 Viability Assay ...... 40 2.10 Statistical analyses ...... 41

CHAPTER 3 - EXPRESSION OF THE LNCRNA GHSROS IN PROSTATE CANCER AND ITS ROLE IN PROSTATE TUMOUR PROMOTION ...... 42 3.1 Introduction ...... 43 3.2 Materials and Methods ...... 44 3.2.1 Bioinformatics ...... 44 3.2.1.1 GHSROS, GHSR1a and GHSR1b annotation analysis ...... 44 3.2.1.2 Identification of GHSROS transcription in exon array data sets ...... 44 3.2.1.3 Evaluation of GHSR/GHSROS transcription in deep RNA-seq data set ...... 45 3.2.2 Expression of GHSROS in cancer specimens, prostate cell lines and tissues ...... 46 3.2.2.1 GHSROS qRT-PCR expression in human tissue specimens ...... 46

xiv 3.2.3 Cell lines, prostate cancer PDX cell lines and cell culture...... 46 3.2.4 RT-PCR of cell line mRNA ...... 47 3.2.5 Production of GHSROS overexpressing cancer cell lines ...... 47 3.2.6 Cell Migration assays ...... 47 3.2.7 Cell proliferation assays ...... 48 3.2.8 Anchorage-independent growth assay ...... 48 3.2.9 Attachment assays ...... 48 3.2.10 Mouse subcutaneous in vivo xenograft models ...... 49 3.2.11 Histology and immunohistochemistry ...... 49 3.2.12 RNA secondary structure prediction ...... 49 3.2.13 Locked Nucleic Acid-Antisense Oligonucleotides (LNA-ASO) ...... 49 3.2.14 Statistical analysis ...... 50 3.3 Results ...... 51 3.3.1 GHSROS is a bona fide mammalian lncRNA that is actively transcribed ...... 51 3.3.2 In silico analysis of GHSROS expression using public datasets ...... 58 3.3.3 GHSROS is upregulated in prostate cancer and associates with advanced Gleason Score .. 60 3.3.4 Expression of GHSROS in prostate-derived cell lines and patient-derived xenograft (PDX) samples ...... 64 3.3.5 GHSROS promotes motility of prostate cancer cells in vitro ...... 65 3.3.6 GHSROS promotes growth of prostate cancer cells in vitro ...... 66 3.3.7 The effects of GHSROS on the rate of cell attachment to extracellular matrix constituents 68 3.3.8 GHSROS potentiates tumour growth in vivo ...... 69 3.3.9 Locked nucleic antisense oligonucleotide (LNA-ASO) knockdown of GHSROS...... 71 3.3.10 GHSROS regulates the expression of antisense GHSR transcripts ...... 73 3.4 Discussion ...... 75

CHAPTER 4 - THE LNCRNA GHSROS MEDIATES EXPRESSION OF GENES ASSOCIATED WITH METASTASIS AND ADVERSE OUTCOME ...... 79 4.1 Introduction ...... 80 4.2 Materials and Methods ...... 81 4.2.1 RNA-sequencing of PC3-GHSROS cells ...... 81 4.2.2 Cell lines, prostate cancer PDX cell lines and cell culture...... 81 4.2.3 Production of GHSROS overexpressing cancer cell lines ...... 82 4.2.4 Locked Nucleic Acid-Antisense Oligonucleotides (LNA-ASO) ...... 82 4.2.5 RT-PCR of cell line mRNA ...... 82 4.2.6 (GO) term analyses ...... 83

xv 4.2.7 Oncomine Concepts analysis ...... 83 4.2.8 The Cancer Genome Atlas analysis...... 84 4.2.9 Performance of gene signature ...... 84 4.2.10 Statistical analyses ...... 84 4.2.11 Code ...... 85 4.3 Results ...... 85 4.3.1 RNA-seq analysis of GHSROS overexpressing PC3 cell line ...... 86 4.3.2 Pathway analysis of RNA-seq gene network derived from GHSROS overexpressing PC3 cell line ...... 89 4.3.3 GHSROS regulated genes comprise a unique gene signature which characterises advanced prostate cancer ...... 91 4.3.4 Association of GHSROS-regulated 34-gene signature with disease-free survival (DFS). ... 94 4.4 Discussion ...... 101

CHAPTER 5 - THE LNCRNA GHSROS FACILITATES ANDROGEN RECEPTOR- INDEPENDENT TUMOUR GROWTH AND MEDIATES RESISTANCE TO DOCETAXEL IN PROSTATE CANCER ...... 104 5.1 Introduction ...... 105 5.2 Materials and Methods ...... 106 5.2.1 Cell culture and drug treatments ...... 106 5.2.2 Production of GHSROS overexpressing cancer cell lines ...... 106 5.2.3 Cell Viability Assay ...... 106 5.2.4 Cell Migration assays ...... 106 5.2.5 Mouse subcutaneous in vivo xenograft models ...... 107 5.2.6 Histology and immunohistochemistry ...... 107 5.2.7 RNA sequencing of LNCaP-GHSROS cells ...... 107 5.2.8 RT-PCR of cell line mRNA ...... 107 5.2.9 Gene Ontology (GO) and gene set enrichment (GSEA) analyses ...... 108 5.2.10 LP50 prostate cancer cell line AR knockdown microarray ...... 108 5.2.11 Survival analysis ...... 109 5.2.12 Statistical analysis ...... 110 5.3 Results ...... 111 5.3.1 GHSROS promotes growth and motility of LNCaP cancer cells in vitro ...... 111 5.3.2 Overexpression of GHSROS in the LNCaP cell line modulates the expression of cancer- associated genes consistent with its functional effects ...... 113 5.3.3 Identification of common genes regulated in the PC3-GHSROS cell line and LNCaP- GHSROS xenografts ...... 115

xvi 5.3.4 GHSROS potently represses PPP2R2C, a mediator of androgen pathway-independent prostate tumour growth and mortality ...... 121 5.3.7 GHSROS is associated with cell survival and resistance to the cytotoxic drug docetaxel . 123 5.4 Discussion ...... 126

CHAPTER 6- THE LONG NONCODING RNA GHSROS FACILITATES BREAST CANCER CELL LINE MIGRATION AND ORTHOTOPIC XENOGRAFT TUMOUR GROWTH ...... 129 6.1 Introduction ...... 130 6.2 Materials and Methods ...... 131 6.2.1 Cell culture ...... 131 6.2.3 Expression of GHSROS across cancer specimens, prostate cell lines and tissues ...... 131 6.2.3.1 GHSROS expression in human tissue specimens ...... 131 6.2.3.2 RT-PCR of cell line mRNA ...... 131 6.2.3 Production of GHSROS overexpressing cancer cell lines ...... 132 6.2.4 Cell proliferation assays ...... 132 6.2.5 Cell Migration assays ...... 133 6.2.6 Oligonucleotide microarray and analysis ...... 133 6.2.7 Gene Ontology (GO) term and STRING analysis ...... 133 6.2.8 Orthotopic mammary fat pad in vivo xenografts in a NOD/SCID gamma (NSG) mouse model ...... 134 6.2.9 Statistical analyses ...... 134 6.3 Results ...... 135 6.3.1 GHSROS expression is elevated in breast cancer samples and breast cancer-derived cell lines ...... 135 6.3.2 Ectopic overexpression of GHSROS promotes in vitro breast-derived cell line migration, but not proliferation ...... 136 6.3.3 GHSROS regulates genes associated with cancer and immune response ...... 138 6.3.4 GHSROS increases orthotopic breast xenograft growth ...... 143 6.4 Discussion ...... 144

CHAPTER 7 - GENERAL DISCUSSION ...... 146

CHAPTER 8 - REFERENCES CITED ...... 155

CHAPTER 9 - APPENDIX ...... 195

xvii LIST OF FIGURES

Figure 1.1. Categories of long noncoding RNAs classified on the basis of their genomic position relative to -coding genes...... 6 Figure 1.2. Diagram showing three mechanisms by which an antisense sequence can be transcribed..7 Figure 1.3. The mechanistic functions of action of lncRNAs ...... 13 Figure 1.4. lncRNAs in different human cancers...... 16 Figure 1.5. Summary of key lncRNAs involved in the hallmarks of prostate cancer...... 21 Figure 1.6. Strategies for the clinical implications of lncRNAs in cancer patients...... 29 Figure 1.7. Identification and verification of antisense transcription in the GHSR locus. Exons are shown as boxes and introns as lines...... 31 Figure 1.8. Expression and in vitro migration effects of GHSROS in lung cancer (NSCLC) and NSCLC cell lines...... 32 Figure 1.9. GHSROS expression across multiple cancer types...... 33 Figure 3.1. Schematic representation of the GHSR and GHSROS genomic loci ...... 52 Figure 3.2. Structural analysis of GHSROS and GHSR transcript sequence ...... 56 Figure 3.3. Scatterplot of GHSROS Universal exPression Code values in publicly available exon array data sets ...... 59 Figure 3.4. UCSC genome browser visualization of GHSR/GHSROS locus expression in castration- resistant prostate cancer...... 60 Figure 3.5. GHSROS expression in cancer ...... 61 Figure 3.6. The lncRNA GHSROS is highly expressed in a subset of aggressive tumours...... 62 Figure 3.7. GHSROS expression in the Andalusian prostate cancer (PCa) cohort ...... 63 Figure 3.8. Expression of GHSROS in prostate cancer-derived cell lines and patient-derived prostate cancer xenografts (PDX) lines by qRT-PCR...... 65 Figure 3.9. GHSROS increases cell migration in the PC3 and DU145 prostate cancer cell lines ...... 66 Figure 3.10. Overexpression of GHSROS promotes DU145 and PC3 cell proliferation ...... 67 Figure 3.11. GHSROS increases attachment to collagen type I in the PC3 cell lines but not to collagen type IV or Fibronectin ...... 68 Figure 3.12. GHSROS promotes in vivo xenograft tumour growth in NOD/SCID mice ...... 70 Figure 3.13. Validation of GHSROS overexpression in PC3 and DU145 tumour xenografts by qRT- PCR...... 71 Figure 3.14. Strand-specific LNA design and mapping to GHSROS ...... 72 Figure 3.15. LNA-ASOs reduce GHSROS expression and decrease PC3 proliferation, migration and cell survival ...... 73 Figure 3.16. Regulation of GHSR1b by GHSROS overexpression and knockdown ...... 74

xviii Figure 4.1. GHSROS overexpression in the PC3 cell line upregulates cancer associated genes related to advanced prostate cancer ...... 87 Figure 4.2. GHSROS overexpression in the PC3 cell line increases the expression of cancer associated genes ...... 89 Figure 4.3. GHSROS overexpression enriches for GO terms important for cancer progression ...... 90 Figure 4.4. GHSROS characterises advanced prostate cancer ...... 92 Figure 4.5. Identification of a 34-gene signature associated with disease-free survival ...... 93 Figure 4.6. Kaplan-Meier analyses of 34-gene signature in the TCGA-PRAD cohort...... 95 Figure 4.7. Performance of 34 gene signature...... 96 Figure 5.1. GHSROS promotes LNCaP cell line growth and motility in vitro and in vivo...... 112 Figure 5.2. Transcriptomic analysis and expression of selected genes in LNCaP tumour xenografts overexpressing GHSROS...... 114 Figure 5.3. Comparative analysis of PC3-GHSROS and LNCaP-GHSROS differentially expressed RNA-seq genes demonstrates commonly regulated ...... 116 Figure 5.4. protein 467 (ZNF467), a gene induced by forced GHSROS-overexpression, is upregulated by metastatic tumours and associated with adverse relapse outcome...... 120 Figure 5.5. Identification and validation of GHSROS perturbed gene expression in various cancer models ...... 122 Figure 5.6. Effect of androgen receptor (AR) perturbation in LP50 prostate cancer cells on PPP2R2C expression...... 123 Figure 5.7. GHSROS mediates cell survival and resistance to the cytotoxic drug docetaxel ...... 124 Figure 5.8. GHSROS is not an androgen regulated gene in the LNCaP cell line but its expression is promoted by docetaxel...... 125 Figure 6.1. GHSROS is expressed at low levels in normal breast tissue and at higher levels in breast cancer...... 135 Figure 6.2. GHSROS promotes cell migration, but not cell proliferation in the MCF10A and MDA- MB-231 breast cancer cell lines in vitro...... 137 Figure 6.3. GHSROS significantly differentially regulates 76 genes in the MDA-MB-231 breast cancer cell line...... 140 Figure 6.4. GHSROS promotes orthotopic MDA-MB-231 xenograft tumour growth in vivo ...... 143 Figure 7.1. Proposed model of GHSROS function in prostate cancer...... 154

xix LIST OF TABLES

Table 1.1. Aberrantly expressed lncRNAs in prostate cancer ...... 28 Table 2.1. Patient derived xenograft (PDX) characteristics ...... 37 Table 2.2. LNA-ASOs used in this study ...... 39 Table 3.1. Primer sequences used in this study ...... 47 Table 3.2 Structural annotation of GHSROS, GHSR1a, GHSR1b...... 54 Table 3.3. Interacting protein binding prediction for GHSROS...... 57 Table 3.4. Relationship between GHSROS expression and clinical and pathological parameters in OriGene TissueScan Prostate Cancer Tissue qPCR panels ...... 63 Table 3.5. Correlation between GHSROS expression and clinical and pathological parameters in the Andalusian Biobank prostate tissue cohort ...... 64 Table 4.1. Primers used in this study ...... 83 Table 4.2. Selected prostate cancer associated gene expression signatures from the literature ...... 97 Table 4.3. Disease-free survival (DFS) analysis of TCGA patients using the 34-gene signature...... 99 Table 5.1. Primer sequences used in this study ...... 108 Table 5.2. Differentially expressed genes in PC3-GHSROS and LNCaP-GHSROS cells (against the vector control cells) compared to the Grasso and Taylor Oncomine data set ...... 117 Table 5.3. Disease-free survival (DFS) analysis of differentially expressed genes (in PC3-GHSROS cells, LNCaP-GHSROS cells, and clinical metastatic tumours) in human data sets...... 119 Table 6.1. Primer sequences used in this study...... 132 Table 6.2. GHSROS expression (qRT-PCR) and clinicopathological parameters in breast cancer and normal breast clinical specimens ...... 136 Table 6.3. Enriched KEGG pathway terms for 40 genes downregulated in microarray analysis from MDA-MB-231-GHSROS cells (compared to empty-vector control)...... 142

xx LIST OF ABBREVIATIONS

˚C Degrees Celcius µg Microgram(s) µl Microlitre(s) µM Micromolar 18S Ribosomal 18S RNA 5-AzadC 5-Aza-2’-deoxycytidine AA Amino Acid Airn Antisense Of IGF2R Non-Protein-coding RNA ADT Androgen deprivation therapy ANOVA Analysis of Variance ANRIL Antisense noncoding RNA in the INK4 locus AR Androgen receptor AR-v7 AR splicing variant 7 ASNR Apoptosis suppressing-noncoding RNA ASO Antisense Oligonucleotide ATT Androgen-targeted therapy AUF1 AU-rich (ARE)/poly (U)-binding/degradation factor 1 BCL B-cell lymphoma like-2 BCF Biochemical failure BCR Biochemical recurrence bp (s) BPH Benign prostatic hyperplasia BRG1 Brahma-related gene-1 CA Cribiform architecture CAGE Cap analysis of Gene Expression CBX7 Chromobox 7 cDNA Complementary DNA Cdr1as Cerebellar degeneration-related protein 1 antisense ceRNA Competing endogenous RNA ChOP Chromatin oligo-affinity precipitation cis-NAT cis Natural Antisense Transcript CKS2 Cyclin-dependent kinases regulatory subunit 2 CRPC Castration-resistant prostate cancer

xxi CSC Cancer stem-like cells

CT Cycle threshold CTBP1 C-terminal binding protein 1 CTBP1-AS C-terminal binding protein 1-antisense DAB2IP Disabled homolog 2-interacting protein DANCR Differentiation antagonising non-protein-coding RNA DFS Disease-free survival DHRS4 Dehydrogenase/reductase SDR family member 4 DHT Dihydrotestosterone DMEM Dulbecco’s Modified Eagle Medium DNA Deoxyribonucleic acid DNMT1 DNA methyltransferase 1 dNTP Deoxynucleotide triphosphate DRAIC Downregulated RNA in cancer ds Double-stranded dsRNA Double-stranded RNA ecCEBPA Extra coding CCAAT/enhancer-binding protein alpha EDTA Ethylenediaminetetraacetic acid-disodium salt EGFR Epidermal growth factor receptor eIF4E Eukaryotic translation initiation factor 4E ENCODE ENCyclopedia of DNA Elements EMP Epithelial-to-mesenchymal plasticity ERα alpha ERG ETS-(E-twenty-six) related gene eRNA Enhancer RNA ESCC Esophageal squamous cells EST Expressed Sequence Tag Evf2 Embryonic ventral forebrain-1 EZH2 Enhancer of zeste homolog FALEC Focally amplified lncRNA on 1 FANTOM Functional Annotation of Mouse FBS Fetal bovine serum Fendrr Fetal-lethal developmental regulatory RNA Firre Functional intergenic repeating RNA element FOXA1 Forkhead box protein A1 g G-force

xxii GAPDH Glyceraldehyde-3-phosphate dehydrogenase GAS5 Growth arrest-specific 5 GHSR Growth Hormone Secretagogue Receptor GHSROS Growth Hormone Secretagogue Receptor opposite strand h Hour(s) HCC Hepatocellular carcinoma HDAC-Sin3A Histone decarboxylase paired amphipathic helix protein Sin3a complex hnRNP A1 Heterogeneous nuclear ribonucleoprotein A1 hnRNP-K Heterogeneous nuclear ribonucleoprotein K hnRNP U Heterogeneous nuclear ribonucleoprotein U HOTAIR HOX transcript antisense RNA HOX HULC Highly upregulated in liver cancer IDC Intraductal carcinoma IFNB1 Fibrillin Igf2R Insulin-like growth factor 2 IPS1 Induced by Phosphate Starvation 1 kb Kilo base pair(s) KCNQ1 Potassium Voltage-Gated Channel Subfamily Q Member 1 KCNQ1OT1 Potassium voltage-gated channel subfamily Q member 1 overlapping transcript kDa Kilo Daltons KRAS Kirsten rat sarcoma KvDMR1 Potassium Voltage-Gated Channel Subfamily Q Member 1 domain L Litre lincRNA Long intergenic noncoding RNA LNA Locked nucleic acid lncRNA Long noncoding RNA LSD1 Lysine-specific histone demethylase 1A LUCAT1 Lung cancer associated transcript 1 M Molar MAPK Mitogen-activated protein kinase MALAT1 Metastasis-associated lung adenocarcinoma transcript-1 MAT2A Methionine adenosyltransferase mCRPC Metastatic castration-resistant prostate cancer MDM2 Mouse double minute 2 homolog MEG3 Maternally expressed 3

xxiii MER Medium Reiteration (transposable element) MIAT Myocardial infarction associated transcript MATF Microphthalmia-associated min Minutes miRNA MicroRNA mL Millilitre(s) MLL Mixed-lineage leukemia mM Millimolar Mhrt Myheart mRNA Messenger RNA MST1 Macrophage stimulating 1 NAT Natural Antisense Transcript ncRNA Noncoding RNA NEAT1 Nuclear-enriched autosomal transcript 1 ng Nanogram(s) NKX3-1 Homeobox protein Nkx-3.1 NSCLC Non-small cell lung cancer nt Nucleotide NTC No template control ORF Open Reading Frame OS Overall survival OSCC Oral squamous cell carcinomas P54/NRB Non-POU domain-containing octamer-binding protein PIK3CA Phosphatidylinositol 4,5-bisphosphate 3-kinase catalytic subunit alpha isoform PAGE Polyacrylamide gel electrophoresis PANDA p21 associated ncRNA DNA damage activated PARTICL Promotor of methionine adenosyltransferase 2A (MAT2A)-antisense radiation- induced circulating lncRNA PBS Phosphate Buffered Saline PCA Prostate cancer PCA3 Prostate cancer antigen 3 PCAT1 Prostate cancer-associated transcript 1 PCAT5 Prostate cancer-associated transcript 5 PCAT29 Prostate cancer-associated transcript 29 PCGEM1 Prostate cancer gene expression marker 1 PCR Polymerase Chain Reaction

xxiv piRNA Piwi-interacting RNA POTEF-AS1 Prostate, ovary, testis expressed protein family member-F gene-antisense transcript 1 PRC1 Polycomb repressive complex 1 PRC2 Polycomb repressive complex 1 pri Polished rice PRNCR1 Prostate cancer noncoding RNA-1 PRUNE2 Prune homolog 2 PSA Prostate-specific antigen PTEN Phosphatase and tensin homolog PTENP1 Phosphatase and tensin homolog pseudogene 1 qRT-PCR Quantitative real-time Reverse Transcription Polymerase Chain Reaction RNA Ribonucleic acid RNAi RNA interference rRNA Ribosomal RNA rpm Revolutions per minute RPMI Roswell Park Memorial Institute 1640 medium RNP Ribonucleoprotein rRNA Ribosomal RNA RT Reverse transcription RT-PCR Reverse transcription Polymerase Chain Reaction s Second(s) SChLAP1 Second chromosome locus associated with prostate-1 SDS Sodium dodecyl sulphate SFM Serum-free media SINE Short Interspersed Nuclear Element siRNA Small interfering RNA SMAD3 Mothers against decapentaplegic homolog 3 SMARCB1 SWI/SNF Related, matrix-associated actin dependent regulator of chromatin subfamily B member 1 snoRNA Small nucleolar RNA SNP Single Nucleotide polymorphism sORF Short Open Reading Frame SRSF Serine/arginine-rich splicing factors ss Single-stranded STORM Stress- and TNF-α-activated ORF micropeptide

xxv SUZ12 Suppressor of Zeste 12 SWI/SFN SWItch Sucrose Non-fermentable complex

Ta Annealing temperature TALE Transcription activator-like effectors TCGA The Cancer Genome Atlas TINCR Terminal differentiation-induced ncRNA T-UCRs Transcribed ultraconserved regions TE Tris:EDTA TE Transposable Element TIC Liver tumour initiating cells

Tm Melting temperature TMEM48 Transmembrane Protein 48 TMPRSS2 Transmembrane protease, serine 2 TNBC Triple-negative breast cancer TNM Tumour Node Metastasis TNSF10 Tumour necrosis factor superfamily member 10 trans-NAT trans Natural Antisense Transcript tRNA Transfer RNA Tris Tris(hydroxymethyl)aminomethane TrxG Trithorax-group TSS Transcription start site U Units U2AF65 U2 auxiliary factor 65 kDa subunit UCA1 Urothelial cancer associated 1 UTR Untranslated region VEGFA Vascular endothelial growth factor A WDR5 WD repeat-containing protein 5 XIST X-inactive specifc transcript ZEB1 Zinc-finger E-box binding homeobox 1 ZNF217 Zinc finger protein 217

xxvi STATEMENT OF ORIGINAL AUTHORSHIP

The work contained in this thesis has not been previously submitted for a degree or diploma at this or any other higher education institution. To the best of my knowledge and belief, the thesis contains no material previously published or written by any other person except where due reference is made.

Signed: QUT Verified Signature

Patrick Brian Thomas BAppliedSci(Medical Science), QUT BAppliedSci(hons), QUT

Date: November 2018

xxvii ACKNOWLEDGEMENTS

Firstly, I would like to acknowledge the tremendous efforts of my doctoral supervisors, Professor Lisa Chopin, Dr Inge Seim and Dr Penny Jeffery. My primary supervisor Lisa has provided limitless patience, constant support, guidance and countless sultanas over the preceding years. After over a decade of being your student in some capacity, I owe you for not only teaching me human physiology, but also for your motivation and abundant enthusiasm during my PhD. I have learned a lot from your tutelage and it was a pleasure to be your student. In that regard, I need to thank my associate supervisors, Inge and Penny. Inge, I have gained a lot from your mentorship and cannot applaud you enough for making me a more rigorous scientist. Thank you for all the drunken rants, electronic darts and Boston pizza times. I hope I was not a difficult first PhD student for you. Penny, I cannot truly express my thanks and admiration for your supervision over the last few years. You really have been a fantastic supervisor and mentor, encouraged me, listened to all of my misguided rants, and enhanced my PhD overall. Really, thank you.

I have been blessed to have known some amazing scientists during my PhD. To the countless people who have mentored me over the years – Dr Eliza Whiteside, Dr Carina Walpole, Emeritus Prof Adrian Herington, Assoc. Prof John Harris, Frances Breen, Emeritus Prof Pamela Russell and Assoc. Prof Elizabeth Williams. I hold you all in the highest esteem and thank you for your mentorship. All of you have played a tremendous role in both my research and teaching, so thank you. In particular, I cannot express more gratitude to Eliza for her enthusiasm, unwavering patience and general love of science. You were a fantastic first supervisor and cherished colleague. Acknowledgements must also go the Australian Prostate Cancer Research Centre - Queensland (APCRC-Q) and Prof Colleen Nelson for funding travel and presentations and for supporting my PhD.

To my friends and current members of the Ghrelin Research Group – Michelle Maugham, Dr Gabrielle Crisp, Esha Shah – thank you for all the good times, the help, scientific advice, cell culture convos, dance throw-downs and cheeky tequila shots after a tough day (I will forever have a drank bank on hand). It has been a privilege to share these years with you guys.

A very special thanks goes to all the friends I have made over the last few years who have provided me with many a stupid drunk time, morning coffee, rant sessions, and countless waves of memes. In no particular order: Jess, Kate, Jonelle, Ena, Joan, Amanda, Trish, Gemma, Chris, Andrew L, Andrew S, Jacky, Yashna, Jana, Barbara, Jacqui, Sugi, Ash, Sarah-louise, Taylar, Romain, Nathan, Lucas, Cath, Danielle, Charlotte, Stephanie, Nataly, Phoebe, Wendy, Rachel and the Lee family, Ellca, Marianna, Katrina, Liz, Mackenzie, Jess, Yean, Alivia, Elora and Hannah.

xxviii To my housemates and exceptional best buds – Joel &Andrew – You guys have been a couple of absolutely great guys for the entire duration of this thesis and I owe you both countless wines for the supporting Part II banter. Our champagne moods, Saturday morning coffees and Masterchef nights kept me sane. Special shout out to the cuter housemates, Archie and Scout for the endless cuddles and for being comforting feet warmers while writing this thesis.

And finally to my parents Colleen and Michael Thomas (as well as the rest of my family; Bridget, Katie, Sebastian, Lou) – Thank you guys for all the support over the years. Obviously, because of your unwavering support, and given you have basically been feeding me (or taught me to cook), I dedicate this thesis to you.

xxix Chapter 1

Introduction and literature review

1 1.1 Introduction

In terms of globally estimated incidence of cancer, Australia has the third highest age-standardised rate for all cancers with 323 people per 100,000 receiving a cancer diagnosis in 2012 (Welfare, 2017). Australia has the second highest age-standardised rate for men (373.9 per 100,000), second only to France (Welfare, 2017). This may reflect improvements and frequency of prostate specific antigen (PSA) testing for detection of prostate cancers within the population (Welfare, 2017). Prostate cancer remains the most commonly diagnosed cancer among males and the third most common cause of cancer deaths in Australia (Welfare, 2017). It is estimated that approximately 1 in 30 will succumb to prostate cancer-related mortality by the age of 85 (Welfare, 2017). Similarly, breast cancer is the most commonly diagnosed malignancy in females, and more than 3,400 women succumbed to this disease in 2017 (Welfare, 2017). Improved diagnostics and cancer therapeutics have contributed to reducing cancer-specific mortality for these cancers, however, 5-10% of the 65, 000-94,000 people living with these diseases cancers in Australia will experience adverse outcomes (Welfare, 2017). Metastatic prostate cancer remains incurable and limited treatment options exist (AIHW, 2013; Sumanasuriya & De Bono, 2017). It is, therefore, of critical importance that new cancer-specific molecular targets are identified and novel treatments developed in order to improve patient outcomes.

The pace of gene discovery has rapidly accelerated over the last two decades, providing an increasing repertoire of cancer-associated genes, and potential therapeutic targets (Morris & Mattick, 2014; Slaby, Laga, & Sedlacek, 2017). The long non-coding RNAs (lncRNAs) provide an abundant source of novel transcripts with regulatory functions in many diseases, including cancer (Boon, et al., 2016), where they play important regulatory roles. LncRNAs drive cell proliferation, tumour growth, chemoresistance and cell survival, contributing to tumour progression (Reviewed in Chiu et al., 2018) and may act as oncogenes or tumour supressors. Specifically targeting aberrantly expressed lncRNAs in tumours provides is a promising new field in cancer research (Slaby et al., 2017). It is imperative, therefore, that lncRNAs are identified, characterised, and their functional effects understood in order to develop more targeted and useful cancer therapeutics.

Our laboratory has identified the ghrelin receptor (GHSR) antisense transcript, GHSROS, as an lncRNA that is upregulated in lung, prostate and breast cancer (Whiteside et al., 2013). GHSROS may play a novel regulatory role in cancer and appears to regulate cellular processes that enhance tumourigenesis (Whiteside et al., 2013). Overexpression of GHSROS in lung derived cell lines increases in vitro migration (Whiteside et al., 2013). In this study, we further analyse the expression of GHSROS in cancer, and focus on its role in prostate and breast cancer using in vitro and in vivo assays. We investigate its role in regulating gene pathways using RNA-sequencing (RNA-seq) and its potential as a therapeutic target in prostate cancer using antisense oligonucleotide knockdown. A more complete

2 analysis of GHSROS in cancer may uncover its potential as a therapeutic target and contribute to our understanding of the role of lncRNA in cancer.

1.2 A brief overview of noncoding RNAs in the human genome

Early studies conducted prior to the Human Genome Project considerably overestimated the number of human protein-coding genes (with early estimates at ~35,000-100,000 gene units) (Ezkurdia et al., 2014; Pertea & Salzberg, 2010). It was not until multinational large-scale transcriptional annotation projects conducted in concert with the first annotated draft of the human genome showed empirical evidence for a more conservative estimate of ~19,000 -25,000 human protein-coding genes (Coffey et al., 2011; Derrien et al., 2012; Harrow et al., 2006; Harrow et al., 2012). The single most striking observation uncovered by large-scale annotation studies such as ENCODE (ENCyclopedia Of DNA elements) and GENCODE (Encyclopedia of genes and genes variants) was the significant abundance of noncoding RNA elements - estimated at 75% of the total human genome (Djebali et al., 2012), providing a stark contrast to the relatively low number of protein-coding genes mapped from model cell types (Deveson & Mattick, 2017; Lander et al., 2001). While it was originally assumed that mRNAs constitute the bulk of non-ribosomal RNA in the cell, it is actually the pervasive transcription of the many small and long noncoding RNAs that contribute to the breadth and complexity of the mammalian transcriptome (Deveson et al., 2017). Across a range of species, the number of protein- coding genes appears to remain largely static, and indeed, it is well known that mammals possess a relatively similar number of protein-coding genes to less complex organisms such as the nematode worm (Deveson et al., 2017). The complex nature of humans arises from diverse gene expression, deriving from our large intergenic and intronic regions as noncoding genome content increases with developmental complexity (Deveson et al., 2017). It has been proposed that the levels of abundant noncoding genes (in particular, the lncRNAs) form the most important basis of species complexity (Deveson et al., 2017; Mattick, 2005; Morris & Mattick, 2014).

Many types of ncRNA have been identified and are divided conventionally by the basis of their size into small (<200 nucleotides) and long (>200 nucleotides) ncRNA (Mattick & Rinn, 2015; Morris & Mattick, 2014). The small RNAs include the infrastructural and regulatory class of RNAs which include: transfer RNAs (tRNAs), ribosomal RNAs (rRNAs), and a number of additional species, such as transcribed ultraconserved regions (T-UCRs), small nucleolar RNAs (snoRNAs), PIWI-interacting RNAs (piRNAs) and microRNAs (miRNAs) (Morris & Mattick, 2014). In contrast, the long RNAs refer to the larger diverse range of lncRNAs (Clark & Mattick, 2011; Gutschner & Diederichs, 2012; Ponting & Reik, 2009; Wilusz et al., 2012). Over the last decade, the diverse role of ncRNAs has become widely accepted and further refinement and elucidation of the functional role of these transcribed elements in normal cellular functions and pathophysiology has been an enthusiastically-

3 studied area in molecular biology research (Adams et al., 2017; Khorkova & Wahlestedt, 2015; Q. Nguyen & Carninci, 2016). In particular, lncRNAs have been shown to play essential roles in many developmental processes, normal physiology and diseases states, particularly in events surrounding cancer growth and tumourigenesis (Hu et al., 2017; Misawa & Inoue, 2017; Slaby et al., 2017; Sun et al., 2017).

1.3 Long noncoding RNAs (lncRNAs)

LncRNAs are the larger subtype of ncRNA and arbitrarily classified on a size threshold of >200 nucleotides in length (Mattick & Rinn, 2015; Morris & Mattick, 2014). Long noncoding RNAs are a similar size and structure to mRNAs but have little, if any, protein-coding potential due to the presence of short (<100 AA) open reading frames (sORFs) (Mattick & Rinn, 2015; Morris & Mattick, 2014). Common structural features of ncRNA include 5’ capping, extensive alternative splicing and polyadenylation at their 3’ ends (Troy & Sharpless, 2012; Wilusz et al., 2012). These are broad characteristics and are also features typical of classical mRNA transcription by RNA polymerase II; which is a common characteristic for many of the identified lncRNA (Deniz & Erman, 2017). However, not all lncRNAs are transcribed by polymerase II, and some lncRNAs such as BC200, (a small cytoplasmic lncRNA expressed in the primate nervous system and in human breast cancer), are transcribed by RNA polymerase III (Czerniak, 2010; Chow et al., 2008; Lamm, 2012). Similar to mRNAs, lncRNAs can also be transcriptionally regulated by conventional transcription factors and morphogens (chemical agents able to govern patterns of tissue development) – suggesting that they may be regulated during development and dysregulated during disease (Anastassova-Kristeva, 2015; Herriges et al., 2014; Luo et al., 2016).

A lack of significant protein-coding potential is, therefore, a major characteristic of a noncoding RNA which discriminates it from an mRNA (Mattick & Rinn, 2015). Although it is difficult to demonstrate experimentally, coding potential can be predicted using a number of algorithms based on the intrinsic features of a sequence (i.e., ORF length/ coverage, nucleotide composition features, k-mer sequence motif, conservation scores, secondary structure) (Hu et al., 2017). Historically, the coding potential calculator (CPC) (Kong et al., 2007) and coding-potential assessment tool (CPAT) (Wang & Park, et al., 2013) packages were useful methods for estimating coding capacity of putative sequences. More recently, coding potential calculator based on multiple evidences (COME) (Hu et al., 2017) and coding potential calculator 2 (CPC2) (Kang et al., 2017) have been released, each employing machine-based learning approaches using support vector machines (SVM) (Hu et al., 2017). The protein-coding potential of many lncRNAs remains putative, however, and the possibility that they encode short peptides has not been experimentally excluded in most cases (Ruiz-Orera et al., 2014). For example, in Drosophila, polished rice (pri) is a lncRNA that contains only sORFs and it was experimentally shown

4 to encode small bioactive peptides of 11 to 32 amino acids (Clark et al., 2012; Plummer et al., 2013). More recently, it was discovered that phosphorylation of eukaryotic translation initiation factor 4E (eIF4E) by TNF-α-activated macrophage stimulating 1 (MST1) promotes the translation of linc00689 in the absence of canonical mRNA translation (Min et al., 2017). The increased translation of linc00689 produced the 50 AA stress- and TNF-α-activated ORF micropeptide (STORM) small regulatory peptide (Min et al., 2017). Additionally, the skeletal muscle regenerative response was shown to be modulated by the lncRNA-encoded polypeptide (SPAR), which is generated by the translation of the lncRNA, LINC00961 (Tajbakhsh, 2017).

Many lncRNAs are differentially expressed in response to environmental stimuli and exhibit specific spatiotemporal expression patterns in both normal physiology and in disease, strongly supporting their biological relevance (Kopparapu et al., 2013; Truax & Thakkar, 2012). The abundance of lncRNA species is greater in brain-region specific tissues, than in liver tissue (Serrano & Castro-Vega, 2011). Additionally, this extends to other more complex, and developmentally regulated organs such as the testis, and thymus, where, like neuronal tissues, noncoding transcript expression is greater (Redondo et al., 2003; Serrano et al., 2011). Thus, in normal physiology, lncRNA and other noncoding RNAs are important regulators of complex systems such as the central nervous system and spermatogenesis in the testis (Kopparapu et al., 2013). The dynamics of the expression, splicing patterns, organisation and functional role of a large majority of lncRNAs remains to be experimentally verified.

1.3.1 LncRNA subtypes

LncRNAs can be categorised on the basis of their relative position to the nearest protein-coding genes (Wright, 2014). The locus biotypes reported by GENCODE include sense, antisense and intergenic lncRNAs (Derrien et al., 2012) (Figure 1.1). Further sub-classifications are based on these three broad subtypes and include intronic (sense or antisense), intergenic or bi-directional promotor lncRNA (Feng, Fan et al., 2014; Ma & Bajic, 2013; Sanbonmatsu, 2015; Wright, 2014) (Figure 1.1). Intronic lncRNAs reside within the intron of a protein-coding gene, where the protein-coding gene may be on the same or opposite strand (Deveson et al., 2017). Based on their length, a number of RNAs transcribed from enhancer sequences (eRNA) are also lncRNAs (Li, Notani et al., 2016; Rothschild & Basu, 2017). Generally, eRNAs are rarely spliced while about 30% of lncRNAs show evidence of alternative splicing (Li et al., 2016; Rothschild & Basu, 2017). A noncoding locus which originates from within the promotor of a protein-coding gene and initiates transcription in the opposite direction is classed as divergent, or bi-directional promotor associated lncRNA (Deveson et al., 2017; Morris & Mattick, 2014). Intergenic lncRNAs do not intersect any protein-coding locus and are typically found in long, evolutionarily conserved intergenic regions (Clark & Mattick, 2011; Morris & Mattick, 2014). Conversely, antisense RNAs are located on the opposite strand of the sense gene, and typically intersect

5 any exon from the protein-coding gene (Mattick & Rinn, 2015; Morris & Mattick, 2014) (Figure 1.2). Antisense lncRNAs that overlap coding genes are generally termed natural antisense transcripts (NATs) and have diverse roles that include chromatin remodelling, and transcriptional/ post-transcriptional regulation of neighbouring genes in cis and in trans (Villegas & Zaphiropoulos, 2015).

Figure 1.1. Five major categories of long noncoding RNAs based on their genomic position relative to protein-coding genes. Long noncoding RNAs are broadly classified into five main types. Intergenic lncRNA (purple) are transcribed between two protein-coding genes (blue) and can either be sense or antisense in their orientation. Intronic lncRNA are transcribed entirely from intronic regions. Sense lncRNA transcription occurs in the sense strand of a protein-coding gene, overlapping with protein-coding genes, or covering entire sequences that include exons and can be enhancer-like or divergent transcription of lncRNA/mRNA pairs. Antisense lncRNA are transcribed from the antisense strand of protein-coding genes either from intronic regions, regions that overlap with exons, or covering the entire protein-coding sequence (Adapted from Knoll, Lodish et al., 2015).

6

Figure 1.2. Diagram showing three mechanisms by which an antisense sequence can be transcribed. These consist of exon overlap, intronic overlap or a complete overlap of the protein- coding sense sequence. (Adapted from Ma et al., 2013).

1.3.2 Roles and functions of lncRNAs

LncRNAs play critical roles in both cellular homeostasis and pathologies such as cancer, cardiovascular disease, some Mendelian diseases and neurological disorders (Brazao et al., 2016; Briggs, Mattick et al., 2015; Hu et al., 2017; Lin & Yang, 2017; Taylor, Chu et al., 2015; Zhang, 2016) and therefore they are being explored as suitable targets for antisense gene therapies (Hauptman & Glavac, 2013; Li & Chen, 2013). Low levels of sequence conservation and lower tissue expression levels compared to protein-coding mRNAs do not indicate a lack of function (Briggs et al., 2015; Mattick & Rinn, 2015; Morris & Mattick, 2014). LncRNAs frequently demonstrate tissue and temporally regulated expression patterns, and have been documented to be involved in a diverse range of functional roles. LncRNAs act in transcriptional and post-transcriptional gene regulation, as molecular decoys, regulators of mRNA processing and alternative splicing, and in epigenetic processes (Sun & Kraus, 2015). LncRNAs may also exert regulatory effects via sequence-specific

7 mechanisms (similar to those employed by miRNAs) and furthermore may act as templates to sequester miRNAs or even be spliced into small RNA fragments with gene regulatory actions (Liu et al., 2016). In normal physiology, lncRNAs can act as determinants of lineage progression, drivers of development processes and as regulators of cellular stress (Gesualdo & Capaccioli, 2014; Gordiiuk, 2014; Liu et al., 2015; Tani et al., 2014). For example, a 3.7 kb lncRNA known as terminal differentiation-induced ncRNA (TINCR) is an important determinant of somatic tissue differentiation (Kretz et al., 2013). LncRNAs therefore show a diverse spectrum of biological functions and regulate gene expression changes in response to a wide range of endogenous and exogenous factors.

1.3.3 Mechanisms of action of lncRNA

1.3.3.1 LncRNA mediated gene and genome regulation

1.3.3.1.1 Regulation of histone modifications by lncRNAs

One of the most well studied and predominant functions of lncRNAs in the fine control of gene expression changes, relates to epigenetic regulation of target genes. Epigenetics is defined as the study of changes caused by modification of gene expression rather than the base genetic code itself (Cao, 2014). Generally, this mechanism of regulation results in the repression of gene transcription of entire loci, via histone and DNA chemical modifications or, alternatively, precise interactions with chromatin remodelling protein complexes (Felden & Paillard, 2017; Qin & Xu, 2015; Wen-Jun & Yi-Lei, 2015). LncRNAs typically achieve this by forming complexes with RNA-binding protein partners, termed ribonucleoproteins (RNPs), and forming RNA:DNA duplexes in proximity to target gene promotors (Cao, 2014; Marchese & Huarte, 2014; Ramalho-Carvalho et al., 2016). One of the most common lncRNA-protein interactions, and functional mechanisms of repression, is the association of lncRNA with the polycomb repressive complex 1 and 2 (PRC1 and PRC2) family of repressive multi-protein complexes (Glazko & Rogozin, 2012). Large numbers of lncRNAs have been experimentally shown to specifically co-purify with PRC2, suggesting that this may be an important cellular mechanism for regulating gene expression (Glazko et al., 2012). For example, a well characterised and abundantly expressed lncRNA, homeobox transcript antisense RNA (HOTAIR) has been associated with PRC2 occupancy, and subsequent repression of genes in trans (Portoso et al., 2017; Yang et al., 2016; Yu & Li, 2015). HOTAIR, a 2.2-kb, antisense intergenic lncRNA, acts as a molecular scaffold to direct two histone modifiers, enhancer of zeste homolog 2 (EZH2; trimethylation of H3K27) and lysine-specific histone demethylase 1A (LSD1; demethylation of H3K4), to repress target genes in trans (Hajjari & Salavaty, 2015; Y. Liu et al., 2016; Meredith et al., 2016; Oh et al., 2016; Portoso et al., 2017; Wu et al., 2015; Yang et al., 2016).

8 Many other lncRNAs, including the cardiac development specific Braveheart (Bvht) and fetal-lethal developmental regulatory RNA (Fendrr), have been reported to exert their control over target genes through the PRC2 complex (Grote & Herrmann, 2013; Hou et al., 2017; Klattenhoff et al., 2013). Bvht, a core upstream activator of mesoderm posterior 1 (MesP1), recruits a component of the polycomb protein, suppressor of Zeste 12 (SUZ12), to mediate an epigenetic directed control over cardiomyocytes during mammalian development (Hou et al., 2017; Klattenhoff et al., 2013; Xue et al., 2016). The tissue-specific lncRNA, Fendrr, is expressed in the lateral mesoderm where it binds to both PRC2 and Trithorax-group/ mixed-lineage leukemia (TrxG/MLL) complexes which ultimately direct cardiac mesoderm differentiation (Grote & Herrmann, 2013). Given that Fendrr acts to promote both PCR2 occupancy to induce H3K27 methylation (repressing mark) and H3K4 methylation (opposing active mark) through its direct interaction with WD repeat-containing protein 5 (WDR5), it has been suggested that this lncRNA may act to adjust the expression of poised genes, or with those involved in developmental processes which require a more precise control (Grote & Herrmann, 2013; Grote et al., 2013).

1.3.3.1.2 Modulation of DNA methylation by lncRNAs

A large proportion of lncRNAs promote repressive marks through histone methyltransferases (of lysine or arginine residues), however, some regulate the transfer of methyl groups to cytosines via DNA methyltransferases (Lai & Shiekhattar, 2014). This type of action typically serves to oppose histone methylation and activate target genes rather than repress their expression (Gonzalez-Ramirez et al., 2014; Kornienko & Pauler, 2013; Thurman et al., 2012; Yoon et al., 2017; Zhang & Zhu, 2014). For example, lung cancer associated transcript 1 (LUCAT1) stabilises the levels of DNA methyltransferase 1 (DNMT1), where siRNA mediated inhibition of the lncRNA induced the expression of tumour suppressor genes generally supressed by methylation (Yoon et al., 2017). Similarly, knockdown of extra coding CCAAT/enhancer-binding protein alpha (ecCEBPA), expressed from the CEBPA locus, led to an increase of DNMT1 activity and subsequently, increased levels of CEBPA promotor methylation (Hackanson et al., 2008).

1.3.3.1.3 Chromatin remodelling

In addition to providing covalent histone-modifying enzymatic complexes (such as PRC2) access to nucleosomal DNA, some lncRNAs have also been shown to regulate gene expression through control of ATP-dependent chromatin remodelling complexes (Han & Chang, 2015; Kawaguchi & Hirose, 2015). Specifically, this involves energy derived from adenosine triphosphate (ATP) to restructure, replace, eject or physically move nucleosomes, with lncRNA, in part, becoming important facilitators of this process (Saxena & Carninci, 2011). An example of this type of regulation is seen in SWI/SNF

9 (SWItch/Sucrose Non-Fermentable) Complex Antagonist Associated With Prostate Cancer 1 (SChLAP1). SChLAP1 has been shown to oppose the SWI/SNF complex, a multi-protein complex which regulates gene transcription through physical alterations of nucleosomes at gene promotors (Prensner et al., 2013). SChLAP1, unlike other lncRNAs such as HOTAIR and KCNQ1OT1 which facilitate epigenetic modifications, globally inhibits binding of hSNF5 (the SWI/SNF Related, matrix- associated, actin-dependent regulator of chromatin subfamily B member 1encoded by SMARCB1), a crucial subunit of this complex (Prensner et al., 2013). This impairment leads to widespread depression of genomic binding and expression of target genes in prostate cancer (Prensner et al., 2013).

1.3.3.1.4 Gene regulation through lncRNA scaffold functions, and decoys for proteins

Interestingly, there is a growing body of evidence linking lncRNAs within the nuclear matrix with crucial roles as structural scaffolds and organisers of nuclear architecture (Mele & Rinn, 2016). The transcription of nuclear-enriched autosomal transcript 1 (NEAT1) has been directly correlated with the formation of discrete nuclear compartments known as paraspeckles (Hirose & Nakagawa, 2012; Nishimoto et al., 2013). While the function of paraspeckles has not been completely elucidated, the highly abundant NEAT1 transcript may function as a structural RNA, binding to proteins which are absolutely essential to the framework of these distinct nuclear domains (Nishimoto et al., 2013). XIST is one of the most well studied and functionally annotated lncRNA, shown to initiate an X chromosome condensation rendering it inactive (to form a Barr body) (Dixon-McDougall & Brown, 2016). XIST localises to the X chromosome and is subsequently anchored into its specific location through a meshwork of architectural proteins of the nucleus via scaffold attachment factor A (SAF-A) (Helbig & Fackelmayer, 2003; Smeets et al., 2014). This appears to be one component of a uni- molecular bridge, where XIST is given access to impact higher-order chromatin condensation to initiate its actions (de Graaf et al., 2004). While these studies reflect how important lncRNAs are to the spatial and temporal expression of genes, their mechanistic diversity and overall influence on nuclear architecture and temporary rearrangements remain difficult to elucidate.

1.3.3.1.5 Direct control of transcription

Several lncRNA have been observed to bind to transcription factors and thus influence the production of transcripts, and in some cases, entire transcriptional programs to regulate cell functions (Deveson et al., 2017). Unlike lncRNAs, which serve to mediate control and repurposing of transcription factors to guide gene transcription, others have been observed to display enhancer-like functions (Chen & Li, 2017; Derrien & Guigo, 2011; Lai et al., 2013; Li & Notani, 2014). Termed enhancer RNAs (eRNAs) or ncRNA-activating (ncRNA-a), these lncRNA are non-polyadenylated transcripts which

10 predominantly are unspliced and bi-directional (Orom & Shiekhattar, 2011; Shibayama & Mhlanga, 2014). These eRNAs have a high correlation of expression with neighbouring genes, and can work in a bidirectional manner to regulate transcription (Shibayama, Fanucchi, & Mhlanga, 2017). AS1eRNA which arises from an enhancer region downstream of the Dehydrogenase/reductase SDR family member 4 (DHRS4) locus is required to promote the physical interaction of the enhancer region with the locus (Yang et al., 2016). AS1eRNA mediated the transcription of DHRS4 and effectively combined with RNA polymerase II and the p300/CBP transcriptional activators (Yang et al., 2016). Direct control of gene expression can also occur through physical obstruction of the complexes which mediate transcription (Hobson & Svejstrup, 2012). Head-to-head collisions of RNA polymerases can displace the entire process of transcription (Hobson et al., 2012). For example, the imprinted lncRNA Antisense Of IGF2R Non-Protein-coding RNA (Airn) is an established cis-acting lncRNA which overlaps the Insulin-like growth factor 2 receptor (igf2r) promotor, interfering with RNA polymerase II recruitment (Latos et al., 2012). Airn transcriptional overlap serves as a prerequisite and necessary silencer of mouse igf2r in host embryonic stem cells (ESC) (Latos et al., 2012).

1.3.3.2 Function of lncRNA at the post-transcriptional level

1.3.3.2.1 miRNA sponge lncRNAs

Aside from acting as sequesters of epigenetic modification machinery and RNA-dependent effectors, lncRNAs can also act as decoys to attenuate miRNA levels inside the cell (Dey & Dutta, 2014; Du et al., 2016; Kumar & Nazir, 2016; Pan & Xiong, 2015). Recognised as the competing endogenous RNA (ceRNA) theory, lncRNAs can act as sequence specific templates for complementary miRNA binding, thus reducing the pool of small RNAs available for gene expression (Du et al., 2016; Paci et al., 2014). Examples of competitive small RNA binding via endogenous target mimicry (alternatively known as “sponges”) include highly upregulated in liver cancer (HULC) (J. Wang et al., 2010), induced by Phosphate Starvation 1 (IPS1) (Pacak et al., 2016), cerebellar degeneration-related protein 1 antisense (Cdr1as) (Tang et al., 2017; Xu et al., 2017; Yu et al., 2016) and phosphatase and tensin homolog pseudogene 1 (PTENP1) (Chen et al., 2015; Li, Gao et al., 2017). HULC is a two exon transcript which is elevated in hepatocellular carcinoma (HCC), making it a useful as a prognostic marker of survival in resected HCC specimens (Liu et al., 2012; Lu et al., 2015; Sonohara et al., 2017; Wang et al., 2010; Wang et al., 2017). Expression of HULC is promoted in HCC via activation of the protein kinase A (PKA) signal transduction pathway, where HULC acts to sequester hsa-miR-372 and thus reduces the levels of translational of repression of hsa-miR-372 target genes (Wang et al., 2010).

11 1.3.3.2.2 lncRNAs mediating transcription processing of pre-mRNA

There is strong evidence supporting the role of lncRNA in transcriptional processing of mRNAs, by recruiting or impairing transcription factors, co-factors or RNA polymerase II to the promotors of target genes (Atianand & Fitzgerald, 2014; Zhang & Zhu, 2014). The nucleus-restricted lncRNA MALAT1, which is constitutively transcribed in many cells and frequently upregulated in multiple cancer types, has been implicated as a direct regulator of alternative splicing (Hu et al., 2016; Malakar et al., 2017). The 5’-end of MALAT1 contains numerous binding sites for multiple splicing factors, leading to changes in the phosphorylation and distribution of serine/arginine-rich splicing factors (SRSFs) (Hu et al., 2016). SRSFs are a conserved family of proteins central to differential gene expression, partly through constitutive splicing, mRNA export and non-sense mediated decay (Hu et al., 2016).

1.3.3.3 lncRNA regulation of mRNA stability and translation

It is now clear that some lncRNA also directly regulate mRNA turnover and the translation of protein- coding genes (Khorkova et al., 2015; Sun et al., 2017). FIRRE physically interacts with heterogeneous nuclear ribonucleoprotein U (hnRNPU) to direct the stability of inflammatory genes through their AU-rich elements following liposaccharide exposure (Lu et al., 2017). For example, the lncRNA LUCAT1 promotes oesophageal squamous cell carcinoma progression by inhibiting tumour suppressors through enhancing the stability of DNMT1 and subsequent DNA methylation events (Yoon et al., 2017).

12

Figure 1.3. The mechanistic functions of action of lncRNAs. LncRNAs can act as (A) scaffolds for protein remodelling complexes, (B) decoys to bind to transcription factors and prevent genomic binding, (C) guides for proteins to be recruited to certain genomic locations, (D) establishing chromatin interactions such as those found in enhancer-promotors, (E) sponging miRNAs and preventing their interaction with target mRNAs, and in (F) stabilising, alternative splicing and mRNA degradation events. (Adapted from Hu et al., 2012 and Mercer et al., 2009).

13 1.3.4 Long noncoding RNAs in cancer

It is becoming increasingly appreciated that lncRNAs play a role in cancer and that dysregulation of gene expression is a critical step in cancer progression (Gutschner & Diederichs, 2012). LncRNAs that are aberrantly expressed in human cancers may be useful as biomarkers and/or predictors of patient outcomes, including overall survival (OS) and disease-free survival (DFS) (Mehra et al., 2015; Misawa & Fujimura, et al., 2017; Peng & Fan, 2015; Wu et al., 2015; Zhang et al., 2015; Zhu, Liu et al., 2015; Zou et al., 2017). Their mechanisms and functions within cancer cells are often diverse (Angrand et al., 2015; Lin & Yang, 2017; Sun & Kraus, 2015). Generally, they repress the transcription of target genes and are often upregulated in the earliest stages of cancer to target genomic regions (cis-regulation), or regions outside their own genomic loci (trans-regulation) (Du et al., 2016; Vance et al., 2014; Villegas & Zaphiropoulos, 2015). Some lncRNAs have been shown to promote tumour growth and metastasis, while others appear to play a tumour suppressive role (Gutschner & Diederichs, 2012; Ishibashi et al., 2013) (Summarised in Figure 1.4).

1.3.4.1 LncRNAs functioning as oncogenes

Many lncRNAs are now considered as bona fide oncogenes, cancer initiators and tumourigenesis drivers (Adams et al., 2017; Hu et al., 2017; Slaby et al., 2017; Xu & Chen et al., 2017). The lincRNA, metastasis-associated lung adenocarcinoma transcript 1/ noncoding nuclear enriched abundant transcript 2 (MALAT1/ NEAT2), for example, is highly expressed in many different cancer types, including non-small cell lung cancer (NSCLC), breast, prostate, thyroid, colorectal, hepatocellular carcinoma (HCC) and acute monocytic leukemia (Chen et al., 2017; Fang et al., 2016; Huang et al., 2016; Huang et al., 2017; Zhang et al., 2017). High MALAT1 expression has been associated with increased metastasis, proliferation, angiogenesis and a poor prognosis, particularly in NSCLC (He et al., 2014; Kogo et al., 2011; Li et al., 2013; Nie et al., 2013; Pavlov et al., 2013; Rak et al., 2009; Sana et al., 2012; Tani et al., 2012). Elevated levels of MALAT1 have also shown to regulate epithelial to mesenchymal plasticity (EMP) in oral squamous cell carcinomas (OSCC). Effective inhibition of the lncRNA with specific siRNAs reduced the levels of these markers in OSCC cell lines, as well as decreased invasion and migration of the A549 NSCLC and CaSki human cervical cancer cell lines (Zhou et al., 2015) (Gittelman et al., 2013; Mattick, 2012; Tombal et al., 2013).

The well characterised oncogenic lncRNA, homeobox (HOX) transcript antisense RNA (HOTAIR), is a 2.2-kb, antisense lincRNA transcribed from the HOX C locus (Rinn et al., 2007). The HOX genes are a family of well conserved transcription factors that encode homeodomain protein products to regulate morphogenesis in animals, fungi and plant species (Ishibashi et al., 2013). This lincRNA, like MALAT1, is overexpressed in breast, pancreatic, hepatocellular and colorectal cancer and is associated with

14 increased metastasis, tumour growth, and clinical outcomes; including recurrence and a poor prognosis in many different cancer types (Avazpour et al., 2017; Jiang et al., 2015; Miao et al., 2016; Oh et al., 2016; Qiu et al., 2015; Wang et al., 2015; Xing et al., 2015; Yang et al., 2016; Zhong et al., 2018). The capacity to increase breast cancer growth was confirmed recently as HOTAIR is a key regulator of proliferation, colony formation and self-renewal in breast cancer stem-like cells (CSCs) (Deng, Yang, et al., 2017).

The antisense noncoding RNA in the INK4 locus (ANRIL or CDKN2B-AS) is a large natural antisense lncRNA gene, spanning approximately 126.3kb of genomic DNA traversing the tumour suppressor gene p15/CDKN2A (Yap et al., 2010). ANRIL works directly through epigenetic transcriptional repression of chromobox 7 (CBX7) and it ultimately controls cell senescence (Yap et al., 2010). This antisense gene is highly expressed in prostatic tumours, HCC, triple-negative breast cancer (TNBC) and in leukaemia (Zhao et al., 2017; Hua et al., 2015; Ma, Li et al., 2017). Like MALAT-1, suppression of ANRIL levels impedes HCC and breast cancer xenograft growth, in vivo (Hua et al., 2015). ANRIL is highly expressed in breast cancer and lung cancer, in which its expression levels closely correlates with poor prognosis in TNBC and higher tumour, node, metastasis classification of malignant tumour (TNM) staging in NSCLC (Lin et al., 2015; Xu et al., 2017). This was recently confirmed in a large lncRNA-centric parametric statistical evaluation of 1104 genomes from 23 different cancers available through Cancer Genome Project at the Sanger Institute and The Cancer Genome Atlas (TCGA) (Lanzos et al., 2017). The authors identified SAMMSON to be a potential critical tumour drivers of stomach adenocarcinomas (Lanzos et al., 2017). This analysis also confirmed MALAT1 to be an important lncRNA driver in pan-cancer datasets, highlighting the important role individual genes like MALAT1 and other lncRNAs play in general tumourigenesis (Lanzos et al., 2017).

1.3.4.2 Tumour suppressing lncRNAs

While many lncRNA activate oncogene signalling, or function to repress tumour suppressor genes, some serve to activate or repress genes leading to tumour suppression (Gutschner & Diederichs, 2012). Tumour suppressive lncRNA phosphatase and tensin homolog pseudogene 1 (PTENP1) is a pseudogene of PTEN. PTENP1 has been implicated in breast, hepatocellular and bladder cancers (Chen et al., 2015; Li et al., 2017), and is associated with poor patient survival when found at decreased levels in head and neck squamous cell carcinoma (HNSCC) (Liu et al., 2017). This lncRNA is expressed in a wide variety of human tissues and acts as a ceRNA; its 3’ untranslated region (UTR) acts as a decoy for miRNAs which attenuate the levels of PTEN, including hsa-miR-193a-3p, in hepatocellular cancer (Qian et al., 2017), hsa-miR-19b in breast cancer (Chen et al., 2017; Li et al., 2017; Shi & Su, 2017) and hsa-miR- 21 in oral squamous cell carcinomas (Gao et al., 2017).

15 The lncRNA lincRNA-p21 and maternally expressed, imprinted lncRNA, maternally expressed 3 (MEG3) have both been directly implicated activating signalling pathway (Huarte et al., 2010). In esophageal squamous cells (ESCC), gastric cancer and neuroblastoma cells, MEG3 overexpression inhibited cell proliferation, promoted apoptosis and supressed metastasis through an accumulation of p53 and activation of its target genes (Lv & Zhang, 2016; Tang et al., 2016; Wei & Wang, 2017). In ESCC this was demonstrated to be via repression of the negative regulator of p53, mouse double minute 2 homolog (MDM2) (Iyer & Agarwal, 2017; Tang et al., 2016; Wei & Wang, 2017). Inhibition of DNMT1 with DNA methylation inhibitor 5-Aza-2’-deoxycytidine (5-AzadC) reversed the aberrant hypermethylation of MEG3 levels in glioma tissues, preventing the loss of MEG3 and leading to an activation of the p53 pathway and decreased cell proliferation (Li et al., 2016). In murine lung cells, LincRNA-p21 expression is elevated upon activation of p53 where reduce the expression of genes normally repressed by p53 (Huarte et al., 2010). This occurs, in part, via a direct interaction with the promotor-associating protein, heterogeneous nuclear ribonucleoprotein K (hnRNP-K), effectively ablating p53-regulated gene expression (Huarte et al., 2010). This was shown to be important for the p53-dependent apoptosis induction, but not cell cycle arrest, and contributed to the inhibition of human prostate cancer cells (Wang et al., 2017). Interestingly, lincRNA-p21 also works in cis to activate the expression of its neighbouring gene, p21, as a coactivator for p53-mediated p21 expression (Dimitrova et al., 2014).

Figure 1.4. lncRNAs in different human cancers. The colour represents either upregulated (red) or downregulated (blue) compared to normal tissue (Adapted from Vitiello & Tuccoli, 2015).

16 1.3.5 LncRNAs in prostate cancer

1.3.5.1 Incidence, genomic landscape and treatment of prostate cancer

Prostate cancer or malignancy of the prostate gland and accounts for 6.6% of cancer-related deaths worldwide (Ferlay et al., 2015). The incidence of prostate cancer remains prevalent in the Australian population with approximately 17, 000 – 18, 000 new diagnoses expected to be reported in the coming year (Welfare, 2017). It is the second leading cause of death of cancer-related death, second to lung cancer, with lethal prostate cancer expected to cause 3,500 male deaths (12.7% of all male deaths from cancer in 2018) nationwide (Welfare, 2017). Prostate malignancies are predominantly believed to be derived from basal and luminal glandular epithelial cells (Litwin & Tan, 2017).

Prostate cancer, unlike other cancer types, is not commonly attributed to a large mutational burden (Attard et al., 2016; Barbieri et al., 2012; Cancer Genome Atlas Research, 2015). Typical mutations in oncogenes which characterise the landscape of tumours such as colon cancer, melanoma and lung carcinomas such as Kirsten rat sarcoma (KRAS), BRAF, epidermal growth factor receptor (EGFR) and phosphatidylinositol 4,5-bisphosphate 3-kinase catalytic subunit alpha isoform (PIK3CA) are uncommon (Barbieri et al., 2012; Rycaj & Tang, 2017). Despite the low mutation rates in prostate cancer, copy-number alterations and amplification of oncogenes/ deletions of tumour suppressors characterise much of its genomic landscape, including deletions in PTEN, amplifications in cMYC and gene rearrangements such as the TMPRSS2:ERG fusion gene (Barbieri et al., 2012; Grasso et al., 2012).

Treatment of localised prostate cancer is reflects the presentation and clinicopathological features of the cancer, including prostate-specific antigen (PSA) levels, tumour stage and Gleason grade and score (Bill-Axelson et al., 2014). Active surveillance is the most common outcome for men who show features of low-risk disease (i.e., Gleason score <6, PSA <20 ng/µl). Men at higher risk (i.e., Gleason score >7, PSA >20 ng/µl), and with local and/or advanced spread of the prostate tumour will typically undergo radical prostatectomy (removal of the prostate) or radiotherapy in combination with long-term androgen deprivation therapy (ADT) (Bill-Axelson et al., 2014). As most primary prostate tumours retain histological features of the prostate, including androgen receptor (AR) expression levels, ADT is useful for diminishing circulating testosterone (to a low ~concentration of <50 ng/mL) and ablating signalling via the androgen receptor. Ultimately leading to a reduction in growth of the prostate tumour (Attard et al., 2016; Beer et al., 2014; Bill-Axelson et al., 2014; Ferraldeschi et al., 2015; Litwin & Tan, 2017). During the course of this treatment, which shows excellent efficacy for ~ 12-36 months, men receiving ADT will inevitably become hormone refractory and develop lethal/ or castration- resistant prostate cancer (mCRPC) (Beer et al., 2014; Litwin & Tan, 2017; Sumanasuriya & De Bono,

17 2017; Yin & Hartmann, 2013). This is characterised phenotypically by an androgen-independent activation of the AR and downstream AR-driven pathways, and clinically by hormone refractory tumour cells and recurrence in the rise PSA and AR levels (Beer et al., 2014; Bill-Axelson et al., 2014; Li et al., 2017; Litwin & Tan, 2017). At this point men who now present with mCRPC are administered taxane-based chemotherapies, or radium-based radiotherapies to treat the inevitably fatal progression of the disease (Beer et al., 2014; Litwin & Tan, 2017).

To date, a number of lncRNAs have been associated with prostate cancer growth and cancer hallmarks, progression and prognostic outcomes (Smolle & Bauernhofer et al., 2017). Numerous studies have associated the aberrant expression of lncRNAs in clinical manifestations of prostatic disease, including benign prostatic hyperplasia, malignant prostate cancer, and during the development to mCRPC (Misawa et al., 2017; Mitobe & Takayama et al., 2018; Smolle et al., 2017). The function of some prostate-associated oncogenic lncRNAs has also been associated with resistance to current therapies, such as ADT, and first-line chemotherapeutics, such as docetaxel and cisplatin (Misawa, Takayama et al., 2017; Ozgur et al., 2017; Stone, 2017; Wang et al., 2017).

1.3.5.2 LncRNA and the hallmarks of prostate cancer

In 2000, Hanahan and Weinberg proposed a number of hallmarks of cancer that are common between all subtypes of cancer and which constitute necessary steps for tumour growth (Hanahan & Weinberg, 2000) (Figure 1.5). These hallmarks were updated and extended a decade after their conception (D. Hanahan & Weinberg, 2011). These hallmarks propose that self-sufficiency in growth signals, insensitivity to anti-growth signals, evasion of apoptosis, limitless replicative potential, sustained angiogenesis and tissue invasion and metastasis are required for cancer development (Hanahan & Weinberg, 2000; Hanahan & Weinberg, 2011). Considering prostate cancer (and many tumour types), lncRNAs have been shown to be vital contributors, and often drivers, within each of the intrinsic hallmarks of cancer development (Gutschner & Diederichs, 2012). Some of these will be discussed in the context of critically evaluated lncRNAs specifically involved in prostate cancer.

1.3.5.2.1 LncRNAS involved in cell proliferation and apoptosis

The ability to sustain proliferative signals provides a neoplastic cell with opportunities for cell division which occurs independently of exogenous growth factors, allowing for uncontrolled growth (D. Hanahan & Weinberg, 2011). Older pathological studies in organ-confined prostate cancer from radical prostatectomies have showed that proliferation (defined by positive ki67 immunohistochemical staining) is useful in distinguishing between more advanced cancers (Gleason score ≥6) and low risk indolent tumours (Gleason score ≤4) (Brown, Sauvageot et al., 1996). Therefore a gain of proliferation

18 signal is important for the transitional growth of these tumours in order to break the confines of the prostate. Many lncRNAs have been implicated in increased cell proliferation through a number of the cellular mechanisms outlined above, including acting as ceRNAs and through modifying cell cycle associated locations of the genome through histone modifications and DNA methylation (Gutschner & Diederichs, 2012). In order to increase the levels of proliferation, Small nucleolar RNA host gene 1 (SNHG1) acts as a ceRNA to negatively regulate the levels of cyclin-dependent kinase 7 (CDK7) targeting miRNA, hsa-miR-199a-3p (J. Li et al., 2017). By interfering with hsa-miR-199a-3p and reducing the translational repression of CDK7 levels, this lncRNA enhances proliferation in prostate cancer, where it is highly expressed above its baseline levels within normal prostate (Li et al., 2017).

It is well understood that many lncRNA are actively transcribed during phases of the cell cycle (Hung et al., 2011). Indeed, one study demonstrated that ~216 putative lncRNAs are derived from the promotor regions of cell cycle genes, including the lncRNA p21 associated ncRNA DNA damage activated (PANDA) deriving from the p21 (CDKN1A) promotor (Hung et al., 2011). While PANDA has not been implicated in prostate cancer, other lncRNAs have been associated as cell cycle gene regulators and intrinsic responders to responses to genotoxic stress, including DNA damage (Ozgur et al., 2013). As an example, the lncRNA Differentiation Antagonizing Non-Protein-coding RNA (DANCR), is a direct target of the multifunctional nuclear phosphoprotein, proto-oncogene, and was established as a functional suppressor of the protein cyclin-dependent kinase inhibitor p21 (alternatively; p21WAF1/Cip1) (Lu et al., 2018). DANCR is upregulated in cancer and is broadly expressed in multiple tumour types (Lu et al., 2018). Importantly, DANCR expression levels correlate with p21 and MYC transcript levels in prostate tumours from the TCGA (TCGA-PRAD) (Lu et al., 2018). In vitro studies using the PC3 and DU145 androgen-independent cell lines demonstrated that a sufficient knockdown of DANCR expression with siRNA resulted in decreased cell proliferation, significant decreases in S-phase cell cycle entry and an upregulation of the tumour suppressor protein p21 (Lu et al., 2018). The precise mechanism of which DANCR mediated p21-dependent inhibition of proliferation is currently unknown. Many lncRNAs are important mediators of proliferation in prostate tumour cells, with some facilitating resistance to cell death following stress – an important hallmark in the regulation of cancer development.

1.3.5.2.3 Cell migration, invasion, and metastasis

The ability to invade into the local milieu is an important process a tumour requires in order to form distant metastases (Rycaj & Tang, 2017). In prostate cancer, metastatic disease remains incurable and difficult to treat (Rycaj & Tang, 2017). LncRNAs have received considerable attention for their role in regulating cell invasion and coordinated invasion processes (Dhamija & Diederichs, 2016; Gutschner & Diederichs, 2012; Shen et al., 2015; Smolle et al., 2017; Zhong et al., 2018). One of the key lncRNAs

19 involved in metastasis is MALAT1 whose expression levels are significantly elevated during the transition from hormone-sensitive to castration-resistant prostate cancer (Ren, Liu, et al., 2013). Furthermore, MALAT1 expression is highly enriched in bone marrow biopsy samples taken from mCRPC patients (Sowalsky et al., 2015; Wang et al., 2015). This lncRNA is regulates EZH2 occupancy of target genes, which serves to promote migration and invasion, in vitro (Wang et al., 2015). Knockdown of either EZH2 or MALAT1 abolished transwell invasion of PC3 and C4-2 prostate cancer cells through Matrigel (Wang et al., 2015). This response could be rescued with restored expression of siRNA-resistant EZH2 (Wang et al., 2015). The increases in migration and invasion, in part, was due to MALAT1 mediating the expression of pro-invasive EZH2-activated oncogenes, including CKS2 and TMEM48, which were shown to be repressed in vitro when MALAT1 levels were knocked down (Wang et al., 2015). Another functional lncRNA in prostate cancer, NEAT1, is an important promotor of invasion in vitro (Chakravarty et al., 2014). Complimentary studies using both stable overexpression models of NEAT1 and shRNA-mediated specific NEAT1 knockdown in the VCaP prostate cancer cell line, showed that this lncRNA enhances prostate cancer cell invasion, and this could be reduced when NEAT1 levels were knocked down (Chakravarty et al., 2014). While the authors used mouse models to validate their proliferation studies, the effects of NEAT1 on in vivo invasion is unclear and remains undetermined (Chakravarty et al., 2014).

1.3.5.2.4 LncRNAs involved in the process of angiogenesis and tumour hypoxia

LncRNAs also play a major role in neovascularisation and angiogenesis in cancer, which is an essential step in promoting and sustaining tumour growth (Gutschner & Diederichs, 2012), however, few studies have described their role in these process in prostate cancer. Most notably, the prostate-specific lncRNA prostate cancer antigen 3 (PCA3; or DD3) is a known partial modulator of EMP in prostate cancer cells and regulates a number of genes involved in migration and invasion (Lemos et al., 2016). Repression of PCA3, via siRNA or shRNA targeting, reduced the expression of a number of important genes involved in the process of angiogenesis, including vascular endothelial growth factor A (VEGFA) and fibrillin (IFNB1) – both major growth factors and promotors of neovascularisation (Lemos et al., 2016). Development of hypoxia in the prostate tumour is considered to be an early event in prostate cancer and is associated with high Gleason scores and adverse prognostic parameters (Choudhry & McIntyre, 2016; Khorshidi et al., 2016). Recently, it has been established that increased hypoxic subpopulations characterise two pro-metastatic prostate carcinoma sub-pathologies; intraductal carcinoma (IDC) and cribriform architecture (CA) (Chua et al., 2017). Moreover, the only transcript upregulated in IDC/CA+ versus IDC/CA- was SChLAP1, a lncRNA associated with unfavourable prognostic outcomes (Chua et al., 2017). Other than SChLAP1, lncRNAs such as XIST and HOTAIR increase angiogenesis in glioma, while MALAT1 has been shown to promote vascularisation in neuroblastoma (Cheng et al., 2017; Ma et al., 2017; Tee et al., 2016). Furthermore, it has been recently demonstrated that the lncRNA focally

20 amplified lncRNA on chromosome 1 (FALEC), a promoter of in vivo tumour growth and invasion in prostate cancer cell lines, is a potential hypoxia-induced lncRNA (R. Zhao et al., 2017). However, the contribution of FALEC, and those of other lncRNAs as regulators of angiogenesis require further investigation.

Figure 1.5. Summary of key lncRNAs involved in the hallmarks of prostate cancer. This illustration encompasses the 10 hallmark capabilities for tumour progression, proposed by (D. Hanahan & Weinberg, 2011). In prostate cancer, a number of lncRNAs regulate these hallmarks and can be considered as oncogenes (Adapted from Hanahan & Weinberg, 2011 and Gutschner & Diederichs, 2012).

21

1.3.5.3 LncRNAs involved in AR signalling and castrate-resistant prostate cancer

The AR and its downstream signalling pathways are vital to prostate cancer growth, differentiation and advanced metastatic disease (Misawa, Takayama et al., 2017; Sumanasuriya & De Bono, 2017). Given the importance of the downstream AR signalling pathways in prostate growth and tumour progression, androgen blockade and AR antagonism have become a central approach for locally advanced prostate cancer (Attard et al., 2016; Litwin & Tan, 2017). While androgen ablation through surgical or chemical castration has been a mainstay therapy since its first description in the 1940s, it frequently fails after its initial use and can lead to the aggressive, often lethal prostate cancer - castration-resistant prostate cancer (CRPC) (Attard et al., 2016; Beer et al., 2014). CRPC tumours can show persistent androgen signalling in the absence of exogenous androgens due to genomic alterations, including amplifications of the AR, as well as increased expression of co-activators of the AR, increased stability and intra-tumoural production of DHT (Beer et al., 2014; Litwin & Tan, 2017; Yin et al., 2013). The dysregulated expression of many lncRNAs may aid in mediating some of the molecular pathways leading to CRPC, including interactions with the AR and its alternative splicing and expression as well as modulating accessory growth signals to support CRPC tumour growth (Beer et al., 2014; Misawa, Takayama, & Inoue, 2017; Sumanasuriya & De Bono, 2017). For example, SChLAP1 is highly expressed in ~25% of prostate cancers and correlates with lethal advanced cancer and CRPC (Cimadamore et al., 2017; Prensner et al., 2013). SChLAP1 expression, measured by in situ hybridisation (ISH), is a useful independent predictor of biochemical recurrence (BCR) after radical prostatectomy. More recently, out of 11 lncRNAs specific for prostate cancer, SChLAP1 showed the highest association and independent prediction of biochemical failure (BCF)-overall survival (OS) (W. Ma et al., 2017; Mehra et al., 2014). Functionally, SChLAP1 showed no association with AR and had no regulatory action; however, it serves to counteract the effects of the SWI/SNF chromatin remodelling complex in increasing prostate cancer metastasis (Cimadamore et al., 2017; Prensner et al., 2013).

Androgens can also induce the expression of lncRNAs involved in CRPC progression (Misawa, Takayama, Fujimura, et al., 2017; Mitobe et al., 2018; Zhang et al., 2017). Prostate, ovary, testis expressed protein family member-F gene-antisense transcript 1 (POTEF-AS1), and lincRNA LINC01138 and suppressor of cytokine signaling 2-antisense transcript 1 (SOCS2-AS1) are considered part of a family of androgen-inducible lncRNAs and are upregulated in CRPC (Misawa et al., 2016; Misawa et al., 2017; Wan et al., 2016). Moreover, SOCS2-AS1, directly associates with the AR to promote AR signalling (Misawa et al., 2016). Knockdown and overexpression models of these lncRNAs leads to apoptosis and proliferation of CRPC cell lines, warranting future investigation into their actions (Mitobe et al., 2018). Another androgen-inducible NAT lncRNA, C-terminal binding protein 1 antisense RNA (CTBP1-AS) suppresses the expression of its sense protein-coding gene,

22 CTBP1, a co-repressor of the AR, to enhance AR signalling (Sung & Cheung, 2013; Takayama et al., 2013). The tumour suppressor lncRNA GAS5, represses the function of the AR by formation of a secondary structure that blocks its DNA-binding site, reducing androgen-mediated transcription and causes cell death (Sung & Cheung, 2013; Takayama et al., 2013). NEAT1 levels are increased by treatment with the antiandrogens, enzalutamide and bicalutamide and further increased by the oestrogen steroid hormone, oestradiol (E2) (Chakravarty et al., 2014). NEAT1 acts as a downstream mediator of oestrogen receptor alpha (ERα) signalling and works to adapt prostate cancer cells to a lack of androgens, bypassing the need for AR activation by utilising alternative steroid receptors in prostate cancer (Chakravarty et al., 2014). Moreover, NEAT1 levels are an independent prognostic indicator of early biochemical recurrence and advanced prostate cancer (Chakravarty et al., 2014).

The prostate-specific lncRNAs, prostate cancer noncoding RNA-1 (PRNCR1; alternatively PCAT8) and prostate cancer gene expression marker 1 (PCGEM1; alternatively known as PCAT9) were initially identified as transcriptional co-regulators of the AR (Srikantan et al., 2000; Yang et al., 2013). Both were shown to be elevated in prostate cancer, leading to enhanced ligand-dependent and independent AR-mediated gene networks, increasing proliferation in CRPC cell lines (Yang et al., 2013). Subsequently, siRNA mediated knockdown of PRNCR1 levels in C4-2 cells was shown to inhibit the expression of AR protein – implying direct transcriptional or translational regulation of the AR (Wang, Shi, et al., 2013). However, analysis of a large cohort of high-risk prostate cancer clinical patients demonstrated that PCGEM1 and PRNCR1 do not directly interact with the AR, and neither gene is a prognostic factor in prostate cancer - refuting their roles in CRPC (Prensner, Sahu, et al., 2014). PRNCR1 was shown to be at the limits of detection in AR+ prostate cancer xenograft models, while PCGEM1 was found to be regulated by the AR, in vivo (Parolia et al., 2015). Induction of PCGEM1 by androgen deprivation in vitro promotes splicing of the most common AR splice variant, AR splicing variant 7 (AR-v7, alternatively known as AR3) (Zhang et al., 2016). Interestingly, AR-v7, like other AR splice variants, has been associated with resistance to the ATTs, enzalutamide and abiraterone acetate (Li et al., 2017; Peacock, Burnstein et al., 2012). This regulatory role was confirmed in CRPC where it was found that PCGEM1 is regulated by Non-POU domain-containing octamer-binding protein (p54/NRB) in response to ADT, and thus could contribute to castrate-resistance by upregulating AR- v7 (Ho et al., 2016). However, given the lack of consensus in the literature, the role of these lncRNAs require further experimental studies to determine their role in AR signalling. MALAT1 is upregulated in mCRPC samples, and like PCGEM1, contributes to the splicing of the AR into the AR-v7 splice variant (Wang et al., 2017). This occurs via interactions with the serine/arginine rich splicing factor 1 (SF2) complex, which has increased activity in enzalutamide resistant C4-2 cells compared to treatment naïve cells (Stone, 2017; Wang et al., 2017). Both AR-v7 and MALAT1 levels are increased in in vitro generated enzalutamide-resistant prostate cancer cells and in the circulating tumour cells (CTCs) of men post ADT-Enzalutamide treatment (Wang et al., 2017). Increased MALAT1 expression induces the

23 levels of AR-v7, thus making prostate cancer cell lines more resistant to enzalutamide (Wang et al., 2017). Furthermore, this effect was blocked and the cells could be re-sensitised to enzalutamide treatment using MALAT1-siRNA (Wang et al., 2017).

1.3.5.4 LncRNAs that confer resistance to chemotherapeutics

While abiraterone acetate and enzalutamide are the most commonly implemented first-line therapies for metastatic CRPC, chemotherapy is still used for men who fail ATT (Litwin & Tan, 2017; Teply & Hauke, 2016). Alkylating agents such as cisplatin, the cytotoxic antibiotic doxorubicin and the taxane- based chemotherapies including both docetaxel (Taxotere) and paclitaxel (Taxol) have improved the overall survival and quality of life for patients (Teply & Hauke, 2016). Despite these advances, the benefits are commonly limited by the development of drug resistance (Teply & Hauke, 2016). Given the diverse regulatory role of lncRNAs in cellular pathways, they could contribute to chemoresistance. Indeed, in breast cancer drug studies, a number of chemotherapy altered lncRNA have been shown to play roles, including lncRNA-ATB, GAS5, H19, HOTAIR, UCAI and Adriamycin Resistance Associated (ARA) (Jiang et al., 2014; Nagini, 2017; Pickard & Williams, 2014; Xue et al., 2016).

In prostate cancer, notable lncRNAs include UCA1 which is upregulated in docetaxel resistant 22RV1 prostate cancer cells (Wang et al., 2016). In vitro, siRNA mediated knockdown of UCA1 expression significantly reduced the IC50 of docetaxel treatment following a 72-hour treatment (Wang et al., 2016). UCA1, while typically reported as a tumour suppressor, is upregulated in prostate cancer and imparts resistance by acting as a ceRNA to sponge miR-204 - a miRNA which would normally target the chemotherapy-associated gene Sirtuin-1 (Sirt1) (Wang et al., 2016). Conversely, abnormally low levels of the tumour suppressor lncRNA GAS5 may reduce the effectiveness of chemotherapeutic agents in prostate cancer (Pickard et al., 2013). It was observed that ectopic overexpression of GAS5 sensitised prostate cancer cells to cell death following treatment with various cytotoxic agents, including; docetaxel, the cis-imidazoline analog, nutlin-3a, and mitoxanthrone (anthracenedione antineoplastic agent) (Pickard et al., 2013).

PCGEM1 participates in regulating pathways to provide a reduced apoptotic response, and thus, resistance to the widely used chemotherapy drug doxorubicin in prostate cancer cells (Fu, Ravindranath, Tran, Petrovics, & Srivastava, 2006). In addition to its growth promoting role in prostate cancer, overexpression of PCGEM1 in LNCaP cells (LNCaP-PCGEM1) attenuates doxorubicin induced expression of p53, and its major target p21Waf1/Cip1 (p21) (Fu et al., 2006). As a major inhibitor of the cyclin-dependent kinases (CDK) CDK4, 6/cyclin-D and CDK2/cyclin-E, nuclear p21 inhibits cell cycle progression and arrests cells in the G1/S and G2/M transitions. The nuclear repression of p21 is thought to decrease the apoptotic response to cytotoxic agents (Karimian et al., 2016; Liu et al., 2013). In vitro

24 cell line models showed overexpression of PCGEM1 reduced the levels of p21, which remained reduced in the presence of doxorubicin, increasing the survival attenuating apoptosis of LNCaP cells (Fu et al., 2006). Specifically, cleaved caspase 7, cleaved (ADP-ribose) polymerase (PARP) and the levels of apoptotic cells measured by fluorescence-activating cell sorting (FACS) were reduced in LNCAP- PCGEM1 cells exposed to other chemotherapy agents (stautosporine, sodium selenite, and etoposide) indicating this lncRNA promotes cell survival (Fu et al., 2006).

1.3.5.5 LncRNA as prostate cancer diagnostics

It is predicted that lncRNAs may provide a resource as novel diagnostic and prognostic indicators for prostate cancer (Table 1.1). Blood serum levels of PSA have been in routine use as a minimally invasive indicator of prostate cancer since FDA approval in 1986. However, even improvements the PSA test, including measuring percentage of free, unbound, PSA (%fPSA) as a proportion of total PSA (tPSA), and the measurement of structural isoforms of the PSA protein have minimally improved the test which is widely criticised its lack of specificity. PSA is elevated in various conditions such as benign prostate hyperplasia, prostatitis and some infectious conditions, and it is not a suitable predictor of mCRPC development (Martignano et al., 2017; Nadler & Humphrey et al., 1995).

To date, the most pivotal lncRNA developed as a biomarker is the prostate cancer specific, PCA3 (Deng, Tang & Zhu, 2017). PCA3, originally discovered in 1999, was significantly upregulated ~ 95% (65/66 tumours) of prostate tumours compared to the adjacent non-neoplastic tissue (Bussemakers et al., 1999). PCA3, unlike PSA, is not readily detected in extra-prostatic tissues and is not adversely influenced by other prostatic diseases such as prostatitis (Martignano et al., 2017). Given its remarkable prostate cancer-specificity, PCA3 was subsequently developed into the first lncRNA-based diagnostic test (PROGENSA), and subsequently approved by the FDA in 2012 for use (in combination with PSA levels) in determining the need for repeat prostate biopsies in men ≥50 years of age who have had one or more previous negative prostate biopsies (Bourdoumis et al., 2015; Nicholson et al., 2015). The test requires the use of stabilisation buffers on collection to allow the preservation of PSA RNA in the urinary sediment, following a digital rectal examination (DRE) procedure (Nicholson et al., 2015). PCA3 concentrations are used in a ratio termed the “PCA3 Score” (Leyten et al., 2015; Nicholson et al., 2015; Perdona et al., 2013; Ramos et al., 2013; Zhang et al., 2015). However, PCA3 scores are unable to be correlated with tumour grade, are likely affected by ADT, and cannot be inferred following docetaxel regiments following ADT (Bourdoumis et al., 2015; Ozgur et al., 2017). Indeed, recent studies by affiliates in our lab and others have showed that PCA3 is likely androgen regulated and its knockdown may enhance the actions of enzalutamide (Lai et al., 2017). Thus, it may not be suitable as a prognostic predictor in these settings.

25 Other markers alone, or in combination, may be more useful to predict advanced disease or for prognosis. For instance, combining urinary PCA3 with serum PSA levels improves upon the independent values of both measures (PCA3 area under the curve (AUC) of 0.66-0.72; PSA AUC of 0.54 to 0.63) as a biomarker (PCA3 + PSA AUC 0.71-0.75) (Prensner & Chinnaiyan, 2012). Moreover, combining urinary PCA3 levels with the Prostate Health Index (Phi), and with other prostate cancer biomarkers, including; prostate cancer fusion gene, transmembrane protease, serine 2 (TMPRSS2:ERG), metabolic marker sarcosine, and annexin A3 (calcium-dependent phospholipid- binding protein) in a multiplex panel has also shown improved sensitivity and diagnostic accuracy in a clinical setting (Perdona et al., 2013). Furthermore, incorporation of TMPRSS2:ERG into the PCA3 test improves the predictive value of European Randomized Study of Screening for Prostate Cancer (ERSPC) risk calculator for predicting biopsy need (Leyten et al., 2014). Unfortunately, while TMPRSS2:ERG has improved diagnostic accuracy when combined with PCA3, its expression appears to be much higher in Western populations and may not be suitable as a universal clinical test when considering repeat biopsies (Ren et al., 2012).

In addition to PCA3, urinary MALAT1 levels have shown potential as a useful clinical indicator of prostate cancer aggressiveness (Wang et al., 2014). Clinically, PSA measurements of 4-10 ng/ml are considered a “grey zone” for indication of a prostate cancer biopsy, given that only 25-35% of men will be diagnosed with prostate cancer, which is generally presents as low grade (Zhou et al., 2017). A multi- centre validation study of 434 prostate cancer patients in China showed that urinary MALAT1 levels were an independent predictor of prostate cancer in patients with PSA values of ~4-10 ng/ml (0.670 and 0.742) and could outperform both tPSA (AUC 0.545 and 0.601) and %fPSA (0.622 and 0.627) (Wang et al., 2014). However, the authors could not improve upon PSA predictions in men that had a PSA > 10ng/ml, and MALAT1 scores were not able to correlate to Gleason score, a useful clinicopathological parameter of interest in prostate cancer diagnosis (Wang et al., 2014). Aside from minimally invasive urine samples, MALAT1-derived fragments available in serum and plasma, termed MD-miniRNA, are able to distinguish positive prostate biopsies from those that are negative (Ren & Wang, et al., 2013; Xue & Zhou et al., 2015). This discriminatory power had superior specificity but lower sensitivity for prostate cancer detection than PSA, however, importantly, MD-miniRNA measurements were shown to be elevated in patients with high grade tumours and able to discriminate between prostate cancer and BPH (Xue et al., 2015). Once combined, the high specificity of MD-miniRNA and high sensitivity of PSA proved useful as an improved diagnostic tool for prostate tumours (Xue et al., 2015).

One of the most notable advanced prostate cancer biomarkers is PCAT11, (also known as as the lincRNA SChLAP) (Cimadamore et al., 2017; Ma et al., 2017). This lncRNA associates with E26 transformation-specific (ETS)-positive prostate cancer and is highly upregulated in metastatic CRPC (W. Ma et al., 2017; Mehra et al., 2014; Mehra et al., 2015; Mouraviev et al., 2015; Prensner et al.,

26 2013; Prensner & Zhao, et al., 2014). SChLAP1 in situ hybridisation (ISH) signal scores and mRNA expression in patients with a radical prostatectomy correlates with Gleason score (<7 vs ≥7), lethal CRPC and biochemical recurrence. SChLAP1 expression is robustly prostate-specific and holds potential as a future tissue-based biomarker for screening patients who might be at higher risk of CRPC; where PSA measurements are typically unreliable (Cimadamore et al., 2017; Ma et al., 2017; Mehra et al., 2016; Prensner & Zhao, et al., 2014). Another promising lncRNA is the transcript FR0348383 (W. Zhang et al., 2015), which like MALAT1, is upregulated in prostate cancer compared to matched non- neoplastic prostate tissue, and is able to discriminate against prostate cancer in post-DRE first-catch urine sample in the PSA grey zone (PSA >4.0 ng/ml) (Zhang et al., 2015). Remarkably, use of this lncRNA as a ratio of PSA mRNA, and analysed using multivariable logistic analysis indicated that this lncRNA was able to outperform PSA, %fPSA, and be able to potentially avoid 52% of unnecessary biopsies and perfectly identify high grade prostate cancer (Zhang et al., 2015).

27 Table 1.1. Aberrantly expressed lncRNAs in prostate cancer. Expression Genomic Association with lncRNA Pattern Molecular mechanism in PCa Association with PCa (clinically) Citations location clinical treatments in PCa Biomarkers (Bourdoumis et al., Predicts prostate cancer in combination Enzalutamide, 2015; Nicholson et al., PCA3 9q21.2 Upregulated Increases EMT and repression of tumour suppressor, PRUNE2 with PSA Flutamide 2015; Ozgur et al., 2017; Salameh et al., 2015) (Ren, Liu, et al., 2013) Interacts with SF2 to promote AR-v7 splicing. Binds EZH2 leading to More sensitive than PSA for initial MALAT1 11q13.1 Upregulated Enzalutamide (Stone, 2017; Wang et reduced levels of EZH2 targets diagnosis al., 2017) Independent predictor of prostate cancer FR0348383 Missing Upregulated Unknown Unknown (Zhang et al., 2015) in post-DRE urine Prognostic/ Predictive Biomarkers (Smolle et al., 2017; Prevents ubiquination/ degradation of AR by binding to its NTD to Associated with resistance to Enzalutamide, HOTAIR 12q13.13 Upregulated Xiang et al., 2017; Zhang prevent MDM2 interaction. enzalutamide Cisplatin et al., 2015) Interacts with SF2 to promote AR-v7 splicing. Binds EZH2 leading to (Stone, 2017; Wang et MALAT1 11q13.1 Upregulated Correlates with ADT-resistance Enzalutamide reduced levels of EZH2 targets al., 2017) Counteracts the actions of the SWI/SNF chromatin remodelling (Mehra et al., 2016; SChLAP1 2q31.3 Upregulated Predicts lethal mCRPC Unknown complex Prensner et al., 2013) ceRNA for the miR-200 family, leading to upregulation of ZEB1, Correlates with unfavourable clinical LncRNA-ATB 14q11.2 Upregulated Unknown in PCa (Xu et al., 2016) ZEB2 features Decreases migration through unknown mechanism. Transcription is Associated with early biochemical (Colditz & Baniahmad, DRAIC 15q23 Downregulated Unknown repressed by AR binding to gene loci recurrence 2016) Recruited to the promotor of oncogenes and induces a transcriptional Enzalutamide, (Chakravarty et al., NEAT1 11q13.1 Upregulated Predicts early biochemical recurrence ‘active’ state at histone H3 to promote invasion and cell proliferation Bicalutamide 2014) Therapeutic targets Inhibition may retard progression of Enzalutamide, (Ozgur et al., 2017; PCA3 9q21.2 Upregulated Increases EMT and repression of tumour suppressor, PRUNE2 disease Flutamide Salameh et al., 2015) Activates p53, upregulates caspase 3 and inhibits cyclin D1 to reduce Induction of expression could decelerate MEG3 14q32 Downregulated Unknown (Luo et al., 2015) cell proliferation tumour progression (Smolle et al., 2017; Prevents ubiquination/ degradation of AR by binding to its NTD to Efficacy of enzalutamide enhanced upon Enzalutamide, HOTAIR 12q13.13 Upregulated Xiang et al., 2017; Zhang prevent MDM2 interaction. blockage Cisplatin et al., 2015) Interacts with hnRNP A1 & U2AF65) to promote AR-v7 expression. Enzalutamide, (Ho et al., 2016; Parolia PCGEM1 2q32.3 Upregulated Efficacy of ADT enhanced by blockage Directly downregulates p21 doxorubicin et al., 2015) Mediates binding of EZH2 and TIMP2/3, thus increasing cancer Metastatic spread prevented by blockage DANCR 4q12 Upregulated Enzalutamide (Lu et al., 2018) growth through TIMP2/3 reduction upon enzalutamide-treatment DANCR: Differentiation antagonising non-protein-coding RNA; TIMP: Tissue inhibitor of metalloproteinase; MEG3: lncRNA Maternally expressed gene 3; PCA3: Prostate cancer antigen 3; PRUNE2: Prune homolog 2; DRAIC: Downregulated RNA in cancer; PCGEM: Prostate cancer gene expression marker 1; MALAT-1: Metastasis-associated lung adenocarcinoma transcript-1; SF2: Serine/arginine rich splicing factor 1; EZH2; Enhancer of zeste homolog 2; NEAT1: Nuclear-enriched abundant transcript 1; SChLAP1: Second chromosome locus associated with prostate-1; HOTAIR: HOX transcript antisense RNA; AR; Androgen receptor; NTD: N-terminal domain; MDM2; mouse double minute 2 homolog; PSA: Prostate-specific antigen; ADT: Androgen deprivation therapy; mCRPC: Castration-resistant prostate cancer. U2AF65: Splicing factor U2 auxiliary factor 65 kDa subunit; AR-v7: AR splicing variant 7; ZEB1, ZEB2: Zinc Finger E-Box binding homeobox 1, 2.

28 1.3.5.6 Therapeutic strategies to target lncRNAs

Many current cancer therapies are not tumour-specific and have many adverse effects (Gutschner & Diederichs, 2012). The ability to target these lncRNAs using new or evolving chemical tools provides a wealth of promising targets for the development of cancer therapies (Slaby et al., 2017). RNA interference (RNAi) techniques, including short interfering RNAs (siRNAs), short hairpin RNAs (shRNAs) and microRNAs (miRNAs) have been applied to silence lncRNAs in vitro and siRNAs have been safely used in preclinical models of cancer (Adams et al., 2017; Li & Chen, 2013; Slaby et al., 2017). Antisense oligonucleotides (ASOs) have undergone a number of generations of refinements, and modified ASOs demonstrate some advantages over siRNAs, particularly for targeting nuclear transcripts (Sharma & Singh, 2014). Catalytic oligonucleotides, aptamers and small molecules may also be used to silence lncRNA, or to disrupt lncRNA function in cancer (Fokina et al., 2015; Sharma et al., 2014). Additionally, newly described gene-editing tools, including CRISPR, also show significant promise for silencing lncRNAs and inhibiting processes required for cancer progression (S. J. Liu et al., 2017). These research efforts are likely to provide substantial health benefits in the future, including avenues for safer and more targeted treatments for cancer (Summarised in Figure 1.6).

Figure 1.6. Strategies for the clinical implications of lncRNAs in cancer patients. Abbreviations: CRISPR, Clustered regularly interspaced short palindromic repeats, Cas, CRISPR associated; FISH;,Fluorescent in situ hybridization; ic-SHAPE, in vivo click selective 20-hydroxyl acylation by primer extension; qRT-PCR, quantitative real-time polymerase chain reaction; RNAi, RNA interference; ASOs, antisense oligonucleotides; lncRNA, long noncoding RNA. (Adapted from Gupta & Tripathi, 2017).

29

1.4 Growth Hormone Secretagogue Receptor Opposite Strand (GHSROS) gene

The natural ghrelin receptor antisense gene, GHSROS, is a single exon, antisense long noncoding gene complementary to the 2.1kb intronic region of the GHSR gene (Figure 1.7) (Whiteside et al., 2013). GHSROS was recently identified by our group, and is predicted to be an intronic, antisense long noncoding transcript (Whiteside et al., 2013). GHSROS has three transcription start sites (TSS) and a transcription start site downstream of a consensus TATA box motif is hypothesised to be the major promoter (Whiteside et al., 2013). The other two promoters are upstream of a thymidine rich, poly-T- repeat within an ancient MER5B (medium reiteration frequency 5B) DNA transposable element (Whiteside et al., 2013). The existence of this thymidine rich area, which is absent in non-primates, suggests that a higher-order eukaryote or primate-specific antisense promoter may have evolved through ab initio generation and the accumulation of mutations in the GHSR intron (Whiteside et al., 2013). MER5B DNA transposable elements have also been reported to contain promoter regions for natural antisense transcripts and could play a role in the transcription of novel and species-specific RNA transcripts (Whiteside et al., 2013). GHSROS is 5’ capped and 3’ polyadenylated and hence shows characteristics of being transcribed into an mRNA-like noncoding RNA that could potentially be involved in gene regulation (Whiteside et al., 2013). These observations provide strong evidence that the GHSROS transcript is processed into mRNA from a single antisense exon within the GHSR intron (Whiteside et al., 2013).

30

Figure 1.7. Identification and verification of antisense transcription in the GHSR locus. Growth hormone secretagogue receptor (GHSR) exons 1 and 2 are shown in black. An exon of a putative antisense transcript, GHSROS, is shown in grey. Lung cancer-derived ESTs (GenBank entries AW451317 and AI681234) and an antisense transcript deduced from a strand-specific cDNA microarray (TIN_36629) are shown as green boxes below the antisense strand exon. The region amplified by qRT-PCR and the region targeted using a riboprobe in Northern blot analysis experiments are shown as yellow boxes. Full-length sequences obtained by RACE (rapid amplification of cDNA ends) having and by sequencing of a full-length cDNA clone (IMAGE cDNA clone 2272492) are shown as blue boxes (Whiteside et al., 2013).

Preliminary evidence using qRT-PCR studies in cDNA panels originally demonstrated that GHSROS is highly expressed in normal testis with slightly lower expression in foetal brain. In contrast, low levels of GHSROS were detected across other normal tissues, including; lung, pancreas, ovary, thymus and stomach (Whiteside et al., 2013) (Figure 1.8A). Additionally, GHSROS is highly expressed in lung cancer compared to normal lung tissue (Whiteside et al., 2013) (Figure 1.11A). Furthermore, forced GHSROS over-expression in lung cancer cell lines significantly increased cell migration in the A549 and lung cancer-derived lines, but led to a significant decrease in cell migration in the Beas2b normal- epithelial derived lung cell line (Whiteside et al., 2013) (Figure 1.8B).

Thus, there is evidence that GHSROS may play a novel regulatory role in cancer and it may play different roles in cancer and normal tissues (Whiteside et al., 2013). Our validation studies have also demonstrated that GHSROS gene expression increases in lung cancer clinical samples, compared to normal matched pair lung tissue, as well as in cancer cell lines compared to normal-derived cells

31

(Whiteside et al., 2013), strengthening our hypothesis that GHSROS plays a role in lung cancer progression. The mechanisms of action of GHSROS has not been determined, however, intronic noncoding RNA molecules may regulate the abundance of their sense protein-coding gene pair, or splicing of alternative isoforms transcribed from the corresponding sense genes (Whiteside et al., 2013). GHSROS could, therefore, regulate the relative expression of the GHSR isoforms, GHSR1a and GHSR1b. LncRNAs may also have widespread effects working in trans and can influence a large number of genes at distant loci, however, the role of GHSROS in regulating gene expression has not been explored. Further, preliminary data generated in our laboratory has shown that GHSROS is dysregulated in a number of different tumour types, including breast and prostate cancer (Figure 1.9).

Figure 1.8. Expression and the effects of GHSROS on in vitro migration in non-small cell lung cancer (NSCLC) and NSCLC cell lines. (A) Relative expression of GHSROS in human tissues using quantitative qRT-PCR. Data are represented as means ± standard error of two technical replicates of two independent replicate experiments. The housekeeping gene 18S ribosomal RNA was used as a reference for normalisation. Data are represented as fold change relative to expression of transcripts in stomach. (B) GHSROS overexpression affects cell migration. Effect of GHSROS overexpression on cell migration in the normal derived Beas-2B cell line after 24 h, the A549 and NCI-H1299 NSCLC cell lines after 6 h (compared to cells expressing vector alone). Migration was examined in a Transwell migration assay with 8µm pores. Results are from triplicate samples from three independent experiments and expressed as percentage above or below control (vector transfected cells). *P<0.05 (Student's t-test) between control cells (vector transfected cells) and cells engineered to overexpress GHSROS (Whiteside et al., 2013).

32

Figure 1.9. GHSROS expression across multiple cancers. GHSROS expression in 8 cancers (TissueScan Cancer Survey Tissue qRT-PCR panel; CSRT101). Blue bars denote normal tissue; pink bars denote tumour tissue. For each cancer, data were expressed as mean fold change using the comparative 2-ΔΔCt method against non-malignant control tissue. Normalized to β-actin (ACTB). *P<0.05, **P<0.01 (Student's t-test) (Unpublished data).

1.5 Conclusion

It is now well established that the transcription of thousands of lncRNAs represents more than just spurious transcriptional noise within the mammalian genome. Widespread expression of lncRNAs across every chromosome is abundant in many normal developmental processes, and is widely dysregulated in disease states, including cancer. Their diverse, often unusual molecular mechanisms ensure that lncRNAs are a unique and promising avenue for exploring novel mechanisms of cancer biology. To date, only a relatively small proportion of lncRNAs have been identified and thoroughly characterised in prostate cancer, and fewer have been verified as useful targets for cancer treatments. It is clear that identifying and examining the functions of noncoding transcripts is going to be important for the future of clinical therapies for cancer. The lncRNA GHSROS is aberrantly expressed in multiple tumour types, including lung, prostate and breast cancers. Furthermore, our previous studies have shown that GHSROS plays a functional and oncogenic role in lung cancer cell lines. Whether or not this is true in other cancer types is currently unknown. Dissecting the actions of GHSROS in cancers may result in novel therapeutic options for targeting these cancers.

33

1.6 Overall Objectives The overall objective of this study is to investigate the expression and function of the lncRNA GHSROS in prostate and breast cancer and to determine if it may be a novel clinically relevant therapeutic target. This study will provide a deeper understanding of the role of lncRNAs in cancer.

1.6.1 Hypotheses This investigation hypothesises that the novel ghrelin receptor natural antisense transcript lncRNA, GHSROS, is dysregulated in cancer and is a mediator of important pro-tumourigenic functions required for tumour growth and cancer progression.

1.6.2 The specific aims of this study are:

i. to characterise the expression of GHSROS in multiple tumour types using in silico analysis ii. to examine the relationship between clinical and pathological (clinicopathological) characteristics, and the level of GHSROS expression in clinical samples using quantitative RT-PCR. iii. to investigate the role of GHSROS in processes associated with cancer progression using in vitro cell proliferation assays, cell migration and in vivo mouse xenograft cancer models of tumour growth in cell line models. iv. to investigate the mechanism of action of GHSROS, including the pathways and cellular processes that are activated by GHSROS, using next generation RNA-sequencing, bioinformatics analyses, exon microarrays, qRT-PCR and Western immunoblot analysis.

1.6.3 Significance Long non coding RNAs are key regulators of gene expression and play an important role in cancer. An increasing number of cancer-associated transcripts are being uncovered and have potential as biomarkers and therapeutic targets. The function and mechanism of action of lncRNAs in cancer is still poorly understood. Although we have identified the lncRNA, GHSROS, which is expressed in lung cancer, its function has not been fully elucidated. As we have demonstrated that GHSROS is overexpressed in cancer and may play a role in cancer progression, targeting this gene using novel antisense approaches may be useful for the development of targeted therapeutics. This study will also increase our understanding of the biology of long noncoding RNAs and their role in cancer.

34

Chapter 2

Materials and methods

35

2.1 Cell Culture, prostate cancer patient-derived xenograft (PDX) cell lines and treatments

PC3 (CRL-1435), DU145 (HTB-81), LNCaP (CRL-1740), MDA-MB-231 (HTB-26), MCF-7 (HTB- 22), MDA-MB-468 (HTB-132), MDA-MB-453 (HTB-131), HMLE, MCF-10A (CRL-10317), T-47D (HTB-133), BT474 (HTB-20), Hose 17.1, PEO4, OV-MZ-6, OV-90 (CRL-11732), OVCAR-3 (HTB- 161), SKOV-3 (HTB-77), CaOv3 (HTB-75), ES-2 (CRL-1978), A549 (CCL-185), RWPE-1 (ATCC® CRL-11609), RWPE-2 (CRL-11610), C4-2B (Thalmann et al., 1994), 22Rv1 (CRL-2505), and DUCaP (Y. G. Lee et al., 2001) cell lines were obtained from the American Type Culture Collection (ATCC, Rockville, MD, USA). Unless specified, all prostate, breast and ovarian cancer cell lines were maintained in Roswell Park Memorial Institute (RPMI) 1640 medium (Invitrogen, Carlsbad, CA) with 10% Fetal Calf Serum (FBS, Thermo Fisher Scientific Australia, Scoresby, VIC, Australia), supplemented with 100 U/mL penicillin G and 100 ng/mL streptomycin (Invitrogen). The A549 lung cancer, MDA-MB-231, BT474, MDA-MB-468, MCF-10A and MDA-MB-453 breast cancer cell lines were maintained in Dulbecco’s Modified Eagle Medium: Nutrient Mixture F-12 (DMEM/F12) medium (Invitrogen) with 10% FBS (Thermo Fisher Scientific Australia) supplemented with 100 U/mL penicillin G and 100 ng/mL streptomycin (Invitrogen). The non-tumourigenic RWPE-1 (CRL-11609) and the transformed, tumourigenic RWPE-2 (ATCC CRL-11610) prostate epithelium-derived cell lines were cultured in keratinocyte serum-free medium (Invitrogen) supplemented with 50 µg/mL bovine pituitary extract and 5 ng/mL epidermal growth factor (Invitrogen). MCF-10A cells were further supplemented with 0.5 μg/ml hydrocortisone, 200 μg/ml cholera toxin and 20 ng/mL epidermal growth factor (EGF). The HMLE human mammary epithelial cell derived cell line was grown in HuMEC Ready Medium (Invitrogen, Auckland, New Zealand) supplemented with 25 mg of bovine bituitary extract (Invitrogen) and HuMEC supplement (Thermo Fisher Scientific) which contains epidermal growth factor (EGF), hydrocortisone, isoproterenol, transferrin, insulin and bovine pituitary extract (BPE). The OV-MZ-6 ovarian cancer cell line was maintained in DMEM (high glucose, 4500 mg/L) (Invitrogen), containing 10 mM HEPES (Invitrogen), 0.55 mM L-arginine (Sigma-Aldrich), 0.272 mM L-asparagine (Sigma-Aldrich) and 20 µg/ml gentamicin (Sigma-Aldrich). All cell lines were passaged at 2- to 3-day intervals on reaching 70% confluency using TrypLE Select (Invitrogen). Cell morphology and viability were monitored by microscopic observation and regular Mycoplasma testing was performed (Universal Mycoplasma Detection Kit, ATCC). Six LuCap PDX lines as well as a locally developed PDX cell line (BM18) were obtained from Dr. Elizabeth Williams. All 7 PDXs express wild- type AR, differentially express PSA and ERG and present with deletions in PTEN, TP53 as well as TMPRSS2-ERG rearrangements (McCulloch & Williams, 2005; Nguyen et al., 2017).

36

Table 2.1. Patient derived xenograft (PDX) characteristics. LN; lymph node, AR; Androgen receptor, IHC; immunohistochemistry, TURP; transurethral resection of prostate, CRPC; castration- resistant prostate cancer, PSA; prostate-specific antigen, met; metastasis. AR Response Response Gleason PTEN ERG Androgen CRPC PDX model Tissue status to to score (IHC) (IHC) ablation disease (IHC) castration docetaxel LuCAP 35 LN met 5+5 ++ - + +++ +++ Yes Yes Liver LuCAP 70 3+4 +++ + - ++ +++ Yes Yes met LuCAP 96 TURP 5+4 +++ - - +++ ++ Yes No LuCAP 23- LN met NA +++ + ++ +++ +++ NA NA 12

LuCAP 105 Rib met 5+3 +++ - - ++ +/- Yes Yes

LuCAP 141 TURP NA +++ + - +++ +++ Yes Yes

Femoral BM18 3+4 +++ NA NA + NA NA NA met

2.2 Production of GHSROS overexpressing cancer cell lines

Full-length GHSROS transcript was cloned into the pTargeT mammalian expression vector (Promega, Madison, WI). PC3, DU145, MDA-MB-231, MCF-10A and A549 cell lines were transfected (using Lipofectamine LTX, Invitrogen) with GHSROS-pTargeT DNA, or vector alone (empty vector), according to the manufacturer’s instructions. Cells were incubated for 24 hours in LTX and selected with Geneticin (100-1500 µg/mL G418, Invitrogen). As LNCaP prostate cancer cells were difficult to transfect using lipid-mediated transfection, we employed lentiviral transduction. Briefly, pReceiver- Lv105 vectors, expressing full-length GHSROS, or empty control vectors, were obtained from GeneCopoeia (Rockville, MD). For stable overexpression, LNCaP cells were seeded at 50-60% confluency and transduced with GHSROS, or empty vector control lentiviral constructs in the presence of 8 µg/mL polybrene (to increase transduction efficiency). Following a 48-hour incubation, transduced cells were selected with 1 µg/mL puromycin (Invitrogen).

2.3 RNA extraction, reverse transcription and quantitative reverse transcription Polymerase Chain Reaction (qRT-PCR)

Total RNA was extracted from cell pellets using either the RNeasy Plus Mini Kit (QIAGEN, Hilden, Germany) with a genomic DNA (gDNA) Eliminator spin column or High Pure RNA kit (Roche, Germany). To remove contaminating genomic DNA, 1 µg RNA was DNase treated prior to cDNA synthesis by Superscript III (Invitrogen). qRT-PCR was performed using the AB7500 FAST sequence detection thermal cycler (Applied Biosystems, Foster City, CA), or the ViiA Real-Time PCR system (Applied Biosystems) with SYBR Green PCR Master Mix (QIAGEN) using primers listed in table 2.2.

37

A negative control (water instead of template) as well as a minus reverse transcriptase control (minus RT) was used in each real-time plate for each primer set. All real-time experiments were performed in triplicate. Baseline and threshold values (Ct) were obtained using ABI 7500 Prism and the relative expression of mRNA was calculated using the comparative 2-ΔΔCt method (Livak & Schmittgen, 2001). Expression was normalized to the housekeeping gene ribosomal protein L32 (RPL32). All primer sequences used are listed as in each chapter.

2.4 Cell proliferation assays

Proliferation assays were performed using an xCELLigence real-time cell analyzer (RTCA) DP instrument (ACEA Biosciences, San Diego, CA). This system employs sensor impedance technology to quantify the status of the cell using a unit-less parameter termed the cell index (CI). The CI represents the status of the cell based on the measured relative changes in electrical impedance that occur in the presence and absence of cells in the wells (generated by the software, according to the formula CI = (Zi

– Z0)/15 Ω, where Zi is the impedance at an individual point of time during the experiment and Z0 the impedance at the start of the experiment). Impedance is measured at three different frequencies (10, 25 or 50 kHz). Briefly, 5 x 103 cells were trypsinized and seeded into a 96 well plate (E-plate) and grown for 48 hours in 150 µl growth media. Cell index was measured every 15 minutes and all experiments were performed in triplicate with at least three independent repeats. For cell lines incompatible with the xCELLigence assay and for measurements using cytotoxic drugs, cell proliferation was quantified by measuring the cleavage of WST-1 (Roche, Basel, Switzerland) or fluorescent quantification of DNA content (Cyquant; Invitrogen). Briefly, 5 x 104 cells/ well were seeded in 96-well plates (BD Biosciences, Franklin Lakes, NJ) and propagated for 72 hours in complete medium. For cell treatments, 10 µM Enz (Selleck Chemicals, Houston, TX, USA), 10 nM dihydrotestosterone (DHT), 10-100 nM Docetaxel (Sigma Aldrich, St. Louis, MO, USA) were added to cell cultures for 48 (gene expressions studies) or 96 (functional studies) hours in 2% FBS containing-medium. Each were compared to a dimethyl sulfoxide (DMSO) (Sigma Aldrich, St. Louis, MO, USA) vehicle control. To determine cell number, absorbance was measured using the FLUOstar Omega spectrophotometer (BMG, Ortenberg, Germany) at 440 nm using a reference wavelength of 600 nm.

2.5 Cell Migration assays

Migration assays were performed using the xCELLigence RTCA DP instrument (ACEA Biosciences). Briefly, 5 x 104 cells/well were seeded on the top chamber in 150 µl serum-free media. The lower chamber contained 160 µl media with 10% FBS as a chemo-attractant. Cell index was measured every 15 minutes for 24 hours to indicate the rate of cell migration to the lower chamber. All experiments were performed in triplicate with at least 3 independent repeats. For cell lines incompatible with the

38 xCELLigence assay, migration assays were performed using a Transwell assay. Briefly, 6 x 105 cells were suspended in serum-free medium and added to the upper chamber of inserts coated with a polycarbonate membrane (8 µm pore size; BD Biosciences). Cells in 12-well plates were allowed to migrate for 24 h in response to a chemoattractant (10% FBS) in the lower chamber. After 24 h, cells remaining in the upper chamber were removed. Cells that had migrated to the lower surface of the membrane were fixed with methanol (100%) and stained with 1% crystal violet. Acetic acid (10%, v/v) was used to extract the crystal violet and absorbance was measured at 595 nm. Each experiment consisted of three replicates and was repeated independently three times.

2.6 Locked Nucleic Acid-Antisense Oligonucleotides (LNA-ASO)

Two distinct LNA-ASOs, RNV104L and RNV124, complementary to different regions of GHSROS, were designed in-house and synthesized commercially (Exiqon, Vedbæk, Denmark). The sequences contained two consecutive LNA nucleotides at the 5’-end and three consecutive LNA nucleotides at the 3’-end in line to a gapmer design (RNV104L had two LNA nucleotides at position 2, 3 and 16, 17, 18; RNV124 had two LNA nucleotides at positions 3, 4 and 18, 19, 20 [Table 2.2]). Lyophilized oligonucleotides were resuspended in ultrapure H2O (Invitrogen) and stored as a 100 µM stock solution at -20°C. Briefly, LNA-ASOs were diluted to 20 µM in OptiMEM I Reduced Serum Medium (Invitrogen) and cultured cells were transfected according to the manufacturer’s instructions. Cultured cells were incubated at 37°C in 5% CO2 for 4 hours, before 500 µl growth medium, containing 30% FBS, was added to the serum-free medium. The cells were transfected for 24-72 hours and GHSROS levels assessed by qRT-PCR.

Table 2.2. LNA-ASOs used in this study. Where red increased font = LNA modifications. aParameters calculated using Oligo Calc: http://biotools.nubic.northwestern.edu/OligoCalc.html aLength aGC aMolecular weight Designation Sequence (mer) (%) (Da) Scrambled 5’-GCTTCGACTCGTAATCACCTA-3’ 21 48 6341.22 control

RNV124 5’-ATAAACCTGCTAGTGTCCTCC-3’ 21 48 6341.22

RNV104L 5’-GTTAACTTTCTTCTTCCTTG-3’ 20 35 6015.00

2.7 Mouse subcutaneous in vivo xenograft models

All mouse studies were carried out under an approved Institutional Animal Care and Use Committee protocol. PC3-GHSROS, PC3-vector, DU145-GHSROS, DU145-vector cell lines and LNCaP- GHSROS, LNCaP-vector were injected subcutaneously into the right dorsal flank of 4-5 week old

39

NOD/SCID gamma male mice (obtained from Animal Resource Centre, Murdoch, Western Australia). PC3 and DU145 ectopically expressed cells were injected in a 1:1 ratio with growth factor reduced Matrigel (Thermo Fisher) (n= 8-10 per cell line) and tumours measured bi-weekly with digital calipers (ProSciTech, Kirwan, Queensland, Australia). LNCaP xenografts were injected in a 1:1 ratio with growth factor containing matrigel and measured at the same intervals. Animals were euthanased once tumour volume reached 1,000 mm3 calculated using the equation ‘tumour volume=length x width2/2’, or at other ethical endpoints, including signs of stress such as increased heart rate, inactivity, abnormal posture/ hunching or >20% of body weight loss per ethical approval and the Australian code. At the experimental endpoint the primary tumour was resected, divided in half, snap frozen and stored at - 80°C. For histological analysis, cryosections (6-10 µm thick) were prepared using a Leica CM1850 cryotome (Wetzlar, Germany). Sections were collected onto warm, charged Menzel Superfrost slides (Thermo Fisher), fixed in ice-cold 100% acetone, air dried and stored at -80 oC. LNCaP tumour xenografts were resected and placed into enough RNAlater (Thermo Fisher Scientific) to exceed 10x the volume, for 24 hours. Following this, tumours were divided and placed in 10% neutral buffered formalin for 24 hours or placed back into fresh RNA later to be extracted as needed

2.8 Histology and immunohistochemistry

For histological analysis, cryosections (6-10 µm thick) were prepared using a Leica CM1850 cryotome (Wetzlar, Germany). Sections were collected onto warm, charged Menzel Superfrost slides (Thermo Fisher), fixed in ice-cold 100% acetone, air dried and stored at -80 oC. For immunohistochemistry, tissues were fixed in paraformaldehyde and dehydrated through a graded series of ethanol and xylene, before being embedded in paraffin. Sections (5µm) were mounted on to glass Menzel Superfrost slides ThermoFisher Scientific). Immunohistochemistry was performed using antibodies for the proliferation marker Ki67 (rabbit anti-human Ki67, Abcam, Cambridge, UK) and for the infiltration of murine blood vessels using rabbit anti-murine CD31 antibody (Abcam). Tissue sections were incubated with HRP- polymer conjugates (SuperPicture, Thermo Fisher Scientific), and incubated with the chromagen diaminobenzidine (DAB) (Dako, Glostrup, Denmark), as per manufacturer’s specifications. All sections were counterstained with Mayer’s hematoxylin (Sigma Aldrich) and mounted with coverslips using D.P.X with Colourfast (Fronine, ThermoFisher Scientific).

2.9 Viability Assay

LNCaP and PC3 vector or GHSROS over-expressing cells (5000 cells/well) were seeded in 96-well plates (BD Biosciences) and cultured overnight in complete medium. LNCaP cells were treated with standard doses of test compounds in both charcoal stripped FBS (CSS) and 2% FBS. PC3 cells were treated with increasing doses of docetaxel in 2% FBS. After a 96-hour period, cell viability was

40 measured using a WST-1 cell proliferation assay as performed previously (see section 2.4; Cell proliferation assays). All viability experiments were performed independently three times, with 3-4 replicates each.

2.10 Statistical analyses

Data values were expressed as means ± s.e.m of at least two independent experiments (expressed as an n value) and, for all direct comparisons, evaluated using Student’s t-test for unpaired samples or Mann- Whitney-Wilcoxon test. One-way or two-way ANOVA and post hoc tests were used to assess the significance of multiple comparisons. Parametric tests were performed on data tested for normality and non-normally distributed data were analysed with non-parametric tests. Mean differences were considered significant when P ≤ 0.05. Q-values denote multiple testing correction (Benjamini- Hochberg) adjusted P-values. Normalized high-throughput gene expression data were analyzed using LIMMA, employing a modified version of the Student’s t-test (moderated t-test) where the standard errors are reduced toward a common value using an empirical Bayesian model robust for data sets with few biological replicates (Ritchie et al., 2015). Statistical analyses were performed using GraphPad Prism v.6.01 software (GraphPad Software, Inc., San Diego, CA), or the R statistical programming language.

41

Chapter 3

Expression of the lncRNA GHSROS in prostate cancer and its role in prostate tumour promotion

42

3.1 Introduction

It is now recognized that the human genome yields a multitude of RNA transcripts with no obvious protein-coding ability, collectively termed ncRNAs (Mattick & Rinn, 2015). A decade of intensive research has revealed that many ncRNAs greater than 200 nucleotides in length have expression patterns and functions as diverse as protein-coding RNAs (Huarte, 2015; Mattick & Rinn, 2015). These long noncoding RNAs (lncRNAs) have emerged as important regulators of gene expression and can act on nearby (cis) or distant (trans) protein-coding genes (Huarte, 2015). Nevertheless, the vast majority of lncRNAs remain uncharacterized. Much effort has been devoted to understanding their role and diagnostic and therapeutic potential in disease (Slaby et al., 2017). This is particularly the case in cancer, where many lncRNAs present with low basal, yet differential expression between tumour subtypes (Qi & Du, 2013). Notable examples associated with aggressive disease processes and adverse outcomes include HOTAIR (HOX transcript antisense RNA), which is upregulated in a range of cancers, and the prostate-cancer specific SChLAP1 (Huarte, 2015). We previously identified GHSROS (also known as AS-GHSR), a 1.1-kb capped and polyadenylated lncRNA gene antisense to the intronic region of the growth hormone secretagogue receptor (GHSR) (Whiteside et al., 2013). GHSROS harbours an AT-rich human-specific promoter in a MER5B SINE repeat element (Whiteside et al., 2013), a pattern frequently found in promoters of lncRNAs with high tissue specificity and low expression levels (Kannan et al., 2015; Pheasant & Mattick, 2007), and belongs to a category of unspliced lncRNAs termed totally intronic RNAs (TINs), many of which exhibit distinct expression in cancer (Engelhardt & Stadler, 2015; Nakaya et al., 2007; Reis et al., 2004). GHSROS is highly expressed in lung tumours compared to normal lung tissue and its forced over-expression in lung adenocarcinoma-derived cell lines increased cell migration (Whiteside et al., 2013). It is likely that GHSROS is also differentially expressed in other cancers, where it may have pathophysiological significance.

We therefore investigated, the expression and function of this lncRNA in prostate cancer, a disease diagnosed in nearly 1.5 million men annually worldwide (Global Burden of Disease Cancer et al., 2015).

43

3.2 Materials and Methods General Materials and Methods are outlined in detail in Chapter 2. Experimental procedures which are specific to this chapter are described below.

3.2.1 Bioinformatics

3.2.1.1 GHSROS, GHSR1a and GHSR1b annotation analysis

Sequence-specific mRNA annotation analysis for GHSROS, GHSR1a and GHSR1b was performed using the AnnoLnc program (M. Hou et al., 2016). Briefly, FASTA sequences derived from GenBank (Benson, Karsch-Mizrachi, Lipman, Ostell, & Wheeler, 2005) were directly inputted into the portal (http://annolnc.cbi.pku.edu.cn/index.jsp). Further analysis of the complete sequences, including alignments and PhyloCSF scores were obtained from the UCSC genome browser (hg19) (Casper et al., 2018).

3.2.1.2 Identification of GHSROS transcription in exon array data sets

Northern blot and quantitative qRT-PCR analyses suggest that the lncRNA GHSROS is expressed at low levels (Whiteside et al., 2013). To expand on these observations and assess GHSROS expression, we interrogated Affymetrix GeneChip Exon 1.0 ST arrays, strand-specific oligonucleotide microarrays with probes for known and predicted exons (hereafter termed exon arrays). Exon arrays are comparative to RNA-seq in experiments aimed at assessing exon expression (i.e. gene isoforms) and suitable for experiments where the exon of interest is known (Dapas & Davuluri, 2017; Griffith et al., 2010). In the Exon 1.0 ST array, known (genes and ESTs) and putative exons are combined to form ‘transcript clusters’, with each exon defined as a probe set (typically, a set of 2-4 probes). By combining all probe sets, the expression of a transcript cluster (known or putative gene) can be measured (see https://goo.gl/4RSTG3). To identify probe set(s) corresponding to GHSROS, we downloaded the Exon 1.0 ST probe annotation file from NCBI (NCBI Gene Expression Omnibus (GEO) accession no. GPL5188). Full-length GHSROS (1.1 kb) was aligned to the human genome (NCBI36/hg18; March 2006 assembly) to generate genomic coordinates compatible with the probe file (chr3:173,646,439- 173,647,538). Next, the probe annotation file (GPL5188) was interrogated to reveal probe sets spanning GHSROS by entering the following command in a UNIX terminal window:

44

This revealed a probe set, 2652604, consisting of 4 probes complementary to GHSROS. Cell and tissue exon array data were downloaded from NCBI GEO (Barrett et al., 2013), EBI ArrayExpress (Brazma et al., 2003) and the Affymetrix web site (See Supplementary Table 1). GEO data sets were bulk- downloaded using v3.6.2.117442 of the Aspera Connect Linux software (Aspera, Emeryville, CA, USA). In total, 3,924 samples were downloaded, corresponding to ~46% of all exon array data deposited in the NCBI GEO database. Arrays (individual CEL files) were normalized (output on a log2 scale, centered at 0) using the SCAN function in the R package ‘SCAN.UPC’ (Piccolo et al., 2012; Piccolo, Withers, Francis, Bild, & Johnson, 2013). SCAN normalizes each array (sample) individually by removing background noise (probe- and array-specific) data from within the array. Next, arrays were interrogated using the UPC function in ‘SCAN.UPC’. UPC outputs standardized expression values (UPC value), ranging from 0 to 1, which indicate whether a gene is actively transcribed in a sample of interest: higher values indicate that a gene is ‘active’ (Piccolo et al., 2013). UPC scores are platform- independent and allow cross-experimental and cross-platform integration.

3.2.1.3 Evaluation of GHSR/GHSROS transcription in deep RNA-seq data set

It has been estimated that reliable detection of low abundance transcripts in humans warrants very deep sequencing (> 200 million reads per sample) (Tarazona et al., 2011) – far beyond most current data sets. To illustrate, we considered the expression of GHSR/GHSROS in a comparable clinical data set. Publicly available RNA-seq data (NCBI GEO accession no. GSE31528) from eight subjects with metastatic castration-resistant prostate cancer (bone marrow metastases) (Sowalsky et al., 2015) were interrogated. Briefly, total RNA-seq was performed on random-primed paired end read libraries, to ensure consistent transcript coverage (Adiconis et al., 2013; Sowalsky et al., 2015), generating an average of 160M reads per sample. Paired-end FASTQ files were aligned to the human genome (UCSC build hg19) using the spliced-read mapper TopHat (v2.0.9) (Kim et al., 2013) and reference gene annotations to guide the alignment. BigWig sequencing tracks for the UCSC genome browser (Karolchik et al., 2004; Kent et al., 2002) were obtained from TopHat-generated BAM files (indexed by samtools v1.2 (H. Li et al., 2009)) using a local instance of the bamCoverage command in deepTools v2.5.4 (Ramirez et al., 2016). BigWig files were visualized in the UCSC genome browser (hg19). A region with less than ~10 supporting reads can be considered to have low coverage, rendering active transcription difficult to interpret (Sims & Ponting, 2014; Tarazona et al., 2011).

45

3.2.2 Expression of GHSROS in cancer specimens, prostate cell lines and tissues

3.2.2.1 GHSROS qRT-PCR expression in human tissue specimens

To survey the expression of GHSROS in cancer, we initially interrogated a TissueScan Cancer Survey Tissue qPCR panel (CSRT102; OriGene, Rockville, MD, USA), with cDNA arrayed on multi-well PCR plates, by qRT-PCR. Some samples were from normal non-malignant tissue samples, making it possible to compare expression in tumour versus normal tissue. For each cancer type, data were expressed as mean fold change using the comparative 2-ΔΔCt method against a non-malignant control tissue. Expression was normalized to β-actin (ACTB) as described in Chapter 2, section 2.3.

To further investigate the expression of GHSROS in prostate cancer TissueScan Prostate Cancer Tissue qPCR panels (HPRT101, HPRT102, and HPRT103) were obtained from OriGene. The cDNA panels contained a total of 24 normal prostate-derived samples, 31 abnormal prostate samples (defined as lesions), and 88 prostate tumour samples. These panels were examined by qRT-PCR, using the method described in Chapter 2, section 2.3, except that the housekeeping gene ribosomal protein L32 (RPL32) was employed.

An independent cohort was obtained from the Andalusian Biobank (Servicio Andaluz de Salud, Spain). It consisted of tissue from 28 patients with clinical high-grade prostate cancer (10 localized and 18 metastatic tumours) and 8 normal prostate tissue samples. RT-PCR was performed using Brilliant III SYBR Green Master Mix and a Stratagene Mx3000p instrument (both from Agilent, La Jolla, CA, USA), as previously described (Hormaechea-Agulla et al., 2016). Briefly, samples on the same plate were analysed with a standard curve to estimate mRNA copy number (tenfold dilutions of synthetic cDNA template for each transcript). No-RNA controls were carried out for all primer pairs. To control for variations in the amount of RNA used, and the efficiency of the reverse-transcription reaction, the expression level (copy number) of each transcript was adjusted by a normalization factor (NF) obtained from the expression of three housekeeping genes (ACTB, HPRT, and GAPDH) using the geNorm algorithm (Vandesompele et al., 2002). Primers used are listed in Table 3.2.

3.2.3 Cell lines, prostate cancer PDX cell lines and cell culture

In this study assays were performed using the PC3 (Kaighn & Jones, 1979), the DU145, LNCaP, DUCaP (Y. G. Lee et al., 2001), 22Rv1, C4-2B (Thalmann et al., 1994), ES-2, RWPE-1 and the RWPE- 2 cell line which is derived from the RWPE-1 cell line by Ki-ras proto-oncogene malignant transformation using the Kirsten murine sarcoma virus (Ki-MuSV) (Webber & Hoffman, 1997). The

46

PC3, DU145, LNCaP, DUCaP, RWPE-1, RWPE-2, C4-2B, 22Rv1, ES-2 and PDX cell lines used in this study were propagated as described in the General Methods, Chapter 2, section 2.1.

3.2.4 RT-PCR of cell line mRNA

RT-PCR was used to determine the expression of GHSROS in the normal, prostate cancer cell lines and PDX cell lines. RNA was extracted and cDNA synthesised and qRT-PCR performed from all cell lines (as described in Chapter 2, section 2.3). Primers used for this chapter are found in Table 3.1. No template control RT-PCRs were also performed where template was substituted with water.

Table 3.1. Primer sequences used in this study. F: Forward primer; R: Reverse primer

Primer Gene name Primer sequence (5'-3') F: ACATTCAGCAAATCCAGTTAATGACA GHSROS growth hormone secretagogue R: CGACTGGAGCACGAGGACACTTGA GHSROS- receptor opposite strand CGACTGGAGCACGAGGACACTGACAAC RT linker AGAATTCACTACTTCCCCAAA growth hormone secretagogue F: TGAAAATGGTGGCTGTAGTGG GHSR1a receptor transcript variant 1a R: AGGACAAAGGACACGAGGTGG growth hormone secretagogue F: GGACCAGAACCACAAGCAAA GHSR1b receptor transcript variant 1b R: AGAGAGAAGGGAGAAGGCACA ribosomal protein L32 F: CCCCTTGTGAAGCCCAAGA RPL32 (housekeeping gene) R: GACTGGTGCCGGATGAACTT F: ACTCTTCCAGCCTTCCTTCCT ACTB actin beta (housekeeping gene) R: CAGTGATCTCCTTCTGCATCCT

3.2.5 Production of GHSROS overexpressing cancer cell lines

Full-length GHSROS transcript was cloned into the pTargeT mammalian expression vector (Promega, Madison, WI). PC3 and DU145 cell lines were transfected with GHSROS-pTargeT DNA, or vector alone (empty vector), (using Lipofectamine LTX, Invitrogen) according to the manufacturer’s instructions, and as described in Chapter 2, section 2.2.

3.2.6 Cell Migration assays

Migration assays were performed (as described in Chapter 2, section 2.5) using an xCELLigence RTCA DP instrument (ACEA Biosciences) using the PC3-vector, PC3-GHSROS and DU145-vector and DU145-GHSROS cell lines. Briefly, 5 × 104 cells/well were seeded on the top chamber in 150 µl serum- free media. The lower chamber contained 160 µl media with 10% FBS as a chemo-attractant. Cell index was measured every 15 minutes for 24 hours to indicate the rate of cell migration to the lower chamber. All experiments were performed in triplicate with at least 3 independent repeats

47

3.2.7 Cell proliferation assays

Cell proliferation assays were performed (as described in Chapter 2, section 2.4) using an xCELLigence RTCA DP instrument (ACEA Biosciences) using the PC3-vector, PC3-GHSROS and DU145-vector and DU145-GHSROS cell lines. Briefly, 5 × 103 cells were trypsinized and seeded into a 96 well plate (E-plate) and grown for 48 hours in 150 µl growth media. Cell index was measured every 15 minutes and all experiments were performed in triplicate with at least three independent repeats.

3.2.8 Anchorage-independent growth assay

In order to observe the growth of cancer cells overexpressing GHSROS, (or vector controls), anchorage- independent growth assays were performed. Preparation of the suspension agar consisted of two layers. Initially, 1.5 mL 0.5% agar base (UltraPureLow Melting Point Agarose, Life Technologies, CA, USA) was prepared by mixing 1.0% agarose (diluted in UltraPure DNase/RNase-Free Distilled Water, Life Technologies) with 2 X RPMI, or 2 X DMEM/F12 with 20% FBS, in equal amounts to make a 0.5% base layer. The agarose base layer was set aside for 30 minutes to allow it to solidify. The top layer was prepared by mixing 0.7% agarose with 2X RPMI-1640 or DMEM/F12 to make a 0.35% agarose suspension which was kept in solution at 40°C in the water bath. Cells were harvested with 0.05% trypsin and resuspended at a cell density of 5x103 cells/ mL in 0.35% agarose solution. Approximately 1 mL of this suspension was dispensed into each plate of a 6 well culture dish (Corning) and allowed to settle overnight at 37°C. The following day, 2 mL fresh 1X media with 10% FBS was added to the wells and replaced every 2-3 days for 21 days. After 21 days, colonies were fixed and stained with 0.1% crystal violet and the number of colonies was determined using a phase contrast light microscope at 40X magnification, by manual counting from triplicate wells for each cell line.

3.2.9 Attachment assays

Tissue culture plates with 96 wells were coated overnight with purified extracellular matrix (ECM) molecules and basement membrane components, fibronectin (BD Biosciences), collagen I (Calbiochem; EMD Biosciences, Merck, Kilsyth, Victoria, Australia) or collagen IV (Calbiochem; EMD Biosciences, Merck), diluted to a concentration of 10 µg/mL (50 µL/well) in RPMI, or DMEM/F12 (medium without additives). After overnight incubation, PBS was removed and non- specific binding sites were blocked with 1% (w/v) BSA. Approximately 2 x 104 cells/ well from each overexpressing cell line (PC3-vector, PC3-GHSROS, DU145-vector, DU145-GHSROS) were seeded in 100 µL serum - free medium with 0.1% (w/v) BSA into each well of the matrix coated plate or in a BSA blocked only plate (for background adherence) and incubated for 1 h at 37 ºC. A third plate with

48 no coating or blocking was also seeded with cells (including 5% FBS) and incubated for 5 h to ensure total cell attachment to estimate the absorbance of total cell adherence (100% plate). After 1 h, or 5 h, media was aspirated from the wells and the cells carefully washed twice with PBS to remove any non- adhered cells. Adherence was then determined by staining the cells using CyQuant, as per the manufacturer’s instructions. Results were corrected for background attachment. The assay was performed in triplicate with 3 independent replicates of each ECM molecule per assay.

3.2.10 Mouse subcutaneous in vivo xenograft models

All mouse studies were carried out with approval from the University of Queensland and the Queensland University of Technology Animal Ethics Committees. PC3-GHSROS, PC3-vector, and DU145-GHSROS, DU145-vector cell lines were injected subcutaneously into the flank of 4-5-week- old male Non-obese diabetic (NOD) scid (NSG) mice (Shultz et al., 2005) (obtained from Animal Resource Centre, Murdoch, WA, Australia). Complete description of procedure is as outlined in Chapter 2, section 2.7.

3.2.11 Histology and immunohistochemistry

For histological analysis, cryosections were prepared, sectioned, fixed and immunohistochemistry performed using antibodies for the proliferation marker Ki67 (rabbit anti-human Ki67, Abcam, Cambridge, UK), as described in Chapter 2, section 2.8.

3.2.12 RNA secondary structure prediction

To predict the secondary structure of GHSROS, single-stranded RNA sequence was used to interrogate the ViennaRNA web server (Gruber & Lorenz, 2015), and the minimum free energy (Wan & Chang, 2011; Zuker & Stiegler, 1981) of the GHSROS structure calculated.

3.2.13 Locked Nucleic Acid-Antisense Oligonucleotides (LNA-ASO)

Design and general transfection procedure of custom made LNA-ASO sequences performed as previously described (Chapter 2, section 2.6). Cell survival assays were performed using an xCELLigence RTCA system. Briefly, 5 × 103 PC3 cells were trypsinized and seeded into a 96 well plate (E-plate) and grown for 24 hours in 150 µl growth medium. Following this, the cells were transfected with the LNA-ASO RNV124 (as described in Chapter 2, section 2.6) and grown for a further 48 hours in serum starved conditions. Cell index was measured every 15 minutes and all experiments were performed in triplicate with at least three independent repeats.

49

3.2.14 Statistical analysis

Student’s t-test was carried out to assess the statistical significance of all the direct comparisons, and one-way ANOVA and post hoc tests were used to assess the significance of multiple comparisons. Parametric tests were performed on data tested for normality and non-normally distributed data were analysed with non-parametric tests. Statistical analyses were undertaken using Graphpad Prism v.6.01 as described in Chapter 2, section 2.10.

50

3.3 Results

3.3.1 GHSROS is a bona fide mammalian lncRNA that is actively transcribed

Previous studies in our laboratory mapping the ghrelin receptor locus identified a 1.1 kb lncRNA denoted as GHSROS, an overlapping antisense transcript to the GHSR gene located on chromosome 3 (Ch3) Ch3q32.3 (Figure 3.1) (Whiteside et al., 2013). We sought to further characterise the structure of GHSROS and its expression in normal tissues and in disease, including cancer tissues. To that end, we initially employed publically available Broad/ENCODE ChIP-seq datatsets (Bernstein et al., 2005; Ernst et al., 2011) readily visualised through the UCSC genome browser (Casper et al., 2018; Kent et al., 2002; Raney et al., 2014). The predicted transcriptional start site (TSS) of the unspliced GHSROS gene was observed to have lowly enriched levels of monomethylated histone H3 lysine 4 (H3K4Me1) and trimethylated H3K4 (H3K4Me3) occupancy (Figure 3.1). Both marks are epigenetic features of RNA polymerase II activity and indicate active transcription of genes (J. Lv, Liu, et al., 2013). No CpG sites could be aligned within the TSS of the GHSROS gene, unlike that of its opposite strand gene, the GHSR. DNA hypermethylation of the promotor of the GHSR is reported to be a common and early occurrence in numerous solid tumours (Moskalev et al., 2015). Indeed, our analysis supports this published data which shows the significant presence of CpG sites within exon 1 of the GHSR (Moskalev et al., 2015) (Figure 3.1). Interestingly, unlike the GHSR, further alignments showed no obvious levels of occupancy of histone H3 lysine 27 trimethylation (H3K27Me3) within the promotor of GHSROS (data not shown), suggesting that the GHSR (but not GHSROS) is subject to both DNA and histone methylation by lysine specific methyltransferases (Rice & Allis, 2001). However, we are unable to determine if this would affect the transcription of GHSROS.

While GHSROS has been verified as a noncoding gene, we wished to computationally exclude the possibility that it may generate small bioactive peptides. LncRNAs are rarely translated in human cells and we found no evidence of active translation or production of small peptide products from the GHSROS sequence. The codon substitution frequency (CSF) (Liao et al., 2011) obtained for GHSROS was negative (equivalent to noncoding genes), and similar to other verified lncRNA RNA sequences (Lee et al., 2016). Furthermore, the full length sequence incorporated into the USCS derived txCdsPredict score fell under the threshold for protein-coding potential (≥800 for a protein-coding sequence) in our analyses (txCdsPredict score = 392.50; UCSC ID: uc021xhj.2), making it is highly unlikely that GHSROS is translated into functional peptides (Figure 3.1).

51

Figure 3.1. Schematic representation of the GHSR and GHSROS genomic loci. Visualisation of the genomic location of the noncoding gene GHSROS (Red) and its associated protein-coding gene, GHSR, through the UCSC Genome Browser (hg19) providing evidence that it is a transcription unit. For reference, the two flanking proteins-coding genes, FNDC3B and TNFSF10, are shown. Histone modifications which denote active transcriptional units, (enrichment of H3K4Me1, H3K4Me3 and H3K27Ac sites), are located at the promotor and within the GHSROS exon. Visualisation of PhyloCSF data (negative = noncoding) and txCDSPredict scores (<800 = noncoding) were exported from the USCS Genome Browser (hg19) and are negative or below 800, confirming the previous characterisation of GHSROS as a noncoding gene (Whiteside et al., 2013). Genomic data and CpG island annotations from the ENCODE project and visualised through the UCSC Genome Browser (hg19) (Kent, Sugnet et al., 2002, Raney, Dreszer et al., 2014, Casper, Zweig et al., 2018).

52

3.3.1.1 Structural features of the lncRNA GHSROS transcript

The functional role and regulation of putative lncRNAs are often difficult to establish experimentally, and therefore, in silico prediction pipelines are useful tools for rapid data generation. We used AnnoLnc portal, a flexible bioinformatics platform consisting of multiple annotation modules (including BLAST- like alignment (BLAST), RNAfold v2.0.7, TargetScan V.60, Gene ontology (GO) enrichment analysis, ENCODE ChIP-Seq dataset(s) integration, integrated cross-linking immunoprecipitation (CLIP-Seq) dataset(s) and single nucleotide polymorphism (SNP)-trait associations (available through the National Human Genome Research Institute (NHGRI) Catalog of Published Genome Wide Association Studies) in order assess GHSROS and GHSR mRNA sequences. The 1043 bp GHSROS sequence (Chr:3, exon 1: 172163753-172164795, strand (+), Genome assembly: hg19) was not predicted to be transcriptionally regulated by any transcription factors upstream of its TSS (Table 3.2, Figure 3.1). Both the 3014 bp GHSR1a and 914 bp GHSR1b sequences were predicted to be regulated by EZH2, a transcription factor and histone-lysine N-methyltransferase enzyme which regulates the repression of numerous genes through H3K27me3 modifications (Yoo & Hennighausen, 2012). Binding predictions were associated upstream and overlapping the TSS, and within the gene sequences. Similarly, analysis of ChIP-seq data from ENCODE/ Broad Institute (Bernstein et al., 2005; Ernst et al., 2011) shows the overlapping EZH2 binding site within exon 1 of the GHSR, with a SUZ12 binding sequence <3000 bp upstream (Table 3.2, Figure 3.2A and Figure 3.2B).

LncRNAs can act as decoys, preventing miRNA interactions with target genes by sequestering them through sequence-specific interactions. TargetScan miRNA binding site predictions, which are assessed as high confidence by employing primate specific conservation scores (Jeggari & Larsson, 2012). This analysis predicted that GHSROS possesses a number of binding sites for 3 different miRNA families (miR-34ac/34bc-5p/449abc/449c-5p, miR-125a-5p/125b-5p/351/670/4319, miR-1ab/206/613), with a primate-specific conservation score (PCS) = >0.8 (Table 3.2). In contrast, there were minimal miRNA binding sites for either GHSR mRNA isoform (GHSR1a and GHSR1b) (GHSR1a; let-7/98/4458/4500, PCS = 0.80 and GHSR1b; miR-503, PCS = 0.80). The miRNA families identified were not common with those predicted to interact with GHSROS (Table 3.2).

Interestingly, two SNPs were present within the flanking 6 kb GHSR genomic loci, and the tag SNP rs572169 was represented in all three sequences (GHSROS, GHSR1a, GHSR1b) and significantly associated with height (P = 3x10-18, Table 3.2).

53

Table 3.2. Structural annotation of GHSROS, GHSR1a, GHSR1b. Annotated through the AnnoLnc portal (Hou et al., 2016) Transcription factor interactions only considered if in 3 individual ENCODE cell lines. TSS; Transcription start site, TES; Transcription end site, SNP; Single nucleotide polymorphism “-” indicates no data available. miRNA interactions only were only considered with a primate conservation score of >0.80. SNPs only considered if P >0.05.

Transcription factor regulation (5kb upstream, 1kb SNPs (5kb downstream) Predicted miRNA Gene name GenBank upstream to Inside interactions (strand) accession ID Overlap Downstream 1kb Upstream TSS gene (TargetScan V6.0) TSS TES downstream) sequence miR-34ac/34bc- 5p/449abc/449c-5p, miR- rs572169 GHSROS (+) GU289929.1 - - - EZH2 125a-5p/125b- rs509035 5p/351/670/4319, miR- 1ab/206/613 NRSF, SUZ12, rs572169 GHSR1a (-) NM_198407.2 EZH2, EZH2 - let-7/98/4458/4500 EZH2, YY1 rs509035 YY1, EZH2, GHSR1b (-) NM_004122.2 EZH2 EZH2 - miR-503 rs572169 NRSF, SUZ12

54

Given the strong occupancy of EZH2 at the GHSR loci, we used the in silico tool catRAPID omics (Agostini et al., 2013) to determine if GHSROS had the propensity to bind and potentially tether EZH2 to the GHSR promotor to regulate its expression. Our analysis, however, determined that while GHSROS had a strong association with multiple RNA-binding proteins (Table 3.2), including heterogenous nuclear ribonucleoprotein D-like (HNRDL; LncPro value = 92.128, catRAPID ranking = high), KH domain-containing, RNA-binding signal transduction-associated protein 3 (KHDR3; LncPro value = 91.703, catRAPID ranking = high), nucleolysin TIA-1 isoform p40 (TIA1; LncPro value = 88.799, catRAPID ranking = high), and nucleolysin (TIAR; LncPro value = 85.366, catRAPID ranking = high), we saw no evidence of potential binding to EZH2 (Table 3.3). Alternatively, GHSROS is predicted to bind to numerous transcriptional regulators, alternative splicing modulators and regulators of apoptosis, proliferation and mRNA stability (outlined in Table 3.3). Taken together, GHSROS appears to be an actively transcribed lncRNA in the human genome (albeit at a low level), where it may serve to regulate multiple processes within its own genomic loci (cis), and at distant loci (trans).

55

Figure 3.2. Structural analysis of GHSROS and GHSR transcript sequence. (A) 5’ and 3’ends of the GHSR (minus strand) with 5’-based 3000bp extension visualised through the UCSC genome browser. Presence of a hAT-Charlie family, MER5B element indicated in blue/bold. GHSR and GHSROS exons indicated by grey/white outlines. Splice donor and acceptor sites are indicated in gold. EZH2 binding motifs (outlined in green), SUZ12 (underlined) and CpG sites (indicated by vertical line and underline) are as indicated. (B) Visualisation of EZH2 binding sites at the GHSR TSS. ChIP-seq from EZH2 binding sites was extracted from the UCSC Genome Browsers annotation of ENCODE/ Broad Institute model cell lines ChIP-string data (Ram et al., 2011). Exons are indicated as black bars with introns denoted by black lines. Sense and antisense orientations are indicated. The GHSROS full length transcript (uc021xhj.2) is highlighted by green bar.

56

Table 3.3. Interacting protein binding prediction for GHSROS. Assessed through the catRAPID and lncPro in silico pipelines. catRAPID ranking determined by the presence of protein binding domain, motif and interaction strength (%) (Agostini et al., 2013). Red text indicates targets of high prediction scores. catRAPID Protein name lncPro value omics Function (Apweiler et al., 2004) ranking HNRDL_HUMAN 92.1288 High Transcriptional regulator. Transcription repression KHDR3_HUMAN 91.7032 High Alternative Splicing TIA1_HUMAN 88.7992 High Pre-RNA splicing and regulation of apoptosis TIAR_HUMAN 85.3665 High Apoptosis. Stem cell division. proliferation ESRP2_HUMAN 83.0551 High mRNA splicing. EMT SRSF6_HUMAN 79.4687 High Proliferation. Wound healing Splice site selection. Pre-mRNA into hnRNP. Transport ROA1_HUMAN 77.0664 High of poly(A) mRNA from nucleus to cytoplasm PCBP1_HUMAN 76.8218 High Binds ssDNA Constitutive splicing. Cellular response to insulin SRSF5_HUMAN 75.7587 High stimulus. mRNA splicing. Transport of mature mRNA from intron-containing transcript Alternative splicing site selection during pre-mRNA SRSF4_HUMAN 75.581 High splicing. Transport of mature mRNA from intron- containing transcript Signal-induced alternative splicing. Homologous DNA repair. Transcriptional regulation. Transcriptional corepressor in absence of hormonal ligands. Inhibits IGF- 1-stimulated transcriptional activity. Histone H3 SFPQ_HUMAN 75.5776 High deacetylation. RNA splicing. Pre-MRNA splicing factor. Binds to intronic polypurimidine tracts. Reuglation of signal induced alternative splicing. Involved in homologous DNA repair. Interacts wuth SFPQ and SIN3A for recruitment of HDACs Splicing factor. Regulation of alternative in neurons. SRS10_HUMAN 74.6111 High Cytoplasmic transport. Binds to AREs in 3’-UTR of proto-oncogenes and HNRPD_HUMAN 74.5869 High cytokine mRNAs. Involved in translationally coupled mRNA turnover. Response to estradiol stimulus. Required for pre-mRNA splicing. Modulate alternative SRSF7_HUMAN 74.3563 High splicing in vitro. Binds to 3’-UTR of mRNAs and increases their stability. ELAV1_HUMAN 74.2641 High Involved in embryonic stem cell differentiation. Required for pre-mRNA splicing. mRNA export from the SRSF2_HUMAN 73.9554 High nucleus. Control of pre-mRNA splicing. Regulation of mRNA TRA2B_HUMAN 73.582 High splicing, via spliceosome. Suppressor of miRNA biogenesis. Increases proliferation LN28B_HUMAN 72.7019 High in breast cancer cell line, MCF-7 ELAV2_HUMAN 71.0501 High mRNA 3’UTR binding. Poly(A) RNA binding.

57

3.3.2 In silico analysis of GHSROS expression using public datasets

Microarrays and RNA-sequencing are commonly used to assess genes associated with disease, however, lncRNAs are often expressed at orders of magnitude lower than protein-coding transcripts making them difficult to detect using current high-throughput technologies (Cabili et al., 2011; Derrien et al., 2012; Kutter et al., 2012; Necsulea et al., 2014; Ruiz-Orera et al., 2014). We have previously explored the expression of GHSROS in commercial cDNA arrays of normal tissues and a limited number of cancer samples (see Chapter 1, Figure 1.9).

In this study we interrogated data from multiple exon arrays (3, 924 samples deposited in the NCBI GEO database; Supplementary Table 1) harbouring four different strand-specific probes against GHSROS to demonstrate that it is actively transcribed and primarily expressed in endocrine tissues, lymphocytes and cancer tissues. GHSROS was expressed at low levels in cancer cell lines and tissues (Figure 3.3), which is consistent with previous observations from locus mapping, Northern blotting and qRT-PCR experiments (Whiteside et al., 2013). Furthermore, this is a typical feature of natural antisense transcripts (NATs) (Werner, 2005; Werner & Berdal, 2005). GHSROS shows variable expression in different tissue and cell types, as is generally described for lncRNA transcripts (Mattick & Rinn, 2015), with the most abundant expression in gliomas, prostate cancer, lung cancer and breast cancer. A high transcription score (denoted by the universal expression code [UPC]) was also seen in mobilised CD34 primary cells and a number of key fetal tissues, including the heart, brain, kidney and lung (Figure 3.3). The low expression across the GHSROS and GHSR loci in RNA-seq data are illustrated in Figure 3.4, representing 8 normalised whole-transcriptome expression data from clinical samples of metastatic prostate cancer (Sowalsky et al., 2015).

58

Figure 3.3. Scatterplot of GHSROS Universal exPression Code values in publicly available exon array data sets. The scatter plot shows the log of Universal exPression Code (UPC) value, an estimate on whether a gene is actively transcribed, in exon array samples. The dotted, horizontal line separates samples with a UPC ≥ 0.1.

59

Figure 3.4. UCSC genome browser visualization of GHSR/GHSROS locus expression in castration-resistant prostate cancer. GHSR exons (black), antisense GHSROS exon (red). SRR332266 to SRR332273 denote NCBI Sequence Read Archive (SRA) database accession numbers. The y-axis represents read counts normalized to sequencing depth.

3.3.3 GHSROS is upregulated in prostate cancer and associates with advanced Gleason Score

In order to investigate the expression of GHSROS in prostate cancer we next interrogated cDNA tissue panels by qRT-PCR, where samples of increasing stage and Gleason Score reflects the transition to advanced, aggressive cancers. We next evaluated GHSROS expression in a qRT-PCR tissue array of 19 different cancer types. This analysis revealed particularly high GHSROS expression in lung tumours, and elevated expression in prostate tumours (Figure 3.5). Analysis of prostate specific qRT-PCR arrays revealed that GHSROS expression in approximately 41.7% of normal prostate tissues (n=24), 55.7% of tumours (n=88), and 58.1% of samples from other prostatic diseases (e.g. prostatitis; n=31) (Table 3.1). GHSROS was highly expressed by a subset of prostate tumours (~11.4%; Z-score above 1) (Figure 3.6A, B) and elevated in tumours with Gleason scores 8-10 (Figure 3.6C). A non-significant yet strong association with prostate cancer stage was observed, with GHSROS upregulated across stage II, III and IV prostate tumours (Table 3.4). To expand on these observations, we examined an independent cohort from southern Spain, consisting of eight normal prostate tissue specimens and 28 primary tumours with high Gleason scores (with 18 patients presenting with metastases at biopsy). Similarly, GHSROS expression was significantly elevated in tumours compared to normal prostate tissue (Mann-Whitney- Wilcoxon test, P = 0.0070) (Figure 3.7A; Table 3.4) and in tumours from all Gleason scores (Figure 3.7B).

60

Figure 3.5 GHSROS expression in cancer. GHSROS expression in 19 different cancers (TissueScan Cancer Survey Tissue qPCR panel). For each cancer, qRT-PCR data are expressed as mean fold change ± standard deviation (S.D) using the comparative 2-ΔΔCt method compared to a matching non-malignant control tissue. Normalized to β-actin (ACTB). Where tissue types are denoted in red, GHSROS levels are high in tumours compared to matched normal tissues. N = normal tissue; T = tumour.

61

Figure 3.6. The lncRNA GHSROS is highly expressed in a subset of aggressive tumours. (A) GHSROS expression in OriGene TissueScan Prostate Cancer Tissue qPCR panel. Determined using RT-PCR. *P ≤ 0.05, Mann-Whitney-Wilcoxon test. Expression was normalized to the housekeeping gene RPL32 and relative to a normal prostate sample. N = normal prostate; T = tumour. Mean ± SD. (B) Relative gene expression of GHSROS in OriGene cDNA panels of tissues from normal prostate (n=24; blue), primary prostate cancer (n=88; red), and other prostatic diseases (n=31; orange). Determined by qRT-PCR, normalized to ribosomal protein L32 (RPL32), and represented as standardized expression values (z-scores). (C) GHSROS expression in OriGene TissueScan Prostate Cancer Tissue qPCR panels stratified by Gleason score. *P ≤ 0.05, **P ≤ 0.01, Mann-Whitney- Wilcoxon test. Mean ± SD. Expression was normalized to the housekeeping gene RPL32 and relative to a normal prostate sample. N = normal prostate.

62

Table 3.4. Relationship between GHSROS expression and clinical and pathological parameters in OriGene TissueScan Prostate Cancer Tissue qPCR panels. Six samples were excluded due to missing clinical information. GHSROS expression (determined using qRT-PCR) in tumours stratified by clinical stage and Gleason score was compared to normal prostate (N). P values were calculated using the Mann-Whitney-Wilcoxon test. NA = not applicable. Clinicopathological parameters Sample number (n) P-value Age at diagnosis (mean ± SD) 62.2 ± 7.80

Clinical stage N (normal prostate) 55 I 0 NA II 47 0.311 III 33 0.0855 IV 3 0.0185

Gleason score N (normal prostate) 55 2-6 14 0.0278 7 47 0.0920 8-10 22 0.00210

Figure 3.7. GHSROS expression in the Andalusian prostate cancer cohort. (A) GHSROS expression in the Andalusian Biobank prostate tissue cohort stratified by (B) Gleason score, (C) number of metastatic sites. 1 Met = one and ≥ 2, two or more metastatic sites. Absolute expression levels were determined by qRT-PCR and adjusted by a normalization factor calculated from the expression levels of three housekeeping genes (HPRT, ACTB and GAPDH). Mean ± SD. N = normal prostate. *P ≤ 0.05, Mann-Whitney-Wilcoxon test.

63

Table 3.5. Correlation between GHSROS expression and clinical and pathological parameters in the Andalusian Biobank prostate tissue cohort. GHSROS expression in tumours stratified by Gleason score and the number of tumour metastatic sites was compared to normal prostate (N). Tumours positive or negative for extraprostatic extension and perineural infiltration were compared to each other. P values were calculated using the Mann-Whitney-Wilcoxon test. NA denotes not applicable.

Clinicopathological parameters Sample number (n) P-value

Age at diagnosis (mean ± SD) 73.7 ± 9.81 Gleason score N (normal prostate) 8 7 6 0.00870 8 9 0.0240 9-10 13 0.0420

Number of metastatic sites N (normal prostate) 8 primary prostate tumour: 0/localized 10 0.0420 primary prostate tumour: 1 metastatic site 6 0.0556 primary prostate tumour: ≥2 metastatic sites 12 0.0127 Extraprostatic extension 16 - 11 0.379 +

Perineural infiltration - 8 + 20 0.415

3.3.4 Expression of GHSROS in prostate-derived cell lines and patient-derived xenograft (PDX) samples

As the functional thresholds of noncoding RNAs are difficult to gauge and are likely to be cell-type specific (Geisler & Coller, 2013), we first identified cell lines with a range of levels of endogenous GHSROS expression using strand specific primers. Significantly higher levels of GHSROS expression were observed in the metastatic PC3 cell line (7.48 ± 2.15 fold, P = 0.00040), DuCaP (111.08 ± 26.26 fold, P = 0.0024) and RWPE-2 cell line (3.32 ± 1.06 fold, P = 0.001) compared to the RWPE-1 normal prostate-derived cell line (Figure 3.8). Expression in the DU145 (2.85 ± 3.18 fold, P = 0.29) and LNCaP prostate cancer cell lines (1.65 ± 1.71 fold, P = 0.49) was low and similar to the RWPE-1 cell line (Figure 3.8). We also assessed the expression of GHSROS in patient derived xenografts (PDXs), which reflect the patient tumour biology more accurately than cancer cell lines (McCulloch et al., 2005; Nguyen et al., 2017). GHSROS was significantly upregulated in 5 of the 6 LuCAP series of PDX lines (compared to RWPE-1 cell line). The highest expression was observed in the BM18 line (6.94 ± 1.55

64 fold, P = 0.0005), a PDX with high androgen receptor expression derived from a femoral metastasis (Figure 3.8). Taken together, in silico analysis of curated cancer datasets and qRT-PCR analyses of prostate cancer cell lines and clinical specimens demonstrate that GHSROS expression is elevated in prostate cancer and particularly the high-grade primary prostate cancer specimens tested. GHSROS may therefore have a functional role in prostate cancer.

Figure 3.8. Expression of GHSROS in prostate cancer-derived cell lines and patient-derived prostate cancer xenografts (PDX) lines by qRT-PCR. Expression was normalized to the housekeeping gene RPL32 and relative to the RWPE-1 normal prostate derived cell line. Mean ± s.e.m, n = 3 biological repeats. *P ≤ 0.05, **P ≤ 0.001; ***P ≤ 0.001 one-way ANOVA, Tukey’s post hoc test. Clinical features of each of the PDX cell lines are detailed in Chapter 2, Table 2.1. Orange font indicates androgen-insensitive cell lines.

3.3.5 GHSROS promotes motility of prostate cancer cells in vitro

Typically, cancer relies on acquiring a number of biological hallmarks to sustain growth and eventually metastasise – processes that have been linked to the functions of some lncRNAs (Gutschner & Diederichs, 2012). We have previously demonstrated that GHSROS overexpression increases the rate of cell migration in lung cancer cell lines compared to vector-expressing controls (Whiteside et al., 2013), however, its role in prostate cancer has not been determined. To gain insights into the functional significance of GHSROS in cancer, we assessed its function in the androgen independent DU145 and PC3 prostate cancer cell lines. We stably overexpressed the full length GHSROS sequence or vector alone in these cell lines (denoted DU145-GHSROS, and PC3-GHSROS) (Figure 3.9A). Using an xCELLigence RTCA proliferation assay, we observed that GHSROS overexpression (Figure 3.9A)

65 increased the rate of cell migration in the PC3 (1.54 ± 0.35 fold, P = 0.0064) and DU145 prostate cancer cell lines (1.94 ± 0.43 fold, P = 0.017) over 18 hours compared to vector controls (Figure 3.9B). Representative masked phase contrast images and extrapolation of cell confluence data from the IncuCyte instrument show that GHSROS overexpressing PC3 cells migrate into a scratch wound at a faster rate than the vector control cells at 18 hours (2.51 ± 0.27 fold, P = 0.016) (Figure 3.9C).

Figure 3.9 GHSROS increases cell migration in the PC3 and DU145 prostate cancer cell lines. (A) Relative expression levels of GHSROS when overexpressed in prostate (PC3, DU145) derived cancer cell lines. Expression was normalized to the RPL32 housekeeping gene. Results are relative to the respective vector control. Mean ± s.e.m, n=3, **P ≤ 0.01, Student’s t-test. (B) PC3-GHSROS and DU145-GHSROS cell migration. PC3 and DU145 cells were assessed using an xCELLigence real time cell analyser for 18 hours. Left panel and bottom left panel: representative kinetic plot over 18 hours. Right panel and bottom right panel: Mean ± s.e.m of triplicate experiments, n=3, *P ≤ 0.05, Student’s t-test. (C) Representative phase contrast image and confluence analysis of PC3-GHSROS and vector cells following 18 hours using the Incucyte™ scratch wound system. Mean ± s.e.m, n=2.

3.3.6 GHSROS promotes growth of prostate cancer cells in vitro

Another hallmark in the multi-step progression of tumourigenesis is sustained or increased cell proliferation (Gutschner & Diederichs, 2012; Hanahan & Weinberg, 2000; Hanahan & Weinberg, 2011). Employing real-time cell analysis, we observed that cell proliferation was increased in the PC3 (1.76 ± 0.18 fold, P = 0.029) and DU145 (1.74 fold ± 0.73, P = 0.026) overexpressing cell lines compared with empty vector control cells after 72h (Figure 3.10A). We performed anchorage- independent growth assays over a 21-day time period to investigate the effect of GHSROS on

66 tumourigenicity of the prostate cell lines. Using this model, we observed that GHSROS overexpression increased the size and number of colonies present after 21 days of growth on agarose (visualised by phase contrast images and manual counting of colonies), although this was not statistically significant (Figure 3.10C).

Figure 3.10 Overexpression of GHSROS promotes DU145 and PC3 cell proliferation. (A) xCELLigence real-time cell analysis for 72 hours demonstrated increased proliferation in the PC3- GHSROS and DU145-GHSROS cell lines. Mean ± s.e.m, *P ≤ 0.05, Student’s t-test. Experiments were performed independently three times (n=3) with three replicates per experiment. (B) Representative phase contrast image of PC3-vector and PC3-GHSROS cells after 72 hours of growth. Scale = 100 μm. (C) Representative micrographs of anchorage-independent growth assay of PC3-GHSROS and DU145- GHSROS cells at 21-days. Experiments were performed independently two times (n=2) with two replicates. Scale = 50 μm. Count data were obtained from colonies stained with 0.1% crystal violet.

67

3.3.7 The effects of GHSROS on the rate of cell attachment to extracellular matrix constituents

Changes in cell number in these assays may be due to decreased levels of apoptosis, or increased rates of attachment to any given substrate. To investigate this, we performed attachment assays with the extracellular matrix (ECM) substrates, fibronectin, collagen type IV and collagen type I and performed CyQUANT® fluorescent measurement of cellular DNA to quantify cell number. GHSROS overexpression significantly increased attachment of the PC3 cell line to Collagen type I (1.38 ± 0.38 fold, P = 0.0107) (Figure 3.11). No difference was noted in the DU145-GHSROS cell line compared to the vector control. There were no statistically significant differences in attachment to fibronectin or collagen type IV in either cell line.

Figure 3.11. GHSROS increases attachment to collagen type I in the PC3 cell lines but not to collagen type IV or Fibronectin. Attachment (measured using the CyQUANT® assay as a direct representation of the number of cells that have attached over a 24-hour period) to fibronectin and collagen type IV were not altered in the PC3-GHSROS and DU145-GHSROS cell lines compared to controls. Mean fluorescence reading measured at 480 nm ± s.e.m, n=3, *P ≤ 0.05, Student’s t-test. Ex = excitation wavelength. Em = emission wavelength.

68

3.3.8 GHSROS potentiates tumour growth in vivo

To determine if GHSROS had effects on tumourigenesis in vivo, we subcutaneously injected GHSROS- overexpressing and control PC3 or DU145 cell lines into NOD/SCID mice. Tumour growth over time was significantly increased in the PC3-GHSROS, and DU145-GHSROS xenografts compared to each vector control xenografts (Figure 3.12A). Xenograft tumour growth was significantly elevated in PC3- GHSROS cells at 25 (PC3-vector: 37.55 ± 69.80 mm3, PC3-GHSROS: 178.02 ± 138.26 mm3, P = 0.0447), 33 (PC3-vector: 168.33 ± 26.26 mm3, PC3-GHSROS: 348.33 ± 130.46 mm3, P = 0.0041), and 37 (PC3-vector: 169.28 ± 33.86 mm3, PC3-GHSROS: 366.15 ± 83.48 mm3, P = 0.0013) days after day 0 (tumour implantation). This was replicated in the DU145 cells which showed significant xenograft growth at 35 (DU145-vector: 99.53 ± 79.82 mm3, DU145-GHSROS: 378.40 ± 170.04 mm3, P = 0.0010) and 38 (DU145-vector: 142.99 ± 75.60 mm3, DU145-GHSROS: 451.14 ± 215.24 mm3, P = 0.002) days. At endpoint, xenograft tumour volume was significantly greater in both the PC3-GHSROS at 40 days after tumour implantation (PC3-vector: 174.73 ± 44.8 mm3, PC3-GHSROS: 374.81 ± 66.6 mm3, P = 0.0002) and DU145-GHSROS group at 42 days after implantation (DU145-vector: 203.91 ± 99.42 mm3, DU145-GHSROS: 492.36 ± 66.6 mm3, P = 0.036), compared to vector control xenografts (Figure 3.12A). The tumour weight at resection was significantly greater in the DU145-GHSROS cells compared to the DU145-vector controls (DU145-vector: 0.1 ± 0.039g, DU145-GHSROS: 0.265 ± 0.121g, P = 0.036) (Figure 3.12C). This can be clearly observed by gross visualisation of excised tumours (Figure 3.11B). Using immunohistochemistry for the proliferation marker Ki67, positive immunoreactivity was increased in the PC3-GHSROS (1.41 ± 0.08, P = 0.00030) and DU145-GHSROS (1.30 ± 0.18, P = 0.012) tumours, indicating an increased rate of cell proliferation in vivo compared to control xenografts (Figure 3.12E). GHSROS overexpression in the PC3- and DU145-GHSROS xenografts tumours was confirmed by qRT-PCR (Figure 3.13).

69

Figure 3.12. GHSROS promotes in vivo xenograft tumour growth in NOD/SCID mice. (A) PC3- GHSROS (n=8) and PC3-vector control (n=4) xenograft tumour volumes after subcutaneous implantation in NOD/SCID mice. *P ≤ 0.01, **P ≤ 0.01, ***P ≤ 0.001, two-way ANOVA with Bonferroni’s post hoc analysis. (B) DU145-GHSROS (n=6) and DU145-vector control (n=4) xenograft tumour volumes after subcutaneous implantation in NSG mice. Mean ± s.e.m. **P ≤ 0.01, ***P ≤ 0.001, two-way ANOVA with Bonferroni’s post hoc analysis. (C) Weight of DU145-vector (n=4) and DU145-GHSROS (n=6) tumours excised at 42 days measured in grams (g). Mean ± s.e.m. *P ≤ 0.05, two-way ANOVA with Bonferroni’s post hoc analysis. (D) Excised xenograft tumours 42 days after male injection with DU145-GHSROS or DU145-vector (empty vector control). Scale indicated below image. (E) Ki67-positive staining in PC3 xenografts and DU145 xenografts. Scale = 20 μm. (F) Number of Ki67 positive staining cells is greater in PC3-GHSROS and DU145-GHSROS xenograft tumours compared to vector control tumours. Mean ± s.e.m of 5 different fields, n=3 for each tumour type, *P ≤ 0.05, ***P ≤ 0.001, Student’s t-test.

70

Figure 3.13. Validation of GHSROS overexpression in PC3 and DU145 tumour xenografts by qRT-PCR. Expression changes were measured from (A) excised PC3 (n=2 vector, n=3 GHSROS), and (B) DU145 xenografts (n=2 vector, n=3 GHSROS) at in vivo endpoint. Expression was normalized to the housekeeping gene RPL32. Results are relative to the respective vector control. Mean ± s.e.m., n=3, *P ≤ 0.05, Student’s t-test.

3.3.9 Locked nucleic antisense oligonucleotide (LNA-ASO) knockdown of GHSROS

To confirm the functional effects of GHSROS and develop a tool for exploring its molecular functions, we designed locked nucleic antisense oligonucleotides (LNA-ASOs) to strand specifically silence endogenous GHSROS expression (Figure 3.14A). After extensive validation with GHSROS-targeting antisense oligonucleotides (ASOs) lacking modifications (data not shown), we developed two different LNA-ASOs targeting distinct hairpin regions of GHSROS, RNV124 and RNV104L (Figure 3.14B). Transfection with either LNAs (10 nM) reduced the expression of GHSROS (by ~80%) in the PC3 cell line 48 hours post transfection compared to a scrambled control (RNV124: -4.32 ± 0.15, P = 0.0002; RNV104L: -4.87 ± 0.10, P = 0.0001) (Figure 3.15A). Moreover, GHSROS knockdown using these LNA-ASOs attenuated cell proliferation (RNV124: -1.14 ± 0.06, P = 0.049; RNV104L: -1.18 ± 0.05, P = 0.030) (Figure 3.15B) and migration in the PC3 cell line over 18h (RNV124: -1.96 ± 0.11, P = 0.0042 (Figure 3.15C) – the reciprocal effects observed when GHSROS was overexpressed. We observed a more pronounced effect on cell growth when GHSROS was knocked down in serum- deprived conditions (0% serum) (Figure 3.15D). RNV124 caused a significant reduction in cell survival (cell number) over a 48-hour time period when assessed by normalised impedance measurements using the xCELLIgence RTCA (RNV124: -2.64 ± 0.25 fold, P = 0.0052) (Figure 3.15D).

71

Figure 3.14. Strand-specific LNA design and mapping to GHSROS. (A) Schematic model of the regions of LNA-ASO hybridisation to the GHSROS sequence, and strand specific knockdown, independent of the GHSR region. (B) GHSROS RNA secondary structure prediction performed using single-stranded sequence and the RNAfold WebServer. MFE = minimum free energy. The location of locked nucleic antisense oligonucleotides (LNA-ASOs) that directly bind to and represses the lncRNA are shown in red.

72

Figure 3.15. LNA-ASOs reduce GHSROS expression and decrease PC3 proliferation, migration and cell survival. (A) LNA-ASOs reduce GHSROS expression in the PC3 prostate cell line (48h). Mean ± s.e.m, n=3, ***P ≤ 0.001, Student’s t-test. (B) GHSROS knockdown reduces PC3 migration. Left panel: Representative plot of the Cell Index (CI) impedance values measured up to 20 hours. Right panel: Reduction of GHSROS levels with RNV124 reduces cell migration at 18 hours. Assessment of ∆ Cell Index, Mean ± s.e.m (* P ≤ 0.05, n=3). Mean ± s.e.m (** P ≤ 0.01, n=3). (C) GHSROS knockdown reduces PC3 proliferation. Mean ± s.e.m, n=3, ***P ≤ 0.001, Student’s t-test. (D) GHSROS knockdown with RNV124 reduces the survival of PC3 cells in serum deprived conditions. Left panel: Representative plot of the ∆ CI impedance values measured over 72 hours. Dotted vertical line indicates the transfection time-point of scramble control and RNV124 LNA-ASO. Right Panel: Assessment of ∆ Cell Index, Mean ± s.e.m (* P ≤ 0.05, n=3).

3.3.10 GHSROS regulates the expression of antisense GHSR transcripts

Antisense lncRNAs typically serve to regulate their sense genes through various mechanisms including transcriptional interference, regulation of alternative splicing, scaffolding of epigenetic modifying enzymes and Watson-Crick base pair hybridisation and degradation (Campbell & Wengel, 2011; Roux, Lindsay, & Heward, 2017; Veedu & Wengel, 2010). In Section 3.31 we showed preliminary evidence

73 that GHSROS may bind and regulate proteins involved in alternative splicing. Given that splicing of the GHSR into its isoforms (GHSR1a & GHSR1b) is frequently seen in cancers that are positive for its expression (Leung et al., 2007; Seim & Chopin, 2009). Expression of the GHSR1a isoform (which encodes the canonical ghrelin receptor) could not be detected by qRT-PCR in the PC3 cell line (data not shown), although this has previously been published using non-quantitative RT-PCR and qRT-PCR by our laboratory (Hormaechea-Agulla et al., 2017; Jeffery & Chopin et al., 2002). GHSR1b expression was increased in both PC3- and LNCaP-GHSROS cell lines, compared to the vector control (PC3: 25.12 ± 15.96 fold, P = 0.059; LNCaP: 12.66 ± 9.31 fold, P = 0.098; Figure 3.16A), but decreased in the DU145-GHSROS cells (-1.59 ± 0.19, P = 0.037; Figure 3.16A). Both GHSROS targeting LNA ASOs had the reciprocal effect, reducing GHSR1b expression in the PC3 cell line, suggesting that GHSROS was regulating GHSR1b expression. We confirmed this in the ES-2 ovarian clear cell carcinoma derived cell line (Figure 3.16B), which also expresses both GHSR isoforms (data not shown). While we saw a trend in the reduction of the GHSR1b in the prostate-derived DUCaP cell line, the expression downregulation was nonetheless variable (Figure 3.16B).

Figure 3.16 Regulation of GHSR1b by GHSROS overexpression and knockdown. (A) GHSR1b qRT-PCR of GHSROS overexpressing DU145, PC3 and LNCaP cell lines. *P ≤ 0.05, Student’s t-test, n=3. Mean ± s.e.m. Expression was normalized to the housekeeping gene RPL32 and relative to vector control. (B) GHSROS knockdown reduces GHSR1b mRNA levels in the PC3 prostate cancer and ES-2 ovarian cancer cell lines. *P ≤ 0.05, ***P ≤ 0.001, Student’s t-test, n=3. Mean ± s.e.m. Expression was normalized to the housekeeping gene RPL32 and relative to scrambled LNA-ASO.

74

3.4 Discussion

In this chapter, we characterised the endogenous expression levels of the lncRNA GHSROS in model cell lines and across multiple cancer types. Specifically, we show that GHSROS is actively transcribed and differentially expressed in prostate cancer, where it promotes cellular functions of tumour progression; including migration, cell proliferation and in vivo tumour growth. Finally, we show preliminary evidence that GHSROS may post-transcriptionally regulate the ghrelin receptor isoform, GHSR1b.

The 1.1 kb sequence that encodes GHSROS (GenBank accession: GU289929) is a single-exon lncRNA located on chromosome 3q26.31, that is fully contained within the 2.1 kb intron flanking the coding exons of the GHSR gene (Whiteside et al., 2013). GHSROS is specific to primates, and poorly conserved across other species (Whiteside et al., 2013). LncRNAs are typically transcribed by RNA polymerase II and are subject to the control of chromatin modifications that are also a feature of canonical mRNA transcription (Deveson et al., 2017; Mattick & Rinn, 2015). GHSROS bears features an actively transcribed gene, however, the low levels of enrichment for the active epigenetic enhancer marks, H3K4Me1, H3K4Me3 and H3K27ac in its promotor region (Cui, et al., 2013; Sun et al.,2015), suggesting that GHSROS is transcribed at low levels in all ENCODE cell types (Ernst et al., 2011). Interestingly, visualisation of these ChIP-seq datasets for histone modifications showed a strong enrichment for EZH2 and SUZ12 polycomb group protein (PcG) at the promotor of the GHSR gene, but not upstream of GHSROS. The PRC2 complex, which includes EZH2 and SUZ12, facilitates transcriptional repression through modification to histones via trimethylation of histone H3 on lysine 27 (H3k27me3) (Lu et al., 2016; Yoo & Hennighausen, 2012), which, following EZH2, was the second most enriched histone modification seen at the GHSR promotor. In silico data and methodologies are useful in suggesting that GHSROS does not interact with EZH2 and SUZ12 directly, however future studies relying RNA-based capture and mass spectrometry, RNA pull-down methods, or protein- focused methodologies such as RNA immunoprecipitation chip (RIP-Chip) and cross-linking and immunoprecipitation (CLIP) will prove valuable in dissecting this interaction (Ferre et al.,2016). Given the lack of reliability with some in silico algorithms, this will be a crucial future aim.

Previously, GHSROS’ designation as a noncoding RNA was originally suggested based on its antisense orientation in respect to the GHSR, in silico open reading frame (ORF) analysis and in vitro ORF-GFP fusion-protein translation assays indicating that GHSROS lacked significant protein-coding potential (Whiteside et al., 2013). In agreement, GHSROS was computationally verified to be noncoding RNA utilizing updated sequence-based algorithms such as the sequence conservation based PhyloCSF (Lin, Jungreis, & Kellis, 2011), USCS derived txCdsPredict and conventional CPAT (Wang & Park, et al., 2013) pipelines, visualised through the UCSC genome browser. There remains the possibility, however,

75 that GHSROS may be translated into a discrete peptide under specific conditions that cannot be factored into these updated programs. While we currently recognise GHSROS as a natural antisense noncoding transcript, we acknowledge the field is still expanding and GHSROS may still lead to downstream translation of small bioactive peptides as is recognised in other genes (transcripts of uncertain coding potential; TUCP). Furthermore, our analyses have not uncovered any additional exons to the single exon GHSROS sequence. However, it has been posited that there are further upstream exons (I. Seim, personal communication, 2009). In this thesis, we have examined the function and role of the 1.1 kb sequence originally characterised using strand-specific qRT-PCR and 5’, 3’rapid amplification of cDNA ends (RACE) (Whiteside et al., 2013).

We thus investigated the active transcription of GHSROS across a large number of Affymetrix exon arrays containing a number of strand-specific probes for GHSROS. Consistent with our evaluations of active transcriptional marks, GHSROS is expressed at low levels in multiple cancer types, cell lines and tissues. This notably, included a significant number of lung cancer specimens and multiple prostate cancer cell lines. Independent analysis across 19 major cancer types demonstrated that GHSROS is expressed at high levels in lung tumours, as previously described. This is not surprising, given that GHSROS was originally identified from EST clones derived from lung cancer cDNA libraries (Whiteside et al., 2013). Like lung cancer, GHSROS is significantly elevated in prostate cancer, which we detected in our exon array and clinical specimen cohorts. We observed that GHSROS is more highly expressed in a subset of clinical prostate cancers, characterizing both local and advanced tumours of a high Gleason score (>7 and highly expressed in Gleason score 8-10 specimens). In agreement with this, we find that GHSROS is expressed in both prostate cancer cell lines and PDX models that resemble aggressive and metastatic prostate cancers (McCulloch et al., 2005; Nguyen et al., 2017). Recognition of the importance of lncRNAs in the regulation of prostate cancer is widely appreciated after the discovery of PCA3 and its use as a diagnostic tool (Mitobe et al., 2018; Smolle et al., 2017). As evidence of their roles in prostate cancer rapidly accumulates, lncRNAs have been shown to play key roles in promoting castrate resistance, organisation of epigenetic modifications, and organising networks of genes that promote cancer progression (Alahari et al., 2016; Ma et al., 2017; Smolle et al., 2017). GHSROS is a single-spliced exon, antisense lncRNA embedded within intron 1 of the corresponding sense gene, GHSR. It has been shown that the increasing presence of antisense transcripts positively correlates with increasing degrees of differentiation in tumour samples (estimated by Gleason score) (Reis et al., 2004), implying they play a widespread functional role. One such antisense lncRNA, transient receptor potential cation channel, subfamily M, member 2 antisense transcript (TRPM2-AS), which resides on the opposite strand to the oxidative stress-activated ion channel encoding gene, TRPM, is abundantly expressed in prostate cancer samples, where high expression is linked to poor clinical outcome (Orfanelli et al., 2015). Indeed, a considerable number of intronic transcripts are correlated with prostate cancer, and a high proportion of these are classified as antisense sequences (Morris &

76

Vogt, 2010; Reis & Louro et al., 2005; Reis et al., 2004). Our data suggests that GHSROS falls in to this category of single exon intronic cancer-associated antisense lncRNAs, and is likely to be associated with significant functional effects in cancers where it is dysregulated.

Very recent work suggests that a small proportion (~3%) of long noncoding RNA genes mediate cell growth (Liu et al., 2017). Herein, we demonstrate that the lncRNA GHSROS is one such gene. GHSROS functions as an oncogene, promoting the in vitro growth and migration of the PC3 and DU145 metastatic prostate cancer cell lines. PC3 and DU145 subcutaneous xenograft models showed significant increases in tumour growth when GHSROS was ectopically expressed. Furthermore, we showed significant increases in the amount of immunohistochemical staining of Ki67 - a marker for cellular proliferation. Forced over-expression in prostate cancer cells demonstrates that GHSROS plays a role in promoting the growth of prostate cancer cells in vitro and in vivo. GHSROS appears to provide a pro-survival benefit and depletion of its expression with strand-specific LNA-ASOs caused a reciprocal regulation of its in vitro proliferation and migratory effects. Similarly, the lncRNA known as SChLAP1 was recently identified as a prostate cancer-specific lncRNA which is overexpressed in clinical prostate cancers (Prensner et al., 2013; Prensner & Zhao, et al., 2014). Utilising in vitro and in vivo gain- and loss-of-function experiments, SChLAP1 was shown to be a critical driver of prostate cancer cell line invasiveness and tumour growth (Prensner et al., 2013). SChLAP1 is overexpressed in a subset of clinical prostate cancers and was observed to be a useful predictor for poor patient outcomes, including prostate cancer specific mortality (Prensner et al., 2013; Prensner & Zhao, et al., 2014). GHSROS, like SChLAP1, displays higher differential levels in a subset of prostate cancers tumours and similarly promoted prostate cancer cell migration and proliferation. The expression levels of GHSROS in DU145- GHSROS cells (compared to the DU145-vector) in resected xenograft tumours grown in vivo (4-fold increase) were different to the same cells grown in vitro (40-60-fold increase in GHSROS expression). The discrepancy might reflect the influence of the in vivo microenvironment on GHSROS expression, or the transfection stability. We predict that the GHSROS overexpression cassette should be relatively stable for a number of weeks without antibiotic selection, as it is in vitro. The GHSROS overexpression cassette is driven by the commonly used CMV promoter, which is sensitive to methylation (Scharfmann et al., 1991) and this could have contributed to the lower levels of GHSROS expression at the end of our xenograft experiments.

Given that the ~1.1kb GHSROS is fully contained within the relatively short, 2.2kb intron separating the two coding exons of GHSR (Whiteside et al., 2013); we wished to measure the effects of GHSROS modulation on GHSR. Previous studies have implicated the upregulation of the GHSR in a wide range of cancers, including prostate cancer, where alternative splicing events leading to the expression of the GHSR can facilitate cell proliferation and migration (Fung et al., 2013). The GHSR1a encodes the 366

77

AA seven-transmembrane cognate receptor for the peptide hormone ghrelin (Howard et al., 1996), while the intron-retained GHSR1b encodes a 289-AA C-terminally truncated receptor that does not bind ghrelin, but interacts with GHSR1a and other receptors (Chu et al., 2007; Leung et al., 2007; Mary et al., 2013; Takahashi et al., 2006). GHSR1b is overexpressed in human tumours of the prostate, lung, breast, ovary, and pituitary (Barzon et al., 2005; Gahete et al., 2011; Gaytan et al., 2005; Ibanez-Costa et al., 2015; Jeffery et al., 2002; Takahashi et al., 2006), however, the basis for its differential expression is unknown. GHSR1a is encoded by exon 1 and 2 of GHSR, with the intron which overlaps GHSROS being removed during the splicing process (Chow et al., 2012; Seim & Chopin, 2011). GHSROS could regulate its overlapping protein-coding sense gene by binding to GHSR pre-mRNA to modulate the bioavailability of receptor isoforms. GHSROS could thus contribute to the high expression of GHSR1b in a range of human cancers, including prostate cancer (Barzon et al., 2005; Gnanapavan et al., 2002; Jeffery et al., 2005; Jeffery et al., 2002). The reciprocal regulation of a receptor by its antisense gene contributes to a small number of reports in the literature, including the transcription factor Rev-erbα (NR1D1), thyroid (THRA) (Hastings et al., 2000; Munroe & Lazar, 1991), DEAD- box helicase 39A (DDX39A), adhesion G protein-coupled receptor E5 (ADGRE5) (Katayama et al., 2005) and Fas cell surface death receptor (FAS) (Sehgal et al., 2014; Yan et al., 2005). The antisense and intronic lncRNA, PCA3, is embedded in intron 6 of the tumour suppressor gene, PRUNE2, where it acts as a dominant-negative oncogene to inactive PRUNE2 levels in in prostate cancer. (Salameh et al., 2015). Critical next steps in the study of GHSROS and its relationship to the GHSR isoforms in prostate cancer include interrogating a large number of well-annotated clinical samples, such as CRPC patient-derived xenografts (PDXs) and metastatic specimens (which are difficult to sample). Furthermore, clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9-mediated strand specific deletion of the gene encoding GHSROS will be useful in teasing apart any post-transcriptional regulation mechanisms.

In summary, we have shown for the first time that GHSROS is expressed in prostate tumours and regulates growth, motility, and survival of androgen-sensitive and androgen-independent prostate cancer cell lines. Although cancers are highly heterogeneous diseases (Hanahan & Weinberg, 2011), few therapies target molecular cancer phenotypes, however, lncRNAs provide a largely untapped source for new molecular targets (Adams et al., 2017). In this study, we developed antisense oligonucleotides targeting GHSROS and assessed their function in cultured cancer cells. Targeting GHSROS may present an opportunity for clinical intervention, however, it is appreciated that translational and regulatory challenges exist for oligonucleotide therapies (Gutschner et al., 2017). Finally, it is suggested that GHSROS is a member of the growing list of bona fide lncRNAs that are differentially expressed in prostate cancer and may play a posttranscriptional role. While this study firmly establishes that GHSROS plays a role in prostate cancer and may possess a large regulatory role in trans, its role in gene expression changes mediating these functional responses remain unknown.

78

Chapter 4

The lncRNA GHSROS mediates expression of genes associated with metastasis and adverse outcome

79

4.1 Introduction

In a relatively short amount of time, long noncoding RNA (lncRNA) transcripts have emerged as critical regulators of cancer progression (Cheetham & Dinger, 2013; Deveson et al., 2017; Taft & Mattick, 2010). These molecules play key roles in cancer progression and provide a largely untapped and abundant source of novel therapeutic targets for cancer (Slaby et al., 2017).

In the previous chapter we showed that GHSROS is upregulated in multiple cancer types, including prostate cancer. We demonstrated that GHSROS forced over-expression in the DU145 and PC3 prostate cancer cell lines significantly increased cell proliferation and migration in vitro. Furthermore, GHSROS overexpression in these cell line models potently increased xenograft growth in vivo. LNA ASOs targeting specific regions of GHSROS significantly reduced its expression in the PC3 cell line which significantly reduced proliferation and cell migration – the reciprocal effects of forced GHSROS over- expression. Our preliminary investigation suggested that GHSROS regulates the expression of the GHSR1b isoform of GHSR, its sense gene, in prostate cancer cell lines. However, it is likely that – like other lncRNAs (Sun et al., 2017) – GHSROS regulates the transcription of a large number of genes acting, in trans, to coordinate its functional effects on tumour growth and survival. The lncRNA SChLAP1, for example, which correlates with higher Gleason score prostate tumours and is highly prognostic for advanced prostate cancers (Mehra et al., 2014; Prensner et al., 2013), coordinates the expression of a pro-invasive and migratory gene network (Prensner et al., 2013).

In this study, we used high-throughput RNA-sequencing (RNA-seq) to examine genes regulated by GHSROS in vitro in the PC3 prostate cancer cell line. Our analysis provides evidence that GHSROS promotes a gene expression pattern that enhances the propensity for metastasis and adverse disease outcome.

80

4.2 Materials and Methods General Materials and Methods are outlined in detail in Chapter 2. Experimental procedures which are specific to this chapter are described below.

4.2.1 RNA-sequencing of PC3-GHSROS cells RNA was extracted from in vitro-cultured PC3-GHSROS cells and controls, as outlined in Chapter 2, section 2.3. RNA purity was analysed using an Agilent 2100 Bioanalyzer, and samples with an RNA Integrity Number (RIN) above 7 used for RNA-sequencing (RNA-seq). Strand-specific RNA-seq was performed by Macrogen, South Korea. A TruSeq stranded mRNA library (Illumina) was constructed and RNA sequencing performed (50 million reads) on a HiSeq 2000 instrument (Illumina) with 100bp paired end reads. Pre-processing of raw FASTQ reads, including elimination of contamination adapters, was performed with scythe v0.994 (https://github.com/vsbuffalo/scythe). Paired-end human FASTQ files were aligned to the human genome (UCSC build hg19) using the spliced-read mapper TopHat (v2.0.9) (Kim et al., 2013) and reference gene annotations (to guide the alignment).

Raw gene counts were computed from TopHat-generated BAM files using featureCounts v1.4.5-p1 (Y. Liao, Smyth, & Shi, 2014), counting coding sequence (CDS) features of the UCSC hg19 gene annotation file (gtf). FeatureCounts output files were analysed using the R programming language (v.3.2.2). Briefly, raw counts were normalised by Trimmed Mean of M-values (TMM) correction (Robinson & Smyth, 2010; Robinson & Oshlack, 2010). Library size-normalised read counts (per million; CPM) were subjected to the voom function (variance modelling at the observation-level) in limma v3.22.1 (Linear Models for Microarray Data) (Law & Smyth, 2014; Ritchie et al., 2015), with trend=TRUE for the eBayes function and correction for multiple testing (Benjamini-Hochberg false discovery rate of cut-off, Q-value, set at 0.05). Genes with at least a 1.5 log2 fold-change difference in expression between PC3-GHSROS and PC3-vector (empty vector) cells were defined as differentially expressed. Although validation is not required, as RNA-seq gives very accurate measurements of relative expression across a broad dynamic range (Wang et al., 2008), selected differentially regulated genes were validated using quantitative reverse-transcription PCR (qRT-PCR) (see Chapter 2, section 2.3). Detailed gene annotations were obtained by querying Ensembl with the R/Bioconductor package ‘biomaRt’ (Durinck & Huber, 2009).

4.2.2 Cell lines, prostate cancer PDX cell lines and cell culture In this study assays were performed using the PC3, DU145, LNCaP, DuCaP prostate cancer, A549 lung cancer and ES-2 ovarian cancer cell lines (Chapter 2; section 2.1). The PC3, DU145, LNCaP, DuCaP, A549, and ES-2 cell lines were propagated as described in the General Methods, Chapter 2, section 2.1.

81

4.2.3 Production of GHSROS overexpressing cancer cell lines Full-length GHSROS was obtained by RT-PCR from A549 cell line mRNA and cloned into the pTargeT mammalian expression vector (Promega, Madison, WI) (Whiteside et al., 2013). PC3, DU145, and A549 cell lines were transfected with GHSROS-pTargeT DNA, or vector alone (empty vector), (using Lipofectamine LTX, Invitrogen) according to the manufacturer’s instructions, and as described in Chapter 2, section 2.2.

4.2.4 Locked Nucleic Acid-Antisense Oligonucleotides (LNA-ASO) Design of the custom made LNA-ASO sequences are and transfections were performed as previously described (Chapter 2, section 2.6).

4.2.5 RT-PCR of cell line mRNA RNA was extracted and cDNA synthesised from all cell lines (as described in Chapter 2, section 2.3). RT-PCR was performed using cDNA from the normal prostate and from prostate cancer cell lines (as described in Chapter 2, section 2.3). No template control RT-PCRs were also performed where template was substituted with water.

82

Table 4.1 Primers used in this study Primer Gene name Primer sequence (5'-3') ACATTCAGCAAATCCAGTTAATGACA GHSROS growth hormone CGACTGGAGCACGAGGACACTTGA secretagogue receptor GHSROS- CGACTGGAGCACGAGGACACTGACAACAGA opposite strand RT linker ATTCACTACTTCCCCAAA CTGGACACGACAACAACCAG AR androgen receptor CAGATCAGGGGCGAAGTAGA neurotensin receptor 1 (high Proprietary – QIAGEN QuantiTect Primer Assay NTSR1 affinity) QT00018494 Proprietary - QIAGEN QuantiTect Primer Assay TFF1 trefoil factor 1 QT00209608 Proprietary - QIAGEN QuantiTect Primer Assay TFF2 trefoil factor 2 QT00001785 mucin 5B, oligomeric Proprietary - QIAGEN QuantiTect Primer Assay MUC5B mucus/ gel-forming QT01322818 mucin 2, oligomeric mucus/ Proprietary - QIAGEN QuantiTect Primer Assay MUC2 gel-forming QT01004675 mucin 3A, cell surface Proprietary - QIAGEN QuantiTect Primer Assay MUC3A associated QT02503613 protein phosphatase 2, Proprietary - QIAGEN QuantiTect Primer Assay PPP2R2C regulatory subunit B, QT01006383 gamma ribosomal protein L32 CCCCTTGTGAAGCCCAAGA RPL32 (housekeeping gene) GACTGGTGCCGGATGAACTT ACTCTTCCAGCCTTCCTTCCT ACTB β-actin (housekeeping gene) CAGTGATCTCCTTCTGCATCCT glyceraldehyde-3-phosphate AATCCCATCACCATCTTCCA GAPDH dehydrogenase AAATGAGCCCCAGCCTTC (housekeeping gene) hypoxanthine CAGTCAACGGGGGACATAAA HPRT phosphoribosyltransferase 1 AGAGGTCCTTTTCACCAGCAA (housekeeping gene)

4.2.6 Gene Ontology (GO) term analyses GO term enrichment analyses were performed using DAVID (Database for Annotation, Visualization and Integrated Discovery) (Dennis et al., 2003). Briefly, to test for enrichment we interrogated DAVID’s GO FAT database with genes differentially expressed in PC3-GHSROS cells. The DAVID functional annotation tool categorises GO terms and calculates an ‘enrichment score’ or EASE score (a modified Fisher's exact test-derived P-value). Categories with smaller P-values (P ≤ 0.01) and larger fold-enrichments (≥ 2.0) were considered interesting and most likely to convey biological meaning (Huang da et al., 2009).Visualisation of the data as a bubble chart was performed in R using a custom script.

4.2.7 Oncomine Concepts analysis To perform Oncomine meta-analysis, genes differentially expressed in PC3-GHSROS were separated into ‘over-expressed’ and ‘under-expressed’ gene sets. The Oncomine database (Rhodes et al., 2007;

83

Rhodes et al., 2004) was interrogated by importing these genes, and enriched concepts were generated and ordered by P-values (calculated using Fisher’s exact test). Only data sets with an odds ratio ≥ 3.0 and a P-value ≤ 0.01 were retained.

4.2.8 The Cancer Genome Atlas analysis. The UCSC Xena Browser (Vivian et al., 2017) was used to obtain normalised gene expression values, represented as log2 (normalised counts+1), from the ‘TCGA TARGET GTeX’ data set consisting of ~12,000 tissue samples from 31 cancers ("The Genotype-Tissue Expression (GTEx) project," 2013). To obtain up-to-date overall survival (OS) and disease-free survival (DFS) information, we manually queried the cBioPortal for Cancer Genomics (Cerami et al., 2012).

We performed non-hierarchical k-means clustering (Hartigan & Wong, 1979) to partition patients into groups with similar gene expression patterns (Quackenbush, 2001). The following 34 genes obtained by Oncomine meta-analysis (Section 4.2.7) were employed: FBXL16, DIRAS1, CRIP2, TP53I11, MUC5B, TFF2, ZNF467, NUDT11, PARM1, CNTN1, ST6GAL1, UNC80, EYA1, MUM1L1, HSPB8, VWA5A, KLF9, DMD, CHRDL1, MCTP2, RYR2, ANGPT1, NUDT10, STOX2, CAPN6, EGF, RNASEL, AASS, HEPH, LRCH2, LCP1, CPA6, IFI16, and LRIG1. Clustering was performed using the kmeans function in the R package ‘stats’ with two clusters/groups (k=2) and the best cluster pair after 500 runs (nstart=500) retained (Weichselbaum et al., 2008). Kaplan-Meier survival analysis (Rich et al., 2010) was performed with the R package ‘survival’, fitting survival curves (survfit) and computing log-rank P-values using the survdiff function, with rho=0 (equivalent to the method employed by UCSC Xena (see https://goo.gl/4knf62). Survival curves were plotted when survival was significantly different between two groups (log-rank P ≤ 0.05). We considered groups (clusters) that had fewer than 10 samples with a recorded event unreliable.

4.2.9 Performance of gene signature We assessed the performance of the 34-gene signature by comparing it to 1,000,000 gene signatures of the same size, generated at random from the UCSC ‘TCGA TARGET GTeX’ data set (contains 58,585 genes, including noncoding RNA genes) (Vivian et al., 2016), and stratified the TCGA-PRAD cohort by k-means clustering of gene expression alone. The 34-gene signature was also compared to 23 prostate cancer associated gene expression signatures reported in the literature. A scaled heat map (unsupervised hierarchical clustering by Euclidean distance) was generated in R using heatmap.3 (available at https://goo.gl/Yd9aTY).

4.2.10 Statistical analyses Data values were expressed as mean ± s.e.m. of at least two independent experiments and evaluated using Student’s t-test for unpaired samples, or otherwise specified. Mean differences were considered

84 significant when P ≤ 0.05. Q-values denote multiple testing correction (Benjamini-Hochberg) adjusted P-values (Benjamini & Hochberg, 1995). Normalised high-throughput gene expression data were analysed using LIMMA, employing a modified version of the Student’s t-test (moderated t-test) where the standard errors are reduced toward a common value using an empirical Bayesian model robust for data sets with few biological replicates (Ritchie et al., 2015). Statistical analyses were performed using GraphPad Prism v.6.01 software (GraphPad Software, Inc., San Diego, CA), or the R statistical programming language.

4.2.11 Code Selected R code is available in a repository at: https://github.com/sciseim/GHSROS_MS

.

85

4.3 Results

4.3.1 RNA-seq analysis of GHSROS overexpressing PC3 cell line

Having established that GHSROS plays a role in regulating hallmarks of cancer, including cell proliferation and migration (Hanahan & Weinberg, 2011), we sought to determine its effects on gene expression in prostate cancer cell lines. To detect genes that are likely to mediate GHSROS function, we examined the transcriptomes of cultured PC3 cell lines.

High-throughput RNA-seq of PC3-GHSROS cells showed that approximately 400 genes were differentially expressed (168 upregulated, 232 downregulated, moderated t-test; cutoff set at log2 fold- change ± 1.5, Q ≤ 0.05; Supplementary Table 2) compared to empty vector control cells (Figure 4.1). Among the genes regulated by GHSROS, we observed an upregulation of cancer-associated genes implicated in prostate cancer signalling pathways, including the neurotensin receptor 1 (NTSR1; 4.0- fold, Q = 6.28 x 10-11), mucin 5B, oligomeric mucus/gel-forming (MUC5B; 2.8-fold, Q = 9.16 x 10-8), and mucin 3A cell surface associated (MUC3A; 3.7-fold, Q = 2.4 x 10-9), as well as trefoil factors 1 and 2 (TFF-1 3.6-fold, Q = 1.91E-9, and TFF-2 3.4-fold, Q = 2.49E-8). Among the downregulated genes, we observed significant repression of the androgen receptor (AR; absolute fold change -14.9, moderated t- test Q = 6.6 x 10-8) and Serine/threonine-protein phosphatase 2A 55 kDa regulatory subunit B gamma (PPP2R2C; -29.9-fold, Q = 3.4 x 10-10), which encodes a PP2A subunit and was the third most downregulated gene in PC3-GHSROS cells (Supplementary Table 2).

It is well established that AR reactivation is a key event in the re-initiation of tumour growth and cellular differentiation in a majority of metastatic CRPC patients (Ferraldeschi et al., 2015; Wyatt & Gleave, 2015). However, evidence is emerging that in a subset of patients who display resistance to AR- targeting therapies, inactivation of the tumour suppressor protein phosphatase 2A (PP2A) mediates CRPC by promoting AR-independent growth and survival (Bluemn et al., 2013; Gonzalez-Alonso et al., 2015). PPP2R2C was the third most downregulated gene in PC3-GHSROS cells (-29.9-fold, Q = 3.4 x 10-10; Supplementary Table 2). The expression of this gene is decreased in a subset of primary and metastatic prostate cancers resistant to androgen-deprivation therapy and associated with adverse disease outcomes (Bluemn et al., 2013). Interestingly, the AR transcript negative-regulator, Calcium/Calmodulin dependent Protein Kinase II Inhibitor 1 (CAMK2N1; 9.9-fold, Q = 1.7 x 10-9), and, the plasma vitamin A binding protein (RBP4; 3.0-fold, Q = 3.2 x 10-9), and the carriers of the vitamin A metabolite retinoic acid (CRABP1 and CRABP2; 17.1-fold, Q = 3.0 x 10-9 and 5.3, Q = 7.8 x 10-10) were upregulated in PC3-GHSROS cells (Figure 4.1 and Supplementary Table 2), suggesting signalling pathways alternative to those of androgens are being transcriptionally activated in the PC3-GHSROS cells.

86

Figure 4.1. GHSROS overexpression in the PC3 cell line upregulates cancer associated genes related to advanced prostate cancer. Scatter plot visualisation of induced (red) or repressed (blue) genes identified by RNA sequencing. The threshold was set at as log2 1.5 fold-change and Q (Benjamini-Hochberg adjusted P-value) ≤ 0.05. CPM = counts per million. NTSR1 (neurotensin receptor 1), CRABP2 (cellular retinoic acid binding protein 2), CAMK2N1 (calcium/calmodulin dependent protein kinase II inhibitor 1), TFF1 (trefoil factor 1), MUC5B (mucin 5B, oligomeric mucus/gel-forming), TFF2 (trefoil factor 2), MUC3A (mucin 3A, cell surface associated), GHSROS (growth hormone secretagogue receptor opposite strand), CRABP1 (cellular retinoic acid binding protein 1), MUC2 (mucin 2, oligomeric mucus/gel-forming), PPP2R2C (protein phosphatase 2 regulatory subunit B gamma), PARM1 (prostate androgen-regulated mucin-like protein 1), AR (androgen receptor).

87

Differentially expressed genes of interest in PC3-GHSROS cells were explored by qRT-PCR in cell lines from a range of cancer cell lines (Figure 4.2, Supplementary Figure 1 and Supplementary Figure 2). In PC3-GHSROS cells this included significant upregulated of NTSR1 (6.0 fold ± 1.43, P = 0.0038) and TFF2 (4.0 fold ± 0.63 P = 0.0011), while there was a trend towards upregulation of MUC5B (4.3 fold ± 2.35, P = 0.07). LNA-ASO mediated GHSROS knockdown (<80%; Figure 4.2 and Supplementary Figure 2) in native PC3 cells over 48 hours (Chapter 3, section 3.14) reversed the expression of several genes differentially regulated upon forced GHSROS over-expression in the cell line (Figure 4.2 and Supplementary Figure 1). We observed a significant reduction in the expression of NTSR1 (RNV124; -2.14 fold ± 0.22, P = 0.0029 and RNV104L; -3.91 fold ± 0.18, P = 0.0002), MUC5B (RNV124; -1.47 fold ± 0.26, P = 0.0488 and RNV104L; -6.87 fold ± 0.11, P < 0.0001), and TFF2 (RNV124; -2.35 fold ± 0.17, P = 0.0044 and RNV104L; -1.35 fold ± 0.06, P = 0.0018) – summarised in the normalised heatmap in Figure 4.2. In the ES-2 human serious ovarian cancer cell line we observed similar gene changes upon LNA-ASO treatment, including decreased NTSR1 (RNV124; -1.22 fold ± 0.46, P = 0.52 and RNV104L -1.80 fold ±, P = < 0.5), MUC5B (RNV124; -0.65 fold ± 0.81, P = 0.32 and RNV104L -0.55 fold ± 1.11, P = 0.26), TFF2 (RNV124; -6.26 fold ± 0.01, P = 0.0001 and RNV104L -8.6 fold ± 0.10, P = 0.0001). Forced overexpression of GHSROS in the A549 lung adenocarcinoma cell line, which we have previously shown to increase cell migration (Whiteside et al., 2013), increased the expression of NTSR1 (1.98 fold ± 0.19, P = 0.0031) and TFF2 (4.87 fold ± 1.33, P = 0.0117) (Figure 4.2 and Supplementary Figure 1). MUC5B expression levels showed larger variability and thus remains inconclusive (1.69 fold ± 1.146, P = 0.3365). NTSR1 and TFF2 are both expressed in aggressive prostate tumours and cell lines, where they enhance tumour growth and progression (Gouyer et al., 2001; Hashimoto et al., 2015; Legrier et al., 2004; Vestergaard & Torring, 2006).

Consistently, forced overexpression or knockdown of GHSROS in prostate cancer cell lines (PC3, LNCaP, and DUCaP) reciprocally regulated the expression of endogenous AR and PPP2R2C (Figure 4.2, Supplementary Figure 1 and Supplementary Figure 3). The AR is also expressed in ovarian and lung cancer tissue and cell lines, where it has a functional role (Harlos et al., 2015; Zhu et al., 2016). Forced overexpression of GHSROS in the A549 cell line decreased AR (-3.25 fold ± 0.18, P = 0.0001) and PPP2RC expression (-2.08 fold ± 0.06, P = 0.0005). The ES-2 ovarian cancer cell line does not express PPP2R2C; neither did ES-2 cells treated with GHSROS-targeting LNA AOS (Figure 4.2 and Supplementary Figure 2). Taken together, these data indicate a possible auto-regulatory feedback loop or shared negative regulation of these genes by GHSROS.

88

Figure 4.2 GHSROS overexpression in the PC3 cell line increases the expression of cancer associated genes. Heat map of gene expression assessed by qRT-PCR in GHSROS overexpressing or silenced (knockdown) cell lines. Each row shows the relative expression level for a single gene, and each column shows the expression level of a single sample (biological replicate). Expression was normalised to RPL32 in each sample and relative, fold-change, to vector control (overexpression) or scrambled LNA-ASO control (knockdown). Fold changes were log2-transformed and are displayed in the heat map as the relative expression of a gene in a sample compared to all other samples (z-score).

4.3.2 Pathway analysis of RNA-seq gene network derived from GHSROS overexpressing PC3 cell line

To identify pathways and fundamental processes altered by GHSROS we perform gene ontology enrichment analysis using DAVID (Dennis et al., 2003). GO analysis with our gene lists (upregulated and downregulated genes), derived from the PC3-GHSROS cells, showed enrichment for cancer, cell motility, cell migration, and regulation of growth (Figure 4.3 and Supplementary Table 3 and Supplementary Table 4). Enrichment for processes including epithelial structure maintenance (GO:0010669), response to hormone stimulus (GO:0009725), steroid hormone stimulus (GO:0048545), estradiol stimulus (GO:0032355), response to hypoxia (GO:0001666) and response to drug (GO:0042493) were among the top-ranking gene ontologies (Figure 4.3).

89

Figure 4.3. GHSROS overexpression enriches for GO terms important for cancer progression. Bubble plot illustrating DAVID gene ontology (GO) enrichment of genes differentially expressed by PC3-GHSROS cells (compared to the vector control). Bubble size is proportional to the number of genes enriched per category, where red nodes indicated upregulated GO terms and blue nodes indicated downregulated GO terms. Bubble plot was generated in RStudio (R version 3.3.1) with image modified for aesthetics using Inkscape (v0.91).

90

4.3.3 GHSROS regulated genes comprise a unique gene signature which characterises advanced prostate cancer

In order to explore the clinical relationship between GHSROS-regulated genes and prostate cancer, and given that GHSROS is not readily detectable by high-throughput sequencing and array technologies, we queried the 400 genes differentially expressed in PC3-GHSROS cells using Oncomine concept map analysis (Rhodes et al., 2007). The enriched Oncomine concepts found to overlap the GHSROS RNA- seq gene signature included poor clinical outcome and metastatic progression (Figure 4.4 and Supplementary Table 5), including a large overlap with the Grasso (Grasso et al., 2012) and Taylor metastatic prostate cancer (Taylor et al., 2010) datasets. These well-annotated clinical prostate cancer data sets, the Grasso dataset (with 59 localised and 35 metastatic prostate tumours) and Taylor dataset (with 123 localised and 27 metastatic prostate tumours) were further interrogated to identify potential driver genes.

We observed a core set of 34 of the 400 genes which were differentially expressed in PC3-GHSROS cells, which were also differentially expressed in both clinical data sets (moderated t-test Q ≤ 0.25 termed ‘34-gene signature’; Figure 4.5A, Supplementary Table 5 and Supplementary Table 6). Interestingly, TFF2 was among these 34 genes, indicating that it may be an important tumour driver (Figure 4.5A). Heat maps of the 34-gene signature in the Grasso (Figure 4.5B), Taylor (Figure 4.5C) and TCGA (Figure 4.5D) datasets confirmed distinct expression of our upregulated genes in metastatic tumours and high level of correlation with our downregulated genes in primary tumours.

91

Figure 4.4. GHSROS characterises advanced prostate cancer. Network analysis of genes associated with genes differentially expressed by PC3- GHSROS cells in our RNA-seq dataset (FDR-corrected P- value ≤ 0.05). Generated by Oncomine analysis and visualised using Cytoscape. Modified for aesthetic purposes in Inkscape 0.91. Node size (gene overlap) reflects the number of genes per molecular concept which were derived through Oncomine concepts analysis. Edges connect nodes with significant enrichment (odds ratio ≥ 3.0).

92

Figure 4.5. Identification of a 34-gene signature associated with disease-free survival. (A) Venn diagram of differentially expressed genes (DEG) in PC3-GHSROS cells (compared to vector control) and Grasso and Taylor Oncomine data sets (metastatic compared to primary prostate tumours). 34 overlapping genes are indicated in a text box. (B) Heat map of 34-gene signature in the Grasso data set. Normalised to depict relative values within rows (samples) with high (red) and low expression (blue). Vertical bars show grouping as primary (green) or metastatic (pink) tumours. (C) Heat map of 34-gene signature in the Taylor data set. (D) Heat map of the 34-gene signature in 489 primary prostate carcinoma tumours (TCGA-PRAD). Vertical bars show patient grouping by k-means clustering (cluster 1, n=334, red; cluster 2, n=155, black) and Gleason score (high-grade ≥ 8, purple). Horizontal top bar indicates differential gene expression in the Grasso and Taylor data sets: high (pink) or low expression (green) in metastatic tumours.

93

4.3.4 Association of GHSROS-regulated 34-gene signature with disease-free survival (DFS).

Most patients with metastatic prostate cancer eventually die of the disease. Given the expression pattern of the 34-gene signature in metastatic tumours, we hypothesised that the signature could be used to identify patients with adverse outcomes. To test this hypothesis, we assessed the expression pattern of the signature in a high-throughput RNA-seq data set, generated by The Cancer Genomics Atlas (TCGA) consortium. This dataset, TCGA-Prostate Adenocarcinoma (PRAD), contains tumours from patients with moderate- (~39% Gleason 6 and 3 + 4) and high- (~61% Gleason 4+3 and Gleason 8-10) grade primary prostate carcinoma (Cancer Genome Atlas Research et al., 2013). Unsupervised k-means clustering of the 34- gene set was employed to divide the cohort into two groups based on gene expression alone. A heat map revealed that k-means clustering separated patients into groups of either low or high expression of the 34-genes (Figure 4.6). Cluster 2 largely mirrored the expression pattern of metastatic prostate cancer. Encouraged by these results, we tested whether prostate carcinoma patients in cluster 2 showed evidence of adverse disease outcome. As overall survival data for the TCGA-PRAD data set was only available for few patients, we assessed disease-free survival (overall relapse). Kaplan-Meier survival analysis indicated that patients in cluster 2 showed a worse outcome (P = 1.9 x 10-4, log-rank test) than patients in cluster 1 (Figure 4.6). The relapse rate was twice as great in cluster 2 (cluster 1: 334 patients, ~15% relapsed; cluster 2: 155 patients, ~28% relapsed)

94

Figure 4.6. Kaplan-Meier analyses of 34-gene signature in the TCGA-PRAD cohort. Primary prostate carcinoma patients were stratified by k-means clustering. Time denotes years.

To assess if the 34-gene signature confers a genuine signal in the TCGA-PRAD data set we compared it to random and known signatures. Compared to a million random signatures of the same size, the 34-gene signature (Figure 4.7A) performed in the 97th percentile, with 2.7% of random signatures demonstrating an equal or lesser P-value (empirical P = 0.026) (Figure 4.7B). Considering 23 previously identified prostate cancer gene expression signatures reported in the literature (Table 4.2), five had equal or smaller P-values, placing our 34-gene signature in the 78th percentile (empirical P = 0.22) (Figure 4.7C, Table 4.2). This analysis suggests that the expression of the 34-genes in our signature indeed confers a poor prognosis for prostate cancer. These data show that a subset of 34 genes differentially expressed in PC3- GHSROS cells correlate with metastatic tumour expression, and are differentially expressed in prostate tumours versus normal prostate.

95

Figure 4.7. Performance of 34 gene signature. (A) List of the genes in the 34-gene signature. (B) Assessment of the 34-gene signature in the 489-patient TCGA-PRAD data set compared to 1,000,000 random signatures of the same size. The density plot shows the distribution of random signature survival (log-rank) P-values for DFS in 489 patients separated by k-means

clustering of gene expression alone. The x-axis denotes –log10 log-rank P-values, and the y- axis their frequency. The 34-gene signature is shown as a solid blue line, the dotted blue line the frequency of signatures with a log-rank P-value of 0.05. (C) Assessment of the 34-gene signature in the 489-patient TCGA-PRAD data set compared to 23 prostate cancer associated gene expression signatures in the literature. Annotated as in (B). The pink line indicates published gene signatures with a DFS log-rank P ≤ 0.05.

96

Table 4.2. Selected prostate cancer associated gene expression signatures from the literature. Patients were stratified into groups by k-means clustering of gene expression (k=2). Disease-free survival (DFS) was assessed in the TCGA-PRAD data set (n=489). A log-rank test was used to assign statistical significance, with a P-value below 0.05 considered significant. log-rank P previous study outcome; no. Title TCGA- citation comments genes PRAD Prensner- Biochemical recurrence, clinical 1 0.0757 (Prensner et al., 2013; SChLAP1 progression to systemic disease, Prensner & Zhao, et prostate cancer-specific mortality, al., 2014) overall survival, Gleason score Lapointe-2004 Biochemical recurrence 2 0.00105 (Lapointe et al., 2004) Peng-2014 Overall survival (OS) and 3 0.0367 (Peng et al., 2014) prostate cancer specific survival Thompson- Biochemical recurrence 3 0.217 (Thompson et al., 2012 2012) Singh-2002 Biochemical recurrence 5 0.332 (Singh et al., 2002) Glinsky-2004 Signature 1. Biochemical 5 0.0783 (Glinsky & Gerald, recurrence 2004) Rajan-2014 Disease free survival 7 0.03 (Rajan et al., 2014) Chen-2012 Biochemical recurrence 7 0.222 (Chen et al., 2012) Varambally- Biochemical recurrence 9 0.968 (Varambally et al., 2005 2005) Glinsky-2005 Biochemical recurrence, 11 0.0000105 (Glinsky & Glinskii, unfavourable response to therapy, 2005) distant metastasis Li-2013 Postoperative relapse 12 0.00192 (Li et al., 2013) Bismar-2006 Biochemical recurrence 12 0.557 (Bismar et al., 2006) Yu-2007 Cancer progression and poor 14 0.193 (Yu et al., 2007) clinical outcome Bibikova-2007 Biochemical recurrence, Gleason 16 0.000155 (Bibikova et al., grade 2007) Nakagawa- Systemic progression after 17 0.00179 (Nakagawa et al., 2008 biochemical recurrence 2008) Ramaswamy- Metastasis and poor clinical 17 0.0161 (Ramaswamy & 2003 outcome Golub, 2003) Zhao-2016 Distant metastasis after 24 0.042 (Zhao et al., 2016) postoperative radiotherapy Wu-2013 Biochemical recurrence and 29 0.000172 (Wu et al., 2013) metastatic disease (32-genes in paper, but 4 input normalisation genes) Sinnot-2016 Predictive of lethal prostate 30 0.00059 (Sinnott et al., 2016) cancer Cuzick-2011 Biochemical recurrence and time 31 0.0000001 (Cuzick et al., 2011) to death from prostate cancer 53 Chen-2011 Overall survival (OS) 42 0.144 (Chen & Lussier, 2011) Penney-2011 Predictive of lethal prostate cancer 157 0.000737 (Penney et al., 2011) Saal-2007 Poor prognosis. Distant disease 185 0.0000001 (Saal et al., 2007) free survival. Some missing gene 37 annotations.

97

Finally, we explored the 34-gene signature in other cancer data sets available from the TCGA in order to identify if GHSROS can confer prognostic events in other cancers (Table 4.3). Similar to our previous analysis, stratifying patients by gene expression alone (P ≤ 0.05, log- rank test), the signature associated with disease-free survival in bladder cancer (P = 3.2 x 10- 3), glioblastoma (P = 4.0 x 10-2), melanoma (P = 1.7 x 10-3), and stomach cancer (P = 1.4 x 10-4) (Table 4.3). We observed a strong trend towards an association with our gene signature and lung adenocarcinoma (P = 5.5 x 10-2), but not lung squamous cell carcinoma, P = 5.0 x 10-1).

Further to that, overall survival was associated with the 34-gene set in bladder cancer (P = 3.2 x 10-3), melanoma (P = 2.7 x 10-3), and stomach cancer (P = 3.0 x 10-4) (Supplementary Table 8).

98

Table 4.3. Disease-free survival (DFS) analysis of TCGA patients using the 34-gene signature. Patients were stratified into groups by k-means clustering of gene expression (k=2). Log-rank test was used to assign statistical significance, where blue text indicates cohorts with a Log-rank P ≤ 0.05. Log-rank P- values calculated from cohorts where less than 10 events (here: relapse) per group (in cluster 1 and/or 2) were recorded are indicated in red. NA denote not available due to missing information. Cluster 1 Cluster 2

Mean Numbe Mean total (n) Cancer Number Median Median Log-rank Cohort n (days ± n r of (days ± type of events (days) (days) P s.e.m) events s.e.m) ACC 76 Adrenocortical Cancer 38 0 NA NA 38 0 NA NA 1.00000 BLCA 319 Bladder Cancer 175 64 370 602±90 144 77 321 493±64 0.00318 BRCA 1002 Breast Invasive Carcinoma 543 57 993 608±122 459 55 789 529±94 0.12632 CESC 264 Cervical Cancer 175 27 412 114±43 89 21 396 311±69 0.07222 1205±14 CHOL 31 Bile Duct Cancer 10 4 83 21 12 251 1150±147 0.85311 7 COADRE 332 Colon and Rectal Cancer 180 54 480 698±91 152 38 560 676±93 0.09752 AD DLBC 42 Large B-cell Lymphoma 9 1 3666 3666±NA 33 11 290 1184±590 0.21452 ESCA 139 Esophageal Cancer 64 35 362 498±64 75 34 215 223±25 0.38058 GBM 119 Glioblastoma 66 52 190 319±58 53 47 152 222±32 0.03984 HNSC 391 Head and Neck Cancer 191 70 392 579±65 200 74 281 448±55 0.78423 KICH 62 Kidney Chromophobe 8 2 512 512±364 54 7 616 1280±605 0.12199 KIRC 62 Kidney Clear Cell Carcinoma 8 2 512 512±364 54 7 616 1280±605 0.12199 Kidney Papillary Cell KIRP 268 142 33 340 717±143 126 20 431 759±205 0.17663 Carcinoma LAML 58 Acute Myeloid Leukemia 24 0 NA NA 34 0 NA NA 1.00000 LGG 484 Lower Grade Glioma 301 118 503 815±80 183 58 687 1004±122 0.05444 LIHC 320 Liver Cancer 180 94 336 537±58 140 81 258 414±48 0.06999 LUAD 434 Lung Adenocarcinoma 204 92 434 586±74 230 93 500 723±74 0.05503 Lung Squamous Cell LUSC 373 206 70 529 755±83 167 59 420 636±94 0.50413 Carcinoma

99

MESO 86 Mesothelioma 41 0 NA NA 45 0 NA NA 1.00000 OV 357 Ovarian Cancer 205 155 462 576±34 152 112 447 566±37 0.75955 PAAD 139 Pancreatic Cancer 14 2 163 163±34 125 80 363 427±38 0.00004 Pheochromocytoma and PCPG 181 94 0 NA NA 87 0 NA NA 1.00000 Paraganglioma PRAD 489 Prostate Cancer 334 48 671 832±94 155 43 648 750±94 0.00019 SARC 234 Sarcoma 128 69 378 529±59 106 59 742 927±105 0.34843 1367±14 SKCM 409 Melanoma 192 119 779 217 140 1114 1790±164 0.00168 9 STAD 324 Stomach Cancer 165 70 323 367±32 159 39 296 423±67 0.00014 1167±46 TGCT 134 Testicular Cancer 32 12 423 102 24 667 1548±407 0.23833 7 THCA 497 Thyroid Cancer 241 29 483 563±78 256 20 464 652±118 0.06850 UCEC 167 Endometrioid Cancer 76 18 452 580±106 91 20 662 753±122 0.63195 UCS 56 Uterine Carcinosarcoma 22 0 NA NA 34 0 NA NA 1.00000 UVM 60 Ocular Melanomas 26 8 400 499±163 34 5 1031 782±235 0.02394

100

4.4 Discussion

In the last decade, the repertoire of human lncRNAs in cancer have rapidly expanded (Deveson et al., 2017). In general, approximately 50% of human lncRNAs in the GENCODE (a project aimed at identifying all genes in the mouse and human genomes) catalogue were identified in the most recent five years (with approximately double the total lncRNAs in GENCODE v7 as opposed to GENCODE v24) (Derrien et al., 2012). It is now appreciated that many lncRNAs are equivalent to classical oncogenes or tumour suppressors and can drive similar transcriptional programs in diverse cancer types (Han & Chang, 2015; Hu et al., 2017). However, the functional roles or transcriptional regulation caused by aberrant expression of many lncRNAs remain largely elusive (Quek et al., 2015). Indeed, it has been reported that less than 1% of identified human lncRNAs have been experimentally investigated, including those involved in cancer pathways (Quek et al., 2015).

In this chapter, we investigated the transcriptome of cultured PC3 prostate cancer cells altered by an upregulation of the lncRNA GHSROS. In chapter 3 we demonstrated that GHSROS is upregulated in prostate cancer and ectopic overexpression in the PC3 and DU145 cell lines led to increased cell proliferation, migration and cell survival in vitro, and tumour growth in vivo. Like other notable examples of oncogenic lncRNAs, including SChLAP1 and HOTAIR – which mediate pro-oncogenic gene networks in prostate cancer cell lines (Aiello et al., 2016; Prensner et al., 2013; Zhang et al., 2015) – our RNA-seq analysis suggests that GHSROS has a similar role. Many lncRNAs are dysregulated in tumours and, given their diverse gene regulatory mechanisms, are likely to elicit responses in downstream gene networks (Hu et al., 2017).

We determined that GHSROS upregulates a number of important cancer genes important for prostate cancer growth and survival, supporting our initial functional studies in Chapter 3. Of note, we observed a robust increase in NTSR1 gene expression. NTSR1 is expressed in aggressive prostate tumours and cell lines and enhances tumour growth and progression (Hashimoto et al., 2015). In lung cancer NTSR1 and the GHSR1b are co-expressed (Takahashi et al., 2006). These GPCRs heterodimerise to form a functional neuromedin U receptor, resulting in increased cell growth (Takahashi et al., 2006). As GHSROS regulates expression of both NTSR1 and GHSR1b, it may facilitate this interaction. We could not detect the GHSR1b in this study, however, future studies focusing on protein expression and function – including western blots and immunohistochemistry), monitoring these interactions (using Bioresonance Resonance Energy Transfer or similar techniques), and implementing CRISPR-generated NTSR1 knockout PC3 cell lines (currently ongoing in our laboratory) – will be useful in teasing this interaction apart and determining its role in prostate cancer progression.

101

Other noteworthy genes upregulated by GHSROS include MUC5B and TFF2 – genes highly expressed in advanced tumours and with distinct roles in prostate cancer, including metastasis (Legrier et al., 2004; Vestergaard et al., 2006). Trefoil factors (TFFs) are small proteins associated with mucin glycoproteins and play roles in mucosal injury through anti-inflammatory signalling and inhibition of apoptosis (Pandey et al., 2014; Radiloff et al., 2011). Their expression is increased in castration- resistant prostate cancer (CRPC) and may facilitate the acquisition of hormone independence (Legrier et al., 2004; Vestergaard et al., 2006). It would be interesting to assess whether GHSROS facilitates mucin production and trefoil factor changes, in particular in colon cancer where these factors are well- established drivers of colorectal carcinoma progression (Gouyer et al., 2001). GHSROS however, is not likely to be a significant driver of colorectal cancer given that our initial exon array analysis of ~ 4,000 samples (Chapter 3) only demonstrated active GHSROS transcription in a single dataset for colon cancer. Additionally, GHSROS expression levels were downregulated in colorectal cancer compared to primary colon in our cDNA array (Chapter 3; assessed by qRT-PCR). However, given the limited sample numbers in our analysis it would be useful to screen a larger, more stratified cohorts of gastric and colorectal cancer types.

The core 34 genes we identified as common between our GHSROS RNA-seq derived gene list and clinical datasets (34-gene signature) suggest that GHSROS overexpression mediates a downstream gene network which characterises advanced cancer. Several genes in the signature have previously been associated with increased risk of prostate cancer progression. These include NUDT10 and NUDT11, which both encode the enzyme diphosphoinositol polyphosphate phosphohydrolase 3- alpha that removes deleterious metabolites (Cheng et al., 2010), and the tumour suppressor KLF9 (Shen et al., 2014; Stelloo et al., 2015). EYA transcriptional coactivator and phosphatase 1 (EYA1) ranked 3rd and cysteine rich protein 2 (CRIP2) ranked 11th in a 6,100-gene custom microarray study that identified a 157-gene expression signature distinguishing low (≤ 6) from high (≥ 8) Gleason score tumours and predicted lethal disease in men with Gleason 7 tumours (Penney et al., 2011). EYA1 is a transcriptional coactivator and candidate driver genes of aggressive prostate cancer (Brennan et al., 2004). A very recent study interrogated the Grasso microarray data set (Grasso et al., 2012) to predict ‘master regulators’, transcription factors that modulate the differential expression of genes in metastatic, castration-resistant prostate cancer (CRPC) (Drake et al., 2016). Of the 74 transcription factors identified, two (NEO1, and AR) were also differentially expressed in PC3-GHSROS cells. A peculiar exception was the androgen receptor (AR), whose expression in PC3-GHSROS was opposite to the 27 CPRC patients in the Grasso data set. Interestingly, its negative-regulator, Calcium/Calmodulin dependent Protein Kinase II Inhibitor 1 (Wang et al., 2014), CRABP1 (the plasma vitamin A binding protein), and CRABP2 (the carriers of the vitamin A metabolite retinoic acid) were upregulated in PC3- GHSROS cells. Retinoic acid signalling has been associated with suppressed androgen metabolism and modulation of androgen receptor (AR) expression (Wang et al., 2016). We can speculate that GHSROS

102 may regulate an interaction with, or adaptation to the AR-mediated gene pathway. Notably, HOTAIR increases AR chromatin targeting and enhances the AR-mediated gene program – driving androgen independent AR activation and promotion of CRPC (Zhang et al., 2015). Androgen-independent AR activation, like that facilitated by the lncRNA HOTAIR, is a hallmark of CRPC (Zhang et al., 2015). It remains to be elucidated whether GHSROS plays a role in the acquisition of CRPC in prostate cancer, however.

It can be speculated upon that the functional effects and gene expression profile regulated by GHSROS in prostate cancer is not restricted to this cancer alone, and that GHSROS belongs to a growing list of long noncoding RNAs that function as bona fide oncogenes. GHSROS expression alone, or the 34-gene signature could, therefore, be developed further as a potential prognostic indicator for disease free- or overall survival for a number of cancers, including prostate cancer. Taken together, we demonstrate that GHSROS is able to act in trans to regulate a number of genes, including several transcription factors, with established roles in prostate cancer. GHSROS has widespread effects on gene expression in cancer cells, including genes associated with metastasis and poor prognosis, however, the precise mechanism by which it reprograms gene expression requires further biochemical analysis.

In summary, we have shown that forced expression of the lncRNA GHSROS promotes a unique gene expression signature that characterises advanced prostate cancer. Our signature of 34-GHSROS regulated genes support our previous observations indicating this lncRNA regulates cell migration and tumour growth. The results of transcriptome profiling of GHSROS-overexpressing PC3 cells, employing RNA-seq, qRT-PCR gene expression validations and in silico informatics demonstrate that GHSROS promotes a gene signature which supports metastasis and predicts adverse disease outcomes in prostate cancer. We aim to extend these studies into other cell lines, in order to fully understand how GHSROS may contribute to prostate cancer tumour growth and progression.

103

Chapter 5

The lncRNA GHSROS facilitates androgen receptor- independent tumour growth and mediates resistance to docetaxel in prostate cancer

104

5.1 Introduction

In the previous chapters of this thesis, we have demonstrated that GHSROS is expressed at higher levels in high Gleason score prostate tumours, and that it influences a number of hallmarks of prostate cancer when overexpressed. GHSROS regulates cell migration in both lung (Whiteside et al., 2013) and prostate cancer cell lines (Chapter 3). In prostate cancer, GHSROS overexpression significantly regulates tumour growth, cell proliferation and cell adhesion to collagen type IV (Chapter 3). GHSROS induces significant changes in cancer gene expression in the PC3 cancer cell line, favouring a transcriptome that characterises prostate cancer progression (GHSROS-regulated 34 gene signature; Chapter 4). Additionally, GHSROS reduced expression of the AR, a critical enhancer of prostate cancer cell growth and differentiation, in the androgen independent PC3 cell line. LncRNAs can act to reprogram gene expression pathways within cancer cells in order to evade stress-induced cell death and enhance cell division. We, therefore, aimed to investigate the role of GHSROS in an androgen-sensitive cell line in order to provide additional insights into the function of GHSROS in prostate cancer. There is compelling evidence that lncRNAs drive similar transcriptional programs in diverse cancer types (Huarte, 2015). The lncRNA HOTAIR drives androgen-independent AR activity and CRPC progression by preventing AR protein degradation and mediating the AR-transcriptional program in the absence of androgens (Zhang et al., 2015). Indeed, prostate cancers which grow in the absence of this AR signalling axis may become recalcitrant to conventional androgen ablation or AR antagonists, and inevitably developing into CRPC (Chakravarty et al., 2014; Zhang et al., 2015). NEAT1 expression in prostate cancer generates a cancer-favourable transcriptome independent of the AR (Chakravarty et al., 2014). Instead, NEAT1 is induced by ERα signalling, allowing it to bypass the AR and develop into aggressive CRPC (Chakravarty et al., 2014). This alternative mechanism also provides a degree of resistance to the anti-androgens bicalutamide and enzalutamide (Chakravarty et al., 2014).

In this study, we investigate the role of GHSROS in LNCaP xenograft tumour growth and investigate genes regulated by GHSROS in these tumours. By comparison of genes regulated by GHSROS in the LNCaP cells to those identified in the PC3 cell line, we aim to identify GHSROS-regulated genes that are common to androgen-sensitive and androgen-independent prostate cancer which are likely key for androgen independence. The importance of GHSROS in prostate cancer resistance to commonly used anti-androgens and chemotherapeutics was also investigated in this study.

105

5.2 Materials and Methods General Materials and Methods are outlined in detail in Chapter 2. Experimental procedures which are specific to this chapter are described below.

5.2.1 Cell culture and drug treatments In this study, assays were performed using the PC3 and LNCaP prostate cancer cell lines (Chapter 2; section 2.1). The PC3, DU145, LNCaP, DUCaP prostate cell lines, the A549 lung cancer cell line and the ES-2 ovarian cancer cell line were propagated as described in the General Methods, Chapter 2.1. LNCaP cells were treated with 10 µM of the AR signalling inhibitor enzalutamide (ENZ; Selleck Chemicals, Houston, TX, USA) for 46 hours. In separate experiments, LNCaP cells were treated with 5 nM of the chemotherapeutic taxane docetaxel (DTX; Sigma Aldrich, St. Louis, MO, USA) for 96 (functional assays) or 1-20 nM for 48 hours (qRT-PCR studies) with dimethyl sulfoxide (DMSO) as vehicle (Sigma Aldrich, St. Louis, MO, USA).

5.2.2 Production of GHSROS overexpressing cancer cell lines Full-length GHSROS transcript was generated by RT-PCR from A549 cell line mRNA and cloned into the pTargeT mammalian expression vector (Promega, Madison, WI) (Whiteside et al., 2013). The PC3 cell line was transfected with GHSROS-pTargeT DNA, or vector alone (empty vector), (using Lipofectamine LTX, Invitrogen) according to the manufacturer’s instructions (Chapter 2). Briefly, cells were incubated for 24 hours in LTX and selected with geneticin (100-1500 µg/mL G418, Invitrogen). As LNCaP prostate cancer cells were difficult to transfect using lipid-mediated transfection, we employed lentiviral transduction. Briefly, pReceiver-Lv105 vectors expressing full length GHSROS, or empty control vectors, were obtained from GeneCopoeia (Rockville, MD) using methodology described in Chapter 2.2.

5.2.3 Cell Viability Assay LNCaP and PC3 vector or GHSROS over-expressing cells (5000 cells/well) were seeded in 96-well plates (BD Biosciences) and propagated overnight in complete medium. LNCaP cells were treated with standard doses of test compounds in both charcoal stripped FBS (CSS) and 2% FBS. PC3 cells were treated with increasing doses of docetaxel in 2% FBS. After a 96-hour period, cell viability was measured using a WST-1 cell proliferation assay (Roche, Nonnenwald, Penzberg, Germany) according to the manufacturer’s instructions. All viability experiments were performed independently three times, with 4 replicates each. Methods are described in Chapter 2. 4 and 2.9.

5.2.4 Cell Migration assays LNCaP cells are incompatible with the xCELLigence migration assay, therefore standard transwell assays were used to LNCaP migration assays. Briefly, 6 × 105 cells were suspended in serum-free

106 medium and added to the upper chamber of a polycarbonate membrane (8 µm pore size; BD Biosciences). Cells in 12-well plates were allowed to migrate for 24 h in response to a chemoattractant (10% FBS) in the lower chamber. After 24 h, cells remaining in the upper chamber were removed. Cells that had migrated to the lower surface of the membrane were fixed with methanol (100%) and stained with 1% crystal violet. Acetic acid (10%, v/v) was used to extract the crystal violet and absorbance was measured at 595 nm. Each experiment consisted of three replicates and was repeated independently three times.

5.2.5 Mouse subcutaneous in vivo xenograft models All mouse studies were carried out with approval from the University of Queensland and the Queensland University of Technology Animal Ethics Committees (QUT/TRI/328/16). LNCaP-vector and LNCaP-GHSROS cell lines were injected subcutaneously into the flank of 4-5-week-old male NSG mice (Shultz, Lyons et al., 2005) (obtained from Animal Resource Centre, Murdoch, WA, Australia) as described in detail in Chapter 2.7.

5.2.6 Histology and immunohistochemistry For histological analysis, cryosections were prepared, sectioned, fixed and immunohistochemistry performed using antibodies for the proliferation marker Ki67 (rabbit anti-human Ki67, Abcam, Cambridge, UK), as described in Chapter 2.8.

5.2.7 RNA sequencing of LNCaP-GHSROS cells RNA was extracted from LNCaP-GHSROS xenograft tumours and controls (empty vector control lentiviral constructs). RNA purity was analysed using an Agilent 2100 Bioanalyzer, and RNA with an RNA Integrity Number (RIN) above 7 used for RNA-seq. Strand-specific RNA-seq was performed by the South Australian Health and Medical Research Institute (SAHMRI, Adelaide, SA, Australia). A TruSeq stranded mRNA library (Illumina) was constructed and RNA-seq performed (35 million reads) on a Nextseq 500 instrument (Illumina) with 75bp single end reads. Pre-processing of raw FASTQ reads, including elimination of contamination adapters, was performed with scythe v0.994 (https://github.com/vsbuffalo/scythe). Human (xenograft tumour; the graft) and mouse (the host) RNA- seq reads were separated using Xenome (Conway et al., 2012) on the trimmed FASTQ files, leaving ~20M human reads. Reads were aligned to the human genome and processed as described for PC3- GHSROS cells above.

5.2.8 RT-PCR of cell line mRNA RNA was extracted and cDNA synthesised from all cell lines (as described in Chapter 2.3). RT-PCR was performed using cDNA from the normal prostate and from prostate cancer cell lines (as described

107 in Chapter 2, section 2.3). No template control RT-PCRs were also performed where template was substituted with water.

Table 5.1. Primer sequences used in this study Primer Gene name Primer sequence (5'-3') growth hormone ACATTCAGCAAATCCAGTTAATGACA GHSROS secretagogue receptor CGACTGGAGCACGAGGACACTTGA opposite strand CTGGACACGACAACAACCAG AR androgen receptor CAGATCAGGGGCGAAGTAGA neurotensin receptor 1 (high Proprietary – QIAGEN QuantiTect Primer Assay NTSR1 affinity) QT00018494 Proprietary - QIAGEN QuantiTect Primer Assay TFF1 trefoil factor 1 QT00209608 Proprietary - QIAGEN QuantiTect Primer Assay TFF2 trefoil factor 2 QT00001785 mucin 5B, oligomeric Proprietary - QIAGEN QuantiTect Primer Assay MUC5B mucus/ gel-forming QT01322818 mucin 2, oligomeric mucus/ Proprietary - QIAGEN QuantiTect Primer Assay MUC2 gel-forming QT01004675 mucin 3A, cell surface Proprietary - QIAGEN QuantiTect Primer Assay MUC3A associated QT02503613 protein phosphatase 2, Proprietary - QIAGEN QuantiTect Primer Assay PPP2R2C regulatory subunit B, QT01006383 gamma ribosomal protein L32 CCCCTTGTGAAGCCCAAGA RPL32 (housekeeping gene) GACTGGTGCCGGATGAACTT

5.2.9 Gene Ontology (GO) and gene set enrichment (GSEA) analyses GO term analyses of our RNA-seq data performed using DAVID (Database for Annotation, Visualization and Integrated Discovery) (Dennis et al., 2003). Briefly, to test for enrichment we interrogated DAVID’s GO FAT database with genes differentially expressed in LNCaP-GHSROS cells. The DAVID functional annotation tool categorizes GO terms and calculates an ‘enrichment score’ or EASE score (a modified Fisher's exact test-derived P-value). Categories with smaller P-values (P ≤ 0.01) and larger fold-enrichments (≥ 2.0) were considered interesting and most likely to convey biological meaning (Huang da et al., 2009). In order to further assess pathways that may be enriched in the LNCaP-GHSROS cells, we imported our differentially expressed genes into the GSEA program (Subramanian et al., 2005).

5.2.10 LP50 prostate cancer cell line AR knockdown microarray Publicly available Affymetrix HG-U133 Plus 2.0 microarray data (NCBI GEO accession no. GSE22483) from a substrain of the LNCaP cell line: androgen-independent late passage LNCaP cells (LP50) was interrogated. This cell line was subjected to AR knockdown by shRNA (Gonit et al., 2011). The array (n=2, of AR shRNA and scrambled control) was normalized to housekeeping genes using the

108

Affymetrix Gene Chip Operating System v1.4 (Gonit et al., 2011). Prior to differential expression analysis, the probe set was pre-filtered, using the R statistical programming language, as follows: probes with mean expression values in the lowest 20th percentile of the array were removed. Differential expression was determined by the R package ‘limma’ (Ritchie et al., 2015) and probes with a Benjamini-Hochberg adjusted P-value (Q; BH-FDR) ≤ 0.05 considered significant. Gene annotations were obtained using the R/Bioconductor packages ‘Biobase’ (Ritchie et al., 2015) and ‘GEOquery’ (Davis & Meltzer, 2007).

5.2.11 Survival analysis Two data sets were interrogated: Taylor (Taylor et al., 2010) (123 localized and 27 metastatic prostate tumours) and TCGA-PRAD from The Cancer Genomics Atlas (TCGA) consortium, which contains tumours from patients with moderate- (~39% Gleason 6 and 3 + 4) and high- (~61% Gleason 4+3 and Gleason 8-10) risk localized prostate carcinoma (Cancer Genome Atlas Research et al., 2013). In the case of TCGA-PRAD, the UCSC Xena Browser (Casper et al., 2018) was used to obtain normalized gene expression values, represented as log2 (normalized counts+1), from the ‘TCGA TARGET GTeX’ data set consisting of ~12,000 tissue samples from 31 cancers ("The Genotype-Tissue Expression (GTEx) project," 2013). To obtain up-to-date overall survival (OS) and disease-free survival (DFS) information, we manually queried cBioPortal for Cancer Genomics (Cerami et al., 2012; Gao et al., 2013).

We performed non-hierarchical k-means clustering (Selim & Ismail, 1984) to partition patients into groups with similar gene expression patterns (Quackenbush, 2001). The following 10 genes obtained by Oncomine meta-analysis (Chapter 4) were assessed: AASS, CHRDL1, CNTN1, DIRAS1, FBXL16, IFI16, MUM1L1, TP53I11, TFF2, and ZNF467. Clustering was performed using the kmeans function in the R package ‘stats’ with two clusters/groups (k=2) and the best cluster pair after 500 runs (nstart=500) was retained (Weichselbaum et al., 2008). Kaplan-Meier survival analysis (Rich et al., 2010) was performed with the R package ‘survival’, fitting survival curves (survfit) and computing log- rank P-values using the survdiff function, with rho=0 (equivalent to the method employed by UCSC Xena; see https://goo.gl/4knf62). Survival curves were plotted when survival was significantly different between two groups (log-rank P ≤ 0.05). We used the coxph function in the R package ‘survival’ to test the prognostic significance of genes (implementing the Cox proportional hazard model to analyse the association of gene expression with patient survival) with P ≤ 0.05 (Wald test) considered significant. Because there is a single categorical covariate (k-means cluster; group), the P-values from the log-rank and the Cox regression tests are comparable. We considered groups (clusters) that had fewer than 10 samples with a recorded event to be unreliable. A scaled heat map (unsupervised hierarchical clustering by Euclidean distance) was generated in R using heatmap.3 (available at https://goo.gl/Yd9aTY) and a custom R script.

109

5.2.12 Statistical analysis Data were expressed as mean ± s.e.m. of at least two independent experiments and evaluated using Student’s t-test for unpaired samples, or tests as otherwise specified. Mean differences were considered significant when P ≤ 0.05. Q-values denote multiple testing correction (Benjamini-Hochberg) adjusted P-values (Benjamini & Hochberg, 1995). Normalised high-throughput gene expression data were analysed using LIMMA, employing a modified version of the Student’s t-test (moderated t-test) where the standard errors are reduced toward a common value using an empirical Bayesian model robust for data sets with few biological replicates (Ritchie et al., 2015). Statistical analyses were performed using GraphPad Prism v.6.01 software (GraphPad Software, Inc., San Diego, CA), or the R statistical programming language.

110

5.3 Results

5.3.1 GHSROS promotes growth and motility of LNCaP cancer cells in vitro GHSROS overexpression (Figure 5.1A) significantly increased the rate of cell migration of LNCaP cells across an 8µm porous membrane in Transwell assays (1.27 ± 0.035 fold, P = 0.00020) (Figure 5.1B). Cell proliferation was also increased (1.34 ± 0.283 fold, P = 0.040) compared to empty vector control cells (Figure 5.1C).

In order to determine if the effects of GHSROS overexpression in vivo, LNCaP-GHSROS or vector cells were injected subcutaneously into NSG mice. Although xenograft tumours were not palpable in LNCaP-GHSROS mice in vivo, as they had invaded subcutaneous muscle, tumours dissected post mortem (at 72 days) were significantly larger by weight (1.82 ± 0.346 fold, P = 0.042) (Figure 5.1D). Similar size increases were seen as in the DU145-GHSROS xenografts (Chapter 3.3.8 and Figure 3.11). Expression of the proliferation marker Ki67 appeared to be increased in LNCaP-GHSROS xenografts (Figure 5.1F), however, insufficient tumour tissue was available to allow quantitation of Ki67 staining. LNCaP-GHSROS tumours invaded the muscle of the flank and the peritoneum (data not shown) and were more vascularized than control tumours upon gross observation (Figure 5.1E) and as estimated by Masson’s trichrome collagen histological staining & CD31+ immunostaining) (Figure 5.1G).

111

Figure 5.1. GHSROS promotes LNCaP cell line growth and motility in vitro and in vivo. (A) Confirmation of GHSROS overexpression in prostate cancer-derived cell lines. Bar graphs show qRT- PCR quantification of the relative expression levels of GHSROS in prostate-derived engineered LNCaP- GHSROS cancer cell line. Expression was normalized to the housekeeping gene RPL32 using the comparative 2-∆∆Ct method of quantification. Results are relative to their respective vector control. Mean ± s.e.m., n=3, ***P ≤ 0.001, Student’s t-test. (B) Increased proliferation in LNCaP GHSROS- overexpressing cells. LNCaP proliferation was assessed using a WST-1 assay at 72 hours. Vector denotes empty control plasmid. Mean ± s.e.m., n=3, ***P ≤ 0.001, Student’s t-test. (C) Increased migration in GHSROS-overexpressing cells. LNCaP migration was assessed using a transwell assay (at 24 hours). Mean ± s.e.m., n=3, *P ≤ 0.05, Student’s t-test. (D) Tumour weights of LNCaP xenograft tumours post mortem. *P ≤ 0.05, Mann-Whitney-Wilcoxon test. (E) Size comparisons of LNCaP xenograft tumours overexpressing GHSROS or empty vector. (F) Representative Ki67 immunostaining of LNCaP-GHSROS xenograft tumours post mortem. Scale = 20 μm. (G) Representative morphology of LNCaP xenograft tumours overexpressing GHSROS or empty vector. Tissue was stained with haematoxylin and eosin (H&E), Masson’s trichrome (MT; collagen= blue) or immunostained for CD31 (endothelial marker; brown immunoreactivity). Scale bar = 20 μm.

112

5.3.2 Overexpression of GHSROS in the LNCaP cell line modulates the expression of cancer- associated genes consistent with its functional effects

Given these changes in tumour growth, we complemented the PC3 transcriptome data (~50M reads) with relatively lower-coverage RNA-seq (~20M reads) of LNCaP xenografts overexpressing GHSROS (Figure 5.2.A and Supplementary Table 9). Surprisingly, a large number of genes were differentially expressed by LNCaP-GHSROS xenografts (1,961 upregulated, 2,372 downregulated, moderated t-test; cutoff set at log2 fold-change ± 1.5, Q ≤ 0.05, Supplementary Table 9). These included prostate specific antigen (PSA; KLK3) (750.9-fold, Q = 3.6 × 10-6), TMPRSS2 (335.4-fold, Q = 4.5 x 10-6), and the androgen receptor (AR) (27.8-fold, Q = 4.5 × 10-4). We also observed downregulation of numerous genes associated with cell migration and adhesion, epithelial–mesenchymal transition (EMT) (including ZEB1, -97.0-fold, Q = 1.5 x 10-5), angiogenesis and vasculature development.

Gene set enrichment (GSEA) analysis revealed a negative enrichment for apoptosis (normalised enrichment score (NES) = 2.71, Q ≤ 0.001) and enrichment for the androgen response for the GHSROS- regulated genes (NES = 2.71, Q ≤ 0.001) (Figure 5.2C). This was supported with gene ontology analysis (applying the DAVID functional analysis tool), which demonstrated a large amount of biological enrichment for cellular processes in the LNCaP-GHSROS xenograft tumours (Table 5.5). Enrichment for lipid-centric processes was observed including lipid metabolic process (GO:0006629, P = 3 x 10-9), cellular lipid metabolic process (GO:0044255, P = 1x10-10), and fatty acid metabolic process (GO:0006631, P = 2.9 x 10-06) (Supplementary Table 10). The response to xenobiotic stimulus (GO:1902600, P = 7.4 x 10-04) and cellular response to xenobiotic stimulus (GO:0071466, 6.6 x 10-04), indicating response to a foreign toxin was also enriched, confirming the GSEA analysis and aligning with our findings that GHSROS may enhance cell survival. Intriguingly, an enrichment of migration was the predominant biological process observed in the LNCaP-GHSROS downregulated genes (Table 5.6 and Supplementary Table 11). Furthermore, processes such as cell death (GO:0008219, P = 5.7 x 10-15), apoptotic process (GO:0006915, P = 3.2 x 10-14), and regulation of programmed cell death (GO:0043067, P = 3.6 x 10-14) were significantly enriched terms for the genes downregulated by GHSROS. Selected genes of interest (guided by the literature and our previous RNA-seq analyses) were validated by qRT-PCR (Figure 5.2D). TFF-1 and TFF-2, which were both upregulated by GHSROS in the PC3 cell line (Chapter 3), were also induced by GHSROS in the LNCaP cell line (Figure 5.2D). Similarly, PPP2R2C was significantly reduced in the LNCaP-GHSROS tumours (-3.7 ± 0.062 fold, P = 7.9 × 10-3), as in the PC3 cell line. The levels of the AR were variable in expression across each individual tumour, and not supportive of our RNA-seq data which demonstrated that GHSROS upregulated AR expression in vivo (Figure 5.2D).

113

114

Figure 5.2. Transcriptomic analysis and expression of selected genes in LNCaP tumour xenografts overexpressing GHSROS. (A) Schematic of LNCaP-vector and LNCaP-GHSROS tumour high throughput RNA-sequencing and differential gene expression analysis. (B) Heat map of expression changes in each tumour extraction. Normalised to depict relative values within rows (samples) with high (red) and low expression (blue). Corresponding tumours are above each of the rows. The associated table shows the top upregulated RNA-seq genes (shown in red) and downregulated genes (shown in blue) with the log2 fold change and Q-value. (C) Gene set enrichment analysis (GSEA) of genes differentially expressed by LNCaP-GHSROS xenografts reveals enrichment for the androgen response. The normalised enrichment score (NES) and GSEA false-discovery corrected P-value (Q) are indicated. (D) Changes in expression of differentially expressed genes identified by RNAseq in the PC3 cell line, AR, PPP2R2C, TFF-1 and TFF-2, were measured by qRT-PCR from excised LNCaP xenografts (n=4- 8 vector, n=4-5 GHSROS) at in vivo endpoint. Expression was normalised to the housekeeping gene RPL32. Results are relative to the vector controls. Mean, n=3, *P ≤ 0.05, **P ≤ 0.01, Student’s t-test.

5.3.3 Identification of common genes regulated in the PC3-GHSROS cell line and LNCaP- GHSROS xenografts It is appreciated that the bone metastasis-derived, androgen-independent PC3 and the lymph node metastasis-derived, androgen-sensitive LNCaP prostate cancer cell lines represent genetically and presumably metabolically distinct subtypes (Seim & Chopin et al., 2017), rendering them useful cell line models for revealing broad, functional gene expression changes associated with aggressive disease.. By combining the differential gene lists between our RNA-seq data (Figure 5.3A), we observed that a quarter of differentially expressed genes in PC3-GHSROS cells (25.3%; 101 genes) had a corresponding expression pattern in LNCaP-GHSROS cells (Figure 5.3B). We subsequently refer to these as the GHSROS-regulated genes.

Ten out of the 101 GHSROS-regulated genes were differentially expressed in metastatic compared to primary tumours in two well-annotated clinical prostate cancer data sets: the Grasso (Grasso et al., 2012) (59 localized and 35 metastatic prostate tumours) and Taylor data sets (Taylor et al., 2010) (123 localized and 27 metastatic prostate tumours) (Table 5.3; moderated t-test Q ≤ 0.25). DIRAS1, FBXL16, TP53I11, TFF2, and ZNF467 were upregulated in metastatic tumours and GHSROS-overexpressing PC3 and LNCaP cells, while AASS, CHRDL1, CNTN1, IFI16, and MUM1L1 were downregulated.

We interrogated the STRING database (Szklarczyk et al., 2017) to reveal protein interactions between the 101 genes. A number of genes associated with cell-cell adhesion, migration, and growth were connected (Figure 5.3C). This included increased epithelial cadherin (CDH1), occludin (OCLN), and claudin-7 (CLDN7); and decreased contactin 1 (CNT1), noggin (NOG), and transforming growth factor

115 beta induced (TGFBI) in GHSROS-overexpressing cells. In particular, increased CDH1 expression is associated with exit from EMT and growth of aggressive, metastatic prostate tumours (Putzke et al., 2011). A second, interesting network was increased expression of trefoil factor 1 and 2 (TFF1 and TFF2) and anterior grading protein 2 homology (AGR2) was also upregulated in metastatic tumours compared to primary tumours in the Grasso and Taylor data sets (Table 5.3).

Figure 5.3. Comparative analysis of PC3-GHSROS and LNCaP-GHSROS differentially expressed RNA-seq genes demonstrates commonly regulated gene expression. (A) Schematic representation of comparative analysis of RNA-seq derived gene expression between the cultured PC3- GHSROS and LNCaP-GHSROS xenografts. (B) Venn diagram of differentially expressed genes (DEG) in LNCaP-GHSROS and PC3-GHSROS cells. (C) Interaction between 101 genes differentially expressed in both the PC3-GHSROS and LNCaP-GHSROS cells. Lines represent protein-protein interaction networks from the STRING database. Genes induced (red) or repressed (blue) by GHSROS- overexpression are indicate.

116

Table 5.2. Differentially expressed genes in PC3-GHSROS and LNCaP-GHSROS cells (against the vector control cells) compared to the Grasso and Taylor Oncomine data set. The Grasso data set includes 59 localized and 35 metastatic prostate tumours. The Taylor dataset includes 123 localized and 35 metastatic prostate tumours Red: higher expression in metastatic tumours; Black: lower expression in metastatic tumours. Fold-changes are log2 transformed; Q-value denotes the false discovery rate (FDR; Benjamini-Hochberg)-adjusted P-value. Fold Q- Gene Gene Name Reporter ID P-value Change value Grasso dataset aminoadipate-semialdehyde AASS A_23_P8754 -1.5 6.0E-03 2.5E-02 synthase CHRDL1 chordin-like 1 A_24_P168925 -48.0 3.6E-21 2.9E-18 CNTN1 contactin 1 A_23_P204541 -19.3 1.6E-16 3.4E-14 DIRAS family, GTP-binding DIRAS1 A_23_P386942 2.2 3.5E-07 4.4E-06 RAS-like 1 F-box and leucine-rich repeat FBXL16 A_23_P406385 6.0 2.5E-08 4.7E-07 protein 16 interferon, gamma-inducible IFI16 A_23_P160025 -2.2 1.1E-05 8.8E-05 protein 16 melanoma associated antigen MUM1L1 A_23_P73571 -8.8 7.9E-11 2.5E-09 (mutated) 1-like 1 TFF2 trefoil factor 2 A_23_P57364 1.3 6.4E-04 3.1E-03 tumour protein p53 inducible TP53I11 A_23_P150281 1.5 4.8E-05 3.1E-04 protein 11 ZNF467 zinc finger protein 467 A_23_P59470 3.5 5.4E-07 6.3E-06 Taylor dataset aminoadipate-semialdehyde AASS 10093 -1.5 3.9E-05 1.7E-03 synthase CHRDL1 chordin-like 1 20828 -5.0 1.6E-18 1.3E-15 CNTN1 contactin 1 6403 -3.5 4.2E-22 6.8E-19 DIRAS family, GTP-binding DIRAS1 20799 1.1 1.2E-02 1.4E-01 RAS-like 1 F-box and leucine-rich repeat FBXL16 21824 1.1 8.0E-03 1.2E-01 protein 16 interferon, gamma-inducible IFI16 9878 -1.5 8.4E-05 3.3E-03 protein 16 melanoma associated antigen MUM1L1 21313 -1.3 1.0E-02 1.3E-01 (mutated) 1-like 1 TFF2 trefoil factor 2 9774 1.1 3.6E-02 2.4E-01 tumour protein p53 inducible TP53I11 4038 1.1 2.2E-02 1.9E-01 protein 11 ZNF467 zinc finger protein 467 25037 1.3 2.8E-04 1.7E-02

117

We next investigated whether expression of these genes contributes to adverse disease outcome by assessing disease-free survival (overall relapse) in the Taylor data set, and a data set generated by The Cancer Genomics Atlas (TCGA) consortium. The TCGA-PRAD dataset consists of localized prostate tumours (Cancer Genome Atlas Research, 2015). As overall survival data are available for a small number of patients in these data sets, we assessed disease-free survival (relapse). For prostate cancer, relapse is a suitable surrogate for overall survival, given that recurrence of disease would be expected to contribute significantly to mortality and metastatic disease is not curable (Rycaj & Tang, 2017; Sumanasuriya & De Bono, 2017). Unsupervised k-means clustering was employed to divide each data set into two groups based on gene expression alone. Two genes, zinc finger protein 467 (ZNF467) and chordin-like 1 (CHRDL1), correlated with relapse in both data sets (Table 5.3). ZNF467 is induced and CHRDL1 repressed by forced GHSROS-overexpression. CHRDL1 is a negative regulator of bone morphogenetic protein 4-induced migration and invasion in breast cancer (Cyr-Depauw et al., 2016). It was also downregulated in metastatic tumours compared to localized tumours. Interrogation of the Chandran prostate cancer data set (60 localized tumours and 63 adjacent, normal tissues) (Chandran et al., 2005) revealed that CHRDL1 is downregulated in prostate tumours in general (Xu et al., 2010). Using CHRDL1 expression the Taylor data set was stratified into groups with obvious differences in overall survival (relapse; 438 days; Cox P = 0.000032, absolute hazard ratio (HR) = 4.2), while a significant, yet negligible, difference in relapse was observed in the TCGA-PRAD data set consisting of localized tumours (9 days; Cox P = 0.0078, absolute HR = 1.8). This suggests that CHRDL1 plays a particularly important role in metastatic tumours. In contrast to CHRDL1, ZNF467 stratified patients into clusters with an obvious difference in overall median survival (relapse) (697 days; Cox P = 0.0039, absolute hazard ratio (HR) = 2.5; Table 5.4) between groups in both the Taylor data set, which includes both metastatic and localized tumours, and the TCGA-data set consisting solely of localized tumours (685 days; Cox P = 0.000026, HR = 2.7). Clustering of patients into groups of either low or high ZNF467 expression revealed that elevated expression is associated with a worse relapse outcome (Figure 5.4 and Table 5.3). In agreement, ZNF467 gene expression appears to be androgen stimulated and can distinguish between low (≤ 6) from high (≥ 8) Gleason score prostate tumours in a Fred Hutchinson Cancer Research Center data set (381 localized and 27 metastatic prostate tumours) (Jhun et al., 2017) (Figure 5.4). ZNF467 expression is also elevated in chemotherapy-resistant ovarian (Zhu et al., 2015) and breast cancer cells (Davies et al., 2014).

118

Table 5.3. Disease-free survival (DFS) analysis of differentially expressed genes (in PC3-GHSROS cells, LNCaP-GHSROS cells, and clinical metastatic tumours) in human data sets. Patients in the Taylor (n=150; n=123 localized and n=27 metastatic tumours) and TCGA-PRAD (n=489; localized tumours) data sets, were stratified into two groups by k-means clustering of gene expression (k=2). The log-rank test, was used to assign statistical significance, with P ≤ 0.05 considered significant (shown in red bold). The Cox P-value and absolute hazard ratio (HR) between k-means cluster 1 and 2 for each gene are indicated. Overall median survival in days are indicated for each cluster.

Taylor (n=150) TCGA-PRAD (n=489) Overall Overall Overall Overall Gene log-rank Absolute median median log- Absolute median median Cox P Cox P symbol P HR survival survival rank P HR survival survival cluster 1 cluster 2 cluster 1 cluster 2 ZNF467 0.0027 0.0039 2.7 174 871 0.000050 0.000026 2.5 546 685 CHRDL1 0.0047 0.0062 2.5 840 402 0.0079 0.0071 1.8 649 640 FBXL16 0.017 0.020 2.2 300 871 0.089 0.087 1.5 627 663 DIRAS1 0.09 0.099 1.7 709 329 0.012 0.011 1.7 425 723 TFF2 0.11 0.11 1.7 840 125 0.84 0.089 1.1 648 896 CNTN1 0.13 0.14 1.6 701 457 0.10 0.094 1.4 627 691 IFI16 0.27 0.28 1.5 579 181 0.95 0.95 1.0 671 648 AASS 0.62 0.63 1.2 843 348 0.35 0.35 1.2 552 697 MUM1L1 0.78 0.78 1.1 472 676 0.14 0.14 1.4 765 426 TP53I11 0.98 0.98 1.0 122 843 0.57 0.57 1.1 533 751

119

Figure 5.4. Zinc finger protein 467 (ZNF467), a gene induced by forced GHSROS-overexpression, is upregulated by metastatic tumours and associated with adverse relapse outcome. Heat map of ZNF467 expression in the Taylor cohort normalised to depict relative values within rows (samples) with high (red) and low expression (green). Vertical bars show patient grouping by k-means clustering (cluster 1, red; cluster 2, black), tumour type (primary, green; metastatic pink), and relapse status (relapse event, orange). Kaplan-Meier analyses of ZNF467 in the Taylor cohort. Patients were stratified by k-means clustering, as previously described. Kaplan-Meier analyses of ZNF467 in the TCGA-PRAD cohort of 489 localized prostate tumours. Patients were stratified by k-means clustering (cluster 1, purple; cluster 2, turquoise).

120

5.3.4 GHSROS potently represses PPP2R2C, a mediator of androgen pathway-independent prostate tumour growth and mortality

The 101 GHSROS-regulated genes were visualized in a scatter plot to reveal genes with particularly distinct (≥ 8-fold) differential expression in GHSROS-overexpressing prostate cancer cell lines - candidate drivers of observed tumourigenic phenotypes (Figure 5.5A). This revealed downregulation of PPP2R2C, a gene encoding a subunit of the holoenzyme phosphatase 2A (PP2A) (Figure 5.5A). In the PC3-GHSROS RNA-seq data set, PPP2R2C was the third most downregulated gene (Chapter 4; Figure 4.1, Table 4.2). Consistently, forced overexpression or knockdown of GHSROS in prostate cancer cell lines repressed the expression of endogenous PPP2R2C (Figure 5.5B). We have previously observed that GHSROS was also able to regulate AR expression in prostate cancer cell lines, including the LNCaP cell line in vitro. Forced overexpression of GHSROS in the A549 lung adenocarcinoma cell line decreased AR and PPP2R2C expression (Figure 5.5B). GHSROS knockdown the ES-2 ovarian clear cell carcinoma cell line, which does not express PPP2R2C, increased the expression of AR (Figure 5.4B). Interestingly, small interfering RNA knockdown of PPP2R2C in cultured LNCaP and VCaP prostate cancer cells did not alter the expression of AR (Bluemn et al., 2013). In contrast, AR knockdown in the androgen-independent LP50 prostate cancer cell line, derived from the LNCaP cell line (Gonit et al., 2011), markedly decreased PPP2R2C expression (Figure 5.6; moderated t-test of all probes Q ≤ 0.05) – suggestive of an adaptive response to loss of androgen receptor expression.

121

Figure 5.5. Identification and validation of GHSROS perturbed gene expression in various cancer models. (A) Gene expression scatter plot comparing GHSROS-overexpressing PC3 and LNCaP cells. Differentially expressed genes compared to vector controls in both data sets shown in red (induced) and blue

(repressed); of which ≥ 8-fold (log2 cutoff at -3 and 3) DEGs are highlighted by a green box. (B) GHSROS perturbation in cultured cancer cell lines. Heat map of gene expression assessed by qRT-PCR in GHSROS overexpressing or silenced (knockdown) cell lines. Each row shows the relative expression level for a single gene, and each column shows the expression level of a single sample (biological replicate). Expression was normalized to RPL32 in each sample and relative, fold-change, to vector control (overexpression) or scrambled LNA-ASO control (knockdown). Fold changes were log2 transformed and are displayed in the heat map as the relative expression of a gene in a sample compared to all other samples (z-score).

122

Figure 5.6. Effect of androgen receptor (AR) perturbation in LP50 prostate cancer cells on PPP2R2C expression. Assessed by microarray (NCBI GEO accession no. GSE22483). Mean ± s.e.m. n=2, *Q ≤ 0.05, moderated t-test.

5.3.7 GHSROS is associated with cell survival and resistance to the cytotoxic drug docetaxel GHSROS knockdown experiments also revealed that GHSROS protected PC3 prostate cancer cells from death by serum starvation (Chapter 3.3.9, Figure 3.14). Given that our various RNA-seq analyses showed concept mapping to drug sensitivities, and strong enrichment for a decreased apoptosis response, we aimed to examine whether GHSROS contributes to cell survival following chemotherapy.

The current treatment of choice for advanced, castration-resistant prostate cancer (CRPC; the fatal final stage of the disease) after the failure of hormonal therapy is the cytotoxic drug docetaxel; a semi- synthetic taxoid that results in cell cycle arrest (Puente et al., 2017). At the half maximal inhibitory concentration (IC50) of docetaxel (5nM for LNCaP) (Komura et al., 2016), cell survival was significantly increased in GHSROS-overexpressing LNCaP cells (Figure 5.7). A similar, less pronounced response was observed in LNCaP cells treated with enzalutamide, a drug used to target the androgen receptor in metastatic, castration-resistant tumours (Drake & Gerritsen, 2014) (Figure 5.7). Survival pathways are induced after docetaxel treatment in prostate cancer (Chandrasekar et al., 2015), and drug resistance may develop after chemotherapy (acquired resistance) or exist in treatment-naïve patients (innate resistance) (Sonpavde et al., 2015).

123

Figure 5.7. GHSROS mediates cell survival and resistance to the cytotoxic drug docetaxel. Viability of GHSROS-overexpressing LNCaP cells under different culture conditions. Cell number was assessed using WST-1, a reagent that stains metabolically viable cells. Cells were treated with enzalutamide (ENZ; 10 μM) or docetaxel (DTX; 5 nM) for 96 hours and grown in either 2% FBS or 5% charcoal stripped (CSS) RPMI-1640 media (n=3). Mean ± s.e.m. *P ≤ 0.05, ***P ≤ 0.001, Student’s t-test.

The pronounced increase in survival in response to docetaxel treatment in GHSROS-overexpressing LNCaP cells led us to speculate that endogenous expression of GHSROS may also contribute to cell survival. In LNCaP cells GHSROS was not differentially expressed in response to charcoal stripped serum (CSS; to simulate androgen deprivation therapy), or following treatment with the androgen receptor antagonist enzalutamide (Figure 5.8A). In agreement with previous reports (Lee et al., 2014; Tran et al., 2009), the gene coding for prostate specific antigen (PSA; KLK3) was downregulated by docetaxel and enzalutamide in LNCaP cells (-6.6-fold, P = 0.00070, Student’s t-test) (Figure 5.8A). Docetaxel significantly increased GHSROS expression in both native LNCaP and PC3 cells in a dose- dependent manner, and at concentrations both above and below their IC50 values (Figure 5.8B). Taken together, these data suggest that GHSROS mediates tumour survival and consequently, protection from the cytotoxic chemotherapy drug docetaxel.

124

Figure 5.8. GHSROS is not an androgen regulated gene in the LNCaP cell line but its expression is promoted by docetaxel. (A) GHSROS and PSA expression in native LNCaP cells treated with enzalutamide (ENZ 10 μM in 2% FBS or 5% CSS RPMI-1640) or docetaxel (DTX 5 nM in 2% FBS RPMI-1640) for 48 hours (n=3). Fold-enrichment of GHSROS normalized to RPL32 and compared to empty vector control. Mean ± s.e.m. *P ≤ 0.05, ***P ≤ 0.001, Student’s t-test. (B) GHSROS expression in native PC3 and LNCaP cells treated with different concentrations (1-20 nM) of docetaxel (DTX). Cells were grown in RPMI-1640 media with 2% FBS and treated for 48 hours (n=3). Fold-enrichment of GHSROS normalized to RPL32 and compared to empty vector control. Mean ± s.e.m. *P ≤ 0.05, **P ≤ 0.01, one-way ANOVA, Dunnett’s post hoc test.

125

5.4 Discussion

We have previously determined that GHSROS is expressed in prostate tumours and regulates growth, motility, and survival of androgen-sensitive and androgen-independent prostate cancer cell lines. In Chapter 4 we demonstrated that ectopic GHSROS expression reduced AR expression in the androgen- independent PC3 cell line. Detecting the expression of the AR in this cell line was a peculiarity, but reports indicate it has measurable levels of AR transcript (Alimirah & Choubey, 2006). Given its controversial role in the PC3 cell line, we undertook further experiments using the LNCaP androgen- dependent cell line. Consistent with our previous observations in the PC3 and DU145 cell lines, elevated GHSROS expression increased the in vitro proliferation and migration, and significant increases in tumour growth in subcutaneous LNCaP-GHSROS xenografts. GHSROS appeared to facilitate a particularly aggressive phenotype, with xenograft tumours becoming invasive, and gaining access to the blood supply. This would allow access to elevated circulating androgens and nutrients compared to control cells, and would be likely to have influenced the changes in gene expression observed 72 days post xenograft inoculation. As the expression of the AR varied among the different xenograft tumours, GHSROS may not act by regulating AR signalling. In Chapter 4, however, we showed that elevated GHSROS expression characterises a metastatic-like gene expression program which is associated with advanced prostate cancer and poor clinical outcome in clinical cohorts. It is therefore likely that this GHSROS induced program exists within the LNCaP cells and contributes to tumour progression, but that this is independent of AR levels.

To better understand how GHSROS mediates its effects in prostate cancer, we examined transcriptomes of prostate cancer cell lines with forced GHSROS overexpression in cultured PC3 cells and mouse LNCaP xenograft tumours. We identified 101 common differentially expressed GHSROS-regulated genes shared between the PC3 and LNCaPs, which included several transcription factors with established roles in prostate cancer and genes associated with metastasis and poor prognosis. Expression of the tumour suppressor gene PPP2R2C, which encodes a substrate-binding regulatory subunit of the tumour suppressor protein phosphatase 2A (PP2A) (Sangodkar et al., 2016), was repressed by forced GHSROS overexpression and increased by GHSROS knockdown. In clinical samples, PPP2R2C expression is decreased in a subset of primary and metastatic prostate tumours resistant to androgen- targeted therapy (ATT) and associated with adverse disease outcomes (Bluemn et al., 2013). A Loss of PPP2R2C expression alone is thought to reprogram prostate tumours towards AR pathway-independent growth and survival, circumventing therapeutic approaches which target this pathway (Bluemn et al., 2013). It is now increasingly recognised that like other endocrine-related cancers, several subtypes of prostate cancer exist (Inamura, 2018) and include subtypes characterized by androgen pathway- independent growth (termed AR-null) (Bluemn et al., 2017; Bluemn et al., 2013). Evidence is emerging that in a subset of patients who display resistance to AR-targeting therapies, inactivation of PP2A

126 enables CRPC by promoting AR-independent growth and survival (Bluemn et al., 2013). Knockdown of PPP2R2C in the LNCaP and VCaP androgen-sensitive prostate cancer cell lines drives tumour growth which is independent of the AR and AR-regulated genes, effectively circumventing therapeutic approaches targeting the AR (Bluemn et al., 2013).

Although seemingly paradoxical, GHSROS promotes prostate tumour growth while repressing AR and PPP2R2C – distinct prostate tumour drivers, however this can be rationalized. Knockdown of PPP2R2C using small interfering RNA in cultured LNCaP and VCaP cells did not alter the expression of AR expression or canonical AR-regulated gene expression (Bluemn et al., 2013). In contrast, AR knockdown in androgen-independent LP50 cells (Gonit et al., 2011) (an LNCaP-derived cell line) markedly decreased PPP2R2C expression. This suggests an adaptive response to loss of androgen receptor expression (and function). Taken together, we speculate that GHSROS downregulates PPP2R2C to prime prostate tumours for androgen receptor-independent growth. Additionally, several independent lines of evidence suggest that PPP2R2C is a critical tumour suppressor. Loss of PPP2R2C expression has been attributed to oesophageal adenocarcinoma tumourigenesis (Peng et al., 2017), and PPP2R2C downregulation by distinct microRNAs promotes the proliferation of cultured cancer cells derived from the prostate (Bi & Ding, 2016), nasopharynx (Yan et al., 2017), and ovary (Wu et al., 2016). PPP2R2C is also a classical growth-inhibiting tumour suppressor in brain cancers (Fan & Wan, 2013). A subtype of paediatric medulloblastoma brain tumours, are characterized by high expression of the chemokine receptor CXCR4 and concordant suppression of PPP2R2C (Sengupta et al., 2012). Similarly, the gene is ablated in A2B5+ glioma stem-like cells, a population which mediates a particularly aggressive chemotherapy-resistant glioblastoma phenotype (Auvergne et al., 2013). Precisely how GHSROS mediates PPP2R2C downregulation and tumour growth remains to be determined, however, GHSROS is the first lncRNA shown to downregulate this critical tumour suppressor, suggesting a role in adaptive survival pathways and CRPC development.

Among drugs targeting advanced, castration-resistant prostate cancer (CRPC), docetaxel is currently the drug of choice after failure of standard androgen targeted therapy (ATT). Unfortunately, approximately 50% of patients present with docetaxel failure or resistance, and there is an urgent need to identify associated molecular events, including gene expression changes (Berthold et al., 2008; Drake et al., 2014). Survival pathways are induced after docetaxel treatment in prostate cancer (Chandrasekar et al., 2015), and drug resistance may develop after chemotherapy (acquired resistance), or exist in treatment-naïve patients (innate resistance) (Hwang, 2012; Puente et al., 2017; Teply & Hauke, 2016). We demonstrate that forced overexpression of GHSROS in prostate cancer cell lines facilitates tumour survival and recalcitrance to the cytotoxic chemotherapy docetaxel. Docetaxel is commonly prescribed for late-stage, metastatic CRPC patients, but large randomized trials suggest that it is also effective against recently diagnosed, localized prostate tumours (Puente et al., 2017). Our data suggests that

127

GHSROS acts as a cell survival factor in prostate cancer. Additionally, gene ontology analysis suggest that GHSROS is involved in xenobiotic (synthetic) response to drug treatment and decreased levels of apoptosis when overexpressed in both the PC3 and LNCaP cell lines. While the underlying mechanisms are largely unknown, two genes associated with chemotherapeutic resistance, ZNF467 and PPP1R1B (also known as DARPP-32), were upregulated in PC3 and LNCaP cells overexpressing GHSROS. PPP1R1B is a potent anti-apoptotic gene which renders cancer cells resistant to several chemotherapeutic agents when overexpressed by modulating proteins associated with survival, including upregulation of the anti-apoptotic Bcl2 (Belkhiri & El-Rifai, 2008; Zhu et al., 2017) and BCL- xL in gastric cancer (Belkhiri & El-Rifai, 2012) and preventing pro-apoptotic Bim accumulation in breast cancer (Christenson & Kane, 2015). Further experiments utilising fluorescence-activated cell sorting (FACS) and human apoptosis antibody arrays will explore the effects of GHSROS targeting (through LNA-ASO treatment) on apoptosis and prostate cancer cell survival in response to chemotherapies like docetaxel. Additionally, it would be useful to explore through knockdown and overexpression models if any relationship exists between ZNF467 and PPP1R1B in GHSROS-mediated chemoresistance.

In summary, we propose that GHSROS is an oncogene that regulates cancer hallmarks and the expression of a number of genes, including the tumour suppressor PPP2R2C – an emerging alternative driver of prostate cancer. Furthermore, GHSROS may attenuate apoptosis in the presence of docetaxel, and thus could provide a novel mechanism of chemotherapeutic resistance. Targeting GHSROS may present an opportunity for clinical intervention, however, it is appreciated that translational and regulatory challenges exist for RNA-centric oligonucleotide based therapies (Stein & Castanotto, 2017). This will be pertinent in order to determine whether pharmacological targeting of this lncRNA could prove useful to treat cancer.

128

Chapter 6

The long noncoding RNA GHSROS facilitates breast cancer cell line migration and orthotopic xenograft tumour growth

129

6.1 Introduction

In the previous research chapters of this thesis, it was demonstrated that the antisense intronic transcript GHSROS is an important lncRNA with oncogenic functions in prostate cancer. Specifically, inducement of GHSROS reprograms prostate cancer cells towards a more aggressive phenotype (Chapters 3, 4 and 5) and delayed apoptosis in response to cytotoxic agents (Chapter 5). Given the diversity of expression of its sense gene, the GHSR (Chopin & Herington, 2012; Fung et al., 2013; Leung et al., 2007; Nikolopoulos & Kouraklis, 2010; Seim et al., 2011), it is likely that GHSROS is expressed in other tumour types, and facilitates a similar phenotype, as is the case in lung cancer (Whiteside et al., 2013). In chapter 3, in silico screening of 3, 924 Affymetrix exon array samples and qRT-PCR of clinical tumour versus normal cDNA panels served as validation to our preliminary observations that GHSROS is also expressed in breast cancer. Like in prostate and lung cancer, we hypothesise that GHSROS may enhance the tumour growth of breast cancers – making GHSROS a useful target for multiple cancer types.

Breast cancer is the most commonly diagnosed cancer and a leading cause of cancer-related deaths in females (Jemal et al., 2011). Given the significant incidence of breast cancer in the population, there is a need to explore the therapeutic potential of new molecular targets, particularly for triple negative breast cancer which is diagnosed in 15% of breast cancer patients (Welfare, 2017). Due to a lack of targeted therapies, these patients require more aggressive treatment regimens (Welfare, 2017). Historically, molecular classification and therapeutic targeting of breast cancer-associated genes has focused on protein-coding genes which represent less than one per cent of the genome (ENCODE Project Consortium, 2004). It is now appreciated that many lncRNAs are feasible biomarkers and targets for molecular therapies (Amorim & Jeronimo, 2016; Cerk et al., 2016; Kumar & Herschkowitz, 2016; Soudyab & Ghafouri-Fard, 2016). A key example includes HOTAIR (HOX transcript antisense intergenic RNA) which is upregulated in primary and metastatic breast tumours (Bhan et al., 2013; Bhan & Mandal, 2016; He et al., 2014). Overexpression of HOTAIR in breast cancer constitutively activates an oestrogen receptor-associated transcriptional program, to enhance cancer growth and tamoxifen resistance in breast cancer (Xue et al., 2016). Similarly, MALAT1 (metastasis-associated lung adenocarcinoma), a highly conserved and abundant lncRNA which is upregulated in a broad range of tumour types (including metastatic breast cancer), stimulates cell proliferation and migration (Aiello et al., 2016). Targeting Malat1 in a mouse model of mammary carcinoma using modified antisense oligonucleotides (ASOs) significantly reduced breast cancer metastasis and slowed primary tumour growth (Arun et al., 2016). Taken together, these studies highlight value of studying the expression and function of lncRNAs in breast cancer.

130

6.2 Materials and Methods General Materials and Methods are outlined in detail in chapter 2. Experimental procedures which are specific to this chapter are described below.

6.2.1 Cell culture In this study, assays were performed using the MCF-10A and MDA-MB-231 breast cancer cell lines (Chapter 2; section 2.1). The MCF-10A and MDA-MB-231 cell lines were propagated as described in the General Methods, Chapter 2.1, section 2.1.

6.2.3 Expression of GHSROS across cancer specimens, prostate cell lines and tissues

6.2.3.1 GHSROS expression in human tissue specimens To survey the expression of GHSROS in cancer, we interrogated cDNA panels of breast tumour and normal breast tissue samples using qRT-PCR (TissueScan Cancer Survey Tissue qPCR panels, BCRT101, BCRT102, BCRT103. and BCRT104 were arranged on a single 384-well reaction plate; OriGene, Rockville, MD, USA). Data expressed as mean fold change using the comparative 2-ΔΔCt method compared to a non-malignant control tissue and normalized to ACTB as described in Chapter 2, section 2.3.

6.2.3.2 RT-PCR of cell line mRNA RNA was extracted and cDNA synthesised from all cell lines (as described in Chapter 2.3). RT-PCR was performed using cDNA from the normal breast and from breast cancer cell lines (as described in Chapter 2, section 2.3). No template control qRT-PCRs were also performed where template was substituted with water.

131

Table 6.1. Primer sequences used in this study. F: Forward primer; R: Reverse primer; MHC: Major Histocompatibility Complex

Primer Gene name Primer sequence (5'-3') F: ACATTCAGCAAATCCAGTTAATGACA GHSROS growth hormone secretagogue R: CGACTGGAGCACGAGGACACTTGA GHSROS- receptor opposite strand CGACTGGAGCACGAGGACACTGACAACA RT linker GAATTCACTACTTCCCCAAA F: ACTCTTCCAGCCTTCCTTCCT ACTB actin beta (housekeeping gene) R: CAGTGATCTCCTTCTGCATCCT hypoxanthine F: CAGTCAACGGGGGACATAAA HPRT phosphoribosyltransferase 1 R: AGAGGTCCTTTTCACCAGCAA (housekeeping gene) HLA-DRB3 MHC class II, DR beta 3 QIAGEN QuantiTect Primer Assay HLA-DRA MHC class II, DR alpha QIAGEN QuantiTect Primer Assay HLA-DPB1 MHC, class II, DP beta 1 QIAGEN QuantiTect Primer Assay HLA-DPA1 MHC class II, DP alpha 1 QIAGEN QuantiTect Primer Assay TBX-3 T-box 3 QIAGEN QuantiTect Primer Assay HTR1F 5-hydroxytryptamine receptor 1F QIAGEN QuantiTect Primer Assay ODZ1 odd Oz/ten-m homolog 1 QIAGEN QuantiTect Primer Assay

6.2.3 Production of GHSROS overexpressing cancer cell lines Full-length GHSROS transcript was generated by RT-PCR from A549 cell line mRNA and cloned into the pTargeT mammalian expression vector (Promega, Madison, WI) (Whiteside et al., 2013). The MCF-10A and MDA-MB-231 cell lines were transfected with GHSROS-pTargeT DNA, or vector alone (empty vector), (using Lipofectamine LTX, Invitrogen) according to the manufacturer’s instructions, and as described in Chapter 2, section 2.2. Briefly, Cells were incubated for 24 hours in LTX and selected with geneticin (G418, Invitrogen) at concentrations of 500 μg/ mL for the MCF-10A and 600 μg/ mL for the MDA-MB-231 cell line. Cells were grown in the presence of G418 for at least two weeks before functional analyses were performed. For in vivo experiments, MDA-MB-231 cells stably overexpressing luciferase pGL4.51[luc2/CMV/Neo] were obtained (kindly, from Dr. Eloïse Dray, QUT) and lentiviral transduction employed to overexpress GHSROS. Briefly, pReceiver-Lv105 vectors, expressing full length GHSROS, or empty control vectors, were obtained from GeneCopoeia (Rockville, MD). Methodology as described in Chapter 2, section 2.2.

6.2.4 Cell proliferation assays Cell proliferation assays were performed (as described in Chapter 2) using an xCELLigence RTCA DP instrument (ACEA Biosciences) using the MCF-10A-vector, MCF-10A-GHSROS and MDA-MB-231- vector and MDA-MB-231-GHSROS cell lines. Briefly, 5 × 103 cells were trypsinized and seeded into a 96 well plate (E-plate) and grown for 48 hours in 150 µl growth media. Cell index was measured

132 every 15 minutes and all experiments were performed in triplicate with at least three independent repeats.

6.2.5 Cell Migration assays Migration assays were performed (as described in Chapter 2) using an xCELLigence RTCA DP instrument (ACEA Biosciences) using the MCF-10A-vector, MCF-10A-GHSROS and MDA-MB-231- vector and MDA-MB-231-GHSROS cell lines. Briefly, 5 × 104 cells/well were seeded on the top chamber in 150 µl serum-free media. The lower chamber contained 160 µl media with 10% FBS as a chemo-attractant. Cell index was measured every 15 minutes for 24 hours to indicate the rate of cell migration to the lower chamber. All experiments were performed in triplicate with at least 3 independent repeats.

6.2.6 Oligonucleotide microarray and analysis Oligonucleotide microarray analysis was performed using RNA extracted from MDA-MB-231 cell lines overexpressing GHSROS or vector controls. The MDA-MB-231 breast cancer cell line was transfected independently three times with GHSROS-pTargeT, or empty pTargeT vector, and RNA was extracted and purity analysed using an Agilent 2100 Bioanalyzer (Agilent Technologies, VIC, Australia). Total RNA (500 ng) was processed and hybridised to Affymetrix Human Gene Arrays 1.0 by the Ramaciotti Centre for Gene Function Analysis (Sydney, Australia). The array (n=3, of GHSROS- pTargeT and empty control) were quantile normalised and log2-transformed using the R statistical programming language. Up-to-date gene annotations were obtained from NCBI (Platform GPL6244; downloaded September 2017). Differential expression was determined by the R package ‘limma’ v3.33.12 (Ritchie et al., 2015). Differentially expressed genes were defined as an absolute fold-change ≥ 1.5 and a P-value ≤ 0.05.

6.2.7 Gene Ontology (GO) term and STRING analysis To identify associations between differentially expressed genes, we implemented STRING (STRING v.10.5) functional analysis (Szklarczyk et al., 2017). STRING integrates predicted and experimentally confirmed relationships between proteins that are likely to contribute to a common biological purpose. Using default parameters, differentially expressed genes were mapped into the STRING user interface and interactions were partitioned into distinct clusters using k-means analysis (Hartigan & Wong, 1979). Interaction networks were exported into Inkscape (v.0.91). To test for gene enrichment, we compared differentially expressed genes using the Kyoto Encylopedia of Genes and Genomes (KEGG) (J. Du et al., 2014) pathway databases within the STRING functional analysis tool. STRING categorises KEGG pathway terms and calculates an ‘enrichment score’ or EASE score (a modified Fisher's exact test- derived P-value).

133

6.2.8 Orthotopic mammary fat pad in vivo xenografts in a NOD/SCID gamma (NSG) mouse model Experiments were approved by the University of Queensland (UQ) and Queensland University of Technology (QUT) animal ethics committees (TRI/QUT/328/16). Mice were housed under pathogen- free conditions, in individually ventilated cages, at a room temperature of 20–23 °C, and with a 12-hour light-dark cycle. MDA-MB-231-GHSROS or MDA-MB-231-Vector cell lines were injected at a 1:1 ratio of growth factor-reduced Matrigel (Corning, NY, USA) (n=8-10 per cell line) directly into the right inguinal mammary fat pad of 3-week old female NOD.Cg-Prkdc SCID IL-2rgtm1WjL/SzJ (NSG) mice (generated by the Jackson Laboratory; provided by Animal Resource Centre, Murdoch, WA, Australia). Tumour growth was measured twice weekly with digital calipers (ProSciTech, Kirwan, QLD, Australia) and tumour volume was calculated with a formula for the volume of an ellipse: V = π/6(d1 × d2)3/2, where d1 and d2 represent perpendicular tumour measurements (Russell et al., 1986). In addition, tumour size and growth was monitored weekly by bioluminescent imaging (Lim, Modi, & Kim, 2009). Briefly, mice were injected intraperitoneally with 150 mg/kg the firefly luciferase substrate D-Luciferin (Perkin Elmer, Boston, MA, USA) diluted in PBS (Thermo Fisher Scientific) 10 minutes prior to imaging. Following anaesthesia with isoflurane, bioluminescent imagining was performed using an IVIS Spectrum in vivo imaging system (Xenogen, Perkin Elmer) (Lim et al., 2009). Images were analysed with the associated Living Image Software. Briefly, total flux in photons/second (p/s) was used as a surrogate for primary tumour size and determined within a defined region of interest, individually, for each mouse (Lim et al., 2009). Animals were euthanased once tumour volume reached 1000 mm3, or earlier at other ethical endpoints.

6.2.9 Statistical analyses Data values were expressed as mean ± s.e.m of at least two independent and evaluated by Student’s t- test or Mann-Whitney-Wilcoxon test. Mean differences were considered significant when P ≤ 0.05. Statistical analyses were performed using GraphPad Prism v.6.01.

134

6.3 Results

6.3.1 GHSROS expression is elevated in breast cancer samples and breast cancer-derived cell lines

To investigate the role of GHSROS in breast cancer, we first performed qRT-PCR assays on cDNA array panels, revealing that GHSROS was expressed at low levels in 3 out of 16 normal breast tissue samples (Figure 6.1A). In contrast, 46% (81/176) tumours expressed GHSROS at significantly higher levels than normal breast specimens (P = 0.030). GHSROS was significantly elevated in both stage I (P = 0.0077) and stage III (P = 0.034) breast cancer samples (Figure 6.1B; Table 6.2). High or low GHSROS expression was not correlated with a range of other clinical parameters and features – including age, hormone receptor status (ER, PR, and HER2), and metastasis (Table 6.2). Similar to the normal clinical tissue specimens, very low levels of GHSROS expression were observed in the MCF10A and HMEC non-malignant cell lines (0.726 ± 0.377 fold in HMEC, P = 0.27 compared to MCF10A). GHSROS was expressed at higher levels in the triple-negative (ER-, PR-, HER2-) MDA-MB-231 (2.71 ± 0.27 fold, P = 0.0013) and MDA-MB-453 (33.60 ± 3.33 fold, P = 0.0003) cell lines and in the (ER-, PR+, HER2+) BT474 (8.36 ± 0.94 fold, P = 0.0002) and (ER+, PR+, HER2+) MCF7 (3.28 ± 0.65 fold, P = 0.0019) cell lines compared to the MCF10A cell line (Figure 6.1C).

Figure 6.1. GHSROS is expressed at low levels in normal breast tissue and at higher levels in breast cancer. (A) GHSROS expression (qRT-PCR) in clinical breast cancer (BCa, n=176) and normal breast (N) samples (n=17). (B) GHSROS expression (qRT-PCR) stratified by clinical stage of breast cancer. N (n=16), stage I (n=35), stage II (n=74), stage III (n=57), and stage IV (n=10). (C) GHSROS expression (RT-qPCR) in the MCF-10A and HMLE normal-breast derived cell lines and the MDA- MB-231, MCF-7, T-47D, BT474, MDA-MB-453 breast cancer cell lines. All experiments were performed independently three times (n=3) with three replicates per experiment (n=3). Mean fold change ± s.e.m. using the comparative 2-ΔΔCt method. Samples were normalized using β-actin (ACTB). *P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001, Student’s t-test.

135

6.3.2 Ectopic overexpression of GHSROS promotes in vitro breast-derived cell line migration, but not proliferation

In order to investigate GHSROS function we generated MCF10A (MCF10A-GHSROS) cells and MDA- MB-231 breast cancer (MDA-MB-231-GHSROS) cells stably overexpressing full-length GHSROS and corresponding vector control lines (MCF10A-Vector; MDA-MB-231-Vector) (Figure 6.2A). Using real-time cell analysis (xCELLigence system), it was observed that GHSROS overexpression did not significantly increase the in vitro proliferation rate of cultured MCF10A (0.96 ± 0.04 fold, P = 0.179) or MDA-MD-231 cell lines (0.89 ± 0.81 fold, P = 0.290) over 72 hours (Figure 6.2B). Overexpression of GHSROS, however, significantly increased the rate of migration of MDA-MB-231 (1.34 ± 0.26 fold, P = 0.045, Figure 6.2C) and MCF10A cell lines (1.71 ± 0.43 fold, P = 0.040, Figure 6.2D) over an 18- hour period.

Table 6.2. GHSROS expression (qRT-PCR) and clinicopathological parameters in breast cancer and normal breast clinical specimens. GHSROS expression in tumours (T) stratified by clinical stage and clinical features was compared to normal breast (N). P-values were calculated using the Mann- Whitney-Wilcoxon test. ER = estrogen receptor; PR = ; HER2 =Human epidermal growth factor receptor 2; M = metastatic tumour; Non-M = non-metastasised tumour (primary breast tumour confined to site); NA = non-applicable. *P ≤ 0.05 compared to normal breast (N). Some samples were missing information for clinical features and were excluded from that analysis.

Clinicopathological parameters Sample number (n) P-value Age at diagnosis (mean ± SD) 55.83 ± 14.02

N/ T 16/176 0.0300*

Clinical stage N 16 NA I 35 0.0077* II 74 0.0537 III 57 0.0345* IV 10 0.8796

Clinical Features, n (%) ER+/ER- 74/59 0.3829 PR+/PR- 57/67 0.3402 HER2+/HER2- 93/42 0.6185 ER-, PR-, HER2-/Other 152/33 0.7110 M/Non-M 17/159 0.4236

136

137

Figure 6.2. GHSROS promotes cell migration, but not cell proliferation in the MCF10A and MDA-MB-231 breast cancer cell lines in vitro. (A) Relative expression of GHSROS in vector and MDA-MD-231-GHSROS and vector and MCF10A-GHSROS cell lines. Expression was normalised to the housekeeping gene ACTB using the comparative 2-∆∆Ct method of quantification. Results are relative to each vector control. Mean ± s.e.m, n=3, **P ≤ 0.01, Student’s t-test. (B) Proliferation was not significantly increased in the MCF10A or MDA-MB-231 cell lines overexpressing GHSROS compared to cells expressing vector alone when assessed using an xCELLigence real-time cell analyser for 72 hours. (C) GHSROS overexpression increases MCF10A migration across a porous membrane (8µM pores). Left panel: representative plot of raw Cell index impedance measurements from 0 to 20 hours after cell seeding. Right panel: GHSROS overexpression increases cell migration at 18 hours. Mean ± s.e.m. (n=3). **P ≤ 0.01, Student’s t-test. (D) GHSROS overexpression increases MDA-MB-231 migration across a porous membrane (8µM pores). Left panel: representative plot of raw Cell index impedance measurements from 0 to 20 hours after overexpression of GHSROS. Right panel: GHSROS overexpression increases cell migration at 18 hours after passaging. Mean ± s.e.m. (n=3). *P ≤ 0.05, Student’s t-test.

6.3.3 GHSROS regulates genes associated with cancer and immune response

In order to gain further insight into the function of GHSROS in breast cancer, gene expression microarrays were performed on RNA isolated from GHSROS over-expressing MDA-MB-231 cells and control cells. Our analysis revealed that a total of 76 genes were differentially regulated more than 1.5- fold by GHSROS overexpressing MDA-MB-231 cells (with 36 upregulated and 40 downregulated genes, P ≤ 0.05) (Figure 6.3A and Supplementary Table 13). Quantitative RT-PCR validated many of these genes, some of which will be discussed below.

The most upregulated genes in the microarray were serotonin binding 5-hydroxytryptamine receptor 1F (HTR1F, 3.1 fold-change, P = 0.0040) and ephrin (EPH) receptor A3 (EPHA3, 2.3 fold-change, P = 0.0190) (Supplementary Table 13). Significant upregulation of HTR1F was verified by qRT-PCR in the MDA-MB231-GHSROS cell line (19.2 ± 1.173 fold-change, P = 0.0001) (Figure 6.3B). Conversely, HTR1F expression was reduced in MCF10A-GHSROS cells (-2.68 ± 0.002 fold-change, P = 0.0001) (Figure 6.3B). Notably, a number of established oncogenes were differentially expressed in the MDA- MB231-GHSROS microarray data. These included teneurin transmembrane protein 1 (TENM1), which was downregulated in MDA-MB-231-GHSROS cells (-2.00 fold-change, P = 0.0238). Quantitative RT-PCR validated this data in MDA-MB-231-GHSROS cells (-2.65 ± 0.286 fold-change, P = 0.0048) and also showed a similar downregulation in MCF10A-GHSROS cells (-2.31 ± 0.262 fold-change, P = 0.0201) (Figure 6.3B). The T-box transcription factor gene (TBX3), a gene associated with breast cancer

138 cell migration and growth (Amir et al., 2016), was upregulated (1.5 fold-change, P = 0.0041) (Supplementary Table 13) in the MDA-MB-231-GHSROS microarray. While qRT-PCR confirmed upregulation in MDA-MB-231-GHSROS cells (5.60 ± 0.82 fold-change, P = 0.0249) (Fig. 3B), TBX3 expression was not altered in MCF10A-GHSROS cells (Figure 6.3B).

In our data set, we observed that a single probe matched to several class II major histocompatibility complex (MHC) genes. A number of MHC class II genes were significantly repressed by GHSROS (including HLA-DRB3, -2.7 fold-change; P = 0.0068; HLA-DRB1, -2.7 fold-change, P = 0.0063; HLA- DRA, -2.4 fold-change, P = 0.0058; HLA-DPB1, -2.3 fold-change, P = 0.039) (Supplementary Table 13). MHC gene loci are complex. Their exons are highly similar and person-to-person variation in their exon sequence can confound microarray probe hybridisation (Boegel et al., 2012; Trowsdale & Knight, 2013). To firmly establish which MHC-II genes were differentially expressed upon forced GHSROS overexpression, we validated selected genes by qRT-PCR. We were able to confirm that GHSROS overexpression in the MDA-MB-231 cells induced downregulation of the MHC-II genes HLA-DRA (- 4.62 ± 0.024 fold-change P = 0.0001), HLA-DPB1 (-4.53 ± 0.30 fold-change, P = 0.0008), HLA-DPA1 (-3.69 ± 0.02 fold-change, P = 0.0001) and HLA-DRB3 (-3.84 ± 0.02 fold-change, P = 0.0001) (Figure 6.3B). These transcripts were not detected in in the MCF10A-GHSROS or MCF10A-Vector cell lines, consistent with the fact that normal breast epithelial tissue is typically MHC-II negative (Tabibzadeh & Satyaswaroop, 1990).

Differentially expressed genes identified by microarray analysis were interrogated for biological interactions using the STRING tool (Szklarczyk et al., 2017). Within the functional protein-protein interaction network (PPI) we observed a small, distinct interaction between the MHC-II gene set (HLA- DRA, HLA-DPA1, HLA-DPB1) and Protein tyrosine phosphatase, non-receptor type 22 (lymphoid) (PTPN22) (Figure 6.3C; expected interactions = 7, observed interactions = 15, PPI enrichment P = 0.00986, hypergeometric test). KEGG pathway analysis derived from 56 genes analysed in STRING, demonstrated that downregulated genes in the MDA-MB-231-GHSROS cells were enriched for pathways typically associated with the expression of multiple members of MHC-II genes (Figure 3C, Table 3), including antigen processing and presentation pathways (Benjamini-Hochberg corrected P- value, BH-FDR = 0.00201), asthma (BH-FDR = 0.00107), and graft-versus-host disease (BH-FDR = 0.00107) (Table 6.3).

139

140

Figure 6.3. GHSROS significantly differentially regulates 76 genes in the MDA-MB-231 breast cancer cell line. (A) Scatter plot visualization of induced (red) or repressed (blue) genes identified by microarray. The threshold was set at as log2 1.5 fold change and Q (Benjamini Hochberg adjusted P- value) ≤ 0.05. EPHA3 (EPH receptor A3), HTR1F (5 hydroxytryptamine receptor 1F), FKBP10 (FK506 binding protein 10), LOC645188 (uncharacterized LOC645188), ZNF585B (zinc finger protein 585B), HLA (Human Leukocyte Antigen class II) (B) Expression of TBX3, HTR1F, TENM1, HLA-DRA, HLA- DRB3, HLA-DPA1, HLA-DPB1 was measured by qRT-PCR from cultured MDA-MB-231-GHSROS, MDA-MB-231-Vector cells, MCF10A-Vector and MCF10A-GHSROS (n=3). Expression was normalised to the housekeeping gene ACTB. Results are relative to the respective vector control. Mean ± s.e.m, n=3, ***P ≤ 0.01, Student’s t-test. (C) STRING network consisting 56 proteins encoded by genes differentially expressed in the MDA-MB-231-GHSROS cells. Nodes represent differentially expressed genes. Genes induced (red edging) or repressed (blue edging) by GHSROS are indicated. Lines between protein nodes indicate biological associations inferred or experimentally demonstrated.

141

Table 6.3. Enriched KEGG pathway terms for 40 genes downregulated in microarray analysis from MDA-MB-231-GHSROS cells (compared to empty-vector control). * indicates GO term passes the 5% Benjamini-Hochberg false discovery rate (BH-FDR) corrected P-value. Pathway Gene Pathway description BH-FDR Genes ID count HLA-DPA1,HLA- 5310 Asthma 3 0.00107* DPB1,HLA-DRA HLA-DPA1,HLA- 5330 Allograft rejection 3 0.00107* DPB1,HLA-DRA HLA-DPA1,HLA- 5332 Graft-versus-host disease 3 0.00107* DPB1,HLA-DRA HLA-DPA1,HLA- 4940 Type I diabetes mellitus 3 0.00111* DPB1,HLA-DRA Intestinal immune network for IgA HLA-DPA1,HLA- 4672 production 3 0.00119* DPB1,HLA-DRA HLA-DPA1,HLA- 5150 Staphylococcus aureus infection 3 0.00132* DPB1,HLA-DRA HLA-DPA1,HLA- 5320 Autoimmune thyroid disease 3 0.00132* DPB1,HLA-DRA HLA-DPA1,HLA- 5416 Viral myocarditis 3 0.00146* DPB1,HLA-DRA HLA-DPA1,HLA- 5321 Inflammatory bowel disease (IBD) 3 0.00185* DPB1,HLA-DRA HLA-DPA1,HLA- 4612 Antigen processing and presentation 3 0.00201* DPB1,HLA-DRA HLA-DPA1,HLA- 5140 Leishmaniasis 3 0.00209* DPB1,HLA-DRA HLA-DPA1,HLA- 5323 Rheumatoid arthritis 3 0.00356* DPB1,HLA-DRA HLA-DPA1,HLA- 5322 Systemic lupus erythematosus 3 0.00443* DPB1,HLA-DRA HLA-DPA1,HLA- 5145 Toxoplasmosis 3 0.0069* DPB1,HLA-DRA HLA-DPA1,HLA- 4514 Cell adhesion molecules 3 0.0114* DPB1,HLA-DRA HLA-DPA1,HLA- 4145 Phagosome 3 0.0121* DPB1,HLA-DRA HLA-DPA1,HLA- 5164 Influenza A 3 0.0179* DPB1,HLA-DRA HLA-DPA1,HLA- 5152 Tuberculosis 3 0.0181* DPB1,HLA-DRA HLA-DPA1,HLA- 5168 Herpes simplex infection 3 0.0183* DPB1,HLA-DRA HLA-DPA1,HLA- 5169 Epstein-Barr virus infection 3 0.0221* DPB1,HLA-DRA HLA-DPA1,HLA- 5166 HTLV-I infection 3 0.0465* DPB1,HLA-DRA

142

6.3.4 GHSROS increases orthotopic breast xenograft growth In order to investigate the effect of GHSROS on tumour growth in vivo we used an orthotopic xenograft model (Kocaturk & Versteeg, 2015). MDA-MB-231luc-GHSROS and MDA-MB-231luc-Vector cells were injected into the mammary fat pad of NSG mice (Figure 6.4A). Compared to vector controls, mammary fat pad xenograft tumour volumes (measured by total flux [p/s]) were significantly increased in MDA-MB-231luc-GHSROS mice at post-implantation day 28 (Mann-Whitney, P = 0.0002), 42 (P <0.0001), and 49 (P ≤ 0.0001) – day 49 itself constituting the experimental endpoint (day 49, P ≤ 0.0001) (Figure 6.4B). Upon excision, the MDA-MB-231luc-GHSROS tumours weighed significantly more post-mortem (Mann-Whitney-Wilcoxon test, P = 0.0169) (Figure 6.4C).

Figure 6.4. GHSROS promotes orthotopic MDA-MB-231 xenograft tumour growth in vivo. (A) NSG mice were injected in the mammary fat pad with MDA-MB-231luc cell lines stably transfected with GHSROS (n = 10 mice) or empty vector (n = 10). Representative IVIS images showing total flux (bioluminescence) at day 49 (endpoint) demonstrating that tumours are larger in mice with GHSROS- overexpressing tumours. (B) Time course for MDA-MB-231luc-GHSROS (n = 10) and luc-vector control (n = 10) mammary fat pad xenograft tumour bioluminescence. Tumour bioluminescent imaging (total flux (p/s) was measured using the IVIS Spectrum in vivo imaging system. Mean ± s.e.m. ***P ≤ 0.001, two-way ANOVA with Bonferonni’s post hoc analysis. (C) Tumour weights of MDA-MB-231 xenografts (GHSROS overexpressing n=10, vector n=10) at end point. Mean ± s.e.m. *P < 0.05, Mann- Whitney-Wilcoxon test.

143

6.4 Discussion

We previously demonstrated that the lncRNA GHSROS, derived from a gene antisense to the ghrelin receptor gene, is expressed by non-small cell lung tumours and increases lung cancer cell line migration in vitro (Whiteside et al., 2013). Similarly, in this study we demonstrate that GHSROS is expressed in breast cancer and promotes in vitro cell line migration. GHSROS also promotes MDA-MB-231 breast cancer cell line xenograft tumour growth and alters the expression of cancer-related and immune genes. While lncRNAs are rarely highly conserved, nor abundantly expressed, they can play integral regulatory roles in a large array of cellular processes, including under pathological conditions such as tumourigenesis (Lanzos et al., 2017; Qiu & Xu, 2013). More recently, genome-wide RNA-seq profiling of 22 paired tumour and non-malignant tissues from oestrogen receptor positive (ER+) breast cancers demonstrated that the expression of natural antisense transcript (NAT) lncRNAs are increased in tumour samples (Wenric et al., 2017), suggesting that they may play important regulatory roles.

A number of lncRNAs that are differentially expressed in breast cancer function as critical mediators of breast cancer tumourigenesis (Nie et al., 2012; Pan et al., 2011; Qiu et al., 2013). For example, the lncRNAs H19 (Collette, Le Bourhis, & Adriaenssens, 2017), MALAT1 (Miao & Qian, 2016), HOTAIR (Avazpour et al., 2017), and TUG1 (Taurine upregulated 1 gene) (Chiu et al., 2018) have a higher gene expression in breast cancer and breast cancer cell lines (Avazpour et al., 2017; Nagini, 2017). LncRNAs have also been proposed as potential breast cancer biomarkers. MALAT1, an abundant and highly conserved lncRNA, can be detected in the serum of breast cancer patients and at much higher levels than in patients with benign breast disease (Miao et al., 2016). Although we have not identified a direct relationship between GHSROS expression and clinical parameters (including ER, PR and HER2 status), further studies interrogating a larger patient cohort are warranted. GHSROS increases cell migration in vitro, but not in vitro cell proliferation, of the MDA-MB-231 breast cancer cell line and the non- tumourigenic normal-breast derived MCF10A cell line. Previously we observed similar effects when overexpressing GHSROS in non-small-cell lung carcinoma cell lines (Whiteside et al., 2013). Although we observed no changes in proliferation in our two-dimensional cell models, tumour size was significantly greater in MDA-MB-231-GHSROS orthotopic xenograft tumours. In vitro models of cell proliferation do not replicate many aspects of cancer progression (Kocaturk & Versteeg, 2015), including the tumour microenvironment and growth factors which are not present in an in vitro system (Cavo et al., 2016).

We performed microarray analysis of MDA-MB-231-GHSROS cultured in vitro to reveal potential GHSROS target genes. The most highly upregulated gene, HTR1F, belongs to a subgroup of serotonin (5-HT) receptors and is significantly associated with breast cancer recurrence (Kopparapu et al., 2013). As this gene was downregulated in the MCF10A-GHSROS cell line, which also showed increased in

144 vitro migration, we speculate that HTR1F is unlikely to play a major role in GHSROS-mediated cell migration, however, its function in migration could be complex. Upregulation of the 5-HT signalling pathways in metastatic breast cancer is pro-oncogenic: stimulating pro-proliferative, invasive, and anti- apoptotic pathways (Kopparapu et al., 2013). Conversely, normal physiological levels of the ligand, 5- HT, induce growth inhibition and apoptosis in breast cancer cell lines, presumably through increases in expression of the receptor (Pai & Horseman, 2009). The key cancer gene TBX3, which is elevated in GHSROS overexpressing cells, is also elevated in metastatic breast cancer, correlates with reduced metastasis-free survival, and potently promotes cell survival and tumour growth in early-stage breast cancer cell models (Amir et al., 2016; Li & Prince, 2013; Peres et al., 2010). Additionally, TBX3 overexpression stimulates cell migration in normal breast and breast cancer cells (Peres et al., 2010). We postulate that TBX3 represents a key gene mediating the effects of GHSROS in breast cancer.

Transcripts encoding a number of major histocompatibility complex MHC-II subunits were repressed in MDA-MB-231 cells overexpressing GHSROS. MHC-II genes encode cell surface proteins primarily involved in antigen presentation and adaptive immunity (Doonan & Haque, 2010; Forero et al., 2016; Trowsdale & Knight, 2013). Reduced tumour expression of the MHC-II complex increases breast tumour aggressiveness and results in poor overall survival (Forero et al., 2016; Thibodeau, Bourgeois- Daigneault, & Lapointe, 2012). Conversely, increased MHC-II expression is associated with a positive prognosis in triple-negative breast cancer (Forero et al., 2016). Given that in vivo xenograft models using human cell lines require immunocompromised or syngeneic mice, the role of GHSROS in anti- tumour immunity is currently challenging to investigate. Humanised mice that support human cell lines and patient-derived xenografts (Walsh et al., 2017; Zhao et al., 2018) will be critical in assessing if GHSROS overexpression indeed facilitates immune system evasion. We speculate that GHSROS, by downregulating critical components of the acquired immune system, promotes breast tumours cell survival.

In conclusion, in this study we examined the expression of function of the lncRNA GHSROS in breast cancer, demonstrating a role in breast cancer cell migration and orthotopic xenograft tumour growth. These data expand on recent findings on GHSROS in lung cancer, and provides a rationale for further investigation into this lncRNA in cancer

145

Chapter 7

General Discussion

146

7.1 General discussion

Summary of results

This thesis characterised the expression and function of the lncRNA GHSROS in cancer, with a focus on prostate and breast cancer as model diseases. We provide evidence that GHSROS expression is dysregulated in breast and prostate cancer and that it regulates a number of hallmarks of cancer progression (Gutschner & Diederichs, 2012) – including cell proliferation, migration, and survival, and xenograft growth in vivo, indicating its oncogenic properties. Our transcriptomic analyses indicate that GHSROS reprograms prostate and breast cancer tumours towards a more aggressive phenotype. GHSROS stimulates androgen-independent prostate cancer growth and, therefore, may also play a role in the development of castrate resistant prostate cancer – an untreatable and inevitably lethal step in the progression of prostate cancer. In LNCaP and PC3 prostate cancer cell lines GHSROS expression is upregulated by the chemotherapeutic agent docetaxel and it also permits cell survival under stress, indicating that it may contribute to cytotoxic drug resistance. GHSROS overexpression in the LNCaP cell line significantly delayed cell death following treatment with the antiandrogen enzalutamide.

GHSROS expression is highly elevated in a subset of advanced prostate cancers and other tumour types

GHSROS, originally described in our laboratory, is highly expressed in lung cancer (Whiteside et al., 2013). These findings were extended to characterise the expression and role of GHSROS in prostate cancer in this thesis. Screening publicly accessible microarray data from ~4000 tissues we detected GHSROS in the brain and numerous foetal tissues, as well as in tumours of the brain (glioma), lung, and prostate. GHSROS expression was confirmed in a subset of prostate cancers, using commercial cDNA arrays and prostate cancer specific arrays (Chapter 3). GHSROS harbours an AT-rich human- specific promoter in a MER5B SINE repeat element (Whiteside et al., 2013), a pattern frequently found in promoters of lncRNAs which have high tissue specificity and low expression levels (Kannan et al., 2015; Pheasant & Mattick, 2007). In a secondary cohort of primary prostate specimens, elevated GHSROS expression significantly correlated with high Gleason score and advanced cancer stage, a feature in common with other prostate cancer lncRNAs such as SChLAP1 and NEAT1 (Chakravarty et al., 2014; Cimadamore et al., 2017; Mehra et al., 2016). Evidence of cancer-specific upregulation of GHSROS was not limited to prostate cancer, however, as it was also increased compared to normal tissue in thyroid, testicular, ovarian, lymphoid, kidney, oesophageal, cervical, endometrial, adrenal, lung and breast cancers. We confirmed that GHSROS was upregulated in breast cancers of a range of subtypes and classifications (including basal, luminal and HER2+) (Chapter 5).

147

It is estimated that ~20-63% of human transcripts exist as sense-antisense pairs (Chen et al., 2004), with key regulatory lncRNAs overlapping coding genes (Ning et al., 2017). GHSROS fully overlaps the single 2.2 kb intronic region of the GHSR gene, and is therefore, a natural antisense gene (Whiteside et al., 2013). Typically, antisense lncRNAs are able to regulate the gene expression and splicing of their sense coding gene counterparts (Villegas & Zaphiropoulos, 2015), influencing transcriptional complexity and cell signalling pathways (Korneev et al., 2015; Latge et al., 2018). In Chapter 3, we employed qRT-PCR to provide preliminary evidence that ectopic GHSROS overexpression in prostate cancer cell lines upregulates the gene expression of the truncated ghrelin receptor isoform GHSR1b. GHSR1b is overexpressed by many human tumours, including those of the prostate, lung, breast, ovary, adrenocortical, and pituitary (Barzon et al., 2005; Gahete et al., 2011; Gaytan et al., 2005; Ibanez-Costa et al., 2015; Jeffery et al., 2002; Takahashi et al., 2006). The basis for its alternative splicing and differential expression is currently unknown. GHSROS may regulate its overlapping coding sense gene by binding to GHSR pre-mRNA, therefore, modulating the expression of the receptor isoforms. Further studies are required to investigate this association, including Western immunoblotting and immunohistochemistry for the GHSR1b following overexpression and knockdown of GHSROS in cancer cell lines. Further, more sophisticated methods, such as the construction of a minigene containing the loci containing GHSR and GHSROS will be a crucial future direction in order to investigate in vitro cis splicing events.

GHSROS has oncogenic functions in prostate and breast cancer

Given the large amount of data supporting a role in cancer, we hypothesised that overexpression of GHSROS may modulate cell growth and migration in our cell lines (including the androgen-independent PC3 and DU145 cell lines). Strand-specific knockdown was performed using locked nucleic acid (LNA) antisense oligonucleotides. The lncRNAs FALEC, MALAT1, and SChLAP1 are among a small number of lncRNAs which increase prostate cancer cell proliferation and migration when overexpressed (Prensner et al., 2013; Wang et al., 2015; Zhao et al., 2017). Overexpression of GHSROS in the DU145 and PC3 prostate cancer cells led to increases in in vitro cell proliferation and migration. Furthermore, GHSROS enhanced attachment to collagen type IV, a major component of the basement membrane (Tanjore & Kalluri, 2006). A number of lncRNAs are capable of sustaining chronic proliferation (Gutschner & Diederichs, 2012). Recent large-scale screens of lncRNAs exhibiting temporal expression and control within the S-phase of cell division show that they are common regulators of cell replication (Ali et al., 2018). Increased cellular proliferation of GHSROS-overexpressing PC3 and DU145 prostate cancer cell lines was recapitulated in subcutaneous xenograft tumours in mice. Our studies also demonstrated that this tumour-promoting role is not confined to prostate cancer, as ectopic overexpression significantly increased orthotopic xenograft growth in the MDA-MB-231 breast cancer

148 cell line. Taken together, however, our studies show that GHSROS acts in a similar manner to other bona fide cancer-associated lncRNAs to, ultimately, regulate tumour growth and progression.

It is now appreciated that lncRNAs can regulate a multitude of genes through transcriptional and posttranscriptional mechanisms. Small, unspliced lncRNAs nested within introns, like GHSROS, exhibit distinct expression in tumours compared to normal tissue (Engelhardt & Stadler, 2015; Nakaya et al., 2007; Reis et al., 2004). Furthermore, they are often involved in the fine-tuning of gene expression, acting in trans (Louro et al., 2008). We used RNA-seq to investigate the transcriptome of PC3 cells overexpressing GHSROS, hypothesising that upregulation of this lncRNA in prostate cancer cells would alter a wide range of tumour-promoting genes (Chapter 4). Indeed, GHSROS overexpression led to a dysregulation of ~400 genes involved in cancer hallmarks, including cell migration, cell growth, and adaptation to stress (including hypoxia and drug responses). GHSROS upregulates a number of genes reported to enhance prostate tumour growth and survival, including NTSR1, (a GPCR and high affinity neurotensin receptor), TFF2, and MUC5B. NTSR1, the mucin genes and trefoil factor family (TFF) peptides are expressed in advanced prostate tumours and cell lines, where they mediate cell migration and growth (Almeida et al., 2010; Bougen et al., 2013; Hashimoto et al., 2015; Lapointe et al., 2004; Legrier et al., 2004; Radiloff et al., 2011; Swift & Maitland, 2010; Vestergaard et al., 2006). The secretion of TFF peptides and mucins is a co-ordinated process. TFF2 is specifically upregulated during differentiation into the mucin-secreting phenotype, where secretory mucins (including MUC5B) are expressed and have potent mitogenic functions (Gouyer et al., 2001; Legrier et al., 2004; Vestergaard et al., 2006; Yu et al., 1996).

A transcriptomics approach for identifying tumour drivers in our RNA-seq list identified a core set of 34 (from 400) GHSROS-regulated transcripts which were also differentially expressed in the Taylor (B. S. Taylor et al., 2010) and Grasso prostate cancer datasets (Grasso et al., 2012) which both contained primary and metastatic tumours. This core gene set included TFF2 and MUC5B, suggesting that these genes may indeed be part of a pro-metastatic phenotype in prostate cancer cells. Additional clustering of these genes demonstrated that the expression of the 34 core genes was useful in predicting adverse disease outcomes in prostate cancer. This signature performed in the 97th percentile when compared to the performance of 23 published gene expression based signatures (outlined in Chapter 4) - indicating that a reasonable number of GHSROS-regulated genes are differentially expressed in prostate tumours versus normal prostate, and as a network, they are important for advanced cancer and metastasis. The potential for each of these genes to be functionally significant in prostate cancer metastasis has yet to be elucidated, and moreover, testing the prognostic utility of the expression of these 34 genes as a signature for predicting survival outcomes in large and well annotated clinical cohorts is an important future direction.

149

GHSROS promotes androgen-independent prostate cancer growth

Our current data also suggests that GHSROS mediates tumour growth independent of a reduction in the expression of androgen receptor (AR), one of the most downregulated genes following GHSROS overexpression in the PC3 and LNCaP cells in vitro. Recent evidence suggests that subsets of genes or gene networks, could confer a castration-resistant phenotype in prostate cancer cells which were previously dependent upon androgen/AR-mediated signalling (Bluemn et al., 2013). While metastatic or advanced tumours typically rely on AR-signalling to enhance tumour growth and promote survival, some prostate tumours may be able to bypass these pathways and permit growth independent of AR expression and action (Bluemn et al., 2017; Fan et al., 2013; Grossmann, Huang, & Tindall, 2001; McKeehan, Adams, & Fast, 1987). Interestingly, some of these gene networks exist among the genes differentially expressed by GHSROS and within the 34-gene signature. For example, mucin glycoproteins (including MUC5B) and TFF peptides – as well as being powerful promoters of cell migration – are also associated with hormone independence and AR-independent growth (Fladeby et al., 2008; Legrier et al., 2004; Pandey et al., 2014; Radiloff et al., 2011). Moreover, growth factor receptors, like NTSR1, may be recruited in advanced prostate cancer as an alternative growth pathway in the absence of androgens (Swift et al., 2010). As discussed in Chapter 4, overexpression of NTSR1 and GHSR1b protein in non-small cell lung cancer can lead to heterodimeric complexing of these two GPCRs and formation of a functioning neuromedin U (NMU) receptor, enhancing in vitro cell migration and growth (Takahashi et al., 2006). Given that GHSROS upregulates each of these genes, it would be interesting to speculate that it may regulate this interaction. This could be investigated by overexpressing GHSROS in prostate cancer cells and exploring GPCR heterodimerisation using AR- specific signalling assays and GPCR receptor proximity assays such as quantitative bioluminescence resonance energy transfer (qBRET) (Szalai et al., 2014).

The effect of reduced AR expression by androgen-insensitive cell lines, such as the PC3 cell line which has been shown to express minute amounts of AR (Alimirah et al., 2006), and its impact on tumour growth is unknown. In the strongly androgen dependent LNCaP cell line GHSROS, promoted cell migration, proliferation, and subcutaneous tumour growth. Interestingly, these effects on LNCaP cells were observed in vitro (where AR was repressed by GHSROS; see Chapter 4) and in xenografts (where AR expression was variable; Chapter 5). With this variability, a subset of the LNCaP xenograft tumours overexpressing GHSROS had higher levels of AR expression which led to gene set enrichment of the androgen response in vivo. This is suggestive that the AR may be regulated by local changes in the microenvironment at the level of the tumour, or subject to a regulatory feedback loop even with the effects exerted by GHSROS. In vitro, LNCaPs overexpressing GHSROS had a delayed response to the actions of enzalutamide which is likely to be due to the reduced levels of AR. Given the discrepancies of AR levels in vitro and in vivo, future studies will include an orthotopic xenograft arm, where

150

GHSROS-overexpressing androgen-responsive prostate cancer cells will be grown in castrated mice. Immunohistochemical measurements of the AR and PSA (encoded by KLK3) will also be performed to characterise the response to GHSROS overexpression. In this context GHSROS may work similar to the lncRNA LOC283070 which, when overexpressed, mediates LNCaP cell proliferation and migration under AR-independent conditions and tumour growth in vivo in castrated mice (in the absence of androgens) (Wang et al., 2016). A luciferase-linked androgen response element (ARE) reporter vectors (under exogenous dihydrotestosterone (DHT) exposure) in the GHSROS overexpressing LNCaPs will be useful in determining if GHSROS mutes signalling through the AR and primes prostate cancer cells to reduced androgen conditions (Azeem et al., 2017).

In chapter 5, RNA-seq and qRT-PCR studies of the largely proliferative GHSROS overexpressing LNCaP xenograft tumours confirmed a potent repression of PPP2R2C and upregulation of genes encoding TFF peptides (TFF1 and TFF2), supporting our previous findings (Chapter 4). RNA- sequencing data from PC3-GHSROS cultured in vitro was complemented by performing RNA- sequencing of LNCaP-GHSROS xenografts (Chapter 5). This revealed 101 common GHSROS- regulated genes between the PC3 and LNCaP GHSROS cell lines - including PPP2R2C and TFF2. In subsets of patients who develop resistance to AR-targeting therapies, reductions in the level of PPP2R2C facilitate CRPC by promotion of AR-independent growth and survival (Bluemn et al., 2013). This is evident in LNCaP and VCaP prostate cancer cells which, when subjected to PPP2R2C knockdown, grow independently of the AR (Bluemn et al., 2013). As we speculated (Chapter 5), GHSROS reduces the expression of PPP2R2C, an event which may prime tumours to adapt to AR- independent growth and facilitate CRPC development. Critical additional experiments are required to explore this further in vivo, where we deplete GHSROS-overexpressing xenografts of PPP2R2C and assess tumour growth in response to enzalutamide and conversely determine the effect of restoring PPP2R2C expression in these experiments. Furthermore, a number of important metastatic genes, including TFF2, were regulated by GHSROS and shared between the Taylor and Grasso prostate cancer datasets. Interestingly, protein network analysis and disease-free survival outcomes revealed that GHSROS regulates genes not previously associated with cancer metastasis and disease outcomes, including CHRDL1. This is similar to other lncRNAs which drive CRPC, including HOTAIR which promotes prostate cancer tumour growth in the absence of androgens by driving AR-independent gene activation (Zhang et al., 2015). Additionally, NEAT1 is a driver of prostate tumour growth and facilitator of an AR-independent gene regulatory network through an ERα-specific transcriptome (Chakravarty et al., 2014). Potentially, these gene expression changes, including strong repression of PPP2R2C, may indicate that support a model where GHSROS works independently of the expression or function of the AR. In this model, GHSROS overexpression, which is unaffected by (DHT) or enzalutamide treatment, may facilitate enzalutamide resistance and prime prostate cancer cells for CRPC.

151

In order to exert gene-regulatory changes, intronic ncRNAs may be responsive to physiological stimuli (which include androgens) or xenobiotic stimuli (Louro et al., 2009). Among drugs targeting advanced CRPC, the cytotoxic drug docetaxel is used upon failure of standard androgen targeted therapy (ATT) or in patients presenting with a high tumour burden (Puente et al., 2017). Although not responsive to exogenous DHT and seemingly recalcitrant to the effects of enzalutamide, GHSROS cells exhibited a strong dose-dependent increase in expression following docetaxel treatment in LNCaP and PC3 cells. Additionally, LNCaP cells overexpressing GHSROS showed a delayed rate of cell death and apoptosis in response to the cytotoxic effects of docetaxel. Our findings are supported by those of other prostate tumour promotor lncRNAs, including POTEF-AS1 and MALAT1 which when overexpressed decrease the rate of apoptosis in response to docetaxel (Misawa & Fujimura, et al., 2017; Xue et al., 2018). POTEF-AS1 in particular achieves this by repressing genes in toll-like receptor signalling and apoptosis pathways (Misawa, Takayama et al., 2017). Additionally, overexpression of the noncoding gene PCGEM1 reduces the levels of p21 that increase in response to doxorubicin (Fu et al., 2006). The mechanisms in which GHSROS exerts its effects are still unknown, however, interestingly, RNA-seq of LNCaP-GHSROS xenografts shows that GHSROS represses the gene encoding the cell cycle inhibitor p21 (CDKN1A). A critical future aim will be to address the expression of p21 in response to docetaxel in GHSROS overexpressing prostate cancer cells.

Taken together, data provided in this thesis offer evidence that GHSROS regulates an adaptation to the AR-mediated gene pathway and reduces apoptosis in prostate cancer cells. However, it remains to be elucidated whether GHSROS plays a role in the acquisition of CRPC or chemoresistance. Future studies will determine if targeting GHSROS is beneficial for inhibiting prostate cancer growth and the progression of androgen-independent prostate tumours. In particular, the mechanism and signalling pathways which GHSROS may regulate in order to permit this phenotype will guide future pre-clinical experiments, as will the efficacy of targeting GHSROS in vivo.

152

GHSROS as a therapeutic target for cancer

LncRNAs provide a wealth of promising targets for the development of new cancer therapies using RNA interference (RNAi) (Slaby et al., 2017). Antisense oligonucleotides (ASOs) have undergone a number of generations of refinements, and modified ASOs demonstrate some advantages (include increased hybridisation specificity and reduced toxicity) over siRNAs and conventional unmodified ASOs, particularly for targeting nuclear transcripts (Frazier, 2015; Ling, 2016; Pauli & Schier, 2015; Sharma et al., 2014; Stein & Castanotto, 2017). Antisense oligonucleotide-targeting of oncogenic lncRNAs may be a beneficial approach for a subset of cancer patients which overexpress these lncRNAs (Ling, 2016). As in the case of MALAT1 (Arun et al., 2016), targeting GHSROS may present an opportunity for clinical intervention. In this study we developed antisense oligonucleotides targeting GHSROS and assessed their function in the PC3 cell line in vitro. Preliminary knockdown studies with locked nucleic antisense oligonucleotides (LNA ASOs) reversed the in vitro effects of proliferation and migration in the PC3 cell line (Chapter 3). However, it is appreciated that translational and regulatory challenges exist for oligonucleotide therapies (Stein & Castanotto, 2017). To that end, we are in the process of refining the LNA-modified oligonucleotides used in this study, and investigating various nanoparticle formulations – which reduce toxicity and improve the pharmacological profile of treatments - with the aim of targeting in vivo xenografts in proof of principle studies. Future studies will assess the practicality and usefulness of an LNA-based cancer therapeutic targeting GHSROS.

153

Figure 7.1. Proposed model of GHSROS function in prostate cancer. Summary of the role of GHSROS in prostate cancer, outlining its proposed expression and contribution to aggressive disease through transcriptional regulation. By regulating a pro-tumourigenic environment, GHSROS may help facilitate tumour growth independently of the AR, thus rendering it partially resistant to the effects of current AR-targeting therapies. Through unknown mechanisms, GHSROS shows resistance to the effects of docetaxel, a cytotoxic drug which also upregulates and enhances the expression of this lncRNA. Targeting GHSROS with LNA-modified antisense oligonucleotides may lead to tumour regression.

Summary and conclusion In the past decade there has been significant progress towards characterising genes in the human genome – an effort which unexpectedly revealed a large number of long noncoding RNAs with significant regulatory functions. An understanding of the expression and function of many of these lncRNA still remains a significant experimental challenge, however. The findings in this PhD thesis have contributed further to the knowledge of lncRNAs, and the important role they play in cancer. Additional studies will be crucial in order to elucidate the molecular mechanism of GHSROS action and its viability as a targetable RNA gene in clinical oncology. Nevertheless, studies in the near future will be crucial in determining whether GHSROS may be translated into the clinic as a viable RNA-based therapy for cancer.

154

Chapter 8

References cited

155

Abraham Czerniak, M., Nader Hanna, MD, Fred Konikoff, MD, Ayala Hubert, MD. (2010, October 7, 2010). Phase 1/2a, dose-escalation, safety, pharmacokinetic, and preliminary efficacy study of intratumoral administration of DTA-H19 in patients with unresectable pancreatic cancer. Retrieved from http://clinicaltrials.gov/ct2/show/study/NCT00711997

Adams, B. D., Parsons, C., Walker, L., Zhang, W. C., & Slack, F. J. (2017). Targeting noncoding RNAs in disease. J Clin Invest, 127(3), 761-771. doi:10.1172/jci84424

Adiconis, X., Borges-Rivera, D., Satija, R., DeLuca, D. S., Busby, M. A., Berlin, A. M., Levin, J. Z. (2013). Comparative analysis of RNA sequencing methods for degraded or low-input samples. Nat Methods, 10(7), 623-629. doi:10.1038/nmeth.2483

Agostini, F., Zanzoni, A., Klus, P., Marchese, D., Cirillo, D., & Tartaglia, G. G. (2013). catRAPID omics: a web server for large-scale prediction of protein-RNA interactions. Bioinformatics, 29(22), 2928-2930. doi:10.1093/bioinformatics/btt495

Aiello, A., Bacci, L., Re, A., Ripoli, C., Pierconti, F., Pinto, F., Farsetti, A. (2016). MALAT1 and HOTAIR long non-coding RNAs play opposite role in estrogen-mediated transcriptional regulation in prostate cancer cells. Sci Rep, 6, 38414. doi:10.1038/srep38414

Australian Institute of Health and Welfare. (2013). Prostate cancer in Australia. Asia Pac J Clin Oncol, 9(3), 199-213. Doi:10.1111/ajco.12127

Alahari, S. V., Eastlack, S. C., & Alahari, S. K. (2016). Role of long noncoding RNAs in neoplasia: special emphasis on prostate cancer. Int Rev Cell Mol Biol, 324, 229-254. doi:10.1016/bs.ircmb.2016.01.004

Ali, M. M., Akhade, V. S., Kosalai, S. T., Subhash, S., Statello, L., Meryet-Figuiere, M., Kanduri, C. (2018). PAN-cancer analysis of S-phase enriched lncRNAs identifies oncogenic drivers and biomarkers. Nat Commun, 9(1), 883. doi:10.1038/s41467-018-03265-1

Alimirah, F., Chen, J., Basrawala, Z., Xin, H., & Choubey, D. (2006). DU-145 and PC-3 human prostate cancer cell lines express androgen receptor: implications for the androgen receptor functions and regulation. FEBS Lett, 580(9), 2294-2300. doi:10.1016/j.febslet.2006.03.041

Almeida, T. A., Rodriguez, Y., Hernandez, M., Reyes, R., & Bello, A. R. (2010). Differential expression of new splice variants of the neurotensin receptor 1 gene in human prostate cancer cell lines. Peptides, 31(2), 242-247. doi:10.1016/j.peptides.2009.12.007

Amir, S., Simion, C., Umeh-Garcia, M., Krig, S., Moss, T., Carraway, K. L., 3rd, & Sweeney, C. (2016). Regulation of the T-box transcription factor Tbx3 by the tumour suppressor microRNA-206 in breast cancer. Br J Cancer, 114(10), 1125-1134. doi:10.1038/bjc.2016.73

Amorim, M., Salta, S., Henrique, R., & Jeronimo, C. (2016). Decoding the usefulness of non-coding RNAs as breast cancer markers. J Transl Med, 14, 265. doi:10.1186/s12967-016-1025-3

Anastassova-Kristeva, M. (2015). Morphogens reveal the appearance and functions of lncRNAs. Stem Cells Dev, 24(13), 1591-1593. doi:10.1089/scd.2014.0528

Angrand, P. O., Vennin, C., Le Bourhis, X., & Adriaenssens, E. (2015). The role of long non-coding RNAs in genome formatting and expression. Front Genet, 6, 165. doi:10.3389/fgene.2015.00165

Apweiler, R., Bairoch, A., Wu, C. H., Barker, W. C., Boeckmann, B., Ferro., Yeh, L. S. (2004). UniProt: the universal protein knowledgebase. Nucleic Acids Res, 32(Database issue), D115-119. doi:10.1093/nar/gkh131

156

Arun, G., Diermeier, S., Akerman, M., Chang, K. C., Wilkinson, J. E., Hearn, S., Spector, D. L. (2016). Differentiation of mammary tumors and reduction in metastasis upon Malat1 lncRNA loss. Genes Dev, 30(1), 34-51. doi:10.1101/gad.270959.115

Atianand, M. K., & Fitzgerald, K. A. (2014). Long non-coding RNAs and control of gene expression in the immune system. Trends Mol Med, 20(11), 623-631. doi:10.1016/j.molmed.2014.09.002

Attard, G., Parker, C., Eeles, R. A., Schroder, F., Tomlins, S. A., Tannock, I., de Bono, J. S. (2016). Prostate cancer. Lancet, 387(10013), 70-82. doi:10.1016/s0140-6736(14)61947-4

Au, C. C., Furness, J. B., & Brown, K. A. (2016). Ghrelin and breast cancer: emerging roles in obesity, estrogen regulation, and cancer. Front Oncol, 6, 265. doi:10.3389/fonc.2016.00265

Auvergne, R. M., Sim, F. J., Wang, S., Chandler-Militello, D., Burch, J., Al Fanek, Y., Goldman, S. A. (2013). Transcriptional differences between normal and glioma-derived glial progenitor cells identify a core set of dysregulated genes. Cell Rep, 3(6), 2127-2141. doi:10.1016/j.celrep.2013.04.035

Avazpour, N., Hajjari, M., & Tahmasebi Birgani, M. (2017). HOTAIR: A promising long non-coding RNA with potential role in breast invasive carcinoma. Front Genet, 8, 170. doi:10.3389/fgene.2017.00170

Azeem, W., Hellem, M. R., Olsen, J. R., Hua, Y., Marvyin, K., Qu, Y., Kalland, K. H. (2017). An androgen response element driven reporter assay for the detection of androgen receptor activity in prostate cells. PLoS One, 12(6), e0177861. doi:10.1371/journal.pone.0177861

Barbieri, C. E., Baca, S. C., Lawrence, M. S., Demichelis, F., Blattner, M., Theurillat, J. P., Garraway, L. A. (2012). Exome sequencing identifies recurrent SPOP, FOXA1 and MED12 mutations in prostate cancer. Nat Genet, 44(6), 685-689. doi:10.1038/ng.2279

Barrett, T., Wilhite, S. E., Ledoux, P., Evangelista, C., Kim, I. F., Tomashevsky, M., Soboleva, A. (2013). NCBI GEO: archive for functional genomics data sets--update. Nucleic Acids Res, 41(Database issue), D991-995. doi:10.1093/nar/gks1193

Barzon, L., Pacenti, M., Masi, G., Stefani, A. L., Fincati, K., & Palu, G. (2005). Loss of growth hormone secretagogue receptor 1a and overexpression of type 1b receptor transcripts in human adrenocortical tumors. Oncology, 68(4-6), 414-421. doi:10.1159/000086983

Beer, T. M., Armstrong, A. J., Rathkopf, D. E., Loriot, Y., Sternberg, C. N., Higano, C. S., Tombal, B. (2014). Enzalutamide in metastatic prostate cancer before chemotherapy. N Engl J Med, 371(5), 424- 433. doi:10.1056/NEJMoa1405095

Belkhiri, A., Dar, A. A., Zaika, A., Kelley, M., & El-Rifai, W. (2008). t-Darpp promotes cancer cell survival by up-regulation of Bcl2 through Akt-dependent mechanism. Cancer Res, 68(2), 395-403. doi:10.1158/0008-5472.CAN-07-1580

Belkhiri, A., Zhu, S., Chen, Z., Soutto, M., & El-Rifai, W. (2012). Resistance to TRAIL is mediated by DARPP-32 in gastric cancer. Clin Cancer Res, 18(14), 3889-3900. doi:10.1158/1078-0432.CCR-11- 3182

Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Series B (Methodol), 289-300.

Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J., & Wheeler, D. L. (2005). GenBank. Nucleic Acids Res, 33(Database Issue), D34-38. doi:10.1093/nar/gki063

157

Bernstein, B. E., Kamal, M., Lindblad-Toh, K., Bekiranov, S., Bailey, D. K., Huebert, D. J., Lander, E. S. (2005). Genomic maps and comparative analysis of histone modifications in human and mouse. Cell, 120(2), 169-181. doi:10.1016/j.cell.2005.01.001

Berthold, D. R., Pond, G. R., Roessner, M., de Wit, R., Eisenberger, M., Tannock, A. I., & investigators, T. A. X. (2008). Treatment of hormone-refractory prostate cancer with docetaxel or mitoxantrone: relationships between prostate-specific antigen, pain, and quality of life response and survival in the TAX-327 study. Clin Cancer Res, 14(9), 2763-2767. doi:10.1158/1078-0432.CCR-07-0944

Bhan, A., Hussain, I., Ansari, K. I., Kasiri, S., Bashyal, A., & Mandal, S. S. (2013). Antisense transcript long noncoding RNA (lncRNA) HOTAIR is transcriptionally induced by estradiol. J Mol Biol, 425(19), 3707-3722. doi:10.1016/j.jmb.2013.01.022

Bhan, A., & Mandal, S. S. (2016). Estradiol-induced transcriptional regulation of long non-coding RNA, HOTAIR. Methods Mol Biol, 1366, 395-412. doi:10.1007/978-1-4939-3127-9_31

Bi, D., Ning, H., Liu, S., Que, X., & Ding, K. (2016). miR-1301 promotes prostate cancer proliferation through directly targeting PPP2R2C. Biomed Pharmacother, 81, 25-30. doi:10.1016/j.biopha.2016.03.043

Bibikova, M., Chudin, E., Arsanjani, A., Zhou, L., Garcia, E. W., Modder, J., Wang-Rodriguez, J. (2007). Expression signatures that correlated with Gleason score and relapse in prostate cancer. Genomics, 89(6), 666-672. doi:10.1016/j.ygeno.2007.02.005

Bill-Axelson, A., Holmberg, L., Garmo, H., Rider, J. R., Taari, K., Busch, C., Johansson, J. E. (2014). Radical prostatectomy or watchful waiting in early prostate cancer. N Engl J Med, 370(10), 932-942. doi:10.1056/NEJMoa1311593

Bismar, T. A., Demichelis, F., Riva, A., Kim, R., Varambally, S., He, L., Rubin, M. A. (2006). Defining aggressive prostate cancer using a 12-gene model. Neoplasia, 8(1), 59-68. doi:10.1593/neo.05664

Bluemn, E. G., Coleman, I. M., Lucas, J. M., Coleman, R. T., Hernandez-Lopez, S., Tharakan, R., Nelson, P. S. (2017). Androgen receptor pathway-independent prostate cancer is sustained through FGF signaling. Cancer Cell, 32(4), 474-489.e476. doi:10.1016/j.ccell.2017.09.003

Bluemn, E. G., Spencer, E. S., Mecham, B., Gordon, R. R., Coleman, I., Lewinshtein, D., Nelson, P. S. (2013). PPP2R2C loss promotes castration-resistance and is associated with increased prostate cancer- specific mortality. Mol Cancer Res, 11(6), 568-578. doi:10.1158/1541-7786.MCR-12-0710

Boegel, S., Lower, M., Schafer, M., Bukur, T., de Graaf, J., Boisguerin, V., Sahin, U. (2012). HLA typing from RNA-Seq sequence reads. Genome Med, 4(12), 102. doi:10.1186/gm403

Boon, R. A., Jae, N., Holdt, L., & Dimmeler, S. (2016). Long Noncoding RNAs: From clinical genetics to therapeutic targets? J Am Coll Cardiol, 67(10), 1214-1226. doi:10.1016/j.jacc.2015.12.051

Bougen, N. M., Amiry, N., Yuan, Y., Kong, X. J., Pandey, V., Vidal, L. J., Lobie, P. E. (2013). Trefoil factor 1 suppression of E-cadherin enhances prostate carcinoma cell invasiveness and metastasis. Cancer Lett, 332(1), 19-29. doi:10.1016/j.canlet.2012.12.012

Bourdoumis, A., Chrisofos, M., Stasinou, T., Christopoulos, P., Mourmouris, P., Kostakopoulos, A., & Deliveliotis, C. (2015). The role of PCA 3 as a pprognostic factor in patients with castration-resistant prostate cancer (CRPC) treated with docetaxel. Anticancer Res, 35(5), 3075-3079.

158

Brazao, T. F., Johnson, J. S., Muller, J., Heger, A., Ponting, C. P., & Tybulewicz, V. L. (2016). Long noncoding RNAs in B-cell development and activation. Blood, 128(7), e10-19. doi:10.1182/blood- 2015-11-680843

Brazma, A., Parkinson, H., Sarkans, U., Shojatalab, M., Vilo, J., Abeygunawardena, N., Sansone, S. A. (2003). ArrayExpress--a public repository for microarray gene expression data at the EBI. Nucleic Acids Res, 31(1), 68-71.

Brennan, P. A., Jing, J., Ethunandan, M., & Gorecki, D. (2004). Dystroglycan complex in cancer. Eur J Surg Oncol, 30(6), 589-592. doi:10.1016/j.ejso.2004.03.014

Briggs, J. A., Wolvetang, E. J., Mattick, J. S., Rinn, J. L., & Barry, G. (2015). Mechanisms of long non- coding RNAs in mammalian nervous system development, plasticity, disease, and evolution. Neuron, 88(5), 861-877. doi:10.1016/j.neuron.2015.09.045

Brown, C., Sauvageot, J., Kahane, H., & Epstein, J. I. (1996). Cell proliferation and apoptosis in prostate cancer--correlation with pathologic stage? Mod Pathol, 9(3), 205-209.

Bussemakers, M. J., van Bokhoven, A., Verhaegh, G. W., Smit, F. P., Karthaus, H. F., Schalken, J. A., Isaacs, W. B. (1999). DD3: a new prostate-specific gene, highly overexpressed in prostate cancer. Cancer Res, 59(23), 5975-5979.

Cabili, M. N., Trapnell, C., Goff, L., Koziol, M., Tazon-Vega, B., Regev, A., & Rinn, J. L. (2011). Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev, 25(18), 1915-1927. doi:10.1101/gad.17446611

Campbell, M. A., & Wengel, J. (2011). Locked vs. unlocked nucleic acids (LNA vs. UNA): contrasting structures work towards common therapeutic goals. Chem Soc Rev, 40(12), 5680-5689. doi:10.1039/c1cs15048k

Cancer Genome Atlas Research, N. (2015). The Molecular Taxonomy of Primary Prostate Cancer. Cell, 163(4), 1011-1025. doi:10.1016/j.cell.2015.10.025

Cancer Genome Atlas Research, N., Weinstein, J. N., Collisson, E. A., Mills, G. B., Shaw, K. R., Ozenberger, B. A., . . . Stuart, J. M. (2013). The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet, 45(10), 1113-1120. doi:10.1038/ng.2764

Cao, J. (2014). The functional role of long non-coding RNAs and epigenetics. Biol Proced Online, 16, 11. doi:10.1186/1480-9222-16-11

Casper, J., Zweig, A. S., Villarreal, C., Tyner, C., Speir, M. L., Rosenbloom, K. R., Kent, W. J. (2018). The UCSC genome browser database: 2018 update. Nucleic Acids Res, 46(D1), D762-d769. doi:10.1093/nar/gkx1020

Cavo, M., Fato, M., Penuela, L., Beltrame, F., Raiteri, R., & Scaglione, S. (2016). Microenvironment complexity and matrix stiffness regulate breast cancer cell activity in a 3D in vitro model. Sci Rep, 6, 35367. doi:10.1038/srep35367

Cerami, E., Gao, J., Dogrusoz, U., Gross, B. E., Sumer, S. O., Aksoy, B. A., Schultz, N. (2012). The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov, 2(5), 401-404. doi:10.1158/2159-8290.CD-12-0095

Cerk, S., Schwarzenbacher, D., Adiprasito, J. B., Stotz, M., Hutterer, G. C., Gerger, A., Pichler, M. (2016). Current Status of Long Non-Coding RNAs in Human Breast Cancer. Int J Mol Sci, 17(9). doi:10.3390/ijms17091485

159

Chakravarty, D., Sboner, A., Nair, S. S., Giannopoulou, E., Li, R., Hennig, S., Rubin, M. A. (2014). The oestrogen receptor alpha-regulated lncRNA NEAT1 is a critical modulator of prostate cancer. Nat Commun, 5, 5383. doi:10.1038/ncomms6383

Chandra Gupta, S., & Nandan Tripathi, Y. (2017). Potential of long non-coding RNAs in cancer patients: From biomarkers to therapeutic targets. Int J Cancer, 140(9), 1955-1967. doi:10.1002/ijc.30546

Chandran, U. R., Dhir, R., Ma, C., Michalopoulos, G., Becich, M., & Gilbertson, J. (2005). Differences in gene expression in prostate cancer, normal appearing prostate tissue adjacent to cancer and prostate tissue from cancer free organ donors. BMC Cancer, 5, 45. doi:10.1186/1471-2407-5-45

Chandrasekar, T., Yang, J. C., Gao, A. C., & Evans, C. P. (2015). Mechanisms of resistance in castration-resistant prostate cancer (CRPC). Transl Androl Urol, 4(3), 365-380. doi:10.3978/j.issn.2223-4683.2015.05.02

Cheetham, S. W., Gruhl, F., Mattick, J. S., & Dinger, M. E. (2013). Long noncoding RNAs and the genetics of cancer. Br J Cancer, 108(12), 2419-2425. doi:10.1038/bjc.2013.233

Chen, C. L., Tseng, Y. W., Wu, J. C., Chen, G. Y., Lin, K. C., Hwang, S. M., & Hu, Y. C. (2015). Suppression of hepatocellular carcinoma by baculovirus-mediated expression of long non-coding RNA PTENP1 and MicroRNA regulation. Biomaterials, 44, 71-81. doi:10.1016/j.biomaterials.2014.12.023

Chen, D., Liu, L., Wang, K., Yu, H., Wang, Y., Liu, J., Zhang, H. (2017). The role of MALAT-1 in the invasion and metastasis of gastric cancer. Scand J Gastroenterol, 52(6-7), 790-796. doi:10.1080/00365521.2017.1280531

Chen, H., Du, G., Song, X., & Li, L. (2017). Non-coding transcripts from enhancers: new insights into enhancer activity and gene expression regulation. Genomics Proteomics Bioinformatics, 15(3), 201- 207. doi:10.1016/j.gpb.2017.02.003

Chen, J., Sun, M., Kent, W. J., Huang, X., Xie, H., & Wang, W. (2004). Over 20% of human transcripts might form sense-antisense pairs. Nucleic Acids Res, 32. doi:10.1093/nar/gkh818

Chen, J. L., Li, J., Stadler, W. M., & Lussier, Y. A. (2011). Protein-network modeling of prostate cancer gene signatures reveals essential pathways in disease recurrence. J Am Med Inform Assoc, 18(4), 392- 402. doi:10.1136/amiajnl-2011-000178

Chen, S., Wang, Y., Zhang, J. H., Xia, Q. J., Sun, Q., Li, Z. K., Dong, M. S. (2017). Long non-coding RNA PTENP1 inhibits proliferation and migration of breast cancer cells via AKT and MAPK signaling pathways. Oncol Lett, 14(4), 4659-4662. doi:10.3892/ol.2017.6823

Chen, X., Xu, S., McClelland, M., Rahmatpanah, F., Sawyers, A., Jia, Z., & Mercola, D. (2012). An accurate prostate cancer prognosticator using a seven-gene signature plus Gleason score and taking cell type heterogeneity into account. PLoS One, 7(9), e45178. doi:10.1371/journal.pone.0045178

Cheng, I., Plummer, S. J., Neslund-Dudas, C., Klein, E. A., Casey, G., Rybicki, B. A., & Witte, J. S. (2010). Prostate cancer susceptibility variants confer increased risk of disease progression. Cancer Epidemiol Biomarkers Prev, 19(9), 2124-2132. doi:10.1158/1055-9965.EPI-10-0268

Cheng, Z., Li, Z., Ma, K., Li, X., Tian, N., Duan, J., Wang, Y. (2017). Long non-coding RNA XIST promotes glioma tumorigenicity and angiogenesis by acting as a molecular sponge of miR-429. J Cancer, 8(19), 4106-4116. doi:10.7150/jca.21024

160

Chiu, H. S., Somvanshi, S., Patel, E., Chen, T. W., Singh, V. P., Zorman, B., Sumazin, P. (2018). Pan- cancer analysis of lncRNA regulation supports their targeting of cancer genes in each tumor context. Cell Rep, 23(1), 297-312.e212. doi:10.1016/j.celrep.2018.03.064

Chopin, L. K., Seim, I., Walpole, C. M., & Herington, A. C. (2012). The ghrelin axis--does it have an appetite for cancer progression? Endocr Rev, 33(6), 849-891. doi:10.1210/er.2011-1007

Choudhry, H., Harris, A. L., & McIntyre, A. (2016). The tumour hypoxia induced non-coding transcriptome. Mol Aspects Med, 47-48, 35-53. doi:10.1016/j.mam.2016.01.003

Chow, K. B., Leung, P. K., Cheng, C. H., Cheung, W. T., & Wise, H. (2008). The constitutive activity of ghrelin receptors is decreased by co-expression with vasoactive prostanoid receptors when over- expressed in human embryonic kidney 293 cells. Int J Biochem Cell Biol, 40(11), 2627-2637. doi:10.1016/j.biocel.2008.05.008

Chow, K. B., Sun, J., Chu, K. M., Tai Cheung, W., Cheng, C. H., & Wise, H. (2012). The truncated ghrelin receptor polypeptide (GHS-R1b) is localized in the endoplasmic reticulum where it forms heterodimers with ghrelin receptors (GHS-R1a) to attenuate their cell surface expression. Mol Cell Endocrinol, 348(1), 247-254. doi:10.1016/j.mce.2011.08.034

Christenson, J. L., Denny, E. C., & Kane, S. E. (2015). t-Darpp overexpression in HER2-positive breast cancer confers a survival advantage in lapatinib. Oncotarget, 6(32), 33134-33145. doi:10.18632/oncotarget.5311

Chu, K. M., Chow, K. B., Leung, P. K., Lau, P. N., Chan, C. B., Cheng, C. H., & Wise, H. (2007). Over-expression of the truncated ghrelin receptor polypeptide attenuates the constitutive activation of phosphatidylinositol-specific phospholipase C by ghrelin receptors but has no effect on ghrelin- stimulated extracellular signal-regulated kinase 1/2 activity. Int J Biochem Cell Biol, 39(4), 752-764. doi:10.1016/j.biocel.2006.11.007

Chua, M. L. K., Lo, W., Pintilie, M., Murgic, J., Lalonde, E., Bhandari, V., Bristow, R. G. (2017). A prostate cancer "nimbosus": Genomic Instability and SChLAP1 dysregulation underpin aggression of intraductal and cribriform subpathologies. Eur Urol, 72(5), 665-674. doi:10.1016/j.eururo.2017.04.034

Cimadamore, A., Gasparrini, S., Mazzucchelli, R., Doria, A., Cheng, L., Lopez-Beltran, A., Montironi, R. (2017). Long non-coding RNAs in prostate cancer with emphasis on second chromosome locus associated with prostate-1 Expression. Front Oncol, 7, 305. doi:10.3389/fonc.2017.00305

Clark, M. B., Johnston, R. L., Inostroza-Ponta, M., Fox, A. H., Fortini, E., Moscato, P., Mattick, J. S. (2012). Genome-wide analysis of long noncoding RNA stability. Genome Res, 22(5), 885-898.

Clark, M. B., & Mattick, J. S. (2011). Long noncoding RNAs in cell biology. Semin Cell Dev Biol, 22(4), 366-376.

Coffey, A. J., Kokocinski, F., Calafato, M. S., Scott, C. E., Palta, P., Drury, E., Palotie, A. (2011). The GENCODE exome: sequencing the complete human exome. Eur J Hum Genet, 19(7), 827-831. doi:10.1038/ejhg.2011.28

Colditz, J., Rupf, B., Maiwald, C., & Baniahmad, A. (2016). Androgens induce a distinct response of epithelial-mesenchymal transition factors in human prostate cancer cells. Mol Cell Biochem, 421(1-2), 139-147. doi:10.1007/s11010-016-2794-y

Collette, J., Le Bourhis, X., & Adriaenssens, E. (2017). Regulation of human breast cancer by the long non-coding RNA H19. Int J Mol Sci, 18(11). doi:10.3390/ijms18112319

161

Conway, T., Wazny, J., Bromage, A., Tymms, M., Sooraj, D., Williams, E. D., & Beresford-Smith, B. (2012). Xenome--a tool for classifying reads from xenograft samples. Bioinformatics, 28(12), i172-178. doi:10.1093/bioinformatics/bts236

Cuzick, J., Swanson, G. P., Fisher, G., Brothman, A. R., Berney, D. M., Reid, J. E., Transatlantic Prostate, G. (2011). Prognostic value of an RNA expression signature derived from cell cycle proliferation genes in patients with prostate cancer: a retrospective study. Lancet Oncol, 12(3), 245- 255. doi:10.1016/S1470-2045(10)70295-3

Cyr-Depauw, C., Northey, J. J., Tabaries, S., Annis, M. G., Dong, Z., Cory, S., Siegel, P. M. (2016). chordin-like 1 suppresses bone morphogenetic protein 4-induced breast cancer cell migration and invasion. Mol Cell Biol, 36(10), 1509-1525. doi:10.1128/MCB.00600-15

Dapas, M., Kandpal, M., Bi, Y., & Davuluri, R. V. (2017). Comparative evaluation of isoform-level gene expression estimation algorithms for RNA-seq and exon-array platforms. Brief Bioinform, 18(2), 260-269. doi:10.1093/bib/bbw016

Davies, G. F., Berg, A., Postnikoff, S. D., Wilson, H. L., Arnason, T. G., Kusalik, A., & Harkness, T. A. (2014). TFPI1 mediates resistance to doxorubicin in breast cancer cells by inducing a hypoxic-like response. PLoS One, 9(1), e84611. doi:10.1371/journal.pone.0084611

Davis, S., & Meltzer, P. S. (2007). GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor. Bioinformatics, 23(14), 1846-1847. doi:10.1093/bioinformatics/btm254 de Graaf, C., Blom, W. A., Smeets, P. A., Stafleu, A., & Hendriks, H. F. (2004). Biomarkers of satiation and satiety. Am J Clin Nutr, 79(6), 946-961.

Deng, J., Tang, J., Wang, G., & Zhu, Y. S. (2017). Long non-coding RNA as potential biomarker for prostate cancer: is it making a difference? Int J Environ Res Public Health, 14(3). doi:10.3390/ijerph14030270

Deng, J., Yang, M., Jiang, R., An, N., Wang, X., & Liu, B. (2017). Long non-coding RNA HOTAIR regulates the proliferation, self-renewal capacity, tumor formation and migration of the cancer stem- like cell (CSC) Subpopulation Enriched from Breast Cancer Cells. PLoS One, 12(1), e0170860. doi:10.1371/journal.pone.0170860

Deniz, E., & Erman, B. (2017). Long noncoding RNA (lincRNA), a new paradigm in gene expression control. Funct Integr Genomics, 17(2-3), 135-143. doi:10.1007/s10142-016-0524-x

Dennis, G., Jr., Sherman, B. T., Hosack, D. A., Yang, J., Gao, W., Lane, H. C., & Lempicki, R. A. (2003). DAVID: Database for annotation, visualization, and integrated discovery. Genome Biol, 4(5), P3.

Derrien, T., & Guigo, R. (2011). Long non-coding RNAs with enhancer-like function in human cells. Med Sci (Paris), 27(4), 359-361. doi:10.1051/medsci/2011274009

Derrien, T., Johnson, R., Bussotti, G., Tanzer, A., Djebali, S., Tilgner, H., Guigo, R. (2012). The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res, 22(9), 1775-1789. doi:10.1101/gr.132159.111

Deveson, I. W., Hardwick, S. A., Mercer, T. R., & Mattick, J. S. (2017). The dimensions, dynamics, and relevance of the mammalian noncoding transcriptome. Trends Genet, 33(7), 464-478. doi:10.1016/j.tig.2017.04.004

162

Dey, B. K., Mueller, A. C., & Dutta, A. (2014). Long non-coding RNAs as emerging regulators of differentiation, development, and disease. Transcription, 5(4), e944014. doi:10.4161/21541272.2014.944014

Dhamija, S., & Diederichs, S. (2016). From junk to master regulators of invasion: lncRNA functions in migration, EMT and metastasis. Int J Cancer, 139(2), 269-280. doi:10.1002/ijc.30039

Di Gesualdo, F., Capaccioli, S., & Lulli, M. (2014). A pathophysiological view of the long non-coding RNA world. Oncotarget, 5(22), 10976-10996. doi:10.18632/oncotarget.2770

Dimitrova, N., Zamudio, J. R., Jong, R. M., Soukup, D., Resnick, R., Sarma, K., Jacks, T. (2014). LincRNA-p21 activates p21 in cis to promote Polycomb target gene expression and to enforce the G1/S checkpoint. Mol Cell, 54(5), 777-790. doi:10.1016/j.molcel.2014.04.025

Dixon-McDougall, T., & Brown, C. (2016). The making of a Barr body: the mosaic of factors that eXIST on the mammalian inactive X chromosome. Biochemistry and Cell Biology, 94(1), 56-70. doi:10.1139/bcb-2015-0016

Djebali, S., Davis, C. A., Merkel, A., Dobin, A., Lassmann, T., Mortazavi, A., Gingeras, T. R. (2012). Landscape of transcription in human cells. Nature, 489(7414), 101-108. doi:10.1038/nature11233

Donald Lamm, M. (2012, 13/05/2012). Phase 2b, multicenter trial of intravesical dta-h19/pei in patients with intermediate-risk superficial bladder cancer. Retrieved from http://clinicaltrials.gov/ct2/show/study/NCT00595088

Doonan, B. P., & Haque, A. (2010). HLA class II antigen presentation in prostate cancer cells: a novel approach to prostate tumor immunotherapy. Open Cancer Immunol J, 3, 1-7. doi:10.2174/1876401001003010001

Drake, C. G., Sharma, P., & Gerritsen, W. (2014). Metastatic castration-resistant prostate cancer: new therapies, novel combination strategies and implications for immunotherapy. Oncogene, 33(43), 5053- 5064. doi:10.1038/onc.2013.497

Drake, J. M., Paull, E. O., Graham, N. A., Lee, J. K., Smith, B. A., Titz, B., Stuart, J. M. (2016). Phosphoproteome integration reveals patient-specific networks in prostate cancer. Cell, 166(4), 1041- 1054. doi:10.1016/j.cell.2016.07.007

Du, J., Yuan, Z., Ma, Z., Song, J., Xie, X., & Chen, Y. (2014). KEGG-PATH: Kyoto encyclopedia of genes and genomes-based pathway analysis using a path analysis model. Mol Biosyst, 10(9), 2441- 2447. doi:10.1039/c4mb00287c

Du, Z., Sun, T., Hacisuleyman, E., Fei, T., Wang, X., Brown, M., Liu, X. S. (2016). Integrative analyses reveal a long noncoding RNA-mediated sponge regulatory network in prostate cancer. Nat Commun, 7, 10982. doi:10.1038/ncomms10982

Durinck, S., Spellman, P. T., Birney, E., & Huber, W. (2009). Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nat Protoc, 4(8), 1184-1191. doi:10.1038/nprot.2009.97

The ENCODE (ENCyclopedia Of DNA Elements) Project. (2004). Science, 306(5696), 636-640. doi:10.1126/science.1105136

Engelhardt, J., & Stadler, P. F. (2015). Evolution of the unspliced transcriptome. BMC Evol Biol, 15, 166. doi:10.1186/s12862-015-0437-7

163

Ernst, J., Kheradpour, P., Mikkelsen, T. S., Shoresh, N., Ward, L. D., Epstein, C. B., Bernstein, B. E. (2011). Mapping and analysis of chromatin state dynamics in nine human cell types. Nature, 473(7345), 43-49. doi:10.1038/nature09906

Espinosa, J. M. (2017). On the Origin of lncRNAs: Missing Link Found. Trends Genet, 33(10), 660- 662. doi:10.1016/j.tig.2017.07.005

Ezkurdia, I., Juan, D., Rodriguez, J. M., Frankish, A., Diekhans, M., Harrow, J., Tress, M. L. (2014). Multiple evidence strands suggest that there may be as few as 19,000 human protein-coding genes. Hum Mol Genet, 23(22), 5866-5878. doi:10.1093/hmg/ddu309

Fan, Y. L., Chen, L., Wang, J., Yao, Q., & Wan, J. Q. (2013). Over expression of PPP2R2C inhibits human glioma cells growth through the suppression of mTOR pathway. FEBS Lett, 587(24), 3892- 3897. doi:10.1016/j.febslet.2013.09.029

Fang, Z., Zhang, S., Wang, Y., Shen, S., Wang, F., Hao, Y., Yang, H. (2016). Long non-coding RNA MALAT-1 modulates metastatic potential of tongue squamous cell carcinomas partially through the regulation of small proline rich proteins. BMC Cancer, 16, 706. doi:10.1186/s12885-016-2735-x

Felden, B., & Paillard, L. (2017). When eukaryotes and prokaryotes look alike: the case of regulatory RNAs. FEMS Microbiol Rev, 41(5), 624-639. doi:10.1093/femsre/fux038

Feng, Y., Fan, Y., Huiqing, C., Zicai, L., & Quan, D. (2014). The emerging landscape of long non- coding RNAs. Yi Chuan, 36(5), 456-468.

Ferlay, J., Soerjomataram, I., Dikshit, R., Eser, S., Mathers, C., Rebelo, M., Bray, F. (2015). Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer, 136(5), E359-386. doi:10.1002/ijc.29210

Ferraldeschi, R., Welti, J., Luo, J., Attard, G., & de Bono, J. S. (2015). Targeting the androgen receptor pathway in castration-resistant prostate cancer: progresses and prospects. Oncogene, 34(14), 1745- 1757. doi:10.1038/onc.2014.115

Ferre, F., Colantoni, A., & Helmer-Citterich, M. (2016). Revealing protein-lncRNA interaction. Brief Bioinform, 17(1), 106-116. doi:10.1093/bib/bbv031

Fladeby, C., Gupta, S. N., Barois, N., Lorenzo, P. I., Simpson, J. C., Saatcioglu, F., & Bakke, O. (2008). Human PARM-1 is a novel mucin-like, androgen-regulated gene exhibiting proliferative effects in prostate cancer cells. Int J Cancer, 122(6), 1229-1235. doi:10.1002/ijc.23185

Fokina, A. A., Stetsenko, D. A., & Francois, J. C. (2015). DNA enzymes as potential therapeutics: towards clinical application of 10-23 DNAzymes. Expert opinion on biological therapy, 15(5), 689- 711. doi:10.1517/14712598.2015.1025048

Forero, A., Li, Y., Chen, D., Grizzle, W. E., Updike, K. L., Merz, N. D., Varley, K. E. (2016). Expression of the mhc class ii pathway in triple-negative breast cancer tumor cells is associated with a good prognosis and infiltrating lymphocytes. Cancer Immunol Res, 4(5), 390-399. doi:10.1158/2326- 6066.cir-15-0243

Frazier, K. S. (2015). Antisense oligonucleotide therapies: the promise and the challenges from a toxicologic pathologist's perspective. Toxicologic pathology, 43(1), 78-89. doi:10.1177/0192623314551840

164

Fu, X., Ravindranath, L., Tran, N., Petrovics, G., & Srivastava, S. (2006). Regulation of apoptosis by a prostate-specific and prostate cancer-associated noncoding gene, PCGEM1. DNA Cell Biol, 25(3), 135- 141. doi:10.1089/dna.2006.25.135

Fung, J. N., Jeffery, P. L., Lee, J. D., Seim, I., Roche, D., Obermair, A., Chen, C. (2013). Silencing of ghrelin receptor expression inhibits endometrial cancer cell growth in vitro and in vivo. Am J Physiol Endocrinol Metab, 305, E305-313. doi:10.1152/ajpendo.00156.2013

Gahete, M. D., Cordoba-Chacon, J., Hergueta-Redondo, M., Martinez-Fuentes, A. J., Kineman, R. D., Moreno-Bueno, G., Castano, J. P. (2011). A novel human ghrelin variant (In1-ghrelin) and ghrelin-O- acyltransferase are overexpressed in breast cancer: potential pathophysiological relevance. PLoS One, 6(8), e23302. doi:10.1371/journal.pone.0023302

Gao, J., Aksoy, B. A., Dogrusoz, U., Dresdner, G., Gross, B., Sumer, S. O., Schultz, N. (2013). Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal, 6(269), pl1. doi:10.1126/scisignal.2004088

Gao, L., Ren, W., Zhang, L., Li, S., Kong, X., Zhang, H., Zhi, K. (2017). PTENp1, a natural sponge of miR-21, mediates PTEN expression to inhibit the proliferation of oral squamous cell carcinoma. Mol Carcinog, 56(4), 1322-1334. doi:10.1002/mc.22594

Gaytan, F., Morales, C., Barreiro, M. L., Jeffery, P., Chopin, L. K., Herington, A. C., Tena-Sempere, M. (2005). Expression of growth hormone secretagogue receptor type 1a, the functional ghrelin receptor, in human ovarian surface epithelium, mullerian duct derivatives, and ovarian tumors. J Clin Endocrinol Metab, 90(3), 1798-1804. doi:10.1210/jc.2004-1532

Geisler, S., & Coller, J. (2013). RNA in unexpected places: long non-coding RNA functions in diverse cellular contexts. Nat Rev Mol Cell Biol, 14(11), 699-712. doi:10.1038/nrm3679

GTEx Consortium. (2013). The Genotype-Tissue Expression (GTEx) project. Nat Genet, 45(6), 580- 585. doi:10.1038/ng.2653

Gittelman, M. C., Hertzman, B., Bailen, J., Williams, T., Koziol, I., Henderson, R. J., Ward, J. F. (2013). PCA3 molecular urine test as a predictor of repeat prostate biopsy outcome in men with previous negative biopsies: a prospective multicenter clinical study. J Urol, 190(1), 64-69. doi:10.1016/j.juro.2013.02.018

Glazko, G. V., Zybailov, B. L., & Rogozin, I. B. (2012). Computational prediction of polycomb- associated long non-coding RNAs. PLoS One, 7(9), e44878. doi:10.1371/journal.pone.0044878

Glinsky, G. V., Berezovska, O., & Glinskii, A. B. (2005). Microarray analysis identifies a death-from- cancer signature predicting therapy failure in patients with multiple types of cancer. J Clin Invest, 115(6), 1503-1521. doi:10.1172/JCI23412

Glinsky, G. V., Glinskii, A. B., Stephenson, A. J., Hoffman, R. M., & Gerald, W. L. (2004). Gene expression profiling predicts clinical outcome of prostate cancer. J Clin Invest, 113(6), 913-923. doi:10.1172/JCI20032

Global Burden of Disease Cancer, C., Fitzmaurice, C., Dicker, D., Pain, A., Hamavid, H., Moradi- Lakeh, M., Naghavi, M. (2015). The Global Burden of Cancer 2013. JAMA Oncol, 1(4), 505-527. doi:10.1001/jamaoncol.2015.0735

Gnanapavan, S., Kola, B., Bustin, S., Morris, D., McGee, P., Fairclough, P., , M. (2002). The tissue distribution of the mrna of ghrelin and subtypes of its receptor, GHS-R, in Humans. Journal of Clinical Endocrinology and Metabolism, 87, 2988.

165

Gonit, M., Zhang, J., Salazar, M., Cui, H., Shatnawi, A., Trumbly, R., & Ratnam, M. (2011). Hormone depletion-insensitivity of prostate cancer cells is supported by the AR without binding to classical response elements. Mol Endocrinol, 25(4), 621-634. doi:10.1210/me.2010-0409

Gonzalez-Alonso, P., Cristobal, I., Manso, R., Madoz-Gurpide, J., Garcia-Foncillas, J., & Rojo, F. (2015). PP2A inhibition as a novel therapeutic target in castration-resistant prostate cancer. Tumour Biol, 36(8), 5753-5755. doi:10.1007/s13277-015-3849-5

Gonzalez-Ramirez, I., Soto-Reyes, E., Sanchez-Perez, Y., Herrera, L. A., & Garcia-Cuellar, C. (2014). Histones and long non-coding RNAs: the new insights of epigenetic deregulation involved in oral cancer. Oral Oncol, 50(8), 691-695. doi:10.1016/j.oraloncology.2014.04.006

Gordiiuk, V. V. (2014). Long non-coding RNAs--"tuning fork" in regulation of cell processes. Ukr Biochem J, 86(2), 5-15.

Gouyer, V., Wiede, A., Buisine, M. P., Dekeyser, S., Moreau, O., Lesuffleur, T., Huet, G. (2001). Specific secretion of gel-forming mucins and TFF peptides in HT-29 cells of mucin-secreting phenotype. Biochim Biophys Acta, 1539(1-2), 71-84.

Grasso, C. S., Wu, Y. M., Robinson, D. R., Cao, X., Dhanasekaran, S. M., Khan, A. P., omlins, S. A. (2012). The mutational landscape of lethal castration-resistant prostate cancer. Nature, 487(7406), 239- 243. doi:10.1038/nature11125

Griffith, M., Griffith, O. L., Mwenifumbo, J., Goya, R., Morrissy, A. S., Morin, R. D., Marra, M. A. (2010). Alternative expression analysis by RNA sequencing. Nat Methods, 7(10), 843-847. doi:10.1038/nmeth.1503

Grossmann, M. E., Huang, H., & Tindall, D. J. (2001). Androgen receptor signaling in androgen- refractory prostate cancer. J Natl Cancer Inst, 93(22), 1687-1697.

Grote, P., & Herrmann, B. G. (2013). The long non-coding RNA Fendrr links epigenetic control mechanisms to gene regulatory networks in mammalian embryogenesis. RNA Biol, 10(10), 1579-1585. doi:10.4161/rna.26165

Grote, P., Wittler, L., Hendrix, D., Koch, F., Wahrisch, S., Beisaw, A., Herrmann, B. G. (2013). The tissue-specific lncRNA Fendrr is an essential regulator of heart and body wall development in the mouse. Dev Cell, 24(2), 206-214. doi:10.1016/j.devcel.2012.12.012

Gruber, A. R., Bernhart, S. H., & Lorenz, R. (2015). The ViennaRNA web services. Methods Mol Biol, 1269, 307-326. doi:10.1007/978-1-4939-2291-8_19

Gutschner, T., & Diederichs, S. (2012). The hallmarks of cancer: a long non-coding RNA point of view. RNA Biol, 9(6), 703-719. doi:10.4161/rna.20481

Gutschner, T., Richtig, G., Haemmerle, M., & Pichler, M. (2017). From biomarkers to therapeutic targets-the promises and perils of long non-coding RNAs in cancer. Cancer Metastasis Rev. doi:10.1007/s10555-017-9718-5

Hackanson, B., Bennett, K. L., Brena, R. M., Jiang, J., Claus, R., Chen, S. S., Plass, C. (2008). Epigenetic modification of CCAAT/enhancer binding protein alpha expression in acute myeloid leukemia. Cancer Res, 68(9), 3142-3151. doi:10.1158/0008-5472.can-08-0483

Hajjari, M., & Salavaty, A. (2015). HOTAIR: an oncogenic long non-coding RNA in different cancers. Cancer Biol Med, 12(1), 1-9. doi:10.7497/j.issn.2095-3941.2015.0006

166

Han, P., & Chang, C. P. (2015). Long non-coding RNA and chromatin remodeling. RNA Biol, 12(10), 1094-1098. doi:10.1080/15476286.2015.1063770

Hanahan, D., & Weinberg, R. (2000). The hallmarks of cancer. Cell, 100, 57-70.

Hanahan, D., & Weinberg, R. A. (2011). Hallmarks of cancer: the next generation. Cell, 144(5), 646- 674. doi:10.1016/j.cell.2011.02.013

Harlos, C., Musto, G., Lambert, P., Ahmed, R., & Pitz, M. W. (2015). Androgen pathway manipulation and survival in patients with lung cancer. Horm Cancer, 6(2-3), 120-127. doi:10.1007/s12672-015- 0218-1

Harrow, J., Denoeud, F., Frankish, A., Reymond, A., Chen, C. K., Chrast, J., Guigo, R. (2006). GENCODE: producing a reference annotation for ENCODE. Genome Biol, 7 Suppl 1, S4.1-9. doi:10.1186/gb-2006-7-s1-s4

Harrow, J., Frankish, A., Gonzalez, J. M., Tapanari, E., Diekhans, M., Kokocinski, F., Hubbard, T. J. (2012). GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res, 22(9), 1760-1774. doi:10.1101/gr.135350.111

Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A k-means clustering algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), 28(1), 100-108.

Hashimoto, K., Kyoda, Y., Tanaka, T., Maeda, T., Kobayashi, K., Uchida, K., Masumori, N. (2015). The potential of neurotensin secreted from neuroendocrine tumor cells to promote gelsolin-mediated invasiveness of prostate adenocarcinoma cells. Lab Invest, 95(3), 283-295. doi:10.1038/labinvest.2014.165

Hastings, M. L., Ingle, H. A., Lazar, M. A., & Munroe, S. H. (2000). Post-transcriptional regulation of expression by cis-acting sequences and a naturally occurring antisense RNA. J Biol Chem, 275(15), 11507-11513.

Hauptman, N., & Glavac, D. (2013). MicroRNAs and long non-coding RNAs: prospects in diagnostics and therapy of cancer. Radiol Oncol, 47(4), 311-318. doi:10.2478/raon-2013-0062

He, X., Bao, W., Li, X., Chen, Z., Che, Q., Wang, H., & Wan, X. P. (2014). The long non-coding RNA HOTAIR is upregulated in endometrial carcinoma and correlates with poor prognosis. Int J Mol Med, 33(2), 325-332. doi:10.3892/ijmm.2013.1570

Helbig, R., & Fackelmayer, F. O. (2003). Scaffold attachment factor A (SAF-A) is concentrated in inactive X chromosome territories through its RGG domain. Chromosoma, 112(4), 173-182. doi:10.1007/s00412-003-0258-0

Herriges, M. J., Swarr, D. T., Morley, M. P., Rathi, K. S., Peng, T., Stewart, K. M., & Morrisey, E. E. (2014). Long noncoding RNAs are spatially correlated with transcription factors and regulate lung development. Genes Dev, 28(12), 1363-1379. doi:10.1101/gad.238782.114

Hirose, T., & Nakagawa, S. (2012). Paraspeckles: possible nuclear hubs by the RNA for the RNA. Biomol Concepts, 3(5), 415-428. doi:10.1515/bmc-2012-0017

Ho, T. T., Huang, J., Zhou, N., Zhang, Z., Koirala, P., Zhou, X., Mo, Y. Y. (2016). Regulation of PCGEM1 by p54/nrb in prostate cancer. Sci Rep, 6, 34529. doi:10.1038/srep34529

Hobson, D. J., Wei, W., Steinmetz, L. M., & Svejstrup, J. Q. (2012). RNA polymerase II collision interrupts convergent transcription. Mol Cell, 48(3), 365-374. doi:10.1016/j.molcel.2012.08.027

167

Hormaechea-Agulla, D., Gahete, M. D., Jimenez-Vacas, J. M., Gomez-Gomez, E., Ibanez-Costa, A., F, L. L., Luque, R. M. (2017). The oncogenic role of the In1-ghrelin splicing variant in prostate cancer aggressiveness. Mol Cancer, 16(1), 146. doi:10.1186/s12943-017-0713-9

Hormaechea-Agulla, D., Gomez-Gomez, E., Ibanez-Costa, A., Carrasco-Valiente, J., Rivero-Cortes, E., F, L. L., Luque, R. M. (2016). Ghrelin O-acyltransferase (GOAT) enzyme is overexpressed in prostate cancer, and its levels are associated with patient's metabolic status: Potential value as a non-invasive biomarker. Cancer Lett, 383(1), 125-134. doi:10.1016/j.canlet.2016.09.022

Hou, J., Long, H., Zhou, C., Zheng, S., Wu, H., Guo, T., Wang, T. (2017). Long noncoding RNA Braveheart promotes cardiogenic differentiation of mesenchymal stem cells in vitro. Stem Cell Res Ther, 8(1), 4. doi:10.1186/s13287-016-0454-5

Hou, M., Tang, X., Tian, F., Shi, F., Liu, F., & Gao, G. (2016). AnnoLnc: a web server for systematically annotating novel human lncRNAs. BMC Genomics, 17(1), 931. doi:10.1186/s12864-016-3287-9

Howard, A. D., Feighner, S. D., Cully, D. F., Arena, J. P., Liberator, P. A., Rosenblum, C. I., Van der Ploeg, L. H. (1996). A receptor in pituitary and hypothalamus that functions in growth hormone release. Science, 273(5277), 974-977.

Hu, L., Xu, Z., Hu, B., & Lu, Z. J. (2017). COME: a robust coding potential calculation tool for lncRNA identification and characterization based on multiple features. Nucleic Acids Res, 45(1), e2. doi:10.1093/nar/gkw798

Hu, W., Alvarez-Dominguez, J. R., & Lodish, H. F. (2012). Regulation of mammalian cell differentiation by long non-coding RNAs. EMBO Rep, 13(11), 971-983. doi:10.1038/embor.2012.145

Hu, X., Sood, A. K., Dang, C. V., & Zhang, L. (2017). The role of long noncoding RNAs in cancer: the dark matter matters. Curr Opin Genet Dev, 48, 8-15. doi:10.1016/j.gde.2017.10.004

Hu, Z. Y., Wang, X. Y., Guo, W. B., Xie, L. Y., Huang, Y. Q., Liu, Y. P., Kan, H. (2016). Long non- coding RNA MALAT1 increases AKAP-9 expression by promoting SRPK1-catalyzed SRSF1 phosphorylation in colorectal cancer cells. Oncotarget, 7(10), 11733-11743. doi:10.18632/oncotarget.7367

Hua, L., Wang, C. Y., Yao, K. H., Chen, J. T., Zhang, J. J., & Ma, W. L. (2015). High expression of long non-coding RNA ANRIL is associated with poor prognosis in hepatocellular carcinoma. Int J Clin Exp Pathol, 8(3), 3076-3082.

Huang da, W., Sherman, B. T., & Lempicki, R. A. (2009). Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc, 4(1), 44-57. doi:10.1038/nprot.2008.211

Huang, J. K., Ma, L., Song, W. H., Lu, B. Y., Huang, Y. B., Dong, H. M., Zhou, R. (2016). MALAT1 promotes the proliferation and invasion of thyroid cancer cells via regulating the expression of IQGAP1. Biomed Pharmacother, 83, 1-7. doi:10.1016/j.biopha.2016.05.039

Huang, J. K., Ma, L., Song, W. H., Lu, B. Y., Huang, Y. B., Dong, H. M., Zhou, R. (2017). LncRNA- MALAT1 promotes angiogenesis of thyroid cancer by modulating tumor-associated macrophage fgf2 protein secretion. J Cell Biochem, 118(12), 4821-4830. doi:10.1002/jcb.26153

Huang, J. L., Liu, W., Tian, L. H., Chai, T. T., Liu, Y., Zhang, F., Shen, J. Z. (2017). Upregulation of long non-coding RNA MALAT-1 confers poor prognosis and influences cell proliferation and apoptosis in acute monocytic leukemia. Oncol Rep, 38(3), 1353-1362. doi:10.3892/or.2017.5802

168

Huarte, M. (2015). The emerging role of lncRNAs in cancer. Nat Med, 21(11), 1253-1261. doi:10.1038/nm.3981

Huarte, M., Guttman, M., Feldser, D., Garber, M., Koziol, M. J., Kenzelmann-Broz, D., Rinn, J. L. (2010). A large intergenic noncoding RNA induced by p53 mediates global gene repression in the p53 response. Cell, 142(3), 409-419. doi:10.1016/j.cell.2010.06.040

Hung, T., Wang, Y., Lin, M. F., Koegel, A. K., Kotake, Y., Grant, G. D., Chang, H. Y. (2011). Extensive and coordinated transcription of noncoding RNAs within cell-cycle promoters. Nat Genet, 43(7), 621- 629. doi:10.1038/ng.848

Hwang, C. (2012). Overcoming docetaxel resistance in prostate cancer: a perspective review. Ther Adv Med Oncol, 4(6), 329-340. doi:10.1177/1758834012449685

Ibanez-Costa, A., Gahete, M. D., Rivero-Cortes, E., Rincon-Fernandez, D., Nelson, R., Beltran, M., Luque, R. M. (2015). In1-ghrelin splicing variant is overexpressed in pituitary adenomas and increases their aggressive features. Sci Rep, 5, 8714. doi:10.1038/srep08714

Inamura, K. (2018). Prostatic cancers: understanding their molecular pathology and the 2016 WHO classification. Oncotarget, 9(18), 14723-14737. doi:10.18632/oncotarget.24515

Ishibashi, M., Kogo, R., Shibata, K., Sawada, G., Takahashi, Y., Kurashige, J., Mori, M. (2013). Clinical significance of the expression of long non-coding RNA HOTAIR in primary hepatocellular carcinoma. Oncol Rep, 29(3), 946-950.

Iyer, S., Modali, S. D., & Agarwal, S. K. (2017). Long noncoding RNA MEG3 is an epigenetic determinant of oncogenic signaling in functional pancreatic neuroendocrine tumor cells. Mol Cell Biol, 37(22). doi:10.1128/mcb.00278-17

Jeffery, P., Murray, R., Yeh, A., McNamara, J., Duncan, R. P., Francis, G., Chopin, L. (2005). Expression and function of the ghrelin axis, including a novel preproghrelin isoform, in human breast cancer tissues and cell lines. Endocrine Related Cancer, 12(4), 839-850.

Jeffery, P. L., Herington, A. C., & Chopin, L. K. (2002). Expression and action of the growth hormone releasing peptide ghrelin and its receptor in prostate cancer cell lines. J Endocrinol, 172(3), R7-11.

Jeggari, A., Marks, D. S., & Larsson, E. (2012). miRcode: a map of putative microRNA target sites in the long non-coding transcriptome. Bioinformatics, 28(15), 2062-2063. doi:10.1093/bioinformatics/bts344

Jemal, A., Bray, F., Center, M. M., Ferlay, J., Ward, E., & Forman, D. (2011). Global cancer statistics. CA Cancer J Clin, 61(2), 69-90. doi:10.3322/caac.20107

Jhun, M. A., Geybels, M. S., Wright, J. L., Kolb, S., April, C., Bibikova, M., Stanford, J. L. (2017). Gene expression signature of Gleason score is associated with prostate cancer outcomes in a radical prostatectomy cohort. Oncotarget, 8(26), 43035-43047. doi:10.18632/oncotarget.17428

Jiang, M., Huang, O., Xie, Z., Wu, S., Zhang, X., Shen, A., Shen, K. (2014). A novel long non-coding RNA-ARA: adriamycin resistance-associated. Biochem Pharmacol, 87(2), 254-283. doi:10.1016/j.bcp.2013.10.020

Jiang, Y., Li, Z., Zheng, S., Chen, H., Zhao, X., Gao, W., Chen, R. (2015). The long non-coding RNA HOTAIR affects the radiosensitivity of pancreatic ductal adenocarcinoma by regulating the expression of Wnt inhibitory factor 1. Tumour Biol. doi:10.1007/s13277-015-4234-0

169

Kaighn, M. E., Narayan, K. S., Ohnuki, Y., Lechner, J. F., & Jones, L. W. (1979). Establishment and characterization of a human prostatic carcinoma cell line (PC-3). Invest Urol, 17(1), 16-23.

Kang, Y. J., Yang, D. C., Kong, L., Hou, M., Meng, Y. Q., Wei, L., & Gao, G. (2017). CPC2: a fast and accurate coding potential calculator based on sequence intrinsic features. Nucleic Acids Res, 45(W1), W12-w16. doi:10.1093/nar/gkx428

Kannan, S., Chernikova, D., Rogozin, I. B., Poliakov, E., Managadze, D., Koonin, E. V., & Milanesi, L. (2015). Transposable element insertions in long intergenic non-coding rna genes. Front Bioeng Biotechnol, 3, 71. doi:10.3389/fbioe.2015.00071

Karimian, A., Ahmadi, Y., & Yousefi, B. (2016). Multiple functions of p21 in cell cycle, apoptosis and transcriptional regulation after DNA damage. DNA Repair (Amst), 42, 63-71. doi:10.1016/j.dnarep.2016.04.008

Karolchik, D., Hinrichs, A. S., Furey, T. S., Roskin, K. M., Sugnet, C. W., Haussler, D., & Kent, W. J. (2004). The UCSC Table Browser data retrieval tool. Nucleic Acids Res, 32(Database issue), D493- 496. doi:10.1093/nar/gkh103

Katayama, S., Tomaru, Y., Kasukawa, T., Waki, K., Nakanishi, M., Nakamura, M., Consortium, F. (2005). Antisense transcription in the mammalian transcriptome. Science, 309(5740), 1564-1566. doi:10.1126/science.1112009

Kawaguchi, T., & Hirose, T. (2015). Chromatin remodeling complexes in the assembly of long noncoding RNA-dependent nuclear bodies. Nucleus, 6(6), 462-467. doi:10.1080/19491034.2015.1119353

Kent, W. J., Sugnet, C. W., Furey, T. S., Roskin, K. M., Pringle, T. H., Zahler, A. M., & Haussler, D. (2002). The human genome browser at UCSC. Genome Res, 12(6), 996-1006. doi:10.1101/gr.229102

Khorkova, O., Hsiao, J., & Wahlestedt, C. (2015). Basic biology and therapeutic implications of lncRNA. Adv Drug Deliv Rev, 87, 15-24. doi:10.1016/j.addr.2015.05.012

Khorshidi, A., Dhaliwal, P., & Yang, B. B. (2016). Noncoding RNAs in tumor angiogenesis. Advances in experimental medicine and biology, 927, 217-241. doi:10.1007/978-981-10-1498-7_8

Kim, D., Pertea, G., Trapnell, C., Pimentel, H., Kelley, R., & Salzberg, S. L. (2013). TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol, 14(4), R36. doi:10.1186/gb-2013-14-4-r36

Klattenhoff, C. A., Scheuermann, J. C., Surface, L. E., Bradley, R. K., Fields, P. A., Steinhauser, M. L., Boyer, L. A. (2013). Braveheart, a long noncoding RNA required for cardiovascular lineage commitment. Cell, 152(3), 570-583. doi:10.1016/j.cell.2013.01.003

Knoll, M., Lodish, H. F., & Sun, L. (2015). Long non-coding RNAs as regulators of the endocrine system. Nat Rev Endocrinol, 11(3), 151-160. doi:10.1038/nrendo.2014.229

Kocaturk, B., & Versteeg, H. H. (2015). Orthotopic injection of breast cancer cells into the mammary fat pad of mice to study tumor growth. J Vis Exp(96). doi:10.3791/51967

Kogo, R., Shimamura, T., Mimori, K., Kawahara, K., Imoto, S., Sudo, T., Mori, M. (2011). Long noncoding RNA HOTAIR regulates polycomb-dependent chromatin modification and is associated with poor prognosis in colorectal cancers. Cancer Res, 71(20), 6320-6326. doi:10.1158/0008-5472.can- 11-1021

170

Komura, K., Jeong, S. H., Hinohara, K., Qu, F., Wang, X., Hiraki, M., Sweeney, C. J. (2016). Resistance to docetaxel in prostate cancer is associated with androgen receptor activation and loss of KDM5D expression. Proc Natl Acad Sci U S A, 113(22), 6259-6264. doi:10.1073/pnas.1600420113

Kong, L., Zhang, Y., Ye, Z. Q., Liu, X. Q., Zhao, S. Q., Wei, L., & Gao, G. (2007). CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res, 35(Web Server issue), W345-349. doi:10.1093/nar/gkm391

Kopparapu, P. K., Tinzl, M., Anagnostaki, L., Persson, J. L., & Dizeyi, N. (2013). Expression and localization of serotonin receptors in human breast cancer. Anticancer Res, 33(2), 363-370.

Korneev, S. A., Maconochie, M., Naskar, S., Korneeva, E. I., Richardson, G. P., & O'Shea, M. (2015). A novel long non-coding natural antisense RNA is a negative regulator of Nos1 gene expression. Sci Rep, 5, 11815. doi:10.1038/srep11815

Kornienko, A. E., Guenzl, P. M., Barlow, D. P., & Pauler, F. M. (2013). Gene regulation by the act of long non-coding RNA transcription. BMC Biol, 11, 59. doi:10.1186/1741-7007-11-59

Kretz, M., Siprashvili, Z., Chu, C., Webster, D. E., Zehnder, A., Qu, K., Khavari, P. A. (2013). Control of somatic tissue differentiation by the long non-coding RNA TINCR. Nature, 493(7431), 231-235. doi:10.1038/nature11661

Kumar, L., Shamsuzzama, Haque, R., Baghel, T., & Nazir, A. (2016). Circular RNAs: the emerging class of non-coding rnas and their potential role in human neurodegenerative diseases. Mol Neurobiol. doi:10.1007/s12035-016-0213-8

Kumar, M., DeVaux, R. S., & Herschkowitz, J. I. (2016). Molecular and cellular changes in breast cancer and new roles of lncrnas in breast cancer initiation and progression. Progress in molecular biology and translational science, 144, 563-586. doi:10.1016/bs.pmbts.2016.09.011

Kutter, C., Watt, S., Stefflova, K., Wilson, M. D., Goncalves, A., Ponting, C. P., Marques, A. C. (2012). Rapid turnover of long noncoding RNAs and the evolution of gene expression. PLoS Genet, 8(7), e1002841. doi:10.1371/journal.pgen.1002841

Lai, F., Orom, U. A., Cesaroni, M., Beringer, M., Taatjes, D. J., Blobel, G. A., & Shiekhattar, R. (2013). Activating RNAs associate with Mediator to enhance chromatin architecture and transcription. Nature, 494(7438), 497-501. doi:10.1038/nature11884

Lai, F., & Shiekhattar, R. (2014). Where long noncoding RNAs meet DNA methylation. Cell Res, 24(3), 263-264. doi:10.1038/cr.2014.13

Lai, J., Moya, L., An, J., Hoffman, A., Srinivasan, S., Panchadsaram, J., Batra, J. (2017). A microsatellite repeat in PCA3 long non-coding RNA is associated with prostate cancer risk and aggressiveness. Sci Rep, 7(1), 16862. doi:10.1038/s41598-017-16700-y

Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C., Zody, M. C., Baldwin, J., Szustakowki, J. (2001). Initial sequencing and analysis of the human genome. Nature, 409(6822), 860-921. doi:10.1038/35057062

Lanzos, A., Carlevaro-Fita, J., Mularoni, L., Reverter, F., Palumbo, E., Guigo, R., & Johnson, R. (2017). Discovery of Cancer Driver Long Noncoding RNAs across 1112 Tumour Genomes: New Candidates and Distinguishing Features. Sci Rep, 7, 41544. doi:10.1038/srep41544

171

Lapointe, J., Li, C., Higgins, J. P., van de Rijn, M., Bair, E., Montgomery, K., Pollack, J. R. (2004). Gene expression profiling identifies clinically relevant subtypes of prostate cancer. Proc Natl Acad Sci U S A, 101(3), 811-816. doi:10.1073/pnas.0304146101

Latge, G., Poulet, C., Bours, V., Josse, C., & Jerusalem, G. (2018). Natural antisense transcripts: molecular mechanisms and implications in breast cancers. Int J Mol Sci, 19(1). doi:10.3390/ijms19010123

Latos, P. A., Pauler, F. M., Koerner, M. V., Senergin, H. B., Hudson, Q. J., Stocsits, R. R., Barlow, D. P. (2012). Airn transcriptional overlap, but not its lncRNA products, induces imprinted Igf2r silencing. Science, 338(6113), 1469-1472. doi:10.1126/science.1228110

Law, C. W., Chen, Y., Shi, W., & Smyth, G. K. (2014). voom: Precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol, 15(2), R29. doi:10.1186/gb-2014-15-2-r29

Lee, H. Y., Wu, W. J., Huang, C. H., Chou, Y. H., Huang, C. N., Lee, Y. C., Huang, S. P. (2014). Clinical predictor of survival following docetaxel-based chemotherapy. Oncol Lett, 8(4), 1788-1792. doi:10.3892/ol.2014.2349

Lee, S., Kopp, F., Chang, T. C., Sataluri, A., Chen, B., Sivakumar, S., Mendell, J. T. (2016). Noncoding RNA NORAD regulates genomic stability by sequestering pumilio proteins. Cell, 164(1-2), 69-80. doi:10.1016/j.cell.2015.12.017

Lee, Y. G., Korenchuk, S., Lehr, J., Whitney, S., Vessela, R., & Pienta, K. J. (2001). Establishment and characterization of a new human prostatic cancer cell line: DuCaP. In Vivo, 15(2), 157-162.

Legrier, M. E., de Pinieux, G., Boye, K., Arvelo, F., Judde, J. G., Fontaine, J. J., Poupon, M. F. (2004). Mucinous differentiation features associated with hormonal escape in a human prostate cancer xenograft. Br J Cancer, 90(3), 720-727. doi:10.1038/sj.bjc.6601570

Lemos, A. E., Ferreira, L. B., Batoreu, N. M., de Freitas, P. P., Bonamino, M. H., & Gimba, E. R. (2016). PCA3 long noncoding RNA modulates the expression of key cancer-related genes in LNCaP prostate cancer cells. Tumour Biol, 37(8), 11339-11348. doi:10.1007/s13277-016-5012-3

Leung, P. K., Chow, K. B., Lau, P. N., Chu, K. M., Chan, C. B., Cheng, C. H., & Wise, H. (2007). The truncated ghrelin receptor polypeptide (GHS-R1b) acts as a dominant-negative mutant of the ghrelin receptor. Cell Signal, 19(5), 1011-1022. doi:10.1016/j.cellsig.2006.11.011

Leyten, G. H., Hessels, D., Jannink, S. A., Smit, F. P., de Jong, H., Cornel, E. B., Schalken, J. A. (2014). Prospective multicentre evaluation of PCA3 and TMPRSS2-ERG gene fusions as diagnostic and prognostic urinary biomarkers for prostate cancer. Eur Urol, 65(3), 534-542. doi:10.1016/j.eururo.2012.11.014

Leyten, G. H., Hessels, D., Smit, F. P., Jannink, S. A., de Jong, H., Melchers, W. J., Schalken, J. A. (2015). Identification of a candidate gene panel for the early diagnosis of prostate cancer. Clin Cancer Res, 21(13), 3061-3070. doi:10.1158/1078-0432.ccr-14-3334

Li, C. H., & Chen, Y. (2013). Targeting long non-coding RNAs in cancers: progress and prospects. Int J Biochem Cell Biol, 45(8), 1895-1910. doi:10.1016/j.biocel.2013.05.030

Li, C. R., Su, J. J., Wang, W. Y., Lee, M. T., Wang, T. Y., Jiang, K. Y., Tsai, K. K. (2013). Molecular profiling of prostatic acinar morphogenesis identifies PDCD4 and KLF6 as tissue architecture-specific prognostic markers in prostate cancer. Am J Pathol, 182(2), 363-374. doi:10.1016/j.ajpath.2012.10.024

172

Li, D., Feng, J., Wu, T., Wang, Y., Sun, Y., Ren, J., & Liu, M. (2013). Long intergenic noncoding RNA HOTAIR is overexpressed and regulates PTEN methylation in laryngeal squamous cell carcinoma. Am J Pathol, 182(1), 64-70. doi:10.1016/j.ajpath.2012.08.042

Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Durbin, R. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25(16), 2078-2079. doi:10.1093/bioinformatics/btp352

Li, H., Wang, Z., Tang, K., Zhou, H., Liu, H., Yan, L., Ye, Z. (2017). Prognostic value of androgen receptor splice variant 7 in the treatment of castration-resistant prostate cancer with next generation androgen receptor signal inhibition: a systematic review and meta-analysis. Eur Urol Focus. doi:10.1016/j.euf.2017.01.004

Li, J., Bian, E. B., He, X. J., Ma, C. C., Zong, G., Wang, H. L., & Zhao, B. (2016). Epigenetic repression of long non-coding RNA MEG3 mediated by DNMT1 represses the p53 pathway in gliomas. Int J Oncol, 48(2), 723-733. doi:10.3892/ijo.2015.3285

Li, J., Weinberg, M. S., Zerbini, L., & Prince, S. (2013). The oncogenic TBX3 is a downstream target and mediator of the TGF-beta1 signaling pathway. Mol Biol Cell, 24(22), 3569-3576. doi:10.1091/mbc.E13-05-0273

Li, J., Zhang, Z., Xiong, L., Guo, C., Jiang, T., Zeng, L., Wang, J. (2017). SNHG1 lncRNA negatively regulates miR-199a-3p to enhance CDK7 expression and promote cell proliferation in prostate cancer. Biochem Biophys Res Commun, 487(1), 146-152. doi:10.1016/j.bbrc.2017.03.169

Li, R. K., Gao, J., Guo, L. H., Huang, G. Q., & Luo, W. H. (2017). PTENP1 acts as a ceRNA to regulate PTEN by sponging miR-19b and explores the biological role of PTENP1 in breast cancer. Cancer Gene Ther, 24(7), 309-315. doi:10.1038/cgt.2017.29

Li, W., Lam, M. T., & Notani, D. (2014). Enhancer RNAs. Cell Cycle, 13(20), 3151-3152. doi:10.4161/15384101.2014.962860

Li, W., Notani, D., & Rosenfeld, M. G. (2016). Enhancers as non-coding RNA transcription units: recent insights and future perspectives. Nat Rev Genet, 17(4), 207-223. doi:10.1038/nrg.2016.4

Liao, Q., Liu, C., Yuan, X., Kang, S., Miao, R., Xiao, H., Zhao, Y. (2011). Large-scale prediction of long non-coding RNA functions in a coding-non-coding gene co-expression network. Nucleic Acids Res, 39(9), 3864-3878. doi:10.1093/nar/gkq1348

Liao, Y., Smyth, G. K., & Shi, W. (2014). featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics, 30(7), 923-930. doi:10.1093/bioinformatics/btt656

Lim, E., Modi, K. D., & Kim, J. (2009). In vivo bioluminescent imaging of mammary tumors using IVIS spectrum. J Vis Exp(26). doi:10.3791/1210

Lin, C., & Yang, L. (2017). Long noncoding rna in cancer: wiring signaling circuitry. Trends Cell Biol. doi:10.1016/j.tcb.2017.11.008

Lin, L., Gu, Z. T., Chen, W. H., & Cao, K. J. (2015). Increased expression of the long non-coding RNA ANRIL promotes lung cancer cell metastasis and correlates with poor prognosis. Diagn Pathol, 10, 14. doi:10.1186/s13000-015-0247-7

173

Lin, M. F., Jungreis, I., & Kellis, M. (2011). PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions. Bioinformatics, 27(13), i275-282. doi:10.1093/bioinformatics/btr209

Ling, H. (2016). Non-coding rnas: therapeutic strategies and delivery systems. Advances in experimental medicine and biology, 937, 229-237. doi:10.1007/978-3-319-42059-2_12

Litwin, M. S., & Tan, H. J. (2017). The Diagnosis and Treatment of Prostate Cancer: A Review. Jama, 317(24), 2532-2542. doi:10.1001/jama.2017.7248

Liu, D., Yu, X., Wang, S., Dai, E., Jiang, L., Wang, J., Jiang, W. (2016). The gain and loss of long noncoding RNA associated-competing endogenous RNAs in prostate cancer. Oncotarget, 7(35), 57228- 57238. doi:10.18632/oncotarget.11128

Liu, J., Xing, Y., Xu, L., Chen, W., Cao, W., & Zhang, C. (2017). Decreased expression of pseudogene PTENP1 promotes malignant behaviours and is associated with the poor survival of patients with HNSCC. Sci Rep, 7, 41179. doi:10.1038/srep41179

Liu, Q., Sun, S., Yu, W., Jiang, J., Zhuo, F., Qiu, G., Jiang, X. (2015). Altered expression of long non- coding RNAs during genotoxic stress-induced cell death in human glioma cells. J Neurooncol, 122(2), 283-292. doi:10.1007/s11060-015-1718-0

Liu, R., Wettersten, H. I., Park, S. H., & Weiss, R. H. (2013). Small-molecule inhibitors of p21 as novel therapeutics for chemotherapy-resistant kidney cancer. Future medicinal chemistry, 5(9), 991-994. doi:10.4155/fmc.13.56

Liu, S. J., Horlbeck, M. A., Cho, S. W., Birk, H. S., Malatesta, M., He, D., Lim, D. A. (2017). CRISPRi- based genome-scale identification of functional long noncoding RNA loci in human cells. Science, 355(6320). doi:10.1126/science.aah7111

Liu, Y., Pan, S., Liu, L., Zhai, X., Liu, J., Wen, J., Hu, Z. (2012). A genetic variant in long non-coding RNA HULC contributes to risk of HBV-related hepatocellular carcinoma in a Chinese population. PLoS One, 7(4), e35145. doi:10.1371/journal.pone.0035145

Liu, Y., Wang, B., Liu, X., Lu, L., Luo, F., Lu, X., Liu, Q. (2016). Epigenetic silencing of p21 by long non-coding RNA HOTAIR is involved in the cell cycle disorder induced by cigarette smoke extract. Toxicol Lett, 240(1), 60-67. doi:10.1016/j.toxlet.2015.10.016

Livak, K. J., & Schmittgen, T. D. (2001). Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods, 25(4), 402-408. doi:10.1006/meth.2001.1262

Louro, R., El-Jundi, T., Nakaya, H. I., Reis, E. M., & Verjovski-Almeida, S. (2008). Conserved tissue expression signatures of intronic noncoding RNAs transcribed from human and mouse loci. Genomics, 92(1), 18-25. doi:10.1016/j.ygeno.2008.03.013

Louro, R., Smirnova, A. S., & Verjovski-Almeida, S. (2009). Long intronic noncoding RNA transcription: expression noise or expression choice? Genomics, 93(4), 291-298. doi:10.1016/j.ygeno.2008.11.009

Lu, H., Li, G., Zhou, C., Jin, W., Qian, X., Wang, Z., Wang, X. (2016). Regulation and role of post- translational modifications of enhancer of zeste homologue 2 in cancer development. Am J Cancer Res, 6(12), 2737-2754.

174

Lu, Y., Hu, Z., Mangala, L. S., Stine, Z. E., Hu, X., Jiang, D., Dang, C. V. (2018). MYC targeted long noncoding RNA DANCR promotes cancer in part by reducing p21 levels. Cancer Res, 78(1), 64-74. doi:10.1158/0008-5472.can-17-0815

Lu, Y., Liu, X., Xie, M., Liu, M., Ye, M., Li, M., Zhou, R. (2017). The NF-kappaB-responsive long noncoding RNA FIRRE regulates posttranscriptional regulation of inflammatory gene expression through interacting with hnRNPU. J Immunol, 199(10), 3571-3582. doi:10.4049/jimmunol.1700091

Lu, Z., Xiao, Z., Liu, F., Cui, M., Li, W., Yang, Z., Zhang, X. (2015). Long non-coding RNA HULC promotes tumor angiogenesis in liver cancer by up-regulating sphingosine kinase 1 (SPHK1). Oncotarget. doi:10.18632/oncotarget.6280

Luo, G., Wang, M., Wu, X., Tao, D., Xiao, X., Wang, L., Jiang, G. (2015). Long Non-Coding RNA MEG3 Inhibits Cell Proliferation and Induces Apoptosis in Prostate Cancer. Cell Physiol Biochem, 37(6), 2209-2220. doi:10.1159/000438577

Luo, S., Lu, J. Y., Liu, L., Yin, Y., Chen, C., Han, X., Shen, X. (2016). Divergent lncrnas regulate gene expression and lineage differentiation in pluripotent cells. Cell Stem Cell, 18(5), 637-652. doi:10.1016/j.stem.2016.01.024

Lv, D., Sun, R., Yu, Q., & Zhang, X. (2016). The long non-coding RNA maternally expressed gene 3 activates p53 and is downregulated in esophageal squamous cell cancer. Tumour Biol. doi:10.1007/s13277-016-5426-y

Lv, J., Cui, W., Liu, H., He, H., Xiu, Y., Guo, J., Wu, Q. (2013). Identification and characterization of long non-coding RNAs related to mouse embryonic brain development from available transcriptomic data. PLoS One, 8(8), e71152. doi:10.1371/journal.pone.0071152

Lv, J., Liu, H., Huang, Z., Su, J., He, H., Xiu, Y., Wu, Q. (2013). Long non-coding RNA identification over mouse brain development by integrative modeling of chromatin and genomic features. Nucleic Acids Res, 41(22), 10044-10061. doi:10.1093/nar/gkt818

Ma, J., Li, T., Han, X., & Yuan, H. (2017). Knockdown of LncRNA ANRIL suppresses cell proliferation, metastasis, and invasion via regulating miR-122-5p expression in hepatocellular carcinoma. J Cancer Res Clin Oncol. doi:10.1007/s00432-017-2543-y

Ma, L., Bajic, V. B., & Zhang, Z. (2013). On the classification of long non-coding RNAs. RNA Biol, 10(6), 925-933. doi:10.4161/rna.24604

Ma, W., Chen, X., Ding, L., Ma, J., Jing, W., Lan, T., Yuan, Y. (2017). The prognostic value of long noncoding RNAs in prostate cancer: a systematic review and meta-analysis. Oncotarget, 8(34), 57755- 57765. doi:10.18632/oncotarget.17645

Ma, X., Li, Z., Li, T., Zhu, L., Li, Z., & Tian, N. (2017). Long non-coding RNA HOTAIR enhances angiogenesis by induction of VEGFA expression in glioma cells and transmission to endothelial cells via glioma cell derived-extracellular vesicles. Am J Transl Res, 9(11), 5012-5021.

Malakar, P., Shilo, A., Mogilevsky, A., Stein, I., Pikarsky, E., Nevo, Y., Karni, R. (2017). Long noncoding RNA MALAT1 promotes hepatocellular carcinoma development by srsf1 upregulation and mtor activation. Cancer Res, 77(5), 1155-1167. doi:10.1158/0008-5472.can-16-1508

Marchese, F. P., & Huarte, M. (2014). Long non-coding RNAs and chromatin modifiers: their place in the epigenetic code. Epigenetics, 9(1), 21-26. doi:10.4161/epi.27472

175

Martignano, F., Rossi, L., Maugeri, A., Galla, V., Conteduca, V., De Giorgi, U., Schepisi, G. (2017). Urinary RNA-based biomarkers for prostate cancer detection. Clin Chim Acta, 473, 96-105. doi:10.1016/j.cca.2017.08.009

Mary, S., Fehrentz, J. A., Damian, M., Gaibelet, G., Orcel, H., Verdie, P., Baneres, J. L. (2013). Heterodimerization with Its splice variant blocks the ghrelin receptor 1a in a non-signaling conformation: a study with a purified heterodimer assembled into lipid discs. J Biol Chem, 288(34), 24656-24665. doi:10.1074/jbc.M113.453423

Mattick, J. S. (1994). Introns: evolution and function. Curr Opin Genet Dev, 4(6), 823-831.

Mattick, J. S. (2005). The functional genomics of noncoding RNA. Science, 309(5740), 1527-1528. doi:309/5740/1527 [pii] 10.1126/science.1117806

Mattick, J. S. (2012). Rocking the foundations of molecular genetics. Proc Natl Acad Sci U S A, 109(41), 16400-16401.

Mattick, J. S., & Rinn, J. L. (2015). Discovery and annotation of long noncoding RNAs. Nat Struct Mol Biol, 22(1), 5-7. doi:10.1038/nsmb.2942

McCulloch, D. R., Opeskin, K., Thompson, E. W., & Williams, E. D. (2005). BM18: A novel androgen- dependent human prostate cancer xenograft model derived from a bone metastasis. Prostate, 65(1), 35- 43. doi:10.1002/pros.20255

McKeehan, W. L., Adams, P. S., & Fast, D. (1987). Different hormonal requirements for androgen- independent growth of normal and tumour epithelial cells from rat prostate. In Vitro Cell Development and Biology, 23, 147-152.

Mehra, R., Shi, Y., Udager, A. M., Prensner, J. R., Sahu, A., Iyer, M. K., Chinnaiyan, A. M. (2014). A Novel RNA In Situ Hybridization Assay for the Long Noncoding RNA SChLAP1 Predicts Poor Clinical Outcome After Radical Prostatectomy in Clinically Localized Prostate Cancer. Neoplasia, 16(12), 1121-1127. doi:10.1016/j.neo.2014.11.006

Mehra, R., Udager, A. M., Ahearn, T. U., Cao, X., Feng, F. Y., Loda, M., Chinnaiyan, A. M. (2015). Overexpression of the long non-coding RNA SChLAP1 independently predicts lethal prostate cancer. Eur Urol. doi:10.1016/j.eururo.2015.12.003

Mele, M., & Rinn, J. L. (2016). "Cat's Cradling" the 3D Genome by the Act of LncRNA Transcription. Mol Cell, 62(5), 657-664. doi:10.1016/j.molcel.2016.05.011

Mercer, T. R., Dinger, M. E., & Mattick, J. S. (2009). Long non-coding RNAs: insights into functions. Nat Rev Genet, 10(3), 155-159. doi:10.1038/nrg2521

Meredith, E. K., Balas, M. M., Sindy, K., Haislop, K., & Johnson, A. M. (2016). An RNA matchmaker protein regulates the activity of the long noncoding RNA HOTAIR. RNA, 22(7), 995-1010. doi:10.1261/rna.055830.115

Miao, Y., Fan, R., Chen, L., & Qian, H. (2016). Clinical significance of long non-coding RNA MALAT1 expression in tissue and serum of breast cancer. Ann Clin Lab Sci, 46(4), 418-424.

Miao, Z., Ding, J., Chen, B., Yang, Y., & Chen, Y. (2016). HOTAIR overexpression correlated with worse survival in patients with solid tumors. Minerva Med, 107(6), 392-400.

Min, K. W., Davila, S., Zealy, R. W., Lloyd, L. T., Lee, I. Y., Lee, R., Yoon, J. H. (2017). eIF4E phosphorylation by MST1 reduces translation of a subset of mRNAs, but increases lncRNA translation. Biochim Biophys Acta, 1860(7), 761-772. doi:10.1016/j.bbagrm.2017.05.002

176

Misawa, A., Takayama, K., Urano, T., & Inoue, S. (2016). Androgen-induced long noncoding RNA (lncRNA) SOCS2-AS1 promotes cell growth and inhibits apoptosis in prostate cancer cells. J Biol Chem, 291(34), 17861-17880. doi:10.1074/jbc.M116.718536

Misawa, A., Takayama, K. I., Fujimura, T., Homma, Y., Suzuki, Y., & Inoue, S. (2017). Androgen- induced lncRNA POTEF-AS1 regulates apoptosis-related pathway to facilitate cell survival in prostate cancer cells. Cancer Sci, 108(3), 373-379. doi:10.1111/cas.13151

Misawa, A., Takayama, K. I., & Inoue, S. (2017). Long non-coding RNAs and prostate cancer. Cancer Sci, 108(11), 2107-2114. doi:10.1111/cas.13352

Mitobe, Y., Takayama, K. I., Horie-Inoue, K., & Inoue, S. (2018). Prostate cancer-associated lncRNAs. Cancer Lett. doi:10.1016/j.canlet.2018.01.012

Morris, K. V., & Mattick, J. S. (2014). The rise of regulatory RNA. Nat Rev Genet, 15(6), 423-437. doi:10.1038/nrg3722

Morris, K. V., & Vogt, P. K. (2010). Long antisense non-coding RNAs and their role in transcription and oncogenesis. Cell Cycle, 9(13), 2544-2547. doi:10.4161/cc.9.13.12145

Moskalev, E. A., Jandaghi, P., Fallah, M., Manoochehri, M., Botla, S. K., Kolychev, O. V., Riazalhosseini, Y. (2015). GHSR DNA hypermethylation is a common epigenetic alteration of high diagnostic value in a broad spectrum of cancers. Oncotarget, 6(6), 4418-4427. doi:10.18632/oncotarget.2759

Mouraviev, V., Lee, B., Patel, V., Albala, D., Johansen, T. E., Partin, A., Perera, R. J. (2015). Clinical prospects of long noncoding RNAs as novel biomarkers and therapeutic targets in prostate cancer. Prostate cancer and prostatic diseases. doi:10.1038/pcan.2015.48

Munroe, S. H., & Lazar, M. A. (1991). Inhibition of c-erbA mRNA splicing by a naturally occurring antisense RNA. J Biol Chem, 266(33), 22083-22086.

Nadler, R. B., Humphrey, P. A., Smith, D. S., Catalona, W. J., & Ratliff, T. L. (1995). Effect of inflammation and benign prostatic hyperplasia on elevated serum prostate specific antigen levels. J Urol, 154(2 Pt 1), 407-413.

Nagini, S. (2017). Breast cancer: current molecular therapeutic targets and new players. Anticancer Agents Med Chem, 17(2), 152-163.

Nakagawa, T., Kollmeyer, T. M., Morlan, B. W., Anderson, S. K., Bergstralh, E. J., Davis, B. J., Jenkins, R. B. (2008). A tissue biomarker panel predicting systemic progression after PSA recurrence post-definitive prostate cancer therapy. PLoS One, 3(5), e2318. doi:10.1371/journal.pone.0002318

Nakaya, H. I., Amaral, P. P., Louro, R., Lopes, A., Fachel, A. A., Moreira, Y. B., Verjovski-Almeida, S. (2007). Genome mapping and expression analyses of human intronic noncoding RNAs reveal tissue- specific patterns and enrichment in genes related to regulation of transcription. Genome Biol, 8(3), R43. doi:10.1186/gb-2007-8-3-r43

Necsulea, A., Soumillon, M., Warnefors, M., Liechti, A., Daish, T., Zeller, U., Kaessmann, H. (2014). The evolution of lncRNA repertoires and expression patterns in tetrapods. Nature, 505(7485), 635-640. doi:10.1038/nature12943

Nguyen, H. M., Vessella, R. L., Morrissey, C., Brown, L. G., Coleman, I. M., Higano, C. S., Corey, E. (2017). LuCaP prostate cancer patient-derived xenografts reflect the molecular heterogeneity of

177 advanced disease an--d serve as models for evaluating cancer therapeutics. Prostate, 77(6), 654-671. doi:10.1002/pros.23313

Nguyen, Q., & Carninci, P. (2016). Expression Specificity of Disease-Associated lncRNAs: Toward Personalized Medicine. Curr Top Microbiol Immunol, 394, 237-258. doi:10.1007/82_2015_464

Nicholson, A., Mahon, J., Boland, A., Beale, S., Dwan, K., Fleeman, N., Dundar, Y. (2015). The clinical effectiveness and cost-effectiveness of the PROGENSA(R) prostate cancer antigen 3 assay and the Prostate Health Index in the diagnosis of prostate cancer: a systematic review and economic evaluation. Health Technol Assess, 19(87), i-xxxi, 1-191. doi:10.3310/hta19870

Nie, L., Wu, H. J., Hsu, J. M., Chang, S. S., Labaff, A. M., Li, C. W., Hung, M. C. (2012). Long non- coding RNAs: versatile master regulators of gene expression and crucial players in cancer. Am J Transl Res, 4(2), 127-150.

Nie, Y., Liu, X., Qu, S., Song, E., Zou, H., & Gong, C. (2013). Long non-coding RNA HOTAIR is an independent prognostic marker for nasopharyngeal carcinoma progression and survival. Cancer Sci, 104(4), 458-464. doi:10.1111/cas.12092

Nikolopoulos, D., Theocharis, S., & Kouraklis, G. (2010). Ghrelin: a potential therapeutic target for cancer. Regul Pept, 163(1-3), 7-17. doi:S0167-0115(10)00075-3 [pii]10.1016/j.regpep.2010.03.011

Ning, Q., Li, Y., Wang, Z., Zhou, S., Sun, H., & Yu, G. (2017). The evolution and expression pattern of human overlapping lncRNA and protein-coding gene pairs. Sci Rep, 7, 42775. doi:10.1038/srep42775

Nishimoto, Y., Nakagawa, S., Hirose, T., Okano, H. J., Takao, M., Shibata, S., Okano, H. (2013). The long non-coding RNA nuclear-enriched abundant transcript 1_2 induces paraspeckle formation in the motor neuron during the early phase of amyotrophic lateral sclerosis. Mol Brain, 6, 31. doi:10.1186/1756-6606-6-31

Oh, E. J., Kim, S. H., Yang, W. I., Ko, Y. H., & Yoon, S. O. (2016). Long non-coding RNA HOTAIR expression in diffuse large b-cell lymphoma: in relation to polycomb repressive complex pathway proteins and H3K27 trimethylation. J Pathol Transl Med, 50(5), 369-376. doi:10.4132/jptm.2016.06.06

Okada, Y., Tashiro, C., Numata, K., Watanabe, K., Nakaoka, H., Yamamoto, N., Kiyosawa, H. (2008). Comparative expression analysis uncovers novel features of endogenous antisense transcription. Hum Mol Genet, 17(11), 1631-1640.

Orfanelli, U., Jachetti, E., Chiacchiera, F., Grioni, M., Brambilla, P., Briganti, A., Lavorgna, G. (2015). Antisense transcription at the TRPM2 locus as a novel prognostic marker and therapeutic target in prostate cancer. Oncogene, 34, 2094-2102. doi:10.1038/onc.2014.144

Orom, U. A., & Shiekhattar, R. (2011). Long non-coding RNAs and enhancers. Curr Opin Genet Dev, 21(2), 194-198. doi:10.1016/j.gde.2011.01.020

Ozgur, E., Celik, A. I., Darendeliler, E., & Gezer, U. (2017). PCA3 silencing sensitizes prostate cancer cells to enzalutamide-mediated androgen receptor blockade. Anticancer Res, 37(7), 3631-3637. doi:10.21873/anticanres.11733

Ozgur, E., Mert, U., Isin, M., Okutan, M., Dalay, N., & Gezer, U. (2013). Differential expression of long non-coding RNAs during genotoxic stress-induced apoptosis in HeLa and MCF-7 cells. Clin Exp Med, 13(2), 119-126. doi:10.1007/s10238-012-0181-x

178

Pacak, A., Barciszewska-Pacak, M., Swida-Barteczka, A., Kruszka, K., Sega, P., Milanowska, K., Szweykowska-Kulinska, Z. (2016). Heat stress affects pi-related genes expression and inorganic phosphate deposition/accumulation in barley. Front Plant Sci, 7, 926. doi:10.3389/fpls.2016.00926

Paci, P., Colombo, T., & Farina, L. (2014). Computational analysis identifies a sponge interaction network between long non-coding RNAs and messenger RNAs in human breast cancer. BMC Syst Biol, 8, 83. doi:10.1186/1752-0509-8-83

Pai, V. P., Marshall, A. M., Hernandez, L. L., Buckley, A. R., & Horseman, N. D. (2009). Altered serotonin physiology in human breast cancers favors paradoxical growth and cell survival. Breast Cancer Res, 11(6), R81. doi:10.1186/bcr2448

Pan, X., & Xiong, K. (2015). PredcircRNA: computational classification of circular RNA from other long non-coding RNA using hybrid features. Mol Biosyst, 11(8), 2219-2226. doi:10.1039/c5mb00214a

Pan, Y. F., Feng, L., Zhang, X. Q., Song, L. J., Liang, H. X., Li, Z. Q., & Tao, F. B. (2011). Role of long non-coding RNAs in gene regulation and oncogenesis. Chin Med J (Engl), 124(15), 2378-2383.

Pandey, V., Wu, Z. S., Zhang, M., Li, R., Zhang, J., Zhu, T., & Lobie, P. E. (2014). Trefoil factor 3 promotes metastatic seeding and predicts poor survival outcome of patients with mammary carcinoma. Breast Cancer Res, 16(5), 429. doi:10.1186/s13058-014-0429-3

Parolia, A., Crea, F., Xue, H., Wang, Y., Mo, F., Ramnarine, V. R., Helgason, C. D. (2015). The long non-coding RNA PCGEM1 is regulated by androgen receptor activity in vivo. Mol Cancer, 14, 46. doi:10.1186/s12943-015-0314-4

Pauli, A., Montague, T. G., Lennox, K. A., Behlke, M. A., & Schier, A. F. (2015). antisense oligonucleotide-mediated transcript knockdown in zebrafish. PLoS One, 10(10), e0139504. doi:10.1371/journal.pone.0139504

Pavlov, K. A., Shkoporov, A. N., Khokhlova, E. V., Korchagina, A. A., Sidorenkov, A. V., Grigor'ev, M. E., Chekhonin, V. P. (2013). Development of a diagnostic test system for early non-invasive detection of prostate cancer based on PCA3 mRNA levels in urine sediment using quantitative reverse tanscription polymerase chain reaction (qRT-PCR). Vestn Ross Akad Med Nauk(5), 45-51.

Peacock, S. O., Fahrenholtz, C. D., & Burnstein, K. L. (2012). Vav3 enhances androgen receptor splice variant activity and is critical for castration-resistant prostate cancer growth and survival. Mol Endocrinol, 26(12), 1967-1979. doi:10.1210/me.2012-1165

Peng, D., Guo, Y., Chen, H., Zhao, S., Washington, K., Hu, T., El-Rifai, W. (2017). Integrated molecular analysis reveals complex interactions between genomic and epigenomic alterations in esophageal adenocarcinomas. Sci Rep, 7, 40729. doi:10.1038/srep40729

Peng, W., & Fan, H. (2015). Long non-coding RNA PANDAR correlates with poor prognosis and promotes tumorigenesis in hepatocellular carcinoma. Biomed Pharmacother, 72, 113-118. doi:10.1016/j.biopha.2015.04.014

Peng, Z., Skoog, L., Hellborg, H., Jonstam, G., Wingmo, I. L., Hjalm-Eriksson, M., Li, C. (2014). An expression signature at diagnosis to estimate prostate cancer patients' overall survival. Prostate Cancer Prostatic Dis, 17(1), 81-90. doi:10.1038/pcan.2013.57

Penney, K. L., Sinnott, J. A., Fall, K., Pawitan, Y., Hoshida, Y., Kraft, P., Mucci, L. A. (2011). mRNA expression signature of Gleason grade predicts lethal prostate cancer. J Clin Oncol, 29(17), 2391-2396. doi:10.1200/JCO.2010.32.6421

179

Perdona, S., Bruzzese, D., Ferro, M., Autorino, R., Marino, A., Mazzarella, C., Terracciano, D. (2013). Prostate health index (phi) and prostate cancer antigen 3 (PCA3) significantly improve diagnostic accuracy in patients undergoing prostate biopsy. Prostate, 73(3), 227-235. doi:10.1002/pros.22561

Peres, J., Davis, E., Mowla, S., Bennett, D. C., Li, J. A., Wansleben, S., & Prince, S. (2010). The Highly Homologous T-Box Transcription Factors, TBX2 and TBX3, Have Distinct Roles in the Oncogenic Process. Genes Cancer, 1(3), 272-282. doi:10.1177/1947601910365160

Pertea, M., & Salzberg, S. L. (2010). Between a chicken and a grape: estimating the number of human genes. Genome Biol, 11(5), 206. doi:10.1186/gb-2010-11-5-206

Pheasant, M., & Mattick, J. S. (2007). Raising the estimate of functional human sequences. Genome Res, 17(9), 1245-1253. doi:gr.6406307 [pii]10.1101/gr.6406307

Piccolo, S. R., Sun, Y., Campbell, J. D., Lenburg, M. E., Bild, A. H., & Johnson, W. E. (2012). A single- sample microarray normalization method to facilitate personalized-medicine workflows. Genomics, 100(6), 337-344. doi:10.1016/j.ygeno.2012.08.003

Piccolo, S. R., Withers, M. R., Francis, O. E., Bild, A. H., & Johnson, W. E. (2013). Multiplatform single-sample estimates of transcriptional activation. Proc Natl Acad Sci U S A, 110(44), 17778-17783. doi:10.1073/pnas.1305823110

Pickard, M. R., Mourtada-Maarabouni, M., & Williams, G. T. (2013). Long non-coding RNA GAS5 regulates apoptosis in prostate cancer cell lines. Biochim Biophys Acta, 1832(10), 1613-1623. doi:10.1016/j.bbadis.2013.05.005

Pickard, M. R., & Williams, G. T. (2014). Regulation of apoptosis by long non-coding RNA GAS5 in breast cancer cells: implications for chemotherapy. Breast Cancer Res Treat, 145(2), 359-370. doi:10.1007/s10549-014-2974-y

Plummer, P. N., Freeman, R., Taft, R. J., Vider, J., Sax, M., Umer, B. A., Mellick, A. S. (2013). MicroRNAs regulate tumor angiogenesis modulated by endothelial progenitor cells. Cancer Res, 73(1), 341-352.

Ponting, C. P., Oliver, P. L., & Reik, W. (2009). Evolution and functions of long noncoding RNAs. Cell, 136(4), 629-641.

Portoso, M., Ragazzini, R., Brencic, Z., Moiani, A., Michaud, A., Vassilev, I., Margueron, R. (2017). PRC2 is dispensable for HOTAIR-mediated transcriptional repression. EMBO J, 36(8), 981-994. doi:10.15252/embj.201695335

Prensner, J. R., Iyer, M. K., Sahu, A., Asangani, I. A., Cao, Q., Patel, L., Chinnaiyan, A. M. (2013). The long noncoding RNA SChLAP1 promotes aggressive prostate cancer and antagonizes the SWI/SNF complex. Nat Genet, 45(11), 1392-1398. doi:10.1038/ng.2771

Prensner, J. R., Rubin, M. A., Wei, J. T., & Chinnaiyan, A. M. (2012). Beyond PSA: the next generation of prostate cancer biomarkers. Sci Transl Med, 4(127), 127rv123. doi:10.1126/scitranslmed.3003180

Prensner, J. R., Sahu, A., Iyer, M. K., Malik, R., Chandler, B., Asangani, I. A., Chinnaiyan, A. M. (2014). The IncRNAs PCGEM1 and PRNCR1 are not implicated in castration resistant prostate cancer. Oncotarget, 5(6), 1434-1438. doi:10.18632/oncotarget.1846

Prensner, J. R., Zhao, S., Erho, N., Schipper, M., Iyer, M. K., Dhanasekaran, S. M., Feng, F. Y. (2014). RNA biomarkers associated with metastatic progression in prostate cancer: a multi-institutional high- throughput analysis of SChLAP1. Lancet Oncol, 15(13), 1469-1480. doi:10.1016/S1470- 2045(14)71113-1

180

Puente, J., Grande, E., Medina, A., Maroto, P., Lainez, N., & Arranz, J. A. (2017). Docetaxel in prostate cancer: a familiar face as the new standard in a hormone-sensitive setting. Ther Adv Med Oncol, 9(5), 307-318. doi:10.1177/1758834017692779

Putzke, A. P., Ventura, A. P., Bailey, A. M., Akture, C., Opoku-Ansah, J., Celiktas, M., Knudsen, B. S. (2011). Metastatic progression of prostate cancer and e-cadherin regulation by and SRC family kinases. Am J Pathol, 179(1), 400-410. doi:10.1016/j.ajpath.2011.03.028

Qi, P., & Du, X. (2013). The long non-coding RNAs, a new cancer diagnostic and therapeutic gold mine. Mod Pathol, 26(2), 155-165.

Qian, Y. Y., Li, K., Liu, Q. Y., & Liu, Z. S. (2017). Long non-coding RNA PTENP1 interacts with miR-193a-3p to suppress cell migration and invasion through the PTEN pathway in hepatocellular carcinoma. Oncotarget, 8(64), 107859-107869. doi:10.18632/oncotarget.22305

Qin, D., & Xu, C. (2015). Study strategies for long non-coding RNAs and their roles in regulating gene expression. Cell Mol Biol Lett, 20(2), 323-349. doi:10.1515/cmble-2015-0021

Qiu, J. J., Wang, Y., Ding, J. X., Jin, H. Y., Yang, G., & Hua, K. Q. (2015). The long non-coding RNA HOTAIR promotes the proliferation of serous ovarian cancer cells through the regulation of cell cycle arrest and apoptosis. Exp Cell Res, 333(2), 238-248. doi:10.1016/j.yexcr.2015.03.005

Qiu, M. T., Hu, J. W., Yin, R., & Xu, L. (2013). Long noncoding RNA: an emerging paradigm of cancer research. Tumour Biol, 34(2), 613-620.

Quackenbush, J. (2001). Computational analysis of microarray data. Nat Rev Genet, 2(6), 418-427. doi:10.1038/35076576

Quek, X. C., Thomson, D. W., Maag, J. L., Bartonicek, N., Signal, B., Clark, M. B., Dinger, M. E. (2015). lncRNAdb v2.0: expanding the reference database for functional long noncoding RNAs. Nucleic Acids Res, 43(Database issue), D168-173. doi:10.1093/nar/gku988

Radiloff, D. R., Wakeman, T. P., Feng, J., Schilling, S., Seto, E., & Wang, X. F. (2011). Trefoil factor 1 acts to suppress senescence induced by oncogene activation during the cellular transformation process. Proc Natl Acad Sci U S A, 108(16), 6591-6596. doi:10.1073/pnas.1017269108

Rajan, P., Stockley, J., Sudbery, I. M., Fleming, J. T., Hedley, A., Kalna, G., . . . Leung, H. Y. (2014). Identification of a candidate prognostic gene signature by transcriptome analysis of matched pre- and post-treatment prostatic biopsies from patients with advanced prostate cancer. BMC Cancer, 14, 977. doi:10.1186/1471-2407-14-977

Rak, A., Szczepankiewicz, D., & Gregoraszczuk, E. L. (2009). Expression of ghrelin receptor, GHSR- 1a, and its functional role in the porcine ovarian follicles. Growth Horm IGF Res, 19(1), 68-76.

Ram, O., Goren, A., Amit, I., Shoresh, N., Yosef, N., Ernst, J., Bernstein, B. E. (2011). Combinatorial patterning of chromatin regulators uncovered by genome-wide location analysis in human cells. Cell, 147(7), 1628-1639. doi:10.1016/j.cell.2011.09.057

Ramalho-Carvalho, J., Fromm, B., Henrique, R., & Jeronimo, C. (2016). Deciphering the function of non-coding RNAs in prostate cancer. Cancer Metastasis Rev, 35(2), 235-262. doi:10.1007/s10555-016- 9628-y

Ramaswamy, S., Ross, K. N., Lander, E. S., & Golub, T. R. (2003). A molecular signature of metastasis in primary solid tumors. Nat Genet, 33(1), 49-54. doi:10.1038/ng1060

181

Ramirez, F., Ryan, D. P., Gruning, B., Bhardwaj, V., Kilpert, F., Richter, A. S., Manke, T. (2016). deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res, 44(W1), W160-165. doi:10.1093/nar/gkw257

Ramos, C. G., Valdevenito, R., Vergara, I., Anabalon, P., Sanchez, C., & Fulla, J. (2013). PCA3 sensitivity and specificity for prostate cancer detection in patients with abnormal PSA and/or suspicious digital rectal examination. First Latin American experience. Urol Oncol, 31(8), 1522-1526. doi:10.1016/j.urolonc.2012.05.002

Raney, B. J., Dreszer, T. R., Barber, G. P., Clawson, H., Fujita, P. A., Wang, T., Kent, W. J. (2014). Track data hubs enable visualization of user-defined genome-wide annotations on the UCSC Genome Browser. Bioinformatics, 30(7), 1003-1005. doi:10.1093/bioinformatics/btt637

Redondo, M., Garcia, J., Villar, E., Rodrigo, I., Perea-Milla, E., Serrano, A., & Morell, M. (2003). Major histocompatibility complex status in breast carcinogenesis and relationship to apoptosis. Hum Pathol, 34(12), 1283-1289.

Reis, E. M., Louro, R., Nakaya, H. I., & Verjovski-Almeida, S. (2005). As antisense RNA gets intronic. OMICS, 9(1), 2-12. doi:10.1089/omi.2005.9.2

Reis, E. M., Nakaya, H. I., Louro, R., Canavez, F. C., Flatschart, A. V., Almeida, G. T., Verjovski- Almeida, S. (2004). Antisense intronic non-coding RNA levels correlate to the degree of tumor differentiation in prostate cancer. Oncogene, 23(39), 6684-6692. doi:10.1038/sj.onc.1207880

Ren, S., Liu, Y., Xu, W., Sun, Y., Lu, J., Wang, F., Sun, Y. (2013). Long noncoding RNA MALAT-1 is a new potential therapeutic target for castration resistant prostate cancer. J Urol, 190(6), 2278-2287. doi:10.1016/j.juro.2013.07.001

Ren, S., Peng, Z., Mao, J. H., Yu, Y., Yin, C., Gao, X., Sun, Y. (2012). RNA-seq analysis of prostate cancer in the Chinese population identifies recurrent gene fusions, cancer-associated long noncoding RNAs and aberrant alternative splicings. Cell research, 22(5), 806-821. doi:10.1038/cr.2012.30

Ren, S., Wang, F., Shen, J., Sun, Y., Xu, W., Lu, J., Sun, Y. (2013). Long non-coding RNA metastasis associated in lung adenocarcinoma transcript 1 derived miniRNA as a novel plasma-based biomarker for diagnosing prostate cancer. Eur J Cancer, 49(13), 2949-2959. doi:10.1016/j.ejca.2013.04.026

Renganathan, A., & Felley-Bosco, E. (2017). Long noncoding rnas in cancer and therapeutic potential. Advances in experimental medicine and biology, 1008, 199-222. doi:10.1007/978-981-10-5203-3_7

Rhodes, D. R., Kalyana-Sundaram, S., Mahavisno, V., Varambally, R., Yu, J., Briggs, B. B., Chinnaiyan, A. M. (2007). Oncomine 3.0: genes, pathways, and networks in a collection of 18,000 cancer gene expression profiles. Neoplasia, 9(2), 166-180.

Rhodes, D. R., Yu, J., Shanker, K., Deshpande, N., Varambally, R., Ghosh, D., Chinnaiyan, A. M. (2004). ONCOMINE: a cancer microarray database and integrated data-mining platform. Neoplasia, 6(1), 1-6.

Rice, J. C., & Allis, C. D. (2001). Histone methylation versus histone acetylation: new insights into epigenetic regulation. Curr Opin Cell Biol, 13(3), 263-273.

Rich, J. T., Neely, J. G., Paniello, R. C., Voelker, C. C., Nussenbaum, B., & Wang, E. W. (2010). A practical guide to understanding Kaplan-Meier curves. Otolaryngol Head Neck Surg, 143(3), 331-336. doi:10.1016/j.otohns.2010.05.007

182

Rinn, J. L., Kertesz, M., Wang, J. K., Squazzo, S. L., Xu, X., Brugmann, S. A., Chang, H. Y. (2007). Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs. Cell, 129(7), 1311-1323. doi:S0092-8674(07)00659-9 [pii]

10.1016/j.cell.2007.05.022

Ritchie, M. E., Phipson, B., Wu, D., Hu, Y., Law, C. W., Shi, W., & Smyth, G. K. (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res, 43(7), e47. doi:10.1093/nar/gkv007

Robinson, M. D., McCarthy, D. J., & Smyth, G. K. (2010). edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics, 26(1), 139-140. doi:10.1093/bioinformatics/btp616

Robinson, M. D., & Oshlack, A. (2010). A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol, 11(3), R25. doi:10.1186/gb-2010-11-3-r25

Rothschild, G., & Basu, U. (2017). Lingering questions about enhancer rna and enhancer transcription- coupled genomic instability. Trends Genet, 33(2), 143-154. doi:10.1016/j.tig.2016.12.002

Roux, B. T., Lindsay, M. A., & Heward, J. A. (2017). Knockdown of nuclear-located enhancer rnas and long ncRNAs using locked nucleic acid gapmeRs. Methods Mol Biol, 1468, 11-18. doi:10.1007/978-1- 4939-4035-6_2

Ruiz-Orera, J., Messeguer, X., Subirana, J. A., & Alba, M. M. (2014). Long non-coding RNAs as a source of new peptides. Elife, 3, e03523. doi:10.7554/eLife.03523

Russell, P. J., Raghavan, D., Gregory, P., Philips, J., Wills, E. J., Jelbart, M., . . . Vincent, P. C. (1986). Bladder cancer xenografts: a model of tumor cell heterogeneity. Cancer Res, 46(4 Pt 2), 2035-2040.

Rycaj, K., & Tang, D. G. (2017). Molecular determinants of prostate cancer metastasis. Oncotarget, 8(50), 88211-88231. doi:10.18632/oncotarget.21085

Saal, L. H., Johansson, P., Holm, K., Gruvberger-Saal, S. K., She, Q. B., Maurer, M., Parsons, R. (2007). Poor prognosis in carcinoma is associated with a gene expression signature of aberrant PTEN tumor suppressor pathway activity. Proc Natl Acad Sci U S A, 104(18), 7564-7569. doi:10.1073/pnas.0702507104

Salameh, A., Lee, A. K., Cardo-Vila, M., Nunes, D. N., Efstathiou, E., Staquicini, F. I., Arap, W. (2015). PRUNE2 is a human prostate cancer suppressor regulated by the intronic long noncoding RNA PCA3. Proc Natl Acad Sci U S A, 112(27), 8403-8408. doi:10.1073/pnas.1507882112

Samudyata, Castelo-Branco, G., & Bonetti, A. (2017). Birth, coming of age and death: The intriguing life of long noncoding RNAs. Semin Cell Dev Biol. doi:10.1016/j.semcdb.2017.11.012

Sana, J., Faltejskova, P., Svoboda, M., & Slaby, O. (2012). Long non-coding RNAs and their relevance in cancer. Klin Onkol, 25(4), 246-254.

Sanbonmatsu, K. Y. (2015). Towards structural classification of long non-coding RNAs. Biochim Biophys Acta. doi:10.1016/j.bbagrm.2015.09.011

Sangodkar, J., Farrington, C. C., McClinch, K., Galsky, M. D., Kastrinsky, D. B., & Narla, G. (2016). All roads lead to PP2A: exploiting the therapeutic potential of this phosphatase. FEBS J, 283(6), 1004- 1024. doi:10.1111/febs.13573

183

Saxena, A., & Carninci, P. (2011). Long non-coding RNA modifies chromatin: epigenetic silencing by long non-coding RNAs. Bioessays, 33(11), 830-839. doi:10.1002/bies.201100084

Scharfmann, R., J.H. Axelrod, and I.M. Verma. (1991). Long-term in vivo expression of retrovirus- mediated gene transfer in mouse fibroblast implants. Proc Natl Acad Sci, 88(11), 4626-30. doi: 10.1073/pnas.88.11.4626

Sehgal, L., Mathur, R., Braun, F. K., Wise, J. F., Berkova, Z., Neelapu, S., Samaniego, F. (2014). FAS- antisense 1 lncRNA and production of soluble versus membrane Fas in B-cell lymphoma. Leukemia, 28(12), 2376-2387. doi:10.1038/leu.2014.126

Seim, I., Herington, A. C., & Chopin, L. K. (2009). New insights into the molecular complexity of the ghrelin gene locus. Cytokine Growth Factor Rev, 20(4), 297-304. doi:10.1016/j.cytogfr.2009.07.006

Seim, I., Jeffery, P. L., Thomas, P. B., Nelson, C. C., & Chopin, L. K. (2017). Whole-genome sequence of the metastatic PC3 and LNCaP human prostate cancer cell lines. G3 (Bethesda), 7(6), 1731-1741. doi:10.1534/g3.117.039909

Seim, I., Josh, P., Cunningham, P., Herington, A., & Chopin, L. (2011). Ghrelin axis genes, peptides and receptors: recent findings and future challenges. Mol Cell Endocrinol, 340(1), 3-9. doi:10.1016/j.mce.2011.05.002

Selim, S. Z., & Ismail, M. A. (1984). K-means-type algorithms: a generalized convergence theorem and characterization of local optimality. IEEE Trans Pattern Anal Mach Intell, 6(1), 81-87.

Sengupta, R., Dubuc, A., Ward, S., Yang, L., Northcott, P., Woerner, B. M., Rubin, J. B. (2012). CXCR4 activation defines a new subgroup of Sonic hedgehog-driven medulloblastoma. Cancer Res, 72(1), 122-132. doi:10.1158/0008-5472.can-11-1701

Serrano, A., Castro-Vega, I., & Redondo, M. (2011). Role of gene methylation in antitumor immune response: implication for tumor progression. Cancers (Basel), 3(2), 1672-1690. doi:10.3390/cancers3021672

Sharma, V., Sharma, R., & Singh, S. (2014). Antisense oligonucleotides: modifications and clinical trials. Medicinal Chemistry Communications, 5, 1454-1471.

Shen, P., Sun, J., Xu, G., Zhang, L., Yang, Z., Xia, S., Shi, G. (2014). KLF9, a transcription factor induced in flutamide-caused cell apoptosis, inhibits AKT activation and suppresses tumor growth of prostate cancer cells. Prostate, 74(9), 946-958. doi:10.1002/pros.22812

Shen, X. H., Qi, P., & Du, X. (2015). Long non-coding RNAs in cancer invasion and metastasis. Mod Pathol, 28(1), 4-13. doi:10.1038/modpathol.2014.75

Shi, X., Tang, X., & Su, L. (2017). Over-expression of long non-coding RNA PTENP1 inhibits cell proliferation and migration via suppression of miR-19b in breast cancer cells. Oncological Research. doi:10.3727/096504017x15123838050075

Shibayama, Y., Fanucchi, S., Magagula, L., & Mhlanga, M. M. (2014). lncRNA and gene looping: what's the connection? Transcription, 5(3), e28658. doi:10.4161/trns.28658

Shibayama, Y., Fanucchi, S., & Mhlanga, M. M. (2017). Visualization of enhancer-derived noncoding RNA. Methods Mol Biol, 1468, 19-32. doi:10.1007/978-1-4939-4035-6_3

184

Shultz, L. D., Lyons, B. L., Burzenski, L. M., Gott, B., Chen, X., Chaleff, S., Handgretinger, R. (2005). Human lymphoid and myeloid cell development in NOD/LtSz-scid IL2R gamma null mice engrafted with mobilized human hemopoietic stem cells. J Immunol, 174(10), 6477-6489.

Sims, D., Sudbery, I., Ilott, N. E., Heger, A., & Ponting, C. P. (2014). Sequencing depth and coverage: key considerations in genomic analyses. Nat Rev Genet, 15(2), 121-132. doi:10.1038/nrg3642

Singh, D., Febbo, P. G., Ross, K., Jackson, D. G., Manola, J., Ladd, C., Sellers, W. R. (2002). Gene expression correlates of clinical prostate cancer behavior. Cancer Cell, 1(2), 203-209.

Sinnott, J. A., Peisch, S., Tyekucheva, S., Gerke, T. A., Lis, R. T., Rider, J. R., Penney, K. L. (2016). Prognostic utility of a new mrna expression signature of Gleason Score. Clin Cancer Res. doi:10.1158/1078-0432.CCR-16-1245

Slaby, O., Laga, R., & Sedlacek, O. (2017). Therapeutic targeting of non-coding RNAs in cancer. Biochem J, 474(24), 4219-4251. doi:10.1042/bcj20170079

Smeets, D., Markaki, Y., Schmid, V. J., Kraus, F., Tattermusch, A., Cerase, A., Cremer, M. (2014). Three-dimensional super-resolution microscopy of the inactive X chromosome territory reveals a collapse of its active nuclear compartment harboring distinct Xist RNA foci. Epigenetics Chromatin, 7, 8. doi:10.1186/1756-8935-7-8

Smolle, M. A., Bauernhofer, T., Pummer, K., Calin, G. A., & Pichler, M. (2017). Current insights into long non-coding RNAs (LncRNAs) in prostate cancer. Int J Mol Sci, 18(2). doi:10.3390/ijms18020473

Sonohara, F., Inokawa, Y., Hayashi, M., Yamada, S., Sugimoto, H., Fujii, T., Nomoto, S. (2017). Prognostic value of long non-coding RNA HULC and MALAT1 following the curative resection of hepatocellular carcinoma. Sci Rep, 7(1), 16142. doi:10.1038/s41598-017-16260-1

Sonpavde, G., Wang, C. G., Galsky, M. D., Oh, W. K., & Armstrong, A. J. (2015). Cytotoxic chemotherapy in the contemporary management of metastatic castration-resistant prostate cancer (mCRPC). BJU Int, 116(1), 17-29. doi:10.1111/bju.12867

Soudyab, M., Iranpour, M., & Ghafouri-Fard, S. (2016). The role of long non-coding RNAs in breast cancer. Arch Iran Med, 19(7), 508-517. doi:0161907/aim.0011

Sowalsky, A. G., Xia, Z., Wang, L., Zhao, H., Chen, S., Bubley, G. J., Li, W. (2015). Whole transcriptome sequencing reveals extensive unspliced mRNA in metastatic castration-resistant prostate cancer. Mol Cancer Res, 13(1), 98-106. doi:10.1158/1541-7786.mcr-14-0273

Srikantan, V., Zou, Z., Petrovics, G., Xu, L., Augustus, M., Davis, L., Srivastava, S. (2000). PCGEM1, a prostate-specific gene, is overexpressed in prostate cancer. Proc Natl Acad Sci U S A, 97(22), 12216- 12221. doi:10.1073/pnas.97.22.12216

Stein, C. A., & Castanotto, D. (2017). FDA-Approved Oligonucleotide Therapies in 2017. Mol Ther, 25(5), 1069-1075. doi:10.1016/j.ymthe.2017.03.023

Stelloo, S., Nevedomskaya, E., van der Poel, H. G., de Jong, J., van Leenders, G. J., Jenster, G., Zwart, W. (2015). Androgen receptor profiling predicts prostate cancer outcome. EMBO Mol Med, 7(11), 1450-1464. doi:10.15252/emmm.201505424

Stone, L. (2017). Prostate cancer: Escaping enzalutamide: Malat1 contributes to resistance. Nat Rev Urol, 14(8), 450. doi:10.1038/nrurol.2017.91

185

Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L., Gillette, M. A., Mesirov, J. P. (2005). Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A, 102(43), 15545-15550. doi:10.1073/pnas.0506580102

Sumanasuriya, S., & De Bono, J. (2017). Treatment of advanced prostate cancer-a review of current therapies and future promise. Cold Spring Harb Perspect Med. doi:10.1101/cshperspect.a030635

Sun, M., Gadad, S. S., Kim, D. S., & Kraus, W. L. (2015). Discovery, annotation, and functional analysis of long noncoding RNAs controlling cell-cycle gene expression and proliferation in breast cancer cells. Mol Cell, 59(4), 698-711. doi:10.1016/j.molcel.2015.06.023

Sun, M., & Kraus, W. L. (2015). From discovery to function: the expanding roles of long non-coding RNAs in physiology and disease. Endocr Rev, er00009999. doi:10.1210/er.0000-9999

Sun, W., Yang, Y., Xu, C., & Guo, J. (2017). Regulatory mechanisms of long noncoding RNAs on gene expression in cancers. Cancer Genet, 216-217, 105-110. doi:10.1016/j.cancergen.2017.06.003

Sung, Y. Y., & Cheung, E. (2013). Antisense now makes sense: dual modulation of androgen-dependent transcription by CTBP1-AS. EMBO J, 32(12), 1653-1654. doi:10.1038/emboj.2013.112

Swift, S. L., Burns, J. E., & Maitland, N. J. (2010). Altered expression of neurotensin receptors is associated with the differentiation state of prostate cancer. Cancer Res, 70(1), 347-356. doi:10.1158/0008-5472.can-09-1252

Szalai, B., Hoffmann, P., Prokop, S., Erdelyi, L., Varnai, P., & Hunyady, L. (2014). Improved methodical approach for quantitative BRET analysis of G Protein Coupled Receptor dimerization. PLoS One, 9(10), e109503. doi:10.1371/journal.pone.0109503

Szklarczyk, D., Morris, J. H., Cook, H., Kuhn, M., Wyder, S., Simonovic, M., von Mering, C. (2017). The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible. Nucleic Acids Res, 45(D1), D362-d368. doi:10.1093/nar/gkw937

Tabibzadeh, S. S., Sivarajah, A., Carpenter, D., Ohlsson-Wilhelm, B. M., & Satyaswaroop, P. G. (1990). Modulation of HLA-DR expression in epithelial cells by interleukin 1 and estradiol-17 beta. J Clin Endocrinol Metab, 71(3), 740-747. doi:10.1210/jcem-71-3-740

Taft, R. J., Pang, K. C., Mercer, T. R., Dinger, M., & Mattick, J. S. (2010). Non-coding RNAs: regulators of disease. J Pathol, 220(2), 126-139. doi:10.1002/path.2638

Tajbakhsh, S. (2017). lncRNA-encoded polypeptide SPAR(s) with mTORC1 to regulate skeletal muscle regeneration. Cell Stem Cell, 20(4), 428-430. doi:10.1016/j.stem.2017.03.016

Takahashi, K., Furukawa, C., Takano, A., Ishikawa, N., Kato, T., Hayama, S., Daigo, Y. (2006). The neuromedin U-growth hormone secretagogue receptor 1b/neurotensin receptor 1 oncogenic signaling pathway as a therapeutic target for lung cancer. Cancer Res, 66(19), 9408-9419. doi:10.1158/0008- 5472.CAN-06-1349

Takayama, K., Horie-Inoue, K., Katayama, S., Suzuki, T., Tsutsumi, S., Ikeda, K., Inoue, S. (2013). Androgen-responsive long noncoding RNA CTBP1-AS promotes prostate cancer. EMBO J, 32(12), 1665-1680. doi:10.1038/emboj.2013.99

Tang, W., Dong, K., Li, K., Dong, R., & Zheng, S. (2016). MEG3, HCN3 and linc01105 influence the proliferation and apoptosis of neuroblastoma cells via the HIF-1alpha and p53 pathways. Sci Rep, 6, 36268. doi:10.1038/srep36268

186

Tang, W., Ji, M., He, G., Yang, L., Niu, Z., Jian, M., Xu, J. (2017). Silencing CDR1as inhibits colorectal cancer progression through regulating microRNA-7. Onco Targets Ther, 10, 2045-2056. doi:10.2147/ott.s131597

Tani, H., Mizutani, R., Salam, K. A., Tano, K., Ijiri, K., Wakamatsu, A., Akimitsu, N. (2012). Genome- wide determination of RNA stability reveals hundreds of short-lived noncoding transcripts in mammals. Genome Res, 22(5), 947-956.

Tani, H., Onuma, Y., Ito, Y., & Torimura, M. (2014). Long non-coding RNAs as surrogate indicators for chemical stress responses in human-induced pluripotent stem cells. PLoS One, 9(8), e106282. doi:10.1371/journal.pone.0106282

Tanjore, H., & Kalluri, R. (2006). The role of type IV collagen and basement membranes in cancer progression and metastasis. Am J Pathol, 168(3), 715-717. doi:10.2353/ajpath.2006.051321

Tarazona, S., Garcia-Alcalde, F., Dopazo, J., Ferrer, A., & Conesa, A. (2011). Differential expression in RNA-seq: a matter of depth. Genome Res, 21(12), 2213-2223. doi:10.1101/gr.124321.111

Taylor, B. S., Schultz, N., Hieronymus, H., Gopalan, A., Xiao, Y., Carver, B. S., Gerald, W. L. (2010). Integrative genomic profiling of human prostate cancer. Cancer Cell, 18(1), 11-22. doi:10.1016/j.ccr.2010.05.026

Taylor, D. H., Chu, E. T., Spektor, R., & Soloway, P. D. (2015). Long non-coding RNA regulation of reproduction and development. Mol Reprod Dev. doi:10.1002/mrd.22581

Tee, A. E., Liu, B., Song, R., Li, J., Pasquier, E., Cheung, B. B., Liu, T. (2016). The long noncoding RNA MALAT1 promotes tumor-driven angiogenesis by up-regulating pro-angiogenic gene expression. Oncotarget, 7(8), 8663-8675. doi:10.18632/oncotarget.6675

Teply, B. A., & Hauke, R. J. (2016). Chemotherapy options in castration-resistant prostate cancer. Indian J Urol, 32(4), 262-270. doi:10.4103/0970-1591.191239

Thalmann, G. N., Anezinis, P. E., Chang, S. M., Zhau, H. E., Kim, E. E., Hopwood, V. L., . . . Chung, L. W. (1994). Androgen-independent cancer progression and bone metastasis in the LNCaP model of human prostate cancer. Cancer Res, 54(10), 2577-2581.

Thibodeau, J., Bourgeois-Daigneault, M. C., & Lapointe, R. (2012). Targeting the MHC Class II antigen presentation pathway in cancer immunotherapy. Oncoimmunology, 1(6), 908-916. doi:10.4161/onci.21205

Thompson, V. C., Day, T. K., Bianco-Miotto, T., Selth, L. A., Han, G., Thomas, M., . . . Tilley, W. D. (2012). A gene signature identified using a mouse model of androgen receptor-dependent prostate cancer predicts biochemical relapse in human disease. Int J Cancer, 131(3), 662-672. doi:10.1002/ijc.26414

Thurman, R. E., Rynes, E., Humbert, R., Vierstra, J., Maurano, M. T., Haugen, E., Stamatoyannopoulos, J. A. (2012). The accessible chromatin landscape of the human genome. Nature, 489(7414), 75-82. doi:10.1038/nature11232

Tombal, B., Andriole, G. L., de la Taille, A., Gontero, P., Haese, A., Remzi, M., Stoevelaar, H. (2013). Clinical judgment versus biomarker prostate cancer gene 3: which is best when determining the need for repeat prostate biopsy? Urology, 81(5), 998-1004. doi:10.1016/j.urology.2012.11.069

187

Tran, C., Ouk, S., Clegg, N. J., Chen, Y., Watson, P. A., Arora, V., Sawyers, C. L. (2009). Development of a second-generation antiandrogen for treatment of advanced prostate cancer. Science, 324(5928), 787-790. doi:10.1126/science.1168175

Trowsdale, J., & Knight, J. C. (2013). Major histocompatibility complex genomics and human disease. Annu Rev Genomics Hum Genet, 14, 301-323. doi:10.1146/annurev-genom-091212-153455

Troy, A., & Sharpless, N. E. (2012). Genetic "lnc"-age of noncoding RNAs to human disease. J Clin Invest, 122(11), 3837-3840.

Truax, A. D., Thakkar, M., & Greer, S. F. (2012). Dysregulated recruitment of the histone methyltransferase EZH2 to the class II transactivator (CIITA) promoter IV in breast cancer cells. PLoS One, 7(4), e36013. doi:10.1371/journal.pone.0036013

Vance, K. W., Sansom, S. N., Lee, S., Chalei, V., Kong, L., Cooper, S. E., Ponting, C. P. (2014). The long non-coding RNA Paupar regulates the expression of both local and distal genes. EMBO J, 33(4), 296-311. doi:10.1002/embj.201386225

Vandesompele, J., De Preter, K., Pattyn, F., Poppe, B., Van Roy, N., De Paepe, A., & Speleman, F. (2002). Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biol, 3(7), RESEARCH0034.

Varambally, S., Yu, J., Laxman, B., Rhodes, D. R., Mehra, R., Tomlins, S. A., Chinnaiyan, A. M. (2005). Integrative genomic and proteomic analysis of prostate cancer reveals signatures of metastatic progression. Cancer Cell, 8(5), 393-406. doi:10.1016/j.ccr.2005.10.001

Veedu, R. N., & Wengel, J. (2010). Locked nucleic acids: promising nucleic acid analogs for therapeutic applications. Chem Biodivers, 7(3), 536-542. doi:10.1002/cbdv.200900343

Vestergaard, E. M., Borre, M., Poulsen, S. S., Nexo, E., & Torring, N. (2006). Plasma levels of trefoil factors are increased in patients with advanced prostate cancer. Clin Cancer Res, 12(3 Pt 1), 807-812. doi:10.1158/1078-0432.CCR-05-1545

Villegas, V. E., & Zaphiropoulos, P. G. (2015). Neighboring gene regulation by antisense long non- coding RNAs. Int J Mol Sci, 16(2), 3251-3266. doi:10.3390/ijms16023251

Vitiello, M., Tuccoli, A., & Poliseno, L. (2015). Erratum to: Long non-coding RNAs in cancer: implications for personalized therapy. Cell Oncol (Dordr), 38(1), 91. doi:10.1007/s13402-014-0202-8

Vivian, J., Rao, A., Nothaft, F. A., Ketchum, C., Armstrong, J., Novak, A., Paten, B. (2016). Rapid and efficient analysis of 20,000 RNA-seq samples with Toil. bioRxiv. doi:10.1101/062497

Vivian, J., Rao, A. A., Nothaft, F. A., Ketchum, C., Armstrong, J., Novak, A., Paten, B. (2017). Toil enables reproducible, open source, big biomedical data analyses. Nat Biotechnol, 35(4), 314-316. doi:10.1038/nbt.3772

Walsh, N. C., Kenney, L. L., Jangalwe, S., Aryee, K. E., Greiner, D. L., Brehm, M. A., & Shultz, L. D. (2017). Humanized Mouse Models of Clinical Disease. Annu Rev Pathol, 12, 187-215. doi:10.1146/annurev-pathol-052016-100332

Wan, X., Huang, W., Yang, S., Zhang, Y., Pu, H., Fu, F., Li, Y. (2016). Identification of androgen- responsive lncRNAs as diagnostic and prognostic markers for prostate cancer. Oncotarget, 7(37), 60503-60518. doi:10.18632/oncotarget.11391

188

Wan, Y., Kertesz, M., Spitale, R. C., Segal, E., & Chang, H. Y. (2011). Understanding the transcriptome through RNA structure. Nat Rev Genet, 12(9), 641-655. doi:10.1038/nrg3049

Wang, B., Su, Y., Yang, Q., Lv, D., Zhang, W., Tang, K., Liu, Y. (2015). Overexpression of long non- coding RNA HOTAIR promotes tumor growth and metastasis in human osteosarcoma. Mol Cells, 38(5), 432-440. doi:10.14348/molcells.2015.2327

Wang, D., Ding, L., Wang, L., Zhao, Y., Sun, Z., Karnes, R. J., Huang, H. (2015). LncRNA MALAT1 enhances oncogenic activities of EZH2 in castration-resistant prostate cancer. Oncotarget, 6(38), 41045-41055. doi:10.18632/oncotarget.5728

Wang, E. T., Sandberg, R., Luo, S., Khrebtukova, I., Zhang, L., Mayr, C., Burge, C. B. (2008). Alternative isoform regulation in human tissue transcriptomes. Nature, 456(7221), 470-476. doi:10.1038/nature07509

Wang, F., Ren, S., Chen, R., Lu, J., Shi, X., Zhu, Y., Sun, Y. (2014). Development and prospective multicenter evaluation of the long noncoding RNA MALAT-1 as a diagnostic urinary biomarker for prostate cancer. Oncotarget, 5(22), 11091-11102. doi:10.18632/oncotarget.2691

Wang, J., Liu, X., Wu, H., Ni, P., Gu, Z., Qiao, Y., Fan, Q. (2010). CREB up-regulates long non-coding RNA, HULC expression through interaction with microRNA-372 in liver cancer. Nucleic Acids Res, 38(16), 5366-5383. doi:10.1093/nar/gkq285

Wang, J., Zou, J. X., Xue, X., Cai, D., Zhang, Y., Duan, Z., Chen, H. W. (2016). ROR-gamma drives androgen receptor expression and represents a therapeutic target in castration-resistant prostate cancer. Nat Med, 22(5), 488-496. doi:10.1038/nm.4070

Wang, L., Lin, Y., Meng, H., Liu, C., Xue, J., Zhang, Q., Jiang, A. (2016). Long non-coding RNA LOC283070 mediates the transition of LNCaP cells into androgen-independent cells possibly via CAMK1D. Am J Transl Res, 8(12), 5219-5234.

Wang, L., Park, H. J., Dasari, S., Wang, S., Kocher, J. P., & Li, W. (2013). CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model. Nucleic Acids Res, 41(6), e74. doi:10.1093/nar/gkt006

Wang, L., Shi, S., Wang, L., Xie, Y., Bai, E., Zhou, X., Zhu, Q. (2013). Role of PRNCR1 in the castration resistant prostate cancer. Xi Bao Yu Fen Zi Mian Yi Xue Za Zhi, 29(8), 789-793.

Wang, R., Sun, Y., Li, L., Niu, Y., Lin, W., Lin, C., Chang, C. (2017). Preclinical study using malat1 small interfering RNA or androgen receptor splicing variant 7 degradation enhancer ASC-J9((R)) to suppress enzalutamide-resistant prostate cancer progression. Eur Urol, 72(5), 835-844. doi:10.1016/j.eururo.2017.04.005

Wang, T., Guo, S., Liu, Z., Wu, L., Li, M., Yang, J., Chen, K. (2014). CAMK2N1 inhibits prostate cancer progression through androgen receptor-dependent signaling. Oncotarget, 5(21), 10293-10306. doi:10.18632/oncotarget.2511

Wang, X., Ruan, Y., Wang, X., Zhao, W., Jiang, Q., Jiang, C., Xu, D. (2017). Long intragenic non- coding RNA lincRNA-p21 suppresses development of human prostate cancer. Cell Prolif, 50(2). doi:10.1111/cpr.12318

Wang, X., Yang, B., & Ma, B. (2016). The UCA1/miR-204/Sirt1 axis modulates docetaxel sensitivity of prostate cancer cells. Cancer Chemother Pharmacol, 78(5), 1025-1031. doi:10.1007/s00280-016- 3158-8

189

Wang, Y., Chen, F., Zhao, M., Yang, Z., Li, J., Zhang, S., Zhang, X. (2017). The long noncoding RNA HULC promotes liver cancer by increasing the expression of the HMGA2 oncogene via sequestration of the microRNA-186. J Biol Chem, 292(37), 15395-15407. doi:10.1074/jbc.M117.783738

Webber, M. M., Bello, D., Kleinman, H. K., & Hoffman, M. P. (1997). Acinar differentiation by non- malignant immortalized human prostatic epithelial cells and its loss by malignant cells. Carcinogenesis, 18(6), 1225-1231.

Wei, G. H., & Wang, X. (2017). lncRNA MEG3 inhibit proliferation and metastasis of gastric cancer via p53 signaling pathway. Eur Rev Med Pharmacol Sci, 21(17), 3850-3856.

Weichselbaum, R. R., Ishwaran, H., Yoon, T., Nuyten, D. S., Baker, S. W., Khodarev, N., Minn, A. J. (2008). An interferon-related gene signature for DNA damage resistance is a predictive marker for chemotherapy and radiation for breast cancer. Proc Natl Acad Sci U S A, 105(47), 18490-18495. doi:10.1073/pnas.0809242105

Welfare, A. I. o. H. a. (2017). Cancer in Australia: Actual incidence data from 1982 to 2013 and mortality data from 1982 to 2014 with projections to 2017. Asia Pac J Clin Oncol. doi:10.1111/ajco.12761

Wen-Jun, L., & Yi-Lei, M. (2015). Regulatory Effects of Long Non-coding RNA on Tumorigenesis. Zhongguo Yi Xue Ke Xue Yuan Xue Bao, 37(3), 358-363. doi:10.3881/j.issn.1000-503X.2015.03.023

Wenric, S., ElGuendi, S., Caberg, J. H., Bezzaou, W., Fasquelle, C., Charloteaux, B., Bours, V. (2017). Transcriptome-wide analysis of natural antisense transcripts shows their potential role in breast cancer. Sci Rep, 7(1), 17452. doi:10.1038/s41598-017-17811-2

Werner, A. (2005). Natural antisense transcripts. RNA Biol, 2(2), 53-62. doi:1852 [pii]

Werner, A., & Berdal, A. (2005). Natural antisense transcripts: sound or silence? Physiol Genomics, 23(2), 125-131. doi:23/2/125 [pii]10.1152/physiolgenomics.00124.2005

Whiteside, E. J., Seim, I., Pauli, J. P., O'Keeffe, A. J., Thomas, P. B., Carter, S. L., Chopin, L. K. (2013). Identification of a long non-coding RNA gene, growth hormone secretagogue receptor opposite strand, which stimulates cell migration in non-small cell lung cancer cell lines. Int J Oncol, 43(2), 566-574. doi:10.3892/ijo.2013.1969

Wilusz, J. E., Jnbaptiste, C. K., Lu, L. Y., Kuhn, C. D., Joshua-Tor, L., & Sharp, P. A. (2012). A triple helix stabilizes the 3' ends of long noncoding RNAs that lack poly(A) tails. Genes Dev, 26(21), 2392- 2407.

Wright, M. W. (2014). A short guide to long non-coding RNA . Hum Genomics, 8, 7. doi:10.1186/1479-7364-8-7

Wu, A. H., Huang, Y. L., Zhang, L. Z., Tian, G., Liao, Q. Z., & Chen, S. L. (2016). MiR-572 prompted cell proliferation of human ovarian cancer cells by suppressing PPP2R2C expression. Biomed Pharmacother, 77, 92-97. doi:10.1016/j.biopha.2015.12.005

Wu, C. L., Schroeder, B. E., Ma, X. J., Cutie, C. J., Wu, S., Salunga, R., McDougal, W. S. (2013). Development and validation of a 32-gene prognostic index for prostate cancer progression. Proc Natl Acad Sci U S A, 110(15), 6121-6126. doi:10.1073/pnas.1215870110

Wu, S., Zheng, C., Chen, S., Cai, X., Shi, Y., Lin, B., & Chen, Y. (2015). Overexpression of long non- coding RNA HOTAIR predicts a poor prognosis in patients with acute myeloid leukemia. Oncol Lett, 10(4), 2410-2414. doi:10.3892/ol.2015.3552

190

Wu, Y., Zhang, L., Zhang, L., Wang, Y., Li, H., Ren, X., Hao, X. (2015). Long non-coding RNA HOTAIR promotes tumor cell invasion and metastasis by recruiting EZH2 and repressing E-cadherin in oral squamous cell carcinoma. Int J Oncol, 46(6), 2586-2594. doi:10.3892/ijo.2015.2976

Wyatt, A. W., & Gleave, M. E. (2015). Targeting the adaptive molecular landscape of castration- resistant prostate cancer. EMBO Mol Med, 7(7), 878-894. doi:10.15252/emmm.201303701

Xiang, S., Zou, P., Tang, Q., Zheng, F., Wu, J., Chen, Z., & Hann, S. S. (2017). HOTAIR-mediated reciprocal regulation of EZH2 and DNMT1 contribute to polyphyllin I-inhibited growth of castration- resistant prostate cancer cells in vitro and in vivo. Biochim Biophys Acta, 1862(3), 589-599. doi:10.1016/j.bbagen.2017.12.001

Xing, C. Y., Hu, X. Q., Xie, F. Y., Yu, Z. J., Li, H. Y., Bin, Z., Gao, S. M. (2015). Long non-coding RNA HOTAIR modulates c-KIT expression through sponging miR-193a in acute myeloid leukemia. FEBS Lett, 589(15), 1981-1987. doi:10.1016/j.febslet.2015.04.061

Xu, K., Cui, J., Olman, V., Yang, Q., Puett, D., & Xu, Y. (2010). A comparative analysis of gene- expression data of multiple cancer types. PLoS One, 5(10), e13696. doi:10.1371/journal.pone.0013696

Xu, L., Zhang, M., Zheng, X., Yi, P., Lan, C., & Xu, M. (2017). The circular RNA ciRS-7 (Cdr1as) acts as a risk factor of hepatic microvascular invasion in hepatocellular carcinoma. J Cancer Res Clin Oncol, 143(1), 17-27. doi:10.1007/s00432-016-2256-7

Xu, S., Kong, D., Chen, Q., Ping, Y., & Pang, D. (2017). Oncogenic long noncoding RNA landscape in breast cancer. Mol Cancer, 16(1), 129. doi:10.1186/s12943-017-0696-6

Xu, S., Yi, X. M., Tang, C. P., Ge, J. P., Zhang, Z. Y., & Zhou, W. Q. (2016). Long non-coding RNA ATB promotes growth and epithelial-mesenchymal transition and predicts poor prognosis in human prostate carcinoma. Oncol Rep, 36(1), 10-22. doi:10.3892/or.2016.4791

Xu, S. T., Xu, J. H., Zheng, Z. R., Zhao, Q. Q., Zeng, X. S., Cheng, S. X., Hu, Q. F. (2017). Long non- coding RNA ANRIL promotes carcinogenesis via sponging miR-199a in triple-negative breast cancer. Biomed Pharmacother, 96, 14-21. doi:10.1016/j.biopha.2017.09.107

Xue, D., Lu, H., Xu, H. Y., Zhou, C. X., & He, X. Z. (2018). Long noncoding RNA MALAT1 enhances the docetaxel resistance of prostate cancer cells via miR-145-5p-mediated regulation of AKAP12. J Cell Mol Med. doi:10.1111/jcmm.13604

Xue, D., Zhou, C. X., Shi, Y. B., Lu, H., & He, X. Z. (2015). MD-miniRNA could be a more accurate biomarker for prostate cancer screening compared with serum prostate-specific antigen level. Tumour Biol, 36(5), 3541-3547. doi:10.1007/s13277-014-2990-x

Xue, X., Yang, Y. A., Zhang, A., Fong, K. W., Kim, J., Song, B., Yu, J. (2016). LncRNA HOTAIR enhances ER signaling and confers tamoxifen resistance in breast cancer. Oncogene, 35(21), 2746- 2755. doi:10.1038/onc.2015.340

Xue, Z., Hennelly, S., Doyle, B., Gulati, A. A., Novikova, I. V., Sanbonmatsu, K. Y., & Boyer, L. A. (2016). A G-Rich Motif in the lncRNA Braveheart Interacts with a Zinc-Finger Transcription Factor to Specify the Cardiovascular Lineage. Mol Cell, 64(1), 37-50. doi:10.1016/j.molcel.2016.08.010

Yan, L., Cai, K., Liang, J., Liu, H., Liu, Y., & Gui, J. (2017). Interaction between miR-572 and PPP2R2C, and their effects on the proliferation, migration, and invasion of nasopharyngeal carcinoma (NPC) cells. Biochemistry and Cell Biology, 95(5), 578-584. doi:10.1139/bcb-2016-0237

191

Yan, M. D., Hong, C. C., Lai, G. M., Cheng, A. L., Lin, Y. W., & Chuang, S. E. (2005). Identification and characterization of a novel gene Saf transcribed from the opposite strand of Fas. Hum Mol Genet, 14(11), 1465-1474. doi:10.1093/hmg/ddi156

Yang, L., Lin, C., Jin, C., Yang, J. C., Tanasa, B., Li, W., Rosenfeld, M. G. (2013). lncRNA-dependent mechanisms of androgen-receptor-regulated gene activation programs. Nature, 500(7464), 598-602. doi:10.1038/nature12451

Yang, X. D., Xu, H. T., Xu, X. H., Ru, G., Liu, W., Zhu, J. J., Li, M. (2016). Knockdown of long non- coding RNA HOTAIR inhibits proliferation and invasiveness and improves radiosensitivity in colorectal cancer. Oncol Rep, 35(1), 479-487. doi:10.3892/or.2015.4397

Yang, Y., Su, Z., Song, X., Liang, B., Zeng, F., Chang, X., & Huang, D. (2016). Enhancer RNA-driven looping enhances the transcription of the long noncoding RNA DHRS4-AS1, a controller of the DHRS4 gene cluster. Sci Rep, 6, 20961. doi:10.1038/srep20961

Yap, K. L., Li, S., Munoz-Cabello, A. M., Raguz, S., Zeng, L., Mujtaba, S., . . . Zhou, M. M. (2010). Molecular interplay of the noncoding RNA ANRIL and methylated histone H3 lysine 27 by polycomb CBX7 in transcriptional silencing of INK4a. Mol Cell, 38(5), 662-674. doi:10.1016/j.molcel.2010.03.021

Yin, L., Hu, Q., & Hartmann, R. W. (2013). Recent progress in pharmaceutical therapies for castration- resistant prostate cancer. Int J Mol Sci, 14(7), 13958-13978. doi:10.3390/ijms140713958

Yoo, K. H., & Hennighausen, L. (2012). EZH2 methyltransferase and H3K27 methylation in breast cancer. Int J Biol Sci, 8(1), 59-65.

Yoon, J. H., You, B. H., Park, C. H., Kim, Y. J., Nam, J. W., & Lee, S. K. (2017). The long noncoding RNA LUCAT1 promotes tumorigenesis by controlling ubiquitination and stability of DNA methyltransferase 1 in esophageal squamous cell carcinoma. Cancer Lett. doi:10.1016/j.canlet.2017.12.016

Yu, C. J., Yang, P. C., Shun, C. T., Lee, Y. C., Kuo, S. H., & Luh, K. T. (1996). Overexpression of MUC5 genes is associated with early post-operative metastasis in non-small-cell lung cancer. Int J Cancer, 69(6), 457-465. doi:10.1002/(SICI)1097-0215(19961220)69:6<457::AID-IJC7>3.0.CO;2-3

Yu, J., Yu, J., Rhodes, D. R., Tomlins, S. A., Cao, X., Chen, G., Chinnaiyan, A. M. (2007). A polycomb repression signature in metastatic prostate cancer predicts cancer outcome. Cancer Res, 67(22), 10657- 10663. doi:10.1158/0008-5472.CAN-07-2498

Yu, L., Gong, X., Sun, L., Zhou, Q., Lu, B., & Zhu, L. (2016). The circular rna Cdr1as act as an oncogene in hepatocellular carcinoma through targeting miR-7 Expression. PLoS One, 11(7), e0158347. doi:10.1371/journal.pone.0158347

Yu, X., & Li, Z. (2015). Long non-coding RNA HOTAIR: A novel oncogene (Review). Mol Med Rep, 12(4), 5611-5618. doi:10.3892/mmr.2015.4161

Zhang, A., Zhao, J. C., Kim, J., Fong, K. W., Yang, Y. A., Chakravarti, D., Yu, J. (2015). LncRNA HOTAIR Enhances the Androgen-Receptor-Mediated Transcriptional Program and Drives Castration- Resistant Prostate Cancer. Cell Rep, 13(1), 209-221. doi:10.1016/j.celrep.2015.08.069

Zhang, E., Li, W., Yin, D., De, W., Sun, S., & Han, L. (2015). c-Myc-regulated long non-coding RNA H19 indicates a poor prognosis and affects cell proliferation in non-small-cell lung cancer. Tumour Biol. doi:10.1007/s13277-015-4185-5

192

Zhang, H., & Zhu, J. K. (2014). Emerging roles of RNA processing factors in regulating long non- coding RNAs. RNA Biol, 11(7), 793-797. doi:10.4161/rna.29731

Zhang, R., Xia, Y., Wang, Z., Zheng, J., Chen, Y., Li, X., Ming, H. (2017). Serum long non coding RNA MALAT-1 protected by exosomes is up-regulated and promotes cell proliferation and migration in non-small cell lung cancer. Biochem Biophys Res Commun, 490(2), 406-414. doi:10.1016/j.bbrc.2017.06.055

Zhang, W., Ren, S. C., Shi, X. L., Liu, Y. W., Zhu, Y. S., Jing, T. L., Sun, Y. H. (2015). A novel urinary long non-coding RNA transcript improves diagnostic accuracy in patients undergoing prostate biopsy. Prostate, 75(6), 653-661. doi:10.1002/pros.22949

Zhang, X., He, X., Liu, Y., Zhang, H., Chen, H., Guo, S., & Liang, Y. (2017). MiR-101-3p inhibits the growth and metastasis of non-small cell lung cancer through blocking PI3K/AKT signal pathway by targeting MALAT-1. Biomed Pharmacother, 93, 1065-1073. doi:10.1016/j.biopha.2017.07.005

Zhang, Y., Su, X., Kong, Z., Fu, F., Zhang, P., Wang, D., Li, Y. (2017). An androgen reduced transcript of LncRNA GAS5 promoted prostate cancer proliferation. PLoS One, 12(8), e0182305. doi:10.1371/journal.pone.0182305

Zhang, Z. (2016). Long non-coding RNAs in Alzheimer's disease. Curr Top Med Chem, 16(5), 511- 519.

Zhang, Z., Zhou, N., Huang, J., Ho, T. T., Zhu, Z., Qiu, Z., Mo, Y. Y. (2016). Regulation of androgen receptor splice variant AR3 by PCGEM1. Oncotarget, 7(13), 15481-15491. doi:10.18632/oncotarget.7139

Zhao, B., Yang, Y., Hu, L. B., Bai, Y., Li, R. Q., Zhang, G. Y., . . . Lu, Y. L. (2017). Overexpression of lncRNA ANRIL promoted the proliferation and migration of prostate cancer cells via regulating let- 7a/TGF-beta1/ Smad signaling pathway. Cancer biomarkers : section A of Disease markers. doi:10.3233/cbm-170683

Zhao, R., Sun, F., Bei, X., Wang, X., Zhu, Y., Jiang, C., Xia, S. (2017). Upregulation of the long non- coding RNA FALEC promotes proliferation and migration of prostate cancer cell lines and predicts prognosis of PCa patients. Prostate, 77(10), 1107-1117. doi:10.1002/pros.23367

Zhao, S. G., Evans, J. R., Kothari, V., Sun, G., Larm, A., Mondine, V., Feng, F. Y. (2016). The Landscape of Prognostic Outlier Genes in High-Risk Prostate Cancer. Clin Cancer Res, 22(7), 1777- 1786. doi:10.1158/1078-0432.CCR-15-1250

Zhao, Y., Shuen, T. W. H., Toh, T. B., Chan, X. Y., Liu, M., Tan, S. Y., Chen, Q. (2018). Development of a new patient-derived xenograft humanised mouse model to study human-specific tumour microenvironment and immunotherapy. Gut. doi:10.1136/gutjnl-2017-315201

Zhong, D. N., Luo, Y. H., Mo, W. J., Zhang, X., Tan, Z., Zhao, N., Tang, W. (2018). High expression of long noncoding HOTAIR correlated with hepatocarcinogenesis and metastasis. Mol Med Rep, 17(1), 1148-1156. doi:10.3892/mmr.2017.7999

Zhou, X., Liu, S., Cai, G., Kong, L., Zhang, T., Ren, Y., Wang, X. (2015). Long non coding RNA MALAT1 promotes tumor growth and metastasis by inducing epithelial-mesenchymal transition in oral squamous cell carcinoma. Sci Rep, 5, 15972. doi:10.1038/srep15972

Zhou, Y., Li, Y., Li, X., & Jiang, M. (2017). Urinary biomarker panel to improve accuracy in predicting prostate biopsy result in chinese men with PSA 4-10 ng/mL. Biomed Res Int, 2017, 2512536. doi:10.1155/2017/2512536

193

Zhu, H., Zhu, X., Zheng, L., Hu, X., Sun, L., & Zhu, X. (2016). The role of the androgen receptor in ovarian cancer carcinogenesis and its clinical implications. Oncotarget. doi:10.18632/oncotarget.12561

Zhu, L., Hu, Z., Liu, J., Gao, J., & Lin, B. (2015). Gene expression profile analysis identifies metastasis and chemoresistance-associated genes in epithelial ovarian carcinoma cells. Med Oncol, 32(1), 426. doi:10.1007/s12032-014-0426-5

Zhu, L., Liu, J., Ma, S., & Zhang, S. (2015). Long noncoding RNA MALAT-1 can predict metastasis and a poor prognosis: a meta-analysis. Pathol Oncol Res, 21(4), 1259-1264. doi:10.1007/s12253-015- 9960-5

Zhu, S., Soutto, M., Chen, Z., Peng, D., Romero-Gallo, J., Krishna, U. S., El-Rifai, W. (2017). Helicobacter pylori-induced cell death is counteracted by NF-kappaB-mediated transcription of DARPP-32. Gut, 66(5), 761-762. doi:10.1136/gutjnl-2016-312141

Zou, Y., Zhong, Y., Wu, J., Xiao, H., Zhang, X., Liao, X., Zhang, F. (2017). Long non-coding PANDAR as a novel biomarker in human cancer: A systematic review. Cell Prolif. doi:10.1111/cpr.12422

Zuker, M., & Stiegler, P. (1981). Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res, 9(1), 133-148.

194

Chapter 9

Appendix

195

SUPPLEMENTARY FIGURES AND TABLES

Chapter 3 related tables and figures • Supplementary Table 1. Overview of human Affymetrix exon array data sets interrogated.

Chapter 4 related tables and figures • Supplementary Table 2. Differentially expressed genes in PC3-GHSROS cells. • Supplementary Figure 1. Effects of GHSROS overexpression perturbation in cultured cells assessed by qRT-PCR. • Supplementary Figure 2. Effects of GHSROS knockdown perturbation in cultured cells assessed by qRT-PCR. • Supplementary Table 3. Enrichment for GO terms in the category ‘biological process’ for genes upregulated in PC3-GHSROS cells (compared to empty-vector control). • Supplementary Table 4. Enrichment for GO terms in the category ‘biological process’ for genes downregulated in PC3-GHSROS cells (compared to empty-vector control). • Supplementary Table 5. Oncomine concepts analysis of positively and negatively correlated PC3-GHSROS gene signature. • Supplementary Table 6. Differentially expressed genes in PC3-GHSROS cells compared to the Grasso Oncomine data set. • Supplementary Table 7. Differentially expressed genes in PC3-GHSROS cells compared to the Taylor Oncomine data set. • Supplementary Table 8. Overall survival (OS) analysis of TCGA patients using the 34-gene signature.

Chapter 5 related tables and figures • Supplementary Table 9. Differentially expressed genes in LNCaP-GHSROS cells. • Supplementary Table 10. Enrichment for GO terms in the category ‘biological process’ for genes upregulated in LNCaP-GHSROS cells (compared to empty-vector control). • Supplementary Dataset 11. Enrichment for GO terms in the category ‘biological process’ for genes downregulated in LNCaP-GHSROS cells (compared to empty-vector control).

Chapter 6 related tables and figures • Supplementary Table 13. Differentially expressed genes in MDA-MB-231-GHSROS cells.

196

Supplementary Table 1. Overview of human Affymetrix exon array data sets interrogated. Resource Unique ID Tissue/cell type Type n ArrayExpress E-MEXP-2644 lung (18 benign and 18 cancer) tissues 36 ArrayExpress E-MEXP-3931 THP1 (acute monocytic leukemia) cell lines 12 ArrayExpress E-MTAB-1273 induced pluripotent stem (iPS) cells derived from glioblastoma- primary cells 16 derived neural stem cells ArrayExpress E-MTAB-2471 large B-cell lymphoma tissues 16 Affymetrix web goo.gl/rBWrFv breast, cerebellum, heart, kidney, liver, muscle, pancreas, prostate, tissues 53 site spleen, testes, thyroid, mixture Affymetrix web goo.gl/Yack5K colon cancer (10 benign and 10 cancer) tissues 20 site GEO GSE11967 thymus (4 benign and 4 cancer) tissues 8 GEO GSE16732 breast (cancer) cell lines 41 GEO GSE18927 NIH Epigenomics Roadmap Initiative (stem cells and primary ex tissues and 99 vivo tissues) primary cells GEO GSE19090 ENCODE Project Consortium (84 cell lines and primary cells) cell lines and 182 primary cells GEO GSE19891 HeLa (cervical cancer) cell lines 15 GEO GSE20342 MCF7 (breast cancer) cell lines 32 GEO GSE20567 HL60, THP-1, U937 (myeloid leukemia) cell lines 17 GEO GSE21034 prostate (cancer) tissues 310 GEO GSE21163 pancreas (1 benign and 6 cancer) cell lines 22 GEO GSE21337 acute myeloid leukemia tissues 64 GEO GSE21840 MCF7 (breast cancer) cell lines 6 GEO GSE23361 lung (cancer) tissues 12 GEO GSE23514 HeLa S3 (cervical cancer) cell lines 12 GEO GSE23768 breast, lung, ovarian and prostate cancer tissues 153 GEO GSE24778 K562 (chronic myelogenous leukemia) cell lines 10 GEO GSE29682 breast, Central Nervous System, colon, leukemia, melanoma, lung, cell lines 178 ovary, prostate, kidney (cancer) GEO GSE29778 HEK293 (embryonic kidney) cell lines 12 GEO GSE30472 brain (cancer; glioma) tissues 55 GEO GSE30521 prostate benign and cancer) tissues 23 GEO GSE30727 stomach (cancer) tissues 60 GEO GSE32875 LNCaP (prostate cancer) cell lines 8 GEO GSE37138 lung (cancer) tissues 117 GEO GSE40871 acute myeloid leukemia primary cells 67 GEO GSE43107 brain (cancer; glioma) tissues 95 GEO GSE43754 bone marrow stem and progenitor cells (chronic myeloid leukemia) cells 20 GEO GSE43830 WI38 (fetal lung fibroblasts) cell lines 6 GEO GSE45379 HeLa (cervical cancer) cell lines 6 GEO GSE46691 prostate (cancer) tissues 545 GEO GSE47032 kidney (cancer) tissues 40 GEO GSE53405 MCF10A (benign) cell lines 26 GEO GSE57076 THP1 (acute monocytic leukemia) cell lines 7 GEO GSE57933 bladder (cancer) tissues 199 GEO GSE58598 breast (cancer) tissues 10 GEO GSE62116 prostate (cancer) tissues 235 GEO GSE62667 prostate (cancer) tissues 182 GEO GSE67312 bladder (cancer) primary 10 xenografts GEO GSE68591 sarcoma (84; cancer) and 5 benign cell lines 75 GEO GSE71010 neutrophils (cystic fibrosis and healthy controls) cells 93 GEO GSE72291 prostate (cancer) tissues 139 GEO GSE78246 brain (schizophrenia, bipolar disorder, major depressive disorder, tissues 20 and controls) GEO GSE79956 prostate (cancer) tissues 211 GEO GSE79957 prostate (cancer) tissues 260 GEO GSE80683 prostate (cancer) tissues 17 GEO GSE9342 T-cell acute lymphoblastic leukaemia cell lines 17 GEO GSE9385 brain (26 glioblastomas, 22 oligodendrogliomas and 6 control brain tissues 55 samples)

197

Supplementary Table 2. Differentially expressed genes in PC3-GHSROS cells. Compared to empty vector control. Red: higher expression in PC3-GHSROS cells; Black: lower expression in PC3- GHSROS cells. Fold-changes are log2 transformed; Q-value denotes the false discovery rate (FDR; Benjamini-Hochberg)-adjusted P-value (cutoff ≤0.05).

log2 Fold Gene Symbol Gene Name ID P-value Q-value Change AADACP1 arylacetamide deacetylase pseudogene 1 201651 -1.9 8.2E-08 1.3E-06 AASS aminoadipate-semialdehyde synthase 10157 -2.4 2.5E-08 5.3E-07 AATK apoptosis associated tyrosine kinase 9625 1.8 5.1E-08 8.9E-07 ABCC3 ATP binding cassette subfamily C member 3 8714 1.8 7.7E-13 7.1E-10 ABCG1 ATP binding cassette subfamily G member 1 9619 2.1 1.7E-09 7.5E-08 ACHE acetylcholinesterase (Cartwright blood group) 43 2.5 4.3E-10 3.0E-08 ACSS1 acyl-CoA synthetase short-chain family member 1 84532 -1.9 7.4E-11 9.6E-09 ADAM23 ADAM metallopeptidase domain 23 8745 -2.7 5.3E-10 3.5E-08 ADAM8 ADAM metallopeptidase domain 8 101 2.7 5.5E-14 1.5E-10 ADD2 adducin 2 119 -3.2 2.1E-09 8.8E-08 ADSSL1 adenylosuccinate synthase like 1 122622 1.5 6.2E-08 1.0E-06 AFF2 AF4/FMR2 family member 2 2334 -3.9 1.1E-11 3.0E-09 anterior gradient 2, protein disulphide isomerase family AGR2 10551 2.1 3.9E-12 1.7E-09 member AGTR1 angiotensin II receptor type 1 185 -2.1 3.6E-08 7.0E-07 AIF1L allograft inflammatory factor 1 like 83543 1.6 2.7E-08 5.6E-07 AMOT angiomotin 154796 -3.1 2.3E-11 4.7E-09 ANGPT1 angiopoietin 1 284 -2.8 1.0E-08 2.7E-07 ANGPTL4 angiopoietin like 4 51129 1.6 2.0E-08 4.5E-07 ANK1 ankyrin 1 286 1.5 9.3E-06 5.5E-05 ANK2 ankyrin 2, neuronal 287 -3.1 2.4E-09 9.5E-08 ANOS1 anosmin 1 3730 1.9 6.1E-09 1.8E-07 ANXA10 annexin A10 11199 -2.3 1.1E-09 5.7E-08 apolipoprotein B mRNA editing enzyme catalytic subunit APOBEC3G 60489 2.0 3.3E-08 6.5E-07 3G AR androgen receptor 367 -3.6 1.4E-09 6.6E-08 ARHGAP44 Rho GTPase activating protein 44 9912 -2.1 1.6E-08 3.8E-07 ARHGDIB Rho GDP dissociation inhibitor beta 397 -3.4 5.3E-11 7.7E-09 ARHGEF16 Rho guanine nucleotide exchange factor 16 27237 2.4 7.8E-10 4.6E-08 ARNT2 aryl hydrocarbon receptor nuclear translocator 2 9915 -1.6 1.0E-07 1.5E-06 ASS1 argininosuccinate synthase 1 445 1.8 2.2E-08 4.8E-07 B3GALT5 beta-1,3-galactosyltransferase 5 10317 -2.0 2.1E-06 1.7E-05 B3GALT5-AS1 B3GALT5 antisense RNA 1 114041 -2.2 6.1E-09 1.8E-07 UDP-GlcNAc:betaGal beta-1,3-N- B3GNT7 93010 1.6 4.2E-09 1.4E-07 acetylglucosaminyltransferase 7 B4GALNT1 beta-1,4-N-acetyl-galactosaminyltransferase 1 2583 -1.8 8.8E-11 1.1E-08 BCAT1 branched chain amino acid transaminase 1 586 -3.9 1.5E-11 3.4E-09 BEX2 brain expressed X-linked 2 84707 -1.8 4.2E-07 4.4E-06 BEX4 brain expressed X-linked 4 56271 -1.8 1.1E-08 2.9E-07 BEX5 brain expressed X-linked 5 340542 -1.7 8.1E-07 7.5E-06 BIK BCL2 interacting killer 638 1.8 2.1E-06 1.6E-05 BLMH bleomycin hydrolase 642 -1.8 6.6E-13 6.5E-10 BMP6 bone morphogenetic protein 6 654 -2.3 5.1E-09 1.6E-07 BTBD11 BTB domain containing 11 121551 -3.1 5.7E-10 3.6E-08 BTG3 BTG family member 3 10950 -2.0 1.1E-10 1.3E-08

198

C11orf45 chromosome 11 open reading frame 45 219833 -1.8 2.0E-08 4.5E-07 C20orf166-AS1 C20orf166 antisense RNA 1 253868 1.8 3.0E-07 3.5E-06 C9orf152 chromosome 9 open reading frame 152 401546 1.8 5.3E-07 5.4E-06 CA9 carbonic anhydrase 9 768 2.5 3.1E-08 6.2E-07 calcium voltage-gated channel auxiliary subunit CACNA2D2 9254 -1.6 2.6E-08 5.5E-07 alpha2delta 2 CADM4 cell adhesion molecule 4 199731 2.3 8.8E-10 5.0E-08 CADPS2 calcium dependent secretion activator 2 93664 -2.0 8.9E-12 2.6E-09 CALB1 calbindin 1 793 3.3 2.1E-10 1.9E-08 CAMK2N1 calcium/calmodulin dependent protein kinase II inhibitor 1 55450 3.3 4.1E-12 1.7E-09 CAPN6 calpain 6 827 -2.5 3.3E-08 6.5E-07 CAV1 caveolin 1 857 1.6 5.2E-13 6.0E-10 CBLC Cbl proto-oncogene C 23624 1.8 2.8E-07 3.3E-06 CCBE1 collagen and calcium binding EGF domains 1 147372 1.5 1.5E-08 3.7E-07 CCDC160 coiled-coil domain containing 160 347475 -2.5 1.6E-08 3.8E-07 CCNB3 cyclin B3 85417 -2.1 5.3E-08 9.2E-07 CD24 CD24 molecule 100133941 1.5 2.9E-12 1.5E-09 CD33 CD33 molecule 945 -2.0 5.7E-08 9.7E-07 CD70 CD70 molecule 970 -2.1 6.3E-07 6.2E-06 CDH1 cadherin 1 999 2.2 4.1E-08 7.8E-07 CDH12 cadherin 12 1010 -1.6 8.8E-10 5.0E-08 CDH3 cadherin 3 1001 2.0 1.5E-07 2.1E-06 CEACAM6 carcinoembryonic antigen related cell adhesion molecule 6 4680 1.7 6.2E-07 6.1E-06 CEND1 cell cycle exit and neuronal differentiation 1 51286 -2.0 2.4E-07 2.9E-06 CHD7 chromodomain helicase DNA binding protein 7 55636 -1.6 1.1E-06 9.6E-06 CHRDL1 chordin-like 1 91851 -3.0 5.4E-10 3.5E-08 CKB creatine kinase B 1152 1.8 2.6E-11 5.0E-09 CLDN7 claudin 7 1366 2.1 1.4E-07 1.9E-06 CLMN calmin (calponin-like, transmembrane) 79789 -2.4 4.9E-09 1.6E-07 CNKSR2 connector enhancer of kinase suppressor of Ras 2 22866 -2.4 9.2E-09 2.5E-07 CNTN1 contactin 1 1272 -4.4 4.2E-11 6.6E-09 COBL cordon-bleu WH2 repeat protein 23242 -4.6 1.0E-11 2.9E-09 COL21A1 collagen type XXI alpha 1 chain 81578 -1.9 1.2E-07 1.7E-06 COL5A1 collagen type V alpha 1 1289 2.1 1.9E-09 8.2E-08 COX7B2 cytochrome c oxidase subunit 7B2 170712 3.8 1.8E-09 7.6E-08 CPA6 carboxypeptidase A6 57094 -2.3 1.5E-08 3.5E-07 CPEB1 cytoplasmic polyadenylation element binding protein 1 64506 -2.5 5.4E-11 7.7E-09 CPZ carboxypeptidase Z 8532 1.6 1.7E-09 7.5E-08 CRABP1 cellular retinoic acid binding protein 1 1381 4.1 1.1E-11 3.0E-09 CRABP2 cellular retinoic acid binding protein 2 1382 2.4 1.1E-12 7.8E-10 CREB3L1 cAMP responsive element binding protein 3 like 1 90993 4.0 2.4E-13 3.4E-10 CRIP1 cysteine rich protein 1 1396 2.2 1.4E-10 1.5E-08 CRIP2 cysteine rich protein 2 1397 2.3 9.6E-12 2.8E-09 CSMD2 CUB and Sushi multiple domains 2 114784 -2.5 7.9E-09 2.2E-07 CST1 cystatin SN 1469 -3.3 1.3E-10 1.4E-08 CST4 cystatin S 1472 -2.6 1.8E-09 7.6E-08 CXADR coxsackie virus and adenovirus receptor 1525 -2.8 4.0E-11 6.5E-09 CXADRP2 coxsackie virus and adenovirus receptor pseudogene 2 646243 -1.8 3.5E-07 3.9E-06 CXCL5 C-X-C motif chemokine ligand 5 6374 1.5 1.0E-06 9.0E-06 CXorf57 chromosome X open reading frame 57 55086 -3.0 4.7E-10 3.2E-08 cytochrome P450 family 4 subfamily F member 35, CYP4F35P 284233 -1.5 1.4E-07 2.0E-06 pseudogene CYP4V2 cytochrome P450 family 4 subfamily V member 2 285440 -2.6 2.3E-09 9.1E-08 CYP7B1 cytochrome P450 family 7 subfamily B member 1 9420 -2.3 2.2E-08 4.8E-07 DEPTOR DEP domain containing MTOR-interacting protein 64798 -1.7 6.7E-08 1.1E-06 DGKG diacylglycerol kinase gamma 1608 -2.7 1.0E-10 1.2E-08 DIRAS1 DIRAS family GTPase 1 148252 2.2 2.8E-10 2.2E-08 DLX3 distal-less homeobox 3 1747 -1.6 1.7E-06 1.4E-05 DMD dystrophin 1756 -3.0 4.0E-10 2.8E-08

199

DNAH5 dynein axonemal heavy chain 5 1767 -1.7 3.3E-09 1.2E-07 DNAJA4 DnaJ heat shock protein family (Hsp40) member A4 55466 -2.0 1.8E-08 4.1E-07 DNAJC6 DnaJ heat shock protein family (Hsp40) member C6 9829 -2.4 4.6E-10 3.1E-08 DOCK3 dedicator of cytokinesis 3 1795 -1.5 9.4E-09 2.6E-07 DPY19L2P1 DPY19L2 pseudogene 1 554236 -1.9 1.2E-06 1.1E-05 DTX4 deltex 4, E3 ubiquitin ligase 23220 -1.8 1.1E-08 2.8E-07 DYSF dysferlin 8291 -2.0 5.9E-11 8.3E-09 EDA ectodysplasin A 1896 -3.0 6.2E-09 1.8E-07 EGF epidermal growth factor 1950 -2.5 9.1E-10 5.1E-08 EGLN3 egl-9 family hypoxia inducible factor 3 112399 -1.8 1.3E-07 1.8E-06 EHD2 EH domain containing 2 30846 2.3 9.7E-09 2.6E-07 EHF ETS homologous factor 26298 1.9 8.5E-09 2.4E-07 ENTPD3 ectonucleoside triphosphate diphosphohydrolase 3 956 -1.8 1.1E-08 2.8E-07 EOMES 8320 -2.1 1.2E-09 6.1E-08 EPHB2 EPH receptor B2 2048 -1.5 2.6E-10 2.1E-08 EPHB6 EPH receptor B6 2051 1.7 1.4E-10 1.5E-08 ESM1 endothelial cell specific molecule 1 11082 -1.8 4.3E-11 6.7E-09 EYA1 EYA transcriptional coactivator and phosphatase 1 2138 -3.5 1.5E-10 1.6E-08 F2RL2 coagulation factor II thrombin receptor like 2 2151 -1.6 2.0E-07 2.6E-06 FAM110C family with sequence similarity 110 member C 642273 -3.0 1.6E-10 1.6E-08 FAM131B family with sequence similarity 131 member B 9715 1.5 1.7E-11 3.6E-09 FAM134B family with sequence similarity 134 member B 54463 -1.5 6.3E-08 1.0E-06 FAM20A family with sequence similarity 20 member A 54757 2.0 1.3E-07 1.9E-06 FAM50B family with sequence similarity 50 member B 26240 1.7 8.5E-08 1.3E-06 FAM89A family with sequence similarity 89 member A 375061 -1.6 5.1E-08 8.9E-07 FBP1 fructose-bisphosphatase 1 2203 2.5 3.1E-11 5.6E-09 FBXL16 F-box and leucine rich repeat protein 16 146330 2.0 1.4E-08 3.5E-07 FBXL7 F-box and leucine rich repeat protein 7 23194 -1.7 1.6E-08 3.7E-07 FCGBP Fc fragment of IgG binding protein 8857 -3.1 1.5E-10 1.6E-08 FEZF1-AS1 FEZF1 antisense RNA 1 154860 -2.0 1.1E-06 9.8E-06 FGF13 fibroblast growth factor 13 2258 -1.8 1.7E-08 4.0E-07 FNDC4 fibronectin type III domain containing 4 64838 2.0 4.2E-08 7.9E-07 FOXL1 forkhead box L1 2300 1.6 4.1E-10 2.9E-08 FOXRED2 FAD dependent oxidoreductase domain containing 2 80020 3.6 2.3E-12 1.5E-09 FRMD4B FERM domain containing 4B 23150 -1.9 5.5E-09 1.7E-07 GAL galanin and GMAP prepropeptide 51083 -3.2 1.5E-09 6.7E-08 GALNT12 polypeptide N-acetylgalactosaminyltransferase 12 79695 2.1 9.9E-10 5.3E-08 GALNT14 polypeptide N-acetylgalactosaminyltransferase 14 79623 -1.6 1.3E-08 3.3E-07 GALNT16 polypeptide N-acetylgalactosaminyltransferase 16 57452 -1.7 1.2E-06 1.0E-05 GAS6 growth arrest specific 6 2621 1.6 2.1E-09 8.8E-08 GBP7 guanylate binding protein 7 388646 -1.5 2.9E-06 2.1E-05 GCNT1 glucosaminyl (N-acetyl) transferase 1, core 2 2650 -1.6 1.4E-07 1.9E-06 GCNT3 glucosaminyl (N-acetyl) transferase 3, mucin type 9245 -2.4 6.8E-09 2.0E-07 GDF15 growth differentiation factor 15 9518 1.5 1.5E-08 3.6E-07 GHSROS Growth Hormone Secretagogue Receptor Opposite Strand - 5.3 3.4E-13 4.3E-10 GJB3 gap junction protein beta 3 2707 1.7 6.0E-10 3.8E-08 GNAI1 G protein subunit alpha i1 2770 1.5 1.6E-10 1.6E-08 GPR153 G protein-coupled receptor 153 387509 1.6 5.7E-13 6.1E-10 GPR63 G protein-coupled receptor 63 81491 -2.3 2.5E-09 9.7E-08 GRID2 glutamate ionotropic receptor delta type subunit 2 2895 1.7 1.5E-07 2.1E-06 GRIN2D glutamate ionotropic receptor NMDA type subunit 2D 2906 2.0 1.6E-08 3.7E-07 GRTP1 growth hormone regulated TBC protein 1 79774 2.1 6.0E-08 1.0E-06 GUCA1B guanylate cyclase activator 1B 2979 1.5 6.7E-07 6.5E-06 HEPH hephaestin 9843 -2.4 2.4E-09 9.3E-08 HR hair growth associated 55806 2.2 5.3E-09 1.7E-07 HRASLS HRAS like suppressor 57110 -1.9 2.7E-07 3.2E-06 HSPA12A heat shock protein family A (Hsp70) member 12A 259217 -3.7 2.7E-11 5.1E-09 HSPB8 heat shock protein family B (small) member 8 26353 -3.5 9.9E-10 5.3E-08 HTR7 5-hydroxytryptamine receptor 7 3363 -2.0 2.4E-07 2.9E-06 HTRA3 HtrA serine peptidase 3 94031 -1.8 1.5E-08 3.6E-07 IFI16 interferon gamma inducible protein 16 3428 -2.2 9.4E-13 7.5E-10 IFITM1 interferon induced transmembrane protein 1 8519 1.6 3.6E-06 2.5E-05

200

IGFBP5 insulin like growth factor binding protein 5 3488 1.6 2.4E-12 1.5E-09 IGFBP6 insulin like growth factor binding protein 6 3489 1.9 3.8E-11 6.3E-09 IL13RA2 interleukin 13 receptor subunit alpha 2 3598 -3.0 6.4E-10 4.0E-08 ITGB2 integrin subunit beta 2 3689 2.2 6.1E-11 8.4E-09 ITGB4 integrin subunit beta 4 3691 1.9 9.2E-13 7.5E-10 ITM2A integral membrane protein 2A 9452 -2.7 7.3E-10 4.4E-08 JAM3 junctional adhesion molecule 3 83700 -1.8 1.3E-08 3.3E-07 JUP junction plakoglobin 3728 1.5 5.9E-12 2.0E-09 KCNJ12 potassium voltage-gated channel subfamily J member 12 3768 -2.2 4.3E-09 1.4E-07 KCNJ3 potassium voltage-gated channel subfamily J member 3 3760 1.5 2.4E-05 1.2E-04 potassium calcium-activated channel subfamily N member KCNN3 3782 -2.3 5.9E-07 5.9E-06 3 KIAA0319 KIAA0319 9856 -3.6 4.6E-11 6.9E-09 KIAA1210 KIAA1210 57481 -1.7 2.9E-07 3.4E-06 KIAA1211 KIAA1211 57482 -3.8 2.7E-12 1.5E-09 KIAA1644 KIAA1644 85352 2.0 2.6E-07 3.1E-06 KIF26A kinesin family member 26A 26153 -1.7 4.5E-08 8.3E-07 KIF5C kinesin family member 5C 3800 -2.7 1.6E-11 3.5E-09 KLF9 Kruppel like factor 9 687 -3.3 2.5E-10 2.1E-08 KRT6B keratin 6B 3854 -2.2 1.7E-08 3.9E-07 KRT7 keratin 7 3855 2.6 1.9E-14 6.7E-11 KRT75 keratin 75 9119 -5.5 3.5E-16 2.5E-12 KRT81 keratin 81 3887 1.5 3.1E-12 1.5E-09 LAMA1 laminin subunit alpha 1 284217 -2.1 1.6E-08 3.8E-07 LAMC3 laminin subunit gamma 3 10319 -1.7 2.1E-09 8.5E-08 LARGE2 LARGE xylosyl- and glucuronyltransferase 2 120071 1.8 4.3E-08 8.0E-07 LCK LCK proto-oncogene, Src family tyrosine kinase 3932 1.5 2.6E-06 2.0E-05 LCP1 lymphocyte cytosolic protein 1 3936 -2.3 3.4E-12 1.6E-09 LEF1 lymphoid enhancer binding factor 1 51176 -1.5 6.7E-07 6.5E-06 LGR5 leucine rich repeat containing G protein-coupled receptor 5 8549 -1.7 3.9E-09 1.3E-07 LIMCH1 LIM and calponin homology domains 1 22998 2.2 6.1E-10 3.8E-08 LIMS2 LIM zinc finger domain containing 2 55679 1.7 3.1E-09 1.2E-07 LINC00346 long intergenic non-protein-coding RNA 346 283487 1.5 8.0E-07 7.4E-06 LINC01133 long intergenic non-protein-coding RNA 1133 100505633 1.5 4.1E-06 2.8E-05 LITAF lipopolysaccharide induced TNF factor 9516 3.1 1.1E-10 1.3E-08 LOC100506123 uncharacterized LOC100506123 100506123 -1.6 1.4E-09 6.7E-08 LOC101927870 uncharacterized LOC101927870 101927870 1.5 5.2E-09 1.6E-07 LOXL4 lysyl oxidase like 4 84171 1.7 4.4E-12 1.8E-09 leucine rich repeats and calponin homology domain LRCH2 57631 -2.4 3.7E-09 1.3E-07 containing 2 LRIG1 leucine rich repeats and immunoglobulin like domains 1 26018 -2.0 1.6E-08 3.7E-07 LRP4 LDL receptor related protein 4 4038 -1.5 1.8E-05 9.3E-05 LRRN2 leucine rich repeat neuronal 2 10446 -1.5 8.3E-07 7.7E-06 MAGEH1 MAGE family member H1 28986 -1.9 1.1E-08 2.8E-07 MAP2 microtubule associated protein 2 4133 -1.5 3.0E-08 6.0E-07 MAP3K15 mitogen-activated protein kinase kinase kinase 15 389840 -1.5 8.5E-09 2.3E-07 MARCH3 membrane associated ring-CH-type finger 3 115123 1.5 2.2E-06 1.7E-05 MARK1 microtubule affinity regulating kinase 1 4139 -2.9 2.7E-10 2.2E-08 MCOLN3 mucolipin 3 55283 -1.8 2.1E-06 1.7E-05 MCTP2 multiple C2 domains, transmembrane 2 55784 -3.0 3.1E-10 2.3E-08 MEGF6 multiple EGF like domains 6 1953 2.0 4.0E-12 1.7E-09 MEST mesoderm specific transcript 4232 1.7 7.0E-09 2.0E-07 MFAP3L microfibrillar associated protein 3 like 9848 -2.5 1.3E-09 6.3E-08 MIR31HG MIR31 host gene 554202 1.7 2.4E-09 9.3E-08 MISP mitotic spindle positioning 126353 1.7 1.0E-10 1.2E-08 myeloid/lymphoid or mixed-lineage leukemia; translocated MLLT11 10962 -3.2 3.1E-16 2.5E-12 to, 11 MMRN1 multimerin 1 22915 -1.8 1.1E-08 2.9E-07 MN1 meningioma (disrupted in balanced translocation) 1 4330 -2.6 1.2E-10 1.3E-08 MSX2 msh homeobox 2 4488 -2.1 1.3E-08 3.3E-07 MTSS1 metastasis suppressor 1 9788 -1.5 4.2E-07 4.4E-06 MUC2 mucin 2, oligomeric mucus/gel-forming 4583 2.0 1.3E-08 3.3E-07 MUC3A mucin 3A, cell surface associated 4584 3.7 7.5E-12 2.4E-09

201

MUC5B mucin 5B, oligomeric mucus/gel-forming 727897 2.8 2.3E-09 9.2E-08 MUM1L1 MUM1 like 1 139221 -3.5 1.1E-09 5.6E-08 MYT1 myelin transcription factor 1 4661 -1.6 4.8E-06 3.2E-05 NACAD NAC alpha domain containing 23148 2.9 4.0E-09 1.4E-07 NAP1L2 nucleosome assembly protein 1 like 2 4674 -1.7 1.4E-11 3.3E-09 NAP1L6 nucleosome assembly protein 1 like 6 645996 -1.8 2.0E-06 1.6E-05 NAV3 neuron navigator 3 89795 -1.6 4.3E-07 4.6E-06 NBEAP1 neurobeachin pseudogene 1 606 -1.8 6.9E-07 6.6E-06 NCR3LG1 natural killer cell cytotoxicity receptor 3 ligand 1 374383 -1.7 8.9E-06 5.3E-05 NEK3 NIMA related kinase 3 4752 2.0 1.4E-08 3.4E-07 NELL2 neural EGFL like 2 4753 -4.3 2.6E-12 1.5E-09 NEO1 neogenin 1 4756 -1.5 3.2E-06 2.3E-05 NFASC neurofascin 23114 -2.3 7.1E-11 9.5E-09 NLGN1 neuroligin 1 22871 -1.7 1.9E-06 1.5E-05 NLRC5 NLR family CARD domain containing 5 84166 -1.5 3.9E-07 4.2E-06 NOG noggin 9241 -1.5 1.3E-09 6.4E-08 NPR3 natriuretic peptide receptor 3 4883 1.6 1.2E-07 1.7E-06 NPY1R neuropeptide Y receptor Y1 4886 1.9 3.0E-11 5.4E-09 NRXN3 neurexin 3 9369 -1.9 1.2E-07 1.7E-06 NTSR1 neurotensin receptor 1 (high affinity) 4923 4.0 1.4E-14 6.3E-11 NUDT10 nudix hydrolase 10 170685 -2.6 9.7E-10 5.3E-08 NUDT11 nudix hydrolase 11 55190 -4.9 2.4E-13 3.4E-10 OASL 2'-5'-oligoadenylate synthetase like 8638 -1.6 1.8E-10 1.6E-08 OCLN occludin 100506658 1.7 4.0E-09 1.4E-07 OPN3 opsin 3 23596 1.5 5.9E-09 1.8E-07 OSBP2 oxysterol binding protein 2 23762 1.8 1.5E-12 1.0E-09 OXTR oxytocin receptor 5021 1.7 7.0E-12 2.3E-09 PALD1 phosphatase domain containing, paladin 1 27143 -2.2 2.0E-08 4.5E-07 PALM2 paralemmin 2 114299 -2.1 5.0E-07 5.2E-06 PAQR8 progestin and adipoQ receptor family member 8 85315 1.7 3.1E-09 1.1E-07 PARM1 prostate androgen-regulated mucin-like protein 1 25849 -4.4 3.2E-12 1.5E-09 PARP8 poly(ADP-ribose) polymerase family member 8 79668 -1.9 4.0E-09 1.4E-07 PCDH19 protocadherin 19 57526 -1.7 2.0E-06 1.6E-05 PCDHGB2 protocadherin gamma subfamily B, 2 56103 1.7 3.4E-07 3.8E-06 PCSK5 proprotein convertase subtilisin/kexin type 5 5125 -1.9 2.4E-07 2.9E-06 PCSK9 proprotein convertase subtilisin/kexin type 9 255738 1.7 4.8E-08 8.7E-07 PDGFA platelet derived growth factor subunit A 5154 1.6 1.9E-08 4.3E-07 PDLIM2 PDZ and LIM domain 2 64236 1.9 8.0E-09 2.3E-07 PELI2 pellino E3 ubiquitin protein ligase family member 2 57161 -1.6 2.3E-06 1.8E-05 PI3 peptidase inhibitor 3 5266 1.7 5.8E-10 3.7E-08 PIANP PILR alpha associated neural protein 196500 1.6 8.1E-07 7.5E-06 PLAT plasminogen activator, tissue type 5327 1.8 1.9E-13 3.4E-10 PLBD1 phospholipase B domain containing 1 79887 -1.5 4.1E-06 2.8E-05 PLCH2 phospholipase C eta 2 9651 1.7 8.4E-10 4.8E-08 PLD5 phospholipase D family member 5 200150 -2.1 4.5E-08 8.2E-07 PLEKHG6 pleckstrin homology and RhoGEF domain containing G6 55200 1.5 3.3E-06 2.3E-05 PLLP plasmolipin 51090 1.5 9.7E-07 8.7E-06 PLXNA4 plexin A4 91584 2.6 1.9E-07 2.4E-06 POF1B premature ovarian failure, 1B 79983 -1.5 1.6E-06 1.3E-05 POU3F2 POU class 3 homeobox 2 5454 -3.6 2.7E-11 5.1E-09 PPL periplakin 5493 2.2 1.7E-10 1.6E-08 PPP1R1B protein phosphatase 1 regulatory inhibitor subunit 1B 84152 3.2 5.9E-10 3.7E-08 PPP1R3G protein phosphatase 1 regulatory subunit 3G 648791 2.3 4.3E-09 1.4E-07 PPP2R2C protein phosphatase 2 regulatory subunit Bgamma 5522 -4.9 1.9E-13 3.4E-10 phosphatidylinositol-3,4,5-trisphosphate dependent Rac PREX2 80243 -1.6 8.3E-07 7.7E-06 exchange factor 2 PRKXP1 protein kinase, X-linked, pseudogene 1 441733 -1.5 6.6E-06 4.2E-05 PRR15 proline rich 15 222171 1.8 4.8E-08 8.6E-07 PTGER2 prostaglandin E receptor 2 5732 1.9 1.5E-09 6.9E-08 PTGFRN prostaglandin F2 receptor inhibitor 5738 -1.6 4.6E-10 3.2E-08 PTGS2 prostaglandin-endoperoxide synthase 2 5743 1.7 4.9E-07 5.1E-06 PTPN20 protein tyrosine phosphatase, non-receptor type 20 26095 1.8 5.1E-08 9.0E-07 PTPRG protein tyrosine phosphatase, receptor type G 5793 -2.5 5.3E-09 1.7E-07

202

RAB39B RAB39B, member RAS oncogene family 116442 -2.1 9.3E-08 1.4E-06 RASD1 ras related dexamethasone induced 1 51655 2.4 4.9E-09 1.6E-07 RASEF RAS and EF-hand domain containing 158158 -1.8 3.6E-07 4.0E-06 RASL11A RAS like family 11 member A 387496 1.9 6.8E-11 9.2E-09 RBM11 RNA binding motif protein 11 54033 -3.2 1.4E-10 1.5E-08 RBP4 retinol binding protein 4 5950 1.6 1.3E-11 3.2E-09 RCOR2 REST corepressor 2 283248 1.6 1.3E-06 1.1E-05 REG4 regenerating family member 4 83998 2.2 5.1E-09 1.6E-07 RFTN1 raftlin, lipid raft linker 1 23180 -1.8 6.6E-09 1.9E-07 RGAG4 retrotransposon gag domain containing 4 340526 -1.6 5.0E-06 3.3E-05 RGCC regulator of cell cycle 28984 1.7 3.3E-06 2.3E-05 RHOU ras homolog family member U 58480 1.7 1.5E-07 2.0E-06 RIMKLB ribosomal modification protein rimK-like family member B 57494 -1.6 1.5E-08 3.6E-07 RNASEL ribonuclease L 6041 -2.4 1.4E-09 6.6E-08 RNF128 ring finger protein 128, E3 ubiquitin protein ligase 79589 -3.8 1.5E-11 3.5E-09 RNF152 ring finger protein 152 220441 -1.5 2.7E-06 2.0E-05 ROBO4 roundabout guidance receptor 4 54538 -2.1 2.0E-11 4.1E-09 RPS6KA2 ribosomal protein S6 kinase A2 6196 1.5 6.3E-08 1.0E-06 RRAGD Ras related GTP binding D 58528 -3.0 8.0E-11 1.0E-08 RUNDC3B RUN domain containing 3B 154661 -1.8 2.8E-06 2.0E-05 RYR2 ryanodine receptor 2 6262 -2.9 3.2E-09 1.2E-07 S100A2 S100 calcium binding protein A2 6273 2.1 5.4E-11 7.7E-09 S100A9 S100 calcium binding protein A9 6280 -1.5 8.9E-06 5.3E-05 S100P S100 calcium binding protein P 6286 1.9 9.6E-07 8.6E-06 SCIN scinderin 85477 1.5 7.2E-07 6.8E-06 SCN3A sodium voltage-gated channel alpha subunit 3 6328 -1.9 2.4E-08 5.1E-07 SEMA3B semaphorin 3B 7869 1.9 3.4E-10 2.5E-08 SEMA6B semaphorin 6B 10501 2.3 5.3E-10 3.5E-08 SERPINA1 serpin family A member 1 5265 -2.1 3.2E-08 6.4E-07 SERPINB2 serpin family B member 2 5055 -2.9 2.4E-11 4.7E-09 SERTAD4 SERTA domain containing 4 56256 -1.6 2.3E-06 1.7E-05 SESN3 sestrin 3 143686 -2.6 5.4E-10 3.5E-08 SGK494 uncharacterized serine/threonine-protein kinase SgK494 124923 -1.5 3.1E-08 6.2E-07 SH2D1B SH2 domain containing 1B 117157 -2.2 1.5E-08 3.5E-07 SIRPB1 signal regulatory protein beta 1 10326 -2.3 6.3E-11 8.7E-09 SIRPB2 signal regulatory protein beta 2 284759 -1.6 3.2E-06 2.3E-05 SLC16A9 solute carrier family 16 member 9 220963 1.7 4.0E-11 6.5E-09 SLC1A3 solute carrier family 1 member 3 6507 -3.7 4.2E-11 6.6E-09 SLC25A25-AS1 SLC25A25 antisense RNA 1 100289019 -1.5 3.2E-08 6.3E-07 SLC26A10 solute carrier family 26 member 10 65012 -1.8 1.6E-10 1.6E-08 SLC26A5 solute carrier family 26 member 5 375611 1.5 1.4E-05 7.6E-05 SLC29A4 solute carrier family 29 member 4 222962 1.6 2.6E-08 5.4E-07 SLC38A11 solute carrier family 38 member 11 151258 1.7 2.7E-07 3.2E-06 SLC44A5 solute carrier family 44 member 5 204962 -2.1 6.3E-08 1.1E-06 SLC47A1 solute carrier family 47 member 1 55244 1.8 2.6E-07 3.1E-06 SLC6A11 solute carrier family 6 member 11 6538 -2.3 4.9E-08 8.8E-07 SLC7A8 solute carrier family 7 member 8 23428 -3.4 4.0E-10 2.8E-08 SLC9A2 solute carrier family 9 member A2 6549 -3.6 2.1E-11 4.2E-09 SLFN13 schlafen family member 13 146857 -2.3 2.9E-10 2.2E-08 SLITRK4 SLIT and NTRK like family member 4 139065 -1.8 1.7E-09 7.5E-08 SOD3 superoxide dismutase 3, extracellular 6649 1.7 7.5E-09 2.1E-07 SPDEF SAM pointed domain containing ETS transcription factor 25803 1.5 2.3E-10 2.0E-08 sparc/osteonectin, cwcv and kazal-like domains SPOCK3 50859 1.5 7.9E-08 1.3E-06 proteoglycan (testican) 3 SPTB spectrin beta, erythrocytic 6710 -1.7 1.0E-09 5.5E-08 SRPX sushi repeat containing protein, X-linked 8406 -1.5 1.2E-09 6.1E-08 ST6GAL1 ST6 beta-galactoside alpha-2,6-sialyltransferase 1 6480 -3.8 2.8E-12 1.5E-09 ST6GAL2 ST6 beta-galactoside alpha-2,6-sialyltransferase 2 84620 2.0 9.2E-09 2.5E-07 ST6GALNAC2 ST6 N-acetylgalactosaminide alpha-2,6-sialyltransferase 2 10610 2.4 1.6E-09 7.2E-08 STARD8 StAR related lipid transfer domain containing 8 9754 -1.8 1.7E-07 2.2E-06 STAT6 signal transducer and activator of transcription 6 6778 1.9 1.3E-11 3.2E-09 STC1 stanniocalcin 1 6781 -1.5 4.5E-08 8.3E-07 STMN3 stathmin 3 50861 1.5 3.8E-08 7.3E-07

203

STOX2 storkhead box 2 56977 -2.6 5.3E-09 1.7E-07 SUSD3 sushi domain containing 3 203328 1.8 1.9E-07 2.5E-06 SYT14 synaptotagmin 14 255928 -2.2 1.0E-07 1.5E-06 TACSTD2 tumour-associated calcium signal transducer 2 4070 -2.0 3.0E-09 1.1E-07 TAGLN3 transgelin 3 29114 -1.5 5.2E-09 1.6E-07 TBC1D30 TBC1 domain family member 30 23329 -1.8 7.1E-08 1.2E-06 TCN1 transcobalamin 1 6947 -5.4 1.7E-13 3.4E-10 TDO2 tryptophan 2,3-dioxygenase 6999 -2.0 1.3E-06 1.1E-05 TENM1 teneurin transmembrane protein 1 10178 -1.8 7.8E-09 2.2E-07 TFF1 trefoil factor 1 7031 3.6 5.2E-12 1.9E-09 TFF2 trefoil factor 2 7032 3.4 3.4E-10 2.5E-08 TGFBI transforming growth factor beta induced 7045 -1.6 2.2E-07 2.7E-06 THBS1 thrombospondin 1 7057 1.5 5.5E-09 1.7E-07 TLR3 toll like receptor 3 7098 -1.9 2.1E-07 2.6E-06 TMC6 transmembrane channel like 6 11322 1.7 2.4E-09 9.3E-08 TMEM108-AS1 TMEM108 antisense RNA 1 101927455 -3.1 9.8E-11 1.2E-08 TMEM229B transmembrane protein 229B 161145 -1.5 8.4E-08 1.3E-06 TMEM27 transmembrane protein 27 57393 -1.7 4.4E-11 6.8E-09 TMEM74 transmembrane protein 74 157753 -1.8 1.7E-07 2.2E-06 TMOD2 tropomodulin 2 29767 -2.3 3.0E-10 2.3E-08 TMPRSS15 transmembrane protease, serine 15 5651 2.4 2.8E-08 5.7E-07 TMPRSS3 transmembrane protease, serine 3 64699 2.1 3.3E-07 3.8E-06 TMX4 thioredoxin related transmembrane protein 4 56255 -1.7 8.7E-12 2.6E-09 TNFRSF11B tumour necrosis factor receptor superfamily member 11b 4982 -1.9 8.9E-08 1.4E-06 TNFRSF14 tumour necrosis factor receptor superfamily member 14 8764 1.7 1.0E-08 2.7E-07 TNFSF9 tumour necrosis factor superfamily member 9 8744 -1.5 1.5E-07 2.0E-06 TNXB tenascin XB 7148 2.6 2.8E-10 2.2E-08 TP53I11 tumour protein p53 inducible protein 11 9537 2.5 1.0E-08 2.7E-07 TPTE transmembrane phosphatase with tensin homology 7179 -2.1 2.4E-08 5.1E-07 TSPEAR thrombospondin type laminin G domain and EAR repeats 54084 1.9 3.8E-08 7.2E-07 TSPOAP1 TSPO associated protein 1 9256 1.7 4.7E-07 4.9E-06 TTC3P1 tetratricopeptide repeat domain 3 pseudogene 1 286495 -1.8 9.6E-08 1.5E-06 TTC9 tetratricopeptide repeat domain 9 23508 1.6 6.7E-07 6.5E-06 UBE2QL1 ubiquitin conjugating enzyme E2 Q family like 1 134111 -2.4 1.3E-11 3.2E-09 UNC80 unc-80 homolog, NALCN activator 285175 -3.6 3.0E-11 5.4E-09 VASH1 vasohibin 1 22846 -1.6 2.8E-10 2.2E-08 VAV3 vav guanine nucleotide exchange factor 3 10451 -1.6 2.5E-07 3.0E-06 VSTM2L V-set and transmembrane domain containing 2 like 128434 2.0 1.2E-08 3.1E-07 VWA5A von Willebrand factor A domain containing 5A 4013 -3.4 1.5E-10 1.6E-08 WDR72 WD repeat domain 72 256764 -3.2 3.7E-11 6.2E-09 WNT2B Wnt family member 2B 7482 -1.5 2.1E-06 1.7E-05 YPEL2 yippee like 2 388403 -1.5 1.7E-07 2.3E-06 ZFPM2 zinc finger protein, FOG family member 2 23414 -3.8 5.1E-12 1.9E-09 ZG16B zymogen granule protein 16B 124220 1.6 8.2E-08 1.3E-06 ZNF334 zinc finger protein 334 55713 3.1 4.5E-09 1.5E-07 ZNF415 zinc finger protein 415 55786 2.5 2.0E-10 1.8E-08 ZNF43 zinc finger protein 43 7594 2.0 2.5E-07 3.0E-06 ZNF467 zinc finger protein 467 168544 3.5 1.7E-10 1.6E-08 ZNF470 zinc finger protein 470 388566 2.5 5.3E-09 1.7E-07 ZNF521 zinc finger protein 521 25925 -1.5 7.8E-07 7.3E-06 ZNF585B zinc finger protein 585B 92285 -2.3 2.9E-08 5.9E-07 ZNF607 zinc finger protein 607 84775 2.4 3.0E-09 1.1E-07 ZNF702P zinc finger protein 702, pseudogene 79986 2.2 3.2E-09 1.2E-07 ZNF818P zinc finger protein 818, pseudogene 390963 2.0 1.5E-07 2.0E-06 ZNF85 zinc finger protein 85 7639 3.9 1.6E-11 3.5E-09

204

Supplementary Figure 1. Effects of GHSROS overexpression perturbation in cultured cells assessed by qRT-PCR. qRT-PCR validation of genes regulated by GHSROS. Expression was normalized to the housekeeping gene RPL32. Results are relative to the respective vector control. Coloured bars indicate individual genes. Genes that were not expressed represented as NE (no expression). Mean ± s.e.m, n=3, *P ≤ 0.05, ** P ≤ 0.01, ***P ≤ 0.001, Student’s t-test.

205

Supplementary Figure 2. Effects of GHSROS knockdown perturbation in cultured cells assessed by qRT-PCR. qRT-PCR validation of regulated genes following knockdown of GHSROS by transfection with LNA-ASOs for 48 hours. Expression was normalized to the housekeeping gene RPL32. Results are relative to the respective scrambled control. Coloured bars indicate individual genes. Genes that were not expressed represented as NE (no expression). Mean ± s.em, n=3, *P ≤ 0.05, ** P ≤ 0.01, ***P ≤ 0.001, Student’s t-test.

206

Supplementary Table 3. Enrichment for GO terms in the category ‘biological process’ for genes upregulated in PC3-GHSROS cells (compared to empty-vector control). P ≤ 0.01, Fisher's exact test. Fold Fisher Exact P- GO term Description Count % Gene Symbols Enrichment value GO:0010669 epithelial structure maintenance 4 2.6 MUC2, RBP4, 70 1.1E-07 MUC3A, TFF1 GO:0030277 maintenance of gastrointestinal 4 2.6 MUC2, RBP4, 70 1.1E-07 epithelium MUC3A, TFF1 GO:0070482 response to oxygen levels 9 5.9 PLAT, CAV1, CA9, 7 7.5E-06 PDGFA, OXTR, CD24, THBS1, SOD3, ANGPTL4 GO:0009725 response to hormone stimulus 13 8.5 RBP4, CAV1, PTGS2, 4 4.4E-05 PDGFA, FBP1, OXTR, NPY1R, ABCG1, CA9, PCSK9, CD24, TFF1, THBS1 GO:0001666 response to hypoxia 8 5.2 PLAT, CAV1, CA9, 6 4.0E-05 PDGFA, CD24, THBS1, SOD3, ANGPTL4 GO:0022600 digestive system process 5 3.3 MUC2, RBP4, 16 1.6E-05 MUC3A, OXTR, TFF1 GO:0009719 response to endogenous 13 8.5 RBP4, CAV1, PTGS2, 3 1.2E-04 stimulus PDGFA, FBP1, OXTR, NPY1R, ABCG1, CA9, PCSK9, CD24, TFF1, THBS1 GO:0048545 response to steroid hormone 9 5.9 CAV1, PTGS2, CA9, 5 8.6E-05 stimulus PDGFA, OXTR, TFF1, NPY1R, CD24, THBS1 GO:0008285 negative regulation of cell 12 7.8 MUC2, RBP4, CAV1, 4 1.6E-04 proliferation TP53I11, IFITM1, PTGS2, IGFBP6, SCIN, TNFRSF14, CD24, THBS1, IGFBP5 GO:0051241 negative regulation of 8 5.2 RBP4, CAV1, ACHE, 5 1.6E-04 multicellular organismal PTGS2, PDGFA, process PCSK9, CD24, THBS1 GO:0042493 response to drug 9 5.9 CAV1, PTGS2, CA9, 4 2.1E-04 PDGFA, LCK, OXTR, CDH1, CDH3, SLC47A1 GO:0032355 response to estradiol stimulus 5 3.3 PTGS2, PDGFA, 10 1.5E-04 OXTR, TFF1, NPY1R GO:0007586 digestion 6 3.9 MUC2, RBP4, 7 2.2E-04 MUC3A, TFF2, OXTR, TFF1 GO:0042127 regulation of cell proliferation 17 11.1 MUC2, RBP4, CAV1, 2 1.2E-03 PTGER2, TP53I11, PTGS2, IFITM1, CXCL5, PDGFA, CRIP2, IGFBP6, TNFRSF14, STAT6, SCIN, CD24, THBS1, IGFBP5 GO:0010033 response to organic substance 16 10.5 RBP4, CAV1, PTGS2, 2 1.3E-03 PDGFA, FBP1, OXTR, CDH1, NPY1R, ABCG1, STAT6, CA9, PCSK9, CREB3L1, CD24, TFF1, THBS1 GO:0031644 regulation of neurological 7 4.6 PLAT, ACHE, S100P, 5 6.2E-04 system process PTGS2, GRIN2D, OXTR, CALB1

207

GO:0050730 regulation of peptidyl-tyrosine 5 3.3 CAV1, PDGFA, 8 4.5E-04 phosphorylation TNFRSF14, ITGB2, CD24 GO:0007267 cell-cell signaling 14 9.2 PLAT, GUCA1B, 2 1.6E-03 ACHE, CXCL5, PDGFA, OXTR, ITGB2, NTSR1, GRIN2D, GRID2, CEACAM6, SEMA3B, CD24, GDF15 GO:0042632 cholesterol homeostasis 4 2.6 CAV1, PCSK9, CD24, 11 4.4E-04 ABCG1 GO:0055092 sterol homeostasis 4 2.6 CAV1, PCSK9, CD24, 11 4.4E-04 ABCG1 GO:0007155 cell adhesion 15 9.8 CLDN7, ACHE, 2 2.5E-03 CADM4, TNXB, ITGB4, ITGB2, CDH1, CDH3, PCDHGB2, COL5A1, JUP, CD24, ADAM8, THBS1, MUC5B GO:0022610 biological adhesion 15 9.8 CLDN7, ACHE, 2 2.6E-03 CADM4, TNXB, ITGB4, ITGB2, CDH1, CDH3, PCDHGB2, COL5A1, JUP, CD24, ADAM8, THBS1, MUC5B GO:0045907 positive regulation of 3 2.0 CAV1, PTGS2, 24 2.2E-04 vasoconstriction NPY1R GO:0035249 synaptic transmission, 3 2.0 PLAT, GRIN2D, 23 2.8E-04 glutamatergic GRID2 GO:0044057 regulation of system process 9 5.9 PLAT, CAV1, ACHE, 3 2.7E-03 S100P, PTGS2, GRIN2D, OXTR, NPY1R, CALB1 GO:0048878 chemical homeostasis 12 7.8 RBP4, CAV1, LCK, 2 3.4E-03 GRID2, PCSK9, OXTR, PLLP, NPY1R, CD24, ABCG1, TMPRSS3, CKB GO:0050804 regulation of synaptic 6 3.9 PLAT, ACHE, S100P, 5 1.8E-03 transmission PTGS2, OXTR, CALB1 GO:0042592 homeostatic process 15 9.8 RBP4, MUC2, CAV1, 2 4.9E-03 OXTR, NPY1R, TMPRSS3, ABCG1, CKB, MUC3A, LCK, GRID2, PCSK9, PLLP, CD24, TFF1 GO:0055088 lipid homeostasis 4 2.6 CAV1, PCSK9, CD24, 8 1.4E-03 ABCG1 GO:0051969 regulation of transmission of 6 3.9 PLAT, ACHE, S100P, 4 2.7E-03 nerve impulse PTGS2, OXTR, CALB1 GO:0007610 behavior 11 7.2 S100P, PTGS2, 2 4.9E-03 CXCL5, PDGFA, PPP1R1B, GRIN2D, OXTR, ITGB2, NPY1R, NTSR1, CALB1 GO:0032570 response to progesterone 3 2.0 CAV1, OXTR, THBS1 16 8.4E-04 stimulus GO:0016337 cell-cell adhesion 8 5.2 JUP, CLDN7, CDH1, 3 4.7E-03 ITGB2, CD24, ADAM8, CDH3, PCDHGB2 GO:0032101 regulation of response to 6 3.9 CAV1, PTGS2, 4 4.0E-03 external stimulus PDGFA, GRID2, CD24, THBS1 GO:0051050 positive regulation of transport 7 4.6 RBP4, CAV1, ACHE, 3 5.3E-03 SCIN, PCSK9, OXTR, CDH1

208

GO:0001894 tissue homeostasis 4 2.6 MUC2, RBP4, 7 3.0E-03 MUC3A, TFF1 GO:0048167 regulation of synaptic plasticity 4 2.6 PLAT, S100P, 7 3.1E-03 PTGS2, CALB1 GO:0033273 response to vitamin 4 2.6 RBP4, PTGS2, 6 3.5E-03 PDGFA, MEST GO:0050865 regulation of cell activation 6 3.9 STAT6, PDGFA, 4 6.4E-03 LCK, TNFRSF14, CD24, THBS1 GO:0006873 cellular ion homeostasis 9 5.9 CAV1, LCK, GRID2, 3 9.1E-03 OXTR, PLLP, NPY1R, CD24, TMPRSS3, CKB GO:0051480 cytosolic calcium ion 5 3.3 CAV1, LCK, OXTR, 4 5.2E-03 homeostasis NPY1R, CD24 GO:0007618 mating 3 2.0 PPP1R1B, PI3, OXTR 12 2.0E-03 GO:0055082 cellular chemical homeostasis 9 5.9 CAV1, LCK, GRID2, 3 1.0E-02 OXTR, PLLP, NPY1R, CD24, TMPRSS3, CKB GO:0001568 blood vessel development 7 4.6 PLAT, CAV1, 3 8.7E-03 PDGFA, CCBE1, THBS1, COL5A1, ANGPTL4 GO:0010648 negative regulation of cell 7 4.6 CBLC, CAV1, ACHE, 3 9.3E-03 communication PTGS2, STMN3, THBS1, IGFBP5 GO:0008544 epidermis development 6 3.9 PTGS2, PDGFA, 3 8.1E-03 PPL, CRABP2, GJB3, COL5A1 GO:0043588 skin development 3 2.0 PDGFA, GJB3, 11 2.5E-03 COL5A1 GO:0001944 vasculature development 7 4.6 PLAT, CAV1, 3 9.9E-03 PDGFA, CCBE1, THBS1, COL5A1, ANGPTL4 GO:0010038 response to metal ion 5 3.3 CAV1, PTGS2, TFF1, 4 7.6E-03 THBS1, SOD3 GO:0007270 nerve-nerve synaptic 3 2.0 PLAT, GRIN2D, 10 3.1E-03 transmission GRID2 GO:0006875 cellular metal ion homeostasis 6 3.9 CAV1, LCK, OXTR, 3 1.1E-02 NPY1R, CD24, TMPRSS3 GO:0044092 negative regulation of 8 5.2 CBLC, CAV1, GNAI1, 3 1.4E-02 molecular function HR, PCSK9, NPY1R, NPR3, ANGPTL4 GO:0031667 response to nutrient levels 6 3.9 RBP4, CAV1, PTGS2, 3 1.1E-02 PDGFA, PCSK9, MEST GO:0032526 response to retinoic acid 3 2.0 RBP4, PDGFA, 10 3.7E-03 MEST

209

Supplementary Table 4. Enrichment for GO terms in the category ‘biological process’ for genes downregulated in PC3-GHSROS cells (compared to empty-vector control). P ≤ 0.01, Fisher's exact test. Fold Fisher GO term Description Count % Gene Symbols Enrich Exact P- ment value GO:0007155 cell adhesion 22 1.1 MTSS1, COL21A1, ADAM23, NRXN3, 3 5.8E-06 LRRN2, NELL2, NLGN1, NFASC, LEF1, NEO1, MMRN1, CXADR, PCDH19, CDH12, LAMA1, SRPX, LAMC3, CD33, TGFBI, CNTN1, FCGBP, CXADRP2, EDA GO:0022610 biological adhesion 22 1.1 MTSS1, COL21A1, ADAM23, NRXN3, 3 6.0E-06 LRRN2, NELL2, NLGN1, NFASC, LEF1, NEO1, MMRN1, CXADR, PCDH19, CDH12, LAMA1, SRPX, LAMC3, CD33, TGFBI, CNTN1, FCGBP, CXADRP2, EDA GO:0007267 cell-cell signaling 16 0.8 AR, NRXN3, S100A9, NLGN1, CD70, 2 7.6E-04 FGF13, GAL, TNFSF9, SLC1A3, KCNN3, HTR7, CD33, DMD, TMOD2, STC1, PCSK5 GO:0000904 cell morphogenesis 9 0.4 NOG, SLITRK4, SLC1A3, NRXN3, KIF5C, 3 1.3E-03 involved in NFASC, EOMES, LEF1, EPHB2 differentiation GO:0000902 cell morphogenesis 11 0.5 LAMA1, NOG, SLITRK4, SLC1A3, NRXN3, 3 1.7E-03 DMD, KIF5C, NFASC, EOMES, LEF1, EPHB2 GO:0001655 urogenital system 6 0.3 AGTR1, EYA1, AR, NOG, LEF1, PCSK5 5 1.2E-03 development GO:0043009 chordate embryonic 10 0.5 EYA1, AR, NOG, CHD7, ARNT2, EOMES, 3 3.2E-03 development LEF1, AMOT, ZFPM2, PCSK5 GO:0009792 embryonic 10 0.5 EYA1, AR, NOG, CHD7, ARNT2, EOMES, 3 3.4E-03 development ending LEF1, AMOT, ZFPM2, PCSK5 in birth or egg hatching GO:0032989 cellular component 11 0.5 LAMA1, NOG, SLITRK4, SLC1A3, NRXN3, 3 3.8E-03 morphogenesis DMD, KIF5C, NFASC, EOMES, LEF1, EPHB2 GO:0030509 BMP signaling 4 0.2 MSX2, NOG, CHRDL1, BMP6 8 1.3E-03 pathway GO:0006928 cell motion 12 0.6 LAMA1, MTSS1, VAV3, NRXN3, KIF5C, 2 5.4E-03 S100A9, NFASC, AMOT, POU3F2, DNAH5, EPHB2, ARHGDIB GO:0003013 circulatory system 7 0.3 AGTR1, CHD7, HTR7, RYR2, AMOT, 3 4.0E-03 process KCNJ12, PCSK5 GO:0008015 blood circulation 7 0.3 AGTR1, CHD7, HTR7, RYR2, AMOT, 3 4.0E-03 KCNJ12, PCSK5 GO:0048732 gland development 6 0.3 EYA1, AR, NOG, LEF1, POU3F2, EDA 4 3.4E-03 GO:0001837 epithelial to 3 0.1 NOG, EOMES, LEF1 15 8.9E-04 mesenchymal transition GO:0021545 cranial nerve 3 0.1 SLC1A3, CHD7, EPHB2 15 1.1E-03 development GO:0030182 neuron 11 0.5 SLITRK4, SLC1A3, MCOLN3, NRXN3, 2 7.9E-03 differentiation DGKG, DMD, KIF5C, MAP2, NFASC, POU3F2, EPHB2 GO:0001822 kidney development 5 0.2 AGTR1, EYA1, NOG, LEF1, PCSK5 5 3.8E-03 GO:0001501 skeletal system 9 0.4 MSX2, EYA1, TNFRSF11B, NOG, CHD7, 3 7.8E-03 development CHRDL1, STC1, PCSK5, BMP6 GO:0035108 limb morphogenesis 5 0.2 MSX2, NOG, CHD7, LEF1, PCSK5 5 4.3E-03 GO:0035107 appendage 5 0.2 MSX2, NOG, CHD7, LEF1, PCSK5 5 4.3E-03 morphogenesis GO:0048736 appendage 5 0.2 MSX2, NOG, CHD7, LEF1, PCSK5 4 5.1E-03 development GO:0060173 limb development 5 0.2 MSX2, NOG, CHD7, LEF1, PCSK5 4 5.1E-03 GO:0043627 response to estrogen 5 0.2 TNFRSF11B, ARNT2, ANGPT1, SERPINA1, 4 5.6E-03 stimulus GAL GO:0021675 nerve development 3 0.1 SLC1A3, CHD7, EPHB2 11 2.4E-03

210

Supplementary Table 5. Oncomine concepts analysis of positively and negatively correlated PC3- GHSROS gene signature. Red: positively correlated gene signature; Black: negatively correlated gene signature. P ≤ 0.01, Fisher's exact test.

Concept 1 Odds Overlap Concept 1 Name Concept 2 ID Concept 2 Name P-value ID Ratio Size Cancer Type: Prostate Cancer - PC3 GHSROS downregulated C41610 17697 Top 10% Over-expressed (Bittner 2.17E-08 3.7 32 gene list Multi-cancer) Prostate Cancer - Metastasis - Top PC3 GHSROS downregulated C41610 122189617 10% Under-expressed (Taylor 3.14E-06 3.1 28 gene list Prostate 3) Cancer Type: Prostate Cancer - PC3 GHSROS downregulated C41610 122210891 Top 5% Under-expressed (Garnett 3.94E-06 4.5 16 gene list CellLine) Cancer Type: Prostate Cancer - PC3 GHSROS downregulated C41610 122208916 Top 5% Under-expressed 3.55E-05 3.5 17 gene list (Barretina CellLine) Prostate Cancer - Metastasis - Top PC3 GHSROS downregulated C41610 122213069 5% Under-expressed (Grasso 9.84E-05 3.3 16 gene list Prostate) Prostate Cancer - Metastasis - Top PC3 GHSROS downregulated C41610 28483 5% Under-expressed (Varambally 4.13E-04 3 15 gene list Prostate) Prostate Cancer - Metastasis - Top PC3 GHSROS downregulated C41610 23100 10% Under-expressed (LaTulippe 4.20E-04 3 16 gene list Prostate) Prostate Cancer - Metastasis - Top PC3 GHSROS downregulated C41610 28344 10% Under-expressed (Vanaja 0.001 2.3 21 gene list Prostate) Cancer Type: Prostate Cancer Prostate Cancer - Metastasis - Top 1.55E- 17697 - Top 10% Over-expressed 122189617 10% Under-expressed (Taylor 4.2 559 120 (Bittner Multi-cancer) Prostate 3) Cancer Type: Prostate Cancer Prostate Cancer - Metastasis - Top 1.33E- 17697 - Top 10% Over-expressed 122213069 5% Under-expressed (Grasso 6.4 356 120 (Bittner Multi-cancer) Prostate) Cancer Type: Prostate Cancer Prostate Cancer - Metastasis - Top 3.67E- 17697 - Top 10% Over-expressed 28483 5% Under-expressed (Varambally 6.8 377 134 (Bittner Multi-cancer) Prostate) Cancer Type: Prostate Cancer Prostate Cancer - Metastasis - Top 17697 - Top 10% Over-expressed 23100 10% Under-expressed (LaTulippe 1.22E-54 4.1 250 (Bittner Multi-cancer) Prostate) Cancer Type: Prostate Cancer Prostate Cancer - Metastasis - Top 3.10E- 17697 - Top 10% Over-expressed 28344 10% Under-expressed (Vanaja 6.1 619 189 (Bittner Multi-cancer) Prostate) Prostate Cancer - Metastasis - Cancer Type: Prostate Cancer - 122189617 Top 10% Under-expressed 122210891 Top 5% Under-expressed (Garnett 1.63E-12 2.1 143 (Taylor Prostate 3) CellLine) Prostate Cancer - Metastasis - Cancer Type: Prostate Cancer - 122189617 Top 10% Under-expressed 122208916 Top 5% Under-expressed 1.94E-26 2.5 218 (Taylor Prostate 3) (Barretina CellLine) Prostate Cancer - Metastasis - Prostate Cancer - Metastasis - Top 6.85E- 122189617 Top 10% Under-expressed 122213069 5% Under-expressed (Grasso 15.4 554 300 (Taylor Prostate 3) Prostate) Prostate Cancer - Metastasis - Prostate Cancer - Metastasis - Top 1.26E- 122189617 Top 10% Under-expressed 28483 5% Under-expressed (Varambally 8.7 446 181 (Taylor Prostate 3) Prostate) Prostate Cancer - Metastasis - Prostate Cancer - Metastasis - Top 7.69E- 122189617 Top 10% Under-expressed 23100 10% Under-expressed (LaTulippe 8.1 422 149 (Taylor Prostate 3) Prostate) Prostate Cancer - Metastasis - Prostate Cancer - Metastasis - Top 1.66E- 122189617 Top 10% Under-expressed 28344 10% Under-expressed (Vanaja 4.8 574 141 (Taylor Prostate 3) Prostate) Cancer Type: Prostate Cancer Cancer Type: Prostate Cancer - 6.57E- 122210891 - Top 5% Under-expressed 122208916 Top 5% Under-expressed 13.7 232 135 (Garnett CellLine) (Barretina CellLine) Cancer Type: Prostate Cancer Prostate Cancer - Metastasis - Top 122210891 - Top 5% Under-expressed 122213069 5% Under-expressed (Grasso 4.37E-06 1.9 65 (Garnett CellLine) Prostate) Cancer Type: Prostate Cancer Prostate Cancer - Metastasis - Top 122210891 - Top 5% Under-expressed 23100 10% Under-expressed (LaTulippe 3.26E-04 1.6 73 (Garnett CellLine) Prostate)

211

Cancer Type: Prostate Cancer Prostate Cancer - Metastasis - Top 122208916 - Top 5% Under-expressed 122213069 5% Under-expressed (Grasso 9.74E-20 2.9 118 (Barretina CellLine) Prostate) Cancer Type: Prostate Cancer Prostate Cancer - Metastasis - Top 122208916 - Top 5% Under-expressed 28483 5% Under-expressed (Varambally 1.44E-09 2.1 93 (Barretina CellLine) Prostate) Cancer Type: Prostate Cancer Prostate Cancer - Metastasis - Top 122208916 - Top 5% Under-expressed 23100 10% Under-expressed (LaTulippe 1.04E-05 1.7 87 (Barretina CellLine) Prostate) Prostate Cancer - Metastasis - Prostate Cancer - Metastasis - Top 1.06E- 122213069 Top 5% Under-expressed 28483 5% Under-expressed (Varambally 14.8 332 199 (Grasso Prostate) Prostate) Prostate Cancer - Metastasis - Prostate Cancer - Metastasis - Top 122213069 Top 5% Under-expressed 23100 10% Under-expressed (LaTulippe 5.82E-85 7.7 221 (Grasso Prostate) Prostate) Prostate Cancer - Metastasis - Prostate Cancer - Metastasis - Top 122213069 Top 5% Under-expressed 28344 10% Under-expressed (Vanaja 1.21E-75 4.7 288 (Grasso Prostate) Prostate) Prostate Cancer - Metastasis - Prostate Cancer - Metastasis - Top 28483 Top 5% Under-expressed 23100 10% Under-expressed (LaTulippe 1.59E-59 5.7 192 (Varambally Prostate) Prostate) Prostate Cancer - Metastasis - Prostate Cancer - Metastasis - Top 28483 Top 5% Under-expressed 28344 10% Under-expressed (Vanaja 3.84E-80 4.7 302 (Varambally Prostate) Prostate) Prostate Cancer - Metastasis - Prostate Cancer - Metastasis - Top 23100 Top 10% Under-expressed 28344 10% Under-expressed (Vanaja 2.79E-54 4 258 (LaTulippe Prostate) Prostate) Prostate Carcinoma - Dead at 3 PC3 GHSROS downregulated C41610 122199554 Years - Top 10% Under-expressed 0.003 2.9 12 gene list (Setlur Prostate) Prostate Carcinoma - Advanced PC3 GHSROS downregulated C41610 122189630 Gleason Score - Top 5% Under- 0.004 2.5 13 gene list expressed (Taylor Prostate 3) Prostate Carcinoma - Recurrence at PC3 GHSROS downregulated C41610 122189606 5 Years - Top 5% Under-expressed 0.004 2.5 13 gene list (Taylor Prostate 3) Prostate Carcinoma - Dead at Prostate Carcinoma - Advanced 122199554 3 Years - Top 10% Under- 122189630 Gleason Score - Top 5% Under- 3.07E-37 4.7 141 expressed (Setlur Prostate) expressed (Taylor Prostate 3) Prostate Carcinoma - Dead at Prostate Carcinoma - Recurrence at 122199554 3 Years - Top 10% Under- 122189606 5 Years - Top 5% Under-expressed 3.29E-24 3.4 130 expressed (Setlur Prostate) (Taylor Prostate 3) Prostate Carcinoma - Prostate Carcinoma - Recurrence at Advanced Gleason Score - 122189630 122189606 5 Years - Top 5% Under-expressed 0.00E+00 26.5 493 Top 5% Under-expressed (Taylor Prostate 3) (Taylor Prostate 3) Prostate Carcinoma vs. Normal - PC3 GHSROS upregulated C41601 29459 Top 1% Over-expressed (Yu 4.78E-04 12.5 4 gene list_RNAseq Prostate) Prostate Carcinoma - Advanced N PC3 GHSROS upregulated C41602 122189633 Stage - Top 5% Over-expressed 0.002 3.5 9 gene list_RNAseq (Taylor Prostate 3) Prostate Adenocarcinoma - PC3 GHSROS upregulated C41603 17807 Advanced Stage - Top 1% Over- 0.003 7.5 4 gene list_RNAseq expressed (Bittner Prostate) Prostate Carcinoma vs. Normal - PC3 GHSROS upregulated C41604 23091 Top 10% Over-expressed 0.003 3.4 10 gene list_RNAseq (LaTulippe Prostate) Prostate Carcinoma vs. Prostate Adenocarcinoma - 29459 Normal - Top 1% Over- 17807 Advanced Stage - Top 1% Over- 3.33E-04 7.2 6 expressed (Yu Prostate) expressed (Bittner Prostate) Prostate Carcinoma vs. Prostate Carcinoma vs. Normal - 29459 Normal - Top 1% Over- 23091 Top 10% Over-expressed 5.61E-07 3.8 25 expressed (Yu Prostate) (LaTulippe Prostate) Prostate Carcinoma - Prostate Carcinoma vs. Normal - Advanced N Stage - Top 5% 122189633 23091 Top 10% Over-expressed 3.73E-06 2 61 Over-expressed (Taylor (LaTulippe Prostate) Prostate 3) Prostate Adenocarinoma - Prostate Carcinoma vs. Normal - Advanced Stage - Top 1% 17807 23091 Top 10% Over-expressed 6.07E-04 2.5 20 Over-expressed (Bittner (LaTulippe Prostate) Prostate)

212

Supplementary Table 6. Differentially expressed genes in PC3-GHSROS cells compared to the Grasso Oncomine data set. The Grasso data set includes 59 localized and 35 metastatic prostate tumours. Red: higher expression in metastatic tumours; Black: lower expression in metastatic tumours. Fold-changes are log2 transformed; Q-value denotes the false discovery rate (FDR; Benjamini-Hochberg)-adjusted P- value. Gene Symbol Gene Name Reporter ID Fold Change P-value Q-value AASS aminoadipate-semialdehyde synthase A_23_P8754 -1.5 6.0E-03 2.5E-02 ACHE acetylcholinesterase (Yt blood group) A_24_P60845 1.3 9.6E-03 3.4E-02 ADAM8 ADAM metallopeptidase domain 8 A_24_P300777 1.5 3.8E-03 1.5E-02 AGTR1 angiotensin II receptor, type 1 A_23_P166616 -4.2 1.7E-07 2.0E-06 AMOT angiomotin A_24_P344961 -4.3 2.6E-08 3.9E-07 ANGPT1 angiopoietin 1 A_23_P216023 -18.5 9.8E-21 7.0E-18 ARHGDIB Rho GDP dissociation inhibitor (GDI) beta A_23_P151075 -1.8 6.5E-04 3.4E-03 ARHGEF16 Rho guanine nucleotide exchange factor (GEF) 16 A_23_P114670 1.4 1.9E-04 1.1E-03 branched chain amino-acid transaminase 1, A_24_P935986 -1.8 2.6E-03 1.2E-02 BCAT1 cytosolic CA9 carbonic anhydrase IX A_23_P157793 3.4 4.8E-06 4.2E-05 CADM4 cell adhesion molecule 4 A_23_P5064 1.9 3.3E-05 2.3E-04 calcium/calmodulin-dependent protein kinase II A_24_P117620 2.1 3.5E-03 1.4E-02 CAMK2N1 inhibitor 1 CAPN6 calpain 6 A_23_P217570 -5.6 1.7E-15 2.6E-13 CCNB3 cyclin B3 A_23_P171107 -1.5 1.3E-02 5.1E-02 CEND1 cell cycle exit and neuronal differentiation 1 A_23_P170030 -1.5 3.9E-03 1.7E-02 CHRDL1 chordin-like 1 A_24_P168925 -48.0 3.6E-21 2.9E-18 CLMN calmin (calponin-like, transmembrane) A_23_P25706 -1.8 9.9E-06 7.7E-05 CNTN1 contactin 1 A_23_P204541 -19.3 1.6E-16 3.4E-14 COL5A1 collagen, type V, alpha 1 A_32_P95034 2.4 8.3E-06 6.8E-05 COX7B2 cytochrome c oxidase subunit VIIb2 A_23_P136246 21.2 4.2E-03 1.7E-02 CPA6 carboxypeptidase A6 A_23_P216291 -6.1 1.3E-10 3.8E-09 cytoplasmic polyadenylation element binding A_23_P106322 -1.5 7.2E-03 3.0E-02 CPEB1 protein 1 CRABP1 cellular retinoic acid binding protein 1 A_23_P117882 1.5 2.2E-03 9.5E-03 CRABP2 cellular retinoic acid binding protein 2 A_23_P115064 1.9 3.2E-05 2.2E-04 CRIP2 cysteine-rich protein 2 A_23_P112798 2.1 8.2E-08 1.3E-06 CXADR coxsackie virus and adenovirus receptor A_23_P57268 -2.5 3.9E-06 3.4E-05 CXorf57 chromosome X open reading frame 57 A_23_P96369 -2.7 7.2E-05 4.7E-04 cytochrome P450, family 4, subfamily V, A_32_P23838 -3.1 1.2E-11 4.9E-10 CYP4V2 polypeptide 2 cytochrome P450, family 7, subfamily B, A_23_P169092 -2.5 2.6E-08 3.9E-07 CYP7B1 polypeptide 1 DGKG diacylglycerol kinase, gamma 90kDa A_23_P40926 -2.1 6.0E-04 3.2E-03 DIRAS1 DIRAS family, GTP-binding RAS-like 1 A_23_P386942 2.2 3.5E-07 4.4E-06 DMD dystrophin A_24_P185854 -8.4 2.6E-15 3.6E-13 EGF epidermal growth factor A_23_P155979 -2.2 1.5E-03 7.1E-03 EOMES eomesodermin homolog (Xenopus laevis) A_24_P97374 -1.6 1.1E-02 4.5E-02 EYA1 eyes absent homolog 1 (Drosophila) A_23_P502363 -17.0 2.2E-20 1.3E-17 FAM20A family with sequence similarity 20, member A A_24_P352952 1.3 4.6E-02 1.3E-01 FBXL16 F-box and leucine-rich repeat protein 16 A_23_P406385 6.0 2.5E-08 4.7E-07 FAD-dependent oxidoreductase domain A_24_P147765 1.7 4.0E-06 3.6E-05 FOXRED2 containing 2 GPR63 G protein-coupled receptor 63 A_23_P214727 -1.3 3.3E-02 1.2E-01 HEPH hephaestin A_23_P22526 -4.8 8.2E-11 2.5E-09 HSPA12A heat shock 70kDa protein 12A A_23_P408376 -1.3 3.8E-02 1.3E-01 HSPB8 heat shock 22kDa protein 8 A_23_P162579 -3.9 7.5E-13 4.6E-11 IFI16 interferon, gamma-inducible protein 16 A_23_P160025 -2.2 1.1E-05 8.8E-05 IL13RA2 interleukin 13 receptor, alpha 2 A_23_P85209 -10.4 9.4E-13 5.6E-11 ITM2A integral membrane protein 2A A_23_P171074 -5.4 5.0E-07 5.4E-06 potassium intermediate/small conductance A_23_P126167 -1.5 1.1E-02 4.5E-02 KCNN3 calcium-activated channel, subfamily N, member 3 KIF5C kinesin family member 5C A_32_P154473 -1.3 3.5E-02 1.2E-01 KLF9 Kruppel-like factor 9 A_23_P415401 -3.3 6.4E-09 1.1E-07 LCP1 lymphocyte cytosolic protein 1 (L-plastin) A_32_P181443 -5.8 1.3E-14 1.5E-12

213

leucine-rich repeats and calponin homology (CH) A_23_P73809 -11.4 2.6E-15 3.6E-13 LRCH2 domain containing 2 leucine-rich repeats and immunoglobulin-like A_23_P109636 -1.8 7.6E-05 4.9E-04 LRIG1 domains 1 MCTP2 multiple C2 domains, transmembrane 2 A_32_P72758 -4.0 8.8E-05 5.6E-04 MFAP3L microfibrillar-associated protein 3-like A_24_P76675 -1.4 7.7E-03 3.2E-02 MUC2 mucin 2, oligomeric mucus/gel-forming A_24_P341761 2.2 5.2E-06 4.5E-05 MUC3A mucin 3A, cell surface associated A_23_P313278 1.3 1.7E-02 5.7E-02 MUC5B mucin 5B, oligomeric mucus/gel-forming A_24_P324141 2.3 1.2E-07 1.8E-06 MUM1L1 melanoma associated antigen (mutated) 1-like 1 A_23_P73571 -8.8 7.9E-11 2.5E-09 NELL2 NEL-like 2 (chicken) A_23_P10025 -4.3 5.2E-08 7.1E-07 nudix (nucleoside diphosphate linked moiety X)- A_23_P73787 -9.2 3.2E-13 2.2E-11 NUDT10 type motif 10 nudix (nucleoside diphosphate linked moiety X)- A_24_P345002 -3.2 1.1E-06 1.1E-05 NUDT11 type motif 11 PARM1 prostate androgen-regulated mucin-like protein 1 A_24_P191781 -13.0 3.9E-17 1.0E-14 PLD5 phospholipase D family, member 5 A_23_P510 -1.4 1.2E-02 4.7E-02 PLXNA4 plexin A4 A_24_P265051 2.3 4.5E-04 2.3E-03 PTPRG protein tyrosine phosphatase, receptor type, G A_23_P41054 -2.0 6.7E-05 4.4E-04 RBM11 RNA binding motif protein 11 A_23_P342000 -3.1 5.1E-06 4.3E-05 ribonuclease L (2',5'-oligoisoadenylate synthetase- A_23_P390172 -1.8 1.4E-05 1.1E-04 RNASEL dependent) RNF128 ring finger protein 128 A_23_P148345 -2.3 7.9E-03 3.2E-02 RYR2 ryanodine receptor 2 (cardiac) A_23_P137797 -1.9 2.5E-02 8.9E-02 sema domain, transmembrane domain (TM), and A_23_P208900 2.8 6.3E-11 3.6E-09 SEMA6B cytoplasmic domain, (semaphorin) 6B solute carrier family 1 (glial high affinity A_24_P286114 -2.0 1.9E-04 1.1E-03 SLC1A3 glutamate transporter), member 3 solute carrier family 9 (sodium/hydrogen A_23_P5536 -1.8 8.3E-04 4.2E-03 SLC9A2 exchanger), member 2 ST6 beta-galactosamide alpha-2,6-sialyltranferase A_24_P388528 -2.5 8.7E-04 4.4E-03 ST6GAL1 1 ST6 beta-galactosamide alpha-2,6-sialyltranferase A_23_P429425 5.6 1.2E-02 4.2E-02 ST6GAL2 2 STOX2 storkhead box 2 A_23_P251364 -1.9 1.7E-03 8.1E-03 transcobalamin I (vitamin B12 binding protein, R A_23_P64372 -1.4 4.1E-02 1.4E-01 TCN1 binder family) TFF2 trefoil factor 2 A_23_P57364 1.3 6.4E-04 3.1E-03 TNXB tenascin XB A_24_P15834 1.8 3.2E-08 5.8E-07 TP53I11 tumour protein p53 inducible protein 11 A_23_P150281 1.5 4.8E-05 3.1E-04 TPTE transmembrane phosphatase with tensin homology A_23_P345746 -1.4 2.5E-02 8.9E-02 UBE2QL1 ubiquitin-conjugating enzyme E2Q family-like 1 A_24_P479551 -2.4 9.6E-06 7.6E-05 UNC80 unc-80 homolog (C. elegans) A_23_P350808 -4.7 2.0E-10 5.4E-09 VWA5A von Willebrand factor A domain containing 5A A_23_P98455 -7.5 4.0E-14 3.9E-12 ZFPM2 zinc finger protein, multitype 2 A_23_P168909 -1.9 5.3E-05 3.5E-04 ZNF467 zinc finger protein 467 A_23_P59470 3.5 5.4E-07 6.3E-06 ZNF85 zinc finger protein 85 A_23_P67702 1.3 3.5E-02 1.1E-01

214

Supplementary Table 7. Differentially expressed genes in PC3-GHSROS cells compared to the Taylor Oncomine data set. The Taylor data set includes 123 localized and 35 metastatic prostate tumours. Red: higher expression in metastatic tumours; Black: lower expression in metastatic tumours. Fold-changes are log2 transformed; Q-value denotes the false discovery rate (FDR; Benjamini-Hochberg)-adjusted P- value. Gene Symbol Gene Name Reported ID Fold Change P-value Q-value AASS aminoadipate-semialdehyde 10093 -1.5 3.9E-05 1.7E-03 synthase ANGPT1 angiopoietin 1 5779 -1.7 5.8E-12 2.0E-09 BTBD11 BTB (POZ) domain containing 11 2600 -1.4 1.2E-04 4.3E-03 CAPN6 calpain 6 12342 -1.4 1.0E-04 3.9E-03 CHRDL1 chordin-like 1 20828 -5.0 1.6E-18 1.3E-15 CLDN7 claudin 7 5927 1.4 8.5E-04 3.4E-02 CNKSR2 connector enhancer of kinase 12888 -1.4 3.4E-03 5.8E-02 suppressor of Ras 2 CNTN1 contactin 1 6403 -3.5 4.2E-22 6.8E-19 CPA6 carboxypeptidase A6 15668 -1.4 1.1E-03 2.4E-02 CRIP1 cysteine-rich protein 1 (intestinal) 5930 1.3 3.5E-03 7.4E-02 CRIP2 cysteine-rich protein 2 5931 1.5 1.8E-02 1.7E-01 DIRAS1 DIRAS family, GTP-binding RAS- 20799 1.1 1.2E-02 1.4E-01 like 1 DMD dystrophin 8445 -2.1 6.1E-10 1.3E-07 EGF epidermal growth factor 6520 -1.3 2.2E-03 4.1E-02 EYA1 eyes absent homolog 1 (Drosophila) 22097 -2.5 3.1E-10 7.1E-08 FBXL16 F-box and leucine-rich repeat 21824 1.1 8.0E-03 1.2E-01 protein 16 HEPH hephaestin 12775 -1.6 1.6E-07 1.7E-05 HSPB8 heat shock 22kDa protein 8 12408 -2.5 4.8E-14 2.5E-11 IFI16 interferon, gamma-inducible 9878 -1.5 8.4E-05 3.3E-03 protein 16 KLF9 Kruppel-like factor 9 5835 -1.3 1.2E-03 2.6E-02 LCP1 lymphocyte cytosolic protein 1 (L- 6845 -1.9 2.0E-03 3.8E-02 plastin) LRCH2 leucine-rich repeats and calponin 15991 -2.2 1.1E-10 2.9E-08 homology (CH) domain containing 2 LRIG1 leucine-rich repeats and 13390 -1.5 8.1E-03 1.1E-01 immunoglobulin-like domains 1 MCTP2 multiple C2 domains, 14977 -1.8 3.7E-03 6.1E-02 transmembrane 2 MUC5B mucin 5B, oligomeric mucus/gel- 6986 1.1 3.8E-02 2.5E-01 forming MUM1L1 melanoma associated antigen 21313 -1.3 1.0E-02 1.3E-01 (mutated) 1-like 1 NFASC neurofascin homolog (chicken) 13029 -1.2 1.4E-02 1.6E-01 NUDT10 nudix (nucleoside diphosphate 21729 -1.2 2.3E-05 1.1E-03 linked moiety X)-type motif 10 NUDT11 nudix (nucleoside diphosphate 14812 -1.2 2.1E-02 2.1E-01 linked moiety X)-type motif 11 PARM1 prostate androgen-regulated mucin- 13279 -2.4 1.6E-06 1.2E-04 like protein 1 RNASEL ribonuclease L (2',5'- 16187 -1.2 9.0E-06 5.2E-04 oligoisoadenylate synthetase- dependent) RYR2 ryanodine receptor 2 (cardiac) 3246 -1.4 3.9E-03 6.4E-02 ST6GAL1 ST6 beta-galactosamide alpha-2,6- 22304 -2.2 3.8E-03 6.3E-02 sialyltranferase 1 STOX2 storkhead box 2 15602 -1.1 1.9E-02 2.0E-01 TFF2 trefoil factor 2 9774 1.1 3.6E-02 2.4E-01 TMPRSS3 transmembrane protease, serine 3 16977 1.1 2.5E-03 6.2E-02 TP53I11 tumour protein p53 inducible 4038 1.1 2.2E-02 1.9E-01 protein 11 UNC80 unc-80 homolog (C. elegans) 23549 -1.3 7.5E-03 1.1E-01 VWA5A von Willebrand factor A domain 24065 -1.9 1.0E-18 9.5E-16 containing 5A ZNF467 zinc finger protein 467 25037 1.3 2.8E-04 1.7E-02 ZNF607 zinc finger protein 607 18630 1.1 8.1E-03 1.2E-01

215

Supplementary Table 8. Overall survival (OS) analysis of TCGA patients using the 34-gene signature. Patients were stratified into groups by k-means clustering of gene expression (k=2). Log-rank test was used to assign statistical significance. Log-rank P-values calculated from cohorts where less than 10 events (here: death) per group (cluster 1 or 2) were recorded are indicated in red. NA denotes not available due to missing information. Cluster 1 Cluster 2

total Number Mean Number Mean Log- Cancer Median Median Cohort (n) n of (days ± n of (days ± rank type (days) (days) events s.e.m) events s.e.m) P Adrenocortical ACC 76 38 13 579 998±220 38 13 557 741±164 0.3414 Cancer BLCA 319 Bladder Cancer 175 37 467 560±60 144 54 509 671±88 0.0032 Breast Invasive BRCA 1002 543 32 1082 1746±295 459 28 1206 1689±263 0.4853 Carcinoma CESC 264 Cervical Cancer 175 20 747 884±167 89 12 556 764±165 0.4725 CHOL 31 Bile Duct Cancer 10 3 385 370±153 21 10 740 892±188 0.6262 Colon and Rectal COADREAD 332 180 26 698 922±130 152 16 649 1126±245 0.0537 Cancer Large B-cell DLBC 42 9 0 NA NA 33 5 595 1794±1168 0.2411 Lymphoma ESCA 139 Esophageal Cancer 64 21 480 654±109 75 13 279 392±103 0.7660 GBM 119 Glioblastoma 66 47 460 606±70 53 40 396 481±54 0.1178 Head and Neck HNSC 391 191 48 488 804±104 200 46 419 660±117 0.9614 Cancer Kidney KICH 62 8 1 325 325±NA 54 5 854 924±180 0.4285 Chromophobe Kidney Clear Cell KIRC 62 8 1 325 325±NA 54 5 854 924±180 0.4285 Carcinoma Kidney Papillary KIRP 268 142 20 551 740±144 126 4 479 716±360 0.0034 Cell Carcinoma Acute Myeloid LAML 58 24 0 NA NA 34 0 NA NA 1.0000 Leukemia Lower Grade LGG 484 301 65 932 1365±142 183 29 1208 1695±260 0.0665 Glioma LIHC 320 Liver Cancer 180 43 642 857±102 140 35 581 774±106 0.6938 Lung LUAD 434 204 51 598 703±67 230 56 866 1063±116 0.3794 Adenocarcinoma Lung Squamous LUSC 373 206 51 733 1007±104 167 40 572 913±134 0.9996 Cell Carcinoma MESO 86 Mesothelioma 41 36 469 529±65 45 37 406 598±118 0.8861 OV 357 Ovarian Cancer 205 118 1248 1297±59 152 78 1160 1373±95 0.3820 PAAD 139 Pancreatic Cancer 14 2 248 248±30 125 51 467 539±58 0.0089 Pheochromocytoma PCPG 181 94 5 95 787±694 87 3 596 553±131 0.6236 and Paraganglioma PRAD 489 Prostate Cancer 334 1 1854 1854±NA 155 3 2467 2241±780 0.1643 SARC 234 Sarcoma 128 41 687 865±95 106 30 752 989±129 0.2604 SKCM 409 Melanoma 192 78 1025 1662±212 217 85 1332 2096±211 0.0027 STAD 324 Stomach Cancer 165 50 388 494±48 159 24 446 565±83 0.0003 TGCT 134 Testicular Cancer 32 1 513 513±NA 102 0 NA NA 0.0797 THCA 497 Thyroid Cancer 241 1 1022 1022±NA 256 1 1752 1752±NA 0.9358 Endometrioid UCEC 167 76 9 666 721±74 91 12 944 1040±213 0.8803 Cancer Uterine UCS 56 22 12 398 407±79 34 22 541 685±121 0.1045 Carcinosarcoma UVM 60 Ocular Melanomas 26 3 872 735±174 34 0 NA NA 0.0365

216

Supplementary Dataset 9. Differentially expressed genes in LNCaP-GHSROS cells. Compared to empty vector control. Red: higher expression in LNCaP-GHSROS cells; Black: lower expression in LNCaP-GHSROS cells. Fold-changes are log2 transformed; Q-value denotes the false discovery rate (FDR; Benjamini-Hochberg)-adjusted P-value (cutoff ≤ 0.05).

(Provided in a separate file as “Supplementary Dataset 9.xlsx”: https://www.dropbox.com/sh/yxq17lquo8nlceu/AAC8REiSlvNKyUez6fjT9hKWa?dl=0).

217

Supplementary Dataset 10. Enrichment for GO terms in the category ‘biological process’ for genes upregulated in LNCaP-GHSROS cells (compared to empty-vector control). P ≤ 0.01, Fisher's exact test.

(Provided in a separate file as “Supplementary Dataset 10.xlsx”: https://www.dropbox.com/sh/yxq17lquo8nlceu/AAC8REiSlvNKyUez6fjT9hKWa?dl=0).

218

Supplementary Table 11. Enrichment for GO terms in the category ‘biological process’ for genes downregulated in LNCaP-GHSROS cells (compared to empty-vector control). P ≤ 0.01, Fisher's exact test.

(Provided in a separate file as “Supplementary Dataset 11.xlsx”: https://www.dropbox.com/sh/yxq17lquo8nlceu/AAC8REiSlvNKyUez6fjT9hKWa?dl=0).

219

Supplementary Table 12. Disease-free survival (DFS) analysis of differentially expressed genes (in PC3-GHSROS cells, LNCaP-GHSROS cells and clinical metastatic tumours) in human datasets. Patients, in the Taylor (n=150; n=123 localized and n=27 metastatic tumours) and TCGA-PRAD (n=489; localized tumours) datasets, were stratified into two groups by k-means clustering of gene expression (k=2). The log-rank test, was used to assign statistical significance, with P ≤ 0.05 considered significant (shown in bold). The Cox P-value and absolute hazard ratio (HR) between k-means cluster 1 and 2 for each gene are indicated. Overall median disease-free survival (DFS) in days are indicated for each cluster. Taylor (n=150) TCGA-PRAD (n=489) Overall Overall Overall Overall Gene log- Absolute median median log- Absolute median median Cox P Cox P Symbol rank P HR DFS DFS rank P HR DFS DFS cluster 1 cluster 2 cluster 1 cluster 2 ZNF467 0.0027 0.0039 2.7 174 871 0.000050 0.000026 2.5 546 685 CHRDL1 0.0047 0.0062 2.5 840 402 0.0079 0.0071 1.8 649 640 FBXL16 0.017 0.020 2.2 300 871 0.089 0.087 1.5 627 663 DIRAS1 0.09 0.099 1.7 709 329 0.012 0.011 1.7 425 723 TFF2 0.11 0.11 1.7 840 125 0.84 0.089 1.1 648 896 CNTN1 0.13 0.14 1.6 701 457 0.10 0.094 1.4 627 691 IFI16 0.27 0.28 1.5 579 181 0.95 0.95 1.0 671 648 AASS 0.62 0.63 1.2 843 348 0.35 0.35 1.2 552 697 MUM1L1 0.78 0.78 1.1 472 676 0.14 0.14 1.4 765 426 TP53I11 0.98 0.98 1.0 122 843 0.57 0.57 1.1 533 751

220

Supplementary Table 13. Differentially expressed genes in MDA-MB-231-GHSROS cells compared to empty vector control. Red coloured text represents higher expression in MDA-MB-231- GHSROS cells; black text represents lower expression in MDA-MB-231-GHSROS cells. P-value obtained using the R package ‘limma’ (moderated t-test). Affymetrix Gene Symbol Gene Name Log2 Fold Change P-Value Probe HTR1F 5-hydroxytryptamine receptor 1F 1.64 0.0040 8081067 EPHA3 EPH receptor A3 1.18 0.0190 8081081 FKBP10 FK506 binding protein 10 1.16 0.0032 8007154 SHISA3 shisa family member 3 1.09 0.0046 8094870 MYO1D myosin ID 1.05 0.0111 8014115 STK26 serine/threonine protein kinase 26 0.933 0.0008 8169949 BICC1 BicC family RNA binding protein 1 0.857 0.0021 7927681 PTGS2 prostaglandin-endoperoxide synthase 2 0.837 0.0106 7922976 EPCAM epithelial cell adhesion molecule 0.822 0.0089 8098439 SH3BGRL2 SH3 domain binding glutamate rich protein like 2 0.768 0.0008 8120833 FZD3 frizzled class receptor 3 0.752 0.0021 8145611 PRKAA2 protein kinase AMP-activated catalytic subunit alpha 2 0.704 0.0118 7901720 SLC16A6 solute carrier family 16 member 6 0.703 0.0045 8017843 SLC4A4 solute carrier family 4 member 4 0.702 0.0056 8095585 GPR63 G protein-coupled receptor 63 0.7 0.0092 8128316 MIPOL1 mirror-image polydactyly 1 0.698 0.0013 7973985 LGR5 leucine rich repeat containing G protein-coupled receptor 5 0.661 0.0184 7957140 PLCB1 phospholipase C beta 1 0.656 0.0222 8060854 ADAM23 ADAM metallopeptidase domain 23 0.639 0.0083 8047788 SLITRK4 SLIT and NTRK like family member 4 0.605 0.0070 8175574 GNAL G protein subunit alpha L 0.604 0.0006 8020164 AMOT angiomotin 0.603 0.0128 8174576 mannosyl (alpha-1,3-)-glycoprotein beta-1,4-N- MGAT4A 0.603 0.0418 8054135 acetylglucosaminyltransferase, isozyme A ULK2 unc-51 like autophagy activating kinase 2 0.585 0.0008 8013399 olfactory receptor family 2 subfamily A member 9 OR2A9P///OR2 pseudogene///olfactory receptor family 2 subfamily A member 20 0.584 0.0373 8136983 A20P pseudogene ZNF704 zinc finger protein 704 0.581 0.0032 8151496 TBX3 T-box 3 0.573 0.0041 7966690 SUPT20HL1///S SPT20 homolog, SAGA complex component-like 1///SPT20 0.568 0.0245 8166509 UPT20HL2 homolog, SAGA complex component-like 2 SUPT20HL1///S SPT20 homolog, SAGA complex component-like 1///SPT20 0.568 0.0245 8171844 UPT20HL2 homolog, SAGA complex component-like 2 TENM2 teneurin transmembrane protein 2 0.556 0.0391 8109752 PDK3 pyruvate dehydrogenase kinase 3 0.555 0.0220 8166511 olfactory receptor family 2 subfamily A member 9 OR2A9P///OR2 pseudogene///olfactory receptor family 2 subfamily A member 20 0.554 0.0334 8143629 A20P pseudogene TPD52L1 tumour protein D52-like 1 0.55 0.0024 8121838 TMEM56- RWDD3///TME TMEM56-RWDD3 readthrough///transmembrane protein 56 0.543 0.0363 7903162 M56 CLEC2B C-type lectin domain family 2 member B 0.541 0.0498 7961083 CSF3 colony stimulating factor 3 0.539 0.0052 8006999 PICK1 protein interacting with PRKCA 1 -0.551 0.0386 8072989 MAOA monoamine oxidase A -0.563 0.0217 8166925 CRISPLD1 cysteine rich secretory protein LCCL domain containing 1 -0.58 0.0009 8146967 MSLN mesothelin -0.584 0.0059 7992071 ZNF558 zinc finger protein 558 -0.611 0.0070 8033667 PTPN22 protein tyrosine phosphatase, non-receptor type 22 -0.619 0.0004 7918657 NDP NDP, norrin cystine knot growth factor -0.624 0.0004 8172220 IFI27 interferon alpha inducible protein 27 -0.637 0.0225 7976443 ZNF814 zinc finger protein 814 -0.638 0.0033 8039692 PARP15 poly(ADP-ribose) polymerase family member 15 -0.639 0.0147 8082086 CPT1C carnitine palmitoyltransferase 1C -0.642 0.0267 8030448 RNU5D-1 RNA, U5D small nuclear 1 -0.647 0.0412 7915592 ARHGAP6 Rho GTPase activating protein 6 -0.664 0.0042 8171313 MEST mesoderm specific transcript -0.701 0.0026 8136248 STOM stomatin -0.744 0.0110 8163896 NLRP2 NLR family pyrin domain containing 2 -0.785 0.0001 8031398 MUSK muscle associated receptor tyrosine kinase -0.841 0.0389 8157173

221

CSF2RA colony stimulating factor 2 receptor alpha subunit -0.849 0.0030 8165735 CSF2RA colony stimulating factor 2 receptor alpha subunit -0.849 0.0030 8176306 LYPD6B LY6/PLAUR domain containing 6B -0.976 0.0188 8045664 TENM1 teneurin transmembrane protein 1 -0.98 0.0238 8174937 TGIF2LY///TGI TGFB induced factor homeobox 2 like, Y-linked///TGFB induced -0.981 0.0002 8176397 F2LX factor homeobox 2 like, X-linked HLA-DPB1 major histocompatibility complex, class II, DP beta 1 -0.992 0.0485 8118594 TGIF2LY///TGI TGFB induced factor homeobox 2 like, Y-linked///TGFB induced -0.996 0.0002 8168646 F2LX factor homeobox 2 like, X-linked ADGRF1 adhesion G protein-coupled receptor F1 -0.999 0.0038 8126820 SLC38A4 solute carrier family 38 member 4 -1.03 0.0081 7962559 HLA-DPB1 major histocompatibility complex, class II, DP beta 1 -1.05 0.0403 8178220 SPANXC///SPA SPANX family member C///SPANX family member D -1.07 0.0027 8175558 NXD HLA-DPA1 major histocompatibility complex, class II, DP alpha 1 -1.15 0.0466 8125556 HLA-DPA1 major histocompatibility complex, class II, DP alpha 1 -1.15 0.0466 8178891 HLA-DPA1 major histocompatibility complex, class II, DP alpha 1 -1.15 0.0470 8180100 HLA-DPB1 major histocompatibility complex, class II, DP beta 1 -1.21 0.0391 8179519 ZNF585B zinc finger protein 585B -1.21 0.0001 8036389 HLA- major histocompatibility complex, class II, DR alpha///major DRA///HLA- -1.23 0.0065 8118548 histocompatibility complex, class II, DQ alpha 1 DQA1 HLA- major histocompatibility complex, class II, DR alpha///major DRA///HLA- -1.24 0.0058 8179481 histocompatibility complex, class II, DQ alpha 1 DQA1 LOC645188 uncharacterized LOC645188 -1.26 0.0003 8170257 HLA- major histocompatibility complex, class II, DR alpha///major DRA///HLA- -1.35 0.0072 8178193 histocompatibility complex, class II, DQ alpha 1 DQA1 LOC105369230/ HLA class II histocompatibility antigen, DRB1-7 beta //HLA- chain///major histocompatibility complex, class II, DR beta 6 DRB6///HLA- (pseudogene)///major histocompatibility complex, class II, DR beta DRB5///HLA- 5///major histocompatibility complex, class II, DR beta 4///major DRB4///HLA- histocompatibility complex, class II, DR beta 3///major DRB3///HLA- histocompatibility complex, class II, DR beta 1///HLA class II DRB1///LOC105 -1.39 0.0054 8180003 histocompatibility antigen, DRB1-7 beta chain///major 369230///HLA- histocompatibility complex, class II, DR beta 5///major DRB5///HLA- histocompatibility complex, class II, DR beta 4///major DRB4///HLA- histocompatibility complex, class II, DR beta 3///major DRB3///HLA- histocompatibility complex, class II, DR beta 1///major DRB1///HLA- histocompatibility complex, class II, DQ beta 1 DQB1 LOC105369230/ HLA class II histocompatibility antigen, DRB1-7 beta //HLA- chain///major histocompatibility complex, class II, DR beta 6 DRB6///HLA- (pseudogene)///major histocompatibility complex, class II, DR beta DRB5///HLA- 5///major histocompatibility complex, class II, DR beta 4///major DRB4///HLA- histocompatibility complex, class II, DR beta 3///major DRB3///HLA- histocompatibility complex, class II, DR beta 1///HLA class II DRB1///LOC105 -1.41 0.0063 8178811 histocompatibility antigen, DRB1-7 beta chain///major 369230///HLA- histocompatibility complex, class II, DR beta 5///major DRB5///HLA- histocompatibility complex, class II, DR beta 4///major DRB4///HLA- histocompatibility complex, class II, DR beta 3///major DRB3///HLA- histocompatibility complex, class II, DR beta 1///major DRB1///HLA- histocompatibility complex, class II, DQ beta 1 DQB1 LOC105369230/ HLA class II histocompatibility antigen, DRB1-7 beta //HLA- chain///major histocompatibility complex, class II, DR beta 6 DRB6///HLA- (pseudogene)///major histocompatibility complex, class II, DR beta DRB5///HLA- 5///major histocompatibility complex, class II, DR beta 4///major DRB4///HLA- histocompatibility complex, class II, DR beta 3///major DRB3///HLA- histocompatibility complex, class II, DR beta 1///HLA class II DRB1///LOC105 -1.44 0.0068 8178802 histocompatibility antigen, DRB1-7 beta chain///major 369230///HLA- histocompatibility complex, class II, DR beta 5///major DRB5///HLA- histocompatibility complex, class II, DR beta 4///major DRB4///HLA- histocompatibility complex, class II, DR beta 3///major DRB3///HLA- histocompatibility complex, class II, DR beta 1///major DRB1///HLA- histocompatibility complex, class II, DQ beta 1 DQB1

222