THE ROLE OF GWAS IDENTIFIED 5P15 LOCUS IN PROSTATE CANCER RISK AND PROGRESSION

Panchadsaram Janaththani BSc in Molecular Biology and Biotechnology

Submitted in fulfilment of the requirements for the degree of

Doctor of Philosophy

Institute of Health and Biomedical Innovation

School of Biomedical Sciences

Faculty of Health

Queensland University of Technology

2019

Keywords

5p15 locus, Iroquois 4, IRX4, IRX4lncRNA, Genome-Wide Association

Studies, Multiple Nucleotide Length Polymorphism, Single nucleotide polymorphism,

Prostate cancer, Long non-coding RNA, Androgens, ERG.

The role of gwas identified 5p15 locus in prostate cancer risk and progression i

Abstract

Prostate cancer is the second most common cause of cancer death in Australian men. Androgens and the androgen (AR) play a critical role in prostate cancer pathogenesis. Genome-wide association studies (GWAS) have led to the identification of 150 loci associated with prostate cancer risk. Through these GWAS, the rs12653946 SNP at the 5p15 locus was found to be significantly associated with prostate cancer risk in multi ethnic populations. The rs12653946 SNP genotype had also been correlated with the expression levels of its downstream Iroquois Homeobox 4 (IRX4) and its antisense long non-coding RNA, IRX4lncRNA. Interestingly, expression of IRX4 and IRX4lncRNA was found to be increased following castration in the LNCaP tumour progression mice model. Therefore, we hypothesised that the prostate cancer risk associated 5p15 locus confers its risk via IRX4 and IRX4lncRNA. The aim of this study was to determine the functional role of IRX4 and IRX4lncRNA in prostate cancer progression and to characterise their androgen-mediated expression regulation in prostate cancer cells.

The studies in this thesis have shown that IRX4 and IRX4lncRNA were overexpressed in prostate tumour samples compared to adjacent non-malignant tissues. The expression of IRX4 was correlated with the ERG-fusion status, a prostate cancer specific fusion, while IRX4lncRNA expression was correlated with aggressive disease. The transient knockdown of IRX4 in LNCaP prostate cancer cells reduced cell proliferation and migration. Gene microarray analysis of IRX4 knockdown LNCaP samples revealed that the AR pathway was inhibited, suggesting IRX4 may mediate effective androgen signalling. This was further strengthened by an IRX4 immunoprecipitation assay, in which IRX4 was found to be interacting with the AR co-factors including, FOXA1. Similarly, knockdown of IRX4lncRNA reduced proliferation and migration of LNCaP cells, while overexpression of IRX4lncRNA increased PC3 cell proliferation, suggesting a tumour promoting role for this lncRNA.

Both IRX4 and IRX4lncRNA expression were up-regulated with androgen (DHT) treatment in VCaP and DuCaP cells, but down-regulated in LNCaP cells. Interestingly, in-silico analysis identified binding of AR and ERG, upstream of IRX4 in VCaP and DuCaP cells, but not in LNCaP cells. Sequencing of this region identified a Multiple-

ii The role of gwas identified 5p15 locus in prostate cancer risk and progression

Nucleotide Length Polymorphism (MNLP-rs386684493), where a stretch of 47bp sequence was replaced by a novel 21bp sequence. VCaP and DuCaP cells have an intact AR binding site whereas LNCaP cells have a disrupted AR binding site, suggesting this MNLP may be the functional variant guiding AR binding at this locus. The androgen responsiveness of MNLP was further confirmed by a reporter vector assay. Large-scale genetic association studies of this MNLP in ~80,000 men through the PRACTICAL Consortium identified the MNLP as the most significant prostate cancer risk associated variant at this locus. Furthermore, the 21bp/21bp genotype was correlated with lower expression levels of IRX4 and IRX4lncRNA and poor survival outcome in those patients who underwent androgen deprivation therapy. IRX4 expression was also correlated with ERG-fusion status in prostate cancer datasets, which was also confirmed in our prostate cancer tissue samples. ERG knockdown in VCaP and DuCaP cells increased the androgen-mediated up-regulation of IRX4 expression, while ERG overexpression in LNCaP cells reduced IRX4 expression independent of androgen treatment.

In summary, our study identified IRX4 as a potential mediator of effective androgen signalling and targeting IRX4 in patients stratified for ERG-fusion status and MNLP genotype may be a novel therapeutic strategy. In addition, this study discovered the functional role of IRX4lncRNA and identifying molecular mechanisms of this lncRNA in prostate cancer pathology may provide insights for the development of biologically meaningful targets for new therapeutics.

The role of gwas identified 5p15 locus in prostate cancer risk and progression iii

Table of Contents Keywords ...... i Abstract ...... ii List of Figures ...... vii List of Tables ...... x List of Abbreviations ...... xi Statement of Original Authorship ...... xiv Awards and Publications ...... xv Acknowledgements ...... xvii Chapter 1: Literature Review ...... 1 1.1. Prostate Cancer ...... 1 1.1.1. Prostate cancer staging and grading ...... 1 1.1.2. Diagnosis ...... 2 1.1.3. Treatment ...... 2 1.2. The role of the in the progression of prostate cancer...... 3 1.3. The role of TMPRSS2-ERG fusion in prostate cancer ...... 4 1.4. Prostate cancer risk factors ...... 6 1.5. Genome-Wide Association Studies (GWAS) ...... 6 1.6. The rs12653946 SNP at the 5p15 locus is associated with prostate cancer risk ...... 7 1.7. Iroquois Homeobox 4 (IRX4) ...... 11 1.8. Antisense long non-coding RNA, IRX4lncRNA, at 5p15 locus ...... 16 1.9. Long non-coding RNAs (lncRNAs) ...... 17 1.10. Summary and knowledge gap ...... 33 1.11. Hypothesis and Aims...... 34 Chapter 2: Materials and Methods ...... 35 2.1. Cell culture ...... 35 2.2. Genomic DNA samples for genetic association studies ...... 35 2.3. RNA isolation from FFPE prostate tumour tissues and matched controls ...... 36 2.4. RNA isolation from cell lines ...... 36 2.5. cDNA synthesis ...... 36 2.6. Reverse transcription polymerase chain reaction (RT-PCR) ...... 37 2.7. Quantitative RT-PCR (qRT-PCR) ...... 37 2.8. Western blot ...... 38 2.9. siRNA mediated knockdown ...... 38 2.10. Proliferation assay ...... 39 2.11. Migration assay ...... 39 2.12. Statistical analysis ...... 40

iv The role of gwas identified 5p15 locus in prostate cancer risk and progression

Chapter 3: The role of Iroquois-Homeobox 4 (IRX4) at the GWAS identified 5p15 locus in prostate cancer ...... 41 3.1. Introduction ...... 41 3.2. Methods...... 43 3.2.1. Analysis of IRX transcription factors’ expression in published data sets ...... 43 3.2.2. Immunofluorescence (IF) analysis...... 43 3.2.3. Microarray gene expression profiling ...... 43 3.2.4. Microarray data analysis ...... 44 3.2.5. Immunoprecipitation (IP) ...... 44 3.2.6. Mass spectrometry analysis ...... 45 3.3. Results ...... 46 3.3.1. Expression of IRX factors in clinical samples ...... 46 3.3.2. IRX4 expression in prostate cancer cell lines ...... 54 3.3.3. IRX4 knockdown in prostate cancer cell lines ...... 55 3.3.4. IRX4 knockdown inhibits cell proliferation in LNCaP and C4-2B cells ...... 56 3.3.5. IRX4 knockdown reduced LNCaP cell migration ...... 59 3.3.6. IRX4 knockdown modulates the expression of involved in EMT ...... 59 3.3.7. Microarray analysis of IRX4 knockdown samples ...... 60 3.3.8. The AR transcriptome is regulated by IRX4 in LNCaP cells ...... 72 3.3.9. IRX4 interacts with the AR co-factor, FOXA1 in LNCaP cells ...... 73 3.4. Discussion ...... 77 Chapter 4: IRX4lncRNA at the prostate cancer risk 5p15 locus promotes prostate cancer progression ...... 81 4.1. Introduction ...... 81 4.2. Methods...... 84 4.2.1. In-silico prediction of IRX4lncRNA expression and function ...... 84 4.2.2. Strand specific RT-qPCR ...... 84 4.2.3. Establishment of stable cell lines ...... 84 4.3. Results ...... 86 4.3.1. Characterisation of IRX4lncRNA in clinical samples...... 86 4.3.2. Expression of IRX4lncRNA splice variants in prostate cancer cells ...... 93 4.3.3. Optimisation of strand specific qRT-PCR for IRX4lncRNA ...... 95 4.3.4. Relative expression of IRX4lncRNA in a panel of prostate cell lines by qPCR ...... 96 4.3.5. IRX4lncRNA expression during Epithelial to Mesenchymal Transition (EMT) in prostate cancer cells ...... 97 4.3.6. Screening siRNAs to determine IRX4lncRNA knockdown efficiency ...... 97 4.3.7. Cell proliferation assay with transient knockdown of IRX4lncRNA ...... 98 4.3.8. Stable cell line establishment for overexpression and knockdown IRX4lncRNA ...... 99 4.3.9. The role of IRX4lncRNA in prostate cancer cell proliferation using stable models ...... 101 4.3.10. The role of IRX4lncRNA in LNCaP cell migration ...... 103 4.3.11. Regulation of IRX4 expression by IRX4lncRNA ...... 104 4.3.12. In-silico prediction of IRX4lncRNA function ...... 105 4.4. Discussion ...... 107 Chapter 5: Genotype specific androgen mediated regulation of IRX4 and IRX4lncRNA ...... 111 5.1. Introduction ...... 111

The role of gwas identified 5p15 locus in prostate cancer risk and progression v

5.2. Methods ...... 113 5.2.1. Androgen deprivation assay ...... 113 5.2.2. Androgen deprivation assay with transient siRNA knockdown ...... 113 5.2.3. Establishment of ERG overexpressing LNCaP cells ...... 113 5.2.4. Genotyping of cell lines and patients ...... 113 5.2.5. Reporter Gene assay ...... 114 5.3. Results ...... 115 5.3.1. Expression of IRX4 and IRX4lncRNA is regulated by androgens ...... 115 5.3.2. In-silico analysis of ChIP-Sequence data at the 5p15 locus ...... 118 5.3.3. Expression of IRX4 correlates with ERG expression ...... 120 5.3.4. Sequencing of AR/ERG binding DNA region in prostate cancer cell lines. .... 126 5.3.5. Allele specific androgen responsiveness of the MNLP ...... 127 5.3.6. Genotyping MNLP in prostate cancer patients and cancer free controls and Linkage Disequilibrium (LD) analysis ...... 128 5.3.7. IRX4 and IRX4lncRNA expression correlation with MNLP genotype ...... 130 5.3.8. In-silico prediction of binding at MNLP ...... 130 5.3.9. MNLP association with prostate cancer risk and survival for patients treated with ADT ...... 131 5.4. Discussion ...... 133 Chapter 6: Conclusions, Limitations and Future Directions ...... 139 Bibliograph ...... 147 Appendix ...... 177 Appendix A ...... 178 Appendix B ...... 180 Appendix C ...... 181 Appendix D ...... 182 Appendix E ...... 220 Appendix F ...... 227 Appendix G ...... 231 Appendix H ...... 233 Appendix I ...... 235 Appendix J ...... 237 Appendix K ...... 239 Appendix L ...... 241 Appendix M ...... 243 Appendix N ...... 244 Appendix O ...... 245 Appendix P ...... 246 Appendix Q ...... 253

vi The role of gwas identified 5p15 locus in prostate cancer risk and progression

List of Figures

Figure 1.1 – Androgen receptor signalling in prostate cells...... 4 Figure 1.2 – TMPRSS2:ERG fusion in prostate cancer cells...... 5 Figure 1.3 - The prostate cancer risk associated 5p15 locus...... 10 Figure 1.4 – Origins of LncRNAs...... 19 Figure 1.5 - Possible lncRNA targeting mechanisms...... 32 Figure 3.1 – Genomic organisation of Homo sapiens IRX genes...... 42 Figure 3.2 - The expression of IRX transcription factors in different cancers from the TCGA dataset...... 48 Figure 3.3 - IRX4 is highly expressed in prostate cancer compared to other cancers...... 49 Figure 3.4 - IRX4 is overexpressed in prostate tumour tissues compared to adjacent benign prostate samples in multiple data sets...... 51 Figure 3.5 - IRX4 is overexpressed in prostate tumour samples compared to adjacent non-malignant tissues...... 52 Figure 3.6 - IRX4 expression in primary and metastatic prostate cancer samples...... 53 Figure 3.7 – IRX4 expression correlation with the Gleason score...... 54 Figure 3.8 - IRX4 expression in a panel of prostate cancer cell lines...... 55 Figure 3.9 - IRX4 knockdown efficiency in LNCaP cells...... 56 Figure 3.10 - IRX4 knockdown reduced LNCaP cell proliferation...... 57 Figure 3.11 – CyQUANT assay for LNCaP cell proliferation...... 58 Figure 3.12 - Transient knockdown of IRX4 in C4-2B cells...... 58 Figure 3.13 - LNCaP cell migration with transient knockdown of IRX4...... 59 Figure 3.14 – Cell morphology of IRX4 knockdown of LNCaP cells ...... 60 Figure 3.15 - EMT gene expression in IRX4 knockdown LNCaP cells...... 60 Figure 3.16 - IRX4 knockdown efficiency in LNCaP and VCaP cells...... 61 Figure 3.17 – IRX4 knockdown was confirmed by immunofluorescence ...... 62 Figure 3.18 – Effect of siIRX4 knockdown in other IRXs’ mRNA expression...... 63 Figure 3.19 - The differential expression of genes upon IRX4 knockdown in LNCaP and VCaP cells...... 65 Figure 3.20 – qPCR validation of IRX4 knockdown microarray results in (a) LNCaP and (b) VCaP cells...... 66 Figure 3.21 - Schematic of genes regulated by IRX4 and AR, annotated by IPA...... 71 Figure 3.22 – A comparison between IRX4 regulated genes and androgen regulated genes (ARG)...... 73 Figure 3.23 - IRX4 interacts with FOXA1 in LNCaP cells...... 75 Figure 4.1 - Expression profile of IRX4lncRNA in different tissue samples...... 86 Figure 4.2 - IRX4lncRNA expression in various tumour and normal tissues...... 87

The role of gwas identified 5p15 locus in prostate cancer risk and progression vii

Figure 4.3 - The expression of IRX4lncRNA and PCA3 in the TCGA prostate cancer dataset ...... 90 Figure 4.4 - Correlation between the Gleason score and IRX4lncRNA/PCA3 expression ...... 90 Figure 4.5 - IRX4lncRNA expression is associated with aggressive disease...... 91 Figure 4.6 - IRX4lncRNA and PCA3 expression in disease free and cancer progressed patients...... 92 Figure 4.7 - IRX4lncRNA is overexpressed in prostate tumour samples compared to adjacent non-malignant tissues...... 92 Figure 4.8 - Predicted variants of IRX4lncRNA...... 94 Figure 4.9 - RT-PCR analysis of IRX4lncRNA variants in prostate cancer cells...... 95 Figure 4.10 - IRX4lncRNA expression in a panel of prostate cell lines ...... 96 Figure 4.11 - Relative expression of IRX4lncRNA in cells undergoing EMT...... 97 Figure 4.12 - siRNA screening for IRX4lncRNA knockdown in LNCaP cells...... 98 Figure 4.13 - The effect of transient knockdown of IRX4lncRNA on cell proliferation of LNCaP cells...... 98 Figure 4.14 - IRX4lncRNA knockdown efficiency in stable LNCaP models...... 100 Figure 4.15 - Validation of doxycycline inducible IRX4lncRNA overexpression PC3 model...... 100 Figure 4.16 - The effect of doxycycline inducible IRX4lncRNA knockdown on LNCaP cell proliferation...... 101 Figure 4.17 – The effect of IRX4lncRNA overexpression on PC3 cell proliferation...... 102 Figure 4.18 - The effect of IRX4lncRNA knockdown on LNCaP cell migration...... 103 Figure 4.19 – IRX4 expression in IRX4lncRNA knockdown LNCaP cells...... 104 Figure 4.20 – IRX4lncRNA expression in IRX4 knockdown cells...... 105 Figure 5.1- Regulation of IRX4 and IRX4lncRNA expression by androgens (DHT) and anti-androgens...... 115 Figure 5.2 - Regulation of IRX4 and IRX4lncRNA expression by androgens (DHT) in VCaP cells...... 116 Figure 5.3 - Transient AR knockdown in VCaP cells ...... 117 Figure 5.4 - Binding of AR and ERG at the prostate cancer risk associated 5p15 locus...... 119 Figure 5.5 - IRX4 expression correlates with ERG-fusion status...... 121 Figure 5.6 - IRX4lncRNA expression does not correlate with ERG fusion status...... 121 Figure 5.7 - High IRX4 expression correlates with high ERG expression...... 122 Figure 5.8 – IRX4lncRNA expression is not correlated with ERG expression...... 122 Figure 5.9 - Validation of ERG expression in LNCaP-pIND21-ERG ...... 123 Figure 5.10 - Differential gene expression in LNCaP cells over expressing ERG (LNCaP-pIND21-ERG)...... 124 Figure 5.11 - Differential gene expression in ERG knockdown prostate cancer cell lines...... 125

viii The role of gwas identified 5p15 locus in prostate cancer risk and progression

Figure 5.12 - MNLP genotype of the panel of prostate cancer cell lines...... 127 Figure 5.13 – Allele specific luciferase promoter vector assay for the MNLP ...... 128 Figure 5.14 – Genotyping of the MNLP in Australian males’ blood DNA...... 129 Figure 5.15 - eQTL analysis of IRX4 and IRX4lncRNA expression with MNLP genotype...... 130 Figure 5.16 - Survival analysis for Queensland men with the MNLP who underwent androgen deprivation therapy...... 132 Figure 5.17 – Allele specific androgen-mediated regulation by the MNLP...... 135

The role of gwas identified 5p15 locus in prostate cancer risk and progression ix

List of Tables

Table 1-1 - Iroquois transcription factors in cancer ...... 12 Table 1-2 - LncRNAs associated with prostate cancer and their functional roles ...... 22 Table 1-3 – Evidence for the role of LncRNAs as biomarkers in prostate cancer ...... 29 Table 3-1 – Expression microarray studies derived from the Oncomine database for characterising IRX4 expression ...... 50 Table 3-2 – Top canonical pathways and molecular and cellular functions deregulated in IRX4 knockdown LNCaP cells ...... 67 Table 3-3 - Top canonical pathways and molecular and cellular functions deregulated in IRX4 knockdown VCaP cells ...... 68 Table 3-4 - found to be interacting with both IRX4 and AR in LNCaP cells ...... 75 Table 3-5 – The top upstream regulators of the IRX4-interacting proteins ...... 76 Table 4-1 - Alterations detected for IRX4lncRNA in different cancers from the TCGA dataset ...... 89 Table 4-2 - Pathways predicted to be associated with IRX4lncRNA expression ...... 106 Table 5-1 - The correlation of TMPRSS2/ERG fusion with IRX4/IRX4lncRNA expression in patient derived xenograft models...... 120 Table 5-2 - Predicted transcription factor binding to the MNLP ...... 131

x The role of gwas identified 5p15 locus in prostate cancer risk and progression

List of Abbreviations

ADT Androgen deprivation therapy AMHC1 Atrial myosin heavy chain-1 APCB Australian Prostate Cancer BioResource AR Androgen receptor ARE Androgen responsive elements BPC3 Breast and Prostate Cancer Cohort Consortium BPH Benign prostatic hyperplasia C2orf43 Open reading frame 43 of 2 ceRNA competing endogenous RNA CLPTM1L Cisplatin resistance-related CRR9p CRPC Castration resistant prostate cancer CSF2 Colony stimulating factor 2 DHT 5α dihydrotestosterone DO Disease ontology Dox Doxycycline EMSA Electrophoretic mobility shift assay EMT Epithelial to mesenchymal transition eQTL Expression quantitative trait locus ER ERG v-ets erythroblastosis virus E26 oncogene homolog eRNA Enhancer associated RNA FBS Fetal bovine serum FFPE Formalin-fixed, paraffin-embedded FOXA1 Forkhead Box A1 FOXP4 Forkhead box P4 fPSA free PSA GO GPRC6A G protein-coupled receptor, family C, group 6, member A GR GSEA Gene set enrichment analysis GWAS Genome-wide association studies

The role of gwas identified 5p15 locus in prostate cancer risk and progression xi

HOTAIR HOX Antisense intergenic RNA HP Human phenotypes IP Immunoprecipitation IPA Ingenuity pathway analysis iPSA intact PSA IPSS International Prostate Symptom Score IRX Iroquois Homeobox LD Linkage disequilibrium lncRNA Long non-coding RNA MAF Minor allele frequency MALAT1 Metastasis-associated lung carcinoma transcript 1 MET Mesenchymal to epithelial transition MNLP Multiple nucleotide length polymorphism NCOAs coactivators ncRNA non-protein coding RNA NFAT Nuclear factor activation of T cell trafficking Ns Non-significant NUPR1 Nuclear protein 1 OR Odds Ratio paRNA Promoter associated RNA PCA3 Prostate cancer antigen 3 PCATs prostate cancer associated transcripts PCR Polymerase chain reaction PIN Prostatic intraepithelial neoplasia PRACTICAL Prostate cancer AssoCiation group To Investigate Cancer Associated aLteration PRC1 Polycomb complex 1 PRC2 Polycomb complex 2 qRT-PCR Quantitative real time polymerase chain reaction RT-PCR Reverse transcription polymerase chain reaction SChLAP1 Second Chromosome Locus Associated with Prostate 1 SNP Single nucleotide polymorphism sORFs Short open reading frames

xii The role of gwas identified 5p15 locus in prostate cancer risk and progression

TCGA The Cancer Genome Atlas TERT Telomerase reverse transcriptase TF Transcription factor TMPRSS2 Transmembrane protease, serine 2 TNM Tumour, Node, Metastasis VDR VMHC1 Ventricular myosin heavy chain-1 VPC Vancouver prostate cancer Xist X-inactive specific transcript

The role of gwas identified 5p15 locus in prostate cancer risk and progression xiii Statement of Original Authorship

The work contained in this thesis has not been previously submitted to meet requirements for an award at this or any other higher education institution. To the best of my knowledge and belief, the thesis contains no material previously published or written by another person except where due reference is made.

Signature: QUT Verified Signature

Date: February 2019

xiv The role of gwas identified 5p15 locus in prostate cancer risk and progression

Awards and Publications

Awards relating to this thesis

1. Finalist for the ESA-Novartis Junior Scientist Award, Endocrine Society of Australia / Society of Reproductive Biology (ESA/SRB) Joint meeting (2018) 2. Best Poster in the Laboratory/Basic Science Student Category, Research Excellence Awards at the PAH-Health Symposium (2018) 3. People’s Choice Award for Poster Presentation at the ASMR Postgraduate Student Conference (2018) 4. Best Oral Presentation at the Indo-Australian Biotechnology Conference (2017) 5. People’s Choice Award for Poster Presentation at the PAH-Health Symposium (2017) 6. Oral Finalist in the Basic Science Student Category, Research Excellence Awards at the PAH-TRI Research Excellence Awards (2016) 7. Best Student Oral Presentation at the TRI Student Poster Symposium (2015)

Peer-reviewed journal articles as co-author

1. John Lai, Leire Moya, Jiyuan An, Andrea Hoffman, Srilakshmi Srinivasan, Janaththani Panchadsaram, Carina Walpole, Joanna L. Perry-Keene, Suzanne Chambers, Australian Prostate Cancer BioResource, Melanie L. Lehman, Colleen C. Nelson, Judith A. Clements, and Jyotsna Batra (2017) A microsatellite repeat in PCA3 long non-coding RNA is associated with prostate cancer risk and aggressiveness. Scientific Reports. 2017. 7: 16862. 2. Leire Moya, John Lai, Andrea Hoffman, Srilakshmi Srinivasan, Janaththani Panchadsaram, Suzanne Chambers, Australian Prostate Cancer BioResource, Judith Clements and Jyotsna Batra. A Short Tandem Repeat in the TRIB1 Gene is Associated with Prostate Cancer Risk. Frontiers in Genetics. 2018. 9: 428. 3. Srilakshmi Srinivasan, Carson Stephens, Emily Wilson, Janaththani Panchadsaram, Kerry DeVoss, Hannu Koistinen, Ulf-Håkan Stenman, The

The role of gwas identified 5p15 locus in prostate cancer risk and progression xv

Practical Consortium, Ashley M. Buckle, Robert J. Klein, Hans Lilja, Judith Clements, Jyotsna Batra. Prostate cancer risk associated Single Nucleotide Polymorphism affects PSA glycosylation and its function. Clinical Chemistry. 2019. 65(1): e1-e9.

xvi The role of gwas identified 5p15 locus in prostate cancer risk and progression

Acknowledgements

First and foremost, I would like to express my sincere gratitude to my Principal Supervisor A/Prof Jyotsna Batra for giving me the opportunity to pursue my PhD studies at the Queensland University of Technology and for her continuous support, motivation and guidance throughout my studies. I would like to thank A/Prof Jyotsna Batra for being very supportive and responding to my questions and queries so promptly all the time. I would also like to thank my associate supervisor, Dr Gregor Tevz for his tremendous support and guidance in the laboratory, and also for hard questions which motivated me to widen my research in various prospective. My sincere thanks also goes to my associate supervisor, D/Prof Judith Clements for her valuable mentoring and insightful comments during my research studies. I couldn’t have maintained the interest in my research during the hard times without the support from my supervisory panel. I would also like to acknowledge QUT Postgraduate Research Award (QUTPRA) and HDR Tuition Fee Waiver for supporting my studies.

I would like to thank the past and present Batra and Clements group members, APCRC-Q members, colleagues from the TRI for their support throughout my PhD studies. I would also like to extend my sincere gratitude to the following people for their assistance toward data collected for this project,  Dr Anja Rockstroh, Dr Atefeh Taherin Fard, Dr Melanie Lehman and Prof Colleen Nelson at the Australian Prostate Cancer Research Centre – Queensland, QUT for microarray experiments and analysis of androgen regulated gene signature;  Dr Thomas Kryza at the Australian Prostate Cancer Research Centre – Queensland, QUT for the assistance with the immunofluorescence assays;  Dr Carina Walpole at the Australian Prostate Cancer Research Centre – Queensland, QUT for generating reporter vector plasmids;  Ms Dorothy Loo at the TRI Proteomics facility with the mass spectrometry analysis of IRX4 immunoprecipitation samples;  A/Prof Elizabeth Williams at the Australian Prostate Cancer Research Centre – Queensland, QUT for providing genomic DNA samples and RNA- sequencing data from the xenograft models;

The role of gwas identified 5p15 locus in prostate cancer risk and progression xvii

 Dr Brett Hollier and Dr Nataly Stylianou at the Australian Prostate Cancer Research Centre – Queensland, QUT for providing cDNA samples from the epithelial to mesenchymal transition cell models;  Dr Trina Yeadon and Ms Allison Eckert, the Australian Prostate Cancer BioResource (APCB), QLD for retrieving the patient tissues for gene expression studies;  QIMR Histology for the FFPE tissue sectioning and H&E staining of the prostate tumour samples obtained from the APCB;  Dr Joanna Perry-Keene and Dr Katie Buzacott at the Anatomical Pathology, Pathology Queensland for marking of tumour and adjacent non-malignant tissues from the FFPE slides and Scoring them;  Dr Srilakshmi Srinivasan at the Australian Prostate Cancer Research Centre – Queensland, QUT for RNA isolation from FFPE samples;  The PRACTICAL Consortium for the SNP genotyping data;  Ms Leire Moya at the Australian Prostate Cancer Research Centre – Queensland, QUT for helping me with retrieving DNA samples for genotyping and assisting with chemical and consumable ordering;  Dr John Lai at the Australian Prostate Cancer Research Centre – Queensland, QUT for providing assistance with the use of online tools, databases and RNA- sequencing data.

I have been blessed to have good friends who had been listening to my struggles and providing me advices through this entire process, especially I would like to thank Ramethaa Pirathiban, Farhana Matin, Srilakshmi Srinivasan, Lakmali Silva, Sugarniya Subramaniam, Thomas Kryza, Leire Moya, Ruth Fuhrman-Luck, Patrick Thomas, Mohanan Maharaj, Ellca Ratther, Carina Walpole, Ying Dong, and Carson Stephens. I would also like to thank honours and vacation research students, Elizabeth Cheeseman, Lok Wan Ko, Zeinab Kooshan and Julia Liukkonen who have accompanied me during this journey.

I would like to acknowledge all my teachers who taught me since my school. I would not have been here without their blessing and guidance. A special thanks also due to my undergraduate lecturers, Prof Suneth Sooriyapathirana, Prof Sanath Rajapakse, Dr

xviii The role of gwas identified 5p15 locus in prostate cancer risk and progression

Preminda Samaraweera and Ms Chandima Dhanapala at the University of Peradeniya, Sri Lanka, for laying a strong foundation and steering me towards my goals.

I would like to thank my parents, my siblings and my in-laws for all the support over these years. Finally, I would like to extend my heartfelt thanks to my husband Logapirathap Naguleswaran for his moral support, encouragement and sacrifices he made during this time. I would like to dedicate my thesis to my parents.

The role of gwas identified 5p15 locus in prostate cancer risk and progression xix

Chapter 1: Literature Review

1.1. Prostate Cancer

Prostate cancer is potentially life-threatening, and the most serious disease among the four main disorders of the prostate - prostatitis, benign prostatic hyperplasia (BPH), prostatodynia and prostate cancer. It is the most commonly diagnosed cancer in men world-wide and accounting for ~1.6 million incidents in year 2015 (Global Burden of Disease Cancer et al., 2017). In Australia, 20,000 new prostate cancer cases are annually diagnosed (Prostate Cancer Foundation of Australia) and one in 11 men in Australia have the risk of developing prostate cancer by the age of 70 and 3,300 deaths of Australian men occur every year (Prostate Cancer Foundation of Australia).

Acinar adenocarcinoma is the common type of prostate cancer diagnosed, while rare forms include mucinous adenocarcinoma, ductal adenocarcinoma, adenosquamous, signet ring cell carcinoma and neuroendocrine carcinoma (Li et al., 2016). Prostate cancer is a clinically heterogeneous disease, and ranges from an indolent tumour which remains relatively insignificant to a patient's health to a rapid fatal progression of disease (Boyd et al., 2012).

1.1.1. Prostate cancer staging and grading Determining the tumour stage is important to decide the severity of disease and to choose the relevant treatment options. The Tumour, Node, Metastasis (TNM - Tumour extent, lymph Node invasion, presence of Metastasis) staging system is a well- accepted standard practice in this regard (Buyyounouski et al., 2017). The Gleason grading system is also used to determine the prognosis and select treatment options for prostate cancer (Egevad, 2008). The Gleason grading system assigns a score to the primary and secondary histological pattern of the tumour, ranging from 1 to 5, and both of these scores are summed together to estimate the risk (6 – low risk, 7 – intermediate risk and 8 or above – high risk) (Tagai et al., 2018). Tumours with a high score predict an aggressive disease with poor clinical outcome (Egevad, 2008; Mak et al., 2010).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 1

1.1.2. Diagnosis The serum PSA test is the commonly used non-invasive diagnostic approach for prostate cancer. Even though the PSA test is considered a sensitive biomarker for the detection of prostate cancer, it is not considered a specific marker (Crawford et al., 2014). The specificity and sensitivity of the diagnostic PSA test ranges from 20-40% and 70-90%, respectively, depending on the applied cut-off for PSA levels (Prensner et al., 2012). One possible explanation for the poor specificity of the PSA blood test is that several noncancerous causes may increase the PSA level in men. For example, benign prostatic hyperplasia (BPH) and prostatitis may cause an elevation in PSA levels (Qu et al., 2014). Thus, modifications to PSA test by measuring PSA density, PSA velocity, free PSA (fPSA) and complexed PSA were recommended to improve its performance (Prensner, et al., 2012). A four-kallikrein panel comprising total PSA, fPSA, intact PSA (iPSA) and human KLK2 has been suggested to have increased accuracy compared to the conventional PSA testing (Filella et al., 2015). Recent studies have suggested the use of molecular biomarkers such as PCA3, TMPRSS2:ERG gene fusion, microRNAs and circulating tumour cells for improved diagnostics and better prognostic outcome for prostate cancer patients (Filella et al., 2018). Even though, each of these tests provide advantages for prostate cancer diagnosis additional studies are essential to determine the appropriate use of these biomarkers for prostate cancer management.

1.1.3. Treatment The growth of prostate cancer is stimulated by androgens, such as testosterone produced by the testis (Zhou et al., 2015). Localised prostate cancer is treated with radical prostatectomy and radiation therapy with or without androgen deprivation therapy (ADT) (Gamat et al., 2017). ADT is used to reduce the levels of circulating androgens for the treatment of prostate cancer (Rodrigues et al., 2014). This is achieved through disruption of the hypothalamic-pituitary-gonadal axis by surgical or medical means (Zareba et al., 2016). Apart from the initial efficacy of ADT, most patients with advanced prostate cancer eventually develop castrate-resistant prostate cancer (CRPC) (Bracarda et al., 2005), in which androgen receptor (AR) signalling is reactivated through various mechanisms such as amplification of the AR locus, mutations in the AR gene and/or AR splice variants (Green et al., 2012). In addition, de novo synthesis of androgens in prostate cancer cells, ligand-independent activation

2 The role of gwas identified 5p15 locus in prostate cancer risk and progression

of AR and altered expression of AR co-factors has been also implicated in CRPC (Green, et al., 2012). Therefore, compounds targeting various aspects of the AR signalling pathway have been developed to treat CRPC patients such as Enzalutamide to prevent the binding of androgens to AR, AR nuclear translocation and DNA binding (Rawlinson et al., 2012), Abiraterone acetate to block de novo synthesis of androgens (Rodrigues, et al., 2014) and Galeterone with multiple mechanisms to block androgen synthesis, antagonise AR and degrade AR (Crona et al., 2015). Even though these drugs are potent, they do not completely inhibit AR signalling and increase the average survival outcome only by several months (Mitsiades, 2013). Therefore, more targeted therapeutics for prostate cancer is essential for the treatment of advanced prostate cancer.

1.2. The role of the androgen receptor in the progression of prostate cancer

The AR is a member of the nuclear family of transcription factors which is essential for normal prostate development as well as driving prostate carcinogenesis (Rodrigues, et al., 2014). The binding of testosterone and 5α dihydrotestosterone (DHT) to AR causes AR receptor dimerisation and translocation to the nucleus and thus enables the recruitment of the AR to the androgen response elements on DNA and promote expression of the AR target genes (Figure 1.1) (Daniels et al., 2014). Prostate luminal cells express high levels of AR, while basal cells express low levels or no AR. The mechanism of transformation of prostate cells to cancer initiating cells is not clearly understood. In prostate adenocarcinoma, the AR transcriptional program switches from cell differentiation to cell proliferation, which is influenced by post-translational modifications, genetic mutations or gene rearrangements and altered association of AR co-factors (Zhou, et al., 2015). Even though the circulating androgen levels become limited in CRPC, prostate cancer cells maintain AR activity through other mechanisms such as AR locus amplification, splice variants, crosstalk with other signalling pathways (Zhou, et al., 2015). Indirect reactivation of the AR is mostly regulated by the altered expression of AR co- chaperones such as heat shock proteins induced by castration and genomic modification of AR co-factors such as Forkhead Box A1 (FOXA1/HNF-3A) and Nuclear receptor coactivators (NCOAs) (Wyatt et al., 2015).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 3

Figure 1.1 – Androgen receptor signalling in prostate cells. Testosterone is activated to 5α-dihydrotestosterone (DHT) by 5α-reductase in the cytoplasm. The binding of DHT to the androgen receptor (AR) leads to receptor dimerisation and conformational changes which translocates AR to the nucleus. In the nucleus, AR binds to specific transcription factors (TF) or co-factors and its specific androgen responsive elements (ARE) on gene promoter or enhancer regions to regulate gene expression. (Figure modified from (Liao et al., 2013)).

1.3. The role of TMPRSS2-ERG fusion in prostate cancer

The recurrent genomic rearrangements of ETS family transcription factor proto- oncogenes results in AR-driven overexpression of these transcription factors in prostate cancer (Cai et al., 2013). Especially, the genomic rearrangement of the v-ets erythroblastosis virus E26 oncogene homolog, ERG has been well-studied for its role in prostate cancer (Figure 1.2). A gene fusion between the 5′ untranslated region (5’ UTR) of the androgen-regulated transmembrane protease, serine 2 (TMPRSS2) gene and an exon in the ERG gene results in over expression of ERG in approximately 50% of cancer, which defines a subset of prostate cancer (Kumar-Sinha et al., 2008; Tomlins et al., 2005). This fusion is found in precursor prostatic intraepithelial neoplasia (PIN) lesions adjacent to prostate tumours and suggests its role in the early stages of the cancer (Perner et al., 2007). The fusion gene is also overexpressed in CRPC, indicating that ERG may contribute to prostate cancer metastasis (Cai et al., 2009). In addition, benign prostate cells and metastatic prostate cancer cell line VCaP overexpressing ERG were shown to be involved in the plasminogen activation

4 The role of gwas identified 5p15 locus in prostate cancer risk and progression

pathway and mediate the cellular invasion process (Tomlins et al., 2008). Studies also report the regulation of genes through androgen-mediated ERG over expression involved in prostate cancer (Cai, et al., 2013). Furthermore, ERG is reported to disrupt AR signalling by binding to the promoter region of the AR gene, where it directly affects the AR transcriptional activity. In addition, it modulates the AR binding region in the chromatin DNA and acts as a negative regulator of AR regulated genes (Yu et al., 2010). It has also been shown that ERG directly activates a Polycomb protein, H3K27 methyltransferase EZH2, and therefore, induces repressive epigenetic programmes (Yu, et al., 2010). ERG overexpression in prostate cancer cells were shown to promote prostate cancer progression, invasion and cellular motility (St John et al., 2012). Pre-clinical and clinical studies are emerging to target ERG and include inhibition of ERG mRNA by small molecule inhibitors, inhibition of ERG - cofactors, inhibition of ERG binding to DNA and destabilisation of ERG protein (Sedarsky et al., 2017).

Figure 1.2 – TMPRSS2:ERG fusion in prostate cancer cells. The androgen responsive promoter of the TMPRSS2 gene fuses with the ERG gene by either chromosomal translocation or deletion, which results in AR-driven overexpression of ERG in approximately 50% of prostate cancer patients. Binding of ERG to its target genes either activates or inhibits the transcription of the downstream genes. (AR – Androgen receptor, TF – Co-factor, DHT – Dihydrotestosterone)

The role of gwas identified 5p15 locus in prostate cancer risk and progression 5

1.4. Prostate cancer risk factors

As a multifactorial disease, prostate cancer has several factors contributing to its aetiology, comprising both modifiable and non-modifiable factors (Adjakly et al., 2015). Age is a well-known non-modifiable risk factor for prostate cancer, where the risk of developing cancer increases with age (Brawley, 2012; Crawford, 2003). Ethnicity is another non-modifiable contributing factor to the development of prostate cancer where Asians have lower prostate cancer rates than Europeans and Americans (Adjakly, et al., 2015). Furthermore, family history and/or heredity is also a known non-modifiable prostate cancer risk factors (Crawford, 2003). There is a considerable amount of evidence for a genetic basis (~57%) for the risk of prostate cancer (Mucci et al., 2016). On the other hand, diet and environmental exposure disruptors, such as bisphenol A, chlorodecone and pesticides (Adjakly, et al., 2015), are reported as modifiable prostate cancer risk factors.

1.5. Genome-Wide Association Studies (GWAS)

Genome-wide association studies (GWAS) identify risk loci involved in human disease. In this analysis, DNA variations, called single nucleotide polymorphisms (SNPs) are investigated throughout the genome to find the SNPs with different frequencies in a population with a particular disease than in a population without the disease. Prostate cancer GWAS have identified more than 150 risk loci associated with prostate cancer (Al Olama et al., 2014; Eeles et al., 2013; Hazelett et al., 2014; Hicks et al., 2013; Schumacher et al., 2018) The initial prostate cancer GWASs were carried out in European populations from the UK, USA, Sweden and Iceland (Duggan et al., 2007; Eeles et al., 2008; Gudmundsson et al., 2007; Thomas et al., 2008; Yeager et al., 2007), with most risk alleles having a similar effect in different ethnic populations in other replication studies (Haiman et al., 2011; Waters et al., 2009; Xu et al., 2011). Most of these loci are typically identified using tagged- SNPs, which act as a marker for other potential causal genetic variations in linkage disequilibrium (LD) at that genomic region and inherited together with the tagged-SNPs for a disease or phenotype (Hazelett, et al., 2014). Fine-mapping studies by more denser genotyping of GWAS identified risk regions identify these additional risk associated variants and characterise the genetic architecture of the risk association (Dadaev et al., 2018).

6 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Most of the risk associated SNPs occurring in non-protein coding DNA and prostate cancer risk is likely to result from the deregulation of gene promoter/enhancer activity, and not from the altered protein coding function as the majority of GWAS tag-SNPs are present in non-protein coding DNA (Edwards et al., 2013; Srinivasan et al., 2016).

1.6. The rs12653946 SNP at the 5p15 locus is associated with prostate cancer risk

As mentioned before, most of the prostate cancer GWAS were performed in European populations. Takata et al., performed a GWAS in a Japanese population comprising 1583 cases and 3,386 controls in stage 1, and 3,001 cases and 5,415 controls in the replication stage, confirming the risk association of eight previously known independent loci with prostate cancer in the Japanese population (Takata et al., 2010). In addition, this study reported five new susceptible loci (rs13385191 – chr2:20751746, rs9600079 – chr13:72626140, rs12653946 – chr5:1948829, rs1983891 – chr6:41644405 and rs339331 – chr6:117316745) in which three plausible candidate genes were identified: GPRC6A (G protein-coupled receptor, family C, group 6, member A), C2orf43 (open reading frame 43 of chromosome 2) and FOXP4 (forkhead box P4). Another two loci were reported to be present in a “gene desert” region, 5p15 locus (rs12653946- Odds Ratio (OR) 1.26 (1.20-1.33), p-value = 3.9X10- 18) and 13q22 locus (Takata, et al., 2010).

A replication study was carried out to investigate the possible association of the five GWAS SNPs with prostate cancer in an Australian sample set by Batra et al., (Batra et al., 2011). The study comprised 1,357 prostate cancer patients and 1,403 healthy Australian male controls which were genotyped using iPLEX Gold assays on the Sequenom Mass ARRAY platform. Apart from rs13385191, the minor allele frequencies (MAF) were not significantly different for the studied SNPs, in Australian males compared to the Japanese population. This study found rs12653946 at the 5p15 locus is significantly associated with prostate cancer risk (OR = 1.20 (1.07-1.34), p- value = 0.002). This study suggested that rs12653946 at the 5p15 locus is not ethnicity- specific for the Japanese population, but is also associated with prostate cancer risk in an European population. An additional replication study was carried out by the Breast and Prostate Cancer Cohort Consortium (BPC3), which confirmed the strongest association for rs12653946 (OR = 1.41 (1.18–1.68), p-value = 1.33X10-6) using a large sample set of 7956 prostate cancer cases and 8148 controls (Lindstrom et al., 2012).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 7

The association of rs12653946 with prostate cancer risk was also confirmed in a Chinese population (OR = 1.10 (1.06–1.16), p-value = 1.33X10-5) (Long et al., 2012). Another Study in a Chinese population reported that the rs12653946 SNP homozygous genotype had approximately three fold risk of developing prostate cancer in smokers compared to non-smokers (Liu, Shi, et al., 2016). Moreover, rs12653946 is associated with aggressive benign prostatic hyperplasia in Chinese men (OR = 1.40 (1.04–1.88), p-value = 0.03) and associated with an elevated International Prostate Symptom Score (IPSS) after treatment (p-value = 0.01) (Qi et al., 2013). In addition, the Prostate cancer AssoCiation group To Investigate Cancer Associated aLteration in the genome (PRACTICAL) Consortium has identified rs10866527 as the most significant prostate cancer risk associated SNP at the 5p15 locus (OR = 1.08 (1.06-1.11, p-value = 1.1X10- 8) by fine-mapping studies (Amin Al Olama et al., 2015). A study on the association of prostate cancer GWAS SNPs with clinical outcomes didn’t find any correlation between rs12653946 SNP and PSA levels at diagnosis (Sullivan et al., 2015). Interestingly, a recent study by Penney et al., reported that the risk association of the GWAS SNP rs12653946 was in the opposite direction in ERG-fusion positive and negative tumours (OR = 0.69 (0.54-0.89), p-value = 0.004) (Penney et al., 2016). Thus, consideration of molecular subtypes of prostate cancer in risk association studies may improve our understanding of this disease aetiology.

GWAS SNPs act as markers to identify the locus that is related with the disease or phenotype, but GWAS may not represent the actual functional SNPs. So any SNPs that are found in linkage disequilibrium (LD) with a GWAS SNP within the locus should be analysed for a functional role. Nguyen H.N. et al., performed an electrophoretic mobility shift assay (EMSA) and reporter vector assay to identify the functional variants at the 5p15 region, which were in LD with the GWAS SNP (Nguyen et al., 2012). Of the 12 variants they examined, 8 SNPs, including the GWAS SNP showed differential binding affinities to the nuclear proteins from LNCaP cells between their risk and non-risk alleles. Moreover, three of the SNPs exhibited an increased luciferase activity with the non-risk allele.

Takata et al., initially stated that the rs12653946 locus represents a 20 kb block region, which contains no known genes (Takata, et al., 2010). But in-depth analysis of the locus by Batra et al., reported that this SNP falls in the intron of a putative gene tojy (CTD-21944D2.4), which is predicted to code for a 8.1 kDa protein of 76 amino acids

8 The role of gwas identified 5p15 locus in prostate cancer risk and progression

(Batra, et al., 2011) (Figure 1.3). The expression of CTD2194D22.4 is detected in testis and not in prostate according to the GTEx tissue expression data. Moreover, there was no expression of CTD2194D22.4 detected in prostate cancer cell lines or prostate cancer tissue samples from RNA-sequencing data available in our laboratory (data not shown). This locus encompasses additional credible candidate genes, telomerase reverse transcriptase (TERT) and cisplatin resistance-related protein CRR9p (CLPTM1L). However, the rs12653946 SNP was not in LD with the risk SNPs associated with these two genes (Batra, et al., 2011), suggesting the rs12653946 SNP represents an independent prostate cancer risk locus. There are two genes of interest – IRX4 and its antisense lncRNA (IRX4lncRNA) - that may be the target of the causative SNPs at this prostate cancer risk associated locus. IRX4 is a protein coding gene proximal to this GWAS SNP and the rs12643946 SNP had been correlated with lower expression of the IRX4 transcript in prostate cancer cells (Xu et al., 2014). A recent study on lncRNAs at prostate cancer risk associated GWAS loci has reported an expression-genotype correlation between a novel lncRNA, IRX4lncRNA, and the rs12653946 SNP. This suggests that causative SNPs at the 5p15 locus may confer risk via regulating the expression of IRX4 and IRX4lncRNA.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 9

Figure 1.3 - The prostate cancer risk associated 5p15 locus. Upper panel shows ~700 kb region surrounding the GWAS SNP, rs12653946, harbouring several cancer associated genes such as TERT and CLPTM1L derived from the UCSC Genome Browser. Genes on the positive strand are shown above the blue line. The middle panel is zoomed in to the positions of GWAS SNP and fine-mapping SNP. The lower panel shows the RNA-Seq wiggle plot from the LNCaP cell lines treated with 10 nM 5α dihydrotestosterone (DHT) or vehicle (EtOH) (unpublished data). Purple reads correspond to the positive strand, IRX4lncRNA and orange reads correspond to IRX4.

10 The role of gwas identified 5p15 locus in prostate cancer risk and progression

1.7. Iroquois Homeobox 4 (IRX4)

IRX4 belongs to the highly conserved homeodomain-containing Iroquois transcription factor family (IRX), which play fundamental roles in diverse developmental processes (Cheng et al., 2005; Gomez-Skarmeta et al., 2002; Matsumoto et al., 2004). Similar to most other homeodomain genes, IRXs were first discovered in invertebrates (Drosophila) and then in vertebrates (Xenopus, zebrafish, chicken, and mammals) (Gomez-Skarmeta et al., 1996). The human IRX complex is composed of six genes, found in two clusters of three genes each in chromosome 5 (IRX1, 2 and 4) and 16 (IRX3, 5 and 6). IRX proteins contain the unique Iro-box motif, a conserved motif of 13 amino acid residues in the carboxyl-terminal region, as well as an atypical homeodomain with three extra amino acids between the first and second alpha helices, which group them in the 3-amino-acid-loop-extension (TALE) family of transcription factor (Cavodeassi et al., 2001).

Homeobox genes, including IRXs, play critical roles in the developmental process and embryogenesis, including cell differentiation, cell-type specification, and organ formation (Cheng, et al., 2005; Gomez-Skarmeta, et al., 2002; Matsumoto, et al., 2004). Recent advanced studies in the field of developmental biology provided evidence regarding the similarities between normal growth and tumorigenesis (Bhatlekar et al., 2014; Shah et al., 2010). This implies developmental regulators such as homeobox genes, may have a role in cancer pathogenesis. Emerging evidence details the involvement of IRXs in various cancers (Table 1-1). Understanding their involvement in cancer may identify new biologically meaningful diagnostic and therapeutic targets for cancer. Even though, some members of the homeodomain family, including HOXs, HNFs and NANOGs (NKX genes) are well characterised for their role in various cancers (Abate-Shen, 2002; Samuel et al., 2005; Shah, et al., 2010), the mechanistic function of IRX proteins in tumorigenesis and their DNA binding sequence is still not fully explored.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 11

Table 1-1 - Iroquois transcription factors in cancer IRXs Cancer Function Reference IRX1 MLL-r leukemia Reciprocal gene expression with HOXA gene associated with poor (Kuhn et al., 2016) prognosis free survival and outcome. IRX-1 dependent actions may promote stem cell compartment. Osteosarcoma IRX1 overexpression associated with the overexpression of its own (Lu, Song, et al., 2015; Lu promoter and altered migration, invasion and resistance to anoikis by & Wang, 2015) upregulating CXCL14/NF-kB signalling. IRX1 hypomethylation is implicated as potential biomarker for metastasis. Gastric cancer Inhibited cell growth, invasion and tumorigenesis both in vitro and in (Guo et al., 2010; Jiang et vivo by targeting genes involved in cell proliferation and invasion. al., 2011; Wang, Xu, et al., IRX1 may potentially target BDKRB2 or its effector PAK1. 2015) Hypomethylation and risk associated SNPs at IRX1 locus may be used as a biomarker for diagnosis and prognosis of gastric cancer. Head and neck cancer Commonly methylated and downregulated. IRX1 overexpression (Bennett et al., 2008; exhibited tumour suppressive potential by reducing mitotic activity Bennett et al., 2009) and promoting apoptosis, maybe via interacting with components of the TGF-β pathway.

12 The role of gwas identified 5p15 locus in prostate cancer risk and progression

IRXs Cancer Function Reference Pediatric embryonal and IRX1 hypomethylation along with seven other genes may be used to (Mahoney et al., 2012) alveolar diagnose or risk stratify pediatric rhabdomyosarcoma patients. rhabdomyosarcomas IRX2 Breast cancer CpG islands in the IRX2 gene are more methylated in luminal A (Kadota et al., 2009; tumours. Highly expressed in malignant cell lines and may act as an Kamalakaran et al., 2011; oncogene. Inhibits cellular motility of breast cancer cells and Werner et al., 2015) repressed chemokine expression. Osteosarcoma Overexpression of IRX2 increased cell proliferation and invasion by (Liu et al., 2015; Liu et al., activating the PI3K/Akt signalling pathway, while knockdown of 2014) IRX2 reduced cell proliferation and invasion by activating the AKT/MMP pathway. IRX2 overexpressed in tumour tissues compared to normal samples and correlated with tumour progression and patient prognosis. Kidney renal cell IRX2-TERT fusion transcript was identified and IRX2 promoter (Karlsson et al., 2015) carcinoma influenced the upregulation of TERT expression. Acute myeloid leukemia Higher survival was observed in heterozygous genotype carriers for (Megias-Vericat et al., (AML) IRX2: rs2897047 in adult and pediatric population with AML. 2017)

The role of gwas identified 5p15 locus in prostate cancer risk and progression 13

IRXs Cancer Function Reference Infant acute IRX2 expression along with FLT3 and TACC2 may improve the (Kang et al., 2012) lymphoblastic leukemia prediction of the event-free-survival. IRX3 Hepatocellular Potential oncogene and a direct target of miR-337. (Wang, Zhuang, et al., carcinoma 2016) Colorectal cancer One of the most upregulated transcription factors in adenoma tissues (Barry et al., 2016; compared to healthy controls. Upregulation of IRX3 and IRX5 Martorell et al., 2014; inversely correlates with the TGF-β response gene signature. Sabates-Bellver et al., 2007) Exhibited a positive correlation with panitumumab monotherapy resistance. Wilms tumour Low expression correlated with advanced disease and poor outcome. (Mengelbier et al., 2010) Pheochromocytomas Overexpressed in malignant tumours. (Evenepoel et al., 2015) and paragangliomas Prostate cancer Hypermethylation of the 5’ CpG island at IRX3 suppressed gene (Morey et al., 2006) expression IRX4 Oral squamous cell One of the genes in a proposed 11 gene signature prognosticator of (Wang, Lim, et al., 2015) carcinoma outcome in patients without nodal metastasis Colorectal cancer Colocalised with 4 (HSF4) (Yang, Jin, et al., 2017)

14 The role of gwas identified 5p15 locus in prostate cancer risk and progression

IRXs Cancer Function Reference Prostate cancer Suppressed cancer growth via interaction with Vitamin D receptor. (Nguyen, et al., 2012; Xu, et Expression correlated with a prostate cancer risk associated GWAS al., 2014) SNP rs12653946. IRX5 Colorectal cancer Expression positively correlated with its antisense lncRNA, CRNDE. (Liu, Zhang, et al., 2016) Prostate cancer Regulates prostate cancer cell proliferation and cell cycle and (Myrthue et al., 2008; Wu et regulated by 1, 25-dihydroxyvitamin D3. IRX5 is differentially al., 2016) methylated in prostate cancer patients based on Gleason grade.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 15

IRX4 is the most divergent member among the vertebrate IRX family (Kim, Rosen, et al., 2012) and well-studied for its role in heart development. As a ventricular-specific transcription factor, IRX4 has been reported to be highly expressed in the ventricular myocardium while not expressed in both atria and the outflow tract (Bao et al., 1999). In addition, IRX4 positively regulated ventricular-chamber-specific gene expression by activating the ventricular myosin heavy chain-1 (VMHC1) gene, and suppressing the expression of atrial myosin heavy chain-1 (AMHC1/ slow MyHC3) by forming an inhibitory complex with the vitamin D receptor (VDR) and retinoic X receptor (RXR) (Bruneau et al., 2000; Wang et al., 2001). Moreover, two potential congenital heart disease associated mutations – p. Asn85Tyr and p. Glu92Gly, were identified in the IRX4 coding region in a Chinese population and these mutations are reported to affect IRX4 interaction with RXRA (Cheng et al., 2011). In addition, a recent study performed by Molck et al., detected a clinically significant deletion of the IRX4 locus in 10% of patients (8/78) with congenital heart disease (Molck et al., 2017).

IRX4 is recently been proposed to be included in an eleven gene molecular signature for extra-capsular spread in oral squamous cell carcinoma, which can serve as a predictor of outcome in patients without nodal metastases (Wang, et al., 2015). IRX4 has been identified as a potential candidate gene in prostate cancer after GWAS studies identified the 5p15 locus to be associated with prostate cancer risk. IRX4 expression was correlated with the genotype of prostate cancer risk associated GWAS SNP rs12653946 (Xu, et al., 2014). IRX4 has recently been shown to suppress prostate cancer proliferation through interaction with the VDR (Nguyen, et al., 2012). However, there are no studies published exploring the detailed mechanistic role of IRX4.

1.8. Antisense long non-coding RNA, IRX4lncRNA, at 5p15 locus

A study on long non-coding RNAs (lncRNAs) at the prostate cancer risk associated GWAS loci has identified the antisense lncRNA, IRX4lncRNA, as one of the top 20 candidate lncRNAs associated with prostate cancer risk (Guo et al., 2016). In this study IRX4lncRNA was reported to be differentially expressed in prostate tumour samples compared to normal tissues. In addition, the expression of this lncRNA was correlated with the rs12653946 SNP genotype (Guo, et al., 2016). Interestingly, IRX4lncRNA was found to be increased following castration in the LNCaP xenograft tumour progression

16 The role of gwas identified 5p15 locus in prostate cancer risk and progression

mice model (Nelson, unpublished). Moreover, this lncRNA expression was observed to be down-regulated by DHT treatment in LNCaP cells from the strand-specific RNA- Sequencing data available in our lab (Figure 1.3). However, the function of this lncRNA in prostate cancer pathogenesis is still unknown.

1.9. Long non-coding RNAs (lncRNAs)

It is estimated that there are only 20,000 - 25,000 protein coding genes in the , which accounts for less than ~ 2% of the human genome (Anastasiadou et al., 2018; Nie et al., 2012). The simple nematode, Caenorhabiditis elegans, also has a similar number of protein-coding genes as humans, suggesting that more developmentally complex organisms such as humans might rely on non-protein- coding regions of the human genome (Mattick, 2011). This suggests that apart from the role of protein-coding DNA, the non-protein coding DNA which was previously thought to be “junk DNA” is important for driving evolution (Mattick, 2011; Palazzo et al., 2015; Taft et al., 2007). This non-protein coding DNA can regulate the expression of protein-coding genes and maintain the 3D structure of the genome by serving as a scaffold for transcription factors. Alternatively, some non-coding DNA is now found to be transcribed as non-protein coding RNA (ncRNA) using high- throughput next-generation sequencing platforms (Carninci et al., 2005). In most cases, the function of these ncRNAs is unknown and therefore, the functional characterisation and identification of molecular pathways in which these ncRNA are involved remains a challenge.

Even though ncRNAs are not transcribed into protein, they play a vital part in human complexity from maintaining normal cellular function to playing a broader role in human diseases including cancer (Alexander et al., 2010; Li, Xuan, et al., 2013). Earlier, the terms ‘oncogene’ and ‘tumour suppressor gene’ referred only to protein- coding genes, but now also include ncRNAs. Both ncRNAs and protein machineries that are involved in the development of diseases have become targets of novel therapeutic approaches (Esteller, 2011). Based on transcript size, these ncRNAs are grouped into two major classes: small non coding RNAs (<200 bp) and lncRNAs, (>200 bp). The small ncRNA class comprises miRNAs, tRNAs, snRNAs, siRNAs and piRNAs (Cao, 2014). miRNAs are approximately 22 bp long ncRNAs which regulate gene expression by post-transcriptional modifications and have received attention

The role of gwas identified 5p15 locus in prostate cancer risk and progression 17

recently for their biomarker and therapeutic potential (Matin et al., 2016). Interestingly, lncRNAs have been recently identified as important mediators in many diseases including cancer (Clark et al., 2011; Zhang, Pitchiaya, et al., 2018).

The biogenesis of lncRNAs is similar to mRNAs. Most lncRNAs are transcribed by RNA polymerase II, while some are also transcribed by RNA polymerase III. Most of the lncRNAs undergo post-transcriptional modifications such as splicing, polyadenylation and 5’ capping like protein coding RNAs (Chen, 2016). However, these molecules have several short open reading frames (sORFs) and have very little protein coding potential which discriminates them from mRNA (Rinn et al., 2012). Interestingly, recent advanced research has identified several putative coding sORFs suggesting that lncRNAs may be translated into micro peptides with a functional role (Andrews et al., 2014; Matsumoto et al., 2017; Nelson, Makarewich, et al., 2016).

Based on their origin, these lncRNAs can be classified as intronic, exonic, intergenic, intragenic, antisense, 3’ and 5’ UTR, promoter associated (paRNA), and enhancer associated (eRNA) (Figure 1.4). Most of the lncRNAs are localised in the nucleus, while some are found in both the nucleus and cytoplasm, and some are specifically distributed in the cytoplasm. These lncRNAs play a functional role in gene expression regulation by either cis (targeting genomically local genes) or trans (targeting distant genes) action (Nie, et al., 2012). One of the well-studied lncRNA, X-inactive specific transcript (Xist) is involved in X-chromosome inactivation by epigenetic silencing, and this is mediated through cis-regulation (Brown et al., 1991; Brown et al., 1992; van Bemmel et al., 2016). On the other hand, HOX Antisense intergenic RNA (HOTAIR) trans-regulates the chromosomal domain in the HOXD locus (Hajjari et al., 2015; Rinn et al., 2007).

18 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Figure 1.4 – Origins of LncRNAs. LncRNAs originate from various positions in the genome. They can be transcribed from intronic, exonic, intergenic, intragenic and antisense regions of promoter coding genes or generated from promoter, enhancer or UTR regions of genes. (Figure reproduced from Nie et al., 2012).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 19

As prostate cancer is a heterogeneous disease, a single biomarker is not sufficient for diagnosis. Previous studies (Hessels et al., 2003; Hessels et al., 2013; Malik et al., 2014; Ren et al., 2013; Wang, Ren, et al., 2014) have proposed that a panel of lncRNAs which are prostate cancer specific could be used as biomarkers for prostate cancer risk along with PSA for better prostate cancer management. Prostate cancer antigen 3 (PCA3), a well characterised lncRNA overexpressed in prostate tumour compared to healthy prostate is the first FDA approved urinary biomarker for prostate cancer detection (Deng et al., 2017). In addition, Second Chromosome Locus Associated with Prostate 1, SChLAP1 was reported to have a prognostic biomarker potential, while metastasis-associated lung carcinoma transcript 1 (MALAT-1) was suggested to have a predictive function for prostate cancer progression (Smolle et al., 2017).

Aberrant expression of lncRNAs has been observed in various cancer types, such as breast, prostate, liver, colorectal and bladder cancer (Chen et al., 2017) and can be used as an indicator of different cancer stages and may predict cancer progression or regulate tumour related signalling pathways (Misawa et al., 2017). Approximately 1800 lncRNAs were found to be expressed in prostate tissues in a study performed by Prensner et al., using transcriptome sequencing on a cohort of 102 prostate tissues and cell lines, including 121 lncRNAs that are transcriptionally dysregulated in prostate cancer (Prensner, Iyer, et al., 2011). These 121 lncRNAs are named as prostate cancer associated transcripts (PCATs), and it is suggested that these PCATs may have potentially functional roles in conferring prostate cancer risk. PCAT1 and PCAT18 are upregulated in prostate cancer and exhibit tissue-specific expression (Crea et al., 2014; Prensner, Iyer, et al., 2011). PCAT1 functions as a transcriptional repressor of genes involved in mitosis and cell division, including the tumour suppressor gene, BRCA2 (Prensner, Chen, Iyer, et al., 2014). PCAT18 upregulation is triggered by AR activation. The silencing of PCAT18 inhibited cell proliferation and triggered caspase 3/7 activation, and most interestingly, the inhibition of PCAT18 had no effect on non- neoplastic cells, suggesting that the development of novel targeted therapeutics such as antisense oligonucleotides can be developed against this lncRNA (Crea, et al., 2014). In addition to the above discussed lncRNAs, other lncRNAs identified to have a role in prostate cancer or have potential to be used as a biomarker are summarised below (Table 1-2, Table 1-3).

20 The role of gwas identified 5p15 locus in prostate cancer risk and progression

These studies suggest that lncRNAs might play a role in prostate cancer progression by regulating cancer-associated signalling pathways. Furthermore, most of these lncRNAs are tissue-specific and show cancer-specific expression, therefore can be used as biomarkers for both tissue-of-origin tests as well as cancer diagnostics (Prensner & Chinnaiyan, 2011). However, a few of the lncRNAs mentioned have also been studied for their role in other types of cancers. For example, the tumour suppressor role of GAS5 is also discussed in breast cancer (Mourtada-Maarabouni et al., 2009) and renal cell carcinoma (Qiao et al., 2013). Polymorphisms in H19 was suggested as a low-risk marker in bladder cancer (Verhaegh et al., 2008) or proposed as a targeted therapy (Amit et al., 2010). PTENP1 was reported in colon cancer with a tumour suppressive function (Swami, 2010) and the role of SRA was also discussed in breast cancer (Leygue et al., 1999).

In summary, lncRNAs act as important regulators of gene expression regulating the biological pathways of cell proliferation, migration and apoptosis which play a crucial role in cancer development and progression. The modulation of lncRNA expression have been shown to affect tumour formation, progression and metastasis. Furthermore, the rapid increase in our understanding of the mechanisms of action of lncRNAs as critical regulators in these cellular processes of cancers increasingly suggests that lncRNAs may represent a poorly characterised layer of cancer biology (Prensner & Chinnaiyan, 2011; Zhang, Pitchiaya, et al., 2018) and thus could be targeted for prostate cancer therapy along with conventional treatments.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 21

Table 1-2 - LncRNAs associated with prostate cancer and their functional roles LncRNA Functional role Mechanism of action Reference Prostate Cancer Associated Promotes cell proliferation Post-transcriptionally modulates 3’UTR (Guo, et al., 2016; Transcript 1 (PCAT 1) through multiple pathways. of cMyc. Prensner, Chen, Han, et Suppresses BRCA2 expression. al., 2014; Prensner, et al., Upregulates FSCN1 by acting as a 2014; Prensner & competing endogenous RNA to FSCN1 Chinnaiyan, 2011; Xu et targeting miRNA, miR -145-5p. al., 2017) Interacts with AR and LSD1.

Prostate Cancer Associated Promotes cell proliferation, Knockdown triggers caspase 3/7 (Crea, et al., 2014) Transcript 18 (PCAT 18) migration and invasion activation. Antisense non-coding RNA in the Epigenetic transcriptional Represses INK 4A/INK 4B through direct (Kotake et al., 2011; Yap INK4 locus (ANRIL) repression binding to a member of polycomb et al., 2010; Zhao et al., Promotes cell proliferation complex 1 (PRC1), CBX7 and a member 2017) and migration of prostate of polycomb complex 2 (PRC2), SUZ12. cancer cells Regulates let-7a/TGF-β1/SMAD signalling pathway. C-Terminal Binding Protein-Anti Promotes both hormone- Represses the expression of its cis gene, (Takayama et al., 2013) sense (CTBP1-AS) dependent and castration- CTBP1 by recruiting the

22 The role of gwas identified 5p15 locus in prostate cancer risk and progression

LncRNA Functional role Mechanism of action Reference resistant tumour growth and RNA-binding transcriptional repressor, cell cycle progression PSF together with histone deacetylases. Exhibits global androgen-dependent functions by inhibiting tumour-suppressor genes via PSF-dependent mechanism. Prostate Cancer Associated Non- Promotes cell proliferation AR dependent gene activation events and (Chung et al., 2011; Coding RNA 1 (PRNCR1) / progression to CRPC. Wang, Shi, et al., 2013; Prostate Cancer Associated Yang, Lin, et al., 2013) Transcript (PCAT 8) Prostate-specific transcript 1 Involved in cell proliferation Overexpressed in prostate cancer. (Fu et al., 2006; He et al., (PCGEM1) and colony formation, Involved in AR dependent gene activation 2014; Hirsch et al., 2015; regulation of apoptosis. events and progression to CRPC with Petrovics et al., 2004; conflicting studies published by Srikantan et al., 2000; Xue (Prensner, Sahu, et al., 2014). et al., 2013; Yang, et al., 2013) Growth Arrest-Specific 5 (GAS5) Promotes apoptosis of Acts as a host gene for Small nucleolar (Pickard et al., 2013; prostate cancer cells. RNA (snoRNA) and positively regulates Yacqub-Usman et al., Effectiveness of mTOR inhibitor action. 2015) chemotherapies may be

The role of gwas identified 5p15 locus in prostate cancer risk and progression 23

LncRNA Functional role Mechanism of action Reference improved with increased GAS5 expression. SWI/SNF Complex Antagonist Regulates cancer cell Antagonizes the tumour-suppressive (Mehra et al., 2014; Associated With Prostate Cancer invasiveness and metastasis. functions of the SWI/SNF Prensner et al., 2013) 1 (SchLAP1) / Prostate Cancer (SWItch/Sucrose NonFermentable) Associated Transcript 114 (PCAT complex therefore promotes the 114) development of fatal cancer at least in a subset of prostate cancer.

Nuclear Enriched Abundant Promotes tumour growth. Alters the epigenetic landscape of target (Chakravarty et al., 2014; Transcript 1 (NEAT-1) gene promoters to favour transcription. Xiong et al., 2018) Over expression causes resistance to both androgens and anti-androgens in prostate cancer cells. Upregulates the AKT via SRC3/IGF1R pathway. Protein sprouty homolog 4 Regulates cell proliferation, Unknown (Lee et al., 2014) (SPRY4)-intronic Transcript 1 invasion and apoptosis. (SPRY4-IT1)

24 The role of gwas identified 5p15 locus in prostate cancer risk and progression

LncRNA Functional role Mechanism of action Reference LncRNA H19 (H19) Represses cell migration in Upregulates miR-675 expression which (Zhu et al., 2014) cells. represses TGFβI mRNA expression. Phosphatase and tensin homolog Suppresses cancer cell Acts as decoy for biomolecules, (Poliseno et al., 2010) pseudogene 1 (PTENP1) growth. preventing them from fulfilling their cellular functions. Positively regulates the tumour suppressor gene, PTEN. HOX transcript antisense RNA Increases prostate cancer cell Enhances the AR-mediated transcriptional (Chiyomaru et al., 2013; (HOTAIR) proliferation, migration and program by directly binding and Zhang, Zhao, et al., 2015) invasion and induces stabilising AR. apoptosis and cell cycle arrest. Prostate cancer-up-regulated long Modulates cancer cell Plays a role in the transactivation activity (Cui et al., 2013) noncoding RNA (PlncRNA1) proliferation and induces of AR. apoptosis in both androgen- dependent and androgen- independent prostate cancer cells.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 25

LncRNA Functional role Mechanism of action Reference Linc00963 Modulates cell proliferation, Promotes the epidermal growth factor (Wang, Han, et al., 2014) motility, invasion and cell receptor (EGFR) expression and apoptosis. phosphorylation levels of protein kinase B (AKT) and therefore involved in the prostate cancer transition from androgen- dependent to androgen-independent. Prostate cancer associated Involved in cell growth, Unknown. (Ylipaa et al., 2015) transcript 5 (PCAT5) invasion, apoptosis, colony formation and migration. TRPM2-AS Modulates apoptosis. Knockdown activates the sense gene, (Orfanelli et al., 2015) TRPM2 and increases intracellular hydrogen peroxide. Differentiation antagonizing non- Modulates invasion and Downregulates the expression of (Jia et al., 2016; Lu et al., protein coding RNA (DANCR) migration. metastasis inhibitor TIMP2/3. 2018) Suppresses the expression of cell-cycle inhibitor p21. PCAT29 Inhibits cancer cell migration Unknown. (Malik, et al., 2014; and proliferation Sakurai et al., 2015) Maternally Expressed Gene 3 Induce apoptosis of prostate Represses BCL-2 expression, enhances (Luo et al., 2015) (MEG3) cancer cells. Bax expression and activates caspase 3

26 The role of gwas identified 5p15 locus in prostate cancer risk and progression

LncRNA Functional role Mechanism of action Reference Plasmacytoma variant Regulates cell viability and Methylates miR-146a. (Liu, Fang, et al., 2016; translocation 1 (PVT1) apoptosis. Promotes tumour Knockdown upregulates the expression of Yang, Li, et al., 2017) growth in-vivo. cleaved caspase-3 and cleaved caspase-9. Colon cancer-associated transcript Involved in cell migration and Modulates epithelial-mesenchymal (Zheng et al., 2016) (CCAT2) invasion. transition via modulating NCAD, vimentin and ECAD expression.

Cytokine signalling 2-antisense Promotes cell growth and Regulates the expression of genes (Misawa et al., 2016) transcript 1 (SOCS2-AS1) represses apoptosis involved in apoptosis pathway, including TNSF10 and sensitise cancer cells to docetaxel treatment. lncRNA-ATB Promotes cancer cell Activates ERK and PI3K/AKT signalling (Xu et al., 2016) proliferation pathway and affects ZEB1 and ZNF217 expression level. Lnc-MX1-1 Promotes cell proliferation Unknown. (Jiang et al., 2016) and invasion PART-1 Knockdown inhibits cell Inhibits Toll-like receptor (TLR) pathway. (Sun et al., 2017) proliferation and increased apoptosis

The role of gwas identified 5p15 locus in prostate cancer risk and progression 27

LncRNA Functional role Mechanism of action Reference LincRNA-p21 Inhibits prostate cancer cell Knockdown induces PKM2 expression (Wang, Ruan, et al., 2017; proliferation and colony and activates glycolysis. Wang, Xu, et al., 2017) formation

28 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Table 1-3 – Evidence for the role of LncRNAs as biomarkers in prostate cancer LncRNAs Description Reference Prostate Cancer Associated The low expression of PCAT29 can be correlated with (Malik, et al., 2014) Transcript (PCAT29) poor prognostic outcomes. A subset of patients at higher risk for disease recurrence can be designated with the loss of PCAT29. Prostate Cancer Associated High expression in plasma samples of localised and (Crea, et al., 2014) Transcript 18 (PCAT 18) metastatic patients compared to healthy individuals. Prostate-specific transcript 1 Associated with high-risk prostate cancer. (Deng, et al., 2017) (PCGEM1) Polymorphisms in PCGEM1 correlated with prostate cancer risk. X-inactive specific transcript Potential role as a biomarker for early diagnosis and (Laner et al., 2005; Song et al., 2007) (XIST) monitoring of cancer in men using serum since the cases with pronounced hypomethylation tend to be more aggressive Metastasis associated lung Involved in increased cell growth, invasion and migration, (Ren, et al., 2013; Wang, et al., 2014) adenocarcinoma transcript 1 and decreased apoptosis rate. (MALAT-1)/ Nuclear-enriched In addition, the correlation of higher MALAT-1 abundant transcript 2 (NEAT-2) expression with high Gleason score, PSA, tumour stage and CRPC suggests its role as a biomarker.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 29

LncRNAs Description Reference SWI/SNF Complex Antagonist Overexpression predicts lethal prostate cancer. (Mehra et al., 2016; Prensner, et al., Associated With Prostate Cancer Correlates with poor clinical outcome after radical 2013) 1 (SchLAP1)/ Prostate Cancer prostatectomy in benign prostate cancer. Associated Transcript 114 (PCAT 114) Prostate cancer antigen 3 (PCA3) First FDA-approved urinary biomarker for prostate cancer (Hessels, et al., 2003; Hessels, et al., in men with elevated serum PSA and a previous negative 2013; van Gils et al., 2007) biopsy. Overexpressed in more than 95% of primary prostate cancer specimens and metastasis. The PCA3 score is not influenced by age, inflammation, prostate volume, or 5α-reductase inhibitors Plasmacytoma variant Higher expression correlates with poor prognosis and (Yang, et al., 2017) translocation 1 (PVT1) advanced tumour stage. LncRNA FR0348383 Suggested as a novel urinary biomarker after post-DRE (Zhang, Ren, et al., 2015) (FR0348383) for the detection of prostate cancer. Protein sprouty homolog 4 Overexpressed in urine samples of prostate cancer patients (Lee, et al., 2014) (SPRY4)-intronic Transcript 1 compared to healthy controls. (SPRY4-IT1) Prostate cancer associated High expression associated with a low grade tumour and (Shukla et al., 2016) transcript 14 (PCAT14) expression inversely correlates with Gleason Grade.

30 The role of gwas identified 5p15 locus in prostate cancer risk and progression

LncRNAs Description Reference TRPM2-AS Associated with poor clinical outcome. (Orfanelli, et al., 2015) HCG11 Expression correlates with age, preoperative PSA level, (Zhang et al., 2016) Gleason score and biochemical recurrence. Low expression associated with poor survival of prostate cancer patients. LncRNA LOC400891 Expression predicts biochemical recurrence-free survival. (Wang, Cheng, et al., 2016) (LOC400891) Colon cancer-associated transcript High expression associated with poor recurrence free (Zheng, et al., 2016) (CCAT2) survival and overall survival. Lnc-MX1-1 Associated with PSA level, Gleason score and recurrence (Jiang, et al., 2016) free survival. LOC440040 Higher expression is correlated with advanced disease and (Zhang et al., 2017) poor survival outcome Long intergenic non-protein Correlated with preoperative PSA, Gleason score, tumour (Wu et al., 2017) coding RNA 1296 (LINC01296) stage and lymph-node metastasis. Higher expression associated with shorter biochemical recurrence free- survival

The role of gwas identified 5p15 locus in prostate cancer risk and progression 31

Although mechanistic studies of lncRNAs are at an early stage, considerable variability in the function of lncRNAs can be explained through well-characterised lncRNAs. Some lncRNAs are suggested to be involved in post-transcriptional regulation, such as regulation of alternative splicing, mRNA stability, degradation, trafficking and translational efficiency of mRNA (Kim & Sung, 2012; Nie, et al., 2012) mostly by homologous pairing of lncRNA with mRNA (Figure 1.5 (a)). For instance, the MALAT1 is shown to have an important role in regulating alternative splicing (Bernard et al., 2010; Tripathi et al., 2010). NRON is a ncRNA repressor of one of the key regulators of nuclear factor activation of T cell trafficking (NFAT) which produces variant transcripts by alternative splicing of three exons. It is reported that NRON may modulate the nuclear trafficking of NFAT, and also NRON is specific for NFAT translocation (Imam et al., 2015; Willingham et al., 2005).

Figure 1.5 - Possible lncRNA targeting mechanisms. (a) Homologous pairing – LncRNA can target RNA through direct (RNA-RNA) and therefore alters the translational efficiency of mRNA (b) RNA-DNA hybrid – LncRNA can form RNA-DNA duplexes and triplexes by binding to DNA by sequence complementarity. If these RNA-DNA complexes are formed at the promoter or enhancer regions the transcriptional activity of the DNA is altered. (c) RNA structure mediated interaction –lncRNA can form secondary and tertiary structures which provides more complex structures for lncRNA targeting. Some lncRNAs can form hairpin structures which enable them to bind with proteins and alter the interaction between proteins and DNA and therefore alter the transcription ability of DNA. (d) Protein linker - Proteins which have nucleic acid binding domains link lncRNAs to target loci. As shown in the figure lncRNA can bind to the proteins and bring them to the vicinity of a DNA so that the DNA transcriptional ability is modified. (Figure reproduced from (Hung et al., 2010)).

32 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Some lncRNAs transcriptionally regulate target genes by DNA-RNA pairing (Figure 1.5 (b)). For example, paRNA and eRNA are categorized under this function (Nie, et al., 2012). The 3.8 Kb antisense paRNA, Evf, associates with Dlx-2 in trans to increase the transcriptional activity of the Dix-5/6 (member of Dix/dllhomeodomain containing protein family) (Feng et al., 2006; Lee, 2012). The association and recruitment of another paRNA with the PRC2 component, SUZ12, to its target gene promoter, causes gene silencing and therefore, represses gene transcription in cis (Kanhere et al., 2010). An example for eRNAs includes ncRNA-a3 by which the expression of its flanking gene Tal1/SCL (a key regulator in hematopoiesis) is regulated (Orom et al., 2010).

Moreover, some lncRNAs can form secondary and tertiary structures like hairpin structures to alter the binding of proteins and DNA (Figure 1.5 (c)). LncRNA Gas5 belongs to this group and serves as a decoy for the Glucocorticoid Receptor (GR) and prevents the GR from binding to its responsive elements in DNA (Kino et al., 2010). The lncRNAs can also bind to proteins which have nucleic acid binding domains and bring them in close vicinity of target DNA (Figure 1.5 (d)). For example Telomere repeat factor TRF2 forms a stable complex with telomere-repeat encoding RNA and telomere DNA repeats (Bilaud et al., 1997; Deng et al., 2009). In addition, some lncRNAs also act as competing endogenous RNA (ceRNA) or RNA sponges by interacting with miRNAs to inhibit their regulatory effect on target mRNAs (Schmitt et al., 2016). For instance, OCT4-pg4, a pseudogene of OCT4 competes with miR-145 to regulate OCT4 expression (Wang, Guo, et al., 2013). However, the role of lncRNA as a miRNA competitor is debated due to low levels of most of these lncRNAs compared to mRNAs (Dykes et al., 2017; Klinge, 2018).

1.10. Summary and knowledge gap

Prostate cancer is the second most common cause of cancer death in Australian men. Recent prostate cancer GWAS found the rs12653946 SNP at the 5p15 locus to be significantly associated with prostate cancer risk in multi-ethnic populations. The IRX4 gene is the only protein coding region proximal to this GWAS SNP, and the rs12653946 SNP genotype had been correlated with lower expression of the IRX4 transcript in prostate cancer. Interestingly, expression of a novel lncRNA, IRX4lncRNA was observed on the opposite strand of IRX4 in prostate cancer. Similar to IRX4, expression of this lncRNA had been correlated with the rs12653946 SNP

The role of gwas identified 5p15 locus in prostate cancer risk and progression 33

genotype. Both IRX4 and IRX4lncRNA expression were found to be regulated by androgens in LNCaP cells. However, the functional role of these two possible candidate genes in prostate cancer pathogenesis and the regulation of their expression in prostate cancer cells are not fully known.

1.11. Hypothesis and Aims

Prostate cancer risk associated SNPs at the 5p15 locus confer prostate cancer risk via regulating the expression of IRX4 and IRX4lncRNA and these two genes play a functional role in prostate cancer aetiology. To address this hypothesis following aims were generated. 1. To determine the function and mechanism of action of IRX4 in prostate cancer aetiology 2. To determine if novel lncRNA, IRX4lncRNA, plays a role in prostate cancer pathology 3. To determine the androgen-mediated regulation of IRX4 and IRX4lncRNA expression in prostate cancer cells

34 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Chapter 2: Materials and Methods

This chapter includes the general materials and methods used for tissue culture, cloning, transfection and gene expression analysis to address the aims of the project. It also includes the reagents and instruments along with their manufacturing details. Other specific materials and methods are included in the relevant chapters.

2.1. Cell culture

A panel of cell lines representing prostate cancer (LNCaP, VCaP, DuCaP, C4-2B, PC3, DU145, RWPE2, 22RV1), benign prostate (BPH1, RWPE1) as well as an immortalized prostate stromal cell line (WPMY1) was used in this study. All cell lines were obtained from the American Type Culture Collection (ATCC). RWPE-1 and RWPE-2 cell lines were grown in Keratinocyte-SFM (1X) (K-SFM, catalog number - 17005-042), whereas the rest of the cell lines were grown in RPMI1640 (1X) with no phenol red (Life Technologies, Catalog number - 11835-030) supplemented with either 5% or 10% fetal bovine serum (FBS, Sigma, Catalog number - F2442). Cells were passaged at 70 – 80% confluence. Detachment of RWPE-1 and RWPE-2 cell lines was performed with TrypLE™ Select Enzyme (1X) (Life Technologies, Catalog number - 12563-011); other cell lines were detached with Trypsin/EDTA Solution (TE, Life Technologies, Catalog number - R-001-100). The cell lines were authenticated by Short tandem repeat (STR) profiling and tested negative for Mycoplasma.

2.2. Genomic DNA samples for genetic association studies

Genotyping data was obtained from the PRACTICAL Consortium from 46,939 cases and 27.910 controls (Schumacher, et al., 2018). This included samples from Queensland men (n=3100), consisting of 350 patients recruited via collaborations with consultant urologists as a retrospective study, 2000 patients from the Australian Prostate Cancer BioResource (APCB) and 750 patients (46-81 years) recruited in collaboration with The Cancer Council Queensland community-based prostate cancer Supportive Care and Patient Outcomes Project (ProsCan). 1300 cancer-free participants were used as controls, including 450 age- and postal code-matched healthy male controls recruited through the Electoral Roll, to complement participants in the ProsCan study, and 850 age-selected male controls recruited through the Australian

The role of gwas identified 5p15 locus in prostate cancer risk and progression 35

Red Cross Blood Services. Ethical clearance has been provided by the relevant authorities for this study. Informed consent had been obtained from all the cases and controls.

2.3. RNA isolation from FFPE prostate tumour tissues and matched controls

Samples for expression analysis included 50 formalin-fixed, paraffin embedded prostatic tissues (FFPE) and their adjacent normal tissues as controls from prostate cancer patients received from the APCB. Tissue blocks containing the tumour cells were serially sectioned (20 μm) and transferred to glass slides an stained with haematoxylin and eosin (H&E). Tumour areas marked by two pathologists. Macrodissection of the marked areas was performed manually using a sterile injection needle and deparaffinised using deparaffinisation solution (Qiagen, Catalog number – 19093). RNA extraction was performed using the miRNeasy FFPE kit (Qiagen, Catalog number – 217504) according to manufacturer’s instructions. RNA quality and quantity was measured using NanoDropTM1000 (Thermo Scientific, Biolab, Scoresby, VIC, Australia).

2.4. RNA isolation from cell lines

Total RNA was extracted from prostate cancer cells either using the RNAeasy Mini Kit according to the standard protocol (Qiagen, Catalog number - 74106) or Isolate II RNA Mini Kit (Bioline, Catalog number - BIO-52073). DNAse digestion (Qiagen, Catalog number - 79254) was performed on the column during the extraction process. RNA concentration and purity were measured using NanoDropTM1000 (Thermo Scientific, Biolab, Scoresby, VIC, Australia).

2.5. cDNA synthesis

1 μg of RNA was reverse transcribed to cDNA using SuperScript® III Reverse Transcriptase (Invitrogen, Catalog number - 18080-044). Briefly, RNA was diluted to 10 µl and incubated with 200 nM of random hexamers (Invitrogen, Catalog number - N8080127) and 1 mM dNTPs (Invitrogen, Catalog number – 18427013) at 65ºC for 5 minutes, followed by incubation on ice for at least 1 minute. First strand synthesis was performed by using 1 Unit of SuperScript III reverse transcriptase,1 Unit of RNaseOUT (Invitrogen, Catalog number - 10777019), 2 mM dithiothrietol (DTT), and 1X first strand synthesis buffer in a total volume of 20 µl. The reaction mix was

36 The role of gwas identified 5p15 locus in prostate cancer risk and progression

incubated at room temperature for 5 minutes, followed by incubation at 50ºC for 50 minutes and inactivated by heating at 80ºC for 15 minutes. The cDNA was diluted to 100 µl before using it as template for PCR reaction.

2.6. Reverse transcription polymerase chain reaction (RT-PCR)

RT-PCR was performed with a reaction comprising 1X PCR buffer, 1.5 mM MgCl2, 0.2 mM dNTPs, 0.2 µM of each of forward, reverse primers (Sigma Aldrich), 1 U Platinum™ Taq DNA Polymerase (Catalog No - 10966018, Invitrogen) and 1 µl of cDNA template. The samples were amplified on a Mastercycler® nexus machine (Eppendorf, North Ryde, NSW, Australia) using the cycling conditions: 94ºC for 4 minutes, 30-40 cycles of 94ºC for 30 seconds, 60ºC for 30 seconds, and 72ºC for 1 minute per kb of product and a final extension step of 72ºC for 8 min. Samples mixed with loading dye (NEB) were loaded on to 0.7-2% agarose gels (Bioline, Alexandria, NSW, Australia) prepared in Tris-borate-EDTA (TBE) buffer (89mM Tris base, 89mM Borate, 2mM EDTA) and containing 0. 5 µg/mL ethidium bromide (Invitrogen™). Approximately 0.5 µg of 1 kb ladder or 100 bp ladder (both from NEB) was loaded to compare the size of the DNA products. Images were captured by the gel documentation system – QUANTUM ST5 (Fisher Biotec, Wembley, WA, Australia).

2.7. Quantitative RT-PCR (qRT-PCR)

Quantitative RT-PCR was performed in MicroAmp® Optical 384-Well Reaction Plate with barcode (Applied Biosystems, Catalog number - 4309849) using the ViiA7 Real- Time PCR system (Applied Biosystems). Each reaction contained 1X final concentration of SYBR Green PCR Master Mix 2X (Life Technologies, Catalog number - 4309155), 50 nM forward and reverse primer, 2.0 µl of diluted cDNA (1:5) and nuclease-free water to a final volume of 8 µl. Primer sequences are provided in Appendix A. The cycling parameters were 95ºC for 10 minutes, 40 cycles of 95ºC for 15 seconds and 60ºC for 1 minutes followed by a dissociation step. Relative expression compared to control was performed by the comparative CT (∆∆CT) method. The CT value is defined as the number of cycles required for the fluorescent signal to cross the threshold and the CT levels are inversely proportional to the amount of target molecules present in the total cDNA (Therefore, lower CT values correspond to an increased amount of the target present in the cDNA).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 37

2.8. Western blot

The whole cell lysate was isolated with RIPA buffer (50mM Tri-HCl pH 7.5, 150mM NaCl, 1% SDS, 1% Triton X-100, 1% CHAPS/IGEPAL and 1x Protease Inhibitor Cocktail) and the western blot was carried out (Odyssey® system) with either Anti- IRX4 antibody (ab123542, Abcam, Rabbit polyclonal), Anti-ERG antibody [EPR3864] (ab92513, Abcam, Rabbit monoclonal) or Anti-FOXA1 antibody (ab55178, Abcam, Mouse monoclonal). Briefly, the protein concentrations were measured using a standard BCA assay with Pierce™ BCA Protein Assay Kit (Sigma). 30 μg of proteins were loaded into a pre-cast gel (NuPAGE® Novex® 4-12% Bis-Tris Protein Gels, 1.0 mm, 10 well) and electrophoresed at 150 V for 1.5 hours. Then the proteins were transferred onto a nitrocellulose membrane (Bio Trace NT, Pall Life Sciences, United States) using a Transblot apparatus (Biorad) at 4˚C for 60 min (100 V) using buffer containing 25mM Tris, 192mM glycine and 10% methanol (v/v). Membranes were subsequently blocked in Odyssey® blocking buffer (LI-COR) for 30-45 min at room temperature. Primary antibody was diluted in Odyssey® blocking buffer (1:1000 – 1:5000) and incubated with membrane overnight at 4ºC. After washing membranes three times (10 min each) with TBS-Tween (50mM Tris-HCl, 150mM NaCl containing 0.005% (v/v) Tween-20), AlexaFluoro conjugate secondary antibody was diluted in Odyssey buffer (1:10,000) and incubated with membranes for 1 hour at room temperature. For ECL method, the membrane was incubated with the secondary antibody (1: 1000), and included IgG linked horseradish peroxidase (HRP). Membranes were washed three times (10 min each) with TBS-Tween, and then scanned with either the Odyssey® system (LI-COR) or ECL (Merck Millipore) according to the manufacturer’s instructions.

2.9. siRNA mediated knockdown siRNAs used in this study included siIRX4 (Ambion, Catalog number – AM16708, s27097 and s224173), siERG (Ambion, Catalog No 4392420, s4811), non-targeting siRNA (Ambion, Silencer select no 1 siRNA, Catalog number – 4390843), and custom-designed siRNAs targeting IRX4lncRNA (Life Technologies). The custom designed siRNA sequences are provided in Appendix A. Briefly, the cells were transfected with 25 pmoles of siRNA using the Lipofectamine® RNAiMAX Transfection Reagent (Invitrogen, Catalog number – 13778150) according to the

38 The role of gwas identified 5p15 locus in prostate cancer risk and progression

manufacturer’s instructions and incubated at 37ºC (5% CO2) for 72 hours and the transfection efficiency was confirmed by qRT-PCR analysis.

2.10. Proliferation assay

Proliferation assays were performed using the IncuCyte live cell imaging system (Essen Biosciences) with LNCaP cells and C4-2B cells transiently transfected with siIRX4 (Ambion, Catalog number – AM16708), siIRX4lncRNA and non-targeting siRNA (Ambion, Silencer select no 1 siRNA, Catalog number – 4390843) using Lipofectamine® RNAiMAX transfection reagent (Invitrogen, Catalog number – 13778150) or stable overexpression and knockdown cells with 250 ng/ml doxycycline treatment. Briefly, the transfected cells seeded at 5000 cells/well density in a 96 well plate was placed in the IncuCyte Live cell imaging system (Essen BioScience). Two images per well were taken every two hours for five consecutive days and confluency of the cells measured. CyQuant NF assays (ThermoFisher Scientific, Catalog number – C35006) were performed in black plastic plates (Perkin Elmer Life Sciences, Catalog number - 6005182) according to the manufacturer’s instructions and fluorescence at 520 nm was measured after excitation at 480 nm using a microplate reader (FLUO Star Omega, BMG LAB TECH). The assay was performed in at least two separate experiments. The data was plotted for monolayer confluence (Mean ± SEM) vs. time using Graph Pad Prism 7.0.

2.11. Migration assay

Transfected/transduced cells were seeded in a poly-L-Ornithine (Sigma-Aldrich, Catalog number - P4957) coated 96-well ImageLock plate (Essen BioScience, Catalog number - 4379) at a confluency of 50,000 cells/well and incubated at 370C overnight to form a monolayer. The cells were treated with mitomycin C at a concentration of 10 µg/ml (Sigma-Aldrich, Catalog number – M4287) for 2 hours to inhibit cell proliferation and then fed with fresh media. The uniform wounds were created in the wells using WoundMakerTM and each well was washed with culture media two times. Finally 100 µl of media was added to the cells and the plate was placed in the IncuCyte (Essen BioScience). The wound closure was monitored by a repeated scanning every two hours for 48 hours and migration was determined as a percentage of wound closure by the IncuCyte integrated image analysis software. The assay was performed in three

The role of gwas identified 5p15 locus in prostate cancer risk and progression 39

separate experiments. The data was plotted for relative wound closure (Mean ± SEM) vs. time using Graph Pad Prism 7.0.

2.12. Statistical analysis

Assays were performed in three biological replicates. Data from qPCR experiments, functional assays and clinical data were statistically analysed using GraphPad Prism 7.0. The comparison between two unpaired groups was analysed by Mann-Whitney t- test and paired groups by Wilcoxon matched-pairs ranked test. More than two groups were analysed using Kruskal-Wallis test with Dunn’s multiple comparison test for unpaired samples and using the Friedman test for matched groups. The results were considered statistically significant if p <0.05.

40 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Chapter 3: The role of Iroquois-Homeobox 4 (IRX4) at the GWAS

identified 5p15 locus in prostate cancer

3.1. Introduction

GWAS have identified genomic regions that alter an individual’s risk of developing prostate cancer. Through these studies, the rs12653946 SNP at the 5p15 locus was found to be significantly associated with prostate cancer risk in multi-ethnic populations (Batra, et al., 2011; Lindstrom, et al., 2012; Takata, et al., 2010). This SNP falls within the intronic region of a predicted lncRNA, CTD-2194D22.4 (Batra, et al., 2011), however this lncRNA is not expressed in prostate tissue (GTEx portal). The proximal protein coding gene to this SNP is Iroquois Homeobox 4 (IRX4) and expression of IRX4 is correlated with the rs12653946 SNP genotype in prostate cancer patients (Xu, et al., 2014).

IRX4 belongs to the IRX family, encoding a highly conserved, from invertebrates (Drosophila) to mammals (Gomez-Skarmeta and Modolell 2002; Cheng et al. 2005; Matsumoto et al. 2004), homeodomain-containing transcription factor which plays fundamental prepatterning roles in diverse developmental processes, such as growth and differentiation, including cell-type specification and organogenesis (Cheng, et al., 2005; Gomez-Skarmeta, et al., 2002; Matsumoto, et al., 2004). The homo sapiens IRX genes are organised as two clusters containing three genes each; IRX1, 2 and 4 cluster on chromosome 5 and IRX3, 5 and 6 on chromosome 16, and separated by large intergenic regions (Figure 3.1). The deregulation of different IRXs has been implicated in various cancers: IRX1 hypomethylation is a potential molecular biomarker for lung metastasis (Lu, et al., 2015); IRX2 is an oncogene in osteosarcoma (Liu, et al., 2014); IRX3 is overexpressed in colorectal cancer (Martorell, et al., 2014); IRX4 is a predictor of outcome in oral squamous cell carcinoma (Wang, et al., 2015) and IRX5 is a cell cycle regulator in prostate cancer (Myrthue, et al., 2008). Interestingly, IRX4 had been reported to suppress prostate cancer growth by interacting with the Vitamin D Receptor (VDR) (Nguyen, et al., 2012). Moreover, several other homeobox proteins also investigated for their role in Epithelial to Mesenchymal Transition (EMT). During EMT, epithelial cells transiently change into mesenchymal cells to facilitate cell migration and invasion. This includes loss of epithelial markers such as ECAD and β- catenin and gain of mesenchymal markers such as vimentin and NCAD (Montanari et

The role of gwas identified 5p15 locus in prostate cancer risk and progression 41

al., 2017). The homeobox gene Six1 overexpression in breast cancer cells induced EMT, while HOXA7 and HOXA10 were reported to promote mesenchymal to epithelial transition (MET) in ovarian and endometrial cancer respectively (Haria et al., 2013; Taniguchi, 2014).

Figure 3.1 – Genomic organisation of Homo sapiens IRX genes. IRX genes are organised on two clusters in chromosome 5 and 16. IRX4 and IRX2 genes are on the negative strand of chromosome 5, while IRX1 is on the positive strand. IRX5 and IRX6 are on the positive strand of chromosome 16, and IRX3 is on the negative strand. The intergenic region between the IRX cluster on chromosome 5 (~1.7 Mb) is longer than the cluster on chromosome 16 (~1 Mb).

Even though evidence is accumulating for the regulation of cell proliferation, apoptosis, angiogenesis and metastasis by different IRXs (Bhatlekar, et al., 2014; Shah, et al., 2010), the transcriptome affected by these transcription factors is still not fully understood. There are no studies published to date to study either the IRX4 transcriptome and/or binding partners, and therefore to identify the biological pathways regulated by IRX4 in prostate cancer cell lines. In the current study, in-vitro assays were performed to help determine the potential functional role of IRX4 in prostate cancer progression using IRX4 knockdown prostate cancer cell line models. To address the observation from functional assays, the changes in EMT genes were determined in the IRX4 knockdown models. Gene microarray analysis was performed to identify the genes regulated by IRX4 and immunoprecipitation (IP) studies to identify the co- factors of IRX4. This chapter is aimed at understanding the IRX4 transcriptome and its co-factors, which will provide crucial information in evaluating the clinical relevance of IRX4 in prostate cancer.

42 The role of gwas identified 5p15 locus in prostate cancer risk and progression

3.2. Methods

3.2.1. Analysis of IRX transcription factors’ expression in published data sets Expression data for the IRX transcription factors in various cancer and normal tissues were downloaded from Oncomine (Rhodes et al., 2004), FireBrowse (Broad Institute) and cBioportal (Cerami et al., 2012; Gao et al., 2013).

3.2.2. Immunofluorescence (IF) analysis IF analysis was performed to confirm the knockdown efficiency with the assistance of Dr Thomas Kryza. Transient knockdown cells were fixed with 4% PFA for 15 minutes at room temperature, then permeabilized using PBS + 0.05 TritonX100 for 5min at room temperature. After three washes with PBS, cells were stained for F-Actin using Phalloidin-488 (1/40 in PBS). After washes to eliminate excess phalloidin, cells were saturated with PBS containing 3% BSA (30 minutes at room temperature), then incubated overnight at 4 ºC with anti-IRX4 antibody (1/300 in PBS-3%BSA) and washed three times with PBS at room temperature, followed by incubation with anti- Rabbit IgG Secondary Antibody coupled with Alexa Fluor® 563 conjugate (1/1000 in PBS-3%, 1h at RT). After three washes with PBS, cell nuclei were stained with DAPI and images were taken with an Olympus Inverted Fluorescence microscope.

3.2.3. Microarray gene expression profiling For gene expression profiling, triplicates of each sample (LNCaP and VCaP cells with IRX4 knockdown and non-targeting control) were analysed on a custom 180k Agilent oligo microarray (ID032034, GPL16604) with the assistance of Dr Anja Rockstroh and Prof Colleen Nelson at the Australian Prostate Cancer Research Centre – Queensland (APCRC-Q). This array contains probes mapping to human protein- coding and non-coding loci; with probes designed for exons, 3’UTRs, 5’UTRs, intronic and intergenic regions (Levrier et al., 2017; Mertens-Walker et al., 2015). RNA was isolated using the RNA Mini Kit (Bioline) according to the manufacturer’s protocol, including an on-column DNAse treatment step. The purity and quality of RNA was analysed on a NanoDrop1000 and Agilent 2100 Bioanalyzer. 150 ng RNA of each sample was amplified and labelled using the Agilent ‘Low Input Quick Amp Labeling Kit’ for One-Color Microarray-Based Gene Expression Analysis. Briefly, the RNA was reverse transcribed into cDNA using an oligo-dT/T7-promoter hybrid primer which introduced a T7 promoter region into the newly synthesised cDNA. Then

The role of gwas identified 5p15 locus in prostate cancer risk and progression 43

in vitro transcription was performed using a T7 RNA polymerase, which simultaneously amplified the target material and incorporated cyanine 3-labeled CTP. cDNA synthesis and in vitro transcription was performed at 40 ºC for 2 h, respectively. The labelled cDNA was purified using the RNeasy Mini Kit (Qiagen, Catalog number - 74106) and quantified on a NanoDrop1000. Finally, 1650 ng cRNA of each sample were hybridised at 65 ºC for 17 h and the arrays subsequently scanned on an Agilent Microarray Scanner G2565CA.

3.2.4. Microarray data analysis The microarray raw data were processed using the Agilent Feature Extraction Software (v10.7). A quantile between array normalization was applied and differential expression was determined using the Baysian adjusted t-statistic linear model of the ‘Linear Models for Microarray Data’ (LIMMA)1 package in R. p-values were corrected for a false discovery rate of 5% and gene expression levels are presented as log2 transformed intensity values. Normalized gene expression data from the experiment are ‘Minimum Information About a Microarray Experiment’ (MIAME) compliant. Genes that were significantly different between two groups were identified with an adjusted p-value of ≤0.05, and an average fold change of >=1.5. For functional annotation and gene network analysis, filtered gene lists were examined using QIAGEN’s Ingenuity® Pathway Analysis (IPA®, QIAGEN, Redwood City) and Gene Set Enrichment Analysis (GSEA, Broad Institute). The analysis of androgen regulated gene signature was performed with the assistance of Dr Anja Rockstroh.

3.2.5. Immunoprecipitation (IP) The nuclear enriched lysate from LNCaP cells was isolated using NE-PER nuclear and cytoplasmic extraction kit (ThermoFisher Scientific, Catalog number – 78833) according to the manufacturer’s instructions. Lysate was pre-cleared with 25 μl of SureBeadsTM Protein G magnetic beads (BIO-RAD, Catalog number – 1614023) for 1 hour at 4 ºC and the beads were removed three times using a magnetic stand. The extract was then incubated with 5 μl of 1mg/ml antibody for IgG or IRX4 (Abcam, Catalog number – ab123542) and 50 μl of protein G beads at 4 ºC for overnight. Beads were washed with ice-cold PBS three times and the proteins were eluted with 100 μl nuclear extraction buffer by boiling the samples at 95 ºC for 5 minutes.

44 The role of gwas identified 5p15 locus in prostate cancer risk and progression

3.2.6. Mass spectrometry analysis The mass spectrometry analysis was performed with assistance from Ms Dorothy Loo at the TRI Proteomics facility.

The IP samples (5 µg) were boiled in sodium deoxycholate buffer (1% sodium deoxycholate, 10mM TCEP, 40mM 2CAA, 100 mM Tris pH8.5) and diluted in H2O. Then the samples were digested with trypsin (0.1 µg) and incubated at 37 ºC overnight and acidified with final formic acid concentration of 1%. The sodium deoxycholate precipitate was removed by centrifugation at 13000g for 10 minutes. The samples were then cleaned with C18 tips and resuspended in 0.1% formic acid, 3% acetonitrile to make 0.5 µg/µl concentration and finally the samples were sonicated for 5 minutes.

1 µg of samples were injected for the LC-MS/MS analysis onto a trap column (Easy LC THC164705 column (C18, 20 MM X 75um ID, 3um particle)) and then separated on an analytical column (Easy LC THCES803 column (C18, 500 MM X 50um ID, 2um particle)), The following LC gradient was applied: 3% B for 5 mins, 25B for 80 mins, 40B for 20 mins, 95%B for 1min, wash at 95% B for 10min, and back to 3% B in 1min. Data Analysis was performed with the following parameters on the Spectrum Mill B.05: Extract: Precursor MH+ 600-6000m/z, Retention time and m/z tolerance of +45sec, +1.4m/z, spectral similarity & RT & m/z merging, Xcalibur centroiding algorithm for profile mode data, precursor of 6, min MS1 S/N of 25, find 12C precursor m/z. The search was done against Swissprot Human Nov 2014, tryspin digest, fix carbamidomethylation C, variable oxidized M, min matched peak intensity of 50%, Instrument ESI QExactive HCD, monoisotopic masses, precursor mass tolerance of + 20ppm, product mass tolerance of + 20ppm, max ambiguous precursor charge of 3, reversed database scores were calculated, discriminant scoring off, variable modification search mode, precursor mass shift range of -18 to 177Da. The proteins identified in at least two replicates were considered for further analysis.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 45

3.3. Results

3.3.1. Expression of IRX factors in clinical samples The expression of IRX transcription factors was inspected in publicly available cancer datasets (The Cancer Genome Atlas (TCGA) data derived from FireBrowse - Broad Institute) to understand their expression pattern in different tumour and normal tissues (Figure 3.2). IRX1 was mostly overexpressed in normal tissues compared to tumour samples, including kidney, lung and breast tissues. On the other hand, esophageal carcinoma and glioma had higher expression of IRX1 compared to their respective normal samples. There was no difference observed between the expression levels of IRX1 in other cancers. IRX2 had a similar expression pattern to IRX1 in most of the tumour and normal tissues. IRX3 had higher expression levels in most of the tissue samples. It was overexpressed in breast, renal and colon cancers compared to their respective normal tissues, while it showed an opposite trend in prostate, lung and esophageous samples. Interestingly, compared to other IRXs, IRX4 has a distinctive expression pattern, as higher expression of IRX4 was only seen in a few tissue types including esophageous, head and neck, breast and prostate. Moreover, it is overexpressed in prostate, head and neck, esophageal carcinoma compared to their corresponding normal tissues. IRX5 had a similar trend of expression to IRX3 in different cancers. Even though, IRX6 was expressed across various tumour and normal tissues including breast, lung and thyroid, it had a lower expression level across different tissue types.

IRX4 was found to be highly expressed in prostate cancer compared to other types of cancer in the Bittner Multi-Cancer dataset (Reporter – 220225_at, Source – Oncomine (Rhodes, et al., 2004), suggesting its importance in prostate cancer pathogenesis (Figure 3.3). This dataset includes data analysed on Affymetrix U133 Plus 2.0 microarrays from 1911 tumour samples from various cancers.

46 The role of gwas identified 5p15 locus in prostate cancer risk and progression

The role of gwas identified 5p15 locus in prostate cancer risk and progression 47

Figure 3.2 - The expression of IRX transcription factors in different cancers from the TCGA dataset. The mRNA expression of IRX4 shows distinctive expression pattern compared to other IRXs. IRX1 has low expression levels in various tissues except normal kidney and breast samples. Similarly, IRX2 has higher expression in normal kidney samples, as well as overexpressed in other tissues including breast and lymphoma. IRX3 is overexpressed in most of the tumour samples except colon, colorectal and rectum samples. IRX4 is highly expressed in prostate, head and neck and testicular carcinoma samples. IRX5 has similar trend of expression as IRX3 in most of the cancers. IRX6 is expressed across various tumours and normal samples including colon, colorectal, breast, thymoma and Uvel Melanoma. (ACC – Adrenocortical carcinoma, BLCA – Bladder Urothelial carcinoma, BRCA – Breast invasive carcinoma, CESC – Cervical Squamous cell carcinoma and endocervical adenocarcinoma, CHOL – Cholangiocarcinoma, COAD – Colon adenocarcinoma, COARDREAD – Colorectal adenocarcinoma, DLBC – Lymphoid Neoplasm Diffuse Large B-Cell Lymphoma, ESCA – Esophageal carcinoma, GBM – Glioblastoma multiforme, GBMLGG – Glioma, HNSC – Head and Neck squamous cell carcinoma, KICH – Kidney Chromophobe, KIPN – Pan-kidney cohort, KIRC – Kidney renal clear cell carcinoma, KIRP – Kidney renal pappollary cell carcinoma, , LAML – Acute Myeloid Leukemia, LGG – Brain Lower Grade Glioma, LIHC – Liver hepatocellular carcinoma, LUAD – Lung adenocarcinoma, LUSC – Lung squamous cell carcinoma, MESO- Mesothelioma, OV – Ovarian serous cystadenocarcinoma, PAAD – Pancreatic adenocarcinoma, PCPG – Pheochromocytoma and Paraganglioma, PRAD – Prostate adenocarcinoma, READ – Rectum adenocarcinoma, SARC – Sarcoma, SKCM – Skin Cutaneous Melanoma, STAD – Stomach adenocarcinoma, STES – Stomach and Esophageal carcinoma, TGCT – Testicular Germ Cell Tumours, THCA – Thyroid carcinoma, THYM – Thymoma, UCEC – Uterine Corpus Endometrial Carcinoma, UCS – Uterine Carcinosarcoma, UVM – Uveal Melanoma) (RSEM - RNA-Seq by Expectation Maximization; Figures derived from Firehose browser, Broad Institute).

48 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Figure 3.3 - IRX4 is highly expressed in prostate cancer compared to other cancers. Higher expression of IRX4 is detected in prostate cancer samples compared to other types of cancer, followed by breast and cervical cancer (Mean ± SD, Source – Oncomine, Bittner multicancer dataset, Bladder cancer = 32, Brain and CNS cancer = 5, Breast cancer = 328, Cervical cancer = 35, Colorectal cancer = 330, Esophageal cancer = 7, Gastric cancer = 7, Head and neck cancer = 41, Kidney cancer = 254, Liver cancer = 11, Lung cancer = 107, Lymphoma = 19, Ovarian cancer = 166, Pancreatic cancer = 19, Prostate cancer = 59 and Sarcoma = 49).

Next, the expression of IRX4 in prostate cancer clinical samples was characterised using multiple datasets from the Oncomine database (Rhodes, et al., 2004). The information of each dataset is summarised in Table 3-1. Similar to TCGA data, higher expression of IRX4 was observed in prostate tumour samples compared to normal prostate tissues in Liu prostate and Wallace prostate datasets (Figure 3.4a and b). Even though, a similar trend was observed in Grasso prostate, Taylor prostate and Arredouani prostate datasets, statistically significant differences were not observed for IRX4 expression between normal and prostate tumour samples (Figure 3.4c, e and f). No difference in IRX4 expression was observed between these groups in the Vanaja prostate dataset Figure 3.4g). The overexpression of IRX4 in prostate cancer samples was further confirmed by qRT-PCR analysis of the RNA samples extracted from FFPE prostate tumour and their adjacent non-malignant tissues obtained from the APCB (n=50, Figure 3.5).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 49

Table 3-1 – Expression microarray studies derived from the Oncomine database for characterising IRX4 expression Dataset Samples Platform Reference Liu Prostate 44 prostate carcinoma, 13 adjacent normal Human Genome U133A Array (Liu et al., 2006) Wallace Prostate 69 prostate tumour samples, 18 adjacent normal Human Genome U133A 2.0 (Wallace et al., prostate samples and 2 pooled normal samples Array 2008)

Taylor Prostate 131 primary tumours, 19 metastasis, 29 normal Not pre-defined in Oncomine (Taylor et al., 2010) adjacent prostate tissue specimens, and 6 cell lines

Grasso Prostate 59 localized prostate carcinoma, 35 castrate- Agilent Human Genome 44K (Grasso et al., 2012) resistant metastatic prostate cancer and 28 benign prostate tissue specimens Arredouani Prostate 13 prostate carcinoma and 8 normal prostate Human Genome U133 Plus 2.0 (Arredouani et al., samples Array 2009) Vanaja Prostate 32 prostate adenocarcinoma and 8 normal Human Genome U133A Array, (Vanaja et al., 2003) prostate gland samples Human Genome U133B Array

50 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Figure 3.4 - IRX4 is overexpressed in prostate tumour tissues compared to adjacent benign prostate samples in multiple data sets. IRX4 is highly expressed in prostate tumour samples compared to the normal prostate. (a) Liu prostate (normal = 13, tumour = 44) (b) Wallace prostate (normal = 20, tumour = 69). Even though similar trend is observed in (c) Grasso prostate (normal = 28, tumour = 94), (d) Taylor prostate and (normal = 29, tumour = 131) and (e) Arredouani prostate (normal = 8, tumour = 13), the results were not statistically significant. No difference in expression was observed between normal and prostate tumour samples in (f) Vanaja prostate (normal = 8, tumour = 32) (Mean±SEM, Source: Oncomine, Mann-Whitney test, *p<0.05, **p<0.01, ns – non-significant)

The role of gwas identified 5p15 locus in prostate cancer risk and progression 51

Figure 3.5 - IRX4 is overexpressed in prostate tumour samples compared to adjacent non-malignant tissues. qRT-PCR analysis was performed with the RNA extracted from prostate tumour and adjacent non-malignant tissues from FFPE blocks (n=50, APCB). The relative expression of IRX4 was calculated by the ΔCT method using geometric mean of RPL32 and HPRT1 expression as an endogenous control. (Wilcoxon matched-pairs signed rank test, p=0.0074).

There was no difference in IRX4 expression observed between primary tumour and metastatic cancer in three different datasets (Figure 3.6). Even though, there was no difference in IRX4 expression observed between different Gleason Scores in Liu prostate, Taylor prostate and Wallace prostate, higher expression of IRX4 was correlated with Gleason Score 9 compared to Gleason score 6 in the Vanaja dataset (Figure 3.7).

52 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Figure 3.6 - IRX4 expression in primary and metastatic prostate cancer samples. No difference in IRX4 expression was observed between primary and metastatic prostate cancer in (a) Taylor prostate (primary site = 131, metastasis = 19), (b) Grasso prostate (primary site = 59, metastasis = 35) and (c) Vanaja prostate (primary site = 27, metastasis = 5) datasets. (Source: Oncomine, Mean±SEM, Mann Whitney test).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 53

Figure 3.7 – IRX4 expression correlation with the Gleason score. No difference in expression levels of IRX4 was seen in (a) Liu prostate (GS6 = 13, GS7 = 16, GS8 = 10 and GS8+ = 5), (b) Taylor prostate (GS6 = 41, GS7 = 76, GS8 = 11 and GS9 = 11) and (c) Wallace prostate (GS6 = 17, GS7 = 48 and GS8-9 = 3) (Kruskal-Wallis test), while higher expression of IRX4 is correlated with Gleason score 9 compared to Gleason score 6 in the Vanaja prostate dataset (GS6 = 12 and GS9 = 15) (Mann Whitney t-test, **p<0.01). (Source: Oncomine, Mean ± SEM).

3.3.2. IRX4 expression in prostate cancer cell lines IRX4 expression was determined in a panel of prostate cell lines by qRT-PCR (Figure 3.8), in order to select the cell lines for further functional assays. The castration resistant prostate cancer cell line, C4-2B, had the highest expression of IRX4. LNCaP, VCaP, DuCaP, RWPE-1 and RWPE-2 cells expressed IRX4, while androgen independent prostate cancer cell lines – PC3 and DU145, had no or minimal expression of IRX4. Low levels of IRX4 expression were detected in the benign prostatic cell line, BPH1, and no expression was observed in the prostate stromal cell line, WPMY1 (Figure 3.8).

54 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Figure 3.8 - IRX4 expression in a panel of prostate cancer cell lines. IRX4 expression (RT-qPCR) from a panel of cell lines representing benign prostate (BPH1, RWPE1), prostate cancer (RWPE2, DuCaP, VCaP, LNCaP, C4-2B, 22RV1, PC3, DU145) and immortalized prostate stromal cell line (WPMY1). RPL32 was used as the endogenous control and the relative expression was determined using the ΔCT method (n=3 biological replicates, Mean±SD, Kruskal-Wallis test with Dunn’s multiple comparisons test respect to BPH1, * p<0.05, *** p<0.001).

3.3.3. IRX4 knockdown in prostate cancer cell lines In order to determine the functional role of IRX4 in prostate cancer progression, transient knock down models were used. Approximately 90% and 75% knockdown of IRX4 mRNA expression was achieved in LNCaP cells with siRNA_1 and siRNA_2 respectively (Figure 3.9).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 55

Figure 3.9 - IRX4 knockdown efficiency in LNCaP cells. The cells seeded in a 6-well plate were transfected with 25 pmoles of siIRX4 or non- targeting siRNA (siNT) using Lipofectamine® RNAiMAX transfection reagent. After 72 hours incubation at 370 C, RNA was extracted and the knockdown efficiency was determined by qRT-PCR. RPL32 expression was used as housekeeping control and was normalised to the RNAiMAX control group (n=3 biological replicates, Mean±SEM, Kruskal-Wallis test with Dunn’s multiple comparisons test, *p<0.05, ***p<0.001).

3.3.4. IRX4 knockdown inhibits cell proliferation in LNCaP and C4-2B cells To date only one study has been published on the potential functional role of IRX4 in prostate cancer, in which IRX4 was proposed to suppress prostate cancer cell growth (Nguyen, et al., 2012). Therefore, we decided to further elucidate the potential role of IRX4 in prostate cancer aetiology. Initially, a cell proliferation assay was performed using the IncuCyte Live cell imaging system with the siRNA mediated transient knockdown in LNCaP cells. IRX4 knockdown cells had significantly lower proliferation compared to the non-targeting siRNA and transfection reagent control (Figure 3.10 - IRX4 knockdown reduced LNCaP cell proliferation.Figure 3.10). However, these results contradict previously published studies, in which IRX4 knockdown was shown to increase prostate cancer cell proliferation (Nguyen, et al., 2012).

56 The role of gwas identified 5p15 locus in prostate cancer risk and progression

7 0 s iN T

) 6 0

% (

s iIR X 4 _ 1

y 5 0

c *

s iIR X 4 _ 2 * n

e 4 0

u

l f

n 3 0

o C

2 0

l

l e

C 1 0

0 0 1 2 2 4 3 6 4 8 6 0 7 2 T im e (h o u r s )

Figure 3.10 - IRX4 knockdown reduced LNCaP cell proliferation. The LNCaP cells seeded in a 96-well plate with a density of 5000 cells/well and transfected either with siIRX4_1, siIRX4_2 or non-targeting siRNA (siNT) using Lipofectamine® RNAiMAX transfection reagent. The images of the cells were taken in two hour intervals for 96 wells and the confluency was analysed using the IncuCyte live cell analysis system. (n=3 biological replicates, Mean±SEM, Friedman test with Dunn’s multiple comparisons test, *p<0.05).

In order to confirm the results obtained from the IncuCyte proliferation assay, the CyQuant assay was performed to measure the DNA content of the cells, which in turn represents the cell number. cyQUANT assay results obtained at 72 hours of transfection were consistent with the IncuCyte results, in which the DNA content of IRX4 knockdown cells were lower compared to the control cells (Figure 3.11). This further confirms the reduction in cell proliferation in IRX4 knockdown LNCaP cells. As siIRX4_1 exhibited more knockdown efficiency, it was used in the following assays.

To further confirm the effect of IRX4 knockdown on cell proliferation, we also performed this assay with C4-2B cells, which had a higher expression of IRX4. Similar to LNCaP cells, ~85% knockdown was observed with siIRX4 knockdown in C4-2B cells after 72 hours of transfection (Figure 3.11(a)). Also, IRX4 knockdown reduced the proliferation of C4-2B cells compared to non-targeting siRNA (Figure 3.11(b)).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 57

Figure 3.11 – CyQUANT assay for LNCaP cell proliferation. The LNCaP cells were seeded in three 96-well black plates with the density of 5000 cells/well and transfected either with siIRX4_1, siIRX4_2 using Lipofectamine® RNAiMAX transfection reagent. The cells were treated with cyQUANT reagent at 72 hours of transfection and the fluorescence was measured after excitation at 480 nm using a microplate reader. siIRX4 transfected cells had lower fluorescent measurements compared to control cells, exhibiting reduced proliferation in these cells (siNT – Non-targeting siRNA, n=3 biological replicates, Mean±SEM, Kruskal-Wallis test with Dunn’s multiple comparisons test, * p<0.05, **p<0.01)

Figure 3.12 - Transient knockdown of IRX4 in C4-2B cells. The C4-2B cells seeded in 96-well plate with the density of 2500 cells/well were transiently transfected with 10 nM siIRX4 or non-targeting siRNA (siNT) using RNAiMAX transfection reagent control and non-targeting siRNA was used as a control. (a) ~85% knockdown was observed at mRNA expression levels after 72 hours of transfection (n=2 biological replicates, Mean±SEM, Mann-Whitney test, *p<0.05). (b) IRX4 knockdown reduced the proliferation of C4-2B cells compared to non- targeting siRNA and transfection reagent control (n=2 biological replicates, Mean±SEM, Wilcoxon test, **** p<0.0001).

58 The role of gwas identified 5p15 locus in prostate cancer risk and progression

3.3.5. IRX4 knockdown reduced LNCaP cell migration Then the effect of IRX4 knockdown on LNCaP cell migration was determined using the IncuCyte Live Cell Imaging System. The cells were treated with the cell cycle inhibitor, mitomycin C (Lee et al., 2001), before assessing for cell migration, to prevent the effect of proliferation interference with the results. IRX4 knockdown in LNCaP cells reduced the migration of cells as observed by wound closure (Figure 3.13).

2 5

) s iN T %

( 2 0 s iIR X 4

y

t ****

i

e s

v 1 5

i

n

t

e

a

l

d e

1 0

d

R n

u 5

o w 0 0 6 1 2 1 8 2 4 3 0 3 6 4 2 4 8 T im e (h o u rs )

Figure 3.13 - LNCaP cell migration with transient knockdown of IRX4. siIRX4 and non-targeting siRNA (siNT) transfected cells seeded in the ImageLock Plate with the seeding density of 50000 cells/well were allowed to form a monolayer overnight. Then the cells were treated with mitomycin C (10 µg/ml) and the uniform, reproducible scratches were made using WoundMaker. The images of the cells were taken in every two hour interval and the relative wound density was determined using the IncuCyte live cell analysis system. IRX4 knockdown cells migrated slower towards wound closure compared to the control cells (n=3 biological replicates, Mean±SEM, Wilcoxon test, **** p<0.0001).

3.3.6. IRX4 knockdown modulates the expression of genes involved in EMT The reduction in LNCaP cell proliferation and migration was accompanied by marked morphological changes in IRX4 knockdown cells (Figure 3.14). Thus, genes involved in EMT were assessed to further understand this process (Figure 3.15). EMT is induced by upregulation of transcription factors such as ZEB1, ZEB2 and SLUG (Li, Xu, et al., 2013). A significant down-regulation of EMT-inducing transcription factor, SLUG, was observed in IRX4 knockdown cells, while the mesenchymal marker,

The role of gwas identified 5p15 locus in prostate cancer risk and progression 59

Vimentin, was also down-regulated, suggesting that IRX4 may play a role in EMT. No changes were observed in ZEB1 and ECAD1 expression in IRX4 knockdown cells.

Figure 3.14 – Cell morphology of IRX4 knockdown of LNCaP cells The cells transfected with siIRX4 had a clear change in cell morphology compared to control cells (siNT – Non-targeting siRNA).

Figure 3.15 - EMT gene expression in IRX4 knockdown LNCaP cells. Cells seeded in a 6-well plate were transfected with 25 pmoles of siIRX4 and non- targeting siRNA (siNT) using Lipofectamine® RNAiMAX transfection reagent. After 72 hours incubation at 370 C, RNA was extracted and the expression of EMT genes, Vimentin, ECAD, ZEB1 and SLUG determined by qRT-PCR. RPL32 expression was used as housekeeping control and was normalised to siNT group from each gene. (n=3 biological replicates, Mean±SEM, Mann-Whitney test, *p<0.05).

3.3.7. Microarray analysis of IRX4 knockdown samples As IRX4 is a transcription factor, it modulates gene expression by binding to its responsive elements on chromatin. Determining IRX4 regulated genes is important to identify the IRX4 function in prostate cancer cells. However, no studies have assessed the effect of IRX4 in global gene expression changes in any models to date. Therefore,

60 The role of gwas identified 5p15 locus in prostate cancer risk and progression

we aimed to perform gene microarray analysis with the IRX4 knockdown in prostate cancer cell models. The IRX4 knockdown efficiency in the samples used for microarray analysis was determined by qRT-PCR (Figure 3.16) and was further confirmed by IF analysis at the protein level (Figure 3.17).

Figure 3.16 - IRX4 knockdown efficiency in LNCaP and VCaP cells. The cells seeded in a 6-well plate were transfected with 25 pmoles of siIRX4 or non- targeting siRNA (siNT) using Lipofectamine® RNAiMAX transfection reagent. After 72 hours incubation at 370 C, RNA was extracted and the knockdown efficiency was determined by qRT-PCR. RPL32 expression was used as housekeeping control and was normalised to the RNAiMAX control group from each cell line. (Mean±SEM, n=3 biological replicates, Mann-Whitney test, *p<0.05).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 61

Figure 3.17 – IRX4 knockdown was confirmed by immunofluorescence Prostate cancer cells transfected with either siIRX4 or non-targeting siRNA were fixed with 4% PFA and stained for F-actin using Phalloidin-488 (green). Cells were treated with anti-IRX4 antibody overnight at 40C and incubated with anti-Rabbit IgG Secondary Antibody coupled with Alexa Fluor® 563 conjugate (red). Nuclei was stained with DAPI (blue) and imaged (60X) with Olympus Inverted Fluorescence microscope.

62 The role of gwas identified 5p15 locus in prostate cancer risk and progression

To confirm the effect of IRX4 siRNA on other IRXs as they contain similar sequences, RT-qPCR was performed to check expression levels in the IRX4 knockdown cells (Figure 3.18). Both LNCaP and VCaP cells had no/low expression of IRX1 and IRX6 (Data not shown). Similarly, these two genes exhibited low levels of expression in prostate tumour and normal samples (Figure 3.2). No difference in expression levels were observed for other IRXs – IRX2, 3, and 5 (Figure 3.18).

Figure 3.18 – Effect of siIRX4 knockdown in other IRXs’ mRNA expression. The expression of IRX2, 3 and 5 in IRX4 knockdown (a) LNCaP and (b) VCaP cells. The cells seeded in a 6-well plate were transfected with 25 pmoles of siIRX4 or non- targeting siRNA (siNT) using Lipofectamine® RNAiMAX transfection reagent. After 72 hours incubation at 370 C, RNA was extracted and the effect of IRX4 knockdown on the expression levels of IRX1, 2, 3, 5 and 6 were determined by qRT-PCR. IRX1 and 6 had no/minimal levels of expression. The expression levels of IRX2, 3 and 5 were not changed by IRX4 knockdown. RPL32 expression was used as housekeeping control and was normalised to the siNT for each gene. (Mean±SEM, n=3 biological replicates, Mann-Whitney test).

The integrity of the RNA samples for microarray analysis was confirmed using the Bioanalyser (RNA Integrity Number, RIN>9.5), while the concentration and purity of the RNA (A260/A280 > 2) samples were confirmed by Nanodrop1000 measurements (Appendix B). The microarray run passed all the QC steps and the samples clustered appropriately by cell line and treatment groups (Appendix C). The genes with fold change ≤ -1.5 or ≥ 1.5 and p<0.05 were considered as differentially regulated and used for further analysis.

IRX4 knockdown resulted in down-regulation of IRX4 by ~6.9 fold in LNCaP and ~5.6 fold in VCaP cells. This knockdown resulted in a number of differentially expressed probes, with the effect being larger in LNCaP cells. Approximately 1900

The role of gwas identified 5p15 locus in prostate cancer risk and progression 63

genes were differentially regulated after knock down of IRX4 in LNCaP cells (687 genes up-regulated, 1309 down-regulated, fold change ≥ ±1.5, p<0.05, Figure 3.19 (a), Appendix D). On the other hand, only 352 genes were found to be differentially regulated in VCaP cells (165 up-regulated and 187 down-regulated, fold change ≥ ±1.5, p<0.05, Figure 3.19 (b), Appendix E). This may be due to the slight difference in knockdown efficiencies and endogenous gene expression differences in these two cell lines. 202 genes were found to be deregulated by IRX4 knockdown in both LNCaP and VCaP cell lines (Figure 3.19 (c) and (d)), in which 195 genes were regulated by IRX4 in the same direction in these cell lines (Appendix F). RT-qPCR was performed to validate the microarray analysis on transcriptional differences for a few of the common genes regulated by IRX4 in both LNCaP and VCaP cells (Figure 3.20). The expression of DPP4, MYB, ITGA1, ITGB3BP, PTDSS1 and TMEM123 were validated to be downregulated with IRX4 knockdown while MAP2K4, NR3C2, PTPN, POLR3F, WNT5A and JAG1 were validated to be upregulated in IRX4 knockdown cells.

Ingenuity Pathway Analysis (IPA) was used to identify the canonical pathways and molecular and cellular functions modulated by IRX4 regulated genes and therefore to identify the potential functional consequence of IRX4 knockdown in these cells. In addition, IPA includes a tool known as ‘Upstream Regulator’, which uses prior knowledge of expected effects between transcription factors and their target genes to explain the observed gene expression changes in the input dataset to understand the biological activities occurring in the samples. The IPA analysis was performed separately for LNCaP and VCaP cell data. As the number of genes deregulated with the IRX4 knockdown in LNCaP cells were higher than in VCaP cells, more pathways were identified to be deregulated in LNCaP cells, with a high activation/inhibition score (Z-Score) compared to the latter.

64 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Figure 3.19 - The differential expression of genes upon IRX4 knockdown in LNCaP and VCaP cells. (a) And (b) represents the volcano plot for the IRX4 regulated genes in LNCaP and VCaP cells respectively. The array contained 23250 genes and 1996 genes were found to be differentially regulated by IRX4 knockdown in LNCaP cells, while 354 genes were identified to be differentially regulated. The volcano plot was graphed using the log2 (fold change) and the –log10 (p-value) in the x and y-axis respectively. Each  represents a gene and the genes with a p-value ≤ 0.05 and fold change ≤ -1.5 or ≥ 1.5 is highlighted as differentially expressed genes. (c) and (d) A comparison analysis between the genes regulated by IRX4 in LNCaP cells and VCaP cells. This had an overlap of 202 genes in common (~57% of IRX4 regulated genes in VCaP cells and ~11% of IRX4 regulated genes in LNCaP cells).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 65

Figure 3.20 – qPCR validation of IRX4 knockdown microarray results in (a) LNCaP and (b) VCaP cells. Cells seeded in 6-well plate were transfected with 10 nM of siIRX4 or non-targeting siRNA (siNT) using Lipofectamine® RNAiMAX transfection reagent. 72 hours after transfection, RNA was extracted and qRT-PCR was performed to determine the gene expression levels. Relative fold expression was calculated using ΔΔCT method. RPL32 expression was used as endogenous control and the expression of each gene was normalised to siNT. The expression of DPP4, MYB, ITGA1, ITGB3BP, PTDSS1 and TMEM123 were downregulated with IRX4 knockdown while MAP2K4, NR3C2, PTPN, POLR3F, WNT5A and JAG1 were upregulated in IRX4 knockdown cells (n=3 biological replicates, Mean±SEM).

66 The role of gwas identified 5p15 locus in prostate cancer risk and progression

3.3.8.1. Pathway analysis for genes regulated by IRX4 in LNCaP cells As expected from the observed inhibition of cell proliferation, the canonical pathways involved in cell cycle were found to be deregulated on IRX4 knockdown in LNCaP cells (Table 3-2). This included signalling by Rho Family GTPases (p-value = 5.10E- 07, z-score = -2.08), Cell cycle and chromosomal replication (p-value = 7.53E-07) and RhoA signalling (p-value = 1.07E-06, z-score = -2.38). Moreover, cell cycle was predicted as the top most deregulated molecular and cellular function (p-value = 1.78E-03 – 4.60E-17) in these cells followed by cellular assembly and organisation (p- value = 1.89E-03 – 3.81E-16) and cellular growth and proliferation (p-value = 1.91E- 03 – 3.81E-16) (Table 3-2).

Table 3-2 – Top canonical pathways and molecular and cellular functions deregulated in IRX4 knockdown LNCaP cells Top Canonical Pathways p-value Signaling by Rho Family GTPases 5.10E-07 Cell cycle control of chromosomal replication 7.53E-07 RhoA signalling 1.07E-06 Molecular and Cellular Functions p-value Cell cycle 1.78E-03 - 4.60E-17 Cellular assembly and organisation 1.89E-03 - 3.81E-16 Cellular Function and Maintenance 1..91E-03 – 3.81E-16 Cellular growth and proliferation 1.89E-03 – 1.98E-14

Colony stimulating factor (CSF2) was predicted to be the most likely inhibited upstream regulator in IRX4 knockdown LNCaP cells (z-score = -5.97, p-value = 5.21E-09, Appendix G and Appendix I). On the other hand, Nuclear Protein 1 (NUPR1) was found to be the most likely activated upstream regulator in IRX4 knockdown samples (z-score = 4.849, p-value = 1.80E-16, Appendix G and Appendix J). In addition, the Upstream Regulator analysis also identified the inhibition of CCND1, which plays a pivotal role in G1- phase progression (z-score -3.230, p-value 8.07E-32) and activation of CDKN1A (z-score 3.196, p-value 6.34E-24), CDKN2A (z-score 4.220, p-value 6.90E-14) and TP53 (z-score = 3.694, p-value = 4.44E-25) (Appendix G). These data together suggest the tumour suppressive state of the LNCaP cells after IRX4 knockdown.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 67

Interestingly, the upstream regulators AR (z-score = -3.174, p-value = 6.18E-09, Figure 3.21 (a)) and dihydrotestosterone (z-score = -4.094, p-value = 3.12E-07) were found to be inhibited when IRX4 was knockdown in LNCaP cells, suggesting that IRX4 may be involved in effective androgen signalling (Appendix G).

3.3.8.2. Pathway analysis for genes regulated by IRX4 in VCaP cells As noted above, the number of statistically significant pathways modulated by IRX4 was less in VCaP cells compared to LNCaP cells. Top canonical pathways regulated by the IRX4 gene signature included regulation of the epithelial-mesenchymal transition pathway (p-value = 1.77E-04), Human embryonic stem cell pluripotency (p- value = 2.90E-04) and Factors promoting cardiogenesis in vertebrates (p-value = 6.90E-04) (Table 3-3). The most likely deregulated molecular and cellular function in IRX4 knockdown VCaP cells was cell death and survival (p-value = 6.45E-03 - 2.10E- 08), followed by cellular movement (p-value = 6.45E-03 – 1.24E-07) and cellular growth and proliferation (p-value = 6.48E-03 – 1.29E-06) (Table 3-3). The data obtained from our microarray analysis reflects the known roles of homeobox genes in developmental process, including differentiation, early embryonic patterning, cell- type specification, and organogenesis (Duverger et al., 2008).

Table 3-3 - Top canonical pathways and molecular and cellular functions deregulated in IRX4 knockdown VCaP cells Top canonical pathways p-value Regulation of the Epithelial-Mesenchymal Transition Pathway 1.77E-04 Human embryonic stem cell pluripotency 2.90E-04 Factors promoting cardiogenesis in vertebrates 6.49E-04 Molecular and Cellular Functions p-value Cell death and survival 6.45E-03 - 2.10E-08 Cellular movement 6.45E-03 - 1.24E-07 Cellular growth and proliferation 6.48E-03 - 1.29E-06

Similar to LNCaP cells, CSF2 was found to be the most likely inhibited upstream regulator (z-score = -2.625, p-value = 0.038, Appendix H and Appendix I), while NUPR1 was found to be in the direction of activation with a significant overlap of genes (p-value = 7E-04, z-score = 1.414, Appendix H and Appendix J). However, the z-score didn’t meet the cutoff of 2, which was used as a statistically significant

68 The role of gwas identified 5p15 locus in prostate cancer risk and progression

activation score. On the other hand, the chemotherapy drug, doxorubicin, was the most likely activated upstream regulator in VCaP cells after IRX4 knockdown (Appendix H and Appendix K), which may be an indicator of the tumour suppressive effect of these knockdown cells. Similarly, in LNCaP cells, doxorubicin was found to be activated with the activation score of 2.391 (p-value = 1.44E-05). Even though, the activation score for AR didn’t meet the cutoff values, it had the same trend of inhibition as LNCaP cells with a statistically significant overlap of genes between the two datasets (z-score = -0.317, p-value = 7E-04).

In summary, similar pathways and/or upstream regulators were modulated by IRX4 knockdown in both LNCaP and VCaP cells, likely suggest a tumour promoting role of IRX4 in these cells, by regulating cell cycle and EMT pathways.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 69

70 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Figure 3.21 - Schematic of genes regulated by IRX4 and AR, annotated by IPA. The genes regulated by IRX4 are also found to be the targets of the androgen receptor, AR (a) in LNCaP cells and (b) in VCaP cells. Orange symbols – up-regulated genes, Blue symbols – down-regulated genes. AR is represented in blue to indicate its predicted inhibition by IRX4 knockdown. Upregulation of a gene by both IRX4 knockdown or AR activity is indicated by orange arrowed lines while downregulation is indicated by blue arrowed lines. The inconsistent relationships by IRX4 knockdown and AR activity are represented by yellow lines. Grey dashed lines indicate that the effect of gene expression regulation by AR is not annotated in IPA.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 71

3.3.8. The AR transcriptome is regulated by IRX4 in LNCaP cells One of the important upstream regulators in prostate cancer, AR, was found to be inhibited in the comparison analysis between the genes regulated by IRX4 in LNCaP and VCaP cells with the effect being larger in LNCaP cells. The modulation of androgen regulated genes by IRX4 was further confirmed by Gene Set Enrichment Analysis (GSEA – MsigDB overlap, Hallmark gene sets) in LNCaP cells (FDR p- value - 7.12E-09) and VCaP cells (FDR p-value = 5.1E-04) and identified androgen responsiveness as one of the top ten deregulated gene signatures.

In addition, androgen regulated gene signatures in LNCaP cells were available in our lab through an APCRC-Q dataset, encompassing microarray analysis of five independent experiments of androgen (DHT) treatment in LNCaP cells. This signature comprised of 1926 genes regulated by DHT. As some of these microarray experiments were performed using the previous versions of Vancouver Prostate Cancer (VPC) microarray chips, the genes which were not on the earlier chips were excluded when generating the androgen regulated gene signature. Therefore, only 1766 IRX4 gene regulated genes were considered for further analysis instead of 1966 genes.

In the androgen gene signature, 863 genes were found to be up-regulated and 1063 genes were down-regulated. The IRX4 regulated gene signature for this analysis comprised of 621 up-regulated and 1145 down-regulated genes (Figure 3.22 (a)). The comparison between these two signatures identified 492 genes to be regulated by both IRX4 and DHT. This accounts for 28% of IRX4 regulated genes (492 out of 1766). On the other hand, IRX4 knockdown affected 26% of these androgen regulated genes (492 out of 1926) in LNCaP cells (Figure 3.22 (b) and (c)). This overlap between the two datasets was found to be statistically significant (Fisher exact test p-value < 2.2e- 16). The majority of the dysregulated genes (62%) showed an opposite directionality of change after IRX4 knock down versus DHT treatment (inhibition z-score - -6.18, Figure 3.22 (d)). This further confirms the negative correlation between the genes regulated by IRX4 knockdown and androgens (DHT), and therefore suggesting that IRX4 may be important in modulating AR signalling.

72 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Figure 3.22 – A comparison between IRX4 regulated genes and androgen regulated genes (ARG). (a) 621 genes were found to be upregulated after IRX4 knockdown in LNCaP cells, while 1145 genes were downregulated. Androgen treatment in LNCaP cells upregulated 863 genes and downregulated 1063 genes (b) and (c) Statistically significant overlap of 492 genes were found to be regulated by both IRX4 and androgens in LNCaP cells. (d) 62% of the overlapping genes exhibited opposite directionality in regulation by IRX4 knockdown and androgen treatment.

3.3.9. IRX4 interacts with the AR co-factor, FOXA1 in LNCaP cells Based on microarray data, we hypothesised that IRX4 may regulate AR. However, previous studies on investigating AR interacting proteins didn’t identify IRX4 as a binding partner of AR in prostate cancer cells, including LNCaP cells (Paltoglou et al., 2017; Stelloo et al., 2017), suggesting the AR transcriptome regulation by IRX4 is not a result of direct interaction between these proteins. Moreover, IRX4 knockdown didn’t have any effect on the AR transcript levels as observed in the microarray data. Therefore, IRX4 may be modulating the AR cistrome by either competing with, or guiding, AR to its chromatin binding sites or this observation might be a consequence of IRX4 interaction with other AR binding partners and therefore modulating AR activity.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 73

In order to identify the interacting partners of IRX4 in LNCaP cells, an immunoprecipitation (IP) was performed with LNCaP nuclear enriched cell lysate. IRX4 was successfully pulled down and detected with 19% - 31.1% coverage in replicates. 86 unique interacting proteins were identified in the IRX4 IP sample (Intensity > E+07, Appendix G) compared to IgG control.

AR wasn’t found in the IRX4 IP samples, suggesting these factors may not directly interact in these cells. Thus to elucidate the mechanism of action of IRX4 in regulating the androgen gene signature the proteins known to be potential interactors and co- regulators of AR were the focus in this study. The proteins identified by IRX4 IP were compared with the AR interacting proteins in androgen treated LNCaP cells as published in a recent study (Stelloo, et al., 2017). In this study, Stelloo et al., had performed the IP experiment for endogenous AR stimulated with synthetic androgen R1881 for 4 hours in LNCaP cells, while endogenous IRX4 pulldown with no treatments were performed in this study. A comparison between these two datasets identified 15 overlapping proteins, suggesting 17% of IRX4 interacting proteins are also known interactors of AR (Table 3-4).

Interestingly, the pioneer transcription factor, FOXA1 (also known as hepatocyte nuclear factor 3 (HNF-3) was detected in the IRX4 IP sample (Table 3-4). The interaction between IRX4 and FOXA1 was further confirmed by Western blot analysis in an independent IP experiment (Figure 3.23). This suggests IRX4 might play a role in altering the AR transcriptome by interacting with its co-factor FOXA1. Further comparison of the FOXA1 transcriptome with IRX4 regulated genes will provide more insights to the regulation of AR signalling by IRX4 via FOXA1.

74 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Table 3-4 - Proteins found to be interacting with both IRX4 and AR in LNCaP cells Accession Gene name ID Description Putative pre-mRNA-splicing factor ATP-dependent DHX15 O43143 RNA helicase DHX15 RPS2 P15880 40S ribosomal protein S2 MYH9 P35579 Myosin-9 Peroxisomal multifunctional enzyme type 2;(3R)- hydroxyacyl-CoA dehydrogenase;Enoyl-CoA HSD17B4 P51659 hydratase 2 FOXA1 P55317 Hepatocyte nuclear factor 3-alpha RPS6 P62753 40S ribosomal protein S6 TJP1 Q07157 Tight junction protein ZO-1 DHX9 Q08211 ATP-dependent RNA helicase A PCBP2 Q15366 Poly(rC)-binding protein 2 IMMT Q16891 Mitochondrial inner membrane protein LONP2 Q86WA8 Lon protease homolog 2, peroxisomal SMARCC1 Q92922 SWI/SNF complex subunit SMARCC1 NCOA5 Q9HCD5 Nuclear receptor coactivator 5 Coiled-coil-helix-coiled-coil-helix domain-containing CHCHD3 Q9NX63 protein 3, mitochondrial CHTOP Q9Y3Y2 Chromatin target of PRMT1 protein

Figure 3.23 - IRX4 interacts with FOXA1 in LNCaP cells. IRX4 IP was performed with nuclear enriched LNCaP cell lysate with an anti-rabbit- IRX4 antibody. Western blot analysis with (a) an IRX4 antibody to confirm IRX4 (blue arrow) pulldown and (b) FOXA1 antibody detected FOXA1 (blue arrow) only in IRX4 pull down sample at 49 kDa.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 75

Moreover, IPA analysis was performed with the proteins found to be interacting with IRX4. The upstream regulator analysis identified Estrogen Receptor 1 (ESR1), as one of the top regulators (Table 3-5), further suggesting that IRX4 may be involved in hormone-regulated pathways. It also identified other cancer related transcription factors, including , VEGFA, HNF4A and SNAI1 as upstream regulators of the IRX4-interacting partners. Further studies on the role of IRX4 in these pathways should clarify the mechanistic role of IRX4 in prostate cancer.

Table 3-5 – The top upstream regulators of the IRX4-interacting proteins Upstream p-value of Regulator Molecule Type overlap ESR1 ligand-dependent nuclear receptor 4.55E-05 MYCN transcription regulator 0.000127 INSR kinase 0.000168 MKNK1 kinase 0.000689 EIF2AK3 kinase 0.000826 ETV5 transcription regulator 0.000903 VEGFA growth factor 0.00102 MYC transcription regulator 0.00125 HNF4A transcription regulator 0.00129 SNAI1 transcription regulator 0.00135

76 The role of gwas identified 5p15 locus in prostate cancer risk and progression

3.4. Discussion

The prostate cancer risk associated 5p15 locus encompasses the plausible candidate genes, TERT and CLPTM1L within 700kb region of the rs12653946 SNP. However, Batra et al., had showed that there is no LD between the rs12653946 SNP and the risk SNPs associated with these two genes (Batra, et al., 2011), thus the rs12653946 SNP represents an independent prostate cancer risk locus. IRX4 had been identified as a potential prostate cancer risk gene from GWAS and additional eQTL studies. This emphasises the importance of these studies in identifying novel genes associated with prostate cancer pathogenesis.

The IRX transcription factors had a distinct expression pattern across different tissues and cancers. Previous studies on IRX1 and IRX2 in the development of lung tissue of mouse embryos (Becker et al., 2001) and frog pronephores (Marra et al., 2014) correlates with the higher expression of these genes in these tissues. A study by Tena et al., identified the distribution of cis-regulatory elements throughout the IRXa cluster (consisting of IRX1, 2 and 4) and shared enhancers among the genes in the cluster (Tena et al., 2011). Even with these shared enhancers, the expression pattern of all the IRXs in each cluster was not found to be completely similar (Bosse et al., 1997; de la Calle-Mustienes et al., 2005; Gomez-Skarmeta et al., 1998; Houweling et al., 2001; Tena, et al., 2011), as only the promoters of the first two genes in these clusters (IRXa – IRX1 and IRX2, IRXb – IRX3 and IRX5) were preferentially interacting with the enhancers. This long-range interaction and conserved three-dimensional architecture were confirmed by Tena et al., and found to be dependent on the transcriptional repressor, CTCF (Tena, et al., 2011). Likewise, a similar expression pattern was observed between IRX1 and IRX2 as well as IRX3 and IRX5 in various tissues and cancers. Recent advanced studies in the field of developmental biology, including homeobox genes, provides evidence regarding the resemblance between normal development and tumorigenesis, suggesting the potential role of developmental genes in cancer pathogenesis (Bhatlekar, et al., 2014; Shah, et al., 2010). Therefore, understanding the expression pattern of these IRX factors in various cancers could provide insights into tumour development and lead to the development of new therapeutic targets in various cancers. Interestingly, higher expression of IRX4 is observed in prostate cancer compared to other types of cancer, which further

The role of gwas identified 5p15 locus in prostate cancer risk and progression 77

suggests a role for IRX4 in prostate cancer progression. In addition, IRX4 is overexpressed in prostate tumour tissues compared to normal tissues, which is also confirmed in our prostate cancer FFPE samples.

The siRNA mediated knockdown of IRX4 reduced cell proliferation in LNCaP and C4-2B cells. The lower rate of proliferation in IRX4 knockdown LNCaP cells was confirmed by both IncuCyte live cell imaging system and cyQUANT assay using two siRNAs. However, this result challenges previously published data, in which IRX4 knockdown was proposed to increase prostate cancer cell proliferation (Nguyen, et al., 2012). This could be due to the siRNAs used in these two studies target different exons of the IRX4 gene, and thus might target different splice variants/isoforms of the IRX4. The presence of a putative exon is reported for IRX4 in an expressed gene tag (EST) data set (BY799479). The expression of this exon is reported in prostate tissue and a few other tissues in the GTEx portal data (Appendix N). This variant is predicted to produce a different protein isoform. Alternative splicing is reported as a common feature for genes encoding transcription factors and the expression of the spliced exons shows tissue-specific expression (Haendeler et al., 2013). Interestingly, the homeodomain gene HNF1B, which encodes for three variants A, B and C has been shown to have different functions with respective to the variants, as HNF1B variants A and B act as a transcriptional activator while HNF1B variant C functions as a transcription repressor (Harries et al., 2010). Thus, using another siRNA targeting the variant containing the new exon will further clarify the results obtained in our studies. Moreover, overexpression studies of different IRX4 variants would identify their individual functions. Similar to our observation in prostate cancer cell proliferation, IRX4-positive ventricular progenitor cells were observed to have a high proliferation rate (Nelson et al., 2016). Moreover, knockdown of another IRX family member, IRX5 also led to reduced proliferation in prostate cancer cells (Myrthue, et al., 2008). Furthermore, IRX4 knockdown in LNCaP cells decreased migration. A marked morphological change was observed in these IRX4 knockdown cells and the key EMT transcription factor, Slug and mesenchymal marker Vimentin were downregulated with IRX4 knockdown.

There are no studies published to date that have comprehensively analysed the transcriptome and pathways affected by IRX4 in any cells. Therefore, we performed microarray analysis with IRX4 knockdown in LNCaP and VCaP cells and found cell

78 The role of gwas identified 5p15 locus in prostate cancer risk and progression

cycle and EMT pathways to be modulated, which explains the observations in the functional assays and the cell morphology changes. Epithelial-Mesenchymal plasticity has been reported to play an important role in prostate cancer metastatic progression (Bitting et al., 2014; Ye et al., 2015). During this transition, cancerous epithelial cells loose the cellular properties such as cell-cell adhesion and cell polarity and acquire a mesenchymal phenotype with enhanced migratory and invasive potential (Montanari, et al., 2017). Interestingly, human embryonic stem cell pluripotency is one of the top deregulated pathways in IRX4 knockdown VCaP microarray analysis. Published literature on IRX4 as the marker of ventricular myocardium differentiation with multipotential ability to differentiate into ventricular myocytes, smooth muscle cells and endothelial cells (Bruneau, et al., 2000; Nelson, Lalit, et al., 2016) suggests similar genes and pathways may be regulated by IRX4 in prostate cancer cells.

Even though, the number of genes regulated by IRX4 knockdown is higher in LNCaP cells compared to VCaP cells, similar upstream regulators were predicted to be deregulated in both the LNCaP and VCaP cells including inhibition of CSF2 and activation of NUPR1. Even though not directly implicated in prostate cancer, CSF2 overexpression has been reported to be associated with STAT5 phosphorylation, which is proposed as a potential therapeutic target for epithelial tumours such as prostate and breast cancer (Lee et al., 2016) (Koptyra et al., 2011; Page et al., 2012). The cellular stress responsive gene NUPR1 has been implicated in context-dependent biological functions, which accounts for the oncogenic and a suppressive role of this protein in tumour growth (Chowdhury et al., 2009). NUPR1 has been found to promote tumour progression in breast, thyroid, brain and pancreatic malignancies (Chowdhury, et al., 2009; Ito et al., 2003; Ree et al., 2000; Su et al., 2001), while it acts as a tumour suppressor in prostate cancer (Jiang et al., 2006).

Interestingly, the expression of AR regulated genes was influenced by IRX4 knockdown in both cell types, with the effect being larger in LNCaP cells. Additional comparative analysis between the androgen gene signature and IRX4 regulated genes in LNCaP cells identified a significant overlap between these two datasets with a negative correlation between androgen regulated genes and IRX4 knockdown, further confirming the role of IRX4 in regulating the AR transcriptome. We further identified a few common binding partners of AR and IRX4, including the pioneer co-factor, FOXA1, suggesting that the modulation of androgen signalling by IRX4 may be

The role of gwas identified 5p15 locus in prostate cancer risk and progression 79

mediated via FOXA1. FOXA1 is known to remodel chromatin to allow genomic access by hormone transcription factors, including AR and the Estrogen receptor (ER) (Yang et al., 2015). This AR co-factor has been reported to promote tumour progression by inducing cell cycle pathways in androgen dependent prostate cancer. A well-known homeobox protein, HOXB13, has been reported to colocalize with FOXA1 and thereby reprogram AR binding sites in prostate tumour tissue (Pomerantz et al., 2015). Interestingly, we also noted a negative correlation between FOXA1 and IRX4 expression in the TCGA data set. However, detailed exploration of the interplay of IRX4, FOXA1 and AR at the cistromic level is essential to definitively identify the shared targets between these transcription factors and to confirm the role of IRX4 in androgen signalling.

80 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Chapter 4: IRX4lncRNA at the prostate cancer risk 5p15 locus

promotes prostate cancer progression

4.1. Introduction

Through GWAS, the 5p15 locus was identified to be associated with prostate cancer risk in multi-ethnic populations (Batra, et al., 2011; Lindstrom, et al., 2012; Takata, et al., 2010). It has been revealed that at least three genes of interest may be the target of causative SNPs at the prostate cancer risk associated 5p15 locus – IRX4, CTD- 2194D22.4 and IRX4lncRNA. CTD-2194D22.4 was not expressed in normal prostate or cancerous tissues in our in-house RNA-sequencing data, implying this lncRNA may not be functional in prostate cancer pathogenesis, even though the prostate cancer risk associated SNP, rs12653946 is located in the intronic region of this gene. The IRX4 gene at the 5p15 locus has been previously identified as the top-ranked expression quantitative trait locus (eQTL) (Xu, et al., 2014). We have already shown the potential role of IRX4 in androgen signalling via interaction with AR co-factor FOXA1 in prostate cancer cells (Chapter 3). Interestingly, we observed the expression of IRX4lncRNA from the antisense strand of IRX4 in our paired-end RNA-sequencing data. A recent study on lncRNAs at the GWAS identified prostate cancer risk loci has shown CTD-2194D22.3, or IRX4lncRNA, as one of the 45 candidate lncRNAs associated with prostate cancer risk (Guo, et al., 2016). In this study IRX4lncRNA was found to be differentially expressed in prostate tumour samples compared to normal prostate tissue in TCGA data. In addition, the expression of this lncRNA was correlated with the SNP rs12653946 genotype at 5p15 locus (Guo, et al., 2016). However, the functional role of this lncRNA in prostate cancer aetiology still needs to be explored.

LncRNAs act as important mediators of gene expression regulating the biological pathways of cell proliferation, migration and apoptosis, which play a crucial role in cancer development and progression. The modulation of lncRNA expression has been shown to have an effect on tumour formation, progression and metastasis. Furthermore, the rapid increase in our understanding of the mechanisms of action of lncRNAs as critical regulators in these cellular processes of various cancers increasingly suggests that lncRNAs may represent a poorly characterised layer of

The role of gwas identified 5p15 locus in prostate cancer risk and progression 81

cancer biology (Prensner & Chinnaiyan, 2011). Emerging studies have suggested that lncRNAs may be used as biomarkers or targeted therapies for prostate cancer along with conventional methods and the diagnostic, prognostic and therapeutic potential of lncRNAs in prostate cancer management is discussed in Chapter 1. These lncRNAs can regulate gene expression by either cis (targeting genomically local genes) or trans (targeting distant genes) action (Ponting et al., 2009; Smolle, et al., 2017). For instance, the tumour suppressor ncRNA, PTNEP1, acts in cis by decaying miRNA which target the 3’ UTR (untranslated region) of the PTEN gene (Phosphatase and tensis homolog) (Poliseno, et al., 2010). A well-known lncRNA ANRIL, an antisense RNA transcribed from the INK4 locus., has potential to act both in cis and trans. In-vitro studies indicate that ANRIL represses INK 4A/INK 4B by directly binding to a member of the polycomb complex 1 (PRC1), CBX7 (Yap, et al., 2010), and a member of the polycomb complex 2 (PRC2), SUZ12 (Kotake, et al., 2011) to subsequently repress histone modifications at that locus. In addition to this, ANRIL can bind to specific sequences such as Alu elements to regulate genes in trans (Aguilo et al., 2016; Chi et al., 2017). The lncRNA HOTAIR is involved in epigenetic regulation at genome-wide level by interacting with histone modification complexes., PRC2 and lysine specific demethylase 1 (LSD1) (Chen et al., 2016).

Interestingly, some of the candidate lncRNAs of prostate cancer are known to be regulated by androgens as well as involved in altering the androgen signalling pathway. For instance, the lncRNA CTBP1-AS, which is located in the antisense region of the AR corepressor C-terminal binding protein 1 (CTBP1) is androgen responsive and shown to promote both hormone-dependent and castration-resistant tumour growth (Takayama, et al., 2013). It is also suggested that CTBP1-AS exhibits global androgen-dependent function by inhibiting tumour-suppressor genes through the PSF (Polypyrimidine tract-binding protein, PTB, – associated splicing factor) - dependent mechanisms, thus promoting cell cycle progression (Takayama, et al., 2013). The lncRNA PRNCR1 polymorphisms, located at the cancer susceptibility region, 8q24, were associated with prostate cancer risk (Chu et al., 2017; Chung, et al., 2011) and PRNCR1 was involved with prostate cancer cell viability and AR transactivation (Yang, et al., 2013). Another androgen-regulated prostate-specific lncRNA, PCGEM1 which is located at chromosome 2q32, is overexpressed in a significant percentage of primary prostate cancer specimens (Yang, et al., 2013). The overexpression of

82 The role of gwas identified 5p15 locus in prostate cancer risk and progression

PCGEM1 in LNCaP and NIH3T3 cells promoted cell proliferation and colony formation, suggesting the biological role of this lncRNA in cell growth regulation (Petrovics, et al., 2004).

In this chapter, the expression of IRX4lncRNA was characterised in clinical samples using in-silico analysis using public databases and was further confirmed in prostate cancer samples from the APCB. Then, expression analysis was performed in a panel of prostate cancer cell lines to select the best model for further overexpression and knockdown models of IRX4lncRNA. In-vitro assays were performed to identify the potential function of this lncRNA in prostate cancer aetiology and the regulation of IRX4 by its antisense IRX4lncRNA was determined in prostate cancer cells.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 83

4.2. Methods

4.2.1. In-silico prediction of IRX4lncRNA expression and function The expression of IRX4lncRNA in different tissues was assessed from the data derived from the Noncode database (Bu et al., 2012; Liu et al., 2005; Xie et al., 2014; Zhao et al., 2016). The differential expression of this lncRNA in tumour vs normal tissues were assessed using the MiTranscriptome database (Iyer et al., 2015). IRX4lncRNA alterations at transcriptional, genomic and epigenetic levels were obtained from The Cancer LncRNome Atlas (http://52.25.87.215/TCLA/index.php). The function of IRX4lncRNA was predicted using an online tool, FuncPred (Perron et al., 2017).

4.2.2. Strand specific RT-qPCR 1 μg of RNA was reverse transcribed to cDNA using SuperScript® III Reverse Transcriptase (Invitrogen, Catalog number - 18080-044). Strand specific cDNA was synthesised with 5 μM strand specific primers (Sigma) for IRX4lncRNA and RPL32 (used as a normalisation control in RT-qPCR) and the mRNA expression levels of IRX4lncRNA and RPL32 were quantified using the SYBR Green PCR method (Life Technologies).

4.2.3. Establishment of stable cell lines The pLKO Tet-ON vector (Sigma, kindly provided by Dr Patrick Ling) was used to clone shRNA sequences. Briefly, the 1.9 kb stuffer was removed from the vector by double restriction digestion of 1 μg of plasmid DNA with AgeI (10 units, New England Bilabs Inc, NEB, Catalog number - R0552S) and EcoRI (10 units, NEB, Catalog number – R0101S) along with 10 X Cutsmart buffer (NEB). The annealed shRNA oligos (0.1 nmol/µl) was ligated into gel-purified double digested pLKO-Tet-On. The ligated product was then transformed into competent Stbl3 E.coli cells and the positive clones was screened based on restriction digestion with XhoI (NEB, R0146S) followed by running them on 2% agarose gel (2 closely migrating bands at 200bp) and by sequencing them.

The overexpression of IRX4lncRNA was carried out according to Gateway® Technology (Invitrogen) instructions. The full length IRX4lncRNA was PCR amplified from LNCaP cDNA and cloned into the pDONR223 vector using the BP recombination reaction protocol (Gateway® Technology (Invitrogen)). Then the insert

84 The role of gwas identified 5p15 locus in prostate cancer risk and progression

was cloned into pINDUCER21 (a gift from Stephen Elledge & Thomas Westbrook, Addgene plasmid # 46948, (Meerbrey et al., 2011)) using the LR recombination reaction protocol (Gateway® Technology (Invitrogen)). The inserts were confirmed by sequencing.

Lentiviral particles (293T) production - On the day before transfection, 1.5x106 HEK- 293T cells were plated onto 10 cm dishes in 10 ml DMEM supplemented with 10% heat inactivated FBS without antibiotics. Then 12 µl X-treme GENE HP DNA Transfection Reagent with 1.8 µg of pCMV-Δ8.2R, 0.2 µg of pCMV-VSVG (kindly provided by Dr Brett Hollier) and 2.0 µg of pDNA (plasmid of interest) incubated at room temperature for 15 minutes. The mixture was then dripped onto the 293T cells using a Pipetman and was incubated overnight at 370C. The viral supernatant was harvested at 48 and 72 hours, and the viral soup was filtered through a 0.45 micron filter.

Infection of target cells - The LNCaP and PC3 cells were seeded 1-2 days before the last collection of the viral particles, so that they were at 30-50% confluence at the time of infection. The target cells were infected with the viral supernatant and protamine sulfate (8 µg/ml). The cells were fed with fresh medium on the day following infection. When the cells were confluent, they were scaled up into larger flasks and the selection was done based on antibiotic resistance for knockdown models and GFP for overexpression models.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 85

4.3. Results

4.3.1. Characterisation of IRX4lncRNA in clinical samples Initially, the expression of IRX4lncRNA in different tissues was checked in publicly available databases. Noncode is an integrated knowledge database comprising non- coding RNAs from 17 species, including human (Bu, et al., 2012; Liu, et al., 2005; Xie, et al., 2014; Zhao, et al., 2016). Briefly, newly identified lncRNAs were retrieved from the literature and other public databases such as Ensemble, RefSeq, lncRNAdb and Genecode, the data were processed through a standard pipeline for each species. According to this database IRX4lncRNA gene was conserved in the mouse and rat genome on chromosome 13 and 1 respectively (Appendix O). IRX4 gene is also in the same chromosome in these species. The conservation data for other species is not available. In addition, transcription of mRNA was reported in this conserved region of both mouse and rat genome. IRX4lncRNA expression is observed only in some specific tissues, including prostate (Figure 4.1). The highest expression of IRX4lncRNA was observed in heart and skeletal muscle, which correlates with the expression of IRX4 in these tissues.

Figure 4.1 - Expression profile of IRX4lncRNA in different tissue samples. IRX4lncRNA is expressed only in specific tissues, including prostate. Highest lncRNA expression was observed in heart and skeletal muscle. (Source –www.noncode.org, FPKM: fragments per kilobase of exon per million fragments mapped).

86 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Then the expression of IRX4lncRNA was checked in different cancer types in the MiTranscriptome database (Iyer, et al., 2015). This database is a catalog of poly- adenylated RNA transcripts derived from computational analysis of RNA-sequencing data from 6500 samples of various cancers and normal tissues from 25 independent data sets, including TCGA data. IRX4lncRNA is overexpressed in prostate cancer samples compared to normal prostate. A similar trend was observed in lung and head and neck cancer, while it is overexpressed in normal breast tissues compared to breast cancer. However, there was only a low/minimal expression observed in heart and skeletal muscle, while these two tissues had a higher expression of IRX4lncRNA according to the Noncode database (Figure 4.1, Figure 4.2). This could be due to the small number of samples representing heart and skeletal muscle tissues in the MiTranscriptome database (13 and 16 samples respectively) (Iyer, et al., 2015).

Figure 4.2 - IRX4lncRNA expression in various tumour and normal tissues. The figure is derived from MiTranscriptome for the expression of IRX4lncRNA in clinical samples. The red box shows the overexpression of IRX4lncRNA expression in prostate tumour tissues compared to normal samples. Similarly, overexpression of IRX4lncRNA was observed in lung and head and neck cancer tissues compared to controls, while the opposite trend was observed in breast tissues. Other tissues expressed relatively lower levels of this lncRNA (FPKM: fragments per kilobase of exon per million fragments mapped).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 87

In addition, IRX4lncRNA alterations at the transcriptional, genomic and epigenetic levels were obtained from The Cancer LncRNome Atlas (TCLA, http://52.25.87.215/TCLA/index.php). This database analyses 13,562 lncRNAs from 13 cancer types obtained from TCGA, comprising expression, copy number variation, methylation and SNP data of 5037 specimens. The alterations in IRX4lncRNA in various cancers is summarised in Table 4-1. IRX4lncRNA is upregulated in prostate cancer samples in the TCGA sample set as also seen in the miTranscriptome data. IRX4lncRNA is one of 140 lncRNAs whose upregulations were unique to prostate cancer compared to other cancer types available on TCLA database. Even though, it was detected in head and neck cancer, no differential expression was observed between cancer and normal samples. Interestingly, this gene was epigenetically silenced in many cancer types, including bladder and ovarian cancer, but not in prostate cancer. In addition, copy number variation was detected at this locus in bladder, breast and skin cancer (Table 4-1).

The expression levels of IRX4lncRNA obtained from the TCGA prostate cancer dataset was analysed for correlation with disease aggressiveness and treatment outcome. PCA3 expression was used as a comparison, as PCA3 is a well-characterised lncRNA biomarker used for prostate cancer diagnosis. The PCA3 score has the ability to predict a low-volume prostate tumour (Auprich et al., 2011), with conflicting reports on its potential to predict disease aggressiveness (Salagierski et al., 2010). The overexpression of IRX4lncRNA expression in TCGA prostate cancer samples compared to their matched controls is shown in Figure 4.3 (a). Even though, IRX4lncRNA was overexpressed in prostate tumour samples with similar statistical significance, PCA3 had a much higher expression compared to IRX4lncRNA expression (Figure 4.3 (b)).

88 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Table 4-1 - Alterations detected for IRX4lncRNA in different cancers from the TCGA dataset Expression CNV1 CAESLG Cancer Type Detectable Dysregulated Alteration Focal alteration loci 2 5p15.33 (chr5:1-2732677): Bladder Urothelial Carcinoma (BLCA) √ √ amplification 5p15.33 (chr5:1-4509373): Breast Invasive Carcinoma (BRCA) √ amplification Colon Adenocarcinoma (COAD) Glioblastoma Multifrome (GBM) √ Head and Neck Squamous Cell Carcinoma √ (HNSC) Kidney Renal Clear Cell Carcinoma (KIRC) Acute Myeloid Leukemia (LAML) √ Lung Adenocarcinoma (LUAD) Lung Squamous Cell Carcinoma (LUSC) Ovarian Serous Cystadenocarcinoma (OV) √ Prostate Adenocarcinoma (PRAD) √ √ (Up) 5p15.31 (chr5:1-31199922): Skin Cutaneous Melanoma (SKCM) √ √ deletion Uterine Corpus Endometrial Carcinoma √ (UCEC)

1 – Copy Number Variation, 2 - Cancer Associated Epigenetically Silenced LncRNA Genes

The role of gwas identified 5p15 locus in prostate cancer risk and progression 89

Figure 4.3 - The expression of IRX4lncRNA and PCA3 in the TCGA prostate cancer dataset Differential expression between tumour and matched controls were determined in TCGA prostate cancer samples for (a) IRX4lncRNA and (b) PCA3. (Source – TCGA prostate cancer adenocarcinoma, n=52 patients; Mean±SEM, Wilcoxon matched-pairs signed rank test, ****p<0.0001).

Then, the correlation of lncRNA expression with the Gleason Score, a clinical predictor of prostate cancer prognosis, was assessed. Even though an increasing trend in the expression of IRX4lncRNA was observed with increasing Gleason score (Figure 4.4 (a)), the results were not statistically significant, while PCA3 expression decreased with increasing Gleason score (Figure 4.4 (b)).

Figure 4.4 - Correlation between the Gleason score and IRX4lncRNA/PCA3 expression (a) IRX4lncRNA expression increased with the increasing Gleason score. (b) PCA3 expression reduced with increasing Gleason score (GS6 = 26, GS7 = 229, GS8 = 43, GS9 = 73 and GS10 = 3 patients; Mean±SEM, Kruskal-Wallis test, *p<0.05, **p<0.001, ****p<0.0001).

Although Gleason Score 7 is considered as an intermediate risk by clinicians, Gleason Score 4+3 has been reported to have a poorer pathological stage and biochemical

90 The role of gwas identified 5p15 locus in prostate cancer risk and progression

recurrence outcome compared to Gleason Score 3+4 (Gordetsky et al., 2016). Patients with 4+3 Gleason Score had a higher expression of IRX4lncRNA compared to the patients with Gleason Score 3+4 (Figure 4.5 (a)). On the other hand, no differences were observed for PCA3 expression between Gleason Scores 3+4 and 4+3 (Figure 4.5 (b)).

Figure 4.5 - IRX4lncRNA expression is associated with aggressive disease. (a) IRX4lncRNA was overexpressed in patients with Gleason Score 4+3 compared to 3+4. (b) No difference was observed in PCA3 expression between Gleason Score 3+4 and 4+3. (Source – TCGA prostate cancer adenocarcinoma, GS3+4 = 140 and GS4+3 = 89 patients; Mean±SEM, Mann-Whitney test, two-tailed, ***p<0.001).

Interestingly, the patients who had cancer recurrence or progression had a higher IRX4lncRNA expression compared to disease free patients (Figure 4.6 (a)). On the other hand, these patients had a lower PCA3 expression compared to disease free patients (Figure 4.6 - IRX4lncRNA and PCA3 expression in disease free and cancer progressed patients.(b)).

The overexpression of IRX4lncRNA in prostate tumour samples compared to adjacent non-malignant tissues was further confirmed by RT-qPCR analysis of RNA samples extracted from FFPE tissues obtained from the APCB (Figure 4.7).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 91

Figure 4.6 - IRX4lncRNA and PCA3 expression in disease free and cancer progressed patients. (a) IRX4lncRNA was overexpressed in those cancer patients with cancer recurrence compared to disease free patients (b) PCA3 expression was higher in the disease free patients compared to the patients with progressed disease. (Source – TCGA prostate cancer adenocarcinoma, Disease free = 308, Recurred/Progressed = 61 patients; Mean±SEM, Mann-Whitney test, ****p<0.0001).

Figure 4.7 - IRX4lncRNA is overexpressed in prostate tumour samples compared to adjacent non-malignant tissues. QRT-PCR analysis was performed with RNA extracted from prostate tumour and adjacent non-malignant tissues from FFPE blocks. The relative expression of IRX4lncRNA was calculated by the ΔCT method using the geometric mean of RPL32 and HPRT1 expression as the endogenous control. (APCB samples, n=78, Mean±SEM, Wilcoxon matched-pairs signed rank test, **** p < 0.0001).

92 The role of gwas identified 5p15 locus in prostate cancer risk and progression

4.3.2. Expression of IRX4lncRNA splice variants in prostate cancer cells According to alternative splicing figure from the Swiss Institute of Bioinformatics derived from the UCSC genome browser, two transcript variants were predicted for IRX4lncRNA (NC_000005_71, Figure 4.8). This lncRNA has three exons, of which variant 1 is comprised of exon 1 and 3, while variant 2 encoded by exon 2 and 3. The expression of the middle exon (exon 2) is not seen in the RNA-sequencing data of prostate cancer cell lines, suggesting variant 2 may not be expressed in prostate cancer. Androgen responsive LNCaP cells had higher expression of variant 1 and the normal prostate cell lines HPR1 and RWPE1 had a moderate expression, while metastatic prostate cancer cell line, PC3 didn’t express the IRX4lncRNA. Benign prostate cell line, BPH1 had low levels of IRX4lncRNA expression.

Expression of the IRX4lncRNA variants was further validated by RT-PCR (Figure 4.9). Variant 1 had a higher expression in a panel of prostate cancer cell lines compared to the expression of variant 2. Benign prostate cell lines, BPH1 and RWPE1 and prostate cancer cell lines RWPE2, DuCaP, VCaP, LNCaP and C4-2B had higher expression levels of variant 1 compared to the androgen independent cell lies PC3, DU145 and 22RV1. Low levels of variant 2 expression was detected in RWPE1, DuCaP, VCaP and C4-2B cells. As the expression levels were very low in this variant 2, only the variant 1 was focused in this study.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 93

Figure 4.8 - Predicted variants of IRX4lncRNA. Two variants are predicted for IRX4lncRNA according to the Swiss Institute of Bioinformatics gene predictions. The lncRNA has three exons, with variant 1 consist exon 1 and exon 3, while variant 2 consist exon 2 and 3 (shown in orange). The lower panel shows the strand-specific RNA- sequencing data of prostate cancer cell lines for IRX4lncRNA expression. The expression of the middle exon is not seen in the RNA-sequencing data. Androgen responsive LNCaP cells had higher expression of variant 1 and the normal prostate cell lines, HPR1 and RWPE1 had a moderate expression. Androgen independent prostate cancer cell line, PC3 didn’t have expression for both the variants of IRX4lncRNA.

94 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Figure 4.9 - RT-PCR analysis of IRX4lncRNA variants in prostate cancer cells. 1% agarose gel showing the expression of IRX4lncRNA variants in a panel of prostate cancer cell lines. Higher expression of variant 1 was observed in normal prostate cell lines, BPH1 and RWPE1, and prostate cancer cell lines RWPE2, DuCaP, VCaP, LNCaP and C4-2B cells. Metastatic prostate cancer cell lines 22RV1, PC3 and DU145 expressed low levels of variant 1 (Upper panel, expected band size – 153bp). Lower panel shows the expression of variant 2 in RWPE1, RWPE2, DuCaP and VCaP (Expected band size – 173 bp) (NTC-Non-template control).

4.3.3. Optimisation of strand specific qRT-PCR for IRX4lncRNA The antisense IRX4lncRNA is located in the intronic region of IRX4. Strand specific RNA-sequencing data of prostate cancer cell lines (including LNCaP, RWPE1, RWPE2 and BPH1) performed by our group showed no reads in the intronic region of IRX4. To additionally ascertain the specificity, strand specific qRT-PCR was carried out, to avoid any amplification from the sense gene. For this purpose, reverse primer sequences for IRX4lncRNA and RPL32 were designed and attached to different universal adapter sequences (non-specific sequences which do not have any complementarity to the human genome) to be used in cDNA synthesis and then the reverse primer against the adapter sequence was used for qRT-PCR; therefore only the specific targets were amplified.

CT values obtained for both IRX4lncRNA and RPL32 expression with IRX4lncRNA+RPL32 specific cDNA in which 0.2 μM of each primer was used was comparable with the CT values obtained with oligo dT and random hexamer primers.

Thus, the strand specific cDNA synthesis for the subsequent experiments in cell lines was carried out according to the strand specific RT-qPCR protocol. In addition, alternative cDNA was prepared for each RNA sample with random hexamers to determine IRX4 expression levels and other genes of interest. However, cDNA from

The role of gwas identified 5p15 locus in prostate cancer risk and progression 95

the patient RNA samples was synthesised only with random primers, due to limited RNA availability.

4.3.4. Relative expression of IRX4lncRNA in a panel of prostate cell lines by qPCR IRX4lncRNA was expressed in the cell lines which also had IRX4 expression, however, they didn’t share the same trend of expression in all the cell lines. Similar to the expression of IRX4, C4-2B cells exhibited the highest expression of IRX4lncRNA, followed by LNCaP cells (Figure 4.10). In addition, IRX4lncRNA was expressed in the prostate cancer cell lines, VCaP, DuCaP, RWPE2 and normal prostate cell lines – BPH1 and RWPE1, while minimal expression was detected in the remaining cell lines. Similarly, low levels of IRX4lncRNA variant 1 was observed in qRT-PCR analysis with 22RV1, PC3 and DU145 cells (Figure 4.9).

Figure 4.10 - IRX4lncRNA expression in a panel of prostate cell lines IRX4lncRNA expression (from strand specific RT-qPCR) from a panel of cell lines representing prostate cancer (LNCaP, DuCaP, C4-2B, PC3, DU145, RWPE2, 22RV1), benign prostate (BPH1, RWPE1) as well as an immortalized prostate stromal cell line (WPMY1). RPL32 was used as the endogenous control and the relative expression was calculated using ΔCT method. (n=3 biological replicates, Mean±SD).

96 The role of gwas identified 5p15 locus in prostate cancer risk and progression

4.3.5. IRX4lncRNA expression during Epithelial to Mesenchymal Transition (EMT) in prostate cancer cells An EMT model of prostate cancer was established by Dr Nataly Stylianou (Dr Hollier’s group) using a doxycycline (dox) inducible SNAI1 (an important transcription factor driving EMT) system in LNCaP cells. IRX4lncRNA expression was slightly down-regulated during EMT and then increased when the cells undergo MET.

Figure 4.11 - Relative expression of IRX4lncRNA in cells undergoing EMT. RT-qPCR was carried out with the RNA for samples, including control (No dox), EMT7 (7 days after dox was added, Epithelial to mesenchymal transition induced), and MET (7 days after dox was removed from EMT7, Mesenchymal to Epithelial Transition restored) (n=3 biological replicates, Mean ±SEM, Kruskal-Wallis test with Dunn’s multiple comparisons test, *p<0.05).

4.3.6. Screening siRNAs to determine IRX4lncRNA knockdown efficiency To determine the functional role of IRX4lncRNA in prostate cancer progression, transient knockdown models were used. To determine the efficiency of IRX4lncRNA knockdown, six siRNAs, three sequences targeting each exon, were screened in LNCaPs using transient transfection. 90-95% knockdown efficiency of IRX4lncRNA expression was achieved with all six siRNAs was more effective in LNCaP cells compared to DuCaP cells (Figure 4.12). Thus a proliferation assay was performed with the siRNA 2 and siRNA 5, targeting exon 1 and 3 respectively. Exon 5 siRNA may also be targeting the low levels of variant 2, while siRNA2 targets specifically variant 1 expression.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 97

Figure 4.12 - siRNA screening for IRX4lncRNA knockdown in LNCaP cells. The cells seeded in a 6-well plate were transfected with Lipofectamine® RNAiMAX Transfection Reagent and 10 nM of siRNA (targeting Exon1 – siRNA1 -3 and targeting Exon2 – siRNA 4-6) and incubated for 72 hours at 370 C. qRT-PCR was performed with strand specific cDNA. The bars represent the relative fold expression of IRX4lncRNA. RPL32 was used as endogenous control and the IRX4lncRNA expression of each group was normalised to the RNAiMAX control. (n=2, Mean ±SEM).

4.3.7. Cell proliferation assay with transient knockdown of IRX4lncRNA The effect of transient IRX4lncRNA knockdown on LNCaP cell proliferation was assessed using the IncuCyte Live Cell Imaging system. Proliferation of LNCaP cells was reduced in IRX4lncRNA knockdown cells (Figure 4.13), which suggests a functional role for this lncRNA in prostate cancer cells.

1 0 0

) 7 5

%

(

e *

c **

n 5 0

e

u

l

f n

o R N A iM A X

C 2 5 s iN T

s iR N A 2 s iR N A 5 0 0 1 2 2 4 3 6 4 8 6 0 7 2 8 4 9 6 1 0 8 1 2 0 T im e (h o u rs )

Figure 4.13 - The effect of transient knockdown of IRX4lncRNA on cell proliferation of LNCaP cells. LNCaP cells were seeded in 96-well plate at a density of 4000 cells/well and transfected with 10 nM of siRNA2 and siRNA5 on the following day with Lipofectamine® RNAiMAX transfection reagent. Proliferation of LNCaP cells was reduced with IRX4lncRNA knockdown. The confluency of cells was measured under IncuCyte from the day of transfection to five days (siNT – Non-targeting siRNA, n=3 biological replicates, Mean±SEM, Friedman test with Dunn’s multiple comparisons test * P<0.05; ****P <0.0001).

98 The role of gwas identified 5p15 locus in prostate cancer risk and progression

4.3.8. Stable cell line establishment for overexpression and knockdown IRX4lncRNA The inducible knockdown models were established using the pLKO-Tet-ON plasmid with the siRNA2 and siRNA5 sequences to generate shRNAs 1 and 2 respectively. LNCaP cells were infected with the viral particles and the destination vector and selection was done based on puromycin resistance. Then the LNCaP-shRNA lentiviral system was validated for IRX4lncRNA knockdown with or without 250 ng/ml doxycycline (dox) treatment and the relative IRX4lncRNA expression was quantified using qRT-PCR (Figure 4.14). shRNA2 had a higher knockdown efficiency compared to shRNA1 which was similar to the transient knockdown efficiency. However, shRNA1 had a lower knockdown efficiency compared to its corresponding transient transfection.

The overexpression models were established using the pINDUCER21 (ORF-EG) vector system and PC3 cells, which had a minimal endogenous expression of IRX4lncRNA (Figure 4.15). The vector exhibited stable expression of GFP and used for selection on FACS sorting. The infected cells were sorted into three populations based on the expression of GFP – low, medium and high. The overexpression of IRX4lncRNA was confirmed by qRT-PCR (Figure 4.15). The lncRNA expression correlates with the GFP expression. However, the system was found to have a leaky promoter as the cells with more plasmid copy number had a relatively higher expression of IRX4lncRNA even in the absence of doxycycline. Therefore, we used the “low” group of cells for further functional assays.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 99

Figure 4.14 - IRX4lncRNA knockdown efficiency in stable LNCaP models. Stable transduced LNCaP cells seeded in normal culture medium (RPMI1640 with 5% FBS) were treated with or without 250 ng/ml of doxycycline (Dox) for 72 hours. The knockdown efficiency was determined by RT-qPCR with strand specific cDNA. RPL32 expression was used as housekeeping gene and the expression is normalised to no dox treatment of each group using ΔΔCT method (n=3, Mean±SEM, Mann- Whitney test, *p<0.05).

Figure 4.15 - Validation of doxycycline inducible IRX4lncRNA overexpression PC3 model. Stable transduced PC3 cells seeded in normal culture medium (RPMI1640 with 5% FBS) was treated with or without 250 ng/ml of doxycycline (Dox) for 72 hours. The overexpression was determined by RT-qPCR with strand specific cDNA. RPL32 used as housekeeping gene and the expression is normalised to no dox treatment of low GFP group using ΔΔCT method. (n=3, Mean±SEM, Mann-Whitney test, *p<0.05).

100 The role of gwas identified 5p15 locus in prostate cancer risk and progression

4.3.9. The role of IRX4lncRNA in prostate cancer cell proliferation using stable models Similar to the observation with siRNA mediated transfection, the inducible knockdown of IRX4lncRNA in stable transfected LNCaP cells exhibited lower cell proliferation compared to the control cells (Figure 4.16). However, only a minimal effect was observed in cell proliferation with shRNA2 knockdown compared to shRNA1 knockdown.

9 0

8 0

) 7 0 ********

%

(

y 6 0

c n

e 5 0

u l

f 4 0

n o

c s h R N A 1 n o d o x

3 0

l l

e s h R N A 1 d o x

2 0 C s h R N A 2 n o d o x 1 0 s h R N A 2 d o x 0 0 1 2 2 4 3 6 4 8 6 0 7 2 8 4 9 6 T im e (h o u rs )

Figure 4.16 - The effect of doxycycline inducible IRX4lncRNA knockdown on LNCaP cell proliferation. Inducible shRNA LNCaP knockdown cells seeded in a 96-well plate at 5000 cells/well density were treated with or without doxycycline (Dox 250ng/ml). The knockdown induced LNCaP cells had reduced proliferation compared to the untreated cells. Two images per well were taken every 2 hours and the confluency was measured by the IncuCyte Live Cell Imaging system (n=3, Mean±SEM, Friedman test with Dunn’s multiple comparisons test *** P<0.001; ****P <0.0001).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 101

Furthermore, the cell proliferation of IRX4lncRNA overexpressing cells were assessed using stable PC3 cells, in which the cells treated with doxycycline to induce the overexpression of IRX4lncRNA had a higher proliferation than the untreated cells, confirming the role of IRX4lncRNA in prostate cancer cell proliferation (Figure 4.17).

8 0

N o d o x

) **** %

( 6 0

D o x

y

c

n

e u

l 4 0

f

n

o

C

l l

e 2 0 C

0 0 1 2 2 4 3 6 4 8 6 0 7 2 8 4 T im e (h o u r s )

Figure 4.17 – The effect of IRX4lncRNA overexpression on PC3 cell proliferation. Inducible lncRNA overexpressing PC3 cells seeded in 96-well plate at a density of 2500 cells/well were treated with or without doxycycline (Dox 250ng/ml). The overexpression induced PC3 cells had increased proliferation compared to the untreated cells. Two images per well were taken every 2 hours and the confluency was measured by the IncuCyte Live Cell Imaging system (n=2 biological replicates, Mean±SEM, Mann-Whitney test, P <0.0001).

102 The role of gwas identified 5p15 locus in prostate cancer risk and progression

4.3.10. The role of IRX4lncRNA in LNCaP cell migration Then the effect of IRX4lncRNA knockdown on LNCaP cell migration was assessed using the established inducible knockdown models. The lncRNA knockdown cells (doxycycline treated) had a lower migration rate compared to the untreated cells (Figure 4.18).

)

4 5

%

(

y

t

i

s n

e 3 0 d

**** ****

d

n u

o s h R N A 1 n o d o x

w 1 5

s h R N A 1 d o x

e

v i

t s h R N A 2 n o d o x

a l

e s h R N A 2 d o x 0 R 0 1 2 2 4 3 6 4 8 6 0 7 2 T im e (h o u rs )

Figure 4.18 - The effect of IRX4lncRNA knockdown on LNCaP cell migration. Inducible lncRNA knockdown cells seeded in 96-well plate to form a monolayer were treated with or without doxycycline (Dox 250ng/ml). The reproducible scratches were made using WoundMaker. The migration of the cells towards the wound was measured using the IncuCyte Live Cell Imaging system. The knockdown induced LNCaP cells migrated more slowly than the untreated cells (n=3 biological replicates, Mean±SEM, Friedman test with Dunn’s multiple comparisons test * P<0.05; **p <0.01).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 103

4.3.11. Regulation of IRX4 expression by IRX4lncRNA Antisense lncRNAs are reported to play a role in regulatory mechanisms to modulate gene expression, often of their overlapping sense gene expression (Goyal et al., 2017; Villegas et al., 2015). However, there was no effect on IRX4 expression observed in LNCaP cells with IRX4lncRNA knockdown (Figure 4.19). This suggests the function of IRX4lncRNA may be exerted through trans action. However, the regulation of IRX4 by its antisense RNA also should be determined at the level of IRX4 mRNA splicing or at the protein level.

Moreover, the effect of IRX4 knockdown on the expression of IRX4lncRNA was confirmed by RT-qPCR. No/minimal differences were observed in IRX4lncRNA expression in the IRX4 knockdown cells (Figure 4.20).

Figure 4.19 – IRX4 expression in IRX4lncRNA knockdown LNCaP cells. The cells seeded in a 6-well plate were transfected with 25 pmoles of siRNAs using Lipofectamine® RNAiMAX transfection reagent. After 72 hours incubation at 370 C, RNA was extracted and the relative fold expression of IRX4 was determined by qRT- PCR. RPL32 expression was used as housekeeping control and was normalised to RNAiMAX control group from each cell line. (siNT – Non-targeting siRNA, n=2, Mean±SEM).

104 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Figure 4.20 – IRX4lncRNA expression in IRX4 knockdown cells. The cells seeded in a 6-well plate were transfected with 25 pmoles of siRNA using Lipofectamine® RNAiMAX transfection reagent. After 72 hours incubation at 370 C, RNA was extracted and the relative fold expression of IRX4lncRNA was determined by qRT-PCR. RPL32 expression was used as a housekeeping control and was normalised to RNAiMAX control group from each cell line. (siNT – Non-targeting siRNA, n=3, Mean±SEM).

4.3.12. In-silico prediction of IRX4lncRNA function There are online tools available to predict the function of lncRNAs based on their expression data and network in different tissues. FuncPred is an in-silico prediction tool which predicts the function of an lncRNA based on the tissue specificity and evolutionary conserved expression data gathered from the GTEX portal and the pathways involved are predicted using Gene Ontology (GO), Gene Set Enrichment Analysis (GSEA), Human Phenotypes (HP) and Disease Ontology databases (DO) (Perron, et al., 2017). According to this prediction, approximately 200 pathways were predicted to be associated with lncRNA (p<0.05) (Appendix P). The predicted pathways are dependent on the expression of this lncRNA available through GTEX portal and therefore, the top pathways included functions in muscle tissue or heart, which had higher expression of this lncRNA (Figure 4.1, Table 4-2).

As these predictions are dependent on the expression data used for the analysis, it might not provide all the information. Performing RNA pull down assays to identify the protein/RNA and chromatin interactions will clarify the exact functional role of lncRNA in prostate cancer cells.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 105

Table 4-2 - Pathways predicted to be associated with IRX4lncRNA expression FDR Keyword Category Description p-value Hallmarkmyogenesis GSEA_H Hallmark myogenesis 4.00E-07 Hallmark oxidative Haloxipho GSEA_H 4.00E-07 phosphorylation GO:0031966 GO_CC mitochondrial membrane 1.40E-06 GO:0005865 GO_CC striated muscle thin filament 1.40E-06 proton-transporting ATP GO:0045259 GO_CC 1.40E-06 synthase complex GO:0019866 GO_CC organelle inner membrane 1.40E-06 GO:0030016 GO_CC myofibril 1.40E-06 GO:0005859 GO_CC muscle myosin complex 1.40E-06 GO:0070469 GO_CC respiratory chain 1.40E-06 DOID:12930 DO Dilated cardiomyopathy 1.30E-05 GSEA – Gene set enrichment analysis, GO – Gene ontology, DO – Disease ontology databases.

106 The role of gwas identified 5p15 locus in prostate cancer risk and progression

4.4. Discussion

In this chapter, we aimed to elucidate whether IRX4lncRNA plays a functional role in prostate cancer progression. IRX4lncRNA is reported as one of the top 20 candidate lncRNAs associated with the risk of prostate cancer (Guo, et al., 2016). This lncRNA is located on the antisense strand in the intronic region of the IRX4 gene as previously shown in (Figure 1.3). The antisense transcripts were considered to be a transcriptional noise until recently, due to low expression levels and low evolutionary conservation (Villegas, et al., 2015). This subclass of lncRNAs is shown to have a role in gene regulation, at transcriptional, post-transcriptional and translational levels, by interacting with either RNA, DNA or proteins (Villegas, et al., 2015).

IRX4lncRNA expression is predicted only in some specific tissues, including prostate. The higher expression of IRX4lncRNA in heart correlates with the expression of IRX4 in this tissue (The Human Protein Atlas). This overexpression of IRX4lncRNA in the TCGA prostate cancer dataset was recently reported in the literature (Guo, et al., 2016) during the time course of this study. We confirmed the overexpression of IRX4lncRNA in tumour tissues compared to the adjacent non-malignant tissues. Even though IRX4lncRNA exhibited low expression level compared to PCA3 in TCGA dataset from prostate tumour and normal tissues, it had an increase in expression with increasing Gleason Score. In addition, higher expression of IRX4lncRNA was observed in Gleason Score 4+3 compared to Gleason score 3+4, suggesting the association of IRX4lncRNA with more aggressive disease. Furthermore, the patients who had cancer recurrence had a higher expression of IRX4lncRNA compared to disease free patients.

PCA3 (also known as DD3) is the only FDA approved lncRNA used for prostate cancer diagnosis (Sartori et al., 2014), which is highly overexpressed in prostate tumours compared to non-malignant tissues (Bussemakers et al., 1999). The studies by Hessels et al., and van Gils et al., found no correlation between PCA3 expression and prognostic parameters such as Gleason score (Hessels et al., 2010; van Gils et al., 2008). Moreover, we observed a reduction in PCA3 expression in prostate tumour samples with increasing Gleason score and cancer recurrence in TCGA prostate cancer data set. However, a few other studies have reported that high PCA3 expression correlates with high Gleason Score and also described the relationship between the PCA3 score and tumour volume (Hessels, et al., 2010; Marks et al., 2007). Therefore,

The role of gwas identified 5p15 locus in prostate cancer risk and progression 107

the ability of the PCA3 score to detect prostate cancer aggressiveness or clinical outcome still remains questionable as these studies have reported conflicting results (Salagierski, et al., 2010). Although the data reported herein looks promising further exploration of IRX4lncRNA expression in plasma or urinary samples will determine its potential as a prognostic biomarker for prostate cancer. In addition, the prognostic potential of this lncRNA should also be compared with the widely used prostate cancer biomarker PSA.

Although there are two transcript variants predicted for IRX4lncRNA, only variant 1 exhibited higher expression levels in prostate cancer cells and thus focused in the functional assays. Interestingly, the expression of IRX4lncRNA was upregulated in those cells undergoing MET, confirming its association with aggressive disease. We generated knockdown and overexpression models of IRX4lncRNA and performed cell- based assays to determine the functional role of IRX4lncRNA in prostate cancer aetiology. Knockdown of IRX4lncRNA was achieved by both siRNA and inducible shRNA mediated transfection. Although, all the siRNAs tested had a higher knockdown efficiency on IRX4lncRNA expression, shRNA1 had lower knockdown efficiency compared to its corresponding siRNA knockdown. However, a similar effect was observed on LNCaP cell proliferation as exhibited by siRNA. IRX4lncRNA knockdown in LNCaP cells reduced cell proliferation and migration, while IRX4lncRNA overexpressing PC3 cells had an increased proliferation. This data suggest that this lncRNA might play a functional role in prostate cancer pathogenesis. However, the observed effects were relatively moderate and performing in vivo studies with these over expression and knockdown models will further clarify the role of this lncRNA in prostate cancer pathogenesis.

Antisense transcripts often act in cis, where they interact with the protein coding-gene transcribed from the opposite strand, but can also interact with a distant locus or even with the genes on different (Brantl, 2007; Pelechano et al., 2013). In order to determine whether IRX4lncRNA plays a role in regulating IRX4 gene expression, we assessed IRX4 expression after lncRNA. However, determining the effect of IRX4lncRNA knockdown and overexpression with respect to expression of IRX4 transcript variants would identify whether this lncRNA plays a role in IRX4 mRNA splicing. As already discussed in Chapter 3, the opposite trend of observation in cell proliferation in prostate cancer cells with IRX4 knockdown in our studies and

108 The role of gwas identified 5p15 locus in prostate cancer risk and progression

published literature may be due to different splice variants of IRX4. Thus, it is important to determine whether this IRX4lncRNA has any effect on regulating IRX4 isoform expression. In addition, no changes were observed in IRX4lncRNA expression with IRX4 knockdown.

In silico analysis using the online prediction tool, FuncPred identified potential pathways associated with IRX4lncRNA based on its expression data available through the GTEx portal. The top pathways predicted to be associated with this lncRNA such as myogensis and dilated cardiomyopathy correlates with the higher expression levels of IRX4lncRNA in muscle and heart tissues in the GTEx data. To further clarify the trans action of this lncRNA in prostate cancer cells, immunoprecipitation assays should be performed to identify its interacting partners. Even though, studies on lncRNAs in prostate cancer have emerged rapidly over the past decade, only a few lncRNAs have been explored for their potential mechanistic role. Therefore, identifying the mechanism of action of IRX4lncRNA in prostate cancer progression will provide more insights into understanding the role of lncRNAs in cancer pathogenesis.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 109

Chapter 5: Genotype specific androgen mediated regulation of IRX4 and IRX4lncRNA

5.1. Introduction

The androgen receptor (AR), a member of the nuclear hormone receptor family of transcription factors, plays a vital role in the development and progression of prostate cancer. The binding of steroidal androgens, testosterone and 5α dihydrotestosterone (DHT) to AR, causes AR receptor dimerization and translocation to the nucleus which enables the recruitment of androgen response elements on DNA and also the recruitment of a series of cofactors to promote expression of AR target genes (Daniels, et al., 2014). This results in activation of pro-proliferation pathways, which in turn leads to prostate cancer cell growth (Schiewer et al., 2012). Clinically, AR activity is monitored through the levels in serum samples of the well-characterised AR target gene, PSA (Beekman et al., 2008). Blocking AR activity by androgen deprivation treatment (ADT) remains the most effective treatment for recurrent prostate cancer (Beekman, et al., 2008). ADT is achieved commonly by the use of drugs that disrupt signalling between the pituitary gland and the testes (Crawford et al., 2015). Apart from the initial efficacy of this treatment, most patients with advanced prostate cancer eventually develop resistance to this therapy and other androgen targeted therapies and progress to castrate-resistant prostate cancer (CRPC) (Knudsen et al., 2011). Therefore, more targeted therapeutics for prostate cancer are essential for the treatment of prostate cancer.

GWAS studies have identified the 5p15 locus to be associated with prostate cancer risk in multi-ethnic populations (Batra, et al., 2011; Lindstrom, et al., 2012; Takata, et al., 2010). A recent study on GWAS SNPs with prostate cancer risk in ERG-fusion positive and negative tumours has reported an opposite direction of risk association in these sub-types (Penney, et al., 2016). ERG is reported to be overexpressed by fusing with the androgen responsive promoter of TMPRSS2 which then has a negative feedback effect on androgen signalling by interacting with AR protein, or through binding to the AR promoter and also by binding to androgen responsive elements on chromatin (Yu, et al., 2010).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 111

We have already shown in Chapter 3 and Chapter 4, that two genes at this prostate cancer risk locus, IRX4 and IRX4lncRNA, may promote prostate cancer progression. According to RNA-sequencing data of LNCaP cells (Figure 1.3, unpublished data) the expression of IRX4 and IRX4lncRNA were observed to be regulated by the androgen, dihydrotestosterone (DHT). In addition, expression of these genes was found to be upregulated after castration in LNCaP xenograft models (data not shown). Therefore, the androgen-mediated regulation of IRX4 and IRX4lncRNA expression was further characterised in this chapter.

112 The role of gwas identified 5p15 locus in prostate cancer risk and progression

5.2. Methods

5.2.1. Androgen deprivation assay LNCaP, VCaP and DuCaP cells were seeded in RPMI1640 media (Life Technologies, Catalog number - 11835-030) supplemented with 5% fetal calf serum (FBS) and incubated at 370 C for 3 days. The medium was then replaced with androgen-depleted culture medium (RPMI1640) containing 5% charcoal-stripped serum (CSS). After 48 hours, the cells in CSS were supplemented with 10 nM DHT or ethanol (EtOH, vehicle control) or 10 μM anti-androgens, bicalutamide and enzalutamide and incubated at 370C for 48 hours.

5.2.2. Androgen deprivation assay with transient siRNA knockdown Cells grown in RPMI1640 media with 5% FCS was replaced with androgen-depleted culture medium containing charcoal-stripped serum (CSS). After 24 hours, cells were transiently transfected with 25 pmol siERG or siAR using Lipofectamine® RNAiMAX Transfection Reagent (Invitrogen, Catalog number - 13778150). The cells in CSS were supplemented with 10 nM DHT after 24 hours of transfection and incubated at 370 C for 48 hours. Controls included RNAiMAX and non-targeting siRNA (Silencer select no 1 siRNA, catalog no – 4390843, Ambion).

5.2.3. Establishment of ERG overexpressing LNCaP cells ERG overexpressing cells were established with assistance from Associate Supervisor, Dr Gregor Tevz, who also was responsible for the GMO handling training for the PhD candidate. pINDUCER21-ERG was a gift from George Daley (Addgene plasmid # 51301; http://n2t.net/addgene:51301 ; RRID:Addgene_51301) (Doulatov et al., 2013). The stable cells were established according to the method described in section 4.2.3.

5.2.4. Genotyping of cell lines and patients PCR was carried out to amplify the AR/ERG binding region with 10 ng genomic DNA of cell lines or blood DNA of prostate cancer cases and controls using Platinum®Taq DNA Polymerase (Invitrogen, Catalog number - 11304011). The cycling parameters used were 95˚C for 5 min, 35 cycles of 95˚C for 15 sec, 60˚C for 30 sec and 72 ˚C for 1 min and final extension at 72 ˚C for 10 min. The products were genotyped for genetic variation by running them on a 2% agarose gel as described in section 2.6 with the

The role of gwas identified 5p15 locus in prostate cancer risk and progression 113

assistance of Ms Elizabeth Cheesman. The sequencing of the PCR products was performed at the Australian Genome Research Facility (AGRF).

5.2.5. Reporter Gene assay The reporter vector constructs were established using a pGL3 promoter (pGL3-p) vector (Promega, Catalog number - E1761) with assistance from Dr Carina Walpole. Briefly, the primers were designed with overhangs containing the restriction sites for SacI and XhoI (restriction sites to be used in pGL3 promoter vector) to amplify the MNLP. Genomic DNA of DuCaP cells was used as a template for the 47bp sequence and LNCaP DNA was used as a template for the 21bp sequence. Both the pGL3–p vector and 21/47bp insert fragments were digested with SacI and XhoI, gel purified using Wizard® Gel and PCR Clean-Up System (Promega) and ligated together using T4 DNA Ligase as per manufacturer’s instructions. Vector expression of inserts was then confirmed by restriction digest and sequencing prior to transfection into target cell lines. The LNCaP cells plated in six-well plates for 24 hours in CSS containing media were transfected with luciferase reporter plasmid and pRL-TK (Renilla Luciferase) using the Fugene 6 reagent (Roche) according to the manufacturer’s instructions. After 24 hours, cells were treated with either 10nM DHT or EtOH (Vehicle control). The cells were solubilized with lysis buffer after 24 hours and were treated with the luciferase assay reagent (LAR II) to measure the firefly luciferase activity followed by addition of Stop and Glo reagent to measure the renilla luciferase activity in a luminometer according to the manufacturer’s instructions (Dual- Luciferase® Reporter Assay and Dual-Luciferase® Reporter 1000 Assay Systems, Promega). Renilla luciferase was used to normalise the recorded luciferase activity of the transfected cells.

114 The role of gwas identified 5p15 locus in prostate cancer risk and progression

5.3. Results

5.3.1. Expression of IRX4 and IRX4lncRNA is regulated by androgens Initially, the expression of IRX4 and IRX4lncRNA was determined in two androgen responsive cell lines – LNCaP and DuCaP after the treatment with DHT and the anti- androgens, bicalutamide and enzalutamide (Figure 5.1). Relative KLK3 (PSA), a key androgen regulated gene, expression was determined to confirm the effectiveness of the androgen treatment in these cell lines as KLK3 expression is induced by DHT treatment in all three cell lines. IRX4 and IRX4lncRNA expression was differentially regulated in these cell lines. DHT treatment up-regulated the expression of both IRX4 and IRX4lncRNA in DuCaP cell line. On the other hand, in LNCaP cells their expression was down regulated with DHT treatment (Figure 5.1). Anti-androgens bicalutamide and enzalutamide inhibited androgen action in both LNCaP and DuCaP cells.

Figure 5.1- Regulation of IRX4 and IRX4lncRNA expression by androgens (DHT) and anti-androgens. Cells in androgen-depleted culture medium containing 5% charcoal-stripped serum CSS were supplemented with 10 nM DHT and 10 µM or ethanol (EtOH, vehicle control) or anti-androgens, bicalutamide (10 μM) and enzalutamide (10 μM) after 48 hours and incubated at 370C for 48 hours. IRX4 and IRX4lncRNA was upregulated with 10 nM DHT in the DuCaP cell line, but down regulated in LNCaP cells. KLK3 expression was used as a positive control to validate the androgen treatment. KLK3 was overexpressed with DHT treatment in both the cell lines. (Mean±SEM, n=3 biological replicates, Mann-Whitney test * P<0.05).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 115

To further determine the androgen mediated regulation of IRX4 and IRX4lncRNA, we treated androgen responsive cell line VCaP with DHT. Similar to DuCaP, DHT treatment upregulated the expression of both IRX4 and IRX4lncRNA in VCaP cells (Figure 5.2).

Figure 5.2 - Regulation of IRX4 and IRX4lncRNA expression by androgens (DHT) in VCaP cells. Cells in androgen-depleted culture medium containing 5% charcoal-stripped serum CSS were supplemented with 10 nM DHT or ethanol (EtOH, vehicle control) after 48 hours and incubated at 370C for 48 hours. IRX4 and IRX4lncRNA was upregulated with 10 nM DHT in the VCaP cell line. KLK3 expression was used as a positive control to validate the androgen treatment. (Mean±SEM, n=3 biological replicates, Mann- Whitney test * P<0.05).

To further validate the androgen mediated upregulation of IRX4 and IRX4lncRNA in VCaP cells, we checked the expression of these genes in transient AR knockdown cells (Figure 5.3). KLK3 expression was analysed in order to validate the treatment. Similar to previous results, IRX4 and IRX4lncRNA was upregulated with DHT treatment in transfection reagent and non-targeting siRNA controls, while this regulation was suppressed in AR knockdown cells, confirming the androgen mediated upregulation of IRX4 and IRX4lncRNA in VCaP cells.

116 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Figure 5.3 - Transient AR knockdown in VCaP cells Cells in androgen-depleted culture medium containing 5% charcoal-stripped serum CSS were transfected with either siAR or non-targeting siRNA (siNT) using Lipofectamine® RNAiMAX transfection reagent. The cells were then supplemented with 10 nM DHT or ethanol (EtOH, vehicle control) after 24 hours and incubated at 370C for 48 hours. IRX4 and IRX4lncRNA was upregulated with 10 nM DHT in the RNAiMAX and non-targeting siRNA control, but not in AR knockdown cells. KLK3 expression was used as a positive control to validate the androgen treatment. KLK3 was overexpressed with DHT treatment in VCaP cells, while AR knockdown suppressed this effect. (n=3, Mean±SEM, Mann-Whitney test, *p<0.05).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 117

5.3.2. In-silico analysis of ChIP-Sequence data at the 5p15 locus An opposite effect was observed in androgen regulation of IRX4 and IRX4lncRNA in LNCaP and VCaP cells, which have a different ERG fusion status. VCaP and DuCaP cells express ERG, while LNCaP cells are ERG-negative. There is evidence that ERG, which is over expressed in more than 50% of all prostate cancer cases, can interfere with the expression of androgen regulated genes. Previous studies provide evidence for induction of gene expression by ERG via androgen-mediated regulation. In addition, ERG is found to disrupt AR signalling by binding to the promoter region of the AR gene as well as binding to the AR binding region in the chromatin DNA and therefore, acts as a negative regulator of AR regulated genes (Figure 1.2).

We hypothesised that the opposite effects are due to differential binding of transcription factors at the 5p15 locus, including ERG and thus we mined ChIP- sequencing data (Cistrome Finder System, http://cistrome.org/finder/) (Mei et al., 2017; Zheng et al., 2019) in order to investigate the binding of ERG and AR in the genomic locus at IRX4 in androgen responsive LNCaP, VCaP and DuCaP cells (Asangani et al., 2014; Bu et al., 2016; Sahu et al., 2013; Yu, et al., 2010).

Interestingly, AR binding peaks were observed upstream of IRX4 (downstream of IRX4lncRNA) in VCaP and DuCaP cell lines, while no AR binding was observed at this locus in LNCaP AR ChIP-Seq data (Figure 5.4) . In addition, an ERG binding peak was also observed in VCaP ERG ChIP-Seq data (Figure 5.4). There was no data available for ERG ChIP-seq in DuCaP cells. This suggests that AR and ERG binding at this locus may be important for up-regulation of the expression of IRX4 and IRX4lncRNA by androgens in DuCaP and VCaP cells.

118 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Figure 5.4 - Binding of AR and ERG at the prostate cancer risk associated 5p15 locus. Upper panel shows the genes at this locus, which encompass the GWAS SNP, rs12653946 associated with prostate cancer risk. AR binding peaks were observed in VCaP and DuCaP cells, (~2kb upstream to IRX4). AR binding at this locus was not observed in the LNCaP cell line. In addition, the ERG binding peak was observed at the same locus in VCaP cells (Cistrome finder (http://cistrome.org/finder derived/, Figure derived from UCSC Genome Browser).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 119

5.3.3. Expression of IRX4 correlates with ERG expression The mechanism of ERG overexpression has been reported to be a result of fusion between the promoter of an androgen responsive gene TMPRSS2 and the open reading frame of the ERG. This fusion renders ERG responsive to androgens. Interestingly, we noted a correlation between IRX4 and IRX4lncRNA expression and ERG fusion in the RNA-sequencing data provided by Associate Professor Elizabeth Williams, from a cohort of seven androgen-responsive patient-derived xenografts (Table 5-1).

Table 5-1 - The correlation of TMPRSS2/ERG fusion with IRX4/IRX4lncRNA expression in patient derived xenograft models.

PDX TMPRSS2 IRX4 IRX4lncRNA ERG- fusion expression expression

LuCaP105 No No No LuCaP70 No No No LuCaP23 Yes Yes Yes BM18 Yes Yes Yes LuCaP141 No Yes Yes LuCaP96 No No No LuCaP35 Yes Yes Yes

Thus, ERG correlation with IRX4 expression in the global transcriptional profile database, Oncomine, was explored and the expression of IRX4 was found to be higher in ERG fusion positive prostate cancer compared to ERG fusion negative samples (Grasso prostate cancer data set, Figure 5.5 (a)). Similarly, in the TCGA dataset high IRX4 expression correlates with high ERG expression (Figure 5.5 (b)). In addition, ERG contribution to the up-regulation of IRX4 expression is evident in the literature where IRX4 is over-expressed in the primary and metastatic samples with the ERG fusion (Cai, et al., 2013). However, we didn’t see any correlation between IRX4lncRNA expression and ERG-fusion status (Figure 5.6), suggesting IRX4 and IRX4lncRNA expression are not always regulated by the same machinery.

120 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Figure 5.5 - IRX4 expression correlates with ERG-fusion status. (a) Higher expression of IRX4 is detected in prostate tumours with ERG rearrangement compared to no fusion. (Source – Oncomine, Grasso prostate cancer dataset, No ERG rearrangement – 51, ERG rearrangement – 43, Mean±SEM, Mann-Whitney test, * p < 0.05). (b) High IRX4 expression correlates with high ERG expression (Source – TCGA, Low ERG expression = 83 (less than Q1) and High ERG expression = 83 (higher than Q3), Mean±SEM, Mann-Whitney test, *** p<0.001).

Figure 5.6 - IRX4lncRNA expression does not correlate with ERG fusion status. No relationship was observed between IRX4lncRNA expression and ERG fusion status. (Source –TCGA, ERG rearrangement = 140 and No ERG rearrangement = 152, Mean±SEM, Mann-Whitney test, ns – non-significant).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 121

We further confirmed the correlation between IRX4 and ERG expression by qRT-PCR analysis in 50 FFPE prostate tumour samples (APCB). ERG was overexpressed in prostate tumour samples compared to adjacent non-malignant tissues (Figure 5.7 (a)). Similar to our observations in public datasets, high IRX4 expression correlates with high ERG expression in prostate tumour samples (Figure 5.7 (b)). Although a similar trend was observed in adjacent non-malignant tissues, the results were not statistically significant. There was no statistically significant differences observed in IRX4lncRNA expression with ERG expression levels in both tumour and non-malignant tissues (Figure 5.8).

Figure 5.7 - High IRX4 expression correlates with high ERG expression. (a) ERG is overexpressed in prostate tumour samples compared to adjacent non- malignant tissues (n=50, APCB, Mean±SEM, Wilcoxon matched-pairs signed rank test, **p < 0.01). (b) High IRX4 expression correlates with high ERG expression in prostate tumour samples. (APCB, T –Tumour, N – Adjacent non-malignant, Low = 10 (less than Q1) and High = 12 (higher than Q3), Mean±SEM, Mann-Whitney test, ** p<0.01, ns – non-significant).

Figure 5.8 – IRX4lncRNA expression is not correlated with ERG expression. No difference in IRX4lncRNA expression was observed with low and high ERG expression levels (APCB, T –Tumour, N – Adjacent non-malignant, Low = 11 (less than Q1) and High = 12 (higher than Q3), Mean±SEM, Mann-Whitney test, ns – non- significant).

122 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Thus, we hypothesised that the difference in androgen responsiveness of IRX4 expression observed in LNCaP cells in comparison to VCaP and DuCaP cells may be the result of low expression of ERG in the TMPRSS2-ERG fusion-negative LNCaP cell line. Therefore, we aimed to produce LNCaP cell lines with doxycycline inducible over expression of ERG and determine the androgen responsiveness of IRX4 and IRX4lncRNA in ERG over-expressing LNCaP cells on androgen deprivation.

Initially, we established doxycycline inducible (dox) LNCaP ERG overexpressing cells with the pINDUCER21-ERG plasmid using the lentiviral system. The established overexpressing model was validated for ERG overexpression with or without 250 ng/ml doxycycline (dox) treatment and the relative ERG expression was determined by RT-qPCR and also at the protein level by Western blot (Figure 5.9). ERG was 150 fold over expressed at the mRNA level in doxycycline induced stable cells compared to the control group (No dox).

Figure 5.9 - Validation of ERG expression in LNCaP-pIND21-ERG Cells seeded in normal cell culture medium (RPMI1640 with 5% FCS) were supplemented with 250 ng/ml of doxycycline (dox) for 48 hours. The overexpression of ERG was confirmed (a) at the mRNA level by RT-qPCR and (b) at the protein level by Western blot with an Anti-ERG antibody [EPR3864] (ab92513, Abcam, Rabbit monoclonal).

KLK3 was down-regulated with ERG over expression which may be due to the negative regulation of ERG on AR transcript levels and AR chromatin-DNA binding as already stated in the literature (Yu, et al., 2010). Interestingly, the expression of IRX4 was down-regulated with ERG over expression, independent of androgen treatment (Figure 5.10). No changes in IRX4lncRNA expression were observed with ERG overexpression (Figure 5.10), further confirming that lncRNA is not regulated by ERG.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 123

Figure 5.10 - Differential gene expression in LNCaP cells over expressing ERG (LNCaP-pIND21-ERG). LNCaP-pIND21-ERG and LNCaP-iGFP control cells in androgen-depleted culture medium containing CSS with or without 250 ng/ml of doxycycline (Dox) for 48 hours was supplemented with either 10 nM DHT or 250 ng/ml Dox or both. The DHT treatment was validated by checking KLK3 expression and the ERG overexpression was validated by measuring the relative ERG mRNA expression levels with ERG primers (Mean±SEM, n=3 biological replicates, Mann-Whitney test, * p<0.05).

To further understand the ERG-mediated regulation of IRX4 in prostate cancer cells, we determined the expression of IRX4 in transient ERG knockdown VCaP and DuCaP cells (Figure 5.11). KLK3 expression was used as the positive control for DHT treatment. Interestingly, the androgen mediated up-regulation of KLK3 expression is further increased with the ERG knockdown in VCaP cells, while no difference was observed with ERG knockdown on KLK3 expression in DuCaP cells. IRX4 expression was upregulated with DHT treatment and further increased with the ERG knockdown in both VCaP and DuCaP cells, suggesting ERG may have a negative effect on androgen response of IRX4 expression. On the other hand, ERG knockdown didn’t have any effect on IRX4lncRNA expression.

124 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Figure 5.11 - Differential gene expression in ERG knockdown prostate cancer cell lines. Cells were seeded in RPMI1640 media supplemented with FBS were then replaced with androgen-depleted culture medium containing 5% charcoal-stripped serum CSS. 24 hours later, the cells were transfected with either siERG or non-targeting siRNA (siNT) using Lipofectamine® RNAiMAX transfection reagent. The cells in CSS were supplemented with 10 nM DHT or ethanol (EtOH, vehicle control) after 48 hours and incubated at 370C for 48 hours. IRX4 and IRX4lncRNA was upregulated with 10 nM DHT in the RNAiMAX and non-targeting siRNA control, and androgen mediated up-regulation of IRX4 further increase with ERG knockdown. KLK3 expression was used as a positive control to validate the androgen treatment. KLK3 was overexpressed with DHT treatment in both the cell lines. (n=3 biological replicates, Mean±SEM, Mann-Whitney test, *p<0.05).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 125

5.3.4. Sequencing of AR/ERG binding DNA region in prostate cancer cell lines. The regulation of IRX4 expression in ERG overexpressing and knockdown cells, contradicts with the observation from clinical data, in which IRX4 expression correlated with the ERG-fusion status. Since we didn’t observe the expected results, we hypothesised that AR/ERG binding region of the LNCaP cell line might contain some genetic variations which can alter the binding ability of AR and ERG transcription factors to this region. Therefore, we attempted to sequence the AR/ERG binding region of these cell lines to determine whether any difference is observed at the genomic DNA level. We detected three different types of PCR products corresponding to band size 252bp, 226bp or two products representing heterozygous genotype (Figure 5.12).

Sequencing of these purified PCR products revealed a rare polymorphism, Multiple Nucleotide Length Polymorphism (MNLP), coinciding with the AR/ERG ChIP-peak in the upstream of IRX4. The genomic sequence from VCaP completely matched the reference sequence in GRCh38 Primary Assembly, whereas in the LNCaP sequence, a fragment of 47bp is replaced by a novel 21bp sequence (Figure 5.12). Effectively, the loss of 47bp would delete the binding site of AR and ERG and could change the way how IRX4 and IRX4lncRNA are regulated in androgen responsive tissue. This 47bp/21bp MNLP upstream of IRX4 was recently included (rs386684493) in the dbSNP 141 database (Oct 2014 release). Based on these results, the genetic association of the novel MNLP identified at the 5p15 locus with prostate cancer risk was further explored and investigated for its role in the regulation of IRX4 and IRX4lncRNA expression.

126 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Figure 5.12 - MNLP genotype of the panel of prostate cancer cell lines. 2% Agarose gel showing the MNLP genotype of the panel of prostate cancer cell lines. If the genotype is 47bp for MNLP the product was 252bp in length and 21bp genotype amplified 226bp length product. Heterozygous genotype had two distinct bands. (NTC-Non-template control, 1kp plus ladder (Invitrogen, Catalogue number -10787- 018)). The lower panel shows the sequences of the 47bp/47bp homozygous, 47bp/21bp heterozygous and 21bp/21bp homozygous of MNLP).

5.3.5. Allele specific androgen responsiveness of the MNLP To confirm the androgen responsiveness of 47bp allele of the MNLP promoter vector assay was performed. The promoter vector cloned with the 47bp allele in the enhancer region had a higher luciferase activity when treated with DHT compared to vehicle control, while 21bp allele clones showed no difference in activity with androgen treatment (Figure 5.13). There was no difference observed at the basal level with vehicle treatment between different groups.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 127

Figure 5.13 – Allele specific luciferase promoter vector assay for the MNLP LNCaP cells were seeded in RPMI1640 media supplemented with FBS were then replaced with androgen-depleted culture medium containing 5% charcoal-stripped serum CSS. 24 hours later, the cells were co-transfected with luciferase promoter vector harbouring either 47bp or 21bp allele of the MNLP and Renilla luciferase vector using Fugene transfection reagent. The cells in CSS were supplemented with 10 nM DHT or ethanol (EtOH, vehicle control) after 48 hours and incubated at 370C for 24 hours. The cells transfected with the 47bp allele had significantly higher luciferase activity with DHT treatment compared to the 21bp reporter construct. (n=2, Mean±SEM, Unpaired t-test, * p < 0.05).

5.3.6. Genotyping MNLP in prostate cancer patients and cancer free controls and Linkage Disequilibrium (LD) analysis In a preliminary study, we genotyped the MNLP in DNA extracted from 180 patients and 85 controls in a Caucasian population by performing the above PCR and confirming the products in a 2% agarose gel electrophoresis (Figure 5.14 (a)). The polymorphic status of this MNLP was confirmed in an Australian population. Linkage Disequilibrium (LD) calculation was performed by the principal supervisor A/Prof Jyotsna Batra using the Haploview online software. (http://www.broadinstitute.org/scientific-community/science/programmes/medical- and-population-genetics/haploview/haploview). LD, which is the non-random association between the alleles to be inherited together on a chromosome was calculated for the MNLP and other SNPs at the 5p15 locus, including rs10866528, the GWAS hit in the recent fine mapping study. A strong LD (r2> 0.6) was observed between the MNLP and the rs10866528 SNP which suggests this MNLP could be a functional polymorphism at this locus (Figure 5.14 (b)).

128 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Figure 5.14 – Genotyping of the MNLP in Australian males’ blood DNA. (a) An example of the 2% agarose gel image obtained for the PCR amplification of MNLP region in Australian men. 18 patients were 47bp/47p homozygous, 20 with heterozygous and five with 21bp/21bp homozygous genotypes. Positive controls include LNCaP (21bp/21bp), DuCaP (47bp/47bp) and BPH1 (47bp/21bp). (A1-D11 represents different patient IDs, LN – LNCaP, BPH – BPH1, Du – DuCaP, and NTC –Non-template control; 1 kb plus ladder (Invitrogen)). (b) Linkage Disequilibrium (LD) map generated through genotyping 150 Australian males. SNPs in green from left showing MNLP (rs386684493), fine mapping SNP (rs10866528) and GWAS top hit (rs12653946). The novel MNLP falls in the same LD block with the fine mapping SNP (Block 1).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 129

5.3.7. IRX4 and IRX4lncRNA expression correlation with MNLP genotype The expression correlation of IRX4 and IRX4lncRNA with the MNLP genotype was then determined. The 21bp/21bp homozygous genotype was correlated with lower transcript levels of both IRX4 and IRX4lncRNA (Figure 5.15).

Figure 5.15 - eQTL analysis of IRX4 and IRX4lncRNA expression with MNLP genotype. The expression of IRX4 and IRX4lncRNA was correlated with the MNLP genotype. The 21bp/21bp genotype correlates with lower transcript levels of these genes. (47bp/47bp = 12, 47bp/21bp = 24, 21bp/21bp = 5 patients, APCB, Mean±SD, Kruskal- Wallis test, * p < 0.05, ** p < 0.01, **** p < 0.0001).

5.3.8. In-silico prediction of transcription factor binding at MNLP In addition to AR and ERG, We also observed FOXA1 and POL2RA binding at the MNLP in FOXA1 ChIP sequencing data in VCaP cells, while no such binding was observed in LNCaP cells (Appendix Q, (Niskanen et al., 2015; Sahu, et al., 2013; Toropainen et al., 2015; Yang, Nickols, et al., 2013; Yu, et al., 2010)). Interestingly, binding of ETV1, another member of the ETS family of transcription factors, to the MNLP was observed in LNCaP cells (Appendix Q) (Chen et al., 2013).

In order to identify other transcription factors binding to the alleles of the MNLP, the in-silico prediction tool, TFBIND, was utilised (cut-off score >=8, (Tsunoda et al., 1999)). There were few transcription factors predicted to bind only to the 47bp allele of MNLP, including FOXA2 and SOX5 (Table 5-2). On the other hand, only NKX2.5 was predicted to bind specifically to the 21bp allele of the MNLP. IRX4 was previously reported to be a downstream target of NKX2.5.

130 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Table 5-2 - Predicted transcription factor binding to the MNLP 47bp 21bp HNF3B NKX2.5 SOX5 CAP EVI NF LYF STAF OLF NFY MYB IRF

5.3.9. MNLP association with prostate cancer risk and survival for patients treated with ADT Risk association analysis of prostate cancer GWAS loci in 82,591 cases and 61,213 controls of European ancestry has identified rs19957702 (INDEL - _/G, chr5: 1889346), to be the most significant prostate cancer risk associated variant at this locus (OR – 1.10 (1.09-1.11), p-value 1.62E-27, MAF = 0.40, OncoArray Project - PRACTICAL Consortium, (Dadaev, et al., 2018)). This SNP coincides with the last nucleotide of MNLP (chr5: 1889300-1889346) and since the probe is designed to the negative strand for this SNP, this may actually identify the MNLP genotype. This was further confirmed by genotyping the MNLP in ~400 samples by PCR and then cross- checking with the genotyping data for SNP rs19957702 from the OncoArray project. Excitingly, the 21bp/21bp genotype was associated with poor survival outcome for those patients who underwent ADT in the Queensland cohort (Figure 5.16).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 131

1 0 0

l 8 0

a

v

i

v r

u 6 0

s

t

n 4 7 b p /4 7 b p

e 4 0

c r

e 4 7 b p /2 1 b p

P 2 0 2 1 b p /2 1 b p

0 0 4 8 1 2 T im e (y e a r s )

Figure 5.16 - Survival analysis for Queensland men with the MNLP who underwent androgen deprivation therapy. The survival of the patients who underwent hormone deprivation therapy was less in individuals harbouring the 21bp/21bp homozygous genotype (Log-rank test, p=0.04).

132 The role of gwas identified 5p15 locus in prostate cancer risk and progression

5.4. Discussion

Androgens play a crucial role in prostate development as well as prostate cancer pathogenesis. Activity of androgens is mediated through its receptor, AR. (Vaarala et al., 2012). In this chapter, we determined the androgen-mediated regulation of IRX4 and IRX4lncRNA. Interestingly, the androgen-mediated regulation of these genes varied in different androgen-responsive cell lines, as they were up-regulated with DHT treatment in VCaP and DuCaP cells, while down-regulated in LNCaP cells (Figure 5.17). Anti-androgens bicalutamide and enzalutamide inhibited the effect of androgen treatment in the expression of IRX4 and IRX4lncRNA in both LNCaP and DuCaP cells. We further observed an AR binding peak in the upstream region of IRX4 in VCaP and DuCaP cells using the Cistrome Finder database, but no AR binding was found in LNCaP cells at this locus (Asangani, et al., 2014; Sahu, et al., 2013; Yu, et al., 2010). In addition, an ERG binding peak was also observed at this locus in VCaP cells (Yu, et al., 2010).

Studies on ERG in prostate cancer have been increasing over the last decade. High expression of ERG has been associated with more aggressive parameters in prostate cancer, such as tumour stage and Gleason Score (Hagglof et al., 2014). We observed a correlation between IRX4 expression and ERG-fusion status in prostate cancer datasets, and confirmed this association in our prostate cancer tissue samples in which high expression of IRX4 was associated with high ERG expression. However, no association between IRX4lncRNA and ERG-fusion status was observed. ERG knockdown in VCaP and DuCaP cells increased the androgen-mediated up-regulation of IRX4 expression, while ERG overexpression in LNCaP cells reduced IRX4 expression independent of androgen treatment. The results on ERG mediated regulation of IRX4 in cell lines conflict with the observation in clinical samples, suggesting additional factors may be contributing to this regulation of IRX4 expression. A recent study on prostate cancer GWAS SNPs had stratified the patients into ERG-fusion positive and ERG-fusion negative groups and found the risk association of the 5p15 SNP rs12653946 is in the opposite direction in these sub- groups (Penney et al., 2016). Therefore, performing ERG-IRX4 correlation analysis by stratifying the patients, according to the rs12653946 SNP genotype may explain the contradictory results observed.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 133

Sequencing of the AR/ERG binding region identified a MNLP (rs386684493) where a stretch of 47bp sequence is replaced by a novel 21bp sequence. The important implication of this finding is that the replacement of the 47bp sequence by a 21bp sequence in the LNCaP cell line could render IRX4 and IRX4lncRNA unresponsive to androgen regulation. This MNLP was previously reported to be the most strongly associated variant with prostate cancer risk at the 5p15 locus in a Japanese population (Nguyen, et al., 2012). Even though, this study performed a reporter gene assay with the MNLP alleles, they didn’t observe any differences, as androgen treatment was not included in the assay. As LNCaP cells have a missense mutation (T877A) in the ligand binding domain of the AR gene, which alters the binding characteristics of the AR (Tan et al., 2015), we performed the promoter vector assay in LNCaP cells to check whether the differential results observed were due to the mutated AR in LNCaP cells or the MNLP alleles. This confirms the androgen responsiveness of the 47bp allele of MNLP by reporter vector assay, in which the 47bp allele with DHT treatment had a higher luciferase activity compared to vehicle control and the 21bp allele construct didn’t show any difference between the treatments. The mechanism by which this MNLP is generated is poorly understood and the incidence of this polymorphism is only reported in a few studies, including the bovine and chicken genome, where they found it to modulate promoter activity (Jiang et al., 2007; Li, Chen, et al., 2013) (Nguyen, et al., 2012).

The MNLP falls in the same LD block with the fine mapping SNP, rs10866528 (Block 1) along with two other SNPs – rs260406 and rs4975758. The alleles of eight SNPs at this locus rs10866528, rs4975758, rs12655062, rs34695572, rs12656007, rs12653946, rs35010507 and rs4975758 were shown to have differential binding with nuclear proteins by Electrophoretic Mobility Shift Assay (EMSA), while alleles of SNPs rs34695572, rs35010507 and rs12656007 exhibited different luciferase activity in a previous study (Nguyen, et al., 2012), suggesting these SNPs could also have an equally important functional role. However, the main focus of this study was to determine the genetic association of MNLP with prostate cancer risk and determining its role in the androgen-mediated regulation of IRX4 and IRX4lncRNA expression. We also noted that the expression of IRX4 and IRX4lncRNA correlates with the MNLP genotype in prostate cancer patients.

134 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Figure 5.17 – Allele specific androgen-mediated regulation by the MNLP. In the presence of androgens (DHT), AR binds to the 47bp allele of the MNLP in the upstream of VCaP cells and upregulates the expression of IRX4 and IRX4lncRNA. LNCaP cells harbouring 21bp allele of the MNLP has a disrupted AR binding site and downregulates the expression of IRX4 and IRX4lncRNA.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 135

In-silico analysis of transcription factor binding at this MNLP predicted NKX2.5 to bind to the 21bp allele of MNLP. Interestingly, cardiac expression of IRX4 is reported to be modulated by NKX2.5 and the loss of this transcription factor in mice had resulted in lower levels of IRX4 transcripts (Bruneau, et al., 2000). On the other hand, the 47bp allele was predicted to bind to more proteins, including the developmental transcription factors, FOXA2 and SOX5. Binding of FOXA1 to the MNLP region was also observed in VCaP cells from the Cistrome Finder database. However, binding of AR is not predicted in this analysis as the MNLP sequence does not contain a canonical AR consensus sequence. Binding of POL2RA to the MNLP region was observed in VCaP cells, ETV1 binding to the MNLP was observed in LNCaP cells. ETV1 overexpression represents a relatively low percentage of prostate cancer patients and at least partially share the ERG binding sites (Gasi Tandefelt et al., 2014). Observation of ERG binding in VCaP cells and ETV1 binding in LNCaP cells at the MNLP region suggest that the binding of these two transcription factors may not be dependent on by MNLP alleles. Performing ChIP-qPCR of this region in ERG overexpressing LNCaP cells will further confirm whether ERG can bind to the 21bp allele of MNLP. However, the studies have suggested these transcription factors might have different effects on the target genes (Gasi Tandefelt, et al., 2014) and therefore, determining the regulation of IRX4 by ETV1 may provide more insights of the regulation of this locus in prostate cancer cells.

This MNLP was identified as the most significant prostate cancer risk associated variant at the 5p15 locus through large scale genetic association studies. Interestingly, the 21bp/21bp genotype was associated with poor survival outcome for patients who underwent ADT. Additional survival analysis in a larger population will provide insights on the potential of this MNLP as a prediction tool for the outcome of hormone deprivation therapy. However, the association of MNLP genotype with survival outcome should be confirmed in a larger population. Consideration of ERG-fusion status along with the MNLP genotype and IRX4/IRX4lncRNA expression and disease progression will provide insights for identifying the potential of this novel polymorphism as a genetic prognostic biomarker for prostate cancer progression. Moreover, genetic modification of the MNLP by CRISPR in prostate cancer cells followed by functional and/or transcriptomic analysis and Chromosome conformation

136 The role of gwas identified 5p15 locus in prostate cancer risk and progression

capture assays will identify the long-range interactions of this MNLP and further clarify the role of this MNLP in prostate cancer pathogenesis.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 137

Chapter 6: Conclusions, Limitations and Future Directions

Prostate cancer is one of the leading causes of cancer related death in men worldwide (Mateo et al., 2015). The incidence of this disease is more common in Western countries compared to developing countries (Eeles et al., 2014). PSA screening is the common method used for prostate cancer diagnosis, despite recommendations against this test due to harmful effects resulting from over diagnosis (Bell et al., 2015). It is well established that AR plays a critical role in this hormone-stimulated disease and thus, therapies targeting androgen synthesis or AR activity are used to treat metastatic prostate cancer (Culig, 2017). Although most patients initially respond to ADT, they develop resistance to this therapy and subsequently progress to CRPC (Ho et al., 2017). This emphasises the urgent need of identifying better biomarker for prostate cancer diagnosis and prognosis and also developing better therapeutic options to overcome the metastatic disease. Interestingly, genetic predisposition has been identified as one of the factors contributing to the risk of prostate cancer (Helfand et al., 2015). Despite the questions raised against the importance of findings from genetic association studies in therapeutic interventions, few studies have demonstrated the emerging clinical implication of analysing genetic variations (Visscher et al., 2017). For instance, a recent clinical trial on the type 2 diabetes reported the risk variant rs553668, located in the 3’-UTR of ADRA2A gene, has the potential to guide treatment options using Yohimbine, the α-2A adrenergic receptor antagonist (Tang et al., 2014).

GWAS have identified more than 150 loci associated with prostate cancer risk to date (Al Olama, et al., 2014; Eeles, et al., 2013; Schumacher, et al., 2018). It is anticipated that understanding the functional role of these prostate cancer risk-associated SNPs will provide insights into the mechanisms driving prostate cancer. Most of these SNPs are located outside the exons of protein-coding genes and they are enriched in regulatory regions (Edwards, et al., 2013; Whitington et al., 2016). However, identifying the functional consequences of these SNPs has mainly focused on the regulation of protein-coding genes, with the exception of a few studies involving lncRNAs (Guo, et al., 2016).

SNP rs12653946 at the 5p15 locus has been identified to be associated with prostate cancer risk in multi-ethnic populations through GWAS studies (Batra, et al., 2011;

The role of gwas identified 5p15 locus in prostate cancer risk and progression 139

Lindstrom, et al., 2012; Takata, et al., 2010). This SNP is located in the intronic region of a non-coding gene, CTD-2194D22.4, which is not expressed in the prostate. Even though, this locus consists of well-known cancer associated genes such as TERT and CLPTM1L within 700kb region of the GWAS SNP, no LD was observed between the SNPs associated with these genes and the rs12653946 SNP (Batra, et al., 2011). This suggests that the SNP rs12653946 represents an independent risk locus. Besides the genes mentioned above, this prostate cancer risk associated locus encompasses a transcription factor encoding gene, IRX4, and an antisense lncRNA, IRX4lncRNA within a 20kb block surrounding the GWAS SNP, rs12653946. IRX4 has been shown to suppress prostate cancer cell proliferation by interacting with the vitamin D receptor (Nguyen, et al., 2012). Interestingly, a correlation has been reported between the expression levels of both IRX4 and IRX4lncRNA with the rs12653946 genotype (Guo, et al., 2016; Xu, et al., 2014). Thus, it is hypothesised that the prostate cancer risk associated 5p15 locus may confer its risk via IRX4 and IRX4lncRNA and aimed to delineate their functional role and expression regulation in prostate cancer aetiology in this study.

IRX4, which belongs to the Iroquois homeodomain transcription factor family, is reported as the most divergent member in this family (Kim, et al., 2012). Even though, the role of IRX4 is well-studied in heart development, it has been implicated in prostate cancer after GWAS identified the genetic association of the 5p15 locus with prostate cancer risk. Interestingly, overexpression of IRX4 was observed in prostate cancer compared to other types of cancer in publicly available cancer datasets and it also overexpressed in prostate tumour tissues compared to normal tissues. This study has identified that the knockdown of IRX4 in prostate cancer cells reduced cell proliferation and migration. However, previous study by Nguyen et al., had reported opposite function for IRX4 in prostate cancer cells (Nguyen, et al., 2012). As the siRNAs used in these studies have targeted different regions of the IRX4 gene, it is important to check the expression of different splice variants of IRX4 with these knockdowns to get a clear understanding of this observation. It is essential to use additional siRNAs to confirm the results obtained to eliminate the effects of non- specific targeting of the siRNA. Although, siRNAs have been used as important molecules in gene silencing and implicated as a therapeutic targeting agent by RNAi technology, off target silencing of unintended genes have limited the use of these

140 The role of gwas identified 5p15 locus in prostate cancer risk and progression

siRNAs (Zhang, Liang, et al., 2018). Interestingly, a recent study has described that the use of circular siRNAs will overcome the non-specific effects of siRNA, while also enhancing the effect time of gene silencing (Zhang, Liang, et al., 2018). As IRX family members share similar sequences in their homeodomain, it is critical to design siRNA sequences to specifically target IRX4. A study on IRX5 in prostate cancer cells showed that two siRNAs targeting either the 3’-UTR or open reading frame of IRX5 reduced the cell proliferation, while one of the siRNA targeting open reading frame didn’t show a statistically significant difference (Myrthue, et al., 2008). As the transcripts of IRX4 share similar exons, it is technically challenging to target an individual variant. Thus, performing functional studies with the overexpression models of different IRX4 isoforms would clarify their individual role in prostate cancer. Stable overexpression models will also provide an opportunity to study the phenotype and EMT related changes induced by IRX4 in prostate cancer cells.

Gene microarray analysis of IRX4 knockdown samples identified AR as one of the upstream regulators of the genes regulated by IRX4. Further comparison of the androgen regulated gene signature in LNCaP cells with IRX4 regulated genes identified a negative correlation between these two data sets, suggesting that IRX4 knockdown may be a negative mediator of androgen signalling. However, there was no change in AR transcript levels observed with IRX4 knockdown, suggesting that AR gene transcription may not be the direct target of IRX4. In addition, IRX4 was not identified as a binding partner of AR in previously published studies on the AR. Interestingly, in our analysis few AR co-factors in LNCaP cells were also identified to be interacting with IRX4 by IP studies, including a crucial AR co-factor, FOXA1. It is expressed in various tissues, including prostate, breast, liver and pancreas and known to modulate the expression of genes involved in cell cycle, cell signalling and metabolic processes (Qiu et al., 2014). It is also known as a pioneer factor which binds to chromatin DNA and remodels it for additional transcription factor binding, including AR (Yang et al., 2015). Moreover, FOXA1 is shown to directly bind AR and modulates the expression of AR regulated genes in prostate cancer (Gao et al., 2003). Therefore, additional comparison of the FOXA1 transcriptome with IRX4 regulated genes may provide the association between these two transcription factors in regulating their down-stream genes and provide more understanding on whether IRX4 regulated genes are co-dependent on FOXA1. The function of FOXA1 in prostate

The role of gwas identified 5p15 locus in prostate cancer risk and progression 141

cancer is debated based on the patient cohort. High FOXA1 was reported to be associated with poor prognosis by enhancing AR activity, while some other reports suggest that low FOXA1 expression is associated with poor prognosis in androgen- independent cancer (Jin et al., 2013).

There are no studies published to date looking at the IRX4 binding sites on the chromatin in prostate cells. Therefore, performing IRX4 ChIP-sequencing will provide better understanding of the role of this transcription factor. Assessing the IRX4- cistrome with the IRX4 regulated genes from microarray analysis will identify the direct targets of IRX4. In addition, it is necessary to delineate the interaction of IRX4, FOXA1 and AR at the cistromic level to identify the shared targets between these transcription factors in order to determine the role of IRX4 in androgen signalling. Identifying the role of IRX4 in androgen signalling will further open up an avenue for developing new therapeutic targets for the prostate cancer treatment. Additionally, other pathways and upstream regulators identified in the microarray and immunoprecipitation studies should be studied to further clarify the role of IRX4 in prostate cancer progression.

Besides IRX4, expression of a lncRNA, IRX4lncRNA, located on the anti-sense strand in the intronic region of the IRX4 gene was reported in prostate cancer cells. IRX4lncRNA was identified as a potential candidate lncRNA for prostate cancer in a study exploring lncRNAs at prostate cancer risk associated GWAS loci (Guo, et al., 2016). IRX4lncRNA was overexpressed in prostate tumour samples compared to the adjacent non-malignant tissues. Moreover, the expression of IRX4lncRNA was associated with increasing Gleason Score. Higher expression of IRX4lncRNA was observed in Gleason Score 4+3 compared to Gleason score 3+4, suggesting the association of IRX4lncRNA with more aggressive disease. Furthermore, the patients who had cancer recurrence had a higher expression of IRX4lncRNA compared to disease free patients. PCA3 test, which measures the mRNA levels of PCA3 in urine, is the only FDA approved lncRNA-mediated test for prostate cancer diagnosis. As opposed to PSA, PCA3 is not elevated in other physiological conditions, such as inflammation or BPH, which accounts for the specificity of this test. Yet, it is not recommended as a standalone test for decision making due to its lower sensitivity (Deng, et al., 2017). Gene expression analysis of PCA3 in prostate tumour samples from TCGA data, showed a reduction of expression levels, with increasing Gleason

142 The role of gwas identified 5p15 locus in prostate cancer risk and progression

score. However, the expression pattern in the tumour tissues may not always match the levels in circulation. There are also conflicting reports published on the prognostic potential of this test as discussed in Chapter 4 Discussion. Of note, detection of the TMPRSS2:ERG fusion transcript in urine samples with the combination of PCA3 has been recommended to increase the sensitivity of prostate cancer diagnosis (Martignano et al., 2017). Besides this urine test, PCAT18, a prostate specific lncRNA, was detected in plasma samples and highly expressed in prostate cancer patients compared to healthy controls (Crea, et al., 2014). These studies highlight the potential of lncRNAs to be detected in liquid biopsy samples and their specificity in prostate cancer diagnosis. Detecting IRX4lncRNA expression in liquid biopsy samples, such as plasma or urine and further comparison with the current PSA test will determine the potential of this lncRNA as a prognostic biomarker for prostate cancer.

Although, the mechanistic studies of lncRNAs are at an early stage, it is suggested that lncRNAs may have a potential as targets for novel therapies for prostate cancer management (Chandra Gupta et al., 2017; Prensner & Chinnaiyan, 2011; Smolle, et al., 2017). Thus, it is essential to determine the functional role of a lncRNA in disease progression. IRX4lncRNA knockdown in LNCaP cells reduced cell proliferation and migration, while IRX4lncRNA overexpressing PC3 cells had an increased proliferation, suggesting a tumour promoting role for this lncRNA in prostate cancer progression. Only a moderate effect was observed in functional assays with the IRX4/IRX4lncRNA knockdown and overexpression models. PC3 cells which were used for IRX4lncRNA overexpression transduction, had already exhibited low levels of this lncRNA and the system was leaky with time, so it may be beneficial to knockdown this endogenous expression followed by IRX4lncRNA overexpression to further validate its function in these cells. Both the siRNA and shRNA systems didn’t completely knockdown the expression of their respective gene expression. Using CRISPR knockout of IRX4 and IRX4lncRNA may provide complete gene knockout and can be used to study the effect on these genes on morphological changes and functional assays. Since all the functional studies performed in this study are in-vitro based, it is also important to confirm the results in in-vivo models.

Nevertheless, it is important to identify how this lncRNA exerts these effects in prostate cancer cells. LncRNAs are known to be involved in several biological and cellular processes through mechanisms such as regulation of transcription and mRNA

The role of gwas identified 5p15 locus in prostate cancer risk and progression 143

post-transcriptional processing (Geisler et al., 2013). Due to their sequence complementarity they can either bind to DNA or RNA molecules to modulate their expression or function. In addition, lncRNAs are also reported to bind to RNA-binding proteins, epigenetic modulators and transcription factors to modulate gene expressions (Xing et al., 2016). Therefore, the interactors of IRX4lncRNA should be determined to identify the mechanistic role of this lncRNA and thereby identify its potential as a therapeutic target. Commonly, RNA pull down assays are performed by recent studies for this purpose, in which biotinylated oligos against the target lncRNA:protein/DNA complex to precipitate the complexes followed by sequencing or mass spectrometry to identify individual targets (Cerase et al., 2015; Xing, et al., 2016). A similar approach can be adapted to identify the interacting partners of IRX4lncRNA in prostate cancer cells.

As AR plays an important role in prostate cancer progression, the regulation of the expression of IRX4 and IRX4lncRNA was determined with androgen treatment as described in Chapter 5. Both IRX4 and IRX4lncRNA were upregulated with androgen treatment in VCaP and DuCaP cells, but downregulated in LNCaP cells. In-silico analysis using Cistrome Finder identified an AR and ERG binding peaks in the upstream region of IRX4 in VCaP cells (Asangani, et al., 2014; Sahu, et al., 2013; Yu, et al., 2010). Higher expression of IRX4 was correlated with ERG-fusion positive status and high ERG expression. However, IRX4lncRNA expression was not correlated with ERG-status, suggesting the regulatory mechanisms between these two genes are not always shared. Similarly, IRX4lncRNA expression was not affected by either ERG knockdown in VCaP cells or overexpression in LNCaP cells. On the other hand, ERG knockdown in VCaP and DuCaP cells increased the androgen-mediated up-regulation of IRX4 expression, while ERG overexpression in LNCaP cells reduced IRX4 expression independent of androgen treatment, which was not in line with the association in clinical samples. Interestingly, a recent study on prostate cancer GWAS SNPs in ERG-fusion sub-types reported an opposite trend of risk association of the 5p15 SNP rs12653946 with ERG-fusion positive and negative tumours (Penney et al., 2016). Therefore, the risk association of this locus might be complex with different sub-types and also AR status. These aspects has to be considered for further analysis and determining the expression correlation of IRX4 with the SNP genotype in ERG-

144 The role of gwas identified 5p15 locus in prostate cancer risk and progression

fusion sub-types may provide a better understanding of the observations from the in- vitro assays.

A rare type of polymorphism, MNLP – rs386684493, was identified at the AR/ERG binding region, in which a stretch of 47bp sequence is replaced by a novel 21bp sequence. This MNLP was previously reported to be the most strongly associated variant with prostate cancer risk at 5p15 in a Japanese population (variant 06, OR - 1.34 (1.23–1.46), p-value = 2.18E-11) (Nguyen, et al., 2012). Our study had confirmed the androgen responsiveness of the MNLP by reporter vector assay. CRISPR modifications of the MNLP allele in VCaP and LNCaP cells followed by transcriptomic analysis will identify the global changes in the gene expressions affected by the MNLP allele. ChIP-qPCR of the MNLP region using the CRISPR models will further confirm the MNLP allele dependent AR binding at 5p15 locus. In addition, Chromosome Confirmation Capture (3C) assays should be performed to identify the long-range interactions of this MNLP. Interestingly, obesity associated variants within FTO gene on chromosome 16 has been shown to exerting their functional role by long-range target IRX3 (Smemo et al., 2014).

Genetic association studies identified this MNLP to be the most significant prostate cancer risk associated variant at this 5p15 locus. The expression of IRX4 and IRX4lncNA correlated with the MNLP genotype in prostate cancer patients. Moreover, the 21bp/21bp genotype was associated with poor survival outcome in prostate cancer patients who underwent ADT. However, it is important to confirm this association in a larger cohort. Additional analysis on the risk association of the MNLP alleles in ERG-fusion status, AR status and the expression correlation of IRX4/lncRNA in these fusion sub-types with the MNLP genotype may provide insights on how this locus contributes to prostate cancer risk and identify the potential of the MNLP as a genetic prognostic biomarker for prostate cancer. However, the unavailability of ERG-fusion status data of these genotyped samples poses a limitation to this study. In addition, determining the expression correlation of different IRX4 variants with the MNLP genotype may provide a more clear understanding on their potential role in prostate cancer.

In summary, this study has identified IRX4 at the prostate cancer risk associated 5p15 locus as a mediator in androgen-mediated signalling pathways in prostate cancer,

The role of gwas identified 5p15 locus in prostate cancer risk and progression 145

probably via interaction with FOXA1. The IRX4lncRNA encoded from the antisense strand of IRX4, at the 5p15 locus, may promote prostate cancer cell proliferation and migration. We also identified that a functional polymorphism, rs386684493 at the 5p15 locus, correlates with IRX4 and IRX4lncRNA expression and regulates their androgen-mediated regulation. This polymorphism was also identified as the most significant genetic variant at the 5p15 locus associated with prostate cancer risk through large-scale genetic association studies. Further studies to identify the molecular mechanisms by which IRX4 and IRX4lncRNA play their role in prostate cancer pathogenesis may provide insight for the development of biologically meaningful therapeutic targets for prostate cancer.

146 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Bibliograph

Abate-Shen, C. (2002). Deregulated homeobox gene expression in cancer: cause or consequence? Nat Rev Cancer, 2(10), 777-785.

Adjakly, M., Ngollo, M., Dagdemir, A., Judes, G., Pajon, A., Karsli-Ceppioglu, S., . . . Bernard-Gallon, D. (2015). Prostate cancer: The main risk and protective factors - Epigenetic modifications. Ann Endocrinol (Paris).

Aguilo, F., Di Cecilia, S., & Walsh, M. J. (2016). Long Non-coding RNA ANRIL and Polycomb in Human Cancers and Cardiovascular Disease. Curr Top Microbiol Immunol, 394, 29-39.

Al Olama, A. A., Kote-Jarai, Z., Berndt, S. I., Conti, D. V., Schumacher, F., Han, Y., . . . Haiman, C. A. (2014). A meta-analysis of 87,040 individuals identifies 23 new susceptibility loci for prostate cancer. Nat Genet, 46(10), 1103-1109.

Alexander, R. P., Fang, G., Rozowsky, J., Snyder, M., & Gerstein, M. B. (2010). Annotating non-coding regions of the genome. Nat Rev Genet, 11(8), 559-571.

Amin Al Olama, A., Dadaev, T., Hazelett, D. J., Li, Q., Leongamornlert, D., Saunders, E. J., . . . Kote-Jarai, Z. (2015). Multiple novel prostate cancer susceptibility signals identified by fine-mapping of known risk loci among Europeans. Hum Mol Genet, 24(19), 5589-5602.

Amit, D., & Hochberg, A. (2010). Development of targeted therapy for bladder cancer mediated by a double promoter plasmid expressing diphtheria toxin under the control of H19 and IGF2-P4 regulatory sequences. J Transl Med, 8, 134.

Anastasiadou, E., Jacob, L. S., & Slack, F. J. (2018). Non-coding RNA networks in cancer. Nat Rev Cancer, 18(1), 5-18.

Andrews, S. J., & Rothnagel, J. A. (2014). Emerging evidence for functional peptides encoded by short open reading frames. Nat Rev Genet, 15(3), 193-204.

Arredouani, M. S., Lu, B., Bhasin, M., Eljanne, M., Yue, W., Mosquera, J. M., . . . Sanda, M. G. (2009). Identification of the transcription factor single-minded homologue 2 as a potential biomarker and immunotherapy target in prostate cancer. Clin Cancer Res, 15(18), 5794-5802.

Asangani, I. A., Dommeti, V. L., Wang, X., Malik, R., Cieslik, M., Yang, R., . . . Chinnaiyan, A. M. (2014). Therapeutic targeting of BET bromodomain proteins in castration-resistant prostate cancer. Nature, 510(7504), 278-282.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 147

Auprich, M., Chun, F. K., Ward, J. F., Pummer, K., Babaian, R., Augustin, H., . . . Haese, A. (2011). Critical assessment of preoperative urinary prostate cancer antigen 3 on the accuracy of prostate cancer staging. Eur Urol, 59(1), 96-105.

Bao, Z. Z., Bruneau, B. G., Seidman, J. G., Seidman, C. E., & Cepko, C. L. (1999). Regulation of chamber-specific gene expression in the developing heart by Irx4. Science, 283(5405), 1161-1164.

Barry, G. S., Cheang, M. C., Chang, H. L., & Kennecke, H. F. (2016). Genomic markers of panitumumab resistance including ERBB2/ HER2 in a phase II study of KRAS wild-type (wt) metastatic colorectal cancer (mCRC). Oncotarget, 7(14), 18953-18964.

Batra, J., Lose, F., Chambers, S., Gardiner, R. A., Aitken, J., Yaxley, J., . . . Australian Prostate Cancer, B. (2011). A replication study examining novel common single nucleotide polymorphisms identified through a prostate cancer genome- wide association study in a Japanese population. Am J Epidemiol, 174(12), 1391-1395.

Becker, M. B., Zulch, A., Bosse, A., & Gruss, P. (2001). Irx1 and Irx2 expression in early lung development. Mech Dev, 106(1-2), 155-158.

Beekman, K. W., & Hussain, M. (2008). Hormonal approaches in prostate cancer: application in the contemporary prostate cancer patient. Urol Oncol, 26(4), 415-419.

Bell, K. J., Del Mar, C., Wright, G., Dickinson, J., & Glasziou, P. (2015). Prevalence of incidental prostate cancer: A systematic review of autopsy studies. Int J Cancer, 137(7), 1749-1757.

Bennett, K. L., Karpenko, M., Lin, M. T., Claus, R., Arab, K., Dyckhoff, G., . . . Plass, C. (2008). Frequently methylated tumor suppressor genes in head and neck squamous cell carcinoma. Cancer Res, 68(12), 4494-4499.

Bennett, K. L., Romigh, T., & Eng, C. (2009). Disruption of transforming growth factor-beta signaling by five frequently methylated genes leads to head and neck squamous cell carcinoma pathogenesis. Cancer Res, 69(24), 9301-9305.

Bernard, D., Prasanth, K. V., Tripathi, V., Colasse, S., Nakamura, T., Xuan, Z., . . . Bessis, A. (2010). A long nuclear-retained non-coding RNA regulates synaptogenesis by modulating gene expression. EMBO J, 29(18), 3082-3093.

Bhatlekar, S., Fields, J. Z., & Boman, B. M. (2014). HOX genes and their role in the development of human cancers. J Mol Med (Berl), 92(8), 811-823.

148 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Bilaud, T., Brun, C., Ancelin, K., Koering, C. E., Laroche, T., & Gilson, E. (1997). Telomeric localization of TRF2, a novel human telobox protein. Nat Genet, 17(2), 236-239.

Bitting, R. L., Schaeffer, D., Somarelli, J. A., Garcia-Blanco, M. A., & Armstrong, A. J. (2014). The role of epithelial plasticity in prostate cancer dissemination and treatment resistance. Cancer Metastasis Rev, 33(2-3), 441-468.

Bosse, A., Zulch, A., Becker, M. B., Torres, M., Gomez-Skarmeta, J. L., Modolell, J., & Gruss, P. (1997). Identification of the vertebrate Iroquois homeobox gene family with overlapping expression during early development of the nervous system. Mech Dev, 69(1-2), 169-181.

Boyd, L. K., Mao, X., & Lu, Y. J. (2012). The complexity of prostate cancer: genomic alterations and heterogeneity. Nat Rev Urol, 9(11), 652-664.

Bracarda, S., de Cobelli, O., Greco, C., Prayer-Galetti, T., Valdagni, R., Gatta, G., . . . Bartsch, G. (2005). Cancer of the prostate. Crit Rev Oncol Hematol, 56(3), 379-396.

Brantl, S. (2007). Regulatory mechanisms employed by cis-encoded antisense RNAs. Curr Opin Microbiol, 10(2), 102-109.

Brawley, O. W. (2012). Prostate cancer epidemiology in the United States. World J Urol, 30(2), 195-200.

Brown, C. J., Ballabio, A., Rupert, J. L., Lafreniere, R. G., Grompe, M., Tonlorenzi, R., & Willard, H. F. (1991). A gene from the region of the human X inactivation centre is expressed exclusively from the inactive X chromosome. Nature, 349(6304), 38-44.

Brown, C. J., Hendrich, B. D., Rupert, J. L., Lafreniere, R. G., Xing, Y., Lawrence, J., & Willard, H. F. (1992). The human XIST gene: analysis of a 17 kb inactive X-specific RNA that contains conserved repeats and is highly localized within the nucleus. Cell, 71(3), 527-542.

Bruneau, B. G., Bao, Z. Z., Tanaka, M., Schott, J. J., Izumo, S., Cepko, C. L., . . . Seidman, C. E. (2000). Cardiac expression of the ventricle-specific homeobox gene Irx4 is modulated by Nkx2-5 and dHand. Dev Biol, 217(2), 266-277.

Bu, D., Yu, K., Sun, S., Xie, C., Skogerbo, G., Miao, R., . . . Zhao, Y. (2012). NONCODE v3.0: integrative annotation of long noncoding RNAs. Nucleic Acids Res, 40(Database issue), D210-215.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 149

Bu, H., Narisu, N., Schlick, B., Rainer, J., Manke, T., Schafer, G., . . . Klocker, H. (2016). Putative Prostate Cancer Risk SNP in an Androgen Receptor-Binding Site of the Melanophilin Gene Illustrates Enrichment of Risk SNPs in Androgen Receptor Target Sites. Hum Mutat, 37(1), 52-64.

Bussemakers, M. J., van Bokhoven, A., Verhaegh, G. W., Smit, F. P., Karthaus, H. F., Schalken, J. A., . . . Isaacs, W. B. (1999). DD3: a new prostate-specific gene, highly overexpressed in prostate cancer. Cancer Res, 59(23), 5975-5979.

Buyyounouski, M. K., Choyke, P. L., McKenney, J. K., Sartor, O., Sandler, H. M., Amin, M. B., . . . Lin, D. W. (2017). Prostate cancer - major changes in the American Joint Committee on Cancer eighth edition cancer staging manual. CA Cancer J Clin, 67(3), 245-253.

Cai, C., Wang, H., He, H. H., Chen, S., He, L., Ma, F., . . . Yuan, X. (2013). ERG induces androgen receptor-mediated regulation of SOX9 in prostate cancer. J Clin Invest, 123(3), 1109-1122.

Cai, C., Wang, H., Xu, Y., Chen, S., & Balk, S. P. (2009). Reactivation of androgen receptor-regulated TMPRSS2:ERG gene expression in castration-resistant prostate cancer. Cancer Res, 69(15), 6027-6032.

Cao, J. (2014). The functional role of long non-coding RNAs and epigenetics. Biol Proced Online, 16, 11.

Carninci, P., Kasukawa, T., Katayama, S., Gough, J., Frith, M. C., Maeda, N., . . . Genome Science, G. (2005). The transcriptional landscape of the mammalian genome. Science, 309(5740), 1559-1563.

Cavodeassi, F., Modolell, J., & Gomez-Skarmeta, J. L. (2001). The Iroquois family of genes: from body building to neural patterning. Development, 128(15), 2847- 2855.

Cerami, E., Gao, J., Dogrusoz, U., Gross, B. E., Sumer, S. O., Aksoy, B. A., . . . Schultz, N. (2012). The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov, 2(5), 401- 404.

Cerase, A., Pintacuda, G., Tattermusch, A., & Avner, P. (2015). Xist localization and function: new insights from multiple levels. Genome Biol, 16, 166.

Chakravarty, D., Sboner, A., Nair, S. S., Giannopoulou, E., Li, R., Hennig, S., . . . Rubin, M. A. (2014). The oestrogen receptor alpha-regulated lncRNA NEAT1 is a critical modulator of prostate cancer. Nat Commun, 5, 5383.

150 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Chandra Gupta, S., & Nandan Tripathi, Y. (2017). Potential of long non-coding RNAs in cancer patients: From biomarkers to therapeutic targets. Int J Cancer, 140(9), 1955-1967.

Chen, J., Miao, Z., Xue, B., Shan, Y., Weng, G., & Shen, B. (2016). Long Non-coding RNAs in Urologic Malignancies: Functional Roles and Clinical Translation. J Cancer, 7(13), 1842-1855.

Chen, L. L. (2016). Linking Long Noncoding RNA Localization and Function. Trends Biochem Sci, 41(9), 761-772.

Chen, S., Zhu, J., Wang, F., Guan, Z., Ge, Y., Yang, X., & Cai, J. (2017). LncRNAs and their role in cancer stem cells. Oncotarget, 8(66), 110685-110692.

Chen, Y., Chi, P., Rockowitz, S., Iaquinta, P. J., Shamu, T., Shukla, S., . . . Sawyers, C. L. (2013). ETS factors reprogram the androgen receptor cistrome and prime prostate tumorigenesis in response to PTEN loss. Nat Med, 19(8), 1023-1029.

Cheng, C. W., Chow, R. L., Lebel, M., Sakuma, R., Cheung, H. O., Thanabalasingham, V., . . . Cheng, S. H. (2005). The Iroquois homeobox gene, Irx5, is required for retinal cone bipolar cell development. Dev Biol, 287(1), 48-60.

Cheng, Z., Wang, J., Su, D., Pan, H., Huang, G., Li, X., . . . Ma, X. (2011). Two novel mutations of the IRX4 gene in patients with congenital heart disease. Hum Genet, 130(5), 657-662.

Chi, J. S., Li, J. Z., Jia, J. J., Zhang, T., Liu, X. M., & Yi, L. (2017). Long non-coding RNA ANRIL in gene regulation and its duality in atherosclerosis. J Huazhong Univ Sci Technolog Med Sci, 37(6), 816-822.

Chiyomaru, T., Yamamura, S., Fukuhara, S., Yoshino, H., Kinoshita, T., Majid, S., . . . Dahiya, R. (2013). Genistein inhibits prostate cancer cell growth by targeting miR-34a and oncogenic HOTAIR. PLoS One, 8(8), e70372.

Chowdhury, U. R., Samant, R. S., Fodstad, O., & Shevde, L. A. (2009). Emerging role of nuclear protein 1 (NUPR1) in cancer biology. Cancer Metastasis Rev, 28(1- 2), 225-232.

Chu, H., Chen, Y., Yuan, Q., Hua, Q., Zhang, X., Wang, M., . . . Zhang, Z. (2017). The HOTAIR, PRNCR1 and POLR2E polymorphisms are associated with cancer risk: a meta-analysis. Oncotarget, 8(26), 43271-43283.

Chung, S., Nakagawa, H., Uemura, M., Piao, L., Ashikawa, K., Hosono, N., . . . Kubo, M. (2011). Association of a novel long non-coding RNA in 8q24 with prostate cancer susceptibility. Cancer Sci, 102(1), 245-252.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 151

Clark, M. B., & Mattick, J. S. (2011). Long noncoding RNAs in cell biology. Semin Cell Dev Biol, 22(4), 366-376.

Crawford, E. D. (2003). Epidemiology of prostate cancer. Urology, 62(6 Suppl 1), 3- 12.

Crawford, E. D., & Moul, J. W. (2015). ADT Risks and Side Effects in Advanced Prostate Cancer: Cardiovascular and Acute Renal Injury. Oncology (Williston Park), 29(1).

Crawford, E. D., Ventii, K., & Shore, N. D. (2014). New biomarkers in prostate cancer. Oncology (Williston Park), 28(2), 135-142.

Crea, F., Watahiki, A., Quagliata, L., Xue, H., Pikor, L., Parolia, A., . . . Helgason, C. D. (2014). Identification of a long non-coding RNA as a novel biomarker and potential therapeutic target for metastatic prostate cancer. Oncotarget, 5(3), 764-774.

Crona, D. J., Milowsky, M. I., & Whang, Y. E. (2015). Androgen receptor targeting drugs in castration-resistant prostate cancer and mechanisms of resistance. Clin Pharmacol Ther, 98(6), 582-589.

Cui, Z., Ren, S., Lu, J., Wang, F., Xu, W., Sun, Y., . . . Sun, Y. (2013). The prostate cancer-up-regulated long noncoding RNA PlncRNA-1 modulates apoptosis and proliferation through reciprocal regulation of androgen receptor. Urol Oncol, 31(7), 1117-1123.

Culig, Z. (2017). Molecular Mechanisms of Enzalutamide Resistance in Prostate Cancer. Curr Mol Biol Rep, 3(4), 230-235.

Dadaev, T., Saunders, E. J., Newcombe, P. J., Anokian, E., Leongamornlert, D. A., Brook, M. N., . . . Kote-Jarai, Z. (2018). Fine-mapping of prostate cancer susceptibility loci in a large meta-analysis identifies candidate causal variants. Nat Commun, 9(1), 2256.

Daniels, G., Jha, R., Shen, Y., Logan, S. K., & Lee, P. (2014). Androgen receptor coactivators that inhibit prostate cancer growth. Am J Clin Exp Urol, 2(1), 62- 70. de la Calle-Mustienes, E., Feijoo, C. G., Manzanares, M., Tena, J. J., Rodriguez- Seguel, E., Letizia, A., . . . Gomez-Skarmeta, J. L. (2005). A functional survey of the enhancer activity of conserved non-coding sequences from vertebrate Iroquois cluster gene deserts. Genome Res, 15(8), 1061-1072.

152 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Deng, J., Tang, J., Wang, G., & Zhu, Y. S. (2017). Long Non-Coding RNA as Potential Biomarker for Prostate Cancer: Is It Making a Difference? Int J Environ Res Public Health, 14(3).

Deng, Z., Norseen, J., Wiedmer, A., Riethman, H., & Lieberman, P. M. (2009). TERRA RNA binding to TRF2 facilitates heterochromatin formation and ORC recruitment at telomeres. Mol Cell, 35(4), 403-413.

Doulatov, S., Vo, L. T., Chou, S. S., Kim, P. G., Arora, N., Li, H., . . . Daley, G. Q. (2013). Induction of multipotential hematopoietic progenitors from human pluripotent stem cells via respecification of lineage-restricted precursors. Cell Stem Cell, 13(4), 459-470.

Duggan, D., Zheng, S. L., Knowlton, M., Benitez, D., Dimitrov, L., Wiklund, F., . . . Carpten, J. D. (2007). Two genome-wide association studies of aggressive prostate cancer implicate putative prostate tumor suppressor gene DAB2IP. J Natl Cancer Inst, 99(24), 1836-1844.

Duverger, O., & Morasso, M. I. (2008). Role of homeobox genes in the patterning, specification, and differentiation of ectodermal appendages in mammals. J Cell Physiol, 216(2), 337-346.

Dykes, I. M., & Emanueli, C. (2017). Transcriptional and Post-transcriptional Gene Regulation by Long Non-coding RNA. Genomics Proteomics Bioinformatics, 15(3), 177-186.

Edwards, S. L., Beesley, J., French, J. D., & Dunning, A. M. (2013). Beyond GWASs: illuminating the dark road from association to function. Am J Hum Genet, 93(5), 779-797.

Eeles, R., Goh, C., Castro, E., Bancroft, E., Guy, M., Al Olama, A. A., . . . Kote-Jarai, Z. (2014). The genetic epidemiology of prostate cancer and its clinical implications. Nat Rev Urol, 11(1), 18-31.

Eeles, R. A., Kote-Jarai, Z., Giles, G. G., Olama, A. A., Guy, M., Jugurnauth, S. K., . . . Easton, D. F. (2008). Multiple newly identified loci associated with prostate cancer susceptibility. Nat Genet, 40(3), 316-321.

Eeles, R. A., Olama, A. A., Benlloch, S., Saunders, E. J., Leongamornlert, D. A., Tymrakiewicz, M., . . . Easton, D. F. (2013). Identification of 23 new prostate cancer susceptibility loci using the iCOGS custom genotyping array. Nat Genet, 45(4), 385-391, 391e381-382.

Egevad, L. (2008). Recent trends in gleason grading of prostate cancer. II. Prognosis, reproducibility and reporting. Anal Quant Cytol Histol, 30(5), 254-260.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 153

Esteller, M. (2011). Non-coding RNAs in human disease. Nat Rev Genet, 12(12), 861- 874.

Evenepoel, L., Van Nederveen, F. H., Oudijk, L., Papathomas, T. G., Restuccia, D. F., Belt, E. J., . . . Korpershoek, E. (2015). 9b.09: Identification of Markers Predictive for Malignant Behavior of Pheochromocytomas and Paragangliomas. J Hypertens, 33 Suppl 1, e122.

Feng, J., Bi, C., Clark, B. S., Mady, R., Shah, P., & Kohtz, J. D. (2006). The Evf-2 noncoding RNA is transcribed from the Dlx-5/6 ultraconserved region and functions as a Dlx-2 transcriptional coactivator. Genes Dev, 20(11), 1470- 1484.

Filella, X., Fernandez-Galan, E., Fernandez Bonifacio, R., & Foj, L. (2018). Emerging biomarkers in the diagnosis of prostate cancer. Pharmgenomics Pers Med, 11, 83-94.

Filella, X., & Foj, L. (2015). Emerging biomarkers in the detection and prognosis of prostate cancer. Clin Chem Lab Med, 53(7), 963-973.

Fu, X., Ravindranath, L., Tran, N., Petrovics, G., & Srivastava, S. (2006). Regulation of apoptosis by a prostate-specific and prostate cancer-associated noncoding gene, PCGEM1. DNA Cell Biol, 25(3), 135-141.

Gamat, M., & McNeel, D. G. (2017). Androgen deprivation and immunotherapy for the treatment of prostate cancer. Endocr Relat Cancer, 24(12), T297-T310.

Gao, J., Aksoy, B. A., Dogrusoz, U., Dresdner, G., Gross, B., Sumer, S. O., . . . Schultz, N. (2013). Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal, 6(269), pl1.

Gao, N., Zhang, J., Rao, M. A., Case, T. C., Mirosevich, J., Wang, Y., . . . Matusik, R. J. (2003). The role of hepatocyte nuclear factor-3 alpha (Forkhead Box A1) and androgen receptor in transcriptional regulation of prostatic genes. Mol Endocrinol, 17(8), 1484-1507.

Gasi Tandefelt, D., Boormans, J., Hermans, K., & Trapman, J. (2014). ETS fusion genes in prostate cancer. Endocr Relat Cancer, 21(3), R143-152.

Geisler, S., & Coller, J. (2013). RNA in unexpected places: long non-coding RNA functions in diverse cellular contexts. Nat Rev Mol Cell Biol, 14(11), 699-712.

Global Burden of Disease Cancer, C., Fitzmaurice, C., Allen, C., Barber, R. M., Barregard, L., Bhutta, Z. A., . . . Naghavi, M. (2017). Global, Regional, and

154 The role of gwas identified 5p15 locus in prostate cancer risk and progression

National Cancer Incidence, Mortality, Years of Life Lost, Years Lived With Disability, and Disability-Adjusted Life-years for 32 Cancer Groups, 1990 to 2015: A Systematic Analysis for the Global Burden of Disease Study. JAMA Oncol, 3(4), 524-548.

Gomez-Skarmeta, J. L., Glavic, A., de la Calle-Mustienes, E., Modolell, J., & Mayor, R. (1998). Xiro, a Xenopus homolog of the Drosophila Iroquois complex genes, controls development at the neural plate. EMBO J, 17(1), 181-190.

Gomez-Skarmeta, J. L., & Modolell, J. (1996). araucan and caupolican provide a link between compartment subdivisions and patterning of sensory organs and veins in the Drosophila wing. Genes Dev, 10(22), 2935-2945.

Gomez-Skarmeta, J. L., & Modolell, J. (2002). Iroquois genes: genomic organization and function in vertebrate neural development. Curr Opin Genet Dev, 12(4), 403-408.

Gordetsky, J., & Epstein, J. (2016). Grading of prostatic adenocarcinoma: current state and prognostic implications. Diagn Pathol, 11, 25.

Goyal, A., Fiskin, E., Gutschner, T., Polycarpou-Schwarz, M., Gross, M., Neugebauer, J., . . . Diederichs, S. (2017). A cautionary tale of sense-antisense gene pairs: independent regulation despite inverse correlation of expression. Nucleic Acids Res.

Grasso, C. S., Wu, Y. M., Robinson, D. R., Cao, X., Dhanasekaran, S. M., Khan, A. P., . . . Tomlins, S. A. (2012). The mutational landscape of lethal castration- resistant prostate cancer. Nature, 487(7406), 239-243.

Green, S. M., Mostaghel, E. A., & Nelson, P. S. (2012). Androgen action and metabolism in prostate cancer. Mol Cell Endocrinol, 360(1-2), 3-13.

Gudmundsson, J., Sulem, P., Manolescu, A., Amundadottir, L. T., Gudbjartsson, D., Helgason, A., . . . Stefansson, K. (2007). Genome-wide association study identifies a second prostate cancer susceptibility variant at 8q24. Nat Genet, 39(5), 631-637.

Guo, H., Ahmed, M., Zhang, F., Yao, C. Q., Li, S., Liang, Y., . . . He, H. H. (2016). Modulation of long noncoding RNAs by risk SNPs underlying genetic predispositions to prostate cancer. Nat Genet, 48(10), 1142-1150.

Guo, X., Liu, W., Pan, Y., Ni, P., Ji, J., Guo, L., . . . Yu, Y. (2010). Homeobox gene IRX1 is a tumor suppressor gene in gastric carcinoma. Oncogene, 29(27), 3908-3920.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 155

Haendeler, J., Mlynek, A., Buchner, N., Lukosz, M., Graf, M., Guettler, C., . . . Altschmied, J. (2013). Two isoforms of Sister-Of-Mammalian Grainyhead have opposing functions in endothelial cells and in vivo. Arterioscler Thromb Vasc Biol, 33(7), 1639-1646.

Hagglof, C., Hammarsten, P., Stromvall, K., Egevad, L., Josefsson, A., Stattin, P., . . . Bergh, A. (2014). TMPRSS2-ERG expression predicts prostate cancer survival and associates with stromal biomarkers. PLoS One, 9(2), e86824.

Haiman, C. A., Chen, G. K., Blot, W. J., Strom, S. S., Berndt, S. I., Kittles, R. A., . . . Henderson, B. E. (2011). Characterizing genetic risk at known prostate cancer susceptibility loci in African Americans. PLoS Genet, 7(5), e1001387.

Hajjari, M., & Salavaty, A. (2015). HOTAIR: an oncogenic long non-coding RNA in different cancers. Cancer Biol Med, 12(1), 1-9.

Haria, D., & Naora, H. (2013). Homeobox Gene Deregulation: Impact on the Hallmarks of Cancer. Cancer Hallm, 1(2-3), 67-76.

Harries, L. W., Perry, J. R., McCullagh, P., & Crundwell, M. (2010). Alterations in LMTK2, MSMB and HNF1B gene expression are associated with the development of prostate cancer. BMC Cancer, 10, 315.

Hazelett, D. J., Rhie, S. K., Gaddis, M., Yan, C., Lakeland, D. L., Coetzee, S. G., . . . Coetzee, G. A. (2014). Comprehensive functional annotation of 77 prostate cancer risk loci. PLoS Genet, 10(1), e1004102.

He, J. H., Zhang, J. Z., Han, Z. P., Wang, L., Lv, Y., & Li, Y. G. (2014). Reciprocal regulation of PCGEM1 and miR-145 promote proliferation of LNCaP prostate cancer cells. J Exp Clin Cancer Res, 33(1), 72.

Helfand, B. T., Catalona, W. J., & Xu, J. (2015). A genetic-based approach to personalized prostate cancer screening and treatment. Curr Opin Urol, 25(1), 53-58.

Hessels, D., Klein Gunnewiek, J. M., van Oort, I., Karthaus, H. F., van Leenders, G. J., van Balken, B., . . . Schalken, J. A. (2003). DD3(PCA3)-based molecular urine analysis for the diagnosis of prostate cancer. Eur Urol, 44(1), 8-15; discussion 15-16.

Hessels, D., & Schalken, J. A. (2013). Urinary biomarkers for prostate cancer: a review. Asian J Androl, 15(3), 333-339.

Hessels, D., van Gils, M. P., van Hooij, O., Jannink, S. A., Witjes, J. A., Verhaegh, G. W., & Schalken, J. A. (2010). Predictive value of PCA3 in urinary sediments

156 The role of gwas identified 5p15 locus in prostate cancer risk and progression

in determining clinico-pathological characteristics of prostate cancer. Prostate, 70(1), 10-16.

Hicks, C., Miele, L., Koganti, T., & Vijayakumar, S. (2013). Comprehensive assessment and network analysis of the emerging genetic susceptibility landscape of prostate cancer. Cancer Inform, 12, 175-191.

Hirsch, G. E., Parisi, M. M., Martins, L. A., Andrade, C. M., Barbe-Tuana, F. M., & Guma, F. T. (2015). gamma-oryzanol reduces caveolin-1 and PCGEM1 expression, markers of aggressiveness in prostate cancer cell lines. Prostate.

Ho, Y., & Dehm, S. M. (2017). Androgen Receptor Rearrangement and Splicing Variants in Resistance to Endocrine Therapies in Prostate Cancer. Endocrinology, 158(6), 1533-1542.

Houweling, A. C., Dildrop, R., Peters, T., Mummenhoff, J., Moorman, A. F., Ruther, U., & Christoffels, V. M. (2001). Gene and cluster-specific expression of the Iroquois family members during mouse development. Mech Dev, 107(1-2), 169-174.

Hung, T., & Chang, H. Y. (2010). Long noncoding RNA in genome regulation: prospects and mechanisms. RNA Biol, 7(5), 582-585.

Imam, H., Bano, A. S., Patel, P., Holla, P., & Jameel, S. (2015). The lncRNA NRON modulates HIV-1 replication in a NFAT-dependent manner and is differentially regulated by early and late viral proteins. Sci Rep, 5, 8639.

Ito, Y., Yoshida, H., Motoo, Y., Miyoshi, E., Iovanna, J. L., Tomoda, C., . . . Miyauchi, A. (2003). Expression and cellular localization of p8 protein in thyroid neoplasms. Cancer Lett, 201(2), 237-244.

Iyer, M. K., Niknafs, Y. S., Malik, R., Singhal, U., Sahu, A., Hosono, Y., . . . Chinnaiyan, A. M. (2015). The landscape of long noncoding RNAs in the human transcriptome. Nat Genet, 47(3), 199-208.

Jia, J., Li, F., Tang, X. S., Xu, S., Gao, Y., Shi, Q., . . . Guo, P. (2016). Long noncoding RNA DANCR promotes invasion of prostate cancer through epigenetically silencing expression of TIMP2/3. Oncotarget, 7(25), 37868-37881.

Jiang, C. Y., Gao, Y., Wang, X. J., Ruan, Y., Bei, X. Y., Wang, X. H., . . . Zhao, F. J. (2016). Long non-coding RNA lnc-MX1-1 is associated with poor clinical features and promotes cellular proliferation and invasiveness in prostate cancer. Biochem Biophys Res Commun, 470(3), 721-727.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 157

Jiang, J., Liu, W., Guo, X., Zhang, R., Zhi, Q., Ji, J., . . . Yu, Y. (2011). IRX1 influences peritoneal spreading and metastasis via inhibiting BDKRB2-dependent neovascularization on gastric cancer. Oncogene, 30(44), 4498-4508.

Jiang, W. G., Davies, G., Martin, T. A., Kynaston, H., Mason, M. D., & Fodstad, O. (2006). Com-1/p8 acts as a putative tumour suppressor in prostate cancer. Int J Mol Med, 18(5), 981-986.

Jiang, Z., Wang, Z., Kunej, T., Williams, G. A., Michal, J. J., Wu, X. L., & Magnuson, N. S. (2007). A novel type of sequence variation: multiple-nucleotide length polymorphisms discovered in the bovine genome. Genetics, 176(1), 403-407.

Jin, H. J., Zhao, J. C., Ogden, I., Bergan, R. C., & Yu, J. (2013). Androgen receptor- independent function of FoxA1 in prostate cancer metastasis. Cancer Res, 73(12), 3725-3736.

Kadota, M., Sato, M., Duncan, B., Ooshima, A., Yang, H. H., Diaz-Meyer, N., . . . Lee, M. P. (2009). Identification of novel gene amplifications in breast cancer and coexistence of gene amplification with an activating mutation of PIK3CA. Cancer Res, 69(18), 7357-7365.

Kamalakaran, S., Varadan, V., Giercksky Russnes, H. E., Levy, D., Kendall, J., Janevski, A., . . . Hicks, J. B. (2011). DNA methylation patterns in luminal breast cancers differ from non-luminal subtypes and can identify relapse risk independent of other clinical variables. Mol Oncol, 5(1), 77-92.

Kang, H., Wilson, C. S., Harvey, R. C., Chen, I. M., Murphy, M. H., Atlas, S. R., . . . Willman, C. L. (2012). Gene expression profiles predictive of outcome and age in infant acute lymphoblastic leukemia: a Children's Oncology Group study. Blood, 119(8), 1872-1881.

Kanhere, A., Viiri, K., Araujo, C. C., Rasaiyaah, J., Bouwman, R. D., Whyte, W. A., . . . Jenner, R. G. (2010). Short RNAs are transcribed from repressed polycomb target genes and interact with polycomb repressive complex-2. Mol Cell, 38(5), 675-688.

Karlsson, J., Lilljebjorn, H., Holmquist Mengelbier, L., Valind, A., Rissler, M., Ora, I., . . . Gisselsson, D. (2015). Activation of human telomerase reverse transcriptase through gene fusion in clear cell sarcoma of the kidney. Cancer Lett, 357(2), 498-501.

Kim, E. D., & Sung, S. (2012). Long noncoding RNA: unveiling hidden layer of gene regulatory networks. Trends Plant Sci, 17(1), 16-21.

158 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Kim, K. H., Rosen, A., Bruneau, B. G., Hui, C. C., & Backx, P. H. (2012). Iroquois homeodomain transcription factors in heart development and function. Circ Res, 110(11), 1513-1524.

Kino, T., Hurt, D. E., Ichijo, T., Nader, N., & Chrousos, G. P. (2010). Noncoding RNA gas5 is a growth arrest- and starvation-associated repressor of the glucocorticoid receptor. Sci Signal, 3(107), ra8.

Klinge, C. M. (2018). Non-coding RNAs: long non-coding RNAs and microRNAs in endocrine-related cancers. Endocr Relat Cancer, 25(4), R259-R282.

Knudsen, K. E., & Kelly, W. K. (2011). Outsmarting androgen receptor: creative approaches for targeting aberrant androgen signaling in advanced prostate cancer. Expert Rev Endocrinol Metab, 6(3), 483-493.

Koptyra, M., Gupta, S., Talati, P., & Nevalainen, M. T. (2011). Signal transducer and activator of transcription 5a/b: biomarker and therapeutic target in prostate and breast cancer. Int J Biochem Cell Biol, 43(10), 1417-1421.

Kotake, Y., Nakagawa, T., Kitagawa, K., Suzuki, S., Liu, N., Kitagawa, M., & Xiong, Y. (2011). Long non-coding RNA ANRIL is required for the PRC2 recruitment to and silencing of p15(INK4B) tumor suppressor gene. Oncogene, 30(16), 1956-1962.

Kuhn, A., Loscher, D., & Marschalek, R. (2016). The IRX1/HOXA connection: insights into a novel t(4;11)- specific cancer mechanism. Oncotarget, 7(23), 35341-35352.

Kumar-Sinha, C., Tomlins, S. A., & Chinnaiyan, A. M. (2008). Recurrent gene fusions in prostate cancer. Nat Rev Cancer, 8(7), 497-511.

Laner, T., Schulz, W. A., Engers, R., Muller, M., & Florl, A. R. (2005). Hypomethylation of the XIST gene promoter in prostate cancer. Oncol Res, 15(5), 257-264.

Lee, B., Mazar, J., Aftab, M. N., Qi, F., Shelley, J., Li, J. L., . . . Perera, R. J. (2014). Long noncoding RNAs as putative biomarkers for prostate cancer detection. J Mol Diagn, 16(6), 615-626.

Lee, J. S., Oum, B. S., & Lee, S. H. (2001). Mitomycin c influence on inhibition of cellular proliferation and subsequent synthesis of type I collagen and laminin in primary and recurrent pterygia. Ophthalmic Res, 33(3), 140-146.

Lee, J. T. (2012). Epigenetic regulation by long noncoding RNAs. Science, 338(6113), 1435-1439.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 159

Lee, Y. Y., Wu, W. J., Huang, C. N., Li, C. C., Li, W. M., Yeh, B. W., . . . Li, C. F. (2016). CSF2 Overexpression Is Associated with STAT5 Phosphorylation and Poor Prognosis in Patients with Urothelial Carcinoma. J Cancer, 7(6), 711- 721.

Levrier, C., Sadowski, M. C., Rockstroh, A., Gabrielli, B., Kavallaris, M., Lehman, M., . . . Nelson, C. C. (2017). 6alpha-Acetoxyanopterine: A Novel Structure Class of Mitotic Inhibitor Disrupting Microtubule Dynamics in Prostate Cancer Cells. Mol Cancer Ther, 16(1), 3-15.

Leygue, E., Dotzlaw, H., Watson, P. H., & Murphy, L. C. (1999). Expression of the steroid receptor RNA activator in human breast tumors. Cancer Res, 59(17), 4190-4193.

Li, J., & Wang, Z. (2016). The pathology of unusual subtypes of prostate cancer. Chin J Cancer Res, 28(1), 130-143.

Li, J., Xuan, Z., & Liu, C. (2013). Long non-coding RNAs and complex human diseases. Int J Mol Sci, 14(9), 18790-18808.

Li, S., Chen, W., Kang, X., Han, R., Sun, G., & Huang, Y. (2013). Distinct tissue expression profiles of chicken Lpin1-alpha/beta isoforms and the effect of the variation on muscle fiber traits. Gene, 515(2), 281-290.

Li, Y. M., Xu, S. C., Li, J., Han, K. Q., Pi, H. F., Zheng, L., . . . Liang, P. (2013). Epithelial-mesenchymal transition markers expressed in circulating tumor cells in hepatocellular carcinoma patients with different stages of disease. Cell Death Dis, 4, e831.

Liao, R. S., Ma, S., Miao, L., Li, R., Yin, Y., & Raj, G. V. (2013). Androgen receptor- mediated non-genomic regulation of prostate cancer cell proliferation. Transl Androl Urol, 2(3), 187-196.

Lindstrom, S., Schumacher, F. R., Campa, D., Albanes, D., Andriole, G., Berndt, S. I., . . . Kraft, P. (2012). Replication of five prostate cancer loci identified in an Asian population--results from the NCI Breast and Prostate Cancer Cohort Consortium (BPC3). Cancer Epidemiol Biomarkers Prev, 21(1), 212-216.

Liu, C., Bai, B., Skogerbo, G., Cai, L., Deng, W., Zhang, Y., . . . Chen, R. (2005). NONCODE: an integrated knowledge database of non-coding RNAs. Nucleic Acids Res, 33(Database issue), D112-115.

Liu, H. T., Fang, L., Cheng, Y. X., & Sun, Q. (2016). LncRNA PVT1 regulates prostate cancer cell growth by inducing the methylation of miR-146a. Cancer Med, 5(12), 3512-3519.

160 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Liu, M., Shi, X., Yang, F., Wang, J., Xu, Y., Wei, D., . . . Yang, Z. (2016). The Cumulative Effect of Gene-Gene and Gene-Environment Interactions on the Risk of Prostate Cancer in Chinese Men. Int J Environ Res Public Health, 13(2), 162.

Liu, P., Ramachandran, S., Ali Seyed, M., Scharer, C. D., Laycock, N., Dalton, W. B., . . . Moreno, C. S. (2006). Sex-determining region Y box 4 is a transforming oncogene in human prostate cancer cells. Cancer Res, 66(8), 4011-4019.

Liu, T., Zhang, X., Yang, Y. M., Du, L. T., & Wang, C. X. (2016). Increased expression of the long noncoding RNA CRNDE-h indicates a poor prognosis in colorectal cancer, and is positively correlated with IRX5 mRNA expression. Onco Targets Ther, 9, 1437-1448.

Liu, T., Zhou, W., Cai, B., Chu, J., Shi, G., Teng, H., . . . Wang, Y. (2015). IRX2- mediated upregulation of MMP-9 and VEGF in a PI3K/AKT-dependent manner. Mol Med Rep, 12(3), 4346-4351.

Liu, T., Zhou, W., Zhang, F., Shi, G., Teng, H., Xiao, J., & Wang, Y. (2014). Knockdown of IRX2 inhibits osteosarcoma cell proliferation and invasion by the AKT/MMP9 signaling pathway. Mol Med Rep, 10(1), 169-174.

Long, Q. Z., Du, Y. F., Ding, X. Y., Li, X., Song, W. B., Yang, Y., . . . Liu, X. G. (2012). Replication and fine mapping for association of the C2orf43, FOXP4, GPRC6A and RFX6 genes with prostate cancer in the Chinese population. PLoS One, 7(5), e37866.

Lu, J., Song, G., Tang, Q., Zou, C., Han, F., Zhao, Z., . . . Wang, J. (2015). IRX1 hypomethylation promotes osteosarcoma metastasis via induction of CXCL14/NF-kappaB signaling. J Clin Invest, 125(5), 1839-1856.

Lu, J., & Wang, J. (2015). IRX1 hypomethylation in osteosarcoma metastasis. Oncotarget, 6(19), 16802-16803.

Lu, Y., Hu, Z., Mangala, L. S., Stine, Z. E., Hu, X., Jiang, D., . . . Dang, C. V. (2018). MYC Targeted Long Noncoding RNA DANCR Promotes Cancer in Part by Reducing p21 Levels. Cancer Res, 78(1), 64-74.

Luo, G., Wang, M., Wu, X., Tao, D., Xiao, X., Wang, L., . . . Jiang, G. (2015). Long Non-Coding RNA MEG3 Inhibits Cell Proliferation and Induces Apoptosis in Prostate Cancer. Cell Physiol Biochem, 37(6), 2209-2220.

Mahoney, S. E., Yao, Z., Keyes, C. C., Tapscott, S. J., & Diede, S. J. (2012). Genome- wide DNA methylation studies suggest distinct DNA methylation patterns in

The role of gwas identified 5p15 locus in prostate cancer risk and progression 161

pediatric embryonal and alveolar rhabdomyosarcomas. Epigenetics, 7(4), 400- 408.

Mak, P., Leav, I., Pursell, B., Bae, D., Yang, X., Taglienti, C. A., . . . Mercurio, A. M. (2010). ERbeta impedes prostate cancer EMT by destabilizing HIF-1alpha and inhibiting VEGF-mediated snail nuclear localization: implications for Gleason grading. Cancer Cell, 17(4), 319-332.

Malik, R., Patel, L., Prensner, J. R., Shi, Y., Iyer, M. K., Subramaniyan, S., . . . Chinnaiyan, A. M. (2014). The lncRNA PCAT29 inhibits oncogenic phenotypes in prostate cancer. Mol Cancer Res, 12(8), 1081-1087.

Marks, L. S., Fradet, Y., Deras, I. L., Blase, A., Mathis, J., Aubin, S. M., . . . Groskopf, J. (2007). PCA3 molecular urine assay for prostate cancer in men undergoing repeat biopsy. Urology, 69(3), 532-535.

Marra, A. N., & Wingert, R. A. (2014). Roles of Iroquois Transcription Factors in Kidney Development. Cell Dev Biol, 3(1), 1000131.

Martignano, F., Rossi, L., Maugeri, A., Galla, V., Conteduca, V., De Giorgi, U., . . . Schepisi, G. (2017). Urinary RNA-based biomarkers for prostate cancer detection. Clin Chim Acta, 473, 96-105.

Martorell, O., Barriga, F. M., Merlos-Suarez, A., Stephan-Otto Attolini, C., Casanova, J., Batlle, E., . . . Casali, A. (2014). Iro/IRX transcription factors negatively regulate Dpp/TGF-beta pathway activity during intestinal tumorigenesis. EMBO Rep, 15(11), 1210-1218.

Mateo, J., Carreira, S., Sandhu, S., Miranda, S., Mossop, H., Perez-Lopez, R., . . . de Bono, J. S. (2015). DNA-Repair Defects and Olaparib in Metastatic Prostate Cancer. N Engl J Med, 373(18), 1697-1708.

Matin, F., Jeet, V., Clements, J. A., Yousef, G. M., & Batra, J. (2016). MicroRNA Theranostics in Prostate Cancer Precision Medicine. Clin Chem, 62(10), 1318- 1333.

Matsumoto, A., Pasut, A., Matsumoto, M., Yamashita, R., Fung, J., Monteleone, E., . . . Pandolfi, P. P. (2017). mTORC1 and muscle regeneration are regulated by the LINC00961-encoded SPAR polypeptide. Nature, 541(7636), 228-232.

Matsumoto, K., Nishihara, S., Kamimura, M., Shiraishi, T., Otoguro, T., Uehara, M., . . . Ogura, T. (2004). The prepattern transcription factor Irx2, a target of the FGF8/MAP kinase cascade, is involved in cerebellum formation. Nat Neurosci, 7(6), 605-612.

162 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Mattick, J. S. (2011). The central role of RNA in human development and cognition. FEBS Lett, 585(11), 1600-1616.

Meerbrey, K. L., Hu, G., Kessler, J. D., Roarty, K., Li, M. Z., Fang, J. E., . . . Elledge, S. J. (2011). The pINDUCER lentiviral toolkit for inducible RNA interference in vitro and in vivo. Proc Natl Acad Sci U S A, 108(9), 3665-3670.

Megias-Vericat, J. E., Montesinos, P., Herrero, M. J., Moscardo, F., Boso, V., Martinez-Cuadron, D., . . . Alino, S. F. (2017). Impact of novel polymorphisms related to cytotoxicity of cytarabine in the induction treatment of acute myeloid leukemia. Pharmacogenet Genomics, 27(7), 270-274.

Mehra, R., Shi, Y., Udager, A. M., Prensner, J. R., Sahu, A., Iyer, M. K., . . . Chinnaiyan, A. M. (2014). A novel RNA in situ hybridization assay for the long noncoding RNA SChLAP1 predicts poor clinical outcome after radical prostatectomy in clinically localized prostate cancer. Neoplasia, 16(12), 1121- 1127.

Mehra, R., Udager, A. M., Ahearn, T. U., Cao, X., Feng, F. Y., Loda, M., . . . Chinnaiyan, A. M. (2016). Overexpression of the Long Non-coding RNA SChLAP1 Independently Predicts Lethal Prostate Cancer. Eur Urol, 70(4), 549-552.

Mei, S., Qin, Q., Wu, Q., Sun, H., Zheng, R., Zang, C., . . . Liu, X. S. (2017). Cistrome Data Browser: a data portal for ChIP-Seq and chromatin accessibility data in human and mouse. Nucleic Acids Res, 45(D1), D658-D662.

Mengelbier, L. H., Karlsson, J., Lindgren, D., Ora, I., Isaksson, M., Frigyesi, I., . . . Gisselsson, D. (2010). Deletions of 16q in Wilms tumors localize to blastemal- anaplastic cells and are associated with reduced expression of the IRXB renal tubulogenesis gene cluster. Am J Pathol, 177(5), 2609-2621.

Mertens-Walker, I., Fernandini, B. C., Maharaj, M. S., Rockstroh, A., Nelson, C. C., Herington, A. C., & Stephenson, S. A. (2015). The tumour-promoting receptor kinase, EphB4, regulates expression of integrin-beta8 in prostate cancer cells. BMC Cancer, 15, 164.

Misawa, A., Takayama, K., Urano, T., & Inoue, S. (2016). Androgen-induced Long Noncoding RNA (lncRNA) SOCS2-AS1 Promotes Cell Growth and Inhibits Apoptosis in Prostate Cancer Cells. J Biol Chem, 291(34), 17861-17880.

Misawa, A., Takayama, K. I., & Inoue, S. (2017). Long non-coding RNAs and prostate cancer. Cancer Sci, 108(11), 2107-2114.

Mitsiades, N. (2013). A road map to comprehensive androgen receptor axis targeting for castration-resistant prostate cancer. Cancer Res, 73(15), 4599-4605.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 163

Molck, M. C., Simioni, M., Paiva Vieira, T., Sgardioli, I. C., Paoli Monteiro, F., Souza, J., . . . Gil-da-Silva-Lopes, V. L. (2017). Genomic imbalances in syndromic congenital heart disease. J Pediatr (Rio J), 93(5), 497-507.

Montanari, M., Rossetti, S., Cavaliere, C., D'Aniello, C., Malzone, M. G., Vanacore, D., . . . Facchini, G. (2017). Epithelial-mesenchymal transition in prostate cancer: an overview. Oncotarget, 8(21), 35376-35389.

Morey, S. R., Smiraglia, D. J., James, S. R., Yu, J., Moser, M. T., Foster, B. A., & Karpf, A. R. (2006). DNA methylation pathway alterations in an autochthonous murine model of prostate cancer. Cancer Res, 66(24), 11659- 11667.

Mourtada-Maarabouni, M., Pickard, M. R., Hedge, V. L., Farzaneh, F., & Williams, G. T. (2009). GAS5, a non-protein-coding RNA, controls apoptosis and is downregulated in breast cancer. Oncogene, 28(2), 195-208.

Mucci, L. A., Hjelmborg, J. B., Harris, J. R., Czene, K., Havelick, D. J., Scheike, T., . . . Nordic Twin Study of Cancer, C. (2016). Familial Risk and Heritability of Cancer Among Twins in Nordic Countries. JAMA, 315(1), 68-76.

Myrthue, A., Rademacher, B. L., Pittsenbarger, J., Kutyba-Brooks, B., Gantner, M., Qian, D. Z., & Beer, T. M. (2008). The iroquois homeobox gene 5 is regulated by 1,25-dihydroxyvitamin D3 in human prostate cancer and regulates apoptosis and the cell cycle in LNCaP prostate cancer cells. Clin Cancer Res, 14(11), 3562-3570.

Nelson, B. R., Makarewich, C. A., Anderson, D. M., Winders, B. R., Troupes, C. D., Wu, F., . . . Olson, E. N. (2016). A peptide encoded by a transcript annotated as long noncoding RNA enhances SERCA activity in muscle. Science, 351(6270), 271-275.

Nelson, D. O., Lalit, P. A., Biermann, M., Markandeya, Y. S., Capes, D. L., Addesso, L., . . . Lyons, G. E. (2016). Irx4 Marks a Multipotent, Ventricular-Specific Progenitor Cell. Stem Cells, 34(12), 2875-2888.

Nguyen, H. H., Takata, R., Akamatsu, S., Shigemizu, D., Tsunoda, T., Furihata, M., . . . Nakagawa, H. (2012). IRX4 at 5p15 suppresses prostate cancer growth through the interaction with vitamin D receptor, conferring prostate cancer susceptibility. Hum Mol Genet, 21(9), 2076-2085.

Nie, L., Wu, H. J., Hsu, J. M., Chang, S. S., Labaff, A. M., Li, C. W., . . . Hung, M. C. (2012). Long non-coding RNAs: versatile master regulators of gene expression and crucial players in cancer. Am J Transl Res, 4(2), 127-150.

164 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Niskanen, E. A., Malinen, M., Sutinen, P., Toropainen, S., Paakinaho, V., Vihervaara, A., . . . Palvimo, J. J. (2015). Global SUMOylation on active chromatin is an acute heat stress response restricting transcription. Genome Biol, 16, 153.

Orfanelli, U., Jachetti, E., Chiacchiera, F., Grioni, M., Brambilla, P., Briganti, A., . . . Lavorgna, G. (2015). Antisense transcription at the TRPM2 locus as a novel prognostic marker and therapeutic target in prostate cancer. Oncogene, 34(16), 2094-2102.

Orom, U. A., Derrien, T., Beringer, M., Gumireddy, K., Gardini, A., Bussotti, G., . . . Shiekhattar, R. (2010). Long noncoding RNAs with enhancer-like function in human cells. Cell, 143(1), 46-58.

Page, B. D., Khoury, H., Laister, R. C., Fletcher, S., Vellozo, M., Manzoli, A., . . . Gunning, P. T. (2012). Small molecule STAT5-SH2 domain inhibitors exhibit potent antileukemia activity. J Med Chem, 55(3), 1047-1055.

Palazzo, A. F., & Lee, E. S. (2015). Non-coding RNA: what is functional and what is junk? Front Genet, 6, 2.

Paltoglou, S., Das, R., Townley, S. L., Hickey, T. E., Tarulli, G. A., Coutinho, I., . . . Selth, L. A. (2017). Novel Androgen Receptor Coregulator GRHL2 Exerts Both Oncogenic and Antimetastatic Functions in Prostate Cancer. Cancer Res, 77(13), 3417-3430.

Pelechano, V., & Steinmetz, L. M. (2013). Gene regulation by antisense transcription. Nat Rev Genet, 14(12), 880-893.

Penney, K. L., Pettersson, A., Shui, I. M., Graff, R. E., Kraft, P., Lis, R. T., . . . Mucci, L. A. (2016). Association of Prostate Cancer Risk Variants with TMPRSS2:ERG Status: Evidence for Distinct Molecular Subtypes. Cancer Epidemiol Biomarkers Prev, 25(5), 745-749.

Perner, S., Mosquera, J. M., Demichelis, F., Hofer, M. D., Paris, P. L., Simko, J., . . . Rubin, M. A. (2007). TMPRSS2-ERG fusion prostate cancer: an early molecular event associated with invasion. Am J Surg Pathol, 31(6), 882-888.

Perron, U., Provero, P., & Molineris, I. (2017). In silico prediction of lncRNA function using tissue specific and evolutionary conserved expression. BMC Bioinformatics, 18(Suppl 5), 144.

Petrovics, G., Zhang, W., Makarem, M., Street, J. P., Connelly, R., Sun, L., . . . Srivastava, S. (2004). Elevated expression of PCGEM1, a prostate-specific gene with cell growth-promoting function, is associated with high-risk prostate cancer patients. Oncogene, 23(2), 605-611.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 165

Pickard, M. R., Mourtada-Maarabouni, M., & Williams, G. T. (2013). Long non- coding RNA GAS5 regulates apoptosis in prostate cancer cell lines. Biochim Biophys Acta, 1832(10), 1613-1623.

Poliseno, L., Salmena, L., Zhang, J., Carver, B., Haveman, W. J., & Pandolfi, P. P. (2010). A coding-independent function of gene and pseudogene mRNAs regulates tumour biology. Nature, 465(7301), 1033-1038.

Ponting, C. P., Oliver, P. L., & Reik, W. (2009). Evolution and functions of long noncoding RNAs. Cell, 136(4), 629-641.

Prensner, J. R., Chen, W., Han, S., Iyer, M. K., Cao, Q., Kothari, V., . . . Feng, F. Y. (2014). The long non-coding RNA PCAT-1 promotes prostate cancer cell proliferation through cMyc. Neoplasia, 16(11), 900-908.

Prensner, J. R., Chen, W., Iyer, M. K., Cao, Q., Ma, T., Han, S., . . . Feng, F. Y. (2014). PCAT-1, a long noncoding RNA, regulates BRCA2 and controls homologous recombination in cancer. Cancer Res, 74(6), 1651-1660.

Prensner, J. R., & Chinnaiyan, A. M. (2011). The emergence of lncRNAs in cancer biology. Cancer Discov, 1(5), 391-407.

Prensner, J. R., Iyer, M. K., Balbin, O. A., Dhanasekaran, S. M., Cao, Q., Brenner, J. C., . . . Chinnaiyan, A. M. (2011). Transcriptome sequencing across a prostate cancer cohort identifies PCAT-1, an unannotated lincRNA implicated in disease progression. Nat Biotechnol, 29(8), 742-749.

Prensner, J. R., Iyer, M. K., Sahu, A., Asangani, I. A., Cao, Q., Patel, L., . . . Chinnaiyan, A. M. (2013). The long noncoding RNA SChLAP1 promotes aggressive prostate cancer and antagonizes the SWI/SNF complex. Nat Genet, 45(11), 1392-1398.

Prensner, J. R., Rubin, M. A., Wei, J. T., & Chinnaiyan, A. M. (2012). Beyond PSA: the next generation of prostate cancer biomarkers. Sci Transl Med, 4(127), 127rv123.

Prensner, J. R., Sahu, A., Iyer, M. K., Malik, R., Chandler, B., Asangani, I. A., . . . Chinnaiyan, A. M. (2014). The IncRNAs PCGEM1 and PRNCR1 are not implicated in castration resistant prostate cancer. Oncotarget, 5(6), 1434-1438.

Qi, J., Tian, L., Chen, Z., Wang, L., Tao, S., Gu, X., . . . Sun, J. (2013). Genetic variants in 2q31 and 5p15 are associated with aggressive benign prostatic hyperplasia in a Chinese population. Prostate, 73(11), 1182-1190.

166 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Qiao, H. P., Gao, W. S., Huo, J. X., & Yang, Z. S. (2013). Long non-coding RNA GAS5 functions as a tumor suppressor in renal cell carcinoma. Asian Pac J Cancer Prev, 14(2), 1077-1082.

Qiu, M., Bao, W., Wang, J., Yang, T., He, X., Liao, Y., & Wan, X. (2014). FOXA1 promotes tumor cell proliferation through AR involving the Notch pathway in endometrial cancer. BMC Cancer, 14, 78.

Qu, M., Ren, S. C., & Sun, Y. H. (2014). Current early diagnostic biomarkers of prostate cancer. Asian J Androl, 16(4), 549-554.

Rawlinson, A., Mohammed, A., Miller, M., & Kunkler, R. (2012). The role of enzalutamide in the treatment of castration-resistant prostate cancer. Future Oncol, 8(9), 1073-1081.

Ree, A. H., Pacheco, M. M., Tvermyr, M., Fodstad, O., & Brentani, M. M. (2000). Expression of a novel factor, com1, in early tumor progression of breast cancer. Clin Cancer Res, 6(5), 1778-1783.

Ren, S., Liu, Y., Xu, W., Sun, Y., Lu, J., Wang, F., . . . Sun, Y. (2013). Long noncoding RNA MALAT-1 is a new potential therapeutic target for castration resistant prostate cancer. J Urol, 190(6), 2278-2287.

Rhodes, D. R., Yu, J., Shanker, K., Deshpande, N., Varambally, R., Ghosh, D., . . . Chinnaiyan, A. M. (2004). ONCOMINE: a cancer microarray database and integrated data-mining platform. Neoplasia, 6(1), 1-6.

Rinn, J. L., & Chang, H. Y. (2012). Genome regulation by long noncoding RNAs. Annu Rev Biochem, 81, 145-166.

Rinn, J. L., Kertesz, M., Wang, J. K., Squazzo, S. L., Xu, X., Brugmann, S. A., . . . Chang, H. Y. (2007). Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs. Cell, 129(7), 1311-1323.

Rodrigues, D. N., Butler, L. M., Estelles, D. L., & de Bono, J. S. (2014). Molecular pathology and prostate cancer therapeutics: from biology to bedside. J Pathol, 232(2), 178-184.

Sabates-Bellver, J., Van der Flier, L. G., de Palo, M., Cattaneo, E., Maake, C., Rehrauer, H., . . . Marra, G. (2007). Transcriptome profile of human colorectal adenomas. Mol Cancer Res, 5(12), 1263-1275.

Sahu, B., Laakso, M., Pihlajamaa, P., Ovaska, K., Sinielnikov, I., Hautaniemi, S., & Janne, O. A. (2013). FoxA1 specifies unique androgen and glucocorticoid receptor binding events in prostate cancer cells. Cancer Res, 73(5), 1570-1580.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 167

Sakurai, K., Reon, B. J., Anaya, J., & Dutta, A. (2015). The lncRNA DRAIC/PCAT29 Locus Constitutes a Tumor-Suppressive Nexus. Mol Cancer Res, 13(5), 828- 838.

Salagierski, M., & Schalken, J. A. (2010). PCA3 and TMPRSS2-ERG: Promising Biomarkers in Prostate Cancer Diagnosis. Cancers (Basel), 2(3), 1432-1440.

Samuel, S., & Naora, H. (2005). Homeobox gene expression in cancer: insights from developmental regulation and deregulation. Eur J Cancer, 41(16), 2428-2437.

Sartori, D. A., & Chan, D. W. (2014). Biomarkers in prostate cancer: what's new? Curr Opin Oncol, 26(3), 259-264.

Schiewer, M. J., Augello, M. A., & Knudsen, K. E. (2012). The AR dependent cell cycle: mechanisms and cancer relevance. Mol Cell Endocrinol, 352(1-2), 34- 45.

Schmitt, A. M., & Chang, H. Y. (2016). Long Noncoding RNAs in Cancer Pathways. Cancer Cell, 29(4), 452-463.

Schumacher, F. R., Al Olama, A. A., Berndt, S. I., Benlloch, S., Ahmed, M., Saunders, E. J., . . . Eeles, R. A. (2018). Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. Nat Genet.

Sedarsky, J., Degon, M., Srivastava, S., & Dobi, A. (2017). Ethnicity and ERG frequency in prostate cancer. Nat Rev Urol.

Shah, N., & Sukumar, S. (2010). The Hox genes and their roles in oncogenesis. Nat Rev Cancer, 10(5), 361-371.

Shukla, S., Zhang, X., Niknafs, Y. S., Xiao, L., Mehra, R., Cieslik, M., . . . Malik, R. (2016). Identification and Validation of PCAT14 as Prognostic Biomarker in Prostate Cancer. Neoplasia, 18(8), 489-499.

Smemo, S., Tena, J. J., Kim, K. H., Gamazon, E. R., Sakabe, N. J., Gomez-Marin, C., . . . Nobrega, M. A. (2014). Obesity-associated variants within FTO form long- range functional connections with IRX3. Nature, 507(7492), 371-375.

Smolle, M. A., Bauernhofer, T., Pummer, K., Calin, G. A., & Pichler, M. (2017). Current Insights into Long Non-Coding RNAs (LncRNAs) in Prostate Cancer. Int J Mol Sci, 18(2).

168 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Song, M. A., Park, J. H., Jeong, K. S., Park, D. S., Kang, M. S., & Lee, S. (2007). Quantification of CpG methylation at the 5'-region of XIST by pyrosequencing from human serum. Electrophoresis, 28(14), 2379-2384.

Srikantan, V., Zou, Z., Petrovics, G., Xu, L., Augustus, M., Davis, L., . . . Srivastava, S. (2000). PCGEM1, a prostate-specific gene, is overexpressed in prostate cancer. Proc Natl Acad Sci U S A, 97(22), 12216-12221.

Srinivasan, S., Clements, J. A., & Batra, J. (2016). Single nucleotide polymorphisms in clinics: Fantasy or reality for cancer? Crit Rev Clin Lab Sci, 53(1), 29-39.

St John, J., Powell, K., Conley-Lacomb, M. K., & Chinni, S. R. (2012). TMPRSS2- ERG Fusion Gene Expression in Prostate Tumor Cells and Its Clinical and Biological Significance in Prostate Cancer Progression. J Cancer Sci Ther, 4(4), 94-101.

Stelloo, S., Nevedomskaya, E., Kim, Y., Hoekman, L., Bleijerveld, O. B., Mirza, T., . . . Zwart, W. (2017). Endogenous androgen receptor proteomic profiling reveals genomic subcomplex involved in prostate tumorigenesis. Oncogene.

Su, S. B., Motoo, Y., Iovanna, J. L., Berthezene, P., Xie, M. J., Mouri, H., . . . Sawabu, N. (2001). Overexpression of p8 is inversely correlated with apoptosis in pancreatic cancer. Clin Cancer Res, 7(5), 1320-1324.

Sullivan, J., Kopp, R., Stratton, K., Manschreck, C., Corines, M., Rau-Murthy, R., . . . Klein, R. J. (2015). An analysis of the association between prostate cancer risk loci, PSA levels, disease aggressiveness and disease-specific mortality. Br J Cancer, 113(1), 166-172.

Sun, M., Geng, D., Li, S., Chen, Z., & Zhao, W. (2017). LncRNA PART1 modulates Toll-like receptor pathway to influence cell proliferation and apoptosis in prostate cancer cells. Biol Chem.

Swami, M. (2010). Small RNAs: Pseudogenes act as microRNA decoys. Nat Rev Genet, 11(8), 530-531.

Taft, R. J., Pheasant, M., & Mattick, J. S. (2007). The relationship between non- protein-coding DNA and eukaryotic complexity. Bioessays, 29(3), 288-299.

Tagai, E. K., Miller, S. M., Kutikov, A., Diefenbach, M. A., Gor, R. A., Al-Saleem, T., . . . Roy, G. (2018). Prostate Cancer Patients' Understanding of the Gleason Scoring System: Implications for Shared Decision-Making. J Cancer Educ.

Takata, R., Akamatsu, S., Kubo, M., Takahashi, A., Hosono, N., Kawaguchi, T., . . . Nakagawa, H. (2010). Genome-wide association study identifies five new

The role of gwas identified 5p15 locus in prostate cancer risk and progression 169

susceptibility loci for prostate cancer in the Japanese population. Nat Genet, 42(9), 751-754.

Takayama, K., Horie-Inoue, K., Katayama, S., Suzuki, T., Tsutsumi, S., Ikeda, K., . . . Inoue, S. (2013). Androgen-responsive long noncoding RNA CTBP1-AS promotes prostate cancer. EMBO J, 32(12), 1665-1680.

Tan, M. H., Li, J., Xu, H. E., Melcher, K., & Yong, E. L. (2015). Androgen receptor: structure, role in prostate cancer and drug discovery. Acta Pharmacol Sin, 36(1), 3-23.

Tang, Y., Axelsson, A. S., Spegel, P., Andersson, L. E., Mulder, H., Groop, L. C., . . . Rosengren, A. H. (2014). Genotype-based treatment of type 2 diabetes with an alpha2A-adrenergic receptor antagonist. Sci Transl Med, 6(257), 257ra139.

Taniguchi, Y. (2014). Hox transcription factors: modulators of cell-cell and cell- extracellular matrix adhesion. Biomed Res Int, 2014, 591374.

Taylor, B. S., Schultz, N., Hieronymus, H., Gopalan, A., Xiao, Y., Carver, B. S., . . . Gerald, W. L. (2010). Integrative genomic profiling of human prostate cancer. Cancer Cell, 18(1), 11-22.

Tena, J. J., Alonso, M. E., de la Calle-Mustienes, E., Splinter, E., de Laat, W., Manzanares, M., & Gomez-Skarmeta, J. L. (2011). An evolutionarily conserved three-dimensional structure in the vertebrate Irx clusters facilitates enhancer sharing and coregulation. Nat Commun, 2, 310.

Thomas, G., Jacobs, K. B., Yeager, M., Kraft, P., Wacholder, S., Orr, N., . . . Chanock, S. J. (2008). Multiple loci identified in a genome-wide association study of prostate cancer. Nat Genet, 40(3), 310-315.

Tomlins, S. A., Laxman, B., Varambally, S., Cao, X., Yu, J., Helgeson, B. E., . . . Chinnaiyan, A. M. (2008). Role of the TMPRSS2-ERG gene fusion in prostate cancer. Neoplasia, 10(2), 177-188.

Tomlins, S. A., Rhodes, D. R., Perner, S., Dhanasekaran, S. M., Mehra, R., Sun, X. W., . . . Chinnaiyan, A. M. (2005). Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. Science, 310(5748), 644-648.

Toropainen, S., Malinen, M., Kaikkonen, S., Rytinki, M., Jaaskelainen, T., Sahu, B., . . . Palvimo, J. J. (2015). SUMO ligase PIAS1 functions as a target gene selective androgen receptor coregulator on prostate cancer cell chromatin. Nucleic Acids Res, 43(2), 848-861.

170 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Tripathi, V., Ellis, J. D., Shen, Z., Song, D. Y., Pan, Q., Watt, A. T., . . . Prasanth, K. V. (2010). The nuclear-retained noncoding RNA MALAT1 regulates alternative splicing by modulating SR splicing factor phosphorylation. Mol Cell, 39(6), 925-938.

Tsunoda, T., & Takagi, T. (1999). Estimating transcription factor bindability on DNA. Bioinformatics, 15(7-8), 622-630.

Vaarala, M. H., Hirvikoski, P., Kauppila, S., & Paavonen, T. K. (2012). Identification of androgen-regulated genes in human prostate. Mol Med Rep, 6(3), 466-472. van Bemmel, J. G., Mira-Bontenbal, H., & Gribnau, J. (2016). Cis- and trans- regulation in X inactivation. Chromosoma, 125(1), 41-50. van Gils, M. P., Hessels, D., Hulsbergen-van de Kaa, C. A., Witjes, J. A., Jansen, C. F., Mulders, P. F., . . . Schalken, J. A. (2008). Detailed analysis of histopathological parameters in radical prostatectomy specimens and PCA3 urine test results. Prostate, 68(11), 1215-1222. van Gils, M. P., Hessels, D., van Hooij, O., Jannink, S. A., Peelen, W. P., Hanssen, S. L., . . . Schalken, J. A. (2007). The time-resolved fluorescence-based PCA3 test on urinary sediments after digital rectal examination; a Dutch multicenter validation of the diagnostic performance. Clin Cancer Res, 13(3), 939-943.

Vanaja, D. K., Cheville, J. C., Iturria, S. J., & Young, C. Y. (2003). Transcriptional silencing of zinc finger protein 185 identified by expression profiling is associated with prostate cancer progression. Cancer Res, 63(14), 3877-3882.

Verhaegh, G. W., Verkleij, L., Vermeulen, S. H., den Heijer, M., Witjes, J. A., & Kiemeney, L. A. (2008). Polymorphisms in the H19 gene and the risk of bladder cancer. Eur Urol, 54(5), 1118-1126.

Villegas, V. E., & Zaphiropoulos, P. G. (2015). Neighboring gene regulation by antisense long non-coding RNAs. Int J Mol Sci, 16(2), 3251-3266.

Visscher, P. M., Wray, N. R., Zhang, Q., Sklar, P., McCarthy, M. I., Brown, M. A., & Yang, J. (2017). 10 Years of GWAS Discovery: Biology, Function, and Translation. Am J Hum Genet, 101(1), 5-22.

Wallace, T. A., Prueitt, R. L., Yi, M., Howe, T. M., Gillespie, J. W., Yfantis, H. G., . . . Ambs, S. (2008). Tumor immunobiological differences in prostate cancer between African-American and European-American men. Cancer Res, 68(3), 927-936.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 171

Wang, F., Ren, S., Chen, R., Lu, J., Shi, X., Zhu, Y., . . . Sun, Y. (2014). Development and prospective multicenter evaluation of the long noncoding RNA MALAT- 1 as a diagnostic urinary biomarker for prostate cancer. Oncotarget, 5(22), 11091-11102.

Wang, G. F., Nikovits, W., Jr., Bao, Z. Z., & Stockdale, F. E. (2001). Irx4 forms an inhibitory complex with the vitamin D and retinoic X receptors to regulate cardiac chamber-specific slow MyHC3 expression. J Biol Chem, 276(31), 28835-28841.

Wang, J., Cheng, G., Li, X., Pan, Y., Qin, C., Yang, H., . . . Wang, Z. (2016). Overexpression of long non-coding RNA LOC400891 promotes tumor progression and poor prognosis in prostate cancer. Tumour Biol, 37(7), 9603- 9613.

Wang, L., Guo, Z. Y., Zhang, R., Xin, B., Chen, R., Zhao, J., . . . Yang, A. G. (2013). Pseudogene OCT4-pg4 functions as a natural micro RNA sponge to regulate OCT4 expression by competing for miR-145 in hepatocellular carcinoma. Carcinogenesis, 34(8), 1773-1781.

Wang, L., Han, S., Jin, G., Zhou, X., Li, M., Ying, X., . . . Zhu, Q. (2014). Linc00963: a novel, long non-coding RNA involved in the transition of prostate cancer from androgen-dependence to androgen-independence. Int J Oncol, 44(6), 2041-2049.

Wang, L., Shi, S., Wang, L., Xie, Y., Bai, E., Zhou, X., . . . Zhu, Q. (2013). [Role of PRNCR1 in the castration resistant prostate cancer]. Xi Bao Yu Fen Zi Mian Yi Xue Za Zhi, 29(8), 789-793.

Wang, P., Zhuang, C., Huang, D., & Xu, K. (2016). Downregulation of miR-377 contributes to IRX3 deregulation in hepatocellular carcinoma. Oncol Rep, 36(1), 247-252.

Wang, T., Xu, Y., & Hou, P. (2015). Identifying novel biomarkers of gastric cancer through integration analysis of single nucleotide polymorphisms and gene expression profile. Int J Biol Markers, 30(3), e321-326.

Wang, W., Lim, W. K., Leong, H. S., Chong, F. T., Lim, T. K., Tan, D. S., . . . Iyer, N. G. (2015). An eleven gene molecular signature for extra-capsular spread in oral squamous cell carcinoma serves as a prognosticator of outcome in patients without nodal metastases. Oral Oncol, 51(4), 355-362.

Wang, X., Ruan, Y., Wang, X., Zhao, W., Jiang, Q., Jiang, C., . . . Xu, D. (2017). Long intragenic non-coding RNA lincRNA-p21 suppresses development of human prostate cancer. Cell Prolif, 50(2).

172 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Wang, X., Xu, Y., Wang, X., Jiang, C., Han, S., Dong, K., . . . Xu, D. (2017). LincRNA-p21 suppresses development of human prostate cancer through inhibition of PKM2. Cell Prolif, 50(6).

Waters, K. M., Le Marchand, L., Kolonel, L. N., Monroe, K. R., Stram, D. O., Henderson, B. E., & Haiman, C. A. (2009). Generalizability of associations from prostate cancer genome-wide association studies in multiple populations. Cancer Epidemiol Biomarkers Prev, 18(4), 1285-1289.

Werner, S., Stamm, H., Pandjaitan, M., Kemming, D., Brors, B., Pantel, K., & Wikman, H. (2015). Iroquois homeobox 2 suppresses cellular motility and chemokine expression in breast cancer cells. BMC Cancer, 15, 896.

Whitington, T., Gao, P., Song, W., Ross-Adams, H., Lamb, A. D., Yang, Y., . . . Wiklund, F. (2016). Gene regulatory mechanisms underpinning prostate cancer susceptibility. Nat Genet, 48(4), 387-397.

Willingham, A. T., Orth, A. P., Batalov, S., Peters, E. C., Wen, B. G., Aza-Blanc, P., . . . Schultz, P. G. (2005). A strategy for probing the function of noncoding RNAs finds a repressor of NFAT. Science, 309(5740), 1570-1573.

Wu, J., Cheng, G., Zhang, C., Zheng, Y., Xu, H., Yang, H., & Hua, L. (2017). Long noncoding RNA LINC01296 is associated with poor prognosis in prostate cancer and promotes cancer-cell proliferation and metastasis. Onco Targets Ther, 10, 1843-1852.

Wu, Y., Davison, J., Qu, X., Morrissey, C., Storer, B., Brown, L., . . . Fang, M. (2016). Methylation profiling identified novel differentially methylated markers including OPCML and FLRT2 in prostate cancer. Epigenetics, 11(4), 247-258.

Wyatt, A. W., & Gleave, M. E. (2015). Targeting the adaptive molecular landscape of castration-resistant prostate cancer. EMBO Mol Med, 7(7), 878-894.

Xie, C., Yuan, J., Li, H., Li, M., Zhao, G., Bu, D., . . . Zhao, Y. (2014). NONCODEv4: exploring the world of long non-coding RNA genes. Nucleic Acids Res, 42(Database issue), D98-103.

Xing, Z., Lin, C., & Yang, L. (2016). LncRNA Pulldown Combined with Mass Spectrometry to Identify the Novel LncRNA-Associated Proteins. Methods Mol Biol, 1402, 1-9.

Xiong, W., Huang, C., Deng, H., Jian, C., Zen, C., Ye, K., . . . Zhu, L. (2018). Oncogenic non-coding RNA NEAT1 promotes the prostate cancer cell growth through the SRC3/IGF1R/AKT pathway. Int J Biochem Cell Biol, 94, 125-132.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 173

Xu, S., Yi, X. M., Tang, C. P., Ge, J. P., Zhang, Z. Y., & Zhou, W. Q. (2016). Long non-coding RNA ATB promotes growth and epithelial-mesenchymal transition and predicts poor prognosis in human prostate carcinoma. Oncol Rep, 36(1), 10-22.

Xu, W., Chang, J., Du, X., & Hou, J. (2017). Long non-coding RNA PCAT-1 contributes to tumorigenesis by regulating FSCN1 via miR-145-5p in prostate cancer. Biomed Pharmacother, 95, 1112-1118.

Xu, X., Hussain, W. M., Vijai, J., Offit, K., Rubin, M. A., Demichelis, F., & Klein, R. J. (2014). Variants at IRX4 as prostate cancer expression quantitative trait loci. Eur J Hum Genet, 22(4), 558-563.

Xu, Z., Bensen, J. T., Smith, G. J., Mohler, J. L., & Taylor, J. A. (2011). GWAS SNP Replication among African American and European American men in the North Carolina-Louisiana prostate cancer project (PCaP). Prostate, 71(8), 881- 891.

Xue, Y., Wang, M., Kang, M., Wang, Q., Wu, B., Chu, H., . . . Wu, D. (2013). Association between lncrna PCGEM1 polymorphisms and prostate cancer risk. Prostate Cancer Prostatic Dis, 16(2), 139-144, S131.

Yacqub-Usman, K., Pickard, M. R., & Williams, G. T. (2015). Reciprocal regulation of GAS5 lncRNA levels and mTOR inhibitor action in prostate cancer cells. Prostate, 75(7), 693-705.

Yang, F., Nickols, N. G., Li, B. C., Marinov, G. K., Said, J. W., & Dervan, P. B. (2013). Antitumor activity of a pyrrole-imidazole polyamide. Proc Natl Acad Sci U S A, 110(5), 1863-1868.

Yang, J., Li, C., Mudd, A., & Gu, X. (2017). LncRNA PVT1 predicts prognosis and regulates tumor growth in prostate cancer. Biosci Biotechnol Biochem, 81(12), 2301-2306.

Yang, L., Lin, C., Jin, C., Yang, J. C., Tanasa, B., Li, W., . . . Rosenfeld, M. G. (2013). lncRNA-dependent mechanisms of androgen-receptor-regulated gene activation programs. Nature, 500(7464), 598-602.

Yang, Y., Jin, L., Zhang, J., Wang, J., Zhao, X., Wu, G., . . . Zhang, Z. (2017). High HSF4 expression is an independent indicator of poor overall survival and recurrence free survival in patients with primary colorectal cancer. IUBMB Life, 69(12), 956-961.

Yang, Y. A., & Yu, J. (2015). Current perspectives on FOXA1 regulation of androgen receptor signaling and prostate cancer. Genes Dis, 2(2), 144-151.

174 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Yap, K. L., Li, S., Munoz-Cabello, A. M., Raguz, S., Zeng, L., Mujtaba, S., . . . Zhou, M. M. (2010). Molecular interplay of the noncoding RNA ANRIL and methylated histone H3 lysine 27 by polycomb CBX7 in transcriptional silencing of INK4a. Mol Cell, 38(5), 662-674.

Ye, X., & Weinberg, R. A. (2015). Epithelial-Mesenchymal Plasticity: A Central Regulator of Cancer Progression. Trends Cell Biol, 25(11), 675-686.

Yeager, M., Orr, N., Hayes, R. B., Jacobs, K. B., Kraft, P., Wacholder, S., . . . Thomas, G. (2007). Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nat Genet, 39(5), 645-649.

Ylipaa, A., Kivinummi, K., Kohvakka, A., Annala, M., Latonen, L., Scaravilli, M., . . . Nykter, M. (2015). Transcriptome Sequencing Reveals PCAT5 as a Novel ERG-Regulated Long Noncoding RNA in Prostate Cancer. Cancer Res, 75(19), 4026-4031.

Yu, J., Yu, J., Mani, R. S., Cao, Q., Brenner, C. J., Cao, X., . . . Chinnaiyan, A. M. (2010). An integrated network of androgen receptor, polycomb, and TMPRSS2-ERG gene fusions in prostate cancer progression. Cancer Cell, 17(5), 443-454.

Zareba, P., Duivenvoorden, W., Leong, D. P., & Pinthus, J. H. (2016). Androgen deprivation therapy and cardiovascular disease: what is the linking mechanism? Ther Adv Urol, 8(2), 118-129.

Zhang, A., Zhao, J. C., Kim, J., Fong, K. W., Yang, Y. A., Chakravarti, D., . . . Yu, J. (2015). LncRNA HOTAIR Enhances the Androgen-Receptor-Mediated Transcriptional Program and Drives Castration-Resistant Prostate Cancer. Cell Rep, 13(1), 209-221.

Zhang, C., Liu, C., Wu, J., Zheng, Y., Xu, H., Cheng, G., & Hua, L. (2017). Upregulation of long noncoding RNA LOC440040 promotes tumor progression and predicts poor prognosis in patients with prostate cancer. Onco Targets Ther, 10, 4945-4954.

Zhang, L., Liang, D., Chen, C., Wang, Y., Amu, G., Yang, J., . . . Tang, X. (2018). Circular siRNAs for Reducing Off-Target Effects and Enhancing Long-Term Gene Silencing in Cells and Mice. Mol Ther Nucleic Acids, 10, 237-244.

Zhang, W., Ren, S. C., Shi, X. L., Liu, Y. W., Zhu, Y. S., Jing, T. L., . . . Sun, Y. H. (2015). A novel urinary long non-coding RNA transcript improves diagnostic accuracy in patients undergoing prostate biopsy. Prostate.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 175

Zhang, Y., Pitchiaya, S., Cieslik, M., Niknafs, Y. S., Tien, J. C., Hosono, Y., . . . Chinnaiyan, A. M. (2018). Analysis of the androgen receptor-regulated lncRNA landscape identifies a role for ARLNC1 in prostate cancer progression. Nat Genet, 50(6), 814-824.

Zhang, Y., Zhang, P., Wan, X., Su, X., Kong, Z., Zhai, Q., . . . Li, Y. (2016). Downregulation of long non-coding RNA HCG11 predicts a poor prognosis in prostate cancer. Biomed Pharmacother, 83, 936-941.

Zhao, B., Yang, Y., Hu, L. B., Bai, Y., Li, R. Q., Zhang, G. Y., . . . Lu, Y. L. (2017). Overexpression of lncRNA ANRIL promoted the proliferation and migration of prostate cancer cells via regulating let-7a/TGF-beta1/ Smad signaling pathway. Cancer Biomark.

Zhao, Y., Li, H., Fang, S., Kang, Y., Wu, W., Hao, Y., . . . Chen, R. (2016). NONCODE 2016: an informative and valuable data source of long non-coding RNAs. Nucleic Acids Res, 44(D1), D203-208.

Zheng, J., Zhao, S., He, X., Zheng, Z., Bai, W., Duan, Y., . . . Zhang, G. (2016). The up-regulation of long non-coding RNA CCAT2 indicates a poor prognosis for prostate cancer and promotes metastasis by affecting epithelial-mesenchymal transition. Biochem Biophys Res Commun, 480(4), 508-514.

Zheng, R., Wan, C., Mei, S., Qin, Q., Wu, Q., Sun, H., . . . Liu, X. S. (2019). Cistrome Data Browser: expanded datasets and new tools for gene regulatory analysis. Nucleic Acids Res, 47(D1), D729-D735.

Zhou, Y., Bolton, E. C., & Jones, J. O. (2015). Androgens and androgen receptor signaling in prostate tumorigenesis. J Mol Endocrinol, 54(1), R15-29.

Zhu, M., Chen, Q., Liu, X., Sun, Q., Zhao, X., Deng, R., . . . Yu, J. (2014). lncRNA H19/miR-675 axis represses prostate cancer metastasis by targeting TGFBI. FEBS J, 281(16), 3766-3775.

176 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Appendix

The role of gwas identified 5p15 locus in prostate cancer risk and progression 177

Appendix A

The sequences of primers and siRNAs used in the thesis

Common primers for RT-qPCR Primer Sequence (5’-3’) IRX4lncRNA Forward - GAGCTGGTCAAAATGCCCTC Reverse - GGCTACCGACGCTAAACTGT RPL32 Forward - CCCTTGTGAAGCCCAAGA Reverse - GACTGGTGCCGGATGAACTT IRX4 Forward - CGCCTTCTACTCGCTGAACA Reverse - AGAGCTGGCTCGTAAGGGTA KLK3 Forward - AGTGCGAGAAGCATTCCCAAC Reverse - CCAGCAAGATCACGCTTTTGTT AR Forward - AAAAGAGCCGCTGAAGGGAA Reverse - GAAGACGACAAGATGGACAATTT ERG Forward - CGCAGAGTTATCGTGCCAGCAGAT Reverse - CCATATTCTTTCACCGCCCACTCC AR/ERG Forward - GTGGGGAGAGCGTCTTCAAT binding region Reverse - CCGCAGTGCCTATGAAAGGA DPP4 Forward - GGTTCTGCTGAACAAAGGCA Reverse - TGATCTGAAATCCATCTTAAGGAGT MYB Forward - TGGGCTGCTTCCCAAGTCTG Reverse - CACATCTGTTCGATTCGGGAGA ITGB3BP Forward - GCGTTTCCTTTGGCGGATTT Reverse - TCTTCTAACAGACCATCCAACTT ITGA Forward - TCAATGACTTTCAGCGGCCC Reverse - GGCCAACTAACGGAGAACCA PTDSS1 Forward - TGTCTGTACGGCATGATTTGGTAT Reverse - ATGCTTGGGTGGGCTGTCTTC TMEM123 Forward - CATCCTGCCCTCGGAACAAT Reverse - CTCTATGTTTGCAGATGCCGC MAP2K4 Forward - GGCCATACATGGCACCTGAAA Reverse - TTGGATAAGGAAATCGGCCTG NR3C2 Forward - GGGGATGAGGCTTCAGGATG Reverse - TGCCCTTCCACTGCTCTTTT PTPN1 Forward - ACATGCGGTCACTTTTGGGA Reverse - TGTGCGCATTTTAACGAACCT POLR3F Forward - GCGGATCCGGTCGAAATAGA Reverse - TATTGATGGCTACTGCCCGC WNT5A Forward - GCTCGCATCCTCATGAACCT Reverse - GCCACATCAGCCAGGTTGTA JAG1 Forward - CAACACCTTCAACCTCAAGGC Reverse - AATCCCACGCCTCCACAAG

178 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Strand specific primers for RT-qPCR Primer Sequence (5’-3’) IRX4lncRNA Forward - GAGCTGGTCAAAATGCCCTC Reverse - CGACTGGAGCACGAGGACACTGA RPL32 Forward - CCCTTGTGAAGCCCAAGA Reverse - CTAATACGACTCACTATAGGGAGA Strand specific primer for cDNA synthesis IRX4lncRNA CGACTGGAGCACGAGGACACTGAGGCTACCGACGCTAAACTGT RPL32 CTAATACGACTCACTATAGGGAGAGACTGGTGCCGGATGAACTT IRX4lncRNA Sense (5’-3’) Antisense (5’-3’) siRNAs siRNA1 GACUUUCAGAACCCACUGCtt GCAGUGGGUUCUGAAAGUCca siRNA2 GGAAUGGACUUUCAGAACCtt GGUUCUGAAAGUCCAUUCCgg siRNA3 GGACUUUCAGAACCCACUGtt CAGUGGGUUCUGAAAGUCCat siRNA4 GGGAAACCAUCGGAAAUAAtt UUAUUUCCGAUGGUUUCCCtc siRNA5 GCUCCAUGCAAAUUAUUCAtt UGAAUAAUUUGCAUGGAGCgg siRNA6 CAUGCAAAUUAUUCAGACAtt UGUCUGAAUAAUUUGCAUGga

The role of gwas identified 5p15 locus in prostate cancer risk and progression 179

Appendix B

Bioanalyser results summary for the samples used for microarray analysis. All the samples had an RNA Integrity Number (RIN) >9.95. (Sample 1-3 - Non- targeting siRNA transfected LNCaP cell samples, 4-6 - siIRX4 knockdown LNCaP cell samples, 7-9 - Non-targeting siRNA transfected LNCaP cell samples, 10-12 - siIRX4 knockdown VCaP cell samples.

180 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Appendix C

Correspondence analysis for the normalised intensities of the samples from microarray analysis. The difference between the cell lines LNCaP and VCaP were higher compared to the treatment groups in each cell line. Replicates of each treatment group were clustered together. The difference between Non-targeting siRNA and siIRX4 treated groups in LNCaP cells were higher than that of VCaP cells.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 181

Appendix D

Genes differentially regulated by IRX4 knockdown in LNCaP cells Fold Symbol Gene Description Chr Change p-Value AADAT aminoadipate aminotransferase chr4 -2.59 0.008 AAED1 AhpC/TSA antioxidant enzyme domain containing 1 chr9 -1.86 0.001 AAMDC adipogenesis associated, Mth938 domain containing chr11 -1.50 0.005 ABCC4 ATP-binding cassette, sub-family C (CFTR/MRP), member 4 chr13 -2.16 0.000 ABCC8 ATP-binding cassette, sub-family C (CFTR/MRP), member 8 chr11 -1.58 0.015 ABHD11 abhydrolase domain containing 11 chr7 -1.57 0.005 ABHD2 abhydrolase domain containing 2 chr15 -1.59 0.001 ABI2 abl-interactor 2 chr2 -1.53 0.048 ABLIM1 actin binding LIM protein 1 chr10 -1.60 0.003 AC004381.6 Putative RNA exonuclease NEF-sp chr16 -1.79 0.002 AC005076.5 chr7 -1.71 0.003 AC145676.2 chr7 -1.69 0.000 ACAD11 acyl-CoA dehydrogenase family, member 11 chr3 -1.67 0.001 ACADM acyl-CoA dehydrogenase, C-4 to C-12 straight chain chr1 -1.50 0.004 ACOT7 acyl-CoA thioesterase 7 chr1 -4.88 0.000 ACSL5 acyl-CoA synthetase long-chain family member 5 chr10 -2.15 0.018 ACSM1 acyl-CoA synthetase medium-chain family member 1 chr16 -1.51 0.002 ACSM3 acyl-CoA synthetase medium-chain family member 3 chr16 -2.64 0.003 ACTA2 actin, alpha 2, smooth muscle, aorta chr10 -2.51 0.001 ACTG2 actin, gamma 2, smooth muscle, enteric chr2 -2.97 0.000 ACTR3 ARP3 actin-related protein 3 homolog (yeast) chr2 -1.65 0.002 ADAM11 ADAM metallopeptidase domain 11 chr17 -1.73 0.005 ADAM1A ADAM metallopeptidase domain 1A, pseudogene chr12 -1.61 0.024 ADCY7 adenylate cyclase 7 chr16 -1.70 0.001 ADPRHL1 ADP-ribosylhydrolase like 1 chr13 -1.59 0.000 ADRA2C adrenoceptor alpha 2C chr4 -1.57 0.036 ADSSL1 adenylosuccinate synthase like 1 chr14 -1.82 0.001 AGBL3 ATP/GTP binding protein-like 3 chr7 -1.84 0.001 AGBL5 ATP/GTP binding protein-like 5 chr2 -1.59 0.000 AGFG2 ArfGAP with FG repeats 2 chr7 -2.53 0.002 AGPAT3 1-acylglycerol-3-phosphate O-acyltransferase 3 chr21 -1.76 0.000 AHI1 Abelson helper integration site 1 chr6 -1.68 0.004 AIF1L allograft inflammatory factor 1-like chr9 -1.59 0.001 AIM1 absent in melanoma 1 chr6 -1.57 0.003 AK2 adenylate kinase 2 chr1 -1.63 0.013 AK7 adenylate kinase 7 chr14 -1.57 0.022 AK8 adenylate kinase 8 chr9 -1.60 0.006 AKAP7 A kinase (PRKA) anchor protein 7 chr6 -1.52 0.048 AKIP1 A kinase (PRKA) interacting protein 1 chr11 -1.57 0.001 ALDH16A1 aldehyde dehydrogenase 16 family, member A1 chr19 -1.80 0.001 ALDH3B1 aldehyde dehydrogenase 3 family, member B1 chr11 -1.78 0.000 ALPK2 alpha-kinase 2 chr18 -1.56 0.002 AMER1 APC membrane recruitment protein 1 chrX -1.52 0.003 ANAPC10 anaphase promoting complex subunit 10 chr4 -1.58 0.000 ANG angiogenin, ribonuclease, RNase A family, 5 chr14 -3.21 0.000 ANKEF1 ankyrin repeat and EF-hand domain containing 1 chr20 -1.71 0.000 ANKRD2 ankyrin repeat domain 2 (stretch responsive muscle) chr10 -1.62 0.016 ANKRD32 ankyrin repeat domain 32 chr5 -1.52 0.023 ANKRD34A ankyrin repeat domain 34A chr1 -1.52 0.004 ANKRD50 ankyrin repeat domain 50 chr4 -1.67 0.000 ANLN anillin, actin binding protein chr7 -1.52 0.031

182 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value ANTXR2 anthrax toxin receptor 2 chr4 -1.56 0.002 AP000487.5 chr11 -1.52 0.001 AP001062.7 chr21 -1.60 0.015 AP3B2 adaptor-related protein complex 3, beta 2 subunit chr15 -1.50 0.002 APH1B APH1B gamma secretase subunit chr15 -2.12 0.000 APLP1 amyloid beta (A4) precursor-like protein 1 chr19 -1.64 0.005 APLP2 amyloid beta (A4) precursor-like protein 2 chr11 -2.43 0.000 apolipoprotein B mRNA editing enzyme, catalytic polypeptide- APOBEC3H like 3H chr22 -2.90 0.000 adaptor protein, phosphotyrosine interaction, PH domain and APPL1 containing 1 chr3 -1.60 0.001 AQP1 aquaporin 1 (Colton blood group) chr7 -1.54 0.007 AQP5 aquaporin 5 chr12 -1.52 0.008 ARG1 arginase 1 chr6 -2.78 0.001 ARHGAP11A Rho GTPase activating protein 11A chr15 -1.64 0.021 ARHGAP17 Rho GTPase activating protein 17 chr16 -1.72 0.034 ARHGAP33 Rho GTPase activating protein 33 chr19 -1.76 0.001 ARHGAP5- AS1 ARHGAP5 antisense RNA 1 (head to head) chr14 -1.59 0.003 ARHGDIB Rho GDP dissociation inhibitor (GDI) beta chr12 -1.68 0.015 ARHGEF10 Rho guanine nucleotide exchange factor (GEF) 10 chr8 -1.51 0.008 ARHGEF17 Rho guanine nucleotide exchange factor (GEF) 17 chr11 -1.53 0.001 ARHGEF28 Rho guanine nucleotide exchange factor (GEF) 28 chr5 -1.55 0.006 ARHGEF40 Rho guanine nucleotide exchange factor (GEF) 40 chr14 -1.64 0.007 ARNTL2 aryl hydrocarbon receptor nuclear translocator-like 2 chr12 -1.59 0.007 ARPC1A actin related protein 2/3 complex, subunit 1A, 41kDa chr7 -1.52 0.001 ARPIN actin-related protein 2/3 complex inhibitor chr15 -1.52 0.004 ARRB2 arrestin, beta 2 chr17 -1.70 0.000 ARRDC1-AS1 ARRDC1 antisense RNA 1 chr9 -1.65 0.001 ARSA arylsulfatase A chr22 -1.60 0.004 ARSB arylsulfatase B chr5 -2.61 0.000 ARSG arylsulfatase G chr17 -1.77 0.002 ARX aristaless related homeobox chrX -2.57 0.001 ASAP1 ArfGAP with SH3 domain, ankyrin repeat and PH domain 1 chr8 -1.70 0.007 ASB6 ankyrin repeat and SOCS box containing 6 chr9 -1.52 0.005 ASIC1 acid-sensing (proton-gated) ion channel 1 chr12 -1.62 0.015 asp (abnormal spindle) homolog, microcephaly associated ASPM (Drosophila) chr1 -1.87 0.012 ASPRV1 aspartic peptidase, retroviral-like 1 chr2 -2.05 0.001 ASRGL1 asparaginase like 1 chr11 -2.59 0.000 ASXL1 additional sex combs like transcriptional regulator 1 chr20 -1.71 0.002 ASXL2 additional sex combs like transcriptional regulator 2 chr2 -1.72 0.002 ankyrin repeat, SAM and basic leucine zipper domain ASZ1 containing 1 chr7 -1.68 0.009 ATAD2 ATPase family, AAA domain containing 2 chr8 -2.16 0.004 ATAD5 ATPase family, AAA domain containing 5 chr17 -1.68 0.016 ATL3 atlastin GTPase 3 chr11 -1.70 0.001 ATP11A ATPase, class VI, type 11A chr13 -2.00 0.000 ATP6V0A4 ATPase, H+ transporting, lysosomal V0 subunit a4 chr7 -1.83 0.044 ATP6V0E1 ATPase, H+ transporting, lysosomal 9kDa, V0 subunit e1 chr5 -2.16 0.000 ATPase, H+ transporting, lysosomal 9kDa, V0 subunit e1 ATP6V0E1P2 pseudogene 2 chr3 -2.12 0.000 ATP6V0E2- AS1 ATP6V0E2 antisense RNA 1 chr7 -1.98 0.000 ATP6V1G2 ATPase, H+ transporting, lysosomal 13kDa, V1 subunit G2 chr6 -1.54 0.038 ATPase, aminophospholipid transporter, class I, type 8B, ATP8B3 member 3 chr19 -1.82 0.014 AUNIP aurora kinase A and ninein interacting protein chr1 -2.59 0.000 AVPI1 arginine vasopressin-induced 1 chr10 -1.80 0.002

The role of gwas identified 5p15 locus in prostate cancer risk and progression 183

Fold Symbol Gene Description Chr Change p-Value B9D1 B9 protein domain 1 chr17 -1.78 0.007 BAG1 BCL2-associated athanogene chr9 -1.52 0.026 BAI2 brain-specific angiogenesis inhibitor 2 chr1 -1.60 0.002 BCAP29 B-cell receptor-associated protein 29 chr7 -1.78 0.000 BCAP31 B-cell receptor-associated protein 31 chrX -1.71 0.012 BCL2 B-cell CLL/lymphoma 2 chr18 -1.65 0.001 BDH2 3-hydroxybutyrate dehydrogenase, type 2 chr4 -1.61 0.010 BEGAIN brain-enriched guanylate kinase-associated chr14 -1.52 0.010 BID BH3 interacting domain death agonist chr22 -1.74 0.001 BIRC5 baculoviral IAP repeat containing 5 chr17 -1.85 0.025 BLM Bloom syndrome, RecQ helicase-like chr15 -2.36 0.003 BLNK B-cell linker chr10 -1.67 0.002 biogenesis of lysosomal organelles complex-1, subunit 5, BLOC1S5 muted chr6 -1.51 0.001 BLZF1 basic leucine zipper nuclear factor 1 chr1 -1.58 0.001 BORA bora, aurora kinase A activator chr13 -1.66 0.011 BPNT1 3'(2'), 5'-bisphosphate nucleotidase 1 chr1 -1.68 0.001 BRCA1 breast cancer 1, early onset chr17 -2.33 0.004 BRCA2 breast cancer 2, early onset chr13 -2.29 0.004 BRICD5 BRICHOS domain containing 5 chr16 -1.55 0.011 BRIP1 BRCA1 interacting protein C-terminal helicase 1 chr17 -1.62 0.023 BROX BRO1 domain and CAAX motif containing chr1 -2.27 0.000 BTBD9 BTB (POZ) domain containing 9 chr6 -2.00 0.003 BTG1 B-cell translocation gene 1, anti-proliferative chr12 -1.53 0.019 BTG3 BTG family, member 3 chr21 -1.55 0.040 BTN2A1 butyrophilin, subfamily 2, member A1 chr6 -1.58 0.020 BTN2A2 butyrophilin, subfamily 2, member A2 chr6 -1.56 0.002 BTN3A2 butyrophilin, subfamily 3, member A2 chr6 -1.60 0.002 BZRAP1 benzodiazepine receptor (peripheral) associated protein 1 chr17 -1.58 0.000 C11orf24 chromosome 11 open reading frame 24 chr11 -2.01 0.000 C11orf52 chromosome 11 open reading frame 52 chr11 -1.58 0.002 C14orf28 chromosome 14 open reading frame 28 chr14 -1.59 0.001 C14orf80 chromosome 14 open reading frame 80 chr14 -1.66 0.005 C14orf93 chromosome 14 open reading frame 93 chr14 -1.62 0.004 C15orf65 chromosome 15 open reading frame 65 chr15 -1.57 0.008 C16orf74 chromosome 16 open reading frame 74 chr16 -1.88 0.000 C16orf89 chromosome 16 open reading frame 89 chr16 -1.56 0.007 C17orf53 chromosome 17 open reading frame 53 chr17 -1.59 0.018 C19orf40 chromosome 19 open reading frame 40 chr19 -1.69 0.018 C19orf57 chromosome 19 open reading frame 57 chr19 -2.35 0.003 C19orf83 chromosome 19 open reading frame 83 chr19 -1.71 0.000 C1orf228 chromosome 1 open reading frame 228 chr1 -1.95 0.001 C1RL complement component 1, r subcomponent-like chr12 -1.98 0.002 C2 complement component 2 chr6 -1.74 0.005 C3orf67 chromosome 3 open reading frame 67 chr3 -1.60 0.004 C6orf132 chromosome 6 open reading frame 132 chr6 -2.63 0.002 C6orf222 chromosome 6 open reading frame 222 chr6 -1.57 0.018 C6orf25 chromosome 6 open reading frame 25 chr6 -2.12 0.000 C6orf52 chromosome 6 open reading frame 52 chr6 -1.52 0.012 C8G complement component 8, gamma polypeptide chr9 -1.65 0.029 C9orf152 open reading frame 152 chr9 -1.58 0.016 C9orf40 chromosome 9 open reading frame 40 chr9 -1.73 0.008 C9orf43 chromosome 9 open reading frame 43 chr9 -1.71 0.001 CA11 carbonic anhydrase XI chr19 -1.84 0.006 CA4 carbonic anhydrase IV chr17 -1.52 0.010 CA5BP1 carbonic anhydrase VB pseudogene 1 chrX -1.55 0.007 CACNB1 calcium channel, voltage-dependent, beta 1 subunit chr17 -1.83 0.000

184 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value CADPS2 Ca++-dependent secretion activator 2 chr7 -1.73 0.007 CALM3 calmodulin 3 (phosphorylase kinase, delta) chr19 -1.50 0.021 CALML4 calmodulin-like 4 chr15 -1.87 0.000 CAMK2D calcium/calmodulin-dependent protein kinase II delta chr4 -1.74 0.048 CAP1 CAP, adenylate cyclase-associated protein 1 (yeast) chr1 -1.57 0.001 CAPN1 calpain 1, (mu/I) large subunit chr11 -1.70 0.006 CAPN13 calpain 13 chr2 -1.59 0.041 CAPN3 calpain 3, (p94) chr15 -1.54 0.001 CAPSL calcyphosine-like chr5 -1.55 0.001 CARD10 caspase recruitment domain family, member 10 chr22 -1.77 0.001 CARNS1 carnosine synthase 1 chr11 -1.64 0.011 CASC5 cancer susceptibility candidate 5 chr15 -1.74 0.046 CASP4 caspase 4, apoptosis-related peptidase chr11 -1.65 0.007 CASP5 caspase 5, apoptosis-related cysteine peptidase chr11 -1.62 0.024 CAST calpastatin chr5 -2.52 0.001 CBFB core-binding factor, beta subunit chr16 -1.54 0.000 CBLL1 Cbl proto-oncogene-like 1, E3 ubiquitin protein ligase chr7 -2.27 0.000 CBR3 carbonyl reductase 3 chr21 -1.96 0.035 CBX1 chromobox homolog 1 chr17 -1.81 0.000 CBX7 chromobox homolog 7 chr22 -1.79 0.000 CBY1 chibby homolog 1 (Drosophila) chr22 -1.93 0.000 CCDC109B coiled-coil domain containing 109B chr4 -2.31 0.000 CCDC11 coiled-coil domain containing 11 chr18 -1.50 0.002 CCDC113 coiled-coil domain containing 113 chr16 -1.95 0.000 CCDC127 coiled-coil domain containing 127 chr5 -1.67 0.001 CCDC136 coiled-coil domain containing 136 chr7 -1.54 0.007 CCDC138 coiled-coil domain containing 138 chr2 -1.65 0.003 CCDC150 coiled-coil domain containing 150 chr2 -1.54 0.025 CCDC153 coiled-coil domain containing 153 chr11 -1.51 0.004 CCDC155 coiled-coil domain containing 155 chr19 -1.56 0.008 CCDC176 coiled-coil domain containing 176 chr14 -1.54 0.001 CCDC18 coiled-coil domain containing 18 chr1 -2.14 0.001 CCDC180 coiled-coil domain containing 180 chr9 -1.78 0.001 CCDC40 coiled-coil domain containing 40 chr17 -1.62 0.006 CCDC50 coiled-coil domain containing 50 chr3 -1.79 0.002 CCDC57 coiled-coil domain containing 57 chr17 -1.82 0.001 CCDC60 coiled-coil domain containing 60 chr12 -1.52 0.018 CCDC92 coiled-coil domain containing 92 chr12 -2.99 0.000 CCNA2 cyclin A2 chr4 -2.37 0.012 CCNB1 cyclin B1 chr5 -1.65 0.042 CCNB2 cyclin B2 chr15 -1.89 0.017 CCND1 cyclin D1 chr11 -1.76 0.015 CCNE1 cyclin E1 chr19 -2.00 0.000 CCNE2 cyclin E2 chr8 -2.71 0.001 CCNG2 cyclin G2 chr4 -1.56 0.002 CD27-AS1 CD27 antisense RNA 1 chr12 -2.47 0.000 CD302 CD302 molecule chr2 -2.37 0.000 CD82 CD82 molecule chr11 -2.44 0.001 CD83 CD83 molecule chr6 -1.64 0.001 CDADC1 cytidine and dCMP deaminase domain containing 1 chr13 -1.77 0.000 CDC20 cell division cycle 20 chr1 -1.83 0.023 CDC20P1 cell division cycle 20 pseudogene 1 chr9 -1.69 0.037 CDC42EP1 CDC42 effector protein (Rho GTPase binding) 1 chr22 -1.97 0.000 CDC42EP2 CDC42 effector protein (Rho GTPase binding) 2 chr11 -1.76 0.002 CDC45 cell division cycle 45 chr22 -2.17 0.019 CDC6 cell division cycle 6 chr17 -2.08 0.017

The role of gwas identified 5p15 locus in prostate cancer risk and progression 185

Fold Symbol Gene Description Chr Change p-Value CDC7 cell division cycle 7 chr1 -1.92 0.003 CDCA2 cell division cycle associated 2 chr8 -1.86 0.043 CDCA7 cell division cycle associated 7 chr2 -2.16 0.005 CDCA7L cell division cycle associated 7-like chr7 -2.35 0.001 CDCA8 cell division cycle associated 8 chr1 -2.27 0.004 CDK14 cyclin-dependent kinase 14 chr7 -1.76 0.004 CDK16 cyclin-dependent kinase 16 chrX -1.79 0.000 CDKN2A cyclin-dependent kinase inhibitor 2A chr9 -1.62 0.025 CDKN2C cyclin-dependent kinase inhibitor 2C (p18, inhibits CDK4) chr1 -2.13 0.004 CDKN3 cyclin-dependent kinase inhibitor 3 chr14 -1.75 0.006 CDNF cerebral dopamine neurotrophic factor chr10 -1.79 0.001 CDR2L cerebellar degeneration-related protein 2-like chr17 -2.17 0.007 CDP-diacylglycerol synthase (phosphatidate CDS2 cytidylyltransferase) 2 chr20 -4.25 0.000 CDT1 chromatin licensing and DNA replication factor 1 chr16 -1.93 0.015 CECR1 eye syndrome chromosome region, candidate 1 chr22 -1.70 0.047 CELSR1 cadherin, EGF LAG seven-pass G-type receptor 1 chr22 -1.58 0.001 CENPA centromere protein A chr2 -2.01 0.019 CENPE centromere protein E, 312kDa chr4 -1.84 0.025 CENPF centromere protein F, 350/400kDa chr1 -1.85 0.018 CENPI centromere protein I chrX -1.72 0.027 CENPJ centromere protein J chr13 -1.70 0.002 CENPK centromere protein K chr5 -1.92 0.037 CENPN centromere protein N chr16 -2.51 0.001 CENPP centromere protein P chr9 -2.19 0.001 CENPU centromere protein U chr4 -1.84 0.026 CEP128 centrosomal protein 128kDa chr14 -1.66 0.017 CEP152 centrosomal protein 152kDa chr15 -1.59 0.034 CEP162 centrosomal protein 162kDa chr6 -1.56 0.001 CEP55 centrosomal protein 55kDa chr10 -1.93 0.031 CEP68 centrosomal protein 68kDa chr2 -1.61 0.001 CERK ceramide kinase chr22 -2.30 0.001 CERS1 ceramide synthase 1 chr19 -1.51 0.018 CES3 carboxylesterase 3 chr16 -1.51 0.022 CFAP61 cilia and flagella associated protein 61 chr20 -1.52 0.001 CHEK1 checkpoint kinase 1 chr11 -1.84 0.019 CHRAC1 chromatin accessibility complex 1 chr8 -1.55 0.005 CHST11 carbohydrate (chondroitin 4) sulfotransferase 11 chr12 -1.55 0.038 CHST12 carbohydrate (chondroitin 4) sulfotransferase 12 chr7 -1.84 0.000 carbohydrate (N-acetylgalactosamine 4-sulfate 6-O) CHST15 sulfotransferase 15 chr10 -2.14 0.000 CISD3 CDGSH iron sulfur domain 3 chr17 -2.07 0.000 CIZ1 CDKN1A interacting zinc finger protein 1 chr9 -1.54 0.001 CKAP2 cytoskeleton associated protein 2 chr13 -1.82 0.013 CKAP2L cytoskeleton associated protein 2-like chr2 -1.86 0.012 CLDN3 claudin 3 chr7 -1.81 0.000 CLDND1 claudin domain containing 1 chr3 -1.62 0.002 CLIP1 CAP-GLY domain containing linker protein 1 chr12 -1.67 0.001 CLSPN claspin chr1 -2.35 0.002 CMC4 C-x(9)-C motif containing 4 chrX -1.91 0.000 CNDP2 CNDP dipeptidase 2 (metallopeptidase M20 family) chr18 -1.51 0.043 CNGB3 cyclic nucleotide gated channel beta 3 chr8 -1.73 0.004 CNKSR2 connector enhancer of kinase suppressor of Ras 2 chrX -1.92 0.002 CNN2 calponin 2 chr19 -1.66 0.001 cyclin and CBS domain divalent metal cation transport CNNM2 mediator 2 chr10 -1.78 0.000 CNOT6 CCR4-NOT transcription complex, subunit 6 chr5 -2.34 0.000

186 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value COL4A2 collagen, type IV, alpha 2 chr13 -1.63 0.001 COL4A6 collagen, type IV, alpha 6 chrX -1.56 0.009 COL7A1 collagen, type VII, alpha 1 chr3 -1.88 0.002 COMMD8 COMM domain containing 8 chr4 -1.97 0.000 CRACR2B calcium release activated channel regulator 2B chr11 -1.81 0.035 CREB3 cAMP responsive element binding protein 3 chr9 -4.08 0.001 CREG1 cellular repressor of E1A-stimulated genes 1 chr1 -1.51 0.013 CRIP2 cysteine-rich protein 2 chr14 -2.10 0.010 CSE1L CSE1 chromosome segregation 1-like (yeast) chr20 -1.73 0.009 CTA- 250D10.23 chr22 -1.55 0.031 CTB- 193M12.5 chr16 -1.71 0.013 CTB-50L17.8 chr19 -1.67 0.000 CTB-58E17.1 chr17 -1.61 0.009 CTC-301O7.4 chr19 -1.50 0.028 CTCF CCCTC-binding factor (zinc finger protein) chr16 -1.50 0.011 CTD- 2366F13.1 chr5 -2.55 0.000 CTSO cathepsin O chr4 -1.56 0.040 CUL4B cullin 4B chrX -1.50 0.002 CUX1 cut-like homeobox 1 chr7 -2.41 0.000 CYFIP2 cytoplasmic FMR1 interacting protein 2 chr5 -1.71 0.002 CYP2E1 cytochrome P450, family 2, subfamily E, polypeptide 1 chr10 -1.60 0.001 CYR61 cysteine-rich, angiogenic inducer, 61 chr1 -1.59 0.016 CYSRT1 cysteine-rich tail protein 1 chr9 -1.51 0.003 D4S234E Neuron-specific protein family member 1 chr4 -1.58 0.002 DAB1 Dab, reelin signal transducer, homolog 1 (Drosophila) chr1 -1.74 0.016 DBF4B DBF4 zinc finger B chr17 -1.80 0.003 dysbindin (dystrobrevin binding protein 1) domain containing DBNDD1 1 chr16 -1.58 0.002 DCLRE1B DNA cross-link repair 1B chr1 -1.77 0.010 DCP2 decapping mRNA 2 chr5 -1.61 0.003 DCTN5 dynactin 5 (p25) chr16 -1.65 0.001 DDB2 damage-specific DNA binding protein 2, 48kDa chr11 -1.54 0.045 DDIAS DNA damage-induced apoptosis suppressor chr11 -1.76 0.021 DEK DEK proto-oncogene chr6 -1.61 0.008 DERA deoxyribose-phosphate aldolase (putative) chr12 -1.69 0.001 DGCR14 DiGeorge syndrome critical region gene 14 chr22 -1.59 0.021 DGKA diacylglycerol kinase, alpha 80kDa chr12 -1.63 0.010 DHRS4-AS1 DHRS4 antisense RNA 1 chr14 -1.95 0.001 DIAPH3 diaphanous-related formin 3 chr13 -1.59 0.041 DIS3L DIS3 like exosome 3'-5' exoribonuclease chr15 -1.62 0.000 DKKL1 dickkopf-like 1 chr19 -1.55 0.015 DMBX1 diencephalon/mesencephalon homeobox 1 chr1 -1.66 0.011 DMXL2 Dmx-like 2 chr15 -1.66 0.000 DNA2 DNA replication helicase/nuclease 2 chr10 -1.65 0.027 DNAH11 dynein, axonemal, heavy chain 11 chr7 -1.99 0.002 DNAH2 dynein, axonemal, heavy chain 2 chr17 -2.05 0.000 DNAJB5 DnaJ (Hsp40) homolog, subfamily B, member 5 chr9 -1.51 0.004 DNAJC16 DnaJ (Hsp40) homolog, subfamily C, member 16 chr1 -1.88 0.000 DNAJC22 DnaJ (Hsp40) homolog, subfamily C, member 22 chr12 -2.77 0.002 DNALI1 dynein, axonemal, light intermediate chain 1 chr1 -1.53 0.001 DNASE2B deoxyribonuclease II beta chr1 -1.97 0.001 DNMT1 DNA (cytosine-5-)-methyltransferase 1 chr19 -1.61 0.003 DNMT3A DNA (cytosine-5-)-methyltransferase 3 alpha chr2 -1.81 0.000 DOCK11 dedicator of cytokinesis 11 chrX -1.76 0.000 DOK7 docking protein 7 chr4 -1.97 0.002

The role of gwas identified 5p15 locus in prostate cancer risk and progression 187

Fold Symbol Gene Description Chr Change p-Value DONSON downstream neighbor of SON chr21 -1.66 0.005 DPP4 dipeptidyl-peptidase 4 chr2 -1.63 0.030 DPP9 dipeptidyl-peptidase 9 chr19 -1.72 0.004 down-regulator of transcription 1, TBP-binding (negative DR1 cofactor 2) chr1 -1.89 0.000 DSCC1 DNA replication and sister chromatid cohesion 1 chr8 -1.61 0.050 DSE dermatan sulfate epimerase chr6 -1.57 0.017 DSN1 DSN1, MIS12 kinetochore complex component chr20 -1.70 0.008 DTD1 D-tyrosyl-tRNA deacylase 1 chr20 -1.56 0.011 DTL denticleless E3 ubiquitin protein ligase homolog (Drosophila) chr1 -2.12 0.010 DUS4L dihydrouridine synthase 4-like (S. cerevisiae) chr7 -1.50 0.013 DUSP3 dual specificity phosphatase 3 chr17 -3.24 0.000 DUSP4 dual specificity phosphatase 4 chr8 -2.12 0.006 DUSP5 dual specificity phosphatase 5 chr10 -1.85 0.048 DUSP6 dual specificity phosphatase 6 chr12 -1.58 0.012 DUT deoxyuridine triphosphatase chr15 -1.61 0.014 DYM dymeclin chr18 -1.57 0.000 DYNLRB1 dynein, light chain, roadblock-type 1 chr20 -1.93 0.001 dual-specificity tyrosine-(Y)-phosphorylation regulated kinase DYRK3 3 chr1 -1.87 0.019 DYX1C1 dyslexia susceptibility 1 candidate 1 chr15 -1.55 0.018 E2F7 transcription factor 7 chr12 -1.96 0.021 E2F8 E2F transcription factor 8 chr11 -1.95 0.014 EBAG9 estrogen receptor binding site associated, antigen, 9 chr8 -3.15 0.000 EBPL emopamil binding protein-like chr13 -1.52 0.002 ECT2 epithelial cell transforming 2 chr3 -2.03 0.007 EDN2 endothelin 2 chr1 -1.57 0.037 EFCAB2 EF-hand calcium binding domain 2 chr1 -1.81 0.000 EFNA5 ephrin-A5 chr5 -2.69 0.001 EFNB3 ephrin-B3 chr17 -1.91 0.003 EGR1 early growth response 1 chr5 -2.49 0.001 EHF ets homologous factor chr11 -2.25 0.000 EIF4EBP2 eukaryotic translation initiation factor 4E binding protein 2 chr10 -1.81 0.002 ELK4 ELK4, ETS-domain protein (SRF accessory protein 1) chr1 -1.76 0.001 EMB embigin chr5 -1.55 0.000 EMBP1 embigin pseudogene 1 chr1 -1.52 0.003 EMC3-AS1 EMC3 antisense RNA 1 chr3 -1.72 0.000 EME1 essential meiotic structure-specific endonuclease 1 chr17 -1.84 0.046 EML1 echinoderm microtubule associated protein like 1 chr14 -1.61 0.006 EMP2 epithelial membrane protein 2 chr16 -1.86 0.011 ENO3 enolase 3 (beta, muscle) chr17 -1.61 0.004 ENPP1 ectonucleotide pyrophosphatase/phosphodiesterase 1 chr6 -1.63 0.003 EPHB2 EPH receptor B2 chr1 -2.04 0.000 ERCC1 excision repair cross-complementation group 1 chr19 -2.82 0.000 ERI2 ERI1 exoribonuclease family member 2 chr16 -1.53 0.025 ERVMER34-1 endogenous retrovirus group MER34, member 1 chr4 -2.15 0.007 establishment of sister chromatid cohesion N- ESCO2 acetyltransferase 2 chr8 -1.87 0.009 ESPN espin chr1 -1.74 0.001 ESPNL espin-like chr2 -1.62 0.012 ESRP2 epithelial splicing regulatory protein 2 chr16 -1.76 0.001 ETNK2 ethanolamine kinase 2 chr1 -1.76 0.005 ETNPPL ethanolamine-phosphate phospho-lyase chr4 -1.80 0.000 EVA1C eva-1 homolog C (C. elegans) chr21 -1.63 0.001 EVC Ellis van Creveld syndrome chr4 -1.57 0.001 EXO1 exonuclease 1 chr1 -2.27 0.010 EXOSC1 exosome component 1 chr10 -2.32 0.000

188 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value EXOSC9 exosome component 9 chr4 -1.58 0.008 EXT2 exostosin glycosyltransferase 2 chr11 -2.42 0.000 EZH2 enhancer of zeste 2 polycomb repressive complex 2 subunit chr7 -1.63 0.028 EZR ezrin chr6 -1.59 0.027 F8 coagulation factor VIII, procoagulant component chrX -1.98 0.000 FA2H fatty acid 2-hydroxylase chr16 -1.75 0.000 FADS1 fatty acid desaturase 1 chr11 -3.99 0.000 FAIM3 Fas apoptotic inhibitory molecule 3 chr1 -1.52 0.013 FAM104B family with sequence similarity 104, member B chrX -3.17 0.000 FAM111B family with sequence similarity 111, member B chr11 -2.46 0.004 FAM118B family with sequence similarity 118, member B chr11 -1.77 0.001 FAM122B family with sequence similarity 122B chrX -1.93 0.001 FAM134B family with sequence similarity 134, member B chr5 -1.59 0.006 FAM13A family with sequence similarity 13, member A chr4 -1.56 0.037 FAM183A family with sequence similarity 183, member A chr1 -1.80 0.038 FAM214B family with sequence similarity 214, member B chr9 -1.64 0.005 FAM221B family with sequence similarity 221, member B chr9 -2.24 0.000 FAM222B family with sequence similarity 222, member B chr17 -2.40 0.000 FAM229B family with sequence similarity 229, member B chr6 -1.70 0.001 FAM64A family with sequence similarity 64, member A chr17 -1.74 0.044 FAM65A family with sequence similarity 65, member A chr16 -2.35 0.000 FAM69A family with sequence similarity 69, member A chr1 -2.15 0.004 FAM83E family with sequence similarity 83, member E chr19 -1.87 0.003 FANCA Fanconi anemia, complementation group A chr16 -2.13 0.003 FANCD2 Fanconi anemia, complementation group D2 chr3 -1.69 0.035 FANCE Fanconi anemia, complementation group E chr6 -1.92 0.000 FANCG Fanconi anemia, complementation group G chr9 -1.59 0.050 FANCI Fanconi anemia, complementation group I chr15 -1.80 0.023 FERM, RhoGEF (ARHGEF) and pleckstrin domain protein 1 FARP1 (chondrocyte-derived) chr13 -1.75 0.004 FBRS fibrosin chr16 -1.64 0.001 FBXL5 F-box and leucine-rich repeat protein 5 chr4 -1.62 0.000 FBXO11 F-box protein 11 chr2 -1.54 0.000 FBXO15 F-box protein 15 chr18 -2.50 0.000 FBXO22 F-box protein 22 chr15 -1.68 0.002 FBXO36 F-box protein 36 chr2 -1.65 0.008 FBXO4 F-box protein 4 chr5 -1.62 0.036 FBXO5 F-box protein 5 chr6 -1.69 0.040 FEN1 flap structure-specific endonuclease 1 chr11 -1.65 0.049 FER1L4 fer-1-like family member 4, pseudogene (functional) chr20 -1.61 0.009 FERMT1 fermitin family member 1 chr20 -2.91 0.000 FERMT2 fermitin family member 2 chr14 -1.80 0.001 FGF12 fibroblast growth factor 12 chr3 -1.50 0.003 FGFR1 fibroblast growth factor receptor 1 chr8 -1.65 0.001 FHL2 four and a half LIM domains 2 chr2 -1.57 0.001 FHL3 four and a half LIM domains 3 chr1 -1.74 0.008 FICD FIC domain containing chr12 -1.90 0.002 FLNA filamin A, alpha chrX -1.61 0.024 FMO5 flavin containing monooxygenase 5 chr1 -1.59 0.006 FNDC3A fibronectin type III domain containing 3A chr13 -1.66 0.000 FOS FBJ murine osteosarcoma viral oncogene homolog chr14 -2.20 0.033 FOXJ3 forkhead box J3 chr1 -1.54 0.004 FOXM1 forkhead box M1 chr12 -1.61 0.038 FRMPD2 FERM and PDZ domain containing 2 chr10 -1.92 0.000 FRRS1 ferric-chelate reductase 1 chr1 -1.52 0.001 FSD1L fibronectin type III and SPRY domain containing 1-like chr9 -1.62 0.004 FYCO1 FYVE and coiled-coil domain containing 1 chr3 -2.20 0.002

The role of gwas identified 5p15 locus in prostate cancer risk and progression 189

Fold Symbol Gene Description Chr Change p-Value GALNT10 polypeptide N-acetylgalactosaminyltransferase 10 chr5 -2.03 0.002 GALNT12 polypeptide N-acetylgalactosaminyltransferase 12 chr9 -1.76 0.011 GAS2 growth arrest-specific 2 chr11 -1.66 0.032 GAS6-AS1 GAS6 antisense RNA 1 chr13 -1.71 0.009 GAS8 growth arrest-specific 8 chr16 -1.95 0.000 GATS GATS, stromal antigen 3 opposite strand chr7 -2.30 0.018 GBA3 glucosidase, beta, acid 3 (gene/pseudogene) chr4 -1.67 0.034 GCLM glutamate-cysteine ligase, modifier subunit chr1 -1.74 0.015 GDF11 growth differentiation factor 11 chr12 -2.13 0.000 GFOD2 glucose-fructose oxidoreductase domain containing 2 chr16 -1.84 0.002 GIMAP2 GTPase, IMAP family member 2 chr7 -1.64 0.003 GINS4 GINS complex subunit 4 (Sld5 homolog) chr8 -1.90 0.026 GIT2 G protein-coupled receptor kinase interacting ArfGAP 2 chr12 -1.54 0.002 GJC3 gap junction protein, gamma 3, 30.2kDa chr7 -1.70 0.004 GLIS2 GLIS family zinc finger 2 chr16 -1.93 0.003 GLS2 glutaminase 2 (liver, mitochondrial) chr12 -2.19 0.000 GMPS guanine monphosphate synthase chr3 -1.85 0.001 guanine nucleotide binding protein (G protein), alpha GNAI2 inhibiting activity polypeptide 2 chr3 -1.63 0.001 GNAS GNAS complex locus chr20 -1.86 0.001 guanine nucleotide binding protein (G protein), beta GNB4 polypeptide 4 chr3 -1.80 0.010 GNG12 guanine nucleotide binding protein (G protein), gamma 12 chr1 -1.63 0.005 GNG13 guanine nucleotide binding protein (G protein), gamma 13 chr16 -1.80 0.010 GNPDA1 glucosamine-6-phosphate deaminase 1 chr5 -2.98 0.000 GNPNAT1 glucosamine-phosphate N-acetyltransferase 1 chr14 -2.18 0.000 GOLIM4 golgi integral membrane protein 4 chr3 -1.67 0.001 GPALPP1 GPALPP motifs containing 1 chr13 -1.64 0.005 GPC4 glypican 4 chrX -2.43 0.000 GPR125 G protein-coupled receptor 125 chr4 -1.70 0.001 GPR19 G protein-coupled receptor 19 chr12 -3.06 0.000 GPSM2 G-protein signaling modulator 2 chr1 -1.70 0.025 GPT glutamic-pyruvate transaminase (alanine aminotransferase) chr8 -2.03 0.000 GRIN2C glutamate receptor, ionotropic, N-methyl D-aspartate 2C chr17 -1.56 0.025 GRK4 G protein-coupled receptor kinase 4 chr4 -1.55 0.005 GSDMD gasdermin D chr8 -2.03 0.000 GTPBP1 GTP binding protein 1 chr22 -1.76 0.001 GTSE1 G-2 and S-phase expressed 1 chr22 -1.82 0.024 GXYLT2 glucoside xylosyltransferase 2 chr3 -1.51 0.010 HADH hydroxyacyl-CoA dehydrogenase chr4 -1.62 0.002 HAUS4 HAUS augmin-like complex, subunit 4 chr14 -1.84 0.002 HAUS5 HAUS augmin-like complex, subunit 5 chr19 -1.89 0.003 HBQ1 hemoglobin, theta 1 chr16 -2.04 0.000 HCLS1 hematopoietic cell-specific Lyn substrate 1 chr3 -1.56 0.023 HCP5 HLA complex P5 (non-protein coding) chr6 -1.54 0.028 HCST hematopoietic cell signal transducer chr19 -2.20 0.000 HDGFRP3 Hepatoma-derived growth factor-related protein 3 chr15 -1.54 0.021 HEG1 heart development protein with EGF-like domains 1 chr3 -1.56 0.020 HELLS helicase, lymphoid-specific chr10 -2.00 0.005 HEPH hephaestin chrX -1.63 0.004 HECT and RLD domain containing E3 ubiquitin protein ligase HERC6 family member 6 chr4 -2.38 0.002 HES2 hes family bHLH transcription factor 2 chr1 -1.56 0.003 HGSNAT heparan-alpha-glucosaminide N-acetyltransferase chr8 -2.40 0.000 HIP1 huntingtin interacting protein 1 chr7 -1.54 0.001 HIPK1 homeodomain interacting protein kinase 1 chr1 -1.59 0.004 HIST1H1B histone cluster 1, H1b chr6 -2.60 0.003

190 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value HIST1H2AI histone cluster 1, H2ai chr6 -1.91 0.022 HIST1H3B histone cluster 1, H3b chr6 -2.02 0.016 HIST1H3D histone cluster 1, H3d chr6 -1.72 0.046 HIST1H4B histone cluster 1, H4b chr6 -2.07 0.018 HIST1H4C histone cluster 1, H4c chr6 -2.17 0.001 HJURP Holliday junction recognition protein chr2 -2.09 0.001 HKR1 HKR1, GLI-Kruppel zinc finger family member chr19 -1.52 0.001 HLA-DMA major histocompatibility complex, class II, DM alpha chr6 -2.38 0.000 HLA-E major histocompatibility complex, class I, E chr6 -1.73 0.018 HLA-F major histocompatibility complex, class I, F chr6 -1.64 0.001 HLF hepatic leukemia factor chr17 -1.77 0.036 HMGB2 high mobility group box 2 chr4 -2.22 0.012 HMGN3 high mobility group nucleosomal binding domain 3 chr6 -1.53 0.007 HMMR hyaluronan-mediated motility receptor (RHAMM) chr5 -1.85 0.034 HNRNPUL2 heterogeneous nuclear ribonucleoprotein U-like 2 chr11 -1.52 0.002 HOMER2 homer homolog 2 (Drosophila) chr15 -1.62 0.005 HOXA10 chr7 -2.91 0.000 HOXA6 homeobox A6 chr7 -1.67 0.050 HOXB3 homeobox B3 chr17 -1.73 0.000 HOXB9 homeobox B9 chr17 -1.67 0.022 HOXC13-AS HOXC13 antisense RNA chr12 -1.59 0.026 HPGD hydroxyprostaglandin dehydrogenase 15-(NAD) chr4 -1.67 0.027 HPS1 Hermansky-Pudlak syndrome 1 chr10 -1.68 0.005 HSD17B2 hydroxysteroid (17-beta) dehydrogenase 2 chr16 -1.57 0.010 HSD17B6 hydroxysteroid (17-beta) dehydrogenase 6 chr12 -1.97 0.001 HSPA1L heat shock 70kDa protein 1-like chr6 -1.54 0.003 HSPA5 heat shock 70kDa protein 5 (glucose-regulated protein, 78kDa) chr9 -1.78 0.003 5-hydroxytryptamine (serotonin) receptor 2C, G protein- HTR2C coupled chrX -1.75 0.002 HTRA1 HtrA serine peptidase 1 chr10 -2.48 0.001 HYKK hydroxylysine kinase chr15 -2.70 0.000 ICAM3 intercellular adhesion molecule 3 chr19 -1.61 0.002 ICMT isoprenylcysteine carboxyl methyltransferase chr1 -2.52 0.000 ICT1 immature colon carcinoma transcript 1 chr17 -1.73 0.000 IDI2-AS1 IDI2 antisense RNA 1 chr10 -1.55 0.007 IDNK idnK, gluconokinase homolog (E. coli) chr9 -2.12 0.000 IER5 immediate early response 5 chr1 -1.51 0.006 IFITM10 interferon induced transmembrane protein 10 chr11 -1.53 0.044 IFT46 intraflagellar transport 46 chr11 -1.70 0.000 IFT80 intraflagellar transport 80 chr3 -1.68 0.002 IKBIP IKBKB interacting protein chr12 -2.62 0.000 IL13RA1 interleukin 13 receptor, alpha 1 chrX -1.59 0.002 IL17RB interleukin 17 receptor B chr3 -1.60 0.016 IL1B interleukin 1, beta chr2 -1.52 0.008 IL1R1 interleukin 1 receptor, type I chr2 -2.96 0.000 IL23A interleukin 23, alpha subunit p19 chr12 -1.53 0.002 IL27RA interleukin 27 receptor, alpha chr19 -1.88 0.030 ILF3 interleukin enhancer binding factor 3, 90kDa chr19 -1.55 0.005 INPP5D inositol polyphosphate-5-phosphatase, 145kDa chr2 -1.68 0.015 INTU inturned planar cell polarity protein chr4 -1.62 0.001 IQCD IQ motif containing D chr12 -1.82 0.001 IQCH-AS1 IQCH antisense RNA 1 chr15 -1.65 0.001 IQCK IQ motif containing K chr16 -2.05 0.001 IQGAP3 IQ motif containing GTPase activating protein 3 chr1 -1.51 0.047 IQSEC1 IQ motif and Sec7 domain 1 chr3 -2.16 0.001 IRX4 iroquois homeobox 4 chr5 -6.88 0.000 ITGA1 integrin, alpha 1 chr5 -1.72 0.002

The role of gwas identified 5p15 locus in prostate cancer risk and progression 191

Fold Symbol Gene Description Chr Change p-Value ITGA2 integrin, alpha 2 (CD49B, alpha 2 subunit of VLA-2 receptor) chr5 -1.61 0.002 ITGB3BP integrin beta 3 binding protein (beta3-endonexin) chr1 -2.73 0.000 ITGB4 integrin, beta 4 chr17 -2.26 0.001 ITPKA inositol-trisphosphate 3-kinase A chr15 -1.52 0.016 ITPR3 inositol 1,4,5-trisphosphate receptor, type 3 chr6 -1.76 0.006 JAM3 junctional adhesion molecule 3 chr11 -1.64 0.000 KANK2 KN motif and ankyrin repeat domains 2 chr19 -1.63 0.004 KAZALD1 Kazal-type serine peptidase inhibitor domain 1 chr10 -2.00 0.000 KB-1732A1.1 chr8 -1.79 0.000 KB-431C1.4 chr8 -1.56 0.001 potassium voltage-gated channel, shaker-related subfamily, KCNAB2 beta member 2 chr1 -2.17 0.000 potassium voltage-gated channel, KQT-like subfamily, member KCNQ1 1 chr11 -1.73 0.024 potassium voltage-gated channel, KQT-like subfamily, member KCNQ2 2 chr20 -1.53 0.004 KCTD3 potassium channel tetramerization domain containing 3 chr1 -1.71 0.014 KDEL (Lys-Asp-Glu-Leu) endoplasmic reticulum protein KDELR1 retention receptor 1 chr19 -1.77 0.000 KDEL (Lys-Asp-Glu-Leu) endoplasmic reticulum protein KDELR2 retention receptor 2 chr7 -1.93 0.001 KDM4A lysine (K)-specific demethylase 4A chr1 -1.71 0.002 KHDC1 KH homology domain containing 1 chr6 -1.64 0.000 KIAA0040 KIAA0040 chr1 -1.57 0.002 KIAA0247 KIAA0247 chr14 -1.61 0.010 KIAA1217 KIAA1217 chr10 -1.53 0.027 KIAA1524 KIAA1524 chr3 -1.83 0.013 KIAA1549 KIAA1549 chr7 -1.67 0.001 KIF11 kinesin family member 11 chr10 -2.37 0.003 KIF15 kinesin family member 15 chr3 -2.22 0.013 KIF16B kinesin family member 16B chr20 -1.88 0.000 KIF18B kinesin family member 18B chr17 -2.05 0.014 KIF1C kinesin family member 1C chr17 -3.11 0.000 KIF20A kinesin family member 20A chr5 -1.90 0.012 KIF20B kinesin family member 20B chr10 -1.81 0.014 KIF21B kinesin family member 21B chr1 -1.50 0.009 KIF23 kinesin family member 23 chr15 -1.77 0.020 KIF24 kinesin family member 24 chr9 -3.37 0.000 KIF2C kinesin family member 2C chr1 -1.75 0.036 KIFC1 kinesin family member C1 chr6 -1.98 0.022 KLF10 Kruppel-like factor 10 chr8 -1.76 0.028 KLF13 Kruppel-like factor 13 chr15 -1.51 0.008 KLHDC2 kelch domain containing 2 chr14 -1.68 0.001 KLHDC9 kelch domain containing 9 chr1 -1.56 0.012 KLHL24 kelch-like family member 24 chr3 -2.03 0.002 KLHL42 kelch-like family member 42 chr12 -1.82 0.000 KLHL5 kelch-like family member 5 chr4 -2.07 0.001 KLK1 kallikrein 1 chr19 -2.05 0.025 KLK2 kallikrein-related peptidase 2 chr19 -2.86 0.006 KLK3 kallikrein-related peptidase 3 chr19 -3.20 0.005 KLK4 kallikrein-related peptidase 4 chr19 -1.53 0.002 KLLN killin, -regulated DNA replication inhibitor chr10 -1.82 0.000 KMT2E lysine (K)-specific methyltransferase 2E chr7 -1.54 0.019 KNSTRN kinetochore-localized astrin/SPAG5 binding protein chr15 -2.22 0.000 KNTC1 kinetochore associated 1 chr12 -1.74 0.021 KREMEN2 kringle containing transmembrane protein 2 chr16 -1.85 0.034 KRT20 keratin 20 chr17 -1.59 0.014 KSR2 kinase suppressor of ras 2 chr12 -1.50 0.003

192 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value KXD1 KxDL motif containing 1 chr19 -1.80 0.013 LCLAT1 lysocardiolipin acyltransferase 1 chr2 -4.90 0.000 LCP1 lymphocyte cytosolic protein 1 (L-plastin) chr13 -1.55 0.001 LDLRAD1 low density lipoprotein receptor class A domain containing 1 chr1 -1.55 0.047 LGALS4 lectin, galactoside-binding, soluble, 4 chr19 -2.30 0.003 LGI2 leucine-rich repeat LGI family, member 2 chr4 -1.89 0.005 LIFR-AS1 LIFR antisense RNA 1 chr5 -1.51 0.003 LIG1 ligase I, DNA, ATP-dependent chr19 -1.69 0.020 LIG3 ligase III, DNA, ATP-dependent chr17 -1.60 0.002 LIMK1 LIM domain kinase 1 chr7 -1.74 0.000 LIN7A lin-7 homolog A (C. elegans) chr12 -1.56 0.050 LIN9 lin-9 DREAM MuvB core complex component chr1 -1.86 0.004 LINC00471 long intergenic non-protein coding RNA 471 chr2 -1.68 0.001 LINC00886 long intergenic non-protein coding RNA 886 chr3 -1.57 0.006 LINC01125 long intergenic non-protein coding RNA 1125 chr2 -1.54 0.014 LINC01133 long intergenic non-protein coding RNA 1133 chr1 -3.02 0.011 LL22NC03- N27C7.1 chr22 -1.50 0.002 LMAN1 lectin, mannose-binding, 1 chr18 -2.79 0.000 LMF2 lipase maturation factor 2 chr22 -1.68 0.002 LMNA lamin A/C chr1 -1.51 0.007 LMNB2 lamin B2 chr19 -2.04 0.000 LONP2 lon peptidase 2, peroxisomal chr16 -2.05 0.000 LONRF2 LON peptidase N-terminal domain and ring finger 2 chr2 -2.06 0.000 LOXL1 lysyl oxidase-like 1 chr15 -1.53 0.008 LOXL3 lysyl oxidase-like 3 chr2 -1.70 0.010 LPGAT1 lysophosphatidylglycerol acyltransferase 1 chr1 -1.94 0.001 LRG1 leucine-rich alpha-2-glycoprotein 1 chr19 -1.94 0.002 LRR1 leucine rich repeat protein 1 chr14 -2.54 0.000 LRRC23 leucine rich repeat containing 23 chr12 -1.69 0.000 LRRC48 leucine rich repeat containing 48 chr17 -1.77 0.001 LRRC8B leucine rich repeat containing 8 family, member B chr1 -1.76 0.000 LRRCC1 leucine rich repeat and coiled-coil centrosomal protein 1 chr8 -2.06 0.002 leucine rich transmembrane and O-methyltransferase domain LRTOMT containing chr11 -1.61 0.001 LUZP1 leucine zipper protein 1 chr1 -1.54 0.011 LYPD8 LY6/PLAUR domain containing 8 chr1 -1.81 0.000 MAML1 mastermind-like 1 (Drosophila) chr5 -1.65 0.000 MAP1A microtubule-associated protein 1A chr15 -1.53 0.032 MAP1LC3B microtubule-associated protein 1 light chain 3 beta chr16 -2.42 0.001 MAP3K12 mitogen-activated protein kinase kinase kinase 12 chr12 -1.88 0.000 MAP3K6 mitogen-activated protein kinase kinase kinase 6 chr1 -1.63 0.001 MAP3K8 mitogen-activated protein kinase kinase kinase 8 chr10 -1.61 0.013 MAP4K4 mitogen-activated protein kinase kinase kinase kinase 4 chr2 -1.96 0.000 MAP7 microtubule-associated protein 7 chr6 -1.56 0.006 MAP7D3 MAP7 domain containing 3 chrX -1.55 0.003 MAPK12 mitogen-activated protein kinase 12 chr22 -1.73 0.002 MAPK8 mitogen-activated protein kinase 8 chr10 -1.76 0.000 MAPKAPK3 mitogen-activated protein kinase-activated protein kinase 3 chr3 -2.20 0.000 MARCKS myristoylated alanine-rich protein kinase C substrate chr6 -2.30 0.000 MARVELD1 MARVEL domain containing 1 chr10 -1.71 0.003 MASTL microtubule associated serine/threonine kinase-like chr10 -1.68 0.026 MAVS mitochondrial antiviral signaling protein chr20 -1.55 0.005 MAX MYC associated factor X chr14 -1.79 0.001 MBD3 methyl-CpG binding domain protein 3 chr19 -1.65 0.001 MBLAC2 metallo-beta-lactamase domain containing 2 chr5 -1.72 0.004 MBNL1-AS1 MBNL1 antisense RNA 1 chr3 -1.81 0.001

The role of gwas identified 5p15 locus in prostate cancer risk and progression 193

Fold Symbol Gene Description Chr Change p-Value MBNL2 muscleblind-like splicing regulator 2 chr13 -1.96 0.022 MBNL3 muscleblind-like splicing regulator 3 chrX -1.64 0.001 MBOAT7 membrane bound O-acyltransferase domain containing 7 chr19 -1.97 0.002 MCCC2 methylcrotonoyl-CoA carboxylase 2 (beta) chr5 -1.56 0.042 MCF2L2 MCF.2 cell line derived transforming sequence-like 2 chr3 -1.52 0.004 MCM10 minichromosome maintenance complex component 10 chr10 -2.15 0.008 MCM2 minichromosome maintenance complex component 2 chr3 -1.66 0.043 MCM3 minichromosome maintenance complex component 3 chr6 -2.32 0.003 MCM4 minichromosome maintenance complex component 4 chr8 -2.00 0.009 MCM5 minichromosome maintenance complex component 5 chr22 -1.62 0.044 MCM6 minichromosome maintenance complex component 6 chr2 -1.56 0.038 MCM8 minichromosome maintenance complex component 8 chr20 -1.82 0.004 MCMBP minichromosome maintenance complex binding protein chr10 -1.50 0.013 MDC1 mediator of DNA-damage checkpoint 1 chr6 -1.51 0.034 MDK midkine (neurite growth-promoting factor 2) chr11 -1.98 0.002 MDM1 Mdm1 nuclear protein homolog (mouse) chr12 -1.62 0.010 MED12 mediator complex subunit 12 chrX -1.54 0.001 MEF2B myocyte enhancer factor 2B chr19 -1.66 0.000 MELK maternal embryonic leucine zipper kinase chr9 -1.98 0.030 METTL21B methyltransferase like 21B chr12 -1.57 0.002 METTL9 methyltransferase like 9 chr16 -2.02 0.000 MFAP3L microfibrillar-associated protein 3-like chr4 -1.65 0.002 MFSD10 major facilitator superfamily domain containing 10 chr4 -1.99 0.000 MFSD2A major facilitator superfamily domain containing 2A chr1 -1.69 0.046 MFSD6 major facilitator superfamily domain containing 6 chr2 -1.72 0.017 MFSD6L major facilitator superfamily domain containing 6-like chr17 -1.76 0.027 MGLL monoglyceride lipase chr3 -2.65 0.000 MGME1 mitochondrial genome maintenance exonuclease 1 chr20 -1.61 0.009 MGP matrix Gla protein chr12 -1.71 0.004 microtubule associated monooxygenase, calponin and LIM MICAL2 domain containing 2 chr11 -1.80 0.001 MICALCL MICAL C-terminal like chr11 -1.70 0.003 MIF4GD MIF4G domain containing chr17 -6.00 0.000 MIR4674HG MIR4674 host gene (non-protein coding) chr9 -1.52 0.001 MIS18BP1 MIS18 binding protein 1 chr14 -1.61 0.009 MKI67 marker of proliferation Ki-67 chr10 -2.13 0.008 MKNK2 MAP kinase interacting serine/threonine kinase 2 chr19 -2.00 0.001 MLK4 Mitogen-activated protein kinase kinase kinase MLK4 chr1 -2.40 0.000 myeloid/lymphoid or mixed-lineage leukemia (trithorax MLLT3 homolog, Drosophila); translocated to, 3 chr9 -1.62 0.005 myeloid/lymphoid or mixed-lineage leukemia (trithorax MLLT6 homolog, Drosophila); translocated to, 6 chr17 -1.51 0.002 MME membrane metallo-endopeptidase chr3 -1.77 0.000 MMP24 matrix metallopeptidase 24 (membrane-inserted) chr20 -1.75 0.024 MMP7 matrix metallopeptidase 7 (matrilysin, uterine) chr11 -2.80 0.001 MMS22L MMS22-like, DNA repair protein chr6 -1.77 0.025 MND1 meiotic nuclear divisions 1 homolog (S. cerevisiae) chr4 -2.13 0.013 MOK MOK protein kinase chr14 -1.54 0.024 MORC4 MORC family CW-type zinc finger 4 chrX -1.50 0.010 MOSPD1 motile sperm domain containing 1 chrX -1.50 0.001 MOV10 Mov10 RISC complex RNA helicase chr1 -1.55 0.014 MPHOSPH9 M-phase phosphoprotein 9 chr12 -1.70 0.004 membrane protein, palmitoylated 7 (MAGUK p55 subfamily MPP7 member 7) chr10 -2.16 0.002 MR1 major histocompatibility complex, class I-related chr1 -2.21 0.001 MROH8 maestro heat-like repeat family member 8 chr20 -1.58 0.004 MRPL11 mitochondrial ribosomal protein L11 chr11 -2.48 0.000 MRPL51 mitochondrial ribosomal protein L51 chr12 -1.76 0.045

194 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value MRPS11 mitochondrial ribosomal protein S11 chr15 -1.97 0.000 MSH5 mutS homolog 5 chr6 -1.62 0.002 MSMB microseminoprotein, beta- chr10 -1.79 0.001 MSRB2 methionine sulfoxide reductase B2 chr10 -1.56 0.001 MSRB3 methionine sulfoxide reductase B3 chr12 -1.82 0.001 macrophage stimulating 1 (hepatocyte growth factor-like) MST1P2 pseudogene 2 chr1 -1.67 0.001 MTBP MDM2 binding protein chr8 -1.62 0.023 MTCL1 microtubule crosslinking factor 1 chr18 -1.70 0.001 MTFR1L mitochondrial fission regulator 1-like chr1 -2.70 0.000 MTFR2 mitochondrial fission regulator 2 chr6 -1.87 0.028 maturin, neural progenitor differentiation regulator homolog MTURN (Xenopus) chr7 -1.64 0.002 MUC20 mucin 20, cell surface associated chr3 -1.53 0.039 MUTYH mutY homolog chr1 -1.50 0.006 MYB v- avian myeloblastosis viral oncogene homolog chr6 -1.69 0.005 MYBL1 v-myb avian myeloblastosis viral oncogene homolog-like 1 chr8 -1.64 0.047 MYBL2 v-myb avian myeloblastosis viral oncogene homolog-like 2 chr20 -2.75 0.001 MYD88 myeloid differentiation primary response 88 chr3 -1.86 0.004 MYH3 myosin, heavy chain 3, skeletal muscle, embryonic chr17 -1.51 0.008 MYL9 myosin, light chain 9, regulatory chr20 -1.75 0.000 MYLPF myosin light chain, phosphorylatable, fast skeletal muscle chr16 -1.63 0.001 MYO19 myosin XIX chr17 -1.50 0.019 MYO1D myosin ID chr17 -2.33 0.000 MYO1E myosin IE chr15 -2.27 0.001 MYO5C myosin VC chr15 -1.69 0.001 MYOF myoferlin chr10 -1.88 0.030 NASP nuclear autoantigenic sperm protein (histone-binding) chr1 -1.71 0.003 NCAPD2 non-SMC condensin I complex, subunit D2 chr12 -2.26 0.002 NCAPD3 non-SMC condensin II complex, subunit D3 chr11 -1.61 0.011 NCAPG non-SMC condensin I complex, subunit G chr4 -2.08 0.003 NCAPH non-SMC condensin I complex, subunit H chr2 -1.62 0.024 NDE1 nudE neurodevelopment protein 1 chr16 -1.52 0.007 NEB nebulin chr2 -1.60 0.011 neural precursor cell expressed, developmentally down- NEDD4L regulated 4-like, E3 ubiquitin protein ligase chr18 -1.77 0.048 NEIL1 nei endonuclease VIII-like 1 (E. coli) chr15 -1.62 0.003 NEIL3 nei endonuclease VIII-like 3 (E. coli) chr4 -1.67 0.042 NEK4 NIMA-related kinase 4 chr3 -1.79 0.000 NEO1 neogenin 1 chr15 -1.70 0.000 NET1 neuroepithelial cell transforming 1 chr10 -1.56 0.006 NEU4 sialidase 4 chr2 -2.00 0.008 NEURL1B neuralized E3 ubiquitin protein ligase 1B chr5 -1.80 0.011 nuclear factor of activated T-cells, cytoplasmic, calcineurin- NFATC3 dependent 3 chr16 -2.36 0.000 NFIB /B chr9 -1.54 0.049 NFIC nuclear factor I/C (CCAAT-binding transcription factor) chr19 -1.61 0.017 NICN1 nicolin 1 chr3 -2.00 0.000 NIN ninein (GSK3B interacting protein) chr14 -3.08 0.000 NINJ1 ninjurin 1 chr9 -1.89 0.001 NIPAL2 NIPA-like domain containing 2 chr8 -2.42 0.010 NIPAL3 NIPA-like domain containing 3 chr1 -1.80 0.000 NKD2 naked cuticle homolog 2 (Drosophila) chr5 -1.57 0.007 NLGN2 neuroligin 2 chr17 -1.66 0.001 NMB neuromedin B chr15 -1.72 0.001 NMRK1 nicotinamide riboside kinase 1 chr9 -1.69 0.000 NMU neuromedin U chr4 -2.13 0.044 NOA1 nitric oxide associated 1 chr4 -1.78 0.000

The role of gwas identified 5p15 locus in prostate cancer risk and progression 195

Fold Symbol Gene Description Chr Change p-Value NOD1 nucleotide-binding oligomerization domain containing 1 chr7 -1.80 0.001 NOSTRIN nitric oxide synthase trafficking chr2 -1.60 0.036 NPDC1 neural proliferation, differentiation and control, 1 chr9 -1.52 0.016 NPNT nephronectin chr4 -1.52 0.002 NR2C1 nuclear receptor subfamily 2, group C, member 1 chr12 -2.11 0.000 NRBP2 nuclear receptor binding protein 2 chr8 -1.74 0.004 NRM nurim (nuclear envelope membrane protein) chr6 -1.54 0.027 NSA2 NSA2 ribosome biogenesis homolog (S. cerevisiae) chr5 -4.31 0.000 NSL1 NSL1, MIS12 kinetochore complex component chr1 -1.80 0.002 NTAN1 N-terminal asparagine amidase chr16 -1.73 0.003 NTF4 neurotrophin 4 chr19 -1.90 0.001 NTN5 netrin 5 chr19 -2.10 0.000 NUAK2 NUAK family, SNF1-like kinase, 2 chr1 -1.85 0.001 NUCB2 nucleobindin 2 chr11 -1.65 0.012 NUDT5 nudix (nucleoside diphosphate linked moiety X)-type motif 5 chr10 -1.64 0.011 NUF2 NUF2, NDC80 kinetochore complex component chr1 -1.90 0.044 NUP214 nucleoporin 214kDa chr9 -1.66 0.001 NUP54 nucleoporin 54kDa chr4 -1.83 0.000 NUP62 nucleoporin 62kDa chr19 -1.56 0.002 NUP62CL nucleoporin 62kDa C-terminal like chrX -1.71 0.006 NWD1 NACHT and WD repeat domain containing 1 chr19 -2.67 0.000 OIP5 Opa interacting protein 5 chr15 -1.63 0.048 ONECUT1 one cut homeobox 1 chr15 -1.51 0.022 ONECUT2 one cut homeobox 2 chr18 -3.42 0.000 ONECUT3 one cut homeobox 3 chr19 -2.06 0.003 ORAI2 ORAI calcium release-activated calcium modulator 2 chr7 -1.73 0.025 ORAI3 ORAI calcium release-activated calcium modulator 3 chr16 -1.93 0.000 ORC1 origin recognition complex, subunit 1 chr1 -1.88 0.033 ORC6 origin recognition complex, subunit 6 chr16 -2.23 0.003 OSBP2 oxysterol binding protein 2 chr22 -1.54 0.002 OSBPL10 oxysterol binding protein-like 10 chr3 -1.55 0.001 OSBPL3 oxysterol binding protein-like 3 chr7 -1.63 0.006 OSTC oligosaccharyltransferase complex subunit (non-catalytic) chr4 -1.59 0.001 OSTCP1 oligosaccharyltransferase complex subunit pseudogene 1 chr6 -1.57 0.001 P2RX7 purinergic receptor P2X, ligand-gated ion channel, 7 chr12 -2.01 0.032 PABPC4 poly(A) binding protein, cytoplasmic 4 (inducible form) chr1 -1.59 0.001 PACSIN1 protein kinase C and casein kinase substrate in neurons 1 chr6 -1.64 0.002 PADI2 peptidyl arginine deiminase, type II chr1 -1.67 0.014 PALLD palladin, cytoskeletal associated protein chr4 -2.09 0.000 PALMD palmdelphin chr1 -2.09 0.002 PAQR4 progestin and adipoQ receptor family member IV chr16 -2.18 0.003 PARP2 poly (ADP-ribose) polymerase 2 chr14 -1.54 0.007 PARP3 poly (ADP-ribose) polymerase family, member 3 chr3 -1.58 0.006 PARP6 poly (ADP-ribose) polymerase family, member 6 chr15 -1.70 0.040 PASK PAS domain containing serine/threonine kinase chr2 -1.88 0.006 PCDHB6 protocadherin beta 6 chr5 -1.55 0.011 PCGF5 polycomb group ring finger 5 chr10 -1.69 0.018 PCNA proliferating cell nuclear antigen chr20 -2.12 0.008 PCNP PEST proteolytic signal containing nuclear protein chr3 -1.57 0.005 PCNXL2 pecanex-like 2 (Drosophila) chr1 -2.41 0.000 PDE4A phosphodiesterase 4A, cAMP-specific chr19 -1.56 0.031 PDE7A phosphodiesterase 7A chr8 -1.60 0.000 PDGFA platelet-derived growth factor alpha polypeptide chr7 -1.68 0.004 PDGFRL platelet-derived growth factor receptor-like chr8 -1.63 0.035 PDLIM1 PDZ and LIM domain 1 chr10 -1.89 0.012 PDP1 pyruvate dehyrogenase phosphatase catalytic subunit 1 chr8 -1.63 0.001

196 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value PDPR pyruvate dehydrogenase phosphatase regulatory subunit chr16 -1.87 0.001 PDZD2 PDZ domain containing 2 chr5 -1.64 0.002 PDZK1IP1 PDZK1 interacting protein 1 chr1 -1.77 0.009 PFKFB3 6-phosphofructo-2-kinase/fructose-2,6-biphosphatase 3 chr10 -2.32 0.000 PFN2 profilin 2 chr3 -1.90 0.001 PFN4 profilin family, member 4 chr2 -1.55 0.022 PGP phosphoglycolate phosphatase chr16 -1.51 0.046 PHACTR4 phosphatase and actin regulator 4 chr1 -1.55 0.004 PHC3 polyhomeotic homolog 3 (Drosophila) chr3 -1.50 0.030 PHF10 PHD finger protein 10 chr6 -1.77 0.001 PHF19 PHD finger protein 19 chr9 -1.98 0.003 PHF8 PHD finger protein 8 chrX -1.65 0.000 PHGR1 proline//glycine-rich 1 chr15 -2.61 0.030 PHIP pleckstrin homology domain interacting protein chr6 -1.58 0.011 PHKG2 phosphorylase kinase, gamma 2 (testis) chr16 -1.69 0.014 PIF1 PIF1 5'-to-3' DNA helicase chr15 -1.67 0.027 PIGP phosphatidylinositol glycan anchor biosynthesis, class P chr21 -2.06 0.001 PIGS phosphatidylinositol glycan anchor biosynthesis, class S chr17 -1.68 0.000 PIGZ phosphatidylinositol glycan anchor biosynthesis, class Z chr3 -2.31 0.000 PIH1D2 PIH1 domain containing 2 chr11 -1.85 0.023 PIK3R5 phosphoinositide-3-kinase, regulatory subunit 5 chr17 -1.52 0.018 PITPNC1 phosphatidylinositol transfer protein, cytoplasmic 1 chr17 -1.52 0.017 PKMYT1 protein kinase, membrane associated tyrosine/threonine 1 chr16 -1.86 0.014 PLCB4 phospholipase C, beta 4 chr20 -2.46 0.000 PLCH1 phospholipase C, eta 1 chr3 -1.66 0.002 PLEKHA6 pleckstrin homology domain containing, family A member 6 chr1 -1.89 0.004 PLK1 polo-like kinase 1 chr16 -1.86 0.025 PLK2 polo-like kinase 2 chr5 -2.99 0.000 PLOD1 procollagen-lysine, 2-oxoglutarate 5-dioxygenase 1 chr1 -1.57 0.001 PLP2 proteolipid protein 2 (colonic epithelium-enriched) chrX -2.30 0.000 PNKD paroxysmal nonkinesigenic dyskinesia chr2 -1.83 0.000 PNMA2 paraneoplastic Ma antigen 2 chr8 -1.67 0.035 PNP purine nucleoside phosphorylase chr14 -1.70 0.010 POLA2 polymerase (DNA directed), alpha 2, accessory subunit chr11 -1.66 0.028 POLD1 polymerase (DNA directed), delta 1, catalytic subunit chr19 -1.57 0.027 POLD3 polymerase (DNA-directed), delta 3, accessory subunit chr11 -1.62 0.027 POLE2 polymerase (DNA directed), epsilon 2, accessory subunit chr14 -2.06 0.015 polymerase (RNA) II (DNA directed) polypeptide J4, POLR2J4 pseudogene chr7 -1.85 0.001 POLR3H polymerase (RNA) III (DNA directed) polypeptide H (22.9kD) chr22 -1.53 0.002 PORCN porcupine homolog (Drosophila) chrX -3.24 0.000 PPAP2A phosphatidic acid phosphatase type 2A chr5 -3.71 0.000 pancreatic progenitor cell differentiation and proliferation PPDPF factor chr20 -2.68 0.000 PPIL2 peptidylprolyl isomerase (cyclophilin)-like 2 chr22 -1.68 0.002 PPP1R12B protein phosphatase 1, regulatory subunit 12B chr1 -1.80 0.000 PPP1R14C protein phosphatase 1, regulatory (inhibitor) subunit 14C chr6 -3.05 0.000 PPP1R26- AS1 PPP1R26 antisense RNA 1 chr9 -1.51 0.006 PPP1R3E protein phosphatase 1, regulatory subunit 3E chr14 -2.23 0.000 PPP2R2C protein phosphatase 2, regulatory subunit B, gamma chr4 -1.51 0.002 PPP4R4 protein phosphatase 4, regulatory subunit 4 chr14 -1.52 0.018 PRAC2 prostate cancer susceptibility candidate 2 chr17 -2.89 0.001 PRC1 protein regulator of cytokinesis 1 chr15 -1.75 0.012 PRCP prolylcarboxypeptidase (angiotensinase C) chr11 -3.27 0.000 PRIM1 primase, DNA, polypeptide 1 (49kDa) chr12 -1.86 0.027 PRIMA1 proline rich membrane anchor 1 chr14 -1.76 0.007

The role of gwas identified 5p15 locus in prostate cancer risk and progression 197

Fold Symbol Gene Description Chr Change p-Value PRKAA2 protein kinase, AMP-activated, alpha 2 catalytic subunit chr1 -1.73 0.000 PRKAG2-AS1 PRKAG2 antisense RNA 1 chr7 -1.55 0.014 PRKAR2B protein kinase, cAMP-dependent, regulatory, type II, beta chr7 -1.64 0.001 PRNP prion protein chr20 -2.06 0.000 PROCR protein C receptor, endothelial chr20 -2.36 0.000 PROSER1 proline and serine rich 1 chr13 -2.19 0.000 PRPH peripherin chr12 -2.27 0.009 PRR14 proline rich 14 chr16 -2.05 0.000 PRRG4 proline rich Gla (G-carboxyglutamic acid) 4 (transmembrane) chr11 -1.69 0.018 PRRT2 proline-rich transmembrane protein 2 chr16 -1.92 0.001 PRTG protogenin chr15 -1.62 0.004 PSIP1 PC4 and SFRS1 interacting protein 1 chr9 -1.53 0.031 PSMC3IP PSMC3 interacting protein chr17 -1.71 0.021 PSMD7 proteasome (prosome, macropain) 26S subunit, non-ATPase, 7 chr16 -1.98 0.000 PSMG2 proteasome (prosome, macropain) assembly chaperone 2 chr18 -2.14 0.000 PSMG3-AS1 PSMG3 antisense RNA 1 (head to head) chr7 -1.85 0.002 PTCRA pre T-cell antigen receptor alpha chr6 -1.64 0.006 PTDSS1 phosphatidylserine synthase 1 chr8 -1.62 0.002 PTEN phosphatase and tensin homolog chr10 -1.64 0.000 PTGFRN prostaglandin F2 receptor inhibitor chr1 -1.97 0.000 prostaglandin-endoperoxide synthase 1 (prostaglandin G/H PTGS1 synthase and cyclooxygenase) chr9 -1.55 0.011 PTH1R parathyroid hormone 1 receptor chr3 -1.68 0.004 PTPDC1 protein tyrosine phosphatase domain containing 1 chr9 -1.66 0.001 protein tyrosine phosphatase-like (proline instead of catalytic PTPLB arginine), member b chr3 -1.54 0.000 PTPRA protein tyrosine phosphatase, receptor type, A chr20 -1.82 0.000 PURA purine-rich element binding protein A chr5 -1.62 0.039 PUSL1 pseudouridylate synthase-like 1 chr1 -1.53 0.017 PXDC1 PX domain containing 1 chr6 -1.54 0.027 PYGL phosphorylase, glycogen, liver chr14 -1.69 0.001 PYGO1 pygopus family PHD finger 1 chr15 -1.53 0.007 QTRTD1 queuine tRNA-ribosyltransferase domain containing 1 chr3 -1.98 0.000 RAB19 RAB19, member RAS oncogene family chr7 -1.87 0.000 RAB26 RAB26, member RAS oncogene family chr16 -1.53 0.001 RAB35 RAB35, member RAS oncogene family chr12 -2.08 0.000 RAB40B RAB40B, member RAS oncogene family chr17 -1.76 0.000 RAB9B RAB9B, member RAS oncogene family chrX -1.99 0.000 RACGAP1 Rac GTPase activating protein 1 chr12 -1.73 0.018 RAD23B RAD23 homolog B (S. cerevisiae) chr9 -2.43 0.000 RAD51 RAD51 recombinase chr15 -2.25 0.002 RAD51AP1 RAD51 associated protein 1 chr12 -1.92 0.028 RAD51B RAD51 paralog B chr14 -1.66 0.013 RAD54B RAD54 homolog B (S. cerevisiae) chr8 -1.65 0.015 RAD9B RAD9 homolog B (S. pombe) chr12 -1.52 0.016 RAI14 retinoic acid induced 14 chr5 -1.75 0.000 RALGPS1 Ral GEF with PH domain and SH3 binding motif 1 chr9 -1.61 0.025 RAP1B RAP1B, member of RAS oncogene family chr12 -1.80 0.003 RAP2A RAP2A, member of RAS oncogene family chr13 -1.72 0.002 RAPGEF3 Rap guanine nucleotide exchange factor (GEF) 3 chr12 -1.56 0.030 RARG , gamma chr12 -1.63 0.001 RASAL3 RAS protein activator like 3 chr19 -1.54 0.002 RASL11A RAS-like, family 11, member A chr13 -1.74 0.000 RAVER1 ribonucleoprotein, PTB-binding 1 chr19 -1.92 0.000 RBBP8 retinoblastoma binding protein 8 chr18 -1.96 0.004 RBM27 RNA binding motif protein 27 chr5 -1.98 0.000 RBMS2 RNA binding motif, single stranded interacting protein 2 chr12 -2.18 0.000

198 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value RCAN3 RCAN family member 3 chr1 -2.62 0.000 REC8 REC8 meiotic recombination protein chr14 -1.81 0.049 RECQL4 RecQ protein-like 4 chr8 -1.87 0.002 RELT RELT tumor necrosis factor receptor chr11 -1.57 0.004 RFC3 replication factor C (activator 1) 3, 38kDa chr13 -2.62 0.004 RFC5 replication factor C (activator 1) 5, 36.5kDa chr12 -1.59 0.044 RFPL1 ret finger protein-like 1 chr22 -1.66 0.021 RFX2 regulatory factor X, 2 (influences HLA class II expression) chr19 -1.54 0.005 RFX5 regulatory factor X, 5 (influences HLA class II expression) chr1 -2.00 0.001 RFX7 regulatory factor X, 7 chr15 -1.55 0.014 RHBDD1 rhomboid domain containing 1 chr2 -2.04 0.000 RHBDL2 rhomboid, veinlet-like 2 (Drosophila) chr1 -1.56 0.009 RHNO1 RAD9-HUS1-RAD1 interacting nuclear orphan 1 chr12 -1.57 0.001 RHOA ras homolog family member A chr3 -1.74 0.001 RIBC1 RIB43A domain with coiled-coils 1 chrX -2.28 0.000 RIMS1 regulating synaptic membrane exocytosis 1 chr6 -3.14 0.001 RIN2 Ras and Rab interactor 2 chr20 -2.47 0.000 RINL Ras and Rab interactor-like chr19 -1.56 0.038 RIPK1 receptor (TNFRSF)-interacting serine-threonine kinase 1 chr6 -1.62 0.007 RLN1 relaxin 1 chr9 -1.68 0.001 RMI2 RecQ mediated genome instability 2 chr16 -1.90 0.020 required for meiotic nuclear division 5 homolog A (S. RMND5A cerevisiae) chr2 -1.65 0.002 RNASE4 ribonuclease, RNase A family, 4 chr14 -2.02 0.011 RNASEH2A ribonuclease H2, subunit A chr19 -1.57 0.031 RNASEH2B ribonuclease H2, subunit B chr13 -1.62 0.004 RNF10 ring finger protein 10 chr12 -2.34 0.000 RNF138 ring finger protein 138, E3 ubiquitin protein ligase chr18 -1.84 0.001 RNF157-AS1 RNF157 antisense RNA 1 chr17 -2.46 0.000 RNF219 ring finger protein 219 chr13 -1.52 0.002 RNF32 ring finger protein 32 chr7 -1.82 0.000 RNPEPL1 arginyl aminopeptidase (aminopeptidase B)-like 1 chr2 -1.65 0.003 ROCK2 Rho-associated, coiled-coil containing protein kinase 2 chr2 -1.66 0.003 ROPN1B rhophilin associated tail protein 1B chr3 -1.91 0.000 ROPN1L rhophilin associated tail protein 1-like chr5 -1.50 0.044 ROR2 receptor tyrosine kinase-like orphan receptor 2 chr9 -1.58 0.005 RP11- 1246C19.1 chr7 -1.54 0.012 RP11- 1260E13.2 chr17 -2.00 0.000 RP11- 145M9.4 chr3 -1.96 0.012 RP11- 162A12.4 chr18 -1.54 0.012 RP11- 181G12.2 chr1 -1.91 0.000 RP11- 191L17.1 chr2 -1.58 0.047 RP11- 229P13.25 chr9 -1.81 0.001 RP11- 253E3.3 chr12 -1.82 0.001 RP11- 259N19.1 chr2 -2.18 0.001 RP11- 286N22.14 chr11 -1.87 0.001 RP11- 295D4.1 chr16 -1.94 0.003 RP11- 303E16.2 chr16 -3.30 0.000

The role of gwas identified 5p15 locus in prostate cancer risk and progression 199

Fold Symbol Gene Description Chr Change p-Value RP11- 333E13.4 chr4 -3.72 0.000 RP11- 343H5.6 chr1 -1.98 0.003 RP11- 391M1.4 chr3 -1.78 0.000 RP11- 421M1.8 chr6 -1.57 0.012 RP11- 443B20.1 chr2 -1.54 0.033 RP11- 488P3.1 chr1 -1.55 0.001 RP1-152L7.5 chr6 -1.70 0.009 RP11- 579D7.4 chr12 -1.81 0.002 RP11- 589M4.1 chr14 -1.53 0.007 RP11-5O17.1 chr3 -1.63 0.003 RP11- 617F23.1 chr15 -1.53 0.027 RP11- 61A14.1 chr16 -1.90 0.002 RP11- 676M6.1 chr11 -2.33 0.000 RP11- 72M17.1 chr14 -1.93 0.003 RP11- 744D14.1 chr16 -1.51 0.001 RP11- 767N6.7 chr1 -2.07 0.000 RP11-7K24.3 chr6 -2.58 0.000 RP11-83N9.5 chr9 -1.79 0.008 RP11- 96D1.11 chr16 -2.27 0.001 RP13- 228J13.8 chrX -1.53 0.004 RP4- 694B14.8 chr20 -2.02 0.001 RP4-740C4.5 chr1 -1.55 0.003 RP4- 751H13.7 chr7 -1.77 0.001 RP5- 1065J22.8 chr1 -1.66 0.001 RP5- 1136G13.2 chr7 -1.53 0.034 RP5- 991G20.1 chr16 -1.61 0.003 RPA3 replication protein A3, 14kDa chr7 -2.06 0.001 RPGRIP1L RPGRIP1-like chr16 -1.66 0.000 RPL31 ribosomal protein L31 chr2 -2.16 0.003 RPP25 ribonuclease P/MRP 25kDa subunit chr15 -1.71 0.006 RPS29 ribosomal protein S29 chr14 -1.76 0.000 RRM2 ribonucleotide reductase M2 chr2 -2.66 0.000 RSL1D1 ribosomal L1 domain containing 1 chr16 -1.51 0.004 RSPH1 radial spoke head 1 homolog (Chlamydomonas) chr21 -1.56 0.006 RTDR1 rhabdoid tumor deletion region gene 1 chr22 -1.61 0.015 Rtf1, Paf1/RNA polymerase II complex component, homolog RTF1 (S. cerevisiae) chr15 -1.65 0.005 RTKN2 rhotekin 2 chr10 -1.70 0.035 RTN3 reticulon 3 chr11 -1.85 0.000 RTN3P1 reticulon 3 pseudogene 1 chr4 -1.71 0.000 RWDD2B RWD domain containing 2B chr21 -1.53 0.003 S100A14 S100 calcium binding protein A14 chr1 -1.56 0.010

200 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value SAMD11 sterile alpha motif domain containing 11 chr1 -1.59 0.045 SAMD12 sterile alpha motif domain containing 12 chr8 -1.80 0.000 SAP30 Sin3A-associated protein, 30kDa chr4 -2.10 0.004 SAPCD1 suppressor APC domain containing 1 chr6 -1.53 0.017 SAPCD2 suppressor APC domain containing 2 chr9 -2.53 0.000 SARM1 sterile alpha and TIR motif containing 1 chr17 -1.58 0.001 SBSN suprabasin chr19 -1.52 0.015 SCD5 stearoyl-CoA desaturase 5 chr4 -1.61 0.008 SCLT1 sodium channel and clathrin linker 1 chr4 -1.66 0.002 SCNN1A sodium channel, non-voltage-gated 1 alpha subunit chr12 -1.94 0.002 SDC1 syndecan 1 chr2 -3.02 0.001 SDCCAG8 serologically defined colon cancer antigen 8 chr1 -1.56 0.001 SEC22A SEC22 vesicle trafficking protein homolog A (S. cerevisiae) chr3 -1.62 0.006 SEC31B SEC31 homolog B (S. cerevisiae) chr10 -1.50 0.036 SELT selenoprotein T precursor chr3 -1.55 0.002 sema domain, immunoglobulin domain (Ig), short basic SEMA3G domain, secreted, (semaphorin) 3G chr3 -2.05 0.000 sema domain, immunoglobulin domain (Ig), transmembrane SEMA4A domain (TM) and short cytoplasmic domain, (semaphorin) 4A chr1 -2.09 0.000 SENP1 SUMO1/sentrin specific peptidase 1 chr12 -1.62 0.002 SEPT11 septin 11 chr4 -1.74 0.000 SEPT12 septin 12 chr16 -1.53 0.001 SEPT2 septin 2 chr2 -1.82 0.000 SEPT6 septin 6 chrX -1.64 0.001 SERP1 stress-associated endoplasmic reticulum protein 1 chr3 -1.57 0.001 SFR1 SWI5-dependent recombination repair 1 chr10 -1.73 0.004 SFT2D1 SFT2 domain containing 1 chr6 -1.55 0.006 SFXN3 sideroflexin 3 chr10 -1.94 0.002 SGCB sarcoglycan, beta (43kDa dystrophin-associated glycoprotein) chr4 -2.11 0.000 SGK2 serum/glucocorticoid regulated kinase 2 chr20 -2.30 0.046 SGK3 serum/glucocorticoid regulated kinase family, member 3 chr8 -1.73 0.004 SGMS2 sphingomyelin synthase 2 chr4 -2.23 0.000 SGOL1 shugoshin-like 1 (S. pombe) chr3 -2.17 0.002 SH2D4A SH2 domain containing 4A chr8 -1.62 0.013 SH3BGRL SH3 domain binding glutamate-rich protein like chrX -1.66 0.001 SHB Src homology 2 domain containing adaptor protein B chr9 -1.94 0.002 SHC (Src homology 2 domain containing) transforming protein SHC3 3 chr9 -1.80 0.009 SHISA5 shisa family member 5 chr3 -1.59 0.003 SHMT1 serine hydroxymethyltransferase 1 (soluble) chr17 -1.65 0.002 SIGMAR1 sigma non-opioid 1 chr9 -1.65 0.001 SIMC1 SUMO-interacting motifs containing 1 chr5 -1.63 0.005 SIRT5 sirtuin 5 chr6 -1.58 0.000 SIX1 SIX homeobox 1 chr14 -1.72 0.020 SKA1 spindle and kinetochore associated complex subunit 1 chr18 -1.79 0.015 SKA3 spindle and kinetochore associated complex subunit 3 chr13 -1.94 0.016 solute carrier family 13 (sodium-dependent dicarboxylate SLC13A3 transporter), member 3 chr20 -1.52 0.045 solute carrier family 16 (aromatic amino acid transporter), SLC16A10 member 10 chr6 -1.60 0.012 solute carrier family 16, member 2 (thyroid hormone SLC16A2 transporter) chrX -1.54 0.000 SLC16A9 solute carrier family 16, member 9 chr10 -3.14 0.000 SLC18B1 solute carrier family 18, subfamily B, member 1 chr6 -1.83 0.004 solute carrier family 1 (glial high affinity glutamate SLC1A3 transporter), member 3 chr5 -1.70 0.003 SLC23A1 solute carrier family 23 (ascorbic acid transporter), member 1 chr5 -1.63 0.004 SLC25A18 solute carrier family 25 (glutamate carrier), member 18 chr22 -2.17 0.012

The role of gwas identified 5p15 locus in prostate cancer risk and progression 201

Fold Symbol Gene Description Chr Change p-Value solute carrier family 25 (mitochondrial thiamine SLC25A19 pyrophosphate carrier), member 19 chr17 -1.58 0.014 SLC25A34 solute carrier family 25, member 34 chr1 -2.91 0.001 solute carrier family 2 (facilitated glucose transporter), SLC2A11 member 11 chr22 -1.54 0.004 solute carrier family 2 (facilitated glucose transporter), SLC2A4 member 4 chr17 -1.60 0.001 SLC30A5 solute carrier family 30 (zinc transporter), member 5 chr5 -1.93 0.000 SLC35G1 solute carrier family 35, member G1 chr10 -1.52 0.025 solute carrier family 37 (glucose-6-phosphate transporter), SLC37A1 member 1 chr21 -3.35 0.000 SLC39A8 solute carrier family 39 (zinc transporter), member 8 chr4 -3.70 0.009 solute carrier family 5 (sodium/glucose cotransporter), SLC5A1 member 1 chr22 -1.62 0.002 solute carrier family 5 (sodium/inositol cotransporter), SLC5A11 member 11 chr16 -1.58 0.001 solute carrier family 7 (amino acid transporter light chain, y+L SLC7A7 system), member 7 chr14 -1.93 0.008 solute carrier family 7 (amino acid transporter light chain, bo,+ SLC7A9 system), member 9 chr19 -1.60 0.020 solute carrier family 9, subfamily A (NHE2, cation proton SLC9A2 antiporter 2), member 2 chr2 -1.54 0.001 solute carrier family 9, subfamily B (NHA2, cation proton SLC9B2 antiporter 2), member 2 chr4 -1.65 0.001 SLFN5 schlafen family member 5 chr17 -1.61 0.023 SLPI secretory leukocyte peptidase inhibitor chr20 -2.42 0.014 SMAD2 SMAD family member 2 chr18 -1.51 0.014 SWI/SNF related, matrix associated, actin dependent regulator SMARCD3 of chromatin, subfamily d, member 3 chr7 -1.53 0.006 SWI/SNF related, matrix associated, actin dependent regulator SMARCE1 of chromatin, subfamily e, member 1 chr17 -3.10 0.000 SMC1A structural maintenance of chromosomes 1A chrX -1.86 0.000 SMC2 structural maintenance of chromosomes 2 chr9 -1.84 0.008 SMIM22 small integral membrane protein 22 chr16 -1.56 0.036 SMIM3 small integral membrane protein 3 chr5 -1.83 0.001 SMKR1 small lysine-rich protein 1 chr7 -1.65 0.004 SMOC2 SPARC related modular calcium binding 2 chr6 -1.80 0.032 SNORA42 Small nucleolar RNA SNORA42/SNORA80 family chr14 -1.52 0.004 SNX24 sorting nexin 24 chr5 -1.53 0.013 SNX7 sorting nexin 7 chr1 -1.59 0.001 SOBP sine oculis binding protein homolog (Drosophila) chr6 -1.54 0.000 SOGA1 suppressor of glucose, autophagy associated 1 chr20 -1.54 0.007 SORT1 sortilin 1 chr1 -1.81 0.001 SP9 Sp9 transcription factor chr2 -1.53 0.005 SPA17 sperm autoantigenic protein 17 chr11 -1.52 0.028 SPAG5 sperm associated antigen 5 chr17 -1.64 0.025 SPAG8 sperm associated antigen 8 chr9 -1.50 0.016 SPAST spastin chr2 -1.57 0.002 SPATA33 spermatogenesis associated 33 chr16 -1.85 0.002 SPDL1 spindle apparatus coiled-coil protein 1 chr5 -1.81 0.009 SPEF2 sperm flagellar 2 chr5 -1.83 0.000 SPEG SPEG complex locus chr2 -1.59 0.001 SPG11 spastic paraplegia 11 (autosomal recessive) chr15 -1.58 0.022 SPHK1 sphingosine kinase 1 chr17 -1.76 0.001 SPINK1 serine peptidase inhibitor, Kazal type 1 chr5 -2.18 0.004 SPRR2F small proline-rich protein 2F chr1 -2.86 0.000 SRSF10 serine/arginine-rich splicing factor 10 chr1 -1.51 0.004 SSH1 slingshot protein phosphatase 1 chr12 -1.65 0.001 SSPO SCO-spondin chr7 -1.62 0.013 ST6GAL1 ST6 beta-galactosamide alpha-2,6-sialyltranferase 1 chr3 -1.52 0.023

202 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value ST6 (alpha-N-acetyl-neuraminyl-2,3-beta-galactosyl-1,3)-N- ST6GALNAC1 acetylgalactosaminide alpha-2,6-sialyltransferase 1 chr17 -1.95 0.015 STAMBPL1 STAM binding protein-like 1 chr10 -2.26 0.000 STAU1 staufen double-stranded RNA binding protein 1 chr20 -1.62 0.000 STEAP2 STEAP family member 2, metalloreductase chr7 -1.56 0.004 STIL SCL/TAL1 interrupting locus chr1 -1.92 0.005 STON1 stonin 1 chr2 -1.68 0.035 STRADB STE20-related kinase adaptor beta chr2 -1.76 0.003 STRIP2 striatin interacting protein 2 chr7 -1.64 0.001 STX17-AS1 STX17 antisense RNA 1 chr9 -1.69 0.000 SULT2B1 sulfotransferase family, cytosolic, 2B, member 1 chr19 -1.59 0.008 SURF4 surfeit 4 chr9 -2.12 0.006 SVIL-AS1 SVIL antisense RNA 1 chr10 -1.53 0.011 SYBU syntabulin (syntaxin-interacting) chr8 -1.51 0.030 SYCE2 synaptonemal complex central element protein 2 chr19 -2.11 0.017 SYNCRIP synaptotagmin binding, cytoplasmic RNA interacting protein chr6 -1.82 0.000 SYNE2 spectrin repeat containing, nuclear envelope 2 chr14 -2.78 0.000 SYNGR1 synaptogyrin 1 chr22 -2.10 0.000 SYNGR3 synaptogyrin 3 chr16 -1.57 0.033 SYTL3 synaptotagmin-like 3 chr6 -1.52 0.002 TACC1 transforming, acidic coiled-coil containing protein 1 chr8 -1.67 0.003 TACSTD2 tumor-associated calcium signal transducer 2 chr1 -1.53 0.012 TAF4b RNA polymerase II, TATA box binding protein (TBP)- TAF4B associated factor, 105kDa chr18 -1.55 0.002 TAS1R1 taste receptor, type 1, member 1 chr1 -2.06 0.000 TBC1D20 TBC1 domain family, member 20 chr20 -1.62 0.006 TBX1 T-box 1 chr22 -1.58 0.002 TBX10 T-box 10 chr11 -1.60 0.004 TCF19 transcription factor 19 chr6 -2.00 0.010 TCTEX1D2 Tctex1 domain containing 2 chr3 -1.97 0.000 TDH L-threonine dehydrogenase (pseudogene) chr8 -1.78 0.002 TDP1 tyrosyl-DNA phosphodiesterase 1 chr14 -1.51 0.009 TDRD1 tudor domain containing 1 chr10 -1.73 0.016 TEAD4 TEA domain family member 4 chr12 -1.58 0.002 TECPR2 tectonin beta-propeller repeat containing 2 chr14 -3.39 0.000 TENC1 tensin like C1 domain containing phosphatase (tensin 2) chr12 -2.87 0.002 TERT telomerase reverse transcriptase chr5 -1.94 0.027 TEX19 testis expressed 19 chr17 -1.82 0.025 TEX22 testis expressed 22 chr14 -1.55 0.003 tissue factor pathway inhibitor (lipoprotein-associated TFPI coagulation inhibitor) chr2 -1.66 0.003 TGOLN2 trans-golgi network protein 2 chr2 -2.22 0.000 THBS4 thrombospondin 4 chr5 -1.52 0.001 THRA , alpha chr17 -1.70 0.000 TICRR TOPBP1-interacting checkpoint and replication regulator chr15 -2.16 0.003 translocase of inner mitochondrial membrane 10 homolog B TIMM10B (yeast) chr11 -1.56 0.004 translocase of inner mitochondrial membrane 21 homolog TIMM21 (yeast) chr18 -1.69 0.001 TJP1 tight junction protein 1 chr15 -1.94 0.000 TLN1 talin 1 chr9 -1.64 0.001 TM4SF1 transmembrane 4 L six family member 1 chr3 -2.73 0.001 TM4SF18 transmembrane 4 L six family member 18 chr3 -1.65 0.006 TMC1 transmembrane channel-like 1 chr9 -1.63 0.001 TMC3 transmembrane channel-like 3 chr15 -1.60 0.020 TMED7 transmembrane emp24 protein transport domain containing 7 chr5 -1.76 0.001 TMED8 transmembrane emp24 protein transport domain containing 8 chr14 -1.71 0.001

The role of gwas identified 5p15 locus in prostate cancer risk and progression 203

Fold Symbol Gene Description Chr Change p-Value transmembrane protein with EGF-like and two follistatin-like TMEFF2 domains 2 chr2 -3.42 0.000 TMEM107 transmembrane protein 107 chr17 -2.21 0.011 TMEM121 transmembrane protein 121 chr14 -1.58 0.031 TMEM123 transmembrane protein 123 chr11 -7.27 0.000 TMEM143 transmembrane protein 143 chr19 -1.91 0.001 TMEM14B transmembrane protein 14B chr6 -1.61 0.000 TMEM14C transmembrane protein 14C chr6 -1.72 0.000 TMEM179 transmembrane protein 179 chr14 -2.40 0.004 TMEM19 transmembrane protein 19 chr12 -1.72 0.001 TMEM194A transmembrane protein 194A chr12 -1.84 0.001 TMEM194B transmembrane protein 194B chr2 -1.70 0.007 TMEM199 transmembrane protein 199 chr17 -1.67 0.001 TMEM209 transmembrane protein 209 chr7 -1.54 0.007 TMEM217 transmembrane protein 217 chr6 -1.85 0.008 TMEM237 transmembrane protein 237 chr2 -1.59 0.018 TMEM25 transmembrane protein 25 chr11 -1.54 0.006 TMEM27 transmembrane protein 27 chrX -2.18 0.000 TMEM38B transmembrane protein 38B chr9 -1.87 0.000 TMEM56 transmembrane protein 56 chr1 -1.50 0.003 TMEM70 transmembrane protein 70 chr8 -1.52 0.023 TMEM97 transmembrane protein 97 chr17 -1.52 0.011 TMOD2 tropomodulin 2 (neuronal) chr15 -1.69 0.001 TMPO thymopoietin chr12 -1.86 0.014 TMX4 thioredoxin-related transmembrane protein 4 chr20 -3.07 0.000 TNFAIP1 tumor necrosis factor, alpha-induced protein 1 (endothelial) chr17 -1.53 0.005 TNFAIP8L1 tumor necrosis factor, alpha-induced protein 8-like 1 chr19 -2.00 0.016 tumor necrosis factor receptor superfamily, member 10c, TNFRSF10C decoy without an intracellular domain chr8 -1.55 0.005 TNFSF13 tumor necrosis factor (ligand) superfamily, member 13 chr17 -1.81 0.000 TNNC2 troponin C type 2 (fast) chr20 -1.51 0.044 TNS4 tensin 4 chr17 -2.35 0.001 TOMM34 translocase of outer mitochondrial membrane 34 chr20 -2.63 0.000 TOP2A topoisomerase (DNA) II alpha 170kDa chr17 -1.94 0.028 TPM2 tropomyosin 2 (beta) chr9 -1.60 0.006 TPM4 tropomyosin 4 chr19 -1.96 0.001 TPX2 TPX2, microtubule-associated chr20 -1.67 0.030 TRAIP TRAF interacting protein chr3 -1.62 0.022 TRAK1 trafficking protein, kinesin binding 1 chr3 -2.38 0.000 TRAM2-AS1 TRAM2 antisense RNA 1 (head to head) chr6 -1.77 0.001 TRAPPC5 trafficking protein particle complex 5 chr19 -1.91 0.001 TRAPPC6B trafficking protein particle complex 6B chr14 -1.56 0.004 TRERF1 transcriptional regulating factor 1 chr6 -1.61 0.001 TRIM34 tripartite motif containing 34 chr11 -1.76 0.004 TRIM37 tripartite motif containing 37 chr17 -1.51 0.023 TRIM46 tripartite motif containing 46 chr1 -1.51 0.021 TRIM7 tripartite motif containing 7 chr5 -1.70 0.017 TRIM8 tripartite motif containing 8 chr10 -1.60 0.001 TRIP13 thyroid hormone receptor interactor 13 chr5 -1.56 0.037 TROAP trophinin associated protein chr12 -1.71 0.020 transient receptor potential cation channel, subfamily M, TRPM8 member 8 chr2 -2.02 0.013 transient receptor potential cation channel, subfamily V, TRPV1 member 1 chr17 -1.59 0.017 TSLP thymic stromal lymphopoietin chr5 -1.55 0.002 TSPAN14 tetraspanin 14 chr10 -1.82 0.000 TSPAN6 tetraspanin 6 chrX -2.86 0.000 TSPAN7 tetraspanin 7 chrX -1.55 0.048

204 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value TTBK1 tau tubulin kinase 1 chr6 -1.52 0.040 TTC18 tetratricopeptide repeat domain 18 chr10 -1.59 0.001 TTC28 tetratricopeptide repeat domain 28 chr22 -1.96 0.000 TTC28-AS1 TTC28 antisense RNA 1 chr22 -1.60 0.004 TTC39C tetratricopeptide repeat domain 39C chr18 -2.13 0.000 TTC6 tetratricopeptide repeat domain 6 chr14 -1.54 0.009 TTK TTK protein kinase chr6 -1.82 0.024 TTL tubulin tyrosine ligase chr2 -1.51 0.003 TTLL6 tubulin tyrosine ligase-like family, member 6 chr17 -1.63 0.037 TTYH1 tweety family member 1 chr19 -1.60 0.002 TTYH2 tweety family member 2 chr17 -1.56 0.004 TUBB6 tubulin, beta 6 class V chr18 -2.43 0.002 TWSG1 twisted gastrulation BMP signaling modulator 1 chr18 -1.77 0.000 TXNIP thioredoxin interacting protein chr1 -1.88 0.011 TYMS thymidylate synthetase chr18 -2.68 0.002 TYRO3 TYRO3 protein tyrosine kinase chr15 -1.55 0.008 uveal autoantigen with coiled-coil domains and ankyrin UACA repeats chr15 -1.89 0.001 UBA52 ubiquitin A-52 residue ribosomal protein fusion product 1 chr19 -1.58 0.001 UBAP2 ubiquitin associated protein 2 chr9 -1.55 0.011 UBE2D4 ubiquitin-conjugating enzyme E2D 4 (putative) chr7 -2.40 0.000 UBE2E3 ubiquitin-conjugating enzyme E2E 3 chr2 -1.73 0.001 UBL7-AS1 UBL7 antisense RNA 1 (head to head) chr15 -1.58 0.001 UBXN8 UBX domain protein 8 chr8 -1.68 0.000 ubiquitin carboxyl-terminal esterase L3 (ubiquitin UCHL3 thiolesterase) chr13 -1.51 0.003 UFC1 ubiquitin-fold modifier conjugating enzyme 1 chr1 -2.06 0.000 UNC5B unc-5 homolog B (C. elegans) chr10 -1.75 0.001 UPK3A uroplakin 3A chr22 -1.61 0.022 UQCC1 ubiquinol-cytochrome c reductase complex assembly factor 1 chr20 -1.65 0.001 USP31 ubiquitin specific peptidase 31 chr16 -1.90 0.000 USP6NL USP6 N-terminal like chr10 -1.52 0.005 VAMP1 vesicle-associated membrane protein 1 (synaptobrevin 1) chr12 -1.60 0.000 VANGL1 VANGL planar cell polarity protein 1 chr1 -2.17 0.000 VDAC3 voltage-dependent anion channel 3 chr8 -2.32 0.000 VEGFB vascular endothelial growth factor B chr11 -1.86 0.000 VPREB3 pre-B lymphocyte 3 chr22 -1.60 0.001 VPS54 vacuolar protein sorting 54 homolog (S. cerevisiae) chr2 -1.76 0.048 VRK1 vaccinia related kinase 1 chr14 -1.78 0.012 VWA1 von Willebrand factor A domain containing 1 chr1 -1.79 0.000 WDHD1 WD repeat and HMG-box DNA binding protein 1 chr14 -1.81 0.034 WDR66 WD repeat domain 66 chr12 -1.56 0.001 WDR72 WD repeat domain 72 chr15 -1.56 0.003 WDR92 WD repeat domain 92 chr2 -1.74 0.000 WAS protein homolog associated with actin, golgi membranes WHAMM and microtubules chr15 -1.64 0.012 WIPI1 WD repeat domain, phosphoinositide interacting 1 chr17 -1.88 0.043 WTAP Wilms tumor 1 associated protein chr6 -1.80 0.001 XDH xanthine dehydrogenase chr2 -1.66 0.006 X-ray repair complementing defective repair in Chinese XRCC2 hamster cells 2 chr7 -1.71 0.014 YBX2 Y box binding protein 2 chr17 -1.84 0.014 YBX3 Y box binding protein 3 chr12 -2.17 0.000 ZBED3 zinc finger, BED-type containing 3 chr5 -3.41 0.000 ZBTB8A zinc finger and BTB domain containing 8A chr1 -1.55 0.008 ZC3H12D zinc finger CCCH-type containing 12D chr6 -1.50 0.023 ZCWPW1 zinc finger, CW type with PWWP domain 1 chr7 -1.68 0.010 ZDHHC16 zinc finger, DHHC-type containing 16 chr10 -1.53 0.029

The role of gwas identified 5p15 locus in prostate cancer risk and progression 205

Fold Symbol Gene Description Chr Change p-Value ZDHHC5 zinc finger, DHHC-type containing 5 chr11 -1.54 0.007 ZFP36L1 ZFP36 ring finger protein-like 1 chr14 -1.66 0.002 ZGRF1 zinc finger, GRF-type containing 1 chr4 -1.70 0.012 ZHX1 zinc fingers and 1 chr8 -1.51 0.016 ZIC5 Zic family member 5 chr13 -2.12 0.001 ZMIZ2 zinc finger, MIZ-type containing 2 chr7 -2.06 0.003 ZMYND11 zinc finger, MYND-type containing 11 chr10 -1.56 0.001 ZMYND12 zinc finger, MYND-type containing 12 chr1 -1.60 0.025 ZMYND15 zinc finger, MYND-type containing 15 chr17 -1.59 0.024 ZNF135 zinc finger protein 135 chr19 -1.55 0.037 ZNF219 zinc finger protein 219 chr14 -1.86 0.000 ZNF318 zinc finger protein 318 chr6 -1.86 0.002 ZNF33B zinc finger protein 33B chr10 -1.59 0.001 ZNF362 zinc finger protein 362 chr1 -2.59 0.000 ZNF365 zinc finger protein 365 chr10 -2.43 0.002 ZNF367 zinc finger protein 367 chr9 -1.67 0.029 ZNF391 zinc finger protein 391 chr6 -1.50 0.001 ZNF398 zinc finger protein 398 chr7 -1.66 0.004 ZNF425 zinc finger protein 425 chr7 -1.56 0.001 ZNF43 zinc finger protein 43 chr19 -1.53 0.010 ZNF480 zinc finger protein 480 chr19 -2.06 0.000 ZNF534 zinc finger protein 534 chr19 -1.55 0.041 ZNF556 zinc finger protein 556 chr19 -1.59 0.007 ZNF618 zinc finger protein 618 chr9 -1.57 0.002 ZNF620 zinc finger protein 620 chr3 -1.65 0.001 ZNF629 zinc finger protein 629 chr16 -3.62 0.000 ZNF677 zinc finger protein 677 chr19 -2.21 0.000 ZNF718 zinc finger protein 718 chr4 -1.66 0.001 ZNF76 zinc finger protein 76 chr6 -1.54 0.004 ZNF778 zinc finger protein 778 chr16 -1.72 0.029 ZNF789 zinc finger protein 789 chr7 -1.65 0.006 ZNF791 zinc finger protein 791 chr19 -1.63 0.002 ZNF793 zinc finger protein 793 chr19 -1.54 0.001 ZNF850 zinc finger protein 850 chr19 -1.51 0.033 ZNF883 zinc finger protein 883 chr9 -1.51 0.003 ZNF93 zinc finger protein 93 chr19 -1.65 0.013 ZP3 zona pellucida glycoprotein 3 (sperm receptor) chr7 -3.49 0.000 ZSCAN25 zinc finger and SCAN domain containing 25 chr7 -1.69 0.000 ZWILCH zwilch kinetochore protein chr15 -1.94 0.001 ABCB5 ATP-binding cassette, sub-family B (MDR/TAP), member 5 chr7 1.56 0.011 ABCC5 ATP-binding cassette, sub-family C (CFTR/MRP), member 5 chr3 1.85 0.004 ABI1 abl-interactor 1 chr10 1.51 0.007 ABTB1 ankyrin repeat and BTB (POZ) domain containing 1 chr3 1.62 0.003 AC002456.2 chr7 1.50 0.020 AC093673.5 chr7 1.68 0.021 ACADSB acyl-CoA dehydrogenase, short/branched chain chr10 1.59 0.005 ACKR3 atypical chemokine receptor 3 chr2 1.51 0.032 ACOX3 acyl-CoA oxidase 3, pristanoyl chr4 1.76 0.004 ACPP acid phosphatase, prostate chr3 1.79 0.035 ACSL1 acyl-CoA synthetase long-chain family member 1 chr4 1.67 0.002 ADAM15 ADAM metallopeptidase domain 15 chr1 1.51 0.002 ADH5 alcohol dehydrogenase 5 (class III), chi polypeptide chr4 1.69 0.012 ADM adrenomedullin chr11 1.92 0.002 AF131217.1 chr21 1.78 0.040 AFMID arylformamidase chr17 1.79 0.003 AGO2 argonaute RISC catalytic component 2 chr8 1.65 0.001

206 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value AGO4 argonaute RISC catalytic component 4 chr1 1.83 0.005 AGPAT6 1-acylglycerol-3-phosphate O-acyltransferase 6 chr8 1.53 0.006 AKAP2 A kinase (PRKA) anchor protein 2 chr9 1.52 0.041 AKIRIN1 akirin 1 chr1 1.66 0.000 ALAD aminolevulinate dehydratase chr9 1.65 0.000 ALCAM activated leukocyte cell adhesion molecule chr3 2.31 0.000 ANKRD16 ankyrin repeat domain 16 chr10 2.95 0.000 ANKRD46 ankyrin repeat domain 46 chr8 1.66 0.000 ANKRD6 ankyrin repeat domain 6 chr6 1.55 0.002 amyloid beta (A4) precursor protein-binding, family A, APBA2 member 2 chr15 1.50 0.010 APOF apolipoprotein F chr12 1.79 0.014 AQP11 aquaporin 11 chr11 1.87 0.005 ARG2 arginase 2 chr14 2.79 0.000 ARHGAP28 Rho GTPase activating protein 28 chr18 2.16 0.000 ARHGAP8 Rho GTPase activating protein 8 chr22 2.24 0.000 ARID3A AT rich interactive domain 3A (BRIGHT-like) chr19 2.39 0.000 ARL1 ADP-ribosylation factor-like 1 chr12 1.52 0.046 ARPP19 cAMP-regulated phosphoprotein, 19kDa chr15 1.52 0.005 ARRDC3 arrestin domain containing 3 chr5 1.90 0.002 ATG4A autophagy related 4A, cysteine peptidase chrX 1.78 0.001 ATP11B ATPase, class VI, type 11B chr3 1.73 0.002 ATP2A2 ATPase, Ca++ transporting, cardiac muscle, slow twitch 2 chr12 1.62 0.000 ATP2B1 ATPase, Ca++ transporting, plasma membrane 1 chr12 1.56 0.001 ATP2B4 ATPase, Ca++ transporting, plasma membrane 4 chr1 1.95 0.002 ATP6AP1 ATPase, H+ transporting, lysosomal accessory protein 1 chrX 1.53 0.020 ATP6V1B2 ATPase, H+ transporting, lysosomal 56/58kDa, V1 subunit B2 chr8 1.53 0.006 ATPase, aminophospholipid transporter (APLT), class I, type ATP8A1 8A, member 1 chr4 1.51 0.019 ATPase, aminophospholipid transporter, class I, type 8B, ATP8B1 member 1 chr18 1.58 0.006 ATP9A ATPase, class II, type 9A chr20 1.64 0.017 ATXN1 ataxin 1 chr6 1.63 0.031 AURKC aurora kinase C chr19 1.59 0.002 B3GALNT2 beta-1,3-N-acetylgalactosaminyltransferase 2 chr1 1.52 0.013 BTB and CNC homology 1, basic leucine zipper transcription BACH1 factor 1 chr21 1.54 0.003 BAIAP2 BAI1-associated protein 2 chr17 1.71 0.000 BANF1 barrier to autointegration factor 1 chr11 1.69 0.014 BCCIP BRCA2 and CDKN1A interacting protein chr10 2.38 0.000 BCL2L13 BCL2-like 13 (apoptosis facilitator) chr22 1.50 0.006 BDKRB2 bradykinin receptor B2 chr14 1.55 0.007 BIK BCL2-interacting killer (apoptosis-inducing) chr22 1.50 0.019 biogenesis of lysosomal organelles complex-1, subunit 4, BLOC1S4 cappuccino chr4 1.52 0.046 BOD1 biorientation of chromosomes in cell division 1 chr5 1.57 0.018 BRWD1 bromodomain and WD repeat domain containing 1 chr21 1.89 0.000 C10orf11 chromosome 10 open reading frame 11 chr10 1.57 0.015 C12orf49 chromosome 12 open reading frame 49 chr12 1.58 0.001 C12orf57 chromosome 12 open reading frame 57 chr12 1.54 0.047 C12orf66 chromosome 12 open reading frame 66 chr12 1.87 0.000 C14orf132 chromosome 14 open reading frame 132 chr14 1.53 0.015 C16orf72 chromosome 16 open reading frame 72 chr16 1.53 0.002 C1D C1D nuclear receptor corepressor chr2 1.57 0.015 C1orf85 chromosome 1 open reading frame 85 chr1 1.58 0.006 C1orf86 chromosome 1 open reading frame 86 chr1 1.80 0.003 C21orf90 chromosome 21 open reading frame 90 chr21 1.62 0.006 C2CD2 C2 calcium-dependent domain containing 2 chr21 1.51 0.039

The role of gwas identified 5p15 locus in prostate cancer risk and progression 207

Fold Symbol Gene Description Chr Change p-Value C4orf29 open reading frame 29 chr4 1.90 0.000 C5orf46 chromosome 5 open reading frame 46 chr5 3.35 0.001 C5orf66 chromosome 5 open reading frame 66 chr5 1.66 0.019 C6orf120 chromosome 6 open reading frame 120 chr6 1.56 0.000 C7orf49 open reading frame 49 chr7 1.67 0.003 C9orf92 chromosome 9 open reading frame 92 chr9 1.77 0.028 CACFD1 calcium channel flower domain containing 1 chr9 1.57 0.024 CALU calumenin chr7 2.71 0.001 CAMK1 calcium/calmodulin-dependent protein kinase I chr3 1.57 0.004 CAMK2N1 calcium/calmodulin-dependent protein kinase II inhibitor 1 chr1 2.19 0.003 calmodulin regulated spectrin-associated protein family, CAMSAP2 member 2 chr1 1.52 0.004 CANT1 calcium activated nucleotidase 1 chr17 1.62 0.020 CAPN5 calpain 5 chr11 1.66 0.049 CARD14 caspase recruitment domain family, member 14 chr17 1.54 0.001 CASC7 cancer susceptibility candidate 7 (non-protein coding) chr8 1.86 0.002 calcium/calmodulin-dependent serine protein kinase (MAGUK CASK family) chrX 1.55 0.012 core-binding factor, runt domain, alpha subunit 2; CBFA2T2 translocated to, 2 chr20 1.72 0.001 CBLN2 cerebellin 2 precursor chr18 5.13 0.008 CBR3-AS1 CBR3 antisense RNA 1 chr21 1.56 0.013 CCDC149 coiled-coil domain containing 149 chr4 1.50 0.004 CCDC71L coiled-coil domain containing 71-like chr7 1.62 0.003 CCDC81 coiled-coil domain containing 81 chr11 1.68 0.018 CCDC82 coiled-coil domain containing 82 chr11 1.79 0.000 CCDC83 coiled-coil domain containing 83 chr11 1.86 0.013 CCL20 chemokine (C-C motif) ligand 20 chr2 4.57 0.018 CCNL2 cyclin L2 chr1 1.71 0.003 CD320 CD320 molecule chr19 1.52 0.028 CDC14A cell division cycle 14A chr1 1.59 0.007 CDC23 cell division cycle 23 chr5 1.55 0.002 CDC37L1 cell division cycle 37-like 1 chr9 1.58 0.006 CDC42EP3 CDC42 effector protein (Rho GTPase binding) 3 chr2 2.98 0.000 CDH18 cadherin 18, type 2 chr5 1.58 0.021 CDON cell adhesion associated, oncogene regulated chr11 1.73 0.009 CDP-diacylglycerol synthase (phosphatidate CDS1 cytidylyltransferase) 1 chr4 2.11 0.000 CECR2 cat eye syndrome chromosome region, candidate 2 chr22 1.63 0.001 checkpoint with forkhead and ring finger domains, E3 CHFR ubiquitin protein ligase chr12 1.63 0.001 CHTOP chromatin target of PRMT1 chr1 1.51 0.031 CIAO1 cytosolic iron-sulfur assembly component 1 chr2 1.53 0.011 CIRBP cold inducible RNA binding protein chr19 1.72 0.026 Cbp/p300-interacting transactivator, with Glu/Asp-rich CITED2 carboxy-terminal domain, 2 chr6 2.54 0.000 CLCN3 chloride channel, voltage-sensitive 3 chr4 1.51 0.003 CLCN5 chloride channel, voltage-sensitive 5 chrX 2.07 0.000 CNBD1 cyclic nucleotide binding domain containing 1 chr8 1.55 0.007 CNOT6L CCR4-NOT transcription complex, subunit 6-like chr4 1.55 0.006 COG8 component of oligomeric golgi complex 8 chr16 1.68 0.001 CORO2A coronin, actin binding protein, 2A chr9 1.52 0.006 CPPED1 calcineurin-like phosphoesterase domain containing 1 chr16 1.53 0.001 CREB1 cAMP responsive element binding protein 1 chr2 1.54 0.001 CRELD1 cysteine-rich with EGF-like domains 1 chr3 1.59 0.001 CRK v-crk avian sarcoma virus CT10 oncogene homolog chr17 2.35 0.000 CSMD1 CUB and Sushi multiple domains 1 chr8 2.43 0.010 CSNK1A1 casein kinase 1, alpha 1 chr5 1.59 0.001

208 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value CTC1 CTS telomere maintenance complex component 1 chr17 1.67 0.000 CTD- 2270P14.5 chr16 1.52 0.016 CTD- 2371O3.3 chr11 1.73 0.037 CTD- 2630F21.1 chr19 1.80 0.006 CTNNA2 catenin (cadherin-associated protein), alpha 2 chr2 1.65 0.001 CTTNBP2NL CTTNBP2 N-terminal like chr1 1.82 0.002 CWC15 CWC15 spliceosome-associated protein chr11 2.14 0.000 CYB561D1 cytochrome b561 family, member D1 chr1 1.68 0.001 CYP7A1 cytochrome P450, family 7, subfamily A, polypeptide 1 chr8 2.85 0.000 DCAF17 DDB1 and CUL4 associated factor 17 chr2 1.54 0.004 DCAF7 DDB1 and CUL4 associated factor 7 chr17 1.76 0.003 DCUN1D5 DCN1, defective in cullin neddylation 1, domain containing 5 chr11 1.50 0.003 DDC dopa decarboxylase (aromatic L-amino acid decarboxylase) chr7 2.09 0.001 DDTL D-dopachrome tautomerase-like chr22 1.53 0.008 DDX10 DEAD (Asp-Glu-Ala-Asp) box polypeptide 10 chr11 1.76 0.002 DDX18 DEAD (Asp-Glu-Ala-Asp) box polypeptide 18 chr2 2.53 0.000 DDX3X DEAD (Asp-Glu-Ala-Asp) box helicase 3, X-linked chrX 1.96 0.000 DDX3Y DEAD (Asp-Glu-Ala-Asp) box helicase 3, Y-linked chrY 3.03 0.000 DENND1A DENN/MADD domain containing 1A chr9 1.57 0.002 DENND1B DENN/MADD domain containing 1B chr1 1.77 0.005 DENND6A DENN/MADD domain containing 6A chr3 1.56 0.008 DEPDC5 DEP domain containing 5 chr22 1.52 0.006 DERL1 derlin 1 chr8 1.64 0.012 DET1 de-etiolated homolog 1 (Arabidopsis) chr15 1.54 0.006 DGKH diacylglycerol kinase, eta chr13 1.59 0.004 DHCR7 7-dehydrocholesterol reductase chr11 1.68 0.001 DICER1 dicer 1, ribonuclease type III chr14 1.74 0.047 DIDO1 death inducer-obliterator 1 chr20 1.61 0.005 DIP2B DIP2 disco-interacting protein 2 homolog B (Drosophila) chr12 1.75 0.000 DIP2C DIP2 disco-interacting protein 2 homolog C (Drosophila) chr10 1.53 0.002 DLEU1 deleted in lymphocytic leukemia 1 (non-protein coding) chr13 1.74 0.005 DNAH9 dynein, axonemal, heavy chain 9 chr17 1.82 0.022 DNAJC5 DnaJ (Hsp40) homolog, subfamily C, member 5 chr20 1.52 0.003 DNASE1 deoxyribonuclease I chr16 2.04 0.002 DNMT3B DNA (cytosine-5-)-methyltransferase 3 beta chr20 1.61 0.003 DOCK4 dedicator of cytokinesis 4 chr7 2.31 0.000 DOT1L DOT1-like histone H3K79 methyltransferase chr19 1.66 0.005 DRAM1 DNA-damage regulated autophagy modulator 1 chr12 1.57 0.004 DSEL dermatan sulfate epimerase-like chr18 1.75 0.004 DTNB dystrobrevin, beta chr2 1.60 0.002 DUSP10 dual specificity phosphatase 10 chr1 1.63 0.017 DUSP2 dual specificity phosphatase 2 chr2 1.72 0.007 DUSP8 dual specificity phosphatase 8 chr11 1.82 0.018 EEA1 early endosome antigen 1 chr12 1.61 0.001 EFNA3 ephrin-A3 chr1 1.76 0.001 EFNA4 ephrin-A4 chr1 1.62 0.008 EGLN2 egl-9 family hypoxia-inducible factor 2 chr19 1.83 0.000 EIF2AK4 eukaryotic translation initiation factor 2 alpha kinase 4 chr15 1.76 0.001 EIF3B eukaryotic translation initiation factor 3, subunit B chr7 1.70 0.003 EIF5B eukaryotic translation initiation factor 5B chr2 2.38 0.000 ELAVL2 ELAV like neuron-specific RNA binding protein 2 chr9 1.71 0.038 ectonucleotide pyrophosphatase/phosphodiesterase 5 ENPP5 (putative) chr6 2.02 0.047 ENTPD3 ectonucleoside triphosphate diphosphohydrolase 3 chr3 1.74 0.002 ENTPD4 ectonucleoside triphosphate diphosphohydrolase 4 chr8 1.58 0.001

The role of gwas identified 5p15 locus in prostate cancer risk and progression 209

Fold Symbol Gene Description Chr Change p-Value EGF domain-specific O-linked N-acetylglucosamine (GlcNAc) EOGT transferase chr3 1.71 0.000 EPB41L1 erythrocyte membrane protein band 4.1-like 1 chr20 1.71 0.001 EPHA1 EPH receptor A1 chr7 1.54 0.004 EPHA6 EPH receptor A6 chr3 2.22 0.008 EPHA7 EPH receptor A7 chr6 1.90 0.001 v-erb-b2 avian erythroblastic leukemia viral oncogene ERBB3 homolog 3 chr12 1.60 0.001 ERC2 ELKS/RAB6-interacting/CAST family member 2 chr3 1.74 0.000 ERCC5 excision repair cross-complementation group 5 chr13 1.74 0.001 ERP44 endoplasmic reticulum protein 44 chr9 2.26 0.000 ESYT2 extended synaptotagmin-like protein 2 chr7 2.34 0.000 EXOC2 exocyst complex component 2 chr6 1.54 0.003 EYA3 EYA transcriptional coactivator and phosphatase 3 chr1 1.53 0.002 FAAH2 fatty acid amide hydrolase 2 chrX 1.61 0.006 FAM160B1 family with sequence similarity 160, member B1 chr10 1.73 0.005 FAM162A family with sequence similarity 162, member A chr3 1.62 0.004 FAM168B family with sequence similarity 168, member B chr2 1.59 0.001 FAM173B family with sequence similarity 173, member B chr5 1.57 0.005 FAM177B family with sequence similarity 177, member B chr1 1.76 0.028 FAM210A family with sequence similarity 210, member A chr18 1.50 0.002 FAM32A family with sequence similarity 32, member A chr19 1.59 0.002 FAM49B family with sequence similarity 49, member B chr8 1.81 0.001 FAM83F family with sequence similarity 83, member F chr22 1.61 0.007 FAM84A family with sequence similarity 84, member A chr2 2.12 0.002 FAR1 fatty acyl CoA reductase 1 chr11 1.56 0.000 FARP2 FERM, RhoGEF and pleckstrin domain protein 2 chr2 2.08 0.000 FASN fatty acid synthase chr17 1.58 0.018 FGFR2 fibroblast growth factor receptor 2 chr10 1.77 0.000 FNBP1L formin binding protein 1-like chr1 1.54 0.016 FNDC3B fibronectin type III domain containing 3B chr3 1.58 0.004 FNIP1 folliculin interacting protein 1 chr5 1.61 0.003 FREM1 FRAS1 related extracellular matrix 1 chr9 1.76 0.002 FRS2 fibroblast growth factor receptor substrate 2 chr12 1.53 0.003 FUT8-AS1 FUT8 antisense RNA 1 chr14 1.83 0.004 G3BP2 GTPase activating protein (SH3 domain) binding protein 2 chr4 1.70 0.003 GABPB1 GA binding protein transcription factor, beta subunit 1 chr15 1.68 0.007 GALNT2 polypeptide N-acetylgalactosaminyltransferase 2 chr1 1.54 0.021 GALNT3 polypeptide N-acetylgalactosaminyltransferase 3 chr2 1.65 0.022 GAREM GRB2 associated, regulator of MAPK1 chr18 1.51 0.022 GAS2L1 growth arrest-specific 2 like 1 chr22 2.01 0.000 GATAD2A GATA zinc finger domain containing 2A chr19 2.16 0.007 GATC glutamyl-tRNA(Gln) amidotransferase, subunit C chr12 1.52 0.014 GCAT glycine C-acetyltransferase chr22 1.69 0.003 GCG glucagon chr2 4.26 0.002 GCNT1 glucosaminyl (N-acetyl) transferase 1, core 2 chr9 2.07 0.002 GDI1 GDP dissociation inhibitor 1 chrX 1.59 0.003 glycerophosphodiester phosphodiesterase domain containing GDPD1 1 chr17 1.61 0.026 GET4 golgi to ER traffic protein 4 homolog (S. cerevisiae) chr7 1.85 0.001 GLUD1 glutamate dehydrogenase 1 chr10 1.51 0.006 GLUD2 glutamate dehydrogenase 2 chrX 1.52 0.010 N-acetylglucosamine-1-phosphate transferase, alpha and beta GNPTAB subunits chr12 1.94 0.000 GP2 glycoprotein 2 (zymogen granule membrane) chr16 1.72 0.012 GPATCH3 G patch domain containing 3 chr1 1.50 0.002 GPATCH8 G patch domain containing 8 chr17 1.52 0.002 GPC1 glypican 1 chr2 1.51 0.002

210 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value GPR137B G protein-coupled receptor 137B chr1 3.09 0.000 GPR137C G protein-coupled receptor 137C chr14 2.20 0.001 GPR158 G protein-coupled receptor 158 chr10 1.88 0.001 GPRIN2 G protein regulated inducer of neurite outgrowth 2 chr10 1.66 0.001 GRIK1 glutamate receptor, ionotropic, kainate 1 chr21 1.78 0.035 GSDMC gasdermin C chr8 1.51 0.009 GSE1 Gse1 coiled-coil protein chr16 1.61 0.002 GSPT1 G1 to S phase transition 1 chr16 1.68 0.003 GTDC1 glycosyltransferase-like domain containing 1 chr2 1.68 0.002 GTF2E1 general transcription factor IIE, polypeptide 1, alpha 56kDa chr3 1.55 0.001 H1FX H1 histone family, member X chr3 1.80 0.017 H3F3AP3 H3 histone, family 3A pseudogene 3 chr3 1.62 0.034 HCCS holocytochrome c synthase chrX 1.80 0.015 HECTD2 HECT domain containing E3 ubiquitin protein ligase 2 chr10 1.84 0.013 HEPACAM hepatic and glial cell adhesion molecule chr11 2.51 0.000 HIC2 hypermethylated in cancer 2 chr22 1.90 0.002 HIST1H3A histone cluster 1, H3a chr6 1.64 0.032 HIST2H2BE histone cluster 2, H2be chr1 1.69 0.046 HMBOX1 homeobox containing 1 chr8 1.81 0.000 HMGCR 3-hydroxy-3-methylglutaryl-CoA reductase chr5 1.60 0.002 HMGCS1 3-hydroxy-3-methylglutaryl-CoA synthase 1 (soluble) chr5 2.23 0.000 HPX hemopexin chr11 1.61 0.006 HTT huntingtin chr4 1.54 0.028 IAH1 isoamyl acetate-hydrolyzing esterase 1 homolog (S. cerevisiae) chr2 1.64 0.002 ICE2 interactor of little elongator complex ELL subunit 2 chr15 1.50 0.027 inhibitor of DNA binding 2, dominant negative helix-loop-helix ID2 protein chr2 2.45 0.001 inhibitor of DNA binding 4, dominant negative helix-loop-helix ID4 protein chr6 2.38 0.004 IGF2R insulin-like growth factor 2 receptor chr6 2.02 0.001 IGSF3 immunoglobulin superfamily, member 3 chr1 1.56 0.004 IGSF8 immunoglobulin superfamily, member 8 chr1 1.58 0.005 IGSF9 immunoglobulin superfamily, member 9 chr1 1.59 0.004 INHBB inhibin, beta B chr2 2.04 0.000 INPP5A inositol polyphosphate-5-phosphatase, 40kDa chr10 1.52 0.029 INPP5K inositol polyphosphate-5-phosphatase K chr17 1.65 0.001 INSIG1 insulin induced gene 1 chr7 1.61 0.028 INSIG2 insulin induced gene 2 chr2 1.78 0.008 INSL5 insulin-like 5 chr1 1.82 0.018 IP6K1 inositol hexakisphosphate kinase 1 chr3 1.54 0.001 ISL2 ISL LIM homeobox 2 chr15 1.76 0.004 ITGA5 integrin, alpha 5 (fibronectin receptor, alpha polypeptide) chr12 1.72 0.007 ITSN1 intersectin 1 (SH3 domain protein) chr21 1.69 0.001 JAG1 jagged 1 chr20 2.32 0.000 JUN jun proto-oncogene chr1 1.59 0.023 KALRN kalirin, RhoGEF kinase chr3 1.82 0.022 KATNAL1 katanin p60 subunit A-like 1 chr13 1.97 0.001 KBTBD2 kelch repeat and BTB (POZ) domain containing 2 chr7 1.68 0.002 KDM3A lysine (K)-specific demethylase 3A chr2 1.79 0.001 KHNYN KH and NYN domain containing chr14 2.56 0.000 KIAA0513 KIAA0513 chr16 1.52 0.004 KIAA1147 KIAA1147 chr7 1.55 0.020 KIAA1467 KIAA1467 chr12 1.56 0.049 KIF5C kinesin family member 5C chr2 1.84 0.002 KLC1 kinesin light chain 1 ;kinesin light chain 1 chr14 1.58 0.004 KLHL15 kelch-like family member 15 chrX 1.62 0.006 KLHL26 kelch-like family member 26 chr19 1.53 0.006

The role of gwas identified 5p15 locus in prostate cancer risk and progression 211

Fold Symbol Gene Description Chr Change p-Value KLHL28 kelch-like family member 28 chr14 1.82 0.007 KLHL9 kelch-like family member 9 chr9 1.55 0.017 KPNA1 karyopherin alpha 1 (importin alpha 5) chr3 1.67 0.002 KPNA5 karyopherin alpha 5 (importin alpha 6) chr6 1.50 0.011 KRAS Kirsten rat sarcoma viral oncogene homolog chr12 1.87 0.000 LACTB lactamase, beta chr15 1.73 0.006 LAMA3 laminin, alpha 3 chr18 1.80 0.010 LARGE like-glycosyltransferase chr22 1.53 0.025 LARP1B La ribonucleoprotein domain family, member 1B chr4 1.57 0.028 LASP1 LIM and SH3 protein 1 chr17 1.57 0.038 LCOR ligand dependent nuclear receptor corepressor chr10 2.04 0.008 LCORL ligand dependent nuclear receptor corepressor-like chr4 1.61 0.001 LDLR low density lipoprotein receptor chr19 1.74 0.004 LEPREL1 leprecan-like 1 chr3 1.63 0.006 LEPROT leptin receptor overlapping transcript chr1 1.75 0.003 LEPROTL1 leptin receptor overlapping transcript-like 1 chr8 2.02 0.000 LGALS8 lectin, galactoside-binding, soluble, 8 chr1 1.88 0.003 LINC00161 long intergenic non-protein coding RNA 161 chr21 1.90 0.001 LINC00472 long intergenic non-protein coding RNA 472 chr6 1.69 0.002 LINC00476 long intergenic non-protein coding RNA 476 chr9 1.75 0.000 LINC00938 long intergenic non-protein coding RNA 938 chr12 1.58 0.003 LINC01003 long intergenic non-protein coding RNA 1003 chr7 1.53 0.025 LPIN1 lipin 1 chr2 2.65 0.000 LPPR2 Lipid phosphate phosphatase-related protein type 2 chr19 1.76 0.001 leucine rich repeat and fibronectin type III domain containing LRFN4 4 chr11 1.61 0.009 LRRC3 leucine rich repeat containing 3 chr21 1.54 0.001 LRRC8A leucine rich repeat containing 8 family, member A chr9 1.61 0.001 LRRFIP2 leucine rich repeat (in FLII) interacting protein 2 chr3 1.64 0.001 LRRN3 leucine rich repeat neuronal 3 chr7 1.51 0.009 LSS lanosterol synthase (2,3-oxidosqualene-lanosterol cyclase) chr21 2.39 0.000 LYST lysosomal trafficking regulator chr1 1.72 0.004 LZTFL1 leucine zipper transcription factor-like 1 chr3 2.78 0.002 membrane associated guanylate kinase, WW and PDZ domain MAGI3 containing 3 chr1 1.55 0.002 MAN1A1 mannosidase, alpha, class 1A, member 1 chr6 2.14 0.013 MAP2K4 mitogen-activated protein kinase kinase 4 chr17 2.32 0.000 MAP3K13 mitogen-activated protein kinase kinase kinase 13 chr3 1.88 0.003 MAPKAP1 mitogen-activated protein kinase associated protein 1 chr9 2.50 0.000 MBD2 methyl-CpG binding domain protein 2 chr18 3.13 0.000 MBTD1 mbt domain containing 1 chr17 1.53 0.003 MBTPS1 membrane-bound transcription factor peptidase, site 1 chr16 1.74 0.001 MCFD2 multiple coagulation factor deficiency 2 chr2 1.60 0.031 MCL1 myeloid cell leukemia 1 chr1 2.07 0.001 MCTP1 multiple C2 domains, transmembrane 1 chr5 1.74 0.017 MCTP2 multiple C2 domains, transmembrane 2 chr15 2.60 0.014 MED28 mediator complex subunit 28 chr4 1.58 0.025 MED4 mediator complex subunit 4 chr13 1.58 0.010 METTL16 methyltransferase like 16 chr17 1.65 0.000 MFSD12 major facilitator superfamily domain containing 12 chr19 1.66 0.005 mannosyl (alpha-1,3-)-glycoprotein beta-1,4-N- MGAT4A acetylglucosaminyltransferase, isozyme A chr2 2.08 0.000 MIEF1 mitochondrial elongation factor 1 chr22 2.08 0.000 MIER2 mesoderm induction early response 1, family member 2 chr19 1.62 0.008 MINA MYC induced nuclear antigen chr3 1.70 0.000 MIR600HG MIR600 host gene (non-protein coding) chr9 1.74 0.004 MIR99AHG mir-99a-let-7c cluster host gene (non-protein coding) chr21 1.78 0.026

212 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value MKLN1 muskelin 1, intracellular mediator containing kelch motifs chr7 1.64 0.002 MKRN3 makorin ring finger protein 3 chr15 1.62 0.005 MLPH melanophilin chr2 1.77 0.006 MMAA methylmalonic aciduria (cobalamin deficiency) cblA type chr4 1.84 0.001 MOSPD2 motile sperm domain containing 2 chrX 1.65 0.002 MPZL3 myelin protein zero-like 3 chr11 1.50 0.005 MRGBP MRG/MORF4L binding protein chr20 1.63 0.008 MRPL45 mitochondrial ribosomal protein L45 chr17 1.53 0.009 MRS2 MRS2 magnesium transporter chr6 1.60 0.005 MSMO1 methylsterol monooxygenase 1 chr4 2.16 0.001 MTAP methylthioadenosine phosphorylase chr9 2.01 0.002 MTERF4 mitochondrial transcription termination factor 4 chr2 1.53 0.018 MTF1 metal-regulatory transcription factor 1 chr1 1.54 0.012 MTF2 metal response element binding transcription factor 2 chr1 1.51 0.013 MTHFR methylenetetrahydrofolate reductase (NAD(P)H) chr1 2.23 0.001 MTMR10 myotubularin related protein 10 chr15 1.62 0.004 MVK mevalonate kinase chr12 1.77 0.001 MYCBP2 MYC binding protein 2, E3 ubiquitin protein ligase chr13 1.71 0.015 MYH10 myosin, heavy chain 10, non-muscle chr17 1.78 0.003 MYNN myoneurin chr3 2.14 0.027 MYO10 myosin X chr5 1.82 0.003 MYO1B myosin IB chr2 1.86 0.003 N4BP1 NEDD4 binding protein 1 chr16 1.56 0.025 N6AMT1 N-6 adenine-specific DNA methyltransferase 1 (putative) chr21 1.82 0.000 NAP1L1 nucleosome assembly protein 1-like 1 chr12 1.93 0.000 NAT8L N-acetyltransferase 8-like (GCN5-related, putative) chr4 1.80 0.035 NDN necdin, melanoma antigen (MAGE) family member chr15 1.52 0.009 NECAP2 NECAP endocytosis associated 2 chr1 1.57 0.003 neural precursor cell expressed, developmentally down- NEDD8 regulated 8 chr14 1.55 0.004 neural precursor cell expressed, developmentally down- NEDD9 regulated 9 chr6 1.85 0.024 NEK6 NIMA-related kinase 6 chr9 1.94 0.001 NEU1 sialidase 1 (lysosomal sialidase) chr6 1.57 0.010 NFYA nuclear transcription factor Y, alpha chr6 1.67 0.009 NGRN neugrin, neurite outgrowth associated chr15 1.92 0.000 NKIRAS1 NFKB inhibitor interacting Ras-like 1 chr3 2.44 0.000 NKPD1 NTPase, KAP family P-loop domain containing 1 chr19 1.96 0.000 NPHP3 nephronophthisis 3 (adolescent) chr3 1.52 0.003 NR3C2 nuclear receptor subfamily 3, group C, member 2 chr4 1.67 0.001 NRAS neuroblastoma RAS viral (v-ras) oncogene homolog chr1 1.56 0.022 NRBF2 nuclear receptor binding factor 2 chr10 1.82 0.000 NRBP1 nuclear receptor binding protein 1 chr2 1.75 0.004 NRN1L neuritin 1-like chr16 1.75 0.001 NMDA receptor synaptonuclear signaling and neuronal NSMF migration factor chr9 1.87 0.003 NUDT16 nudix (nucleoside diphosphate linked moiety X)-type motif 16 chr3 1.58 0.001 NUPL1 nucleoporin like 1 chr13 2.00 0.000 nuclear undecaprenyl pyrophosphate synthase 1 homolog (S. NUS1 cerevisiae) chr6 1.59 0.002 OSBPL8 oxysterol binding protein-like 8 chr12 1.62 0.025 OTUB2 OTU deubiquitinase, ubiquitin aldehyde binding 2 chr14 1.66 0.002 PACRG PARK2 co-regulated chr6 3.67 0.005 phosphoprotein membrane anchor with glycosphingolipid PAG1 microdomains 1 chr8 1.54 0.003 PAN3 PAN3 poly(A) specific ribonuclease subunit chr13 1.50 0.001 PAPD5 PAP associated domain containing 5 chr16 1.58 0.002 PARD6B par-6 family cell polarity regulator beta chr20 1.85 0.000

The role of gwas identified 5p15 locus in prostate cancer risk and progression 213

Fold Symbol Gene Description Chr Change p-Value PBX1 pre-B-cell leukemia homeobox 1 chr1 1.56 0.027 PCAT6 prostate cancer associated transcript 6 (non-protein coding) chr1 1.59 0.011 PCF11 PCF11 cleavage and polyadenylation factor subunit chr11 1.87 0.004 PCSK7 proprotein convertase subtilisin/kexin type 7 chr11 1.79 0.000 PDCD2 programmed cell death 2 chr6 2.33 0.000 PDCD6 programmed cell death 6 chr5 2.30 0.000 PDIA4 protein disulfide isomerase family A, member 4 chr7 1.74 0.040 PDLIM5 PDZ and LIM domain 5 chr4 1.61 0.007 PEX13 peroxisomal biogenesis factor 13 chr2 1.60 0.001 PEX2 peroxisomal biogenesis factor 2 chr8 1.51 0.005 PFDN4 prefoldin subunit 4 chr20 1.83 0.001 PFKFB2 6-phosphofructo-2-kinase/fructose-2,6-biphosphatase 2 chr1 1.92 0.001 PGM2 phosphoglucomutase 2 chr4 1.59 0.009 PGM2L1 phosphoglucomutase 2-like 1 chr11 1.99 0.003 PGRMC2 membrane component 2 chr4 1.59 0.007 PHF20L1 PHD finger protein 20-like 1 chr8 1.58 0.012 PIAS4 protein inhibitor of activated STAT, 4 chr19 1.79 0.000 PIKFYVE phosphoinositide kinase, FYVE finger containing chr2 1.83 0.000 PIP4K2A phosphatidylinositol-5-phosphate 4-kinase, type II, alpha chr10 1.80 0.001 PITX1 paired-like homeodomain 1 chr5 1.60 0.014 PITX2 paired-like homeodomain 2 chr4 1.70 0.001 PKD1 polycystic kidney disease 1 (autosomal dominant) chr16 1.86 0.000 PLAA phospholipase A2-activating protein chr9 1.72 0.005 PLD1 phospholipase D1, phosphatidylcholine-specific chr3 2.40 0.009 pleckstrin homology domain containing, family H (with MyTH4 PLEKHH1 domain) member 1 chr14 1.71 0.008 PNMA1 paraneoplastic Ma antigen 1 chr14 1.69 0.002 POFUT1 protein O-fucosyltransferase 1 chr20 1.66 0.002 POLR3F polymerase (RNA) III (DNA directed) polypeptide F, 39 kDa chr20 1.52 0.002 PPA2 pyrophosphatase (inorganic) 2 chr4 1.75 0.001 PPAP2B phosphatidic acid phosphatase type 2B chr1 2.31 0.000 PPFIBP1 PTPRF interacting protein, binding protein 1 (liprin beta 1) chr12 1.80 0.011 PPM1A protein phosphatase, Mg2+/Mn2+ dependent, 1A chr14 1.55 0.010 PPM1H protein phosphatase, Mg2+/Mn2+ dependent, 1H chr12 1.73 0.036 PPP1CB protein phosphatase 1, catalytic subunit, beta isozyme chr2 2.67 0.000 PPP2R1B protein phosphatase 2, regulatory subunit A, beta chr11 3.65 0.000 PPP2R5B protein phosphatase 2, regulatory subunit B', beta chr11 1.74 0.021 PPP3CB protein phosphatase 3, catalytic subunit, beta isozyme chr10 1.59 0.006 PPT2 palmitoyl-protein thioesterase 2 chr6 2.17 0.007 PRDM2 PR domain containing 2, with ZNF domain chr1 1.60 0.004 PREPL prolyl endopeptidase-like chr2 1.59 0.016 PRKCE protein kinase C, epsilon chr2 1.64 0.001 PROSER2 proline and serine rich 2 chr10 1.73 0.001 PRRC2C proline-rich coiled-coil 2C chr1 1.62 0.001 PRSS8 protease, serine, 8 chr16 1.53 0.025 PSMB4 proteasome (prosome, macropain) subunit, beta type, 4 chr1 1.51 0.003 proteasome (prosome, macropain) 26S subunit, non-ATPase, PSMD11 11 chr17 1.67 0.002 PSMG4 proteasome (prosome, macropain) assembly chaperone 4 chr6 1.51 0.041 PTPMT1 protein tyrosine phosphatase, mitochondrial 1 chr11 3.53 0.000 PTPN1 protein tyrosine phosphatase, non-receptor type 1 chr20 2.38 0.000 protein tyrosine phosphatase, non-receptor type 13 (APO- PTPN13 1/CD95 (Fas)-associated phosphatase) chr4 1.78 0.019 PTPN14 protein tyrosine phosphatase, non-receptor type 14 chr1 1.66 0.007 PTPN21 protein tyrosine phosphatase, non-receptor type 21 chr14 2.38 0.000 protein tyrosine phosphatase, non-receptor type 5 (striatum- PTPN5 enriched) chr11 2.23 0.001 PTPRR protein tyrosine phosphatase, receptor type, R chr12 2.32 0.002

214 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value PYGO2 pygopus family PHD finger 2 chr1 2.20 0.000 QSOX1 quiescin Q6 sulfhydryl oxidase 1 chr1 1.82 0.037 RAB3A RAB3A, member RAS oncogene family chr19 1.69 0.016 RAB3D RAB3D, member RAS oncogene family chr19 1.73 0.000 RABGGTB Rab geranylgeranyltransferase, beta subunit chr1 2.05 0.005 RAD54L2 RAD54-like 2 (S. cerevisiae) chr3 1.60 0.000 RAP2C RAP2C, member of RAS oncogene family chrX 1.65 0.003 RASA1 RAS p21 protein activator (GTPase activating protein) 1 chr5 1.59 0.004 RASA2 RAS p21 protein activator 2 chr3 1.67 0.015 RBFOX2 RNA binding protein, fox-1 homolog (C. elegans) 2 chr22 2.00 0.000 RBM24 RNA binding motif protein 24 chr6 1.74 0.001 recombination signal binding protein for immunoglobulin RBPJ kappa J region chr4 1.85 0.002 RCE1 Ras converting CAAX endopeptidase 1 chr11 1.55 0.009 ring finger and CHY zinc finger domain containing 1, E3 RCHY1 ubiquitin protein ligase chr4 1.52 0.018 RDH10 retinol dehydrogenase 10 (all-trans) chr8 1.52 0.014 RFK riboflavin kinase chr9 2.36 0.000 RGL1 ral guanine nucleotide dissociation stimulator-like 1 chr1 1.51 0.002 RGS9BP regulator of G protein signaling 9 binding protein chr19 1.87 0.009 RHOB ras homolog family member B chr2 2.11 0.001 RHOU ras homolog family member U chr1 3.49 0.000 RHPN2 rhophilin, Rho GTPase binding protein 2 chr19 1.75 0.003 RIC1 RAB6A GEF complex partner 1 chr9 1.73 0.015 RIMKLA ribosomal modification protein rimK-like family member A chr1 1.71 0.002 RNF144A ring finger protein 144A chr2 1.50 0.006 RNF19A ring finger protein 19A, RBR E3 ubiquitin protein ligase chr8 1.65 0.007 RNF6 ring finger protein (C3H2C3 type) 6 chr13 1.87 0.001 RP11- 10L12.4 chr4 1.75 0.001 RP11- 119B16.2 chr20 1.78 0.001 RP11- 20I20.4 chr4 1.87 0.000 RP11- 279F6.1 chr15 2.24 0.001 RP11- 279F6.2 chr15 1.52 0.047 RP11- 279O9.4 chr4 1.55 0.035 RP11- 327P2.5 chr13 1.72 0.001 RP11-32B5.8 chr15 1.79 0.038 RP11- 356O9.1 chr14 1.74 0.004 RP11-35N6.1 Lipid phosphate phosphatase-related protein type 1 chr9 1.57 0.013 RP11- 362K14.6 chr3 1.71 0.028 RP11- 379H18.1 chr7 1.62 0.002 RP11- 420A23.1 chr4 1.68 0.004 RP11- 448A19.1 chr7 1.84 0.005 RP11- 467L19.16 chr15 2.07 0.000 RP11- 513I15.6 chr6 1.51 0.017 RP11- 533E19.7 chr1 1.75 0.002 RP11- 563N4.1 chr2 2.14 0.004

The role of gwas identified 5p15 locus in prostate cancer risk and progression 215

Fold Symbol Gene Description Chr Change p-Value RP11- 567M16.6 chr18 1.70 0.001 RP11- 574F21.2 chr1 1.60 0.007 RP11- 629O1.2 chr8 1.77 0.000 RP11- 670E13.6 chr17 1.69 0.032 RP11- 672L10.6 chr18 2.13 0.001 RP11- 708J19.1 chr3 1.66 0.014 RP11- 727F15.12 chr11 1.52 0.018 RP11-85A1.3 chr10 1.56 0.037 RP11- 872J21.3 chr14 1.54 0.007 RPL29 ribosomal protein L29 chr3 1.76 0.002 RPTOR regulatory associated protein of MTOR, complex 1 chr17 1.56 0.001 RPUSD3 RNA pseudouridylate synthase domain containing 3 chr3 1.55 0.006 RSPRY1 ring finger and SPRY domain containing 1 chr16 1.54 0.001 RTCA RNA 3'-terminal phosphate cyclase chr1 1.56 0.001 RTN1 reticulon 1 chr14 1.60 0.023 RTN4 reticulon 4 chr2 1.96 0.000 SBF1 SET binding factor 1 chr22 1.87 0.000 SBF2 SET binding factor 2 chr11 1.79 0.001 SBNO1 strawberry notch homolog 1 (Drosophila) chr12 1.73 0.001 SCAF1 SR-related CTD-associated factor 1 chr19 1.51 0.004 SCAMP3 secretory carrier membrane protein 3 chr1 1.51 0.001 SCARNA13 small Cajal body-specific RNA 13 chr14 2.75 0.012 SCARNA14 small Cajal body-specific RNA 14 chr15 2.06 0.015 SCARNA20 small Cajal body-specific RNA 20 chr17 2.70 0.009 SCARNA8 small Cajal body-specific RNA 8 chr9 2.29 0.033 SCGB1D2 secretoglobin, family 1D, member 2 chr11 1.54 0.006 SCLY selenocysteine lyase chr2 1.71 0.002 SCUBE2 signal peptide, CUB domain, EGF-like 2 chr11 2.93 0.001 SEL1L sel-1 suppressor of lin-12-like (C. elegans) chr14 2.03 0.002 sema domain, immunoglobulin domain (Ig), transmembrane SEMA4F domain (TM) and short cytoplasmic domain, (semaphorin) 4F chr2 1.70 0.002 SERINC5 serine incorporator 5 chr5 1.51 0.003 SFXN1 sideroflexin 1 chr5 1.67 0.003 SH3BP4 SH3-domain binding protein 4 chr2 1.53 0.036 SI sucrase-isomaltase (alpha-glucosidase) chr3 4.22 0.030 SIAE sialic acid acetylesterase chr11 1.69 0.006 SKI SKI proto-oncogene chr1 1.53 0.002 SKIV2L2 superkiller viralicidic activity 2-like 2 (S. cerevisiae) chr5 2.01 0.000 solute carrier family 11 (proton-coupled divalent metal ion SLC11A2 transporter), member 2 chr12 1.51 0.007 solute carrier family 12 (potassium/chloride transporter), SLC12A6 member 6 chr15 1.53 0.003 SLC17A5 solute carrier family 17 (acidic sugar transporter), member 5 chr6 1.51 0.017 SLC19A1 solute carrier family 19 (folate transporter), member 1 chr21 1.65 0.002 SLC22A3 solute carrier family 22 (organic cation transporter), member 3 chr6 1.87 0.002 SLC25A16 solute carrier family 25 (mitochondrial carrier), member 16 chr10 1.83 0.001 solute carrier family 25 (mitochondrial carrier; peroxisomal SLC25A17 membrane protein, 34kDa), member 17 chr22 1.75 0.001 SLC26A1 solute carrier family 26 (anion exchanger), member 1 chr4 1.63 0.030 SLC30A10 solute carrier family 30, member 10 chr1 2.14 0.000 SLC31A1 solute carrier family 31 (copper transporter), member 1 chr9 1.65 0.044

216 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value solute carrier family 35 (UDP-xylose/UDP-N-acetylglucosamine SLC35B4 transporter), member B4 chr7 1.70 0.004 solute carrier family 36 (proton/amino acid symporter), SLC36A1 member 1 chr5 2.12 0.000 SLC38A2 solute carrier family 38, member 2 chr12 1.73 0.000 SLC39A13 solute carrier family 39 (zinc transporter), member 13 chr11 1.54 0.007 SLC39A9 solute carrier family 39, member 9 chr14 1.61 0.001 SLC41A3 solute carrier family 41, member 3 chr3 1.54 0.034 SLC44A1 solute carrier family 44 (choline transporter), member 1 chr9 1.57 0.003 SLC45A3 solute carrier family 45, member 3 chr1 1.66 0.003 solute carrier family 5 (sodium/multivitamin and iodide SLC5A6 cotransporter), member 6 chr2 1.68 0.026 solute carrier family 7 (anionic amino acid transporter), SLC7A13 member 13 chr8 1.52 0.003 solute carrier family 9, subfamily A (NHE6, cation proton SLC9A6 antiporter 6), member 6 chrX 1.52 0.001 SLK STE20-like kinase chr10 1.66 0.003 SMAP1 small ArfGAP 1 chr6 1.73 0.003 single-strand-selective monofunctional uracil-DNA glycosylase SMUG1 1 chr12 2.52 0.000 SNHG10 small nucleolar RNA host gene 10 (non-protein coding) chr14 1.71 0.011 SNHG14 small nucleolar RNA host gene 14 (non-protein coding) chr15 2.03 0.004 SNHG16 small nucleolar RNA host gene 16 (non-protein coding) chr17 1.98 0.005 SNHG20 small nucleolar RNA host gene 20 (non-protein coding) chr17 1.55 0.006 SNN stannin chr16 1.53 0.003 SNORA13 small nucleolar RNA, H/ACA box 13 chr5 2.05 0.026 SNORA28 small nucleolar RNA, H/ACA box 28 chr14 1.65 0.015 SNORA37 small nucleolar RNA, H/ACA box 37 chr18 1.59 0.027 SNORA51 small nucleolar RNA, H/ACA box 51 chr20 2.25 0.026 SNORA74B small nucleolar RNA, H/ACA box 74B chr5 1.88 0.008 SNRPN small nuclear ribonucleoprotein polypeptide N chr15 1.68 0.007 SNX10 sorting nexin 10 chr7 1.59 0.013 SOCS7 suppressor of cytokine signaling 7 chr17 1.56 0.006 SOX11 SRY (sex determining region Y)-box 11 chr2 1.67 0.049 SOX9 SRY (sex determining region Y)-box 9 chr17 1.58 0.008 SPAG6 sperm associated antigen 6 chr10 2.79 0.000 SPHK2 sphingosine kinase 2 chr19 1.56 0.008 SPON2 spondin 2, extracellular matrix protein chr4 1.53 0.016 SPRTN SprT-like N-terminal domain chr1 1.68 0.001 SPTBN2 spectrin, beta, non-erythrocytic 2 chr11 1.58 0.007 SQLE squalene epoxidase chr8 1.71 0.000 SRPR signal recognition particle receptor (docking protein) chr11 1.70 0.001 SSBP3 single stranded DNA binding protein 3 chr1 1.57 0.002 SSFA2 sperm specific antigen 2 chr2 1.95 0.016 SSTR2 somatostatin receptor 2 chr17 1.52 0.008 STARD4 StAR-related lipid transfer (START) domain containing 4 chr5 1.67 0.001 STRBP spermatid perinuclear RNA binding protein chr9 1.84 0.001 SUSD4 sushi domain containing 4 chr1 1.97 0.000 SWSAP1 SWIM-type zinc finger 7 associated protein 1 chr19 1.96 0.000 SYT7 synaptotagmin VII chr11 1.55 0.018 TADA2B transcriptional adaptor 2B chr4 1.60 0.003 TAPT1 transmembrane anterior posterior transformation 1 chr4 2.05 0.000 TARS2 threonyl-tRNA synthetase 2, mitochondrial (putative) chr1 1.56 0.034 TARSL2 threonyl-tRNA synthetase-like 2 chr15 1.80 0.001 TBL1XR1 transducin (beta)-like 1 X-linked receptor 1 chr3 1.51 0.008 TBRG1 transforming growth factor beta regulator 1 chr11 1.79 0.001 TC2N tandem C2 domains, nuclear chr14 1.61 0.010 TENM2 teneurin transmembrane protein 2 chr5 1.66 0.023

The role of gwas identified 5p15 locus in prostate cancer risk and progression 217

Fold Symbol Gene Description Chr Change p-Value TESK2 testis-specific kinase 2 chr1 2.64 0.000 TFRC transferrin receptor chr3 1.56 0.029 THAP7-AS1 THAP7 antisense RNA 1 chr22 1.51 0.017 translocase of inner mitochondrial membrane 17 homolog B TIMM17B (yeast) chrX 1.51 0.007 TK2 thymidine kinase 2, mitochondrial chr16 2.03 0.001 TLDC1 TBC/LysM-associated domain containing 1 chr16 1.60 0.003 TMCC1 transmembrane and coiled-coil domain family 1 chr3 1.69 0.006 TMCO1 transmembrane and coiled-coil domains 1 chr1 1.59 0.001 TMEM135 transmembrane protein 135 chr11 1.53 0.018 TMEM145 transmembrane protein 145 chr19 1.63 0.007 TMEM164 transmembrane protein 164 chrX 1.81 0.001 TMEM2 transmembrane protein 2 chr9 2.97 0.000 TMEM241 transmembrane protein 241 chr18 1.98 0.000 TMEM245 transmembrane protein 245 chr9 1.52 0.021 TMEM41B transmembrane protein 41B chr11 1.62 0.003 TMTC3 transmembrane and tetratricopeptide repeat containing 3 chr12 2.52 0.000 TMX1 thioredoxin-related transmembrane protein 1 chr14 2.19 0.003 TNFRSF11B tumor necrosis factor receptor superfamily, member 11b chr8 1.59 0.036 TNKS1BP1 tankyrase 1 binding protein 1, 182kDa chr11 1.56 0.010 tankyrase, TRF1-interacting ankyrin-related ADP-ribose TNKS2 polymerase 2 chr10 1.79 0.004 TOR1AIP2 torsin A interacting protein 2 chr1 1.51 0.015 TP53INP1 tumor protein p53 inducible nuclear protein 1 chr8 2.30 0.015 TRAM2 translocation associated membrane protein 2 chr6 1.86 0.000 TRAPPC3 trafficking protein particle complex 3 chr1 1.53 0.002 TRIM2 tripartite motif containing 2 chr4 2.33 0.017 TRIM48 tripartite motif containing 48 chr11 2.10 0.012 TRIQK triple QxxK/R motif containing chr8 1.96 0.007 transient receptor potential cation channel, subfamily M, TRPM2 member 2 chr21 1.68 0.002 TRUB2 TruB pseudouridine (psi) synthase family member 2 chr9 1.55 0.046 TSPAN5 tetraspanin 5 chr4 2.05 0.007 TTC21B tetratricopeptide repeat domain 21B chr2 1.73 0.003 TTR transthyretin chr18 1.88 0.001 TTTY15 testis-specific transcript, Y-linked 15 (non-protein coding) chrY 1.78 0.000 TULP4 tubby like protein 4 chr6 1.64 0.002 UBE2D3 ubiquitin-conjugating enzyme E2D 3 chr4 2.70 0.000 UBE2G1 ubiquitin-conjugating enzyme E2G 1 chr17 2.22 0.000 UBE2R2 ubiquitin-conjugating enzyme E2R 2 chr9 2.01 0.000 UBE2W ubiquitin-conjugating enzyme E2W (putative) chr8 1.52 0.006 UBOX5 U-box domain containing 5 chr20 1.71 0.000 UFSP1 UFM1-specific peptidase 1 (non-functional) chr7 1.63 0.002 UGCG UDP-glucose ceramide glucosyltransferase chr9 2.75 0.000 UGT2B11 UDP glucuronosyltransferase 2 family, polypeptide B11 chr4 2.07 0.000 UGT2B17 UDP glucuronosyltransferase 2 family, polypeptide B17 chr4 1.95 0.027 UGT2B4 UDP glucuronosyltransferase 2 family, polypeptide B4 chr4 2.06 0.011 UHMK1 U2AF homology motif (UHM) kinase 1 chr1 1.62 0.001 USP25 ubiquitin specific peptidase 25 chr21 1.92 0.001 USP3 ubiquitin specific peptidase 3 chr15 1.57 0.028 USP39 ubiquitin specific peptidase 39 chr2 1.53 0.001 USP47 ubiquitin specific peptidase 47 chr11 1.55 0.008 UTP20, small subunit (SSU) processome component, homolog UTP20 (yeast) chr12 1.62 0.006 UTRN utrophin chr6 1.54 0.011 VAMP4 vesicle-associated membrane protein 4 chr1 1.56 0.003 VAMP (vesicle-associated membrane protein)-associated VAPA protein A, 33kDa chr18 1.50 0.003

218 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value VGLL4 vestigial-like family member 4 chr3 2.17 0.000 VLDLR very low density lipoprotein receptor chr9 2.35 0.000 VPS13A vacuolar protein sorting 13 homolog A (S. cerevisiae) chr9 1.58 0.004 VPS33A vacuolar protein sorting 33 homolog A (S. cerevisiae) chr12 1.56 0.006 WASF3 WAS protein family, member 3 chr13 1.82 0.001 WDFY2 WD repeat and FYVE domain containing 2 chr13 2.11 0.000 WDFY3 WD repeat and FYVE domain containing 3 chr4 1.71 0.006 WDR1 WD repeat domain 1 chr4 1.87 0.001 WDR20 WD repeat domain 20 chr14 1.63 0.000 WDR4 WD repeat domain 4 chr21 1.66 0.004 WDR83 WD repeat domain 83 chr19 1.59 0.004 WNT5A wingless-type MMTV integration site family, member 5A chr3 2.16 0.002 WNT5B wingless-type MMTV integration site family, member 5B chr12 1.57 0.006 WSB1 WD repeat and SOCS box containing 1 chr17 1.90 0.001 WWOX WW domain containing oxidoreductase chr16 1.52 0.023 WWTR1 WW domain containing transcription regulator 1 chr3 2.37 0.000 YOD1 YOD1 deubiquitinase chr1 2.42 0.000 YRDC yrdC N(6)-threonylcarbamoyltransferase domain containing chr1 1.85 0.000 YTHDF3 YTH domain family, member 3 chr8 1.61 0.001 ZBED4 zinc finger, BED-type containing 4 chr22 1.57 0.004 ZBTB45 zinc finger and BTB domain containing 45 chr19 1.62 0.005 ZBTB5 zinc finger and BTB domain containing 5 chr9 1.67 0.003 ZBTB7A zinc finger and BTB domain containing 7A chr19 1.65 0.001 ZCCHC24 zinc finger, CCHC domain containing 24 chr10 1.53 0.017 ZCCHC3 zinc finger, CCHC domain containing 3 chr20 1.95 0.000 ZDHHC7 zinc finger, DHHC-type containing 7 chr16 1.52 0.010 ZDHHC9 zinc finger, DHHC-type containing 9 chrX 1.52 0.005 ZFYVE1 zinc finger, FYVE domain containing 1 chr14 1.58 0.004 ZMAT3 zinc finger, matrin-type 3 chr3 1.65 0.002 ZNF197 zinc finger protein 197 chr3 1.52 0.009 ZNF24 zinc finger protein 24 chr18 1.65 0.001 ZNF264 zinc finger protein 264 chr19 1.76 0.000 ZNF281 zinc finger protein 281 chr1 1.75 0.000 ZNF583 zinc finger protein 583 chr19 1.51 0.002 ZNF687 zinc finger protein 687 chr1 1.53 0.020 ZNF703 zinc finger protein 703 chr8 2.19 0.002 ZNF721 zinc finger protein 721 chr4 1.89 0.007 ZNF75A zinc finger protein 75a chr16 2.35 0.000 ZNF780A zinc finger protein 780A chr19 1.56 0.021 ZNHIT6 zinc finger, HIT-type containing 6 chr1 1.53 0.002 ZPLD1 zona pellucida-like domain containing 1 chr3 3.76 0.000 ZSCAN31 zinc finger and SCAN domain containing 31 chr6 1.53 0.012 ZSCAN9 zinc finger and SCAN domain containing 9 chr6 1.52 0.001 ZSWIM8 zinc finger, SWIM-type containing 8 chr10 1.64 0.037 ZYG11A zyg-11 family member A, cell cycle regulator chr1 1.60 0.004

The role of gwas identified 5p15 locus in prostate cancer risk and progression 219

Appendix E

Genes differentially regulated by IRX4 knockdown in VCaP cells Fold Symbol Gene Description Chr Change p-Value ABCB5 ATP-binding cassette, sub-family B (MDR/TAP), member 5 chr7 -1.73 0.001 ABCC4 ATP-binding cassette, sub-family C (CFTR/MRP), member 4 chr13 -1.60 0.048 ACOT7 acyl-CoA thioesterase 7 chr1 -4.70 0.000 ACSL5 acyl-CoA synthetase long-chain family member 5 chr10 -1.68 0.017 AFP alpha-fetoprotein chr4 -2.06 0.001 AGBL1 ATP/GTP binding protein-like 1 chr15 -1.86 0.018 ALB albumin chr4 -1.99 0.004 ALOX5 arachidonate 5-lipoxygenase chr10 -1.59 0.031 ANG angiogenin, ribonuclease, RNase A family, 5 chr14 -2.62 0.008 AOX1 aldehyde oxidase 1 chr2 -1.68 0.018 APLP2 amyloid beta (A4) precursor-like protein 2 chr11 -2.00 0.003 APOH apolipoprotein H (beta-2-glycoprotein I) chr17 -2.38 0.001 ARHGAP33 Rho GTPase activating protein 33 chr19 -1.53 0.025 ARSB arylsulfatase B chr5 -1.57 0.008 ASXL1 additional sex combs like transcriptional regulator 1 chr20 -1.51 0.040 ATP6V0E1 ATPase, H+ transporting, lysosomal 9kDa, V0 subunit e1 chr5 -1.71 0.004 ATPase, H+ transporting, lysosomal 9kDa, V0 subunit e1 ATP6V0E1P2 pseudogene 2 chr3 -1.64 0.002 AUNIP aurora kinase A and ninein interacting protein chr1 -1.95 0.024 AZGP1 alpha-2-glycoprotein 1, zinc-binding chr7 -1.67 0.013 CADM1 cell adhesion molecule 1 chr11 -1.54 0.031 CCDC109B coiled-coil domain containing 109B chr4 -1.88 0.004 CCDC92 coiled-coil domain containing 92 chr12 -1.50 0.012 CCNE2 cyclin E2 chr8 -2.35 0.018 CD302 CD302 molecule chr2 -1.56 0.036 CDK14 cyclin-dependent kinase 14 chr7 -1.91 0.010 CDKL1 cyclin-dependent kinase-like 1 (CDC2-related kinase) chr14 -1.69 0.008 CDP-diacylglycerol synthase (phosphatidate CDS2 cytidylyltransferase) 2 chr20 -1.98 0.001 CNOT6 CCR4-NOT transcription complex, subunit 6 chr5 -2.02 0.000 COL4A4 collagen, type IV, alpha 4 chr2 -1.63 0.050 CSTA cystatin A (stefin A) chr3 -2.25 0.017 CUX1 cut-like homeobox 1 chr7 -2.02 0.002 CYR61 cysteine-rich, angiogenic inducer, 61 chr1 -1.68 0.040 DNAJC16 DnaJ (Hsp40) homolog, subfamily C, member 16 chr1 -1.70 0.004 DPP4 dipeptidyl-peptidase 4 chr2 -1.60 0.042 DUSP6 dual specificity phosphatase 6 chr12 -1.61 0.044 EBAG9 estrogen receptor binding site associated, antigen, 9 chr8 -1.52 0.013 EHF ets homologous factor chr11 -1.99 0.003 ELL2 elongation factor, RNA polymerase II, 2 chr5 -1.74 0.008 EPHX3 epoxide hydrolase 3 chr19 -1.62 0.047 ERCC1 excision repair cross-complementation group 1 chr19 -1.59 0.032 ERVMER34- 1 endogenous retrovirus group MER34, member 1 chr4 -2.51 0.013 FADS1 fatty acid desaturase 1 chr11 -1.74 0.035 FAM104B family with sequence similarity 104, member B chrX -2.07 0.000 FAM111A family with sequence similarity 111, member A chr11 -1.54 0.040 FAM222B family with sequence similarity 222, member B chr17 -1.67 0.041 FAM83E family with sequence similarity 83, member E chr19 -1.67 0.049 FANCI Fanconi anemia, complementation group I chr15 -1.62 0.020 FERMT1 fermitin family member 1 chr20 -1.56 0.041 FREM1 FRAS1 related extracellular matrix 1 chr9 -1.63 0.029

220 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value FREM2 FRAS1 related extracellular matrix protein 2 chr13 -1.51 0.009 FUT7 fucosyltransferase 7 (alpha (1,3) fucosyltransferase) chr9 -2.46 0.001 GABBR2 gamma-aminobutyric acid (GABA) B receptor, 2 chr9 -1.59 0.025 GABRD gamma-aminobutyric acid (GABA) A receptor, delta chr1 -1.65 0.035 GAS8 growth arrest-specific 8 chr16 -1.61 0.005 GATA6 GATA binding protein 6 chr18 -1.55 0.035 GNAS GNAS complex locus chr20 -1.57 0.026 GNPDA1 glucosamine-6-phosphate deaminase 1 chr5 -2.40 0.000 HABP2 hyaluronan binding protein 2 chr10 -2.35 0.001 HAO1 hydroxyacid oxidase (glycolate oxidase) 1 chr20 -1.58 0.008 HGF hepatocyte growth factor (hepapoietin A; scatter factor) chr7 -1.57 0.007 HGSNAT heparan-alpha-glucosaminide N-acetyltransferase chr8 -1.76 0.012 HIST1H3B histone cluster 1, H3b chr6 -2.16 0.047 HNF1A HNF1 homeobox A chr12 -1.50 0.034 HRH3 histamine receptor H3 chr20 -1.66 0.005 HSPA8 heat shock 70kDa protein 8 chr11 -1.55 0.030 HSPG2 heparan sulfate proteoglycan 2 chr1 -1.55 0.022 HYKK hydroxylysine kinase chr15 -2.09 0.001 IFFO1 intermediate filament family orphan 1 chr12 -2.41 0.026 IRX4 iroquois homeobox 4 chr5 -5.60 0.000 ITGA1 integrin, alpha 1 chr5 -1.59 0.032 ITGB3BP integrin beta 3 binding protein (beta3-endonexin) chr1 -2.37 0.008 KIF16B kinesin family member 16B chr20 -1.51 0.032 KIF1C kinesin family member 1C chr17 -1.56 0.011 KIF24 kinesin family member 24 chr9 -2.45 0.005 KNTC1 kinetochore associated 1 chr12 -1.66 0.045 LCLAT1 lysocardiolipin acyltransferase 1 chr2 -2.86 0.000 LINC00883 long intergenic non-protein coding RNA 883 chr3 -1.65 0.006 LINC01091 long intergenic non-protein coding RNA 1091 chr4 -1.64 0.008 LMAN1 lectin, mannose-binding, 1 chr18 -1.66 0.006 LONP2 lon peptidase 2, peroxisomal chr16 -1.57 0.003 LOXL3 lysyl oxidase-like 3 chr2 -1.81 0.027 LRR1 leucine rich repeat protein 1 chr14 -1.63 0.025 MARCKS myristoylated alanine-rich protein kinase C substrate chr6 -1.97 0.000 MASTL microtubule associated serine/threonine kinase-like chr10 -1.64 0.025 MCAM melanoma cell adhesion molecule chr11 -1.70 0.006 MFSD10 major facilitator superfamily domain containing 10 chr4 -1.56 0.011 MGP matrix Gla protein chr12 -1.83 0.007 MIF4GD MIF4G domain containing chr17 -1.81 0.033 MKNK2 MAP kinase interacting serine/threonine kinase 2 chr19 -2.00 0.034 MMP15 matrix metallopeptidase 15 (membrane-inserted) chr16 -1.53 0.007 MRPL11 mitochondrial ribosomal protein L11 chr11 -1.78 0.006 MRPS11 mitochondrial ribosomal protein S11 chr15 -1.77 0.002 MTFR1L mitochondrial fission regulator 1-like chr1 -2.12 0.000 MYB v-myb avian myeloblastosis viral oncogene homolog chr6 -2.04 0.004 MYBL2 v-myb avian myeloblastosis viral oncogene homolog-like 2 chr20 -2.36 0.019 MYBPC1 myosin binding protein C, slow type chr12 -1.51 0.041 MYH13 myosin, heavy chain 13, skeletal muscle chr17 -1.58 0.031 MYO19 myosin XIX chr17 -1.57 0.014 MYO1D myosin ID chr17 -1.96 0.003 NASP nuclear autoantigenic sperm protein (histone-binding) chr1 -1.60 0.036 NDE1 nudE neurodevelopment protein 1 chr16 -1.62 0.018 NEK4 NIMA-related kinase 4 chr3 -1.52 0.003 NEO1 neogenin 1 chr15 -1.65 0.001 nuclear factor of activated T-cells, cytoplasmic, calcineurin- NFATC3 dependent 3 chr16 -1.84 0.005 NINJ1 ninjurin 1 chr9 -1.68 0.006

The role of gwas identified 5p15 locus in prostate cancer risk and progression 221

Fold Symbol Gene Description Chr Change p-Value NRCAM neuronal cell adhesion molecule chr7 -1.61 0.003 NSA2 NSA2 ribosome biogenesis homolog (S. cerevisiae) chr5 -1.95 0.003 PCLO piccolo presynaptic cytomatrix protein chr7 -1.57 0.036 PCNXL2 pecanex-like 2 (Drosophila) chr1 -1.55 0.026 PEX13 peroxisomal biogenesis factor 13 chr2 -1.59 0.045 PFKFB3 6-phosphofructo-2-kinase/fructose-2,6-biphosphatase 3 chr10 -2.04 0.000 PGC progastricsin (pepsinogen C) chr6 -2.00 0.004 PIGP phosphatidylinositol glycan anchor biosynthesis, class P chr21 -1.62 0.034 PIGS phosphatidylinositol glycan anchor biosynthesis, class S chr17 -1.51 0.016 PLAUR plasminogen activator, urokinase receptor chr19 -1.60 0.023 PLOD1 procollagen-lysine, 2-oxoglutarate 5-dioxygenase 1 chr1 -1.51 0.011 PLXND1 plexin D1 chr3 -1.89 0.004 PNKD paroxysmal nonkinesigenic dyskinesia chr2 -1.61 0.002 POLQ polymerase (DNA directed), theta chr3 -1.65 0.033 POU5F1 POU class 5 homeobox 1 chr6 -1.67 0.009 PPAP2A phosphatidic acid phosphatase type 2A chr5 -1.62 0.008 pancreatic progenitor cell differentiation and proliferation PPDPF factor chr20 -1.86 0.027 PRAC2 prostate cancer susceptibility candidate 2 chr17 -2.03 0.039 PRCP prolylcarboxypeptidase (angiotensinase C) chr11 -2.07 0.000 PRKAA2 protein kinase, AMP-activated, alpha 2 catalytic subunit chr1 -1.53 0.007 PROCR protein C receptor, endothelial chr20 -1.76 0.004 PROSER1 proline and serine rich 1 chr13 -1.55 0.018 PSMD7 proteasome (prosome, macropain) 26S subunit, non-ATPase, 7 chr16 -1.63 0.001 PSMG2 proteasome (prosome, macropain) assembly chaperone 2 chr18 -1.51 0.005 PTDSS1 phosphatidylserine synthase 1 chr8 -1.66 0.027 PYGM phosphorylase, glycogen, muscle chr11 -2.48 0.017 RAB40B RAB40B, member RAS oncogene family chr17 -1.58 0.005 RAD51 RAD51 recombinase chr15 -1.94 0.033 Ras association (RalGDS/AF-6) domain family (N-terminal) RASSF10 member 10 chr11 -1.50 0.023 RAVER1 ribonucleoprotein, PTB-binding 1 chr19 -1.54 0.008 RGS3 regulator of G-protein signaling 3 chr9 -1.53 0.017 RIBC2 RIB43A domain with coiled-coils 2 chr22 -1.76 0.004 RIPPLY3 ripply transcriptional repressor 3 chr21 -2.06 0.015 RNASE4 ribonuclease, RNase A family, 4 chr14 -2.27 0.024 RNF10 ring finger protein 10 chr12 -1.52 0.019 RNPEPL1 arginyl aminopeptidase (aminopeptidase B)-like 1 chr2 -1.50 0.040 RP11- 229P13.25 chr9 -1.52 0.044 RP11- 333E13.4 chr4 -3.63 0.000 RP11- 676M6.1 chr11 -1.52 0.036 RP11- 834C11.4 chr12 -1.58 0.010 RRM2 ribonucleotide reductase M2 chr2 -2.41 0.006 RSPH9 radial spoke head 9 homolog (Chlamydomonas) chr6 -2.29 0.003 SCG3 secretogranin III chr15 -4.16 0.000 SGMS2 sphingomyelin synthase 2 chr4 -1.57 0.039 SHB Src homology 2 domain containing adaptor protein B chr9 -1.89 0.005 SIGMAR1 sigma non-opioid intracellular receptor 1 chr9 -1.61 0.022 solute carrier family 16 (aromatic amino acid transporter), SLC16A10 member 10 chr6 -1.68 0.034 SLC25A18 solute carrier family 25 (glutamate carrier), member 18 chr22 -2.84 0.010 SLC35F1 solute carrier family 35, member F1 chr6 -2.70 0.000 solute carrier family 37 (glucose-6-phosphate transporter), SLC37A1 member 1 chr21 -1.80 0.027 SLFN13 schlafen family member 13 chr17 -1.67 0.010

222 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value SWI/SNF related, matrix associated, actin dependent regulator SMARCE1 of chromatin, subfamily e, member 1 chr17 -1.57 0.012 SNHG14 small nucleolar RNA host gene 14 (non-protein coding) chr15 -1.55 0.004 SNORD88A small nucleolar RNA, C/D box 88A chr19 -1.63 0.005 SP8 Sp8 transcription factor chr7 -1.98 0.001 SPINK1 serine peptidase inhibitor, Kazal type 1 chr5 -2.23 0.002 SPRR2F small proline-rich protein 2F chr1 -2.51 0.001 SPSB1 splA/ryanodine receptor domain and SOCS box containing 1 chr1 -1.83 0.038 STMN1 stathmin 1 chr1 -1.55 0.036 SYTL5 synaptotagmin-like 5 chrX -2.47 0.009 TCF7L2 transcription factor 7-like 2 (T-cell specific, HMG-box) chr10 -1.54 0.025 TEX11 testis expressed 11 chrX -1.50 0.025 TEX29 testis expressed 29 chr13 -2.32 0.026 TGM4 transglutaminase 4 chr3 -1.59 0.015 TICRR TOPBP1-interacting checkpoint and replication regulator chr15 -1.87 0.047 TMCO2 transmembrane and coiled-coil domains 2 chr1 -1.68 0.013 transmembrane protein with EGF-like and two follistatin-like TMEFF2 domains 2 chr2 -1.88 0.002 TMEM123 transmembrane protein 123 chr11 -1.91 0.004 TMEM143 transmembrane protein 143 chr19 -1.66 0.030 TMEM199 transmembrane protein 199 chr17 -1.72 0.004 TMOD4 tropomodulin 4 (muscle) chr1 -1.68 0.024 TMX4 thioredoxin-related transmembrane protein 4 chr20 -2.35 0.000 TOMM34 translocase of outer mitochondrial membrane 34 chr20 -1.57 0.015 TRAK1 trafficking protein, kinesin binding 1 chr3 -1.52 0.028 TSPAN14 tetraspanin 14 chr10 -1.51 0.011 UBE2D4 ubiquitin-conjugating enzyme E2D 4 (putative) chr7 -1.60 0.018 UFC1 ubiquitin-fold modifier conjugating enzyme 1 chr1 -1.64 0.001 UGT2B7 UDP glucuronosyltransferase 2 family, polypeptide B7 chr4 -1.68 0.027 UGT3A2 UDP glycosyltransferase 3 family, polypeptide A2 chr5 -1.86 0.002 VANGL1 VANGL planar cell polarity protein 1 chr1 -1.76 0.003 XK X-linked Kx blood group chrX -3.58 0.000 ZBED3 zinc finger, BED-type containing 3 chr5 -1.69 0.005 ACVR2B activin A receptor, type IIB chr3 1.60 0.041 ALCAM activated leukocyte cell adhesion molecule chr3 1.79 0.004 ANKRD46 ankyrin repeat domain 46 chr8 1.59 0.012 ANKRD52 ankyrin repeat domain 52 chr12 1.70 0.010 ANKRD6 ankyrin repeat domain 6 chr6 1.75 0.002 AP1S3 adaptor-related protein complex 1, sigma 3 subunit chr2 1.56 0.028 ARHGAP20 Rho GTPase activating protein 20 chr11 1.73 0.011 ARHGAP28 Rho GTPase activating protein 28 chr18 1.86 0.010 ATP11B ATPase, class VI, type 11B chr3 1.58 0.050 ATP2B4 ATPase, Ca++ transporting, plasma membrane 4 chr1 1.60 0.004 BCCIP BRCA2 and CDKN1A interacting protein chr10 1.80 0.028 BCL2L13 BCL2-like 13 (apoptosis facilitator) chr22 1.62 0.012 BOD1 biorientation of chromosomes in cell division 1 chr5 2.01 0.017 BOD1L2 biorientation of chromosomes in cell division 1-like 2 chr18 1.80 0.010 C14orf132 chromosome 14 open reading frame 132 chr14 2.01 0.004 C20orf194 chromosome 20 open reading frame 194 chr20 1.52 0.044 C6orf120 chromosome 6 open reading frame 120 chr6 1.61 0.001 CA1 carbonic anhydrase I chr8 1.76 0.004 CA13 carbonic anhydrase XIII chr8 1.52 0.028 CA2 carbonic anhydrase II chr8 1.61 0.012 CALU calumenin chr7 1.88 0.048 CFL2 cofilin 2 (muscle) chr14 1.51 0.022 CNTN4 contactin 4 chr3 1.51 0.013 CRK v-crk avian sarcoma virus CT10 oncogene homolog chr17 1.58 0.014

The role of gwas identified 5p15 locus in prostate cancer risk and progression 223

Fold Symbol Gene Description Chr Change p-Value CRTAM cytotoxic and regulatory T cell molecule chr11 1.84 0.005 CSRP2 cysteine and glycine-rich protein 2 chr12 1.92 0.013 CTBP2 C-terminal binding protein 2 chr10 1.55 0.002 CWC15 CWC15 spliceosome-associated protein chr11 1.73 0.002 CYB561D1 cytochrome b561 family, member D1 chr1 1.62 0.010 DACT1 dishevelled-binding antagonist of beta-catenin 1 chr14 1.66 0.006 DACT2 dishevelled-binding antagonist of beta-catenin 2 chr6 1.72 0.013 DDX18 DEAD (Asp-Glu-Ala-Asp) box polypeptide 18 chr2 1.58 0.027 DDX3X DEAD (Asp-Glu-Ala-Asp) box helicase 3, X-linked chrX 1.53 0.013 DDX3Y DEAD (Asp-Glu-Ala-Asp) box helicase 3, Y-linked chrY 2.24 0.000 DEPTOR DEP domain containing MTOR-interacting protein chr8 1.52 0.025 DGKH diacylglycerol kinase, eta chr13 1.72 0.018 DLEU1 deleted in lymphocytic leukemia 1 (non-protein coding) chr13 1.63 0.050 DOCK10 dedicator of cytokinesis 10 chr2 1.91 0.006 EEA1 early endosome antigen 1 chr12 1.58 0.002 EIF5B eukaryotic translation initiation factor 5B chr2 1.71 0.013 ENTPD5 ectonucleoside triphosphate diphosphohydrolase 5 chr14 1.53 0.041 EPB41L1 erythrocyte membrane protein band 4.1-like 1 chr20 1.58 0.032 EPHA4 EPH receptor A4 chr2 1.64 0.011 EPHA5-AS1 EPHA5 antisense RNA 1 chr4 1.63 0.021 ESYT2 extended synaptotagmin-like protein 2 chr7 1.62 0.006 ETS2 v-ets avian erythroblastosis virus E26 oncogene homolog 2 chr21 1.77 0.018 F3 coagulation factor III (thromboplastin, tissue factor) chr1 1.60 0.024 FAM135B family with sequence similarity 135, member B chr8 1.53 0.043 FAM49B family with sequence similarity 49, member B chr8 1.60 0.039 FOLH1 folate hydrolase (prostate-specific membrane antigen) 1 chr11 1.70 0.018 FZD4 frizzled class receptor 4 chr11 1.69 0.039 G3BP2 GTPase activating protein (SH3 domain) binding protein 2 chr4 1.54 0.050 glucosaminyl (N-acetyl) transferase 2, I-branching enzyme (I GCNT2 blood group) chr6 1.65 0.010 glycerophosphocholine phosphodiesterase GDE1 homolog (S. GPCPD1 cerevisiae) chr20 1.63 0.033 GPR137B G protein-coupled receptor 137B chr1 1.81 0.011 GPR137C G protein-coupled receptor 137C chr14 1.72 0.030 GPR158 G protein-coupled receptor 158 chr10 1.77 0.014 GRIA2 glutamate receptor, ionotropic, AMPA 2 chr4 2.25 0.002 GSDMC gasdermin C chr8 1.76 0.007 human immunodeficiency virus type I enhancer binding HIVEP3 protein 3 chr1 1.73 0.013 IGFBP5 insulin-like growth factor binding protein 5 chr2 1.61 0.011 ISLR2 immunoglobulin superfamily containing leucine-rich repeat 2 chr15 1.94 0.008 ISOC1 isochorismatase domain containing 1 chr5 1.58 0.018 ITPKB inositol-trisphosphate 3-kinase B chr1 1.53 0.002 JAG1 jagged 1 chr20 1.69 0.027 potassium voltage-gated channel, subfamily H (eag-related), KCNH1 member 1 chr1 1.78 0.009 potassium voltage-gated channel, subfamily H (eag-related), KCNH8 member 8 chr3 1.63 0.026 potassium large conductance calcium-activated channel, KCNMB2 subfamily M, beta member 2 chr3 1.53 0.004 KH domain containing, RNA binding, signal transduction KHDRBS3 associated 3 chr8 1.60 0.011 KHNYN KH and NYN domain containing chr14 1.70 0.002 KLC1 kinesin light chain 1 ;kinesin light chain 1 chr14 1.57 0.011 KPNA5 karyopherin alpha 5 (importin alpha 6) chr6 1.62 0.020 LGALS17A Charcot-Leyden crystal protein pseudogene chr19 1.58 0.049 LPHN2 latrophilin 2 chr1 1.72 0.007 LRRC4 leucine rich repeat containing 4 chr7 1.55 0.016 LRRTM4 leucine rich repeat transmembrane neuronal 4 chr2 1.65 0.037

224 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Fold Symbol Gene Description Chr Change p-Value LUM lumican chr12 1.87 0.023 MAP2K4 mitogen-activated protein kinase kinase 4 chr17 1.86 0.000 MAPKAP1 mitogen-activated protein kinase associated protein 1 chr9 1.58 0.023 MBTPS1 membrane-bound transcription factor peptidase, site 1 chr16 1.66 0.014 MOSPD2 motile sperm domain containing 2 chrX 1.50 0.026 MSI2 musashi RNA-binding protein 2 chr17 1.53 0.018 MSMB microseminoprotein, beta- chr10 2.49 0.003 MSRB3 methionine sulfoxide reductase B3 chr12 1.52 0.019 MYLK myosin light chain kinase chr3 1.89 0.024 NEK3 NIMA-related kinase 3 chr13 1.61 0.004 NIPSNAP3B nipsnap homolog 3B (C. elegans) chr9 1.52 0.046 NKIRAS1 NFKB inhibitor interacting Ras-like 1 chr3 2.12 0.003 nuclear receptor subfamily 3, group C, member 1 NR3C1 (glucocorticoid receptor) chr5 1.52 0.038 NR3C2 nuclear receptor subfamily 3, group C, member 2 chr4 1.70 0.004 NRBF2 nuclear receptor binding factor 2 chr10 1.64 0.008 NTF3 neurotrophin 3 chr12 1.97 0.004 NTN4 netrin 4 chr12 1.51 0.008 NUPL1 nucleoporin like 1 chr13 2.00 0.001 PDCD6 programmed cell death 6 chr5 1.74 0.027 PFDN4 prefoldin subunit 4 chr20 1.72 0.011 PIP4K2A phosphatidylinositol-5-phosphate 4-kinase, type II, alpha chr10 1.87 0.006 POLR3F polymerase (RNA) III (DNA directed) polypeptide F, 39 kDa chr20 1.53 0.025 PPAP2B phosphatidic acid phosphatase type 2B chr1 1.66 0.002 PPP2R1B protein phosphatase 2, regulatory subunit A, beta chr11 2.10 0.001 PPP3CB protein phosphatase 3, catalytic subunit, beta isozyme chr10 1.72 0.012 PRKCE protein kinase C, epsilon chr2 1.59 0.006 PSMG4 proteasome (prosome, macropain) assembly chaperone 4 chr6 1.58 0.006 PTPMT1 protein tyrosine phosphatase, mitochondrial 1 chr11 3.15 0.001 PTPN1 protein tyrosine phosphatase, non-receptor type 1 chr20 1.82 0.005 protein tyrosine phosphatase, non-receptor type 18 (brain- PTPN18 derived) chr2 1.51 0.008 PYGO2 pygopus family PHD finger 2 chr1 1.61 0.035 RAB30 RAB30, member RAS oncogene family chr11 1.66 0.013 RAD54L2 RAD54-like 2 (S. cerevisiae) chr3 1.58 0.003 RALB v-ral simian leukemia viral oncogene homolog B chr2 1.57 0.008 Ras association (RalGDS/AF-6) domain family (N-terminal) RASSF8 member 8 chr12 1.55 0.004 RBFOX2 RNA binding protein, fox-1 homolog (C. elegans) 2 chr22 1.64 0.003 recombination signal binding protein for immunoglobulin RBPJ kappa J region chr4 1.66 0.034 RELN reelin chr7 1.78 0.008 RFK riboflavin kinase chr9 2.02 0.001 RGS9BP regulator of G protein signaling 9 binding protein chr19 1.90 0.038 RND3 Rho family GTPase 3 chr2 1.50 0.017 RNF212 ring finger protein 212 chr4 1.99 0.004 RP11- 553L6.5 chr3 1.53 0.012 RUNX1 runt-related transcription factor 1 chr21 1.78 0.001 SAMD4A sterile alpha motif domain containing 4A chr14 1.54 0.024 SBNO1 strawberry notch homolog 1 (Drosophila) chr12 1.57 0.022 SERGEF secretion regulating guanine nucleotide exchange factor chr11 1.57 0.013 SERPINI1 serpin peptidase inhibitor, clade I (neuroserpin), member 1 chr3 1.70 0.030 SESN3 sestrin 3 chr11 2.17 0.004 SH3PXD2A SH3 and PX domains 2A chr10 1.57 0.036 solute carrier family 2 (facilitated glucose transporter), SLC2A3 member 3 chr12 1.82 0.016 solute carrier family 2 (facilitated glucose/fructose SLC2A5 transporter), member 5 chr1 1.59 0.013

The role of gwas identified 5p15 locus in prostate cancer risk and progression 225

Fold Symbol Gene Description Chr Change p-Value SLC30A4 solute carrier family 30 (zinc transporter), member 4 chr15 1.71 0.003 solute carrier family 36 (proton/amino acid symporter), SLC36A1 member 1 chr5 2.22 0.002 solute carrier family 7 (cationic amino acid transporter, y+ SLC7A2 system), member 2 chr8 1.92 0.007 SLK STE20-like kinase chr10 1.55 0.012 SMAP1 small ArfGAP 1 chr6 1.60 0.008 SNAP91 synaptosomal-associated protein, 91kDa chr6 1.53 0.035 SRPR signal recognition particle receptor (docking protein) chr11 1.63 0.008 SSUH2 ssu-2 homolog (C. elegans) chr3 1.56 0.035 ST3GAL6 ST3 beta-galactoside alpha-2,3-sialyltransferase 6 chr3 1.60 0.039 STX12 syntaxin 12 chr1 1.52 0.024 SUSD1 sushi domain containing 1 chr9 1.93 0.003 TAOK3 TAO kinase 3 chr12 1.72 0.001 TAPT1 transmembrane anterior posterior transformation 1 chr4 2.00 0.001 TESK2 testis-specific kinase 2 chr1 1.84 0.000 TGFB3 transforming growth factor, beta 3 chr14 1.53 0.026 THOC5 THO complex 5 chr22 1.54 0.018 THSD7A thrombospondin, type I, domain containing 7A chr7 2.65 0.002 TK2 thymidine kinase 2, mitochondrial chr16 1.93 0.009 TLN2 talin 2 chr15 1.63 0.006 TMEM178A transmembrane protein 178A chr2 1.61 0.044 TMEM241 transmembrane protein 241 chr18 1.58 0.008 TMEM26 transmembrane protein 26 chr10 1.56 0.007 TMSB4Y thymosin beta 4, Y-linked chrY 1.50 0.010 TMTC3 transmembrane and tetratricopeptide repeat containing 3 chr12 1.62 0.032 TRAPPC2 trafficking protein particle complex 2 chrX 1.50 0.041 TRIB2 tribbles pseudokinase 2 chr2 1.90 0.001 transient receptor potential cation channel, subfamily C, TRPC7 member 7 chr5 1.57 0.007 UBE2D3 ubiquitin-conjugating enzyme E2D 3 chr4 1.53 0.007 UGCG UDP-glucose ceramide glucosyltransferase chr9 1.80 0.002 VGLL4 vestigial-like family member 4 chr3 1.70 0.004 WASF3 WAS protein family, member 3 chr13 1.93 0.005 WNT5A wingless-type MMTV integration site family, member 5A chr3 2.20 0.001 YOD1 YOD1 deubiquitinase chr1 1.52 0.028 ZBTB20 zinc finger and BTB domain containing 20 chr3 1.56 0.038 ZCCHC3 zinc finger, CCHC domain containing 3 chr20 1.66 0.006 ZHX1 zinc fingers and homeoboxes 1 chr8 1.68 0.015 ZNF467 zinc finger protein 467 chr7 1.58 0.038

226 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Appendix F

Genes differentially regulated by IRX4 knockdown in both LNCaP and VCaP cells LNCaP_ VCaP_ Symbol siIRX4vssiNT siIRX4vssiNT ABCC4 -1 -1 ACOT7 -1 -1 ACSL5 -1 -1 ANG -1 -1 APLP2 -1 -1 ARHGAP33 -1 -1 ARSB -1 -1 ASXL1 -1 -1 ATP6V0E1 -1 -1 ATP6V0E1P2 -1 -1 AUNIP -1 -1 CCDC109B -1 -1 CCDC92 -1 -1 CCNE2 -1 -1 CD302 -1 -1 CDK14 -1 -1 CDS2 -1 -1 CNOT6 -1 -1 CUX1 -1 -1 CYR61 -1 -1 DNAJC16 -1 -1 DPP4 -1 -1 DUSP6 -1 -1 EBAG9 -1 -1 EHF -1 -1 ERCC1 -1 -1 ERVMER34-1 -1 -1 FADS1 -1 -1 FAM104B -1 -1 FAM222B -1 -1 FAM83E -1 -1 FANCI -1 -1 FERMT1 -1 -1 GAS8 -1 -1 GNAS -1 -1 GNPDA1 -1 -1 HGSNAT -1 -1 HIST1H3B -1 -1 HYKK -1 -1 IRX4 -1 -1 ITGA1 -1 -1 ITGB3BP -1 -1 KIF16B -1 -1 KIF1C -1 -1 KIF24 -1 -1 KNTC1 -1 -1 LCLAT1 -1 -1 LMAN1 -1 -1 LONP2 -1 -1 LOXL3 -1 -1 LRR1 -1 -1

The role of gwas identified 5p15 locus in prostate cancer risk and progression 227

LNCaP_ VCaP_ Symbol siIRX4vssiNT siIRX4vssiNT MARCKS -1 -1 MASTL -1 -1 MFSD10 -1 -1 MGP -1 -1 MIF4GD -1 -1 MKNK2 -1 -1 MRPL11 -1 -1 MRPS11 -1 -1 MTFR1L -1 -1 MYB -1 -1 MYBL2 -1 -1 MYO19 -1 -1 MYO1D -1 -1 NASP -1 -1 NDE1 -1 -1 NEK4 -1 -1 NEO1 -1 -1 NFATC3 -1 -1 NINJ1 -1 -1 NSA2 -1 -1 PCNXL2 -1 -1 PFKFB3 -1 -1 PIGP -1 -1 PIGS -1 -1 PLOD1 -1 -1 PNKD -1 -1 PPAP2A -1 -1 PPDPF -1 -1 PRAC2 -1 -1 PRCP -1 -1 PRKAA2 -1 -1 PROCR -1 -1 PROSER1 -1 -1 PSMD7 -1 -1 PSMG2 -1 -1 PTDSS1 -1 -1 RAB40B -1 -1 RAD51 -1 -1 RAVER1 -1 -1 RNASE4 -1 -1 RNF10 -1 -1 RNPEPL1 -1 -1 RP11-229P13.25 -1 -1 RP11-333E13.4 -1 -1 RP11-676M6.1 -1 -1 RRM2 -1 -1 SGMS2 -1 -1 SHB -1 -1 SIGMAR1 -1 -1 SLC16A10 -1 -1 SLC25A18 -1 -1 SLC37A1 -1 -1 SMARCE1 -1 -1 SPINK1 -1 -1 SPRR2F -1 -1 TICRR -1 -1

228 The role of gwas identified 5p15 locus in prostate cancer risk and progression

LNCaP_ VCaP_ Symbol siIRX4vssiNT siIRX4vssiNT TMEFF2 -1 -1 TMEM123 -1 -1 TMEM143 -1 -1 TMEM199 -1 -1 TMX4 -1 -1 TOMM34 -1 -1 TRAK1 -1 -1 TSPAN14 -1 -1 UBE2D4 -1 -1 UFC1 -1 -1 VANGL1 -1 -1 ZBED3 -1 -1 ALCAM 1 1 ANKRD46 1 1 ANKRD6 1 1 ARHGAP28 1 1 ATP11B 1 1 ATP2B4 1 1 BCCIP 1 1 BCL2L13 1 1 BOD1 1 1 C14orf132 1 1 C6orf120 1 1 CALU 1 1 CRK 1 1 CWC15 1 1 CYB561D1 1 1 DDX18 1 1 DDX3X 1 1 DDX3Y 1 1 DGKH 1 1 DLEU1 1 1 EEA1 1 1 EIF5B 1 1 EPB41L1 1 1 ESYT2 1 1 FAM49B 1 1 G3BP2 1 1 GPR137B 1 1 GPR137C 1 1 GPR158 1 1 GSDMC 1 1 JAG1 1 1 KHNYN 1 1 KLC1 1 1 KPNA5 1 1 MAP2K4 1 1 MAPKAP1 1 1 MBTPS1 1 1 MOSPD2 1 1 NKIRAS1 1 1 NR3C2 1 1 NRBF2 1 1 NUPL1 1 1 PDCD6 1 1 PFDN4 1 1

The role of gwas identified 5p15 locus in prostate cancer risk and progression 229

LNCaP_ VCaP_ Symbol siIRX4vssiNT siIRX4vssiNT PIP4K2A 1 1 POLR3F 1 1 PPAP2B 1 1 PPP2R1B 1 1 PPP3CB 1 1 PRKCE 1 1 PSMG4 1 1 PTPMT1 1 1 PTPN1 1 1 PYGO2 1 1 RAD54L2 1 1 RBFOX2 1 1 RBPJ 1 1 RFK 1 1 RGS9BP 1 1 SBNO1 1 1 SLC36A1 1 1 SLK 1 1 SMAP1 1 1 SRPR 1 1 TAPT1 1 1 TESK2 1 1 TK2 1 1 TMEM241 1 1 TMTC3 1 1 UBE2D3 1 1 UGCG 1 1 VGLL4 1 1 WASF3 1 1 WNT5A 1 1 YOD1 1 1 ZCCHC3 1 1 MSMB -1 1 MSRB3 -1 1 ZHX1 -1 1 ABCB5 1 -1 FREM1 1 -1 PEX13 1 -1 SNHG14 1 -1

230 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Appendix G

The most likely activated, or inhibited upstream regulators of IRX4 regulated genes in LNCaP cells Exp Fold Predicted Activation p-value of Upstream Regulator Change Molecule Type Activation State z-score overlap NUPR1 1.57 transcription regulator Activated 4.849 1.80E-16 CDKN2A -1.62 transcription regulator Activated 4.22 6.90E-14 fulvestrant chemical drug Activated 4.047 1.33E-08 let-7 1.24 microrna Activated 3.85 3.47E-18 calcitriol chemical drug Activated 3.841 1.73E-13 TP53 -1.17 transcription regulator Activated 3.694 4.44E-25 ATP7B 1.47 transporter Activated 3.606 6.30E-06 E2F6 1.15 transcription regulator Activated 3.317 4.89E-09 CDKN1A 1.21 kinase Activated 3.196 6.34E-24 RBL1 -1.43 transcription regulator Activated 3.053 1.00E-07 RBL2 -1.39 other Activated 3.005 5.11E-07 BNIP3L -1.33 other Activated 2.998 2.46E-08 miR-291a-3p (and other miRNAs w/seed AAGUGCU) mature microrna Activated 2.967 5.99E-05 SMARCB1 1.12 transcription regulator Activated 2.922 5.80E-05 RB1 -1.18 transcription regulator Activated 2.821 7.51E-13 Rb group Activated 2.81 1.48E-11 SCAP 1.35 other Activated 2.749 5.45E-05 Irgm1 other Activated 2.72 4.58E-04 BMS-690514 chemical drug Activated 2.714 3.26E-05 rhodamine 6G chemical toxicant Activated 2.626 1.61E-07

The role of gwas identified 5p15 locus in prostate cancer risk and progression 231

Exp Fold Predicted Activation p-value of Upstream Regulator Change Molecule Type Activation State z-score overlap CSF2 1.04 cytokine Inhibited -5.973 5.21E-09 Vegf group Inhibited -5.558 4.59E-10 -1.55 transcription regulator Inhibited -5.466 1.01E-13 RABL6 1.35 other Inhibited -5.396 1.65E-19 MITF -1.38 transcription regulator Inhibited -5.118 8.18E-15 HGF -1.17 growth factor Inhibited -5.021 4.18E-09 PTGER2 -1.09 g-protein coupled receptor Inhibited -4.987 2.96E-10 TBX2 -1.03 transcription regulator Inhibited -4.559 1.41E-12 dihydrotestosterone chemical - endogenous mammalian Inhibited -4.094 3.12E-07 E2f group Inhibited -4.026 1.14E-12 EP400 -1.14 other Inhibited -3.85 3.42E-09 FOXM1 -1.61 transcription regulator Inhibited -3.655 1.39E-07 RAF1 1.13 kinase Inhibited -3.463 2.12E-06 1.08 transcription regulator Inhibited -3.362 1.49E-08 CCND1 -1.76 transcription regulator Inhibited -3.23 8.07E-32 AR -1.24 ligand-dependent nuclear receptor Inhibited -3.174 6.18E-09 ESR1 -1.22 ligand-dependent nuclear receptor Inhibited -3.161 2.38E-14 -1.65 transcription regulator Inhibited -3.053 6.84E-12 hydroxyurea chemical drug Inhibited -2.902 2.44E-06 LMNB1 -1.68 other Inhibited -2.8 3.42E-05

232 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Appendix H

The most likely activated, or inhibited upstream regulators of IRX4 regulated genes in VCaP cells Expr Predicted p-value Activation z- Upstream Regulator Fold Molecule Type Activation of score Change State overlap

doxorubicin chemical drug Activated 2.411 6E-04 CD3 complex Activated 2.111 0.031 miR-155-5p (miRNAs w/seed UAAUGCU) mature microrna 1.769 5E-04 SMARCA4 1.18 transcription regulator 1.471 7E-07 NUPR1 1.19 transcription regulator 1.414 7E-04 NCOA4 1.08 transcription regulator 1.387 8E-05 OSM 1.1 cytokine 1.26 4E-04 TP53 1.19 transcription regulator 1.243 6E-05 STK11 1.1 kinase 1 3E-05 SYVN1 1.16 transporter 0.707 1E-03 ERG 1.45 transcription regulator 0.632 7E-04 PLG 1.1 peptidase 0.61 8E-04 ERBB2 1.17 kinase 0.478 1E-04 IL10RA 1.13 transmembrane receptor 0.336 7E-05 CSF2 -1.03 cytokine Inhibited -2.625 0.038 NFKBIA 1.05 transcription regulator Inhibited -2.331 0.031 HRAS -1.18 enzyme Inhibited -2.207 0.002 EHF -1.99 transcription regulator Inhibited -2.236 0.002

The role of gwas identified 5p15 locus in prostate cancer risk and progression 233

Expr Predicted p-value Activation z- Upstream Regulator Fold Molecule Type Activation of score Change State overlap

dalfampridine chemical drug Inhibited -2 0.001 E2f group -1.981 6E-04 HNF1A -1.5 transcription regulator -1.937 4E-04 beta-estradiol chemical - endogenous mammalian -1.823 3E-04 cisplatin chemical drug -1.242 6E-04 triamcinolone acetonide chemical drug -1.108 0.001 EGF -1.15 growth factor -1.102 0.001 Vegf group -1.009 6E-05 FN1 -1.24 enzyme -1 3E-04 FOS -1.45 transcription regulator -0.896 4E-04 progesterone chemical - endogenous mammalian -0.771 8E-08 PRKAA1 1.11 kinase -0.707 7E-05 PLAU -1.17 peptidase -0.7 4E-04 F2 -1.09 peptidase -0.601 7E-06 PRKAA2 -1.53 kinase -0.378 0.001 F7 -1.14 peptidase -0.338 1E-04 AR -1.3 ligand-dependent nuclear receptor -0.317 7E-04 HNF1B -1.13 transcription regulator -0.243 7E-05 HGF -1.57 growth factor -0.201 5E-05

234 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Appendix I

The role of gwas identified 5p15 locus in prostate cancer risk and progression 235

Schematic of genes regulated by IRX4 and CSF2, annotated by IPA. The genes regulated by IRX4 are also found to be the targets of Colony Stimulating Factor 2, CSF (a) in LNCaP cells (b) VCaP cells. Orange symbols – up-regulated genes, Blue symbols – down-regulated genes. CSF2 is represented in blue to indicates its predicted inhibition by IRX4 knockdown. Upregulation of a gene by both IRX4 knockdown or CSF2 activity is indicated by orange arrowed lines while downregulation is indicated by blue arrowed lines. The inconsistent relationships by IRX4 knockdown and CSF2 activity are represented by yellow lines. Grey dashed lines indicate that the effect of gene expression regulation by CSF2 is not annotated in IPA.

236 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Appendix J

The role of gwas identified 5p15 locus in prostate cancer risk and progression 237

Schematic of genes regulated by IRX4 and NUPR1, annotated by IPA. The genes regulated by IRX4 are also found to be the targets of Nuclear Protein 1, NUPR1 (a) in LNCaP cells (b) VCaP cells. Orange symbols – up-regulated genes, Blue symbols – down-regulated genes. NUPR1 is represented in orange to indicates its predicted activation by IRX4 knockdown. Upregulation of a gene by both IRX4 knockdown or NUPR1 activity is indicated by orange arrowed lines while downregulation is indicated by blue arrowed lines. The inconsistent relationships by IRX4 knockdown and NUPR1 activity are represented by yellow lines. Grey dashed lines indicate that the effect of gene expression regulation by NUPR1 is not annotated in IPA.

238 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Appendix K

The role of gwas identified 5p15 locus in prostate cancer risk and progression 239

Schematic of genes regulated by IRX4 and doxorubicin, annotated by IPA. The genes regulated by IRX4 are also found to be the targets of chemotherapeutic drug, doxorubicin (a) in LNCaP cells (b) VCaP cells. Orange symbols – up-regulated genes, Blue symbols – down-regulated genes. Doxorubicin is represented in orange to indicate its predicted activation by IRX4 knockdown. Upregulation of a gene by both IRX4 knockdown or doxorubicin activity is indicated by orange arrowed lines while downregulation is indicated by blue arrowed lines. The inconsistent relationships by IRX4 knockdown and doxorubicin activity are represented by yellow lines. Grey dashed lines indicate that the effect of gene expression regulation by doxorubicin is not annotated in IPA.

240 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Appendix L

Proteins identified to be interacting with IRX4 Accession number Entry name Q07157 Tight junction protein ZO-1 Q92734 Protein TFG Q8NDX5 Polyhomeotic-like protein 3 Q06587 E3 ubiquitin-protein ligase RING1 Uveal autoantigen with coiled-coil domains and ankyrin Q9BZF9 repeats Q9Y3Y2 Chromatin target of PRMT1 protein Q5T5P2 Sickle tail protein homolog Q9UDY2 Tight junction protein ZO-2 Q99700 Ataxin-2 Q9H2D6 TRIO and F-actin-binding protein Q9BPW8 Protein NipSnap homolog 1 O00257 E3 SUMO-protein ligase CBX4 P78364 Polyhomeotic-like protein 1 Q13501 Sequestosome-1 Q9HC52 Chromobox protein homolog 8 P35579 Myosin-9 P35226 Polycomb complex protein BMI-1 Q96HS1 Serine/threonine-protein phosphatase PGAM5, mitochondrial O94916 Nuclear factor of activated T-cells 5 O95817 BAG family molecular chaperone regulator 3 P62847 40S ribosomal protein S24 Q8N9N5 Protein BANP Q16891 MICOS complex subunit MIC60 Q9NPI6 mRNA-decapping enzyme 1A Q9Y676 28S ribosomal protein S18b, mitochondrial Q9NX63 MICOS complex subunit MIC19 P61326 Protein mago nashi homolog P38117 Electron transfer flavoprotein subunit beta Q5JSZ5 Protein PRRC2B Q9HCD5 Nuclear receptor coactivator 5 P13804 Electron transfer flavoprotein subunit alpha, mitochondrial P69905 Hemoglobin subunit alpha P35227 Polycomb group RING finger protein 2 P55317 Hepatocyte nuclear factor 3-alpha Q9UHB6 LIM domain and actin-binding protein 1 Chromosome transmission fidelity protein 8 homolog P0CG12 isoform 2 P11387 DNA topoisomerase 1 P14373 Zinc finger protein RFP P15880 40S ribosomal protein S2 P62753 40S ribosomal protein S6 O75934 Pre-mRNA-splicing factor SPF27 Q16643 Drebrin P02751 Fibronectin P0C0L4 Complement C4-A Q49A26 Putative oxidoreductase GLYR1 Q15293 Reticulocalbin-1

The role of gwas identified 5p15 locus in prostate cancer risk and progression 241

Q9Y3B7 39S ribosomal protein L11, mitochondrial Q9BUT9 Protein FAM195A Q86WA8 Lon protease homolog 2, peroxisomal Q96F86 Enhancer of mRNA-decapping protein 3 O95931 Chromobox protein homolog 7 Q9H5H4 Zinc finger protein 768 P78413 Iroquois-class homeodomain protein IRX-4 P40939 Trifunctional enzyme subunit alpha, mitochondrial P36542 ATP synthase subunit gamma, mitochondrial Q99623 Prohibitin-2 O96019 Actin-like protein 6A Q92922 SWI/SNF complex subunit SMARCC1 Q9P2M7 Cingulin Q9NYL9 Tropomodulin-3 Lipoamide acyltransferase component of branched-chain P11182 alpha-keto acid dehydrogenase complex, mitochondrial Q16698 2,4-dienoyl-CoA reductase, mitochondrial O75390 Citrate synthase, mitochondrial P34897 Serine hydroxymethyltransferase, mitochondrial Putative pre-mRNA-splicing factor ATP-dependent RNA O43143 helicase DHX15 Q6UN15 Pre-mRNA 3'-end-processing factor FIP1 Q9UM54 Unconventional myosin-VI Q15366 Poly(rC)-binding protein 2 O95503 Chromobox protein homolog 6 Q99714 3-hydroxyacyl-CoA dehydrogenase type-2 Q9BSD7 Cancer-related nucleoside-triphosphatase P51659 Peroxisomal multifunctional enzyme type 2 Q6ZRV2 Protein FAM83H Q16822 Phosphoenolpyruvate carboxykinase [GTP], mitochondrial Q99459 Cell division cycle 5-like protein P32322 Pyrroline-5-carboxylate reductase 1, mitochondrial P48735 Isocitrate dehydrogenase [NADP], mitochondrial Q9BZL4 Protein phosphatase 1 regulatory subunit 12C Q9UPQ0 LIM and calponin homology domains-containing protein 1 Dihydrolipoyllysine-residue acetyltransferase component of P10515 pyruvate dehydrogenase complex, mitochondrial O75955 Flotillin-1 Q9BUJ2 Heterogeneous nuclear ribonucleoprotein U-like protein 1 Q08211 ATP-dependent RNA helicase A P02042 Hemoglobin subunit delta Q96GD3 Polycomb protein SCMH1 Q15773 Myeloid leukemia factor 2

242 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Appendix M

Heatmap showing correlation between the expressions of IRX4 and FOXA1 in prostate cancer samples. The inverse relationship was observed for IRX4 and FOXA1 mRNA expression levels in prostate cancer patient samples (Source – cBioportal, TCGA prostate cancer dataset)

The role of gwas identified 5p15 locus in prostate cancer risk and progression 243

Appendix N

Exon expression for IRX4 derived from GTEx. Expression of a putative exon (indicated by arrow) is predicted in skin, esophagus, vagina, salivary gland, prostate and heart left ventricle (Data Source: GTEx Analysis Release V7 (dbGaP Accession phs000424.v7.p2).

244 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Appendix O

Conservation information for IRX4lncRNA (NONHSAG039687) retrieved from the NONCODE database The sequence of IRX4lncRNA is conserved in mouse and rat genome.

The role of gwas identified 5p15 locus in prostate cancer risk and progression 245

Appendix P

Pathways associated with IRX4lncRNA as predicted by the FuncPred online tool Keyword Category Description FDR HALLMARKMYOGENESIS GSEA_H HALLMARK MYOGENESIS 4.00E-07 HALOXIPHO GSEA_H HALLMARK OXIDATIVE PHOSPHORYLATION 4.00E-07 GO:0031672 GO_CC A band 1.40E-06 GO:0031966 GO_CC mitochondrial membrane 1.40E-06 GO:0005865 GO_CC striated muscle thin filament 1.40E-06 GO:0045259 GO_CC proton-transporting ATP synthase complex 1.40E-06 GO:0019866 GO_CC organelle inner membrane 1.40E-06 GO:0030017 GO_CC sarcomere 1.40E-06 GO:0030016 GO_CC myofibril 1.40E-06 GO:0005859 GO_CC muscle myosin complex 1.40E-06 GO:0036379 GO_CC myofilament 1.40E-06 GO:0070469 GO_CC respiratory chain 1.40E-06 GO:0005739 GO_CC mitochondrion 1.40E-06 GO:0005743 GO_CC mitochondrial inner membrane 1.40E-06 GO:0005747 GO_CC mitochondrial respiratory chain complex I 1.40E-06 GO:0005753 GO_CC mitochondrial proton-transporting ATP synthase complex 1.40E-06 GO:0044455 GO_CC mitochondrial membrane part 1.40E-06 GO:0044429 GO_CC mitochondrial part 1.40E-06 GO:0005759 GO_CC mitochondrial matrix 1.40E-06 GO:0045271 GO_CC respiratory chain complex I 1.40E-06 KIVPTU GSEA_C2 KUNINGER IGF1 VS PDGFB TARGETS UP 1.50E-06 KEGHUNDIS GSEA_C2 KEGG HUNTINGTONS DISEASE 1.50E-06 STESTAUP GSEA_C2 STEIN ESRRA TARGETS UP 1.50E-06 KEGPARDIS GSEA_C2 KEGG PARKINSONS DISEASE 1.50E-06 RHANCF GSEA_C2 RICKMAN HEAD AND NECK CANCER F 1.50E-06 WOMIGEMO GSEA_C2 WONG MITOCHONDRIA GENE MODULE 1.50E-06 MHM62 GSEA_C2 MOOTHA HUMAN MITODB 6 2002 1.50E-06

246 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Keyword Category Description FDR RFOABCC GSEA_C2 REACTOME FORMATION OF ATP BY CHEMIOSMOTIC COUPLING 1.50E-06 GO:0016460 GO_CC myosin II complex 2.50E-06 BIOKREPAT GSEA_C2 BIOCARTA KREB PATHWAY 3.10E-06 GO:0005861 GO_CC troponin complex 4.40E-06 HALLMARKADIPOGENESIS GSEA_H HALLMARK ADIPOGENESIS 5.30E-06 RPMACATC GSEA_C2 REACTOME PYRUVATE METABOLISM AND CITRIC ACID TCA CYCLE 5.40E-06 GO:0016651 GO_MF oxidoreductase activity, acting on NAD(P)H 6.50E-06 GO:0008137 GO_MF NADH dehydrogenase (ubiquinone) activity 6.50E-06 GO:0003954 GO_MF NADH dehydrogenase activity 6.50E-06 GO:0050136 GO_MF NADH dehydrogenase (quinone) activity 6.50E-06 GO:0016469 GO_CC proton-transporting two-sector ATPase complex 6.60E-06 STEESRTAR GSEA_C2 STEIN ESRRA TARGETS 6.80E-06 GO:0016655 GO_MF oxidoreductase activity, acting on NAD(P)H, quinone or similar compound as acceptor 7.10E-06 GO:0031967 GO_CC organelle envelope 8.60E-06 GO:0031975 GO_CC envelope 8.60E-06 KCCTC GSEA_C2 KEGG CITRATE CYCLE TCA CYCLE 1.10E-05 DOID:0050700 DO Cardiomyopathy 1.30E-05 DOID:0060036 DO NA 1.30E-05 DOID:114 DO Heart diseases 1.30E-05 DOID:11984 DO Hypertrophic cardiomyopathy 1.30E-05 DOID:12930 DO Dilated cardiomyopathy 1.30E-05 GO:0008307 GO_MF structural constituent of muscle 1.40E-05 REAGLUMET GSEA_C2 REACTOME GLUCOSE METABOLISM 1.80E-05 RMFABO GSEA_C2 REACTOME MITOCHONDRIAL FATTY ACID BETA OXIDATION 2.00E-05 GO:0031674 GO_CC I band 2.20E-05 DOID:3652 DO Leigh disease 2.90E-05 DOID:700 DO Mitochondrial diseases 5.40E-05 GO:0031432 GO_MF titin binding 6.10E-05 DOID:422 DO Myopathies, Structural, Congenital 7.20E-05 DTOMAS GSEA_C2 DELASERNA TARGETS OF MYOD AND SMARCA4 7.80E-05 REMIPRIM GSEA_C2 REACTOME MITOCHONDRIAL PROTEIN IMPORT 0.00014

The role of gwas identified 5p15 locus in prostate cancer risk and progression 247

Keyword Category Description FDR REALRHUP GSEA_C2 REN ALVEOLAR RHABDOMYOSARCOMA UP 0.00015 HAFAACME GSEA_H HALLMARK FATTY ACID METABOLISM 0.00033 RROPDPC GSEA_C2 REACTOME REGULATION OF PYRUVATE DEHYDROGENASE PDH COMPLEX 0.00035 BRUSEPATR GSEA_C2 BRUNEAU SEPTATION ATRIAL 0.00056 DOID:3649 DO Pyruvate decarboxylase deficiency 0.00061 GO:0044325 GO_MF ion channel binding 0.00062 GO:0030018 GO_CC Z disc 0.00074 GO:0014701 GO_CC junctional sarcoplasmic reticulum membrane 0.00083 REAMUSCON GSEA_C2 REACTOME MUSCLE CONTRACTION 0.00094 GO:0031013 GO_MF troponin I binding 0.001 DOID:0080000 DO NA 0.001 DOID:66 DO Muscle tissue disease 0.0012 HSCPD GSEA_C2 HUMMERICH SKIN CANCER PROGRESSION DN 0.0013 GO:0000314 GO_CC organellar small ribosomal subunit 0.0016 GO:0005763 GO_CC mitochondrial small ribosomal subunit 0.0016 REAPYRMET GSEA_C2 REACTOME PYRUVATE METABOLISM 0.0018 GO:0016491 GO_MF oxidoreductase activity 0.0019 DOID:423 DO Myopathy 0.002 LEBPD GSEA_C2 LANDIS ERBB2 BREAST PRENEOPLASTIC DN 0.0021 GO:0016459 GO_CC myosin complex 0.0022 KCRMD GSEA_C2 KAYO CALORIE RESTRICTION MUSCLE DN 0.0033 MOOFFAOXY GSEA_C2 MOOTHA FFA OXYDATION 0.0034 ETOPFFU GSEA_C2 EBAUER TARGETS OF PAX3 FOXO1 FUSION UP 0.0035 KRTRU GSEA_C2 KEEN RESPONSE TO ROSIGLITAZONE UP 0.0039 DOID:705 DO Leber hereditary optic neuropathy 0.0048 DOID:1891 DO disorder of the optic nerve 0.0048 DOID:3191 DO Myopathies, Nemaline 0.0049 GO:0014874 GO_BP response to stimulus involved in regulation of muscle adaptation 0.0051 GO:0032781 GO_BP positive regulation of ATPase activity 0.0051 GO:1901659 GO_BP glycosyl compound biosynthetic process 0.0054 GO:0031000 GO_BP response to caffeine 0.0058

248 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Keyword Category Description FDR GO:0031430 GO_CC M band 0.0059 GO:0010890 GO_BP positive regulation of sequestering of triglyceride 0.0062 GO:0042383 GO_CC sarcolemma 0.0064 GO:0005761 GO_CC mitochondrial ribosome 0.0067 GO:0000313 GO_CC organellar ribosome 0.0067 GO:0009156 GO_BP ribonucleoside monophosphate biosynthetic process 0.0072 GO:0033017 GO_CC sarcoplasmic reticulum membrane 0.0073 GO:0070296 GO_BP sarcoplasmic reticulum calcium ion transport 0.0073 GO:0042455 GO_BP ribonucleoside biosynthetic process 0.0079 GO:0031305 GO_CC integral component of mitochondrial inner membrane 0.008 GO:0007005 GO_BP mitochondrion organization 0.0083 GO:0009124 GO_BP nucleoside monophosphate biosynthetic process 0.0083 GO:0000502 GO_CC proteasome complex 0.0084 DEMYTAUP GSEA_C2 DELASERNA MYOD TARGETS UP 0.0087 GO:0016528 GO_CC sarcoplasm 0.0089 GO:0016529 GO_CC sarcoplasmic reticulum 0.0094 MOOTHAPYR GSEA_C2 MOOTHA PYR 0.0096 KARVCA GSEA_C2 KEGG ARRHYTHMOGENIC RIGHT VENTRICULAR CARDIOMYOPATHY ARVC 0.01 GO:0086069 GO_BP bundle of His cell to Purkinje myocyte communication 0.01 SANPPATAR GSEA_C2 SANDERSON PPARA TARGETS 0.012 HBSTD GSEA_C2 HUMMERICH BENIGN SKIN TUMOR DN 0.012 GO:0009055 GO_MF electron carrier activity 0.012 JAEMETDN GSEA_C2 JAEGER METASTASIS DN 0.012 GO:0017022 GO_MF myosin binding 0.013 GO:0006090 GO_BP pyruvate metabolic process 0.013 CSVEMD GSEA_C2 CHEMELLO SOLEUS VS EDL MYOFIBERS DN 0.013 GO:0098900 GO_BP regulation of action potential 0.013 GO:0042692 GO_BP muscle cell differentiation 0.014 GO:0060307 GO_BP regulation of ventricular cardiac muscle cell membrane repolarization 0.014 DOID:3762 DO Cytochrome-c Oxidase Deficiency 0.014 GO:0043034 GO_CC costamere 0.014

The role of gwas identified 5p15 locus in prostate cancer risk and progression 249

Keyword Category Description FDR GO:0006818 GO_BP hydrogen transport 0.015 GO:0015992 GO_BP proton transport 0.017 RAKNBD GSEA_C2 RODWELL AGING KIDNEY NO BLOOD DN 0.017 BRUSEPVEN GSEA_C2 BRUNEAU SEPTATION VENTRICULAR 0.017 GO:0015078 GO_MF hydrogen ion transmembrane transporter activity 0.018 ROAGKIDN GSEA_C2 RODWELL AGING KIDNEY DN 0.018 DCPRE GSEA_C2 DAIRKEE CANCER PRONE RESPONSE E2 0.018 GO:0055119 GO_BP relaxation of cardiac muscle 0.018 GO:0009068 GO_BP aspartate family amino acid catabolic process 0.02 GO:0006853 GO_BP carnitine shuttle 0.02 GO:1902603 GO_BP carnitine transmembrane transport 0.02 DOID:5656 DO Cranial nerve diseases 0.02 GO:0043267 GO_BP negative regulation of potassium ion transport 0.02 GO:0086002 GO_BP cardiac muscle cell action potential involved in contraction 0.021 GO:0014850 GO_BP response to muscle activity 0.021 GO:0014823 GO_BP response to activity 0.021 DOID:1882 DO Heart Septal Defects, Atrial 0.022 DOID:1682 DO Congenital heart disease 0.022 DOID:1681 DO Heart Septal Defects 0.022 GO:0086004 GO_BP regulation of cardiac muscle cell contraction 0.022 DEMAGIDN GSEA_C2 DEMAGALHAES AGING DN 0.022 GO:0031304 GO_CC intrinsic component of mitochondrial inner membrane 0.023 GO:0042645 GO_CC mitochondrial nucleoid 0.023 WANNFKTAR GSEA_C2 WANG NFKB TARGETS 0.023 GO:0005744 GO_CC mitochondrial inner membrane presequence translocase complex 0.024 DOID:0050451 DO Brugada syndrome 0.024 GO:0015002 GO_MF heme-copper terminal oxidase activity 0.026 GO:0004129 GO_MF cytochrome-c oxidase activity 0.026 GO:0016675 GO_MF oxidoreductase activity, acting on a heme group of donors 0.026 GO:0016676 GO_MF oxidoreductase activity, acting on a heme group of donors, oxygen as acceptor 0.026 GO:0086001 GO_BP cardiac muscle cell action potential 0.026

250 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Keyword Category Description FDR GO:0030315 GO_CC T-tubule 0.026 GO:0007519 GO_BP skeletal muscle tissue development 0.026 GO:0060373 GO_BP regulation of ventricular cardiac muscle cell membrane depolarization 0.026 GO:0010880 GO_BP regulation of release of sequestered calcium ion into cytosol by sarcoplasmic reticulum 0.027 KVLAID GSEA_C2 KEGG VALINE LEUCINE AND ISOLEUCINE DEGRADATION 0.027 GO:0071313 GO_BP cellular response to caffeine 0.027 GO:0071415 GO_BP cellular response to purine-containing compound 0.027 GO:0009295 GO_CC nucleoid 0.027 GO:0090075 GO_BP relaxation of muscle 0.027 GO:0014704 GO_CC intercalated disc 0.029 GO:0086010 GO_BP membrane depolarization during action potential 0.029 GO:0003215 GO_BP cardiac right ventricle morphogenesis 0.03 RITIMU GSEA_C2 ROME INSULIN TARGETS IN MUSCLE UP 0.03 GO:0006006 GO_BP glucose metabolic process 0.031 DOID:11724 DO Muscular Dystrophies, Limb-Girdle 0.031 GO:0031588 GO_CC AMP-activated protein kinase complex 0.032 GO:0003785 GO_MF actin monomer binding 0.032 GO:0086009 GO_BP membrane repolarization 0.033 YTRTPC1 GSEA_C2 YAO TEMPORAL RESPONSE TO PROGESTERONE CLUSTER 13 0.033 GO:0051539 GO_MF 4 iron, 4 sulfur cluster binding 0.033 GO:0042805 GO_MF actinin binding 0.034 GO:0003857 GO_MF 3-hydroxyacyl-CoA dehydrogenase activity 0.034 GO:0060306 GO_BP regulation of membrane repolarization 0.035 KEGPROMET GSEA_C2 KEGG PROPANOATE METABOLISM 0.036 GO:0005523 GO_MF tropomyosin binding 0.037 KEGPYRMET GSEA_C2 KEGG PYRUVATE METABOLISM 0.037 GO:0010882 GO_BP regulation of cardiac muscle contraction by calcium ion signaling 0.038 regulation of cardiac muscle contraction by regulation of the release of sequestered GO:0010881 GO_BP calcium ion 0.038 CROMETDN GSEA_C2 CROMER METASTASIS DN 0.039 DOID:0050773 DO Paraganglioma 0.039

The role of gwas identified 5p15 locus in prostate cancer risk and progression 251

Keyword Category Description FDR DOID:169 DO Neuroendocrine Tumors 0.039 GO:0051371 GO_MF muscle alpha-actinin binding 0.041 GO:0010889 GO_BP regulation of sequestering of triglyceride 0.041 REACTOMEGLUCONEOGENESIS GSEA_C2 REACTOME GLUCONEOGENESIS 0.041 GO:0046716 GO_BP muscle cell cellular homeostasis 0.042 GO:0048038 GO_MF quinone binding 0.042 GO:0015879 GO_BP carnitine transport 0.042 GO:0015838 GO_BP amino-acid betaine transport 0.042 RAASFAOIM GSEA_C2 REACTOME ACTIVATED AMPK STIMULATES FATTY ACID OXIDATION IN MUSCLE 0.045 GO:0086065 GO_BP cell communication involved in cardiac conduction 0.045 MODHIPPOS GSEA_C2 MODY HIPPOCAMPUS POSTNATAL 0.045 GO:0005758 GO_CC mitochondrial intermembrane space 0.046 GO:0086012 GO_BP membrane depolarization during cardiac muscle cell action potential 0.046 GO:0022624 GO_CC proteasome accessory complex 0.047 GO:0014808 GO_BP release of sequestered calcium ion into cytosol by sarcoplasmic reticulum 0.049 GO:0000315 GO_CC organellar large ribosomal subunit 0.049 GO:0005762 GO_CC mitochondrial large ribosomal subunit 0.049 GO:1901016 GO_BP regulation of potassium ion transmembrane transporter activity 0.049 GO:0010959 GO_BP regulation of metal ion transport 0.05 GO:0045823 GO_BP positive regulation of heart contraction 0.05 HMSTD GSEA_C2 HUMMERICH MALIGNANT SKIN TUMOR DN 0.05

252 The role of gwas identified 5p15 locus in prostate cancer risk and progression

Appendix Q

Transcription factors binding at the MNLP in VCaP and LNCaP cells. Binding of AR, ERG, FOXA1 and POL2RA was observed at the MNLP region in the upstream of IRX4 and no binding of AR and FOXA1 to MNLP sequence was observed in LNCaP cells, while binding of ETV1 was observed at this region in LNCaP cells (Derived from UCSC Genome Browser, Source – Cistrome Finder).

The role of gwas identified 5p15 locus in prostate cancer risk and progression 253

254 The role of gwas identified 5p15 locus in prostate cancer risk and progression