Retrotransposition in Colorectal Cancer
Total Page:16
File Type:pdf, Size:1020Kb
Retrotransposition in colorectal cancer Tatiana Cajuso Pons, M.Sc. Applied Tumor Genomics Research Program Department of Medical and Clinical Genetics Faculty of Medicine University of Helsinki Finland DOCTORAL DISSERTATION To be presented for public examination with the permission of the Faculty of Medicine of the University of Helsinki, in Lecture Hall 2, Haartman Institute, on the 27th of November, 2020 at 15 o’clock. Helsinki 2020 Supervised by Academy Professor Lauri A. Aaltonen, M.D., Ph.D. Applied Tumor Genomics Research Program Department of Medical and Clinical Genetics Faculty of Medicine University of Helsinki, Finland Docent Outi Kilpivaara, Ph.D. Applied Tumor Genomics Research Program Department of Medical and Clinical Genetics Faculty of Medicine University of Helsinki, Finland Kimmo Palin, Ph.D. Applied Tumor Genomics Research Program Department of Medical and Clinical Genetics Faculty of Medicine University of Helsinki, Finland Reviewed by Adjunct professor Csilla Sipeky, Ph.D. Institute of Biomedicine Cancer Research Laboratories Western Cancer Centre FICAN West University of Turku, Finland Adjunct professor Kaisa Huhtinen, Ph.D. Institute of Biomedicine Cancer Research Laboratories Western Cancer Centre FICAN West University of Turku, Finland Official Opponent Kathleen H. Burns, M.D., Ph.D. Chair, Dana-Farber Cancer Institute, Department of Oncologic Pathology Professor of Pathology, Harvard Medical School Boston, MA USA Doctoral Programme in Biomedicine ISBN 978-951-51-6701-9 (paperback) ISBN 978-951-51-6702-6 (PDF) Unigrafia Oy Helsinki 2020 The Faculty of Medicine uses the Urkund system (plagiarism recognition) to examine all doctoral dissertations. 1 “If you know you are on the right track, if you have this inner knowledge, then nobody can turn you off... no matter what they say.” - Barbara McClintock To my family, 2 TABLE OF CONTENTS LIST OF ORIGINAL PUBLICATIONS 5 ABBREVIATIONS 6 1. List of genes 7 ABSTRACT 9 REVIEW OF THE LITERATURE 10 1. Cancer 10 1.1. Genomic instability 11 1.2. Oncogenes 13 1.3. Tumor suppressor genes 13 1.4. Epigenetics 14 2. Colorectal cancer 15 2.1. Adenoma-to- carcinoma sequence 15 2.2. Molecular subtypes 16 3. Transposable elements 17 3.1. DNA transposons 18 3.2. Retrotransposons 19 3.2.1. LTR retrotransposons 19 3.2.2. Non-LTR retrotransposons 19 3.2.2.1. LINE -1 19 3.2.2.2. SINE 20 Alu 20 SVA 20 3.2.3. LINE-1 cycle 20 4. Retrotransposition in humans 22 5. Retrotransposition in cancer 23 AIMS OF THE STUDY 27 MATERIALS AND METHODS 28 1. Sample collection (Study I - Study III) 28 2. DNA extraction (Study I - Study III) 28 3. RNA extraction (Study III) 28 4. Whole genome sequencing (Study I - Study III) 29 4.1. Structural variation calls (Study I - Study III) 29 4.2. Retrotransposon insertion calls (Study III) 29 4.2.1. Detection of solo retrotransposon insertions 29 4.2.2. Detection of LINE -1 transductions 29 3 5. Nanopore sequencing (Study II) 30 6. SNP arrays (Study I and Study III) 30 7. Methylation assay (Study III) 30 8. RNA sequencing (Study III) 31 9. Long-distance inverse-PCR (LDI-PCR) (Study II) 31 10. Sanger validation (Study I and Study II) 31 RESULTS 33 1. Identification of frequent insertions from a LINE-1 in TTC28 33 1.2. Germline insertions in the Finnish population 34 2. Detection of clonal and subclonal transductions from the LINE-1 in TTC28 35 2.1. Analysis of the insertion sequence 37 2.2. Comparison of Nanopore and WGS local assembly 39 3. Characterization of somatic retrotransposition in 202 colorectal tumors 39 3.1. Insertions in protein-coding genes 40 3.1.1. Insertions in common fragile site genes 41 3.1.2. Insertions in known cancer genes 41 3.2. Hot reference LINE-1s 43 3.3. Retrotransposition and molecular characteristics 43 3.4. Retrotransposition and disease-specific survival 44 DISCUSSION 46 1. LINE-1 in TTC28 46 1.1. Accurate interpretation of WGS data 46 1.2. Detection of subclonal insertions 47 1.3. Elucidation of insertion characteristics 47 2. Genomic characterization of retrotransposition 48 2.1. Recurrent insertions in protein- coding genes 48 2.2. Retrotransposition as a source of fragility 49 2.3. Retrotransposition as early tumor initiating events 49 3. Retrotransposition and molecular and clinical characteristics 50 3.1. Retrotransposition correlates with CIMP and AI 50 3.2. Retrotransposition correlates with poor CRC survival 51 CONCLUSIONS AND FUTURE PROSPECTS 52 ACKNOWLEDGEMENTS 55 REFERENCES 57 4 LIST OF ORIGINAL PUBLICATIONS This thesis is based on the following publications which are referred to in the text as Study I, Study II, and Study III. I. Pitkänen, E., Cajuso T., Katainen, R., Kaasinen, E., Välimäki, N., Palin, K., Taipale, J., Aaltonen, L.A. & Kilpivaara, O. Frequent L1 retrotranspositions originating from TTC28 in colorectal cancer. Oncotarget. 5, 853-859 (2014). II. Pradhan, B.*, Cajuso, T.*, Katainen, R., Sulo, P., Tanskanen, T., Kilpivaara, O., Pitkänen, E., Aaltonen, L.A., Kauppi, L. & Palin, K. Detection of subclonal L1 transductions in colorectal cancer by long- distance inverse-PCR and Nanopore sequencing. Scientific Reports. 7, 14521 (2017). III. Cajuso, T., Sulo, P., Tanskanen, T., Katainen, R., Taira, A., Hänninen, U.A., Kondelin, J., Forsström, L., Välimäki, N., Aavikko, M., Kaasinen, E., Ristimäki, A., Koskensalo, S., Lepistö, A., Renkonen-Sinisalo, L., Seppälä, T., Kuopio, T., Böhm, J., Mecklin, J.P., Kilpivaara, O., Pitkänen, E., Palin, K. & Aaltonen, L.A. Retrotransposon insertions can initiate colorectal cancer and are associated with poor survival. Nature Communications. 10, 4022 (2019). * Equal contribution The publications are reproduced under a CC BY license. 5 ABBREVIATIONS AI allelic imbalance bp base pair BWA Burrows-Wheeler Aligner cDNA complementary deoxyribonucleic acid CIMP CpG island methylator phenotype CIN chromosomal instability CRC colorectal cancer DNA deoxyribonucleic acid EN endonuclease HERV human endogenous retrovirus Indels small insertions and deletions LDI-PCR long-distance inverse PCR LINE-1 long interspersed element-1 LINE-1HS human specific long interspersed element-1 LTR long terminal repeat MAPK mitogen-activated protein kinases miRNA microRNA MMR mismatch repair mRNA messenger RNA MSI microsatellite instability MSS microsatellite stable ORF open reading frame PCAWG Pan-Cancer Analysis of Whole Genomes PCR polymerase chain reaction RNA ribonucleic acid RNP ribonucleoprotein RT reverse transcriptase SINE short interspersed element SNP single nucleotide polymorphism SNV single nucleotide variant SVA SINE-VNTR-alu TE transposable element TPRT target primed reverse transcription TraFiC Transposon Finder in Cancer TSD target site duplication 6 TSG tumor suppressor gene TSM target site modification UTR untranslated region VNTR variable number tandem repeat WGS whole genome sequencing 1. List of genes AFF3 AF4/FMR2 family member 3 ALK ALK receptor tyrosine kinase APC APC regulator of WNT signaling pathway ARAP2 ArfGAP with RhoGAP domain, ankyrin repeat and PH domain 2 ARHGEF4 Rho guanine nucleotide exchange factor 4 BRAF B-Raf proto-oncogene, serine/threonine kinase CDKN1B cyclin dependent kinase inhibitor 1B COL25A1 collagen type XXV alpha 1 chain DLG2 discs large MAGUK scaffold protein 2 ERBB4 erb-b2 receptor tyrosine kinase 4 F8 coagulation factor VIII FAT4 FAT atypical cadherin 4 FHIT fragile histidine triad diadenosine triphosphatase FLI1 Fli-1 proto-oncogene, ETS transcription factor GABRA4 gamma-aminobutyric acid type A receptor subunit alpha4 GRIN2A glutamate ionotropic receptor NMDA type subunit 2A IKZF1 IKAROS family zinc finger 1 KRAS KRAS proto-oncogene LRP1B LDL receptor related protein 1B MECOM MDS1 and EVI1 complex locus MLH1 mutL homolog 1 MYC MYC proto-oncogene, bHLH transcription factor NOVA1 NOVA alternative splicing regulator 1 NRG1 neuregulin 1 POLE DNA Polymerase Epsilon Catalytic Subunit PREX2 phosphatidylinositol-3,4,5-trisphosphate dependent Rac exchange factor2 PTEN phosphatase and tensin homolog PTPRD protein tyrosine phosphatase receptor type D PTPRT protein tyrosine phosphatase receptor type T 7 RUNX1T1 RUNX1 partner transcriptional co-repressor 1 SGIP1 SH3GL interacting endocytic adaptor 1 ST18 ST18 C2H2C-type zinc finger transcription factor TP53 tumor protein p53 TTC28 tetratricopeptide repeat domain 28 ZNF251 zinc finger protein 251 8 ABSTRACT Colorectal cancer (CRC) is the third most common cancer type and a major cause for cancer deaths. Retrotransposons are transposable elements (TE) able to mobilize and insert via an RNA intermediate. Although germline retrotransposition can contribute to genetic diversity, uncontrolled retrotransposition can lead to genomic instability. Retrotransposons are normally repressed by promoter methylation however, they become highly active in many cancers, such as CRC. Since retrotransposons are difficult to detect, their role in tumorigenesis has remained considerably unexplored. The main goal of this thesis project was to further understand the impact of retrotransposition in colorectal tumorigenesis utilizing whole genome sequencing (WGS) and Nanopore sequencing. In study I, we detected an active reference long interspersed element-1 (LINE-1) located in the first intron of TTC28. This active LINE-1 led to the most frequent somatic structural rearrangement in our dataset, with a total of 83 somatic retrotranspositions in 92 CRCs. In study II, we applied long-distance inverse-PCR (LDI-PCR) with Nanopore sequencing to detect retrotranspositions arising from the active LINE-1 identified in study I. We identified