Molecular genetic characterization of ovarian tumors

Thesis for the degree of Philosophiae Doctor (PhD)

University of Oslo, 2018

Antonio Agostini

Section for Cancer Cytogenetics Institute for Cancer Genetics and Informatics The Norwegian Radium Hospital Oslo University Hospital

Department of Biosciences Faculty of Mathematics and Natural Sciences University of Oslo

The Norwegian Cancer Society

Radium Hospital Foundation

© Antonio Agostini, 2018

Series of dissertations submitted to the Faculty of Mathematics and Natural Sciences, University of Oslo No. 1992

ISSN 1501-7710

All rights reserved. No part of this publication may be reproduced or transmitted, in any form or by any means, without permission.

Cover: Hanne Baadsgaard Utigard. Print production: Reprosentralen, University of Oslo.

Molecular genetic characterization of ovarian tumors

Antonio Agostini University of Oslo 2018

1

Cover Illustration

Bellerophon fighting the Chimera, Roman mosaic, 1st century A.D.

(Image from Wikimedia commons)

Bellerophon was one of the most ancient heroes of Greek mythology and a monster hunter. One day King Iobates gave Bellerophon an impossible quest: to hunt the Chimera, a fire- breathing monster. The Chimera was truly ferocious, and Bellerophon could not harm the monster even if he was riding Pegasus, the winged horse. Feeling the heat of the breath of the Chimera, an idea struck him: He got a large block of lead that he mounted on his spear. Then he flew head-on towards the Chimera and managed to lodge the block of lead inside the monster's jaws. The lead melted in the monster’s throat and killed the Chimera. Finally Bellerophon returned victorious to King Iobates.

Nora Heisterkamp and colleagues in 1983 described the first chimeric, cancer-specific , BCR-ABL1, originating from the translocation t(9;22)(q34;11) in chronic myeloid leukemia. Since then many other fusion were discovered and found to be primary events in the pathogenesis of different types of cancer. Nowadays the hunt for these chimeras goes on with the help of ever more advanced technologies.

3

Acknowledgements

The work described in this thesis was performed at the Section of Cancer Genetics, Institute of Cancer Genetics and Informatics, The Norwegian Radium Hospital, Oslo. The studies were funded by grants from the Norwegian Radium Hospital Foundation, the Norwegian Cancer Society, the Center for Cancer Biomedicine, the John and Inger Fredriksen Foundation, and the Anders Jahre’s foundation through UNIFOR (University of Oslo).

I would first like to thank my main supervisor, Francesca Micci, for giving me the opportunity of working in one of the best cytogenetic group in the world. You are a great leader, mentor, and friend. I also want to thank you for your continuous support and guidance; and for having entrusted me such interesting, complicated but fun, studies.

The second person that I want to thank is Ioannis Panagopoulos. I want to thank you for everything you taught me about fusion genes, for always encouraging me; and for all the fun times we had. Anyway, you will never convince me that Greeks were better than Romans.

Marta Brunetti deserves special thanks for having helped me with the research projects presented here. I want to thank you for your collaboration and friendship.

I want to thank Sverre Heim for having shown me the beauty of , and for all the funny and interesting history lessons. But most of all I want to thank you for your contribution in the writing process of articles and thesis.

I would like to thanks all the co-authors of the papers presented in this thesis. In particular I want to thank Ben Davidson for providing the patients’ material and his contributions in my research projects.

I want to thank all my colleagues at the Section for Cancer Genetics. Thanks for all the laughs, the fun, and the “drinks” we shared. Special thanks go to Hege Kilen Andersen, Kristin Andersen, Lene Elisabeth Johannesen, and Laila Bergly for their contributions in my research projects.

Infine voglio ringraziare i miei genitori, la mia famiglia e tutti i miei amici per avermi supportato ed incoraggiato in tutti questi anni lontano da casa.

Oslo, February 2018

4

List of Papers

Paper I:

A novel truncated form of HMGA2 in tumors of the ovaries

Agostini A, Panagopoulos I, Davidson B, Tropé CG, Heim S, Micci F.

Oncology Letters 2016 Aug;12(2):1559-1563

Paper II:

Genomic imbalances are involved in miR-30c and let-7a deregulation in ovarian tumors: implications for HMGA2 expression

Agostini A, Brunetti M, Davidson B, Tropé CG, Heim S, Panagopoulos I, Micci F.

Oncotarget 2017 Mar 28;8(13):21554-21560

Paper III:

The microRNA mir-192/215 family is upregulated in mucinous ovarian carcinomas

Agostini A, Brunetti M, Davidson B, Tropé CG, Heim S, Panagopoulos I, Micci F.

Submitted manuscript (Scientific Reports)

Paper IV:

Recurrent involvement of DPP9 in gene fusions in serous ovarian carcinoma

Smebye ML, Agostini A, Johannessen B, Thorsen J, Davidson B, Tropé CG, Heim S, Skotheim RI, Micci F.

BMC Cancer 2017 Sep 11;17(1):642

Paper V:

Identification of novel cyclin gene fusion transcripts in endometrioid ovarian carcinomas

Agostini A, Brunetti M, Davidson B, Tropé CG, Heim S, Panagopoulos I, Micci F.

Submitted manuscript (International Journal of Cancer)

6

“Science, my lad, is made of mistakes, but they are mistakes which are useful to make, because they lead little by little to the truth”

― Jules Verne, A Journey to the Center of the Earth

7

Table of contents

List of Papers...... 6 Part 1: Introduction...... 11 Ovarian tumors...... 11 Epithelial tumors...... 12 Borderline epithelial tumors...... 12 Ovarian carcinomas...... 13 Sex cord-stromal tumors ...... 14 Pure stromal tumors ...... 14 Pure sex cord tumors ...... 14 The hallmarks of cancer ...... 16 Genomic instability ...... 18 Chromosomal instability ...... 19 Fusion genes...... 21 Clinical relevance of fusion genes in solid tumors ...... 24 Somatic point mutations in cancer ...... 25 Genomic alterations and mutations in ovarian tumors...... 26 Fusion genes in ovarian carcinomas ...... 28 microRNAs: biogenesis and main functions...... 29 miRNAs and cancer ...... 32 Clinical relevance...... 35 miRNAs in ovarian cancer ...... 36 High-mobility group AT-hook 2 (HMGA2)...... 38 HMGA2 regulation...... 39 HMGA2 role in cancer...... 42 Part 2: Materials and methods...... 44 Patient material...... 44 Methods...... 45 Cancer cytogenetics ...... 45 Short-term cultures and karyotyping of solid tumors...... 46 Fluorescence in situ hybridization (FISH) ...... 46 Comparative genomic hybridization (CGH) ...... 47

8 Polymerase chain reaction (PCR) based techniques: brief history of a “special” DNA polymerase ...... 48 PCR...... 48 Reverse transcription PCR (RT-PCR)...... 49 Rapid amplification of cDNA ends (RACE)...... 50 Real-Time quantitative PCR (Real-Time qPCR)...... 50 Basics of Real-Time qPCR ...... 51 Methylation-specific qPCR (MSP-qPCR) ...... 53 Nucleotide sequencing ...... 54 Sanger sequencing...... 55 High-throughput sequencing (HTS)...... 55 High-throughput RNA-sequencing (RNA-seq) ...... 57 Bioinformatic analyses...... 58 Specific bioinformatic analyses for RNA-seq...... 59 Part 3: Aims...... 62 Part 4: Results in brief...... 64 Paper I: A novel truncated form of HMGA2 in tumors of the ovaries...... 64 Paper II: Genomic imbalances are involved in miR-30c and let-7a deregulation in ovarian tumors: implications for HMGA2 expression...... 65 Paper III: The microRNA mir-192/215 family is upregulated in mucinous ovarian carcinomas...... 67 Paper IV: Recurrent involvement of DPP9 in gene fusion in serous ovarian carcinomas. . 68 Paper V: Identification of novel cyclin gene fusion transcripts in endometrioid ovarian carcinomas...... 69 Part 5: General discussion ...... 71 Methodological considerations ...... 71 Study design...... 71 Transcriptomics...... 72 miRNA profiling ...... 72 Methods for fusion transcript detection ...... 74 The synergistic use of cytogenetics, transcriptomics, and molecular analyses ...... 76 Tissue references used in the studies ...... 78 Biological considerations ...... 79 HMGA2 regulation in ovarian tumors...... 79

9 miRNAs deregulation in ovarian tumors ...... 81 Fusion transcripts in ovarian carcinomas...... 85 Part 6: Conclusion and future perspectives ...... 88 Appendix ...... 91 Abbreviations ...... 91 Gene abbreviations...... 92 Reference list...... 94

10

Part 1: Introduction

Ovarian tumors

Tumors of the ovaries represent 30 % of all cancer of the female genital tract. In 2017 ovarian cancer cases were estimated to be 225.000 worldwide and the number of deaths was

145.000 (65 %) [1]. The last report of the Norwegian National Cancer Registry (https:// www.kreftregisteret.no) for 2014 showed that five years survival (2010-2014) was 44.5 % [2].

Ovarian cancer is ten times less frequent compared to cancer of the breast, yet it is associated with a much greater number of deaths, as 75 % of patients with advanced cancer, i.e., stage

III, experience recurrence after surgery and chemotherapy and most, ultimately, die of the disease [3]. The high mortality rate is mainly explained by the fact that the majority of patients are diagnosed with advanced stage carcinoma, a highly aggressive malignant tumor

[4]. This notwithstanding, ovarian tumors are nevertheless a heterogeneous group of neoplasms that differ with respect to essential features of etiology, pathogenesis, prognosis, pathology, and molecular signature [5]. These tumors are grouped according to two main classification systems, both updated in the year 2014. The FIGO staging system, developed by the International Federation of Gynecology and Obstetrics (FIGO) Committee on

Gynecologic Oncology, scores the grade of the malignancy [6], The World Health

Organization (WHO) classification divides ovarian tumors into three major groups, namely epithelial tumors, sex-cord stromal tumors, and germ cell tumors.

In the next chapters only the types of ovarian tumors analyzed as part of the present research project will be discussed.

11 Epithelial tumors

Epithelial tumors are the most common neoplasms of the ovary (90 %) [3]. They arise from the ovarian surface epithelium or its derivatives. Epithelial tumors are classified on the basis of the relative amounts of epithelium and stroma, the cell type, the presence of papillary processes, and the location of the epithelial elements [4].

Microscopic pathological features, such as evidence of aggressiveness and stromal invasion, and molecular analyses determine whether an epithelial tumor is benign, borderline, or malignant. This group of tumors is characterized by different histotypes: serous, mucinous,

endometrioid, clear cell, and transitional cell tumors, though mixed, undifferentiated, and unclassified types also exist [4].

Borderline epithelial tumors

Borderline epithelial tumors are neoplasms characterized by an active cellular proliferation and the presence of slight nuclear atypia but without destructive stromal invasion

[7]. These tumors are divided in six histological subtypes. The most common borderline tumors are serous and mucinous while the endometrioid, clear cell, seromucinous, and borderline Brenner tumor are rare. The vast majority of borderline epithelial tumors are limited to the ovaries at presentation with 75 % being diagnosed at FIGO stage I. Borderline tumors generally have an excellent prognosis with a 10-year survival of 97 %, although recurrences and malignant transformation can occur [1]. It is not clear if borderline tumors are a transition state from adenomas to carcinomas, or if they are distinct entities. Genetic and molecular studies showed that these tumors share many similarities with both their benign and malignant counterparts (see below) [8]. Moreover, cytogenetic studies also detected common, recurrent genomic imbalances present in both borderline tumors and carcinomas [9, 10], suggesting that there is a tight developmental connection between them.

12 Ovarian carcinomas

Carcinomas are the most common ovarian cancers accounting for 90 % of all cases

[4]. Although traditionally referred to as a single entity, ovarian cancer is not a homogeneous disease but rather a group of neoplasms, each with different morphology, genetics, and biological behavior. Carcinomas account for over 100,000 female deaths per year worldwide and constitute the fifth most frequent cause of cancer death in women in the Western world, being the most lethal gynecological cancer [3]. There are five main types of ovarian carcinomas: high-grade serous (HGSC; accounting for 70-75 % of all ovarian carcinomas), low-grade serous (LGSC; < 5 %), endometrioid (EC; 10 %), clear cell (CCC; 10 %), and mucinous (MC; 3 %) carcinomas [3]. HGSC is the most aggressive type of carcinoma; however, CCC shows a poor response rate to chemotherapy (15 %) compared to HGSC (80

%) resulting in a < 5-year survival for CCC compared to HGSC in patients with advanced stage tumors of 20 % versus 30 % [3]. The prognosis for the other carcinomas subtypes (MC,

EC, and LGSC) is relatively favorable [11].

The origin of ovarian carcinomas is still debated. The cells characterizing the five subtypes are not present in the normal ovarian epithelium and are more similar to cells of other sites in the reproductive system. In the past, the cancerous cells were thought to derive exclusively from the surface epithelium (mesothelium) and epithelial inclusion cysts through a process of mullerian neometaplasia, thus they would resemble morphologically the epithelia of the fallopian tube, endometrium, or endocervix [12]. However, although the mesothelial origin cannot be excluded, there is also strong evidence that primary ovarian cancers actually originate in neighboring pelvic organs and involve the ovary secondarily. Recently, it was proposed that HGSC arise from precursor epithelial lesions in the fimbriae of the fallopian tube [13], whereas EC and clear CC originate from ovarian endometriosis [14]. The different

13 origins of these carcinomas may also explain their different histological and genetic features

(see below).

Sex cord-stromal tumors

Sex cord-stromal tumors are composed of different cell types, alone or in various combinations, which belong to the sex-cord lineage and the stromal component of the ovary.

These represent 7 % of all ovarian tumors [15, 16]. The cells composing these tumors are of different types: granulosa, theca, Sertoli, Leydig cells, and fibroblasts. The WHO classification of sex cord-stromal tumor was recently revised regrouping the tumors into the following clinicopathologic entities: pure stromal tumors, pure sex cord tumors, and mixed sex cord-stromal tumors [4]. Although sex cord-stromal tumors are diverse, the majority present as a low-grade disease that usually follows a non aggressive clinical course in younger patients [16].

Pure stromal tumors

Fibromas (F) are composed of spindle stromal cells that produce a collagenous stroma.

They are the most common sex cord-stromal tumors, representing < 4 % of all ovarian neoplasms [16]. F can occur at any age but with a mean age in the late forties [17]. Thecomas are composed of lipid-laden stromal cells that resemble theca cells, which are located around the ovarian follicles and exhibit estrogenic activity in most cases [4]. Thecomas account for around 1 % of all primary ovarian tumors. They are more likely to occur in postmenopausal women and, with rare exceptions, are considered benign neoplasms [18]. Thecomas showing presence of fibrous tissue may be classified as thecofibromas (Thf) [4].

Pure sex cord tumors

Granulosa cell tumors are low-grade malignant ovarian sex cord-stromal tumors which represent less than 5 % of all ovarian malignancies. Clinicopathologically, they are divided

14 into two types, adult and juvenile, of which the former accounts for 95 % of the neoplasms

[4]. As the names suggest, adult granulosa cell tumors are more frequent in postmenopausal women while the juvenile form is frequent in women under 30 years of age and can also occur in infants [19]. These tumors are the most common estrogen-producing tumors, although a small subset is androgenic. The production of these hormones cause a different range of symptoms [16], Estrogen overproduction can be responsible for endometrial hyperplasia and concomitant endometrial cancer that is nearly always of a low-grade, low-stage adenocarcinoma type [20]. Even though these tumors are generally benign, they have a potential for aggressive behavior as 10-50 % of patients develop recurrences.

15 The hallmarks of cancer

“We are made up of thousands of parts with thousands of functions all working in tandem to keep us alive. Yet if only one part of our imperfect machine fails, life fails.”

—Emil Pagliarulo, Skyrim

The neoplastic process transforms irremediably a normal cell into a cell that is beyond control, either dividing too much or living too long. Different mechanisms drive this process but they all result in gain of functional properties that are necessary for a given neoplasm’s development, be it benign or malignant. These properties were summarized by Weinberg et al. [21] in ten capabilities known as the hallmarks of cancer (Figure1).

Figure 1. The Hallmark of Cancer. Adapted with permission.

16 Each of these hallmarks is fundamental for the cancer cells’ growth and survival. The first and probably most important hallmark is the ability to sustain a proliferative signal. Normal tissues carefully control the production and release of growth-promoting signals to control cell number and maintenance of architecture and function in normal tissue. The proliferation signals can stem from both extracellular and intracellular pathways and these stimuli can, in many instances, affect other hallmarks of cancer. In addition to the capability of inducing and sustaining positively acting growth-stimulatory signals, cancer cells must bypass the cellular programs that negatively regulate cell proliferation [22]. Senescence and apoptosis are the only two barriers that prevent unlimited replication, therefore cancer cells must evolve to evade these barriers [23, 24]. A growing neoplastic tissue needs nutrients which arrive to it through development of new blood vessel (angiogenesis) [25] and utilize the nutrients in a fast way that matches their high rate of proliferation [26, 27]. Tumor growth is accompanied by increasing selective pressure on tumor cells by different factors such as basal membrane, increased interstitial pressure, limited oxygen, nutrients supply to tumor cells, the formation of active oxygen forms, hypoxia conditions, and permanent exposure to immune system cells

[28]. Under this selective pressure some tumor cells may be subjected to regression and death, while other cells, which resist powerful, counteracting microenvironmental factors, gain an aggressive phenotype and the ability of metastatic progression. The involvement of the host immune system in cancer progression was demonstrated in many types of tumors that are densely infiltrated by cells of both the innate and adaptive arms of the immune system. These tumor-associated inflammatory responses have in some instances the paradoxical effect of enhancing tumorigenesis and progression [29, 30]. On the other hand, the cells of the immune system represent a clear menace for the tumors, so cancer cells have to acquire the ability to evade immune surveillance [31, 32].

17 Genomic instability

The acquisition of the hallmarks described above depends, in large part, on a succession of alterations in the genome of neoplastic cells (genomic instability). There are different types of genomic instability. Most cancers have a form that is called chromosomal instability (CIN) in which such alterations influence the structures and/or number. CIN is generally caused by a defective mitotic apparatus [33]. Equal segregation of chromosomes during mitosis is pivotal for the maintenance of genomic stability. Failure of accurate chromosome segregation usually leads to cell death, but it can also be the starting point of malignant transformation. Accurate chromosome segregation during cell division is monitored and safe-guarded by several closely linked, yet distinct, molecular machineries involved in DNA repair and mitosis control. The genes coding for the proteins responsible for control of genomic stability may be mutated in both hereditary and sporadic cancers [34].

This is the main reason why it is still not clear for many types of tumors if CIN is a causal event in tumorigenesis or occurs during the later stage as a result of mutation(s) in caretaking genes [34]. Although CIN is the major form of genomic instability in human cancers, other forms have also been described. Microsatellite instability (MSI) is characterized by the expansion or contraction of the number of oligonucleotide repeats present in microsatellite sequences. MSI is most commonly found in autosomal dominant malignancies such as the hereditary nonpolyposis colorectal cancer syndrome (HNPCC) [35]; however, it can also be found in sporadic tumors [36]. Another form of genome instability is caused by point mutations. The loss of function of DNA repair genes causes increased frequencies of point mutations. For example, hereditary MYH-associated polyposis, with biallelic germline mutations in MYH, a DNA base excision repair (BER) gene, results in increased G-C to T-A mutation frequencies and cancer [37].

18 Chromosomal instability

CIN is seen as an alteration of chromosome number and/or structure. Structural and numerical chromosomal aberrations lead to imbalances of genetic material (gain and/or loss).

Structural aberrations also cause the relocation of material with or without net gain or loss

(unbalanced or balanced rearrangements). Chromosomal aberrations are used for diagnostic and a prognostic purposes, as well as to study cancer heterogeneity [38]. Continuous studies of such changes in different types of neoplasia help unravel the pathogenetic mechanisms behind tumorigenesis and tumor progression.

Most chromosomal aberrations lead to gain of genetic material either as additional copies of chromosome(s) or parts of chromosomal arms or regions. Presence of additional genetic material in tumor cells can result in amplification of oncogenes. Ring chromosomes are recurrent chromosomal aberrations in highly differentiated liposarcoma whose amplicons typically contain material from chromosomal region 12q13-21 containing many copies of the proto-oncogene MDM2 and the high mobility group AT-hook 2 gene (HMGA2). The amplification of these oncogenes leads to their overexpression which is important in tumorigenesis [39].

The effect of loss of chromosomal material is thought to be loss of a copy of a tumor suppressor gene, possibly accompanied by inactivation of the second allele on the homologous chromosome. In chronic lymphocytic leukemia (CLL) the three most common chromosomal aberrations lead to loss of tumor suppressor genes (Figure 2) [40]. Deletion of

13q14.3 is the most common cytogenetic abnormality in CLL, seen in 50–60 % of the patients. This deletion leads to the loss of the miRNAs miR-15a and miR-16 and the long non-coding RNA DLEU7; all three non-coding RNAs act as tumor suppressors in lymphocytes [40]. Structural abnormalities of the short arm of chromosome 17 (17p), including deletions, translocations (usually unbalanced), and isochromosome of 17q, are

19 detectable cytogenetically in less than 5 % of the patients with early-stage CLL rising to over

30 % in patients with advanced chemo-refractory disease. The deletion is usually associated with the loss of tumor protein 53 (TP53) [40]. Structural abnormalities of the long arm of chromosome 11 occur in approximately 20 % of the patients with CLL. The great majority of

11q deletions result in loss of the ataxia telangiectasia (ATM) gene and approximately 40 % of the patients with ATM loss carry a mutation of the remaining allele.

Figure 2. Chromosomal locations of genes often deleted in CLL. A) DLEU2, MIR-16-1, MIR-15A,

DLEU1, and DLEU7 are included in the minimal deleted region of del(13q) in CLL, B) Deletion of

17p material causes loss of the TP53 gene, C) ATM loss by del(11q). Image acquired by

GenomeBrowser.

Chromosomal rearrangements, such as translocations, are common aberrations in human cancer, particularly in hematopoietic, lymphoid, and mesenchymal tumors [41].

Translocations change the original locations of genes and may generate new products relevant for carcinogenesis, i.e., fusion genes. Translocations are pathognomonic in many types of

20 tumors and are thought to be the initial event of tumorigenesis. It was demonstrated that in most cases of childhood acute lymphoblastic leukemia (ALL) the translocation t(12;21)(p13;q22) arises pre-natally, in utero. Monozygotic twin affected by t(12;21)(p13,q22) positive ALL share the same ETV6-RUNX1 genomic fusion sequence [42] which may be sufficient to initiate ALL [43]. Recurrent chromosomal translocations are seen in both benign and malignant tumors [41].

Fusion genes

“To create, something of equal value must be lost. That is the principle of the equivalent exchange.”

— Hiromu Arakawa, Full Metal Alchemist

A fusion gene is a hybrid gene composed of parts from two independent genes which have been fused together. Fusion genes are caused by balanced and unbalanced chromosomal rearrangements (Figure 3) [44] and are found in many types of neoplasia, especially solid connective tissue tumors and hematological malignancies. Presently, 11.125 fusions have been discovered and scientifically reported (https://cgap.nci.nih.gov/Chromosomes

/Mitelman,update …2017). Many of these fusion genes are pathognomonic of specific neoplasias and are valid diagnostic, prognostic, and predictive markers [45]; however, not all identified fusion genes have a documented pathogenetic effect.

Formation of a fusion gene may lead to two different scenarios: its eventual translation into a chimeric protein with a qualitatively new structure and new function(s), or deregulation of the genes involved in the fusion with subsequent production of an altered amount of otherwise normal protein.

21 Figure 3. Examples of chromosomal rearrangements leading to formation of fusion genes. A)

The translocation t(9;22)(q34;q11.2) causing the BCR-ABL fusion gene in CML. B) The inversion inv(16)(p13q22) causing CBFB-MYH11 in acute myeloid leukemia. C) The insertion ins(11;2)(q23:q11.2) leading to formation of the MLL-LAF4 fusion gene in acute lymphoblastic leukemia. D) The 5q-deletion del(5)(q32q33.3) leading to the fusion gene EBF1-PDGFRB. Arrows point to breakpoint positions.

One example of a chimeric fusion is known from studies of lung cancer. The fusion involves the exon 13 of echinoderm microtubule-associated protein-like 4 gene (EML4) and exon 22 of anaplastic lymphoma kinase (ALK) (Figure 4A). The resulting chimeric protein contains the kinase domain of ALK and the basic domain, the HELP domain, and the WD repeats of EML4, which are all indispensable for the oncogenic activity of the chimeric kinase

(Figure 4A) [46].

A fusion event can cause deregulation of the 3’ partner gene in different ways.

Oncogenes can be brought into proximity of new cis-regulatory elements. The classic

22 example is the overexpression of proto-oncogene MYC in Burkitt lymphoma due to the translocation t(8;14)(q24;q32) which causes MYC to become juxtaposed to immunoglobulin heavy chain (IGH) regulatory elements (Figure 4B) [47]. Some fusions lead to a phenomenon called “promoter swapping” where the breakpoint is located in the 5’ untranslated region

(UTR) of the 5’ partner gene resulting in ectopic expression of the wild-type 3’ partner gene.

A clear example of promoter swapping is the fusion gene resulting from the translocation t(3;8)(p21;q12) in pleomorphic adenomas of the salivary glands. This results in promoter swapping between the pleomorphic adenoma gene 1 (PLAG1) and the gene for β-catenin

(CTNNB1), leading to activation of PLAG1 expression but reduced expression of CTNNB1

[48]. Also out-of-frame fusions can cause aberrant expression. Typical examples are the out- of-frame fusion products involving HMGA2. The functional impact of these fusions is the truncation of HMGA2, because the partner gene generally contributes only a stop codon few amino acids downstream of the breakpoint; such truncation causes HMGA2 overexpression

(see below) [49].

23 Figure 4. Illustration of the two main consequences of fusion gene formation. A) Aminoacid domains originated from EML4 (blue) and ALK (red) included in the EML4-ALK fusion kinase. B) The

MYC gene (blue) is fused with the regulatory part of the IGH gene (red) in IGH-MYC fusion.

Clinical relevance of fusion genes in solid tumors

Fusion genes are valid biomarkers in both hematological malignancies and solid tumors. In hematological malignancies, screening for fusion genes is a routine diagnostic procedure, as the finding of aberrant genes is fundamental for the stratification of different types of leukemia and lymphoma into prognostic and therapeutic subgroups [50]. In solid tumors, particularly soft tissue tumor, fusion genes play a similar role. This group of neoplasms includes many rare tumors that are difficult to diagnose by standard pathological examinations. The fact that fusion genes are not only recurrent, but in some tumor types even

24 pathognomonic, make them a perfect diagnostic tool. The fusions NAB2-STAT6 [51] and

FUS-CREBL2 [52] are pathognomonic for solitary fibrous tumor and low-grade fibromixoid sarcoma, respectively, and are currently used for the diagnosis of these fibrous tumors. Some fusions can also be used as prognostic markers, e.g., PAX7-FOXO4 and PAX3-FOXO4 in alveolar rhabdomyosarcoma (ARMS). PAX3-FOXO4 is expressed in 65 % of ARMS whereas

PAX7-FOXO4 is expressed in only 20 %. In pediatric patients, PAX7-FOXO4 is associated with an improved outcome while PAX3-FOXO4 is associated with a poor prognosis [53].

Only few fusion genes were found to be clinically relevant in carcinomas. TMPPRS2-

ERG fusion is the most frequent genomic alteration in prostate cancer (50 %) leading to androgen dependent up-regulation of the ERG ETS-transcription factor. The prognostic relevance of ERG fusion has long been debated; however, there is a growing evidence that other markers combined with ERG are strong prognostic tools in this setting [54-56]. In many epithelial tumors, chimeric genes are rare, though their discovery is clinically relevant, i.e., the tyrosine kynase gene ALK and the ROS proto-oncogene 1 (ROS1), found in 3-7 % of lung carcinomas, are predictive markers as patients with ALK and ROS fusions can be treated with the kinase inhibitor Crizotinib to good effect [57].

Somatic point mutations in cancer

The genome of most types of cancer is characterized by a high rate of point mutations.

Over the past decades, different international projects aiming to characterize the most frequent somatic mutation(s) in cancer have found that some tumors are characterized by a high number of mutations (~200) while others have only few [58, 59]. The neoplasms showing a high mutation rate usually arise in tissues expose to high dose of mutagenic agents like lung cancer and smoke or melanoma and UV light [60]. Pediatric tumors and leukemias instead harbor few point mutations (~20) possibly because these tumors arise in tissue with low or absent rate of self-renewal. In fact, pediatric tumors such as glioblastoma often occur in non–

25 self-renewing tissues, and those that arise in renewing tissues (such as leukemias) originate from precursor cells that have not renewed themselves as often as in adults [60]. All other tumors arising in adults usually harbor around 50-100 point mutations, though not all these mutations are necessarily pathogenic. Historically, the mutations are divided into “drivers” and “passengers”. The “drivers” are recurrent mutations that confer a significant selective growth advantage and are frequent; while the “passengers” confer little selective advantage on the cancer cells (if any) and usually are more rare. In recent years, it has become clear that this division may be outdated as it has been shown that mutations occurring at a low rate can be tumorigenically important and affect the same pathways that drive tumor progression [61].

Recently, different studies highlighted the role of point mutations in non-coding sequences during oncogenesis. These mutations can arise in both cis and trans regulatory regions affecting the regulation of the genes’ transcription and expression [62]. An example is provided by the mutations C228T and C250T in the Telomerase reverse transcriptase gene

(TERT) promoter region identified in a large number of tumors of different types [23]. They introduce a new binding site (TTCCGG) for members of the E-twenty-six/ternary complex factor (Ets/TCF) transcription factor family resulting in increased TERT expression.

Genomic alterations and mutations in ovarian tumors

Borderline tumors of the ovary with abnormal karyotype usually have simple chromosomal aberrations with +7 and +12 (trisomies for chromosomes 7 and 12, respectively) as the most common [10]. Other major copy number changes are gains of or from chromosome arms 2q, 6q, 8q, 9p, and 13q and losses of or from 1p, 12q, 14q, 15q, 16p,

17p, 17q, 19p, 19q, and 22q [10]. Ovarian serous borderline tumors have largely the same mutations as cystadenomas and LGSC. Ho et al. (9) identified mutations in either codons 12 and 13 of the KRAS proto-oncogene (KRAS) or codon 599 of the B-Raf proto-oncogene

(BRAF) in the serous borderline tumors as well as in adjacent cystadenoma and found that the

26 frequency of these mutations in cystadenomas associated with borderline tumors was significantly higher than in tumors lacking borderline progression. Another study found the very same mutations in serous borderline tumors and LGSC [63].

Ovarian carcinomas usually show a complex karyotype. The chromosomal breakpoints identified by banding analysis tend to cluster to 19p/q and to 11q, but there is no one unquestionably recurrent rearrangement. Common imbalances were noted as gains from 1q,

3q, 7q, and 8q and losses from 17p, 19q, and 22q [64]. The different histotypes of ovarian carcinoma display different molecular signatures at the gene mutation level. CCC are characterized by a high frequency of mutations of the AT-rich interaction domain 1A gene

(ARID1A) (50 %) and the phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha gene (PIK3CA) (46 %) [65, 66]. EC are characterized by a high frequency of mutations in the genes ARID1A, catenin beta 1 (CTNNB1), and phosphatase and tensin homolog (PTEN)

[67]. Interestingly, CTNNB1 mutations, occurring in 38–50 % of the cases, are associated with a “squamous/endometrioid” differentiation, low tumor grade, and favorable outcome

[67]. MC have a high frequency of KRAS (60 %) and TP53 (52 %) mutations, while BRAF, cyclin-dependent kinase Inhibitor 2A (CDKN2A), and Kruppel-like factor 5 (KLF5) mutations are recurrent but more rare [68, 69]. LGSC have a high rate of KRAS mutations

(38 %), but also BRAF (20 %) is frequently mutated [70]. A study conducted by The Cancer

Genome Atlas (TCGA) project analyzed 489 tumors of the HGSC subtype. The most frequently mutated gene was TP53 (96 %), whereas BRCA1, BRCA2, neurofibromin 1

(NF1), and RB transcriptional corepressor 1 (RB1) had fewer but recurrent somatic mutations

[71]

There are few studies investigating the genetic features of sex cord-stromal tumors.

Thecomas and fibromas are characterized by simple karyotypes with aneuploidies such as

+12 as the most recurrent aberration [72]. Granulosa cell tumors are characterized by

27 supernumerary chromosomes 8, 9, 12, and 14, the latter being present in 30 % of the cases.

Partial or complete loss of chromosomes and/or chromosome arms 1p, 13p, 16, 11, and 22 are recurrent (20-39 %), with monosomy 22 being common (51 %) [73].

Fusion genes in ovarian carcinomas

Different studies have investigated the presence of fusion genes in ovarian carcinomas. Around 700 samples have been analyzed so far but only a small number of recurrent fusion transcripts were found. Two studies that were part of the TCGA project analyzed more than 400 samples of HGSC and found that this histological subtype is characterized by many fusion genes but that only some of them are recurrent (frequency 0.5-

1 %) [71, 74]. The authors concluded that fusion genes are secondary to the high-grade genomic instability characteristic of HGSC. On the other hand, other studies have hinted that fusion transcripts may be recurrent and relevant in subsets of ovarian carcinomas. Patch et al.

[75] performed a whole-genome characterization of 165 samples of chemoresistant HGSC and found in 6 samples the SLC25A40-ABCB1 fusion transcript. The fusion juxtaposes the promoter of the solute carrier family 25 member 40 gene (SLC25A40) to exon 2 of the ATP binding cassette subfamily B member 1 (ABCB1) causing the upregulation of the latter gene that codes for MDR1, an efflux pump for chemotherapeutic agents that was implied in the mechanism of resistance to paclitaxel in ovarian cancer [75]. Earp et al. [76] analyzed a series of 220 tumors representing different histotypes of ovarian carcinomas and found

UBAP1-TGM7 to be a recurrent fusion transcript in CCC (2 out of 20 tumors).

Several studies found fusion genes at higher frequencies but subsequent research failed to confirm this conclusion [77-79]. ESSRA-C11orf20 was reported at a 15 % frequency in HGSC [77]; however, a later study from our group did not find the fusion in any of 163 samples analyzed [80]; the same goes for the CDKN2D-WDFY2 [78] and BCAM-AKT2 [79] fusions that were described as recurrent at rates of 20 % and 7 %, respectively.

28 microRNAs: biogenesis and main functions

microRNAs (miRNAs) are a special class of small non-coding RNA involved in gene regulation. These molecules were first identified in C.elegans by Lee et al. in 1993 [81]. miRNAs are expressed and highly conserved in the majority of eukaryotes, plants included, and are important gene regulators of the genome [82]. Around 1.881 mature miRNAs have been identified so far in humans (genomehttp://www.mirbase.org/cgi-bin/mirna_ summary.pl?org=has). miRNA-genes account for 1-5 % of the and regulate at least 30 % of protein-coding genes [83]. The genes coding for miRNAs are generally polycistronic (cluster) and code for different miRNAs species. The clusters are located in intergenic regions as well in the introns and exons of host genes [84].

The biogenesis of miRNAs is a complex process starting in the nucleus and finishing in the cytoplasm; it requires the action of many proteins (Figure 5). Intergenic miRNAs are transcribed by RNA polymerases II and III in miRNA-generating precursors, named pri- miRNA, which possess a large stem-loop structure with single-stranded RNA extensions at both ends. Intragenic miRNAs are instead transcribed as part of the pre-mRNA, and then the pri-miRNAs are excised out trough a mechanism still not fully understood [85]. The pri- miRNAs undergo a first step of maturation in the nucleus facilitated by a processing complex formed by RNase III, the endonuclease DROSHA, and the double-stranded RNA-binding protein DiGeorge syndrome critical region gene 8 (DGCR8). This complex cleaves the pri- miRNA in a ~60-70 bp miRNA hairpin precursor named pre-miRNA. Subsequently pre- miRNA assembles into a complex with the nucleo-cytoplasmic transporter Exportin-5 and

RanGTP, which prevents nuclear degradation and facilitates translocation into the cytoplasm where DICER in concert with other proteins cut the pre-miRNA into the final mature product of 20-21 nt and load it in the RNA induced silencing complex (RISC) (Figure 5) [86].

29 The RISC complex is formed by different ribonucleoproteins in its core, among them a protein of the Argonaute family that binds the miRNA [87]. The miRNA guides the RISC complex to the target transcripts binding specific sites located at the 3’UTR of the mRNAs. In general, the consequence of recognition by RISC is down-regulation of the targeted gene, even though the recognition can trigger also other processes including mRNA localization and alternative splicing [87]. The most prevalent mode of gene silencing by RISC in mammalian systems is repression of translation guided by miRNAs translational repression that does not require extensive sequence complementarity between miRNA and target mRNA. Only 2–7 bases of the guide RNA are required to match a target and initiate translational repression.

Mechanisms for translation repression by RISCs include, among others, inhibition of translation initiation and deadenylation [87].

30 Figure 5. Schematic illustration of miRNA biogenesis. Adapted with permission.

31 miRNAs and cancer

“It is not the size of the dog in the fight; it is the size of the fight in the dog.”

— Mark Twain

miRNA dysregulation has been found in most tumors examined [88, 89]. The majorities of these deregulated miRNAs participate actively in the pathogenesis of different types of tumors controlling the expression of several genes although it is not always possible to correlate directly miRNA deregulation with oncogenic effect [90]. It is likewise not always clear if altered miRNA patterns are a direct cause of cancer or rather an indirect result of changes in cellular phenotype, as it is known that miRNA expression patterns differ for specific tissues and differentiation states [86].

Different factors thus influence the expression of miRNAs in cancer. Several signaling pathways [91, 92] and transcription factors usually deregulated in cancer, such as MYC [93], can alter the regulation of several miRNA species. Genomic alterations can also affect miRNA expression levels. In different types of solid tumors, both mesenchymal and epithelial, there is net loss of tumor suppressor miRNA clusters, such as MIRLET7A, MIR15A,

MIR16, and MIR125B [94, 95]. Gene amplification of miRNA coding clusters is also frequent and is usually, though not always, correlated with high expression levels [95, 96]. In their seminal study, Zhang et al. [95] analyzed 293 tumors of the ovaries and breast as well as melanomas using both array comparative genomic hybridization (aCGH) and miRNA expression arrays. They showed that miRNA cluster amplification is not always correlated with overexpression concluding that other factors must be behind the copy number variation- expression discrepancy. miRNA biogenesis machinery can be defective in cancer and impair miRNA expression. The genes coding for DROSHA and DICER are recurrently downregulated and/or mutated in different types of tumors [97, 98]. Changes in the activity of these two proteins profoundly affect miRNA biogenesis and impact the expression of

32 different miRNA species [85]. miRNAs are also epigenetically regulated on a genome-wide level in cancer [99]. The members of the miR-34 gene family (miR-34a, miR-34b, and miR-

34c) are direct targets of p53 which induces cell cycle arrest and apoptosis in cancer cells. The promoters of the genes coding for these miRNAs are targets of CpG island hypermethylation in different neoplasms including lung and breast cancer [100]. Some miRNAs are epigenetically activated via DNA hypomethylation, e.g., the upregualtion of miR-196b [101], miR-106a [102], and miR-519d [103] is caused by hypomethylation in their promoter regions.

Generally, miRNAs are divided into oncomirs and tumor suppressors; however, this classification fails in some instances due to the intricate pathways in which miRNAs are involved. A single miRNA can regulate multiple targets and can act as an oncogene as well as a tumor suppressor in different tumors [90]. miR-125b acts as an oncomir in the vast majority of hematologic malignancies exerting its oncogenic role via the suppression of hematopoietic differentiation factors [104]. In solid tumors, whether miR-125b is oncogenic or tumor suppressive is far more variable, probably because there is a more even balance between its oncogenic and tumor suppressive effect (Figure 6) [90].

33 Figure 6. Overview of miR-125b targets in cancer. A) tumor suppressor activity of miR-125b. miR-125b suppresses proliferation and tumor growth by targeting LIN28B in hepatocellular carcinoma

[105], BCL3 in ovarian cancer [106], and STAT3 in osteosarcoma [107]. B) miR-125b acts as an oncomir in endometrial carcinoma [108], prostate [109], and breast cancer [110] repressing the genes

TP53INP1, BAK1, and STARD13, respectively.

miR-21 is one of the best characterized oncomirs. It is upregulated in many epithelial cancers and in some hematological neoplasms [111]. mir-21 can target different genes promoting almost every hallmark of cancer; however, it exerts its main oncogenic function through inhibition of apoptosis [112]. The let-7 family of miRNAs is highly conserved among invertebrates and vertebrates, including humans who have twelve let-7 genes encoding nine miRNAs that probably provide the best example of tumor suppressor miRNA. The let-7 family clusters are frequently deleted in a wide range of tumors, including ovarian, lung, and breast carcinomas [113]. let-7 miRNAs’ main targets are the RAS oncogenes [114] and the

HMGA2 gene [115, 116]. Moreover, let-7s play a role in controlling cell proliferation

34 inhibiting the expression of numerous cell cycle regulators including MYC, cyclin dependent kinase 6 (CDK6), and cyclin D 2 (CCND2) [117].

Clinical relevance

There is a large number of clinical trials going on validating the possible use of miRNAs as diagnostic, prognostic, and predictive markers (for details, see www.clinicaltrials.gov). Lu et al. demonstrated that miRNA expression profiles can be used to precisely classify various types of cancers and are superior to mRNA expression profiles in classifying poorly differentiated tumors [118]. Furthermore, they can be used as reliable biomarkers in both solid tumors and hematological malignancies [119-121]. Zhang et al. built an algorithm using the expression patterns of six miRNAs (miR-21-5p, miR-20a-5p, miR-

103a-3p, miR-106b-5p, miR-143-5p, and miR-215) which was able to differentiate patients with high risk colorectal cancer from those with lower risk disease. They found that this six- miRNA-based classifier had better prognostic value than clinicopathological risk factors and mismatch repair status [122].

In recent years, several studies have identified distinctive patterns of expression of miRNAs in the serum of patients affected by different types of cancer. Recently, Yang et al.

[123] showed that seven miRNAs, miR-15b, miR-23a, miR-133a, miR-150, miR-197, miR-

497, and miR-548b-5p, were significantly decreased in the serum of patients with advanced stage (grade II–IV) astrocytoma and that the miRNA-based signature could accurately distinguish between normal controls and cancer patients. The use of circulating miRNAs as biomarkers might omit the necessity of biopsies and could also open up for cheap and effective screening of at risk populations, thus avoiding a range of collateral health and cost issues associated with invasive techniques.

35 miRNAs in ovarian cancer

Several studies focusing on miRNA expression profiles have been performed in ovarian cancer. Most of them addressed the situation in HGSC whereas other histotype where analyzed less extensively. Zhang et al. showed that 34 miRNAs were recurrently dysregulated

[124]. The tumor suppressors let-7a/b/d/f, miR-31, miR-34abc, miR-125b, and miR-127 were the most down-regulated while the oncomirs miR-20a, miR-23a/b, and miR-200a/b/c were expressed at high levels. Mining the TCGA data Miles et al. , identified seventeen miRNAs that were dysregulated in HGSC including eight up-regulated miRNAs (miR-183-3p, miR-

15b-3p, miR-15b, miR-590-5p, miR-18a, miR-16, miR-96, and miR-18b) and nine down- regulated miRNAs (miR-140-3p, miR-145-3p, miR-143-5p, miR-34b-5p, miR-145, miR-139-

5p, miR-34c-3p, miR-133a, and miR-34c-5p) [125]. EC seems to be characterized by the upregulation of three miRNAs, miR-21, miR-203, and miR-205 [126]. Calura et al. found that

CCC are characterized by a higher expression of miR-30a whereas MC have higher levels of miR-192 and miR-194 [127].

Allegedly, miRNA deregulation in ovarian cancer is mainly caused by copy number alterations. Ovarian cancer is characterized by genomic instability and it has been demonstrated that the loci of miRNAs previously found deregulated in ovarian carcinomas are affected by genomic imbalances [94, 95]. Seminal studies by two different groups [126, 128] showed that miRNAs are also epigenetically silenced in ovarian cancer. Zhang et al. [128] treated five ovarian cancer cell lines with 5-AZA and a histone deacetylase inhibitor 4- phenylbutyric acid and found that 16 of 44 miRNAs (36.4 %), down-regulated in advanced stage ovarian cancer, could show a normal expression profile after the treatment with these drugs. Another study found that MIR34A was frequently methylated in ovarian carcinomas

[129]. This gene codes for miR-34a which possesses tumor suppressive functions and it is frequently downregulated in ovarian carcinomas [129]. Several studies showed that

36 mutations in genes often mutated in ovarian carcinomas (see above) such as TP53, KRAS,

BRAF, and PTEN, can lead to altered expression of miRNAs [130-132]; however, only TP53 and BRAF mutations have been proven to alter miRNAs expression in HGSC and LGSC, respectively [130].

In recent years, miRNAs have been proven to be valid prognostic and predictive markers in ovarian cancer. miR-9 upregulation was associated with improved outcome [133] and sensitivity to cisplatin [134]. Low expression of the miR-200 family of miRNAs, on the other hand, was instead associated with paclitaxel resistance (Figure 7) [135]. Numerous studies have also demonstrated the clinical relevance of cell-free circulating miRNAs in ovarian cancer [136]. Circulating miRNAs belonging to the miR-200c family were found to have a potential application as diagnostic marker of ovarian carcinomas. These miRNAs were found in exosomes as well as free in the serum of ovarian cancer patients in different studies

[137-139]. mir-21 was also found expressed in high levels in the serum and correlated with an unfavorable prognosis [140, 141].

37 Figure 7. miR-200 family regulation of paclitaxel sensitivity. A) miRNAs belonging to the miR-

200 family aided by ROS repress p38α and downregulate TUBB3 with nuclear HuR increasing this way sensitivity to paclitaxel. B) In chemoresistant cells, miR-200 miRNAs are expressed at low levels impairing the sensitivity mechanisms. Adapted with permission.

High-mobility group AT-hook 2 (HMGA2)

The HMGA2 gene belongs, together with the three homologs HMGA1a, HMGA1b, and HMGA1c, to a family of architectural transcription factors named high-mobility Group

AT-hook. HMGA proteins are characterized by three DNA-binding domains, called AT- hooks, located at the N-terminal region, and an acidic carboxy-terminal tail. These proteins are mainly involved in various chromosome and chromatin dynamics as well as in gene regulation [142]. HMGA2 proteins do not possess direct transcriptional activation capacity but act as regulators of gene transcription by controlling the global structure of large domains of chromatin. HMGA2 binds to the scaffold-attachment regions (SARs) at the base of

38 repressed chromatin loops and causes concomitant displacement/exclusion of histone H1 from these sequences leading to a local opening of chromatin and initiation of the gene activation process. This initial binding is followed by a propagation of HMGA2 binding throughout the entire chromatin region that makes the structure accessible allowing gene transcription [143].

In this way, HMGA2 regulates the expression of a wide number of genes that are involved in different biological processes including cell growth, proliferation, differentiation, and death

[142].

HMGA2 regulation

HMGA2 play a major role in embryonal development [144, 145]. It is rarely expressed in normal, adult tissues where tight post-transcriptional regulation prevents its expression. The

HMGA2 gene is located on chromosome band 12q14 and contains five exons dispersed over a genomic region ≥ 160 kb. The main regulation sites of HMGA2 are in the promoter region which contains multiple transcription initiation sites [146] and a 2.9 kb long highly-conserved

3’UTR. Early studies demonstrated that the 3’UTR is essential for HMGA2 regulation [147,

148]. A systematic analysis performed with advanced methodologies mapped with a high resolution the entire HMGA2 3’UTR, showing that this region possesses a wide number of independent regulatory elements [149]. Many of these regulatory elements are the binding sites for several miRNAs such as the let-7 family [116, 150], miR-30c [151], and miR-33 family [152-154]; that are all able to repress HMGA2. It is widely accepted that miRNA- mediated regulation is the main mechanism of regulation of HMGA2 (Figure 8A). Evidence in favor of this hypothesis comes from the study of HMGA2 deregulation in neoplasia. The gene is expressed in tumors of different natures (benign and malignant) and origins (mesenchymal and epithelial). High expression levels of the gene is correlated with poor clinical outcome and a high rate of metastasis in different types of tumors such as lung cancer [155], colon cancer [156], and HGSC [157]. Interestingly, high expression levels of HMGA2 were found

39 also in leukaemias [158, 159]. Two main mechanisms cause HMGA2 deregulation in neoplasia: one is disruption of the HMGA2 locus due to chromosomal aberrations; the other is the downregulation of miRNAs that repress HMGA2. Both mechanisms lead to circumvention of miRNA-dependent repression and lead to HMGA2 expression that is enhanced by signaling pathways usually active in cancer such as the Raf/ERK/MEK [160] and WNT10B/β-catenin pathways [161].

Several cytogenetic studies have shown chromosomal rearrangements targeting the region 12q13~15 (where the gene HMGA2 is located) with a number of different partner chromosomes in various tumors of mainly mesenchymal origin [162], such as lipomas [163,

164], myolipomas [165], uterine leiomyomas [166], pulmonary chondroid hamartomas [167], pleomorphic adenomas of the salivary glands [168], and, less commonly, acute myeloid leukaemia [169], and polycythemia vera [159]. The HMGA2 gene is the target of these chromosomal aberrations with breakpoints preferentially clustering to the large third intron of the gene (>140 kb), more rarely to the fourth intron, resulting in HMGA2 truncation or generation of an HMGA2-chimeric transcript. The result of the HMGA2 fusion or chimeric transcript is a truncated but functional protein [170], since the AT-hooks domain is encoded by the first three exons, which can evade repression as the product lacks the 3’UTR (Figure

8B).

40 Figure 8. Mechanism of HMGA2 upregulation caused by chromosomal aberrations. A) In normal adult cells, HMGA2 is repressed by a miRNA-mediated mechanism. B) Disruption of the HMGA2 locus resulting in a truncated transcript that evades miRNA-induced silencing. The result is a functional but truncated protein.

Several miRNAs targeting HMGA2 are downregulated in different types of tumors.

Loss of these miRNAs is usually correlated with HMGA2 upregulation and such loss seems to be the main cause of high HMGA2 expression in epithelial cancers (Table 1), where HMGA2 disruption seems to be rare [171] (Paper I).

41 Table 1. List of miRNAs targeting HMGA2 in different types of cancer

miRNAs Tumor type References

colorectal cancer [172]

glioblastoma [173]

hepatocellular carcinoma [174] let-7 family lung cancer [151, 175]

HGSC [176]

squamous cell carcinoma of the vulva [177]

gallbladder cancer [178] miR-26a non small-cell lung cancer [179]

breast cancer [180] miR-30c squamous cell carcinoma of the vulva [181]

breast cancer [153] miR-33b melanoma [182]

miR-101 pancreatic cancer [183]

miR-154 prostatic cancer [184]

miR-204 thyroid cancer [185]

miR-485 bladder cancer [186]

miR-543 colorectal cancer [187]

HMGA2 role in cancer

Early studies showed the transforming effect of both full length [188] and truncated

HMGA2 [170]. Many studies focused on the the oncogenic role of HMGA2 through four main processes: cell proliferation, DNA damage repair, epithelial-mesenchymal transition (EMT), and stem cell self-renewal.

42 HMGA2 promotes cell proliferation by targeting a set of cell-cycle related genes such as cyclin B2 (CCNB2) [189], cyclin A1 (CCNA1) [190], and cyclin dependent kinase inhibitor

1A (CDKN1A) [191]. Fedele et al. (196) showed that HMGA2 promotes E2F1 expression and influences its activity affecting the expression of downstream genes such as aspartokinase

(ASK), cyclin B1 (CCNB1), and aurora kinase B (AURKB).

HMGA2 is able to repress the non-homologous end joining DNA repair mechanism impairing binding of DNA dependent kinases to double strand breakage sites [192]. HMGA2 can also repress ERCC1 gene expression affecting nucleotide excision repair [193].

HMGA2 is one of the main drivers of tumor metastasis as it controls the expression of the main EMT switches. Morishita et al. [171] showed that HMGA2 is expressed by tumor cells located in the invasive front of a primary tumor and in its metastasis where it enhances

TGFβ signaling promoting the expression of TGFβRII. The gene can activate Wnt/β- catenin promoting the expression of the Axin-1 (AXIN1) and Twist-related protein 1 gene

(TWIST1) which are responsible for β-catenin activation and its translocation to the nucleus

[194]. HMGA2 regulates also Snail 1 (SNAI1) [195] and zinc finger E-box binding homeobox

1 (ZEB1) [196] expression and many other EMT-related genes such as lumican (LUM), vimentin (VIM), inhibitor of DNA binding (ID1), and wingless type 2 (WNT2) [191].

HMGA2 participates in the development of different organs [197-200]. Particularly, it regulates key pluripotency genes such as POU class 5 homeobox 1 (POU5F1), SRY-box 2

(SOX2), and Nanog homeobox (NANOG), modulating the chromatin structure near these loci

[201].

43 Part 2: Materials and methods

Patient material The cohort presented in this thesis consists of 193 samples of ovarian tumors collected at the Norwegian Radium Hospital in Oslo, Norway, between 1999 and 2010. The series is composed of 39 sex-cord stromal tumors (19 Thf, 16 F, two granulosa cell tumors, and two teratomas), 22 borderline epithelial tumors, and 126 carcinomas (56 HGSC, 30 EC, 18 MC,

14 LGSC, 12 CCC, five mixed carcinoma subtypes, and one undifferentiated carcinoma). All the samples are part of a tumor biobank that has been registered according to Norwegian legislation and approved by the Regional Committee for Medical Research Ethics in South-

East Norway. The patients included in this work gave written informed consent at the time of surgery to participate in research studies on gynecological tumors. All samples have been examined by pathologists of the Department of Pathology at the Norwegian Radium Hospital

(Oslo University Hospital) and some of these were cytogenetically analyzed at the Section for

Cancer Cytogenetics at the Norwegian Radium Hospital.

In Paper I, a series of 187 samples were analyzed, consisting of 39 sex-cord stromal tumors (19 Thf, 16 F, two granulosa cell tumors, and two teratomas), 22 borderline epithelial tumors, and 126 carcinomas (56 HGSC, 30 EC, 18 MC, 14 LGSC, 12 CCC, five mixed-type carcinomas, and one undifferentiated carcinoma).

In Paper II, a series of 165 ovarian tumors were studied, consisting of 30 sex-cord stromal tumors (16 Thf and 14 F), 22 borderline epithelial tumors, and 103 carcinomas (35

HGSC, 30 EC, 16 MC, 10 LGSC, and 12 CCC).

In Paper III, 89 tumors (53 HGSC, 17 EC, 7 LGSC, five CCC, four borderline epithelial tumors, and three MC) were investigated by means of next generation sequencing

(NGS), more specifically miRNA-sequencing (miRNA-seq). Sixty-six additional samples (10

44 HGSC, 12 MC, 10 EC, 10 CCC, 10 Thf, 10 F, and four LGSC) were used as an independent cohort to validate the results from miRNA-seq and bioinformatic analyses.

The series of Paper IV consisted of 18 samples of HGSC and one undifferentiated carcinoma characterized at chromosomal level by rearrangement of . These samples were sequenced (RNA-sequencing) and analyzed with exon-level expression microarray.

In Paper V, 34 ovarian tumors (14 HGSC, 9 EC, 4 CCC, 3 MC, 2 LGSC2 and 2 borderline) were studied by means of RNA-sequencing. An additional series of 113 tumors

(10 F , 10 Thf, 10 borderline epithelial tumors, and 83 carcinomas of which 35 were HGSC,

18 EC, 16 MC, 10 CCC, and 4 were LGSC) was used to test whether the novel fusion transcripts identified by RNA-seq were recurrent.

Methods

In this section, an overview is provided of the methods used in the different subprojects.

Since cytogenetics analyses were used only in Paper II and IV, cytogenetic methods will be briefly introduced. The candidate did not perform any cytogenetic analysis himself, the data were provided by the diagnostic part of the section or by research projects performed by our other persons belonging to our group over the years. These data were nevertheless the backbone and starting point for some of the research subprojects that are included in this thesis.

Cancer cytogenetics

The discovery of the Philadelphia chromosome by Nowell and Hungerford in 1960

[202] stimulated interest in cancer cytogenetics. Once banding was invented and karyotyping techniques refined in the 1970s [203], the field of cancer cytogenetics quickly developed and 45 provided discoveries that massively increase our knowledge about tumorigenesis. Today, cancer cytogenetics is still contributing to this growing understandig using a combination of classical banding techniques and more modern molecular cytogenetic methods, such as fluorescent in situ hybridization (FISH) and comparative genomic hybridization (CGH).

Short-term cultures and karyotyping of solid tumors

Fresh tumor samples must be short-term cultured before karyotyping [204]. Briefly, the samples are disrupted mechanically and enzymatically before cells are cultured in a suitable medium at 37 C.̊ Cell proliferation is monitored and when cultures are about to reach confluency, colcemid is added to arrest the dividing cells in metaphase. The cells are harvested and suspended in a hypotonic solution of potassium chloride which induces them to swell. Cells are then fixed with a solution of acetic acid and methanol. When they are later dropped onto microscopy slides, the cells burst and the metaphases spread out on the glass surface. Eventually, the cells are stained using Wright’s stain. The metaphases are then cytogenetically analyzed and the karyotype is written according to the ISCN system [205].

Fluorescence in situ hybridization (FISH)

Molecular cytogenetic techniques are based on fluorescence in situ hybridization

(FISH). This technology was designed to provide a description of chromosome structure at a resolution exceeding that of microscopic analysis. Therefore, these technologies bridge the gap between cytogenetic and molecular approaches. FISH techniques were first introduced in early 80s and refined during the next decade [206]. The methodology is based on the use of fluorescent-labeled, DNA specific probes that bind to the genomic site of interest because of base-pair complementarity. The technique can be used on metaphase spreads and/or interphase nuclei. The resolution varies from whole chromosomes in metaphase spreads

(resolution of 5-10 Mb per band), to the analysis of interphase nuclei by locus-specific probes (50 kb–2 Mb), and to the level of chromatin strands (5 kb–500 kb) using fiber FISH.

46 Moreover, the use of DNA microarray FISH now provides even high resolution down to the single-nucleotide level.

Comparative genomic hybridization (CGH)

The preparation of high-quality metaphase spreads, especially from solid tumors, is often difficult. Comparative genomic hybridization (CGH) was developed to overcome this problem as well as the presence of many aberrations incompletely described by karyotyping

[206]. DNA is extracted directly from the tumor and a normal sample used as a reference. The two DNAs are differentially labeled using two different fluorophores, usually green for the tumor sample and red for the reference. The two probes are then applied simultaneously to normal metaphase chromosomes where they compete for complementary hybridization sites.

If a region is gained or amplified in the tumor sample, the corresponding region on the metaphase chromosome becomes predominantly green. Conversely, if a region is deleted in the tumor sample, the corresponding region becomes red. The ratios of tumor vs normal fluorescence along the chromosomes are quantified using digital image analysis. Gains and/or amplifications in the tumor DNA are identified as chromosomal regions with increased fluorescence ratios, whereas losses result in a reduced ratio. One of the main advantages of

CGH is that it requires no a priori knowledge of the chromosome rearrangements present in a genome. However, CGH cannot detect balanced rearrangements, changes in ploidy, and/or intratumor heterogeneity, Furthermore, tumor samples admixed with stromal or normal tissue in a percentage higher than 50% can create a problem as imbalances may remain undetected.

Despite these limitations, CGH has become one of the most widely used molecular cytogenetic techniques in both basic research and diagnostics [206, 207]. It has increased our knowledge of cancer biology, revealing that many tumors display specific patterns of chromosomal imbalances [208, 209].

47 The methods described in the following are those used to perform the different subprojects presented in this thesis.

Polymerase chain reaction (PCR) based techniques: brief history of a “special” DNA polymerase

The story of PCR began in 1969 when the norwegian researcher Kjell Kleppe described a process of in vitro DNA amplification involving oligonucleotide primers and

DNA polymerase at the Gordon Conference [210]. But the paternity of PCR is attributed to

Kary Mullis, a researcher of Cetus Corporation that refined the PCR methodologies in late 80s

[210]. Kary Mullis and colleagues were the first to utilize a couple of oligonucleotide primers and the thermostable DNA polymerase Taq for in vitro DNA amplification [211]. Although this discovery represented a significant progress for biology, Taq polymerase was not perfect: it was unstable at high temperatures, error prone, and had difficulty amplifying DNA rich in

GC content or with strong secondary structures. These problems were finally overcome in the year 2003 with the creation of next generation engineered DNA polymerases which provided the possibility to perform high-fidelity PCR [212]. The development of these next generation polymerases, alongside the development of both Real-Time and digital PCR technologies

[213], made PCR-based methods the indispensable research tools that they are today.

PCR As the name says, the PCR principle is based on a chain reaction where one DNA molecule is used to produce two copies, then four, then eight and so forth. The technique amplifies a specific sequence of DNA. Most PCR methods amplify DNA fragments of between 0.1 and 10 kilo base pairs (kbp), although some techniques allow for amplification of fragments up to 40 kbp in size [214]. The reaction requires a mix solution containing the

DNA polymerase, a buffer specific for reaction and also for the type of enzyme, deoxynucleotide triphosphates (dNTPS) used by the enzyme for amplification, and the

48 primers set. The primers are short strands of DNA, generally about 19-20 base pairs, which serve as a starting point for the amplification reaction. They are required for DNA replication because the enzymes that catalyze the process, the DNA polymerases, can only add new nucleotides to an existing strand of DNA. The design of the primers is crucial for the result of the PCR as they must be highly specific for the targeted DNA sequence, otherwise the polymerase will amplify also other regions of the genome leading to unspecific products

[215]. Different algorithms are used to design highly specific primers like Primer3 ( http://primer3.ut.ee/) and Primer-Blast (https://www.ncbi.nlm.nih.gov/tools/primer-blast/).

The typical procedure of a PCR consists of a series of 20–40 repeated temperature changes, called cycles, with each cycle commonly consisting of 2–3 discrete temperature steps. The typical steps of a PCR are shown in Figure 9.

Figure 9. Polymerase chain reaction cycling. Figure publicly available at Wikimedia commons.

Reverse transcription PCR (RT-PCR)

Reverse transcription PCR (RT-PCR) is a variant PCR-based technique commonly used to investigate RNA expression and alterations such as splicing variants. The RNA template is first converted into a complementary DNA (cDNA) using reverse transcriptase. The cDNA is then used as a template for exponential amplification using PCR. Most of the commercially 49 available reverse transcriptases are derived from Avian Myelomatosis Virus (AMV), Murine

Moloney Leukemia Virus (MMLV), and/or Human Immunodeficiency Virus (HIV) [216].

Rapid amplification of cDNA ends (RACE)

This application of PCR is used to obtain cDNA from a target sequence when only one of its mRNA ends is known. The technique is frequently used to study the 3’ regulatory region of the mRNAs [217, 218], as well as chimeric transcripts [163, 165]. In Paper I 3'

RACE-PCR was used to characterize the truncated forms of HMGA2. The 3’ variant uses the polyA tail for priming the reverse transcription. cDNAs are generated using an Oligo-dT- adaptor primer that complements the polyA stretch and adds a special adaptor sequence to the

3' end of each cDNA. PCR is then used to amplify the obtained cDNA using an anti-sense primer complementary to the adaptor sequence and specific primer designed in the known 5’ region. A second PCR, called nested, is used to eliminate all unspecific products that could be generated in the first round of PCR.

Real-Time quantitative PCR (Real-Time qPCR)

In 1992, Higuchi et al. [219] came up with a technique to analyze the kinetics of PCR reactions. They added ethidium bromide to the PCR mix creating a thermocycler that was able to irradiate the template with ultraviolet lights and detect the fluorescence generated. They also created an algorithm that could correlate the fluorescence with the DNA amplification.

They called this technique Kinetic PCR, today it is known as Real-Time quantitative PCR.

The technique is a variant of RT-PCR that is mainly used for gene expression studies; however, it also offers other applications, such as the quantification of pathogens [220], methylation analyses [221], and genotyping [222]. The combination of excellent sensitivity and specificity, reproducible data, low contamination risk, and reduced hand-on time has made Real-Time qPCR technology a pillar of genetic research.

50 Basics of Real-Time qPCR

The Real-Time qPCR instrument consists of a thermal cycler with an integrated excitation light source, a fluorescence detection system, and software that displays the recorded fluorescence data as a DNA amplification curve. There are two main elements of

DNA analysis by qPCR: 1) methods enabling both specific and non-specific detection of amplified products using dsDNA binding dyes, and 2) those that only detect specific PCR products employing fluorophore-linked oligonucleotide. SYBR® Green is the most used dye that binds unspecifically to dsDNA. Though used in many applications, it requires melt curve analysis at the end of the amplification cycles in order to verify the specificity of the amplified fragment, something that increases the time of analysis. In contrast, the fluorescent- labeled oligonucleotides are a DNA specific sequence (probe) for the target gene that is usually ligated to two different fluorophores. There are two types of fluorophores: the donor/reporter and the acceptor/quencher. When a reporter fluorophore absorbs energy from light, it rises to an excited state. The process of returning to the ground state is driven by the emission of energy as fluorescence. When the probe is not binding target DNA, the fluorophores are at a distance of some 10 to 100 angstrom and the energy emitted from the reporter is absorbed by the quencher; no fluorescence signal is emitted by the probes that are not involved in amplification. On the other hand, the degradation of the probe due to amplification releases the fluorophore and breaks the close proximity to the quencher, thus relieving the quenching effect and allowing fluorescence of the fluorophore. This is then detected by the machine. The fluorescence detected in the quantitative PCR thermal cycler is directly proportional to the fluorophore released and the amount of DNA amplified in the

PCR reaction. Different probes and chemistries are used in Real-Time qPCR, and there are also different strategies to use the fluorophore-linked oligonucleotide. For the projects included in this thesis, the reliable TaqMan probes were used. The TaqMan principle relies on

51 the 5´ to 3´ exonuclease activity of Taq polymerase to cleave the labeled probe after the hybridization to the complementary target sequence. In brief, as the Taq polymerase extends the primer and synthesizes the nascent strand, the 5' to 3' exonuclease activity of the Taq polymerase degrades the probe that has annealed to the template (Figure 10). This mechanism is called “Pac-Man principle” after the videogame with the same name and it is also the reason behind the name TaqMan (Taq Polymerase + PacMan = TaqMan).

Figure 10. Schematic representation of the “Pac-Man principle”. Figure publicly available at Wikimedia commons.

52 Methylation-specific qPCR (MSP-qPCR)

Precise mapping of DNA methylation patterns in CpG islands has become essential for understanding diverse biological processes. DNA methylation is frequent in cancer and is often central to tumorigenesis [223]. MSP-qPCR is a quantitative methylation assay that can be used to study the methylation of CpG islands of the gene of interest. MSP requires only small quantities of DNA and is sensitive to 0.1 % methylated alleles of a given CpG island locus [224]. The technique entails the modification of DNA by sodium bisulfite, converting all unmethylated, but not methylated, cytosines to uracil. Then a qPCR is performed using primers specific for methylated and unmethylated DNA. The qPCR results are evaluated using melting curve analysis [225]. Melting curves are generated by continuous acquisition of fluorescence from the samples subjected to the linear temperature gradient. For basic analyses, the melting curves can be converted to peaks by plotting the negative derivative of fluorescence over the temperature (–dF/dT) versus temperature. In methylation studies, two peaks, one unmethylated and one methylated, are obtained from standard samples. The methylation status of the template sample is assessed comparing the peak of the template with the peaks of the standards. Different melt curve peaks, using different ratios of methylated/unmethylated standards, can be generated to obtain a rough evaluation of the methylation ratio of the template (Figure 11).

53 Figure 11. Example of plot of a melting curve analysis. The methylated (red) and unmethylated

(green) references are used to quantify the methylation status of the template. Here is an example of a non methylated sample (blue), a partially (light blue), and completely (orange) methylated template.

Figure adapted from Wikimedia commons.

Nucleotide sequencing

Watson and Crick solved the three-dimensional structure of DNA in 1953 but the ability to sequence DNA did not follow for some time. During the study of bacterial RNA,

Fred Sanger and colleagues developed, in 1965, a technique based on the detection of radiolabeled, partial-digested fragments after two-dimensional fractionation [226]. This method, subsequently named Sanger sequencing, was the main sequencing technique used for many years and it is still used a lot. Sanger sequencing was modified and improved over the years and was never forsaken [227-229]. It was only in recent years that work begun in the

80s [230, 231] resulted in a new break-through in how of sequencing is performed: the birth of high-throughput sequencing technology [232].

54 Sanger sequencing

“Knowledge of sequences could contribute much to our understanding of living matter.”

— Frederick Sanger

Sanger sequencing was based on four DNA polymerization reactions carried out using radio-labelled-ddNTPs nucleotides for each base. The labels are included in the reaction at low concentrations. Once they are incorporated they prevent further extension, sequence fragments are thus generated with 3′ truncations as a ddNTP is randomly incorporated at a particular instance of that base. Different chemical treatments are then used to selectively remove the base from a small proportion of DNA sites. The fragments generated can be visualized using gel electrophoresis: sequences are then inferred by reading ‘up’ the gel, as the shorter DNA fragments migrate fastest. Nowadays, radio-labelled-ddNTPs have been replaced by fluorescence-labelled-ddNTPs and the whole process is capillary-based and semi- automated. The sequence of the template is determined by high-resolution electrophoretic separation of the end-labelled extension products in a capillary-based polymer. The fluorescent labels are excited by lasers which are coupled to a four-color detection system.

This provides the readout that is represented in a Sanger sequencing trace, and the software translates the traces into a DNA sequence.

High-throughput sequencing (HTS)

High–throughput sequencing (HTS) technologies emerged in the early years of the new millennium [232-234]. The method has since been improved, and new technologies have been developed [235]. Even though all these sequencing technologies differ in terms of biochemistry and library preparation, the workflow is quite similar. The projects presented in this thesis used Illumina as sequencing platform. Illumina sequencing workflow consists of three main steps. The first step is the library preparation, in which DNA is fragmented and

55 legated to adaptors. The fragments are then amplified using solid-phase bridge amplification

(Figure 12).

Figure 12. The solid-phase bridge amplification used by the Illumina sequencing platform.

Adapted with permission.

In brief, the adaptors attach to specific anchors ligated to the solid surface, the free ends hybridize with the complementary anchors forming a “bridge” structure, and then amplification begins. The bridge structure immobilizes the amplicons causing the formation of clonal clusters containing around 1000 copies of the same fragment. The sequencing step is performed using the cyclic reversible termination [236] (Figure 13). The technique uses reversible terminators in a cyclic method that comprises nucleotide incorporation, fluorescence imaging, and cleavage. In the first step, a DNA polymerase bound to the primed template adds one fluorescently modified nucleotide which represents the complement of the template base. Following the incorporation, a special camera detects the fluorescence signal, identifying the incorporated nucleotide and determining the sequence of the fragment. The imaging step is followed by a cleavage step, which removes the modified base. A washing step is performed before starting the next incorporation. At the end, the sequenciator will

56 generate a data file containing the sequences of all the fragments analyzed (called reads). This final step will be discussed in detail below.

Figure 13. The illumina solexa sequencing. Image adapted with permission.

High-throughput RNA-sequencing (RNA-seq)

HTS technologies have lately being applied routinely to a wide range of important topics in biology and medicine. High-throughput RNA-sequencing (RNA-seq) is one of these applications. It is the only sequencing technology used in the projects described in this thesis.

This method employs the high-throughput sequencing of cDNA fragments generated from a library of total or fractionated RNA. The preparation of the library depends on the nature of

57 RNA molecules to be analyzed, e.g., mRNA and miRNA. The most used type of RNA-seq is the Paired-End sequencing that allows the sequence of both ends of a fragment and generate high-quality, alignable sequence data. Paired-End RNA-seq allows unambiguous mapping to unique regions of the genome and, hence, improvement of the dataset quality. RNA-seq allows the precise quantification of transcripts and exons, and also the analysis of transcript isoforms with at least a 5000-fold dynamic range [237]. Not only is RNA-seq able to quantify more accurately the transcriptome consisting of known genes, it is also a great tool for identifying new transcript isoforms [238], new fusion genes [239, 240], and novel non-coding

RNAs such as lncRNAs [241] and miRNAs [242]. RNA-seq has been the most powerful tool for miRNA profiling which allows precise detection of both novel and known miRNAs.

RNA-seq can distinguish between miRNAs that differ by a single nucleotide, as well as isomiRs of varying length [243].

Bioinformatic analyses

A multitude of informatic tools were developed to use the information encoded in the data generated by HTS. Complex algorithms are required to analyze every single read which will eventually lead to new genetic findings. In this chapter, the basic downstream pipeline for the analyses of HTS data will be illustrated. The bioinformatic analyses involve different steps and are divided in primary, secondary, and tertiary analyses. Primary and secondary analyses are usually common to every DNA/RNA sequencing data bioinformatics pipeline.

The tertiary analyses are more specific and are chosen according to the purpose of the research study. The software for primary analyses is nowadays usually integrated with the sequencing instruments. First, these tools convert fluorescence intensity data of each cluster in the nucleotide bases (base calling). Then this process is checked in order to assess the quality of the sequencing. The quality is scored using the Phred scale [244], which is logarithmically related to the base-calling error probability (P) and described by the formula Q= -10log10P

58 where a Q value of 20 correspond to the 99% of base calling accuracy. The secondary data are those usually performed by the user. Depending on the nature of the study, these analyses can occur at the level of the genome, exome, transcriptome, or selected gene panels [245]. The secondary analyses begin with the collective alignment of reads to a reference human genome. Once this step has been performed, several refinement steps are often needed [246] before proceeding to the tertiary analysis. Another type of secondary analysis commonly used in genomics is the variant calling to identify new mutations. The class of genomic variation profiled can vary and includes single nucleotide variants, small insertions and deletions, or larger alterations like structural rearrangements and copy number changes. Variant calling errors are common as HTS is inherently less accurate than traditional sequencing methods and, therefore, artifacts occur with greater regularity. This problem is partially corrected by increasing sequencing depth (i.e., sequencing each base position several times). Different algorithms perform the tertiary analysis. The majority of these tools are involved in the annotation of the variants detected by variant calling. This process is important to ensure that the variants detected have biological importance.

Specific bioinformatic analyses for RNA-seq

The pipeline for bioinformatic analyses of RNA-seq data follow the workflow described in the previous paragraph [247]. The primary analyses are basically the same, while the secondary analyses for RNA-seq differs from the ones used for DNA sequencing. In fact the reads from RNA-seq contain only exonic sequences many of which will map in two different neighboring exons. As a consequence, alignment algorithms were created to permit gaps in the alignment. The most used RNA-seq alignments tools are TopHat [248] and STAR [249] which were used in the present project (Paper III, Paper IV, Paper V). The tertiary analyses vary but correspond to the different applications of RNA-seq. The most common application of RNA-seq is to estimate gene expression. This application is primarily based on the number

59 of reads that map to each transcript sequence. The simplest approach to quantification is to aggregate raw counts of mapped reads. However, raw read counts alone are not sufficient to compare expression levels among samples as these values are affected by transcript length, total number of reads, sequencing biases, etc. For this reason the measure RPKM (reads per kilobase of exon model per million reads) was introduced as a normalization method, within sample, that will remove the feature-length and library-size effects [247]. This measure and its subsequent derivatives FPKM (fragments per kilobase of exon model per million mapped reads) are the most frequently reported RNA-seq gene expression values. There are different, sophisticated algorithms to estimate transcript-level expression calculating all the bias that can affect the quantification. The most used tools are Cufflinks [250], kallisto (http://pachterlab. github.io/kallisto/), and Sailfish [251]. Differential expression analysis requires that gene expression values are compared among samples. RPKM and FPKM are the most important factors for comparing samples. There are different tools for differential expression analysis using different statistics. The Bioconductor package DESeq was used in the third subproject described here (Paper III) [252]. An additional tertiary analysis can be the alternative splicing profiling [253]. This analysis can detect new isoforms and/or asses their expression. Many of the programs listed above, such as Cufflink and TopHat, can be used for this purpose.

Another application of RNA-seq is miRNA profiling. Using miRNA-sequencing (miRNA- seq) it is possible to identify novel miRNAs and perform miRNAs differential expression analyses. miRDeep [254] is the most known, and probably the most used, tool to identify new miRNAs. This program analyzes reads shorter than 30bp and maps them to the genome, then calculates the reads’ probability of being miRNAs using specific algorithms. Furthermore, it uses the algorithm TargetScan to identify the possible target genes [254].

In recent years, RNA-seq was used extensively to identify new fusion transcripts in cancer. More than 20 tools have been developed for such detection from RNA-Seq data

60 [239]. None of these algorithms is flawless and there are still many issues associated with these tools. Many of these software packages require a huge amount of time and computational power to function properly, moreover they are not free form errors as the results change with datasets, they can miss true fusion events (false negatives) as well as create false positives. Few programs like FusionCatcher [255] and Chimerascan [256] were

“trained” to recognize and discard the possible/probable false positive and the read-through transcript that are present in normal cells. The analyses behind the subprojects described in this thesis have been based on the use of four different algorithms: FusionCatcher,

Chimerascan, Fusionmap [257], and TopHat [258].

61 Part 3: Aims

We work from the premise that an understanding of the pathogenesis underlying cancer development is a prerequisite to finding specific medical treatments that counteract exactly those molecular rearrangements that render the cells neoplastic.

Knowledge about the genomic rearrangements that make cells neoplastic is central to such an understanding. We took as our starting point for some of the subprojects in this thesis ovarian tumor that we, based on prior cytogenetic and molecular cytogenetic investigations, knew carried a rearrangement we thought was interesting. The subsequent studies of this material relied on transcriptome analyses using RNA sequencing, Sanger sequencing, and

PCR-based methods.

The main aim of this PhD project was to gain insight into the transcriptional consequences of these genomic rearrangements. Furthermore, we wanted to use the genomic and transcriptomic profile of each tumor to possibly identify new genetic and molecular features that characterize the different types of ovarian tumors, which may help to unravel the pathogenesis of these neoplasms. We particularly tried, where possible, to find a correlation between genomic alterations and tumorigenic molecular changes.

Since most studies of ovarian cancer have focused on HGSC, whereas the more rare types of carcinomas have not been extensively analyzed yet, we chose to work with relatively large series also of tumors of other histotypes, from the very common HGSC to epithelial borderline tumors and sex cord-stromal tumors.

The specific aims of the studies presented in this thesis were:

 Paper I: To verify the presence in ovarian tumors of mutations, epigenetic changes,

and genomic alterations of specific genes known to be involved in the pathogenesis of

other types of tumors. We therefore checked for mutations in the genes IDH1, IDH2, 62 TERT, the MGMT promoter methylation as well as the expression and truncation of

HMGA2.

 Paper II: To characterize the regulation of HMGA2 gene in ovarian tumors studying

the expression and regulation of let-7a and miR-30c which are two miRNAs known to

repress HMGA2 expression, as well as the expression levels of the enhancer of miR-

30c expression (FHIT) and the let-7a repressors (LIN28A/B).

 Paper III: To identify new miRNA signatures that could be used as diagnostic markers

for the different types of ovarian tumors by means of the latest sequencing technology,

i.e., miRNA-sequencing.

 Paper IV: To find new candidate fusion genes arising from a specific chromosomal

rearrangement involving chromosome 19. The study combined cytogenetics, exon-

level expression, and transcriptome sequencing data.

 Paper V: To identify novel fusion genes in ovarian carcinomas.

63 Part 4: Results in brief

Paper I: A novel truncated form of HMGA2 in tumors of the ovaries.

The first project aimed to screen a series of 187 tumors belonging to different groups of ovarian neoplasia to verify the presence of specific genomic changes known to be involved in oncogenetic mechanism, i.e., mutations in the genes IDH1, IDH2, TERT, as well as the methylation status of the MGMT promoter. Moreover, we focused on the detection of HMGA2 expression and the possible genomic alterations that can affect this locus. We investigated the mutations IDH1R100, IDH1R109, and IDH1R132 of IDH1, and IDH2R140, IDH2R149, and

IDH2R172 of IDH2 in all the 185 samples from which DNA was available. We gained informative results for all 185 tumors but did not find any mutation of IDH1/2 in our series.

We checked for the possible presence of mutation in the TERT promoter by looking specifically at positions 228 and 250 (mutations known as C228T and C250T). We found four tumors, out of 185 investigated, harboring the C228T mutation. Interestingly, this mutation seems to be recurrent in fibromas as it was found in three out of 16 such tumors analyzed

(18.7 %). The fourth mutation was found in a borderline epithelial tumor sample. Analysis of the methylation status of the MGMT promoter gave informative results in 185 tumors, and two borderline tumors were found methylated. The expression of HMGA2 was investigated by means of RT-PCR in 161 tumors of the original series and in 120 of them the gene was found expressed (74.5 %). Furthermore, we checked for possible truncation of the HMGA2 gene and found the gene truncated in 11 samples: five HGSC, three MC, two EC, and one borderline tumor. Interestingly, we found a novel truncated form of HMGA2 in HGSC samples in which the third exon was fused with different regions of the third intron.

64 Paper II: Genomic imbalances are involved in miR-30c and let-7a deregulation in ovarian tumors: implications for HMGA2 expression.

In line with the results of Paper I, we decided to assess the expression levels of

HMGA2 in the different types of ovarian tumors present in our series. HMGA2 was expressed at high level in the majority of tumors analyzed. The highest HMGA2 relative normalized expression levels were found in tumors of the serous subgroup where the high-grade group had a mean value of 74.3 and a median of 31, whereas the LGSC showed a mean value of

67.6 and a median of 27.2. Interestingly, HMGA2 was not expressed at high levels in CCC

(mean=1.8; median=0.5).

Since in Paper I we discovered that HMGA2 truncation and/or generation of fusion transcript is not a frequent event in ovarian tumors, we decided to investigate the role of miRNAs in the expression regulation of this gene. We therefore assessed the expression of two miRNAs, let-7a and miR-30c, known to target HMGA2 and to be deregulated in different types of tumors. let-7a and miR-30c were found downregulated in all types of ovarian tumors investigated with no particular expression level differences among the groups analyzed.

We also investigated the possible factors that could affect expression of the aforementioned miRNAs. In order to do so, we checked the expression levels of the lin 28 homologues LIN28A/B that are known to repress the let-7 family of miRNAs in cancer, and the FHIT gene, which is able to enhance miR-30c and whose downregulation was found correlated with low levels of miR-30c in lung cancer [151] and squamous cell carcinoma of the vulva [181] in previous studies. LIN28A was overexpressed in all carcinoma histotypes with the highest expression seen in HGSC (mean=10.6; median=2.32). LIN28B expression was not detected in any of the samples analyzed nor was it seen in the normal controls. FHIT was slightly overexpressed in all ovarian tumor types. The highest expression levels for the

65 gene were found in Thf (mean=2.4; median=2) and the lowest in CCC (mean=1.4; median=1.1).

In order to find out if genomic imbalances caused the low expression levels of these miRNA, we went back to our previously generated cytogenetic data (karyotype and/or high resolution CGH; HR-CGH) to see if the regions where the let-7a and miR-30c clusters are mapping were visibly lost, i.e., at the chromosomal level. We found a deletion corresponding to at least one of the chromosomal bands where the three clusters for let-7a map (9q22.32,

11q24.1, and 22q13.31) in 47 out of 62 tumors showing aberrations at the chromosomal level

(76 %). The bands where the two clusters for miR-30c map, 1p34.2 and 6q13, were found deleted in 41 out of 62 cases with genomic aberrations (66 %). We also performed FISH analyses using locus specific probes for the let-7a and miR-30c clusters on 42 cases to confirm, at the gene level, the loss of one or more clusters. Twelve cases showed a normal diploid pattern of signals for the all probes analyzed. The remaining 26 cases showed mostly heterozygous deletions of one or more miRNAs; only in four cases did the signal pattern indicate a homozygous deletion. At least one let-7a cluster was deleted in 88.5 % of the cases with an abnormal genome, while one miR-30c cluster was deleted in at least 19 % of the cases analyzed.

The results of this study showed that HMGA2 was expressed in the majority of ovarian tumor subtypes, CCC being the only exception, but that the miRNAs that target this gene, let-

7a and miR-30c, are downregulated in ovarian neoplasms. Moreover, we found that genomic imbalances are a cause of the downregulation of the miRNAs analyzed.

66 Paper III: The microRNA mir-192/215 family is upregulated in mucinous ovarian carcinomas.

To investigate the expression of miRNAs in the different groups of ovarian carcinoma,

89 samples of such tumors were sequenced by means of miRNA-sequencing. The subsequent bioinformatic analysis was performed in order to obtain a miRNA profile of all the miRNAs whose expression levels could characterize the different subtypes of ovarian cancer.

The bioinformatic analyses performed on the sequenced samples showed that miRNAs belonging to the miR-192/215 family were consistently upregulated in mucinous carcinomas with a mean 6-fold change. These results were further validated performing Real-Time qPCR for the mentioned miRNAs in an independent cohort of another 66 samples from several subgroups of ovarian tumors, i.e., different histological subtypes of carcinoma as well as sex cord-stromal tumors. The results obtained by Real-Time qPCR confirmed that at least two of the miRNAs of the 192/215 family were upregulated with a mean > 6-fold change in the majority of MC (9 out of 12) whereas the same family of miRNAs was consistently downregulated in all other tumor types. Interestingly, the three cases showing miRNA downregulation presented particular features such as a mixed mucinous and endometrioid histotype, presence of atypia, and neuroendocrine differentiation.

67 Paper IV: Recurrent involvement of DPP9 in gene fusion in serous ovarian carcinomas.

We designed a study to identify novel fusion genes in ovarian carcinomas, starting our search with samples harboring chromosome 19 changes, the most common genomic alterations in this tumor subtype.

Using both exon-level expression miRNA and RNA-seq, we identified two fusion transcripts, DPP9-PPP6R3 and DPP9-PLIN3, involving the dipeptidyl peptidase 9 gene

(DPP9) located on chromosome band 19p13.3 in two samples of HGSC. The junction in the

DPP9-PPP6R3 fusion was between exon 11 of DPP9 and exon 18 of PPP6R3 whereas in

DPP9-PLIN3 the fusion junction detected by RNA seq was between exon 16 of DPP9 and exon 8 of PLIN3. We were able to validate with PCR and direct sequencing only the DPP9-

PPP6R3 fusion transcript. Despite the fact that different partner genes (PPP6R3 and PLIN3) were involved in the rearrangements, that both led to loss of the 3’ part of the DPP9 transcript. The effect of the disruption of DPP9 would be loss of the peptidase and esterase- lipase domains, the gene’s main functional domains. DPP9 encodes a serine protease that acts as a tumor suppressor in several cancer types inducing apoptosis and suppressing proliferation.

68 Paper V: Identification of novel cyclin gene fusion transcripts in endometrioid ovarian carcinomas

In this study, we used an unbiased approach to detect novel fusion transcripts in ovarian tumors, i.e., no cytogenetic information was used to select the samples for sequence analysis. We sent for RNA-seq 34 tumors representing the different histotypes of ovarian carcinoma. The data obtained by RNA-seq were analyzed using four different algorithms,

FusionCatcher, Chimerascan, FusionMap, and TopHat, all designed for the discovery of fusion transcripts and freely available on the net.

A total of 22 novel fusion transcripts were identified. Since all these transcripts were non-recurrent in the original series of 34 tumors sequenced, we also tested a larger cohort of

113 ovarian tumors for their possible presence. We found that the in-frame fusion transcript

PCMTD1-CCNL2 was present in four out of 18 EC samples involving the Cyclin L2 gene

(CCNL2) mapping on 1p36.33. The result of this fusion is loss of the two cyclin domains and secondary loss of function of Cyclin L2. Since this cyclin displays tumor suppressor activity, possibly this fusion, like the DPP9-rearrangements found in the previous project (Paper IV), causes the loss of tumor suppressor activity which may explain its tumorigenic role in a subgroup of EC. Two additional chimeric transcripts were also found, ANXA5-CCNA2 and

PDE4D-CCNB1, involving other cyclin genes in two additional samples of EC. These fusions put almost the entire cyclin genes (CCNA2 and CCNB1) under the regulation of their 5’ partners (ANXA5 and PDE4D) which are expressed in ovarian cells at considerable levels.

The main effect of this fusion is thought to be deregulated expression of CCNA2 and CCNB1 with eventual deregulation of the cell cycle.

Two fusion transcripts, CCNY-NRG4 and TSPAN3-NRG4, involving the Neuregulin 4 gene were also found. In these two transcripts, exon 4 of NRG4 was fused respectively with exon 1 of another Cyclin gene (CCNY) in one CCC and exon 6 of the Tetraspanin-3 gene

69 (TSPAN3) in an HGSC. These findings suggest that NRG4 could be a promiscuous fusion gene in ovarian carcinoma, even though we cannot at present see how these fusions might exert a tumorigenic effect.

70 Part 5: General discussion

Methodological considerations

Study design

Around 200.000 new cases of ovarian cancer are diagnosed every year and the mortality rate is about (55-65 %) worldwide [70]. This high mortality rate is mainly due to the fact that most ovarian cancers belong to the aggressive HGSC [3]. HGSC is by far the best characterized type of ovarian carcinoma; however, ovarian tumors are a very heterogeneous group of neoplasms consisting also of relatively rare tumors of disparate origins and histology. Because of their rarity, most of these tumors were not extensively studied, at least not from the vantage point of the research projects presented in this thesis. We based all our projects on analyses of a range of tumors that included sex cord-stromal tumors, epithelial borderline tumors, and all the five histological types of ovarian carcinomas. Our starting series consisted of 39 sex-cord stromal tumors (19 Thf, 16 F, two granulosa cell tumors, and two teratomas), 22 borderline epithelial tumors, and 126 carcinomas (56 HGSC, 30 EC, 18

MC, 14 LGSC, 12 CCC, five mixed-type carcinomas, and one undifferentiated carcinoma).

Unfortunately, we could not use the same series of samples for all the projects because we soon ran out of fresh frozen material. The majority of the samples included in the starting series were previously analyzed by karyotyping and HR-CGH in studies that were already published [64, 259].

Based on the examination of this series, we were able to identify molecular features that were common to all the ovarian tumors analyzed as well as new specific aberrations characteristic of a subgroup of tumors, such as the TERT promoter mutations in ovarian F and presence of the PCMTD1-CCNL2 fusion transcript in EC (see above).

71 Transcriptomics miRNA profiling

miRNA profiling has become of interest in oncology because miRNAs can be informative biomarkers of both diagnostic and prognostic value for different patient groups

[243]. There are three major approaches for miRNA profiling currently well-established:

Real-Time qPCR, hybridization-based methods (microarrays), and high-throughput sequencing (miRNA-seq). In Paper II we used only TaqMan assays for Real-Time qPCR analyses [260], while in Paper III we used both miRNA-seq and the TaqMan assays. miRNA- seq allows profiling of the entire pool of miRNAS of a given Total-RNA template. The major advantages of this method are the detection of known and novel miRNAs and the precise identification of miRNA sequences (since it can readily distinguish between variants differing by a single nucleotide as well as isomiRs of varying length). Even though miRNA-seq is the most powerful tool for miRNA profiling, many biases can affect the method’s reliability. The library preparation is a crucial step in the miRNA-seq since any problem at this stage can affect the whole profiling process. This step indeed affected the miRNA-seq in Paper III where we sent for sequencing 89 samples of ovarian epithelial tumors for miRNA profiling.

The core facility used the Illumina HiSeq 2500 platform. Sequencing was successful in all 89 samples, but once we started the bioinformatics analyses, we noticed that after the trimming step (adapters sequence removal), only 10-38 % of the reads were conserved (Table 2).

72 Table 2. Discrepancies between raw and trimmed reads. Raw and trimmed reads in nine samples analyzed with RNA-seq.

Samples* Raw reads Trimmed reads

1 18427640 3082867 (16%)

17 13984610 1589429 (11,5%)

24 17422351 5609984 (32%)

35 23718803 2125912 (9%)

47 13802396 2563594 (18.5%)

54 33291454 5289467 (15%)

66 25624715 9839221 (38%)

77 47643658 8322222 (17%)

89 27493687 5250261 (19%)

We concluded that, in all likelihood, unknown errors occurred during the library preparation causing the difference between the number of raws and the trimmed sequences.

We checked if there was mRNA contamination in the sequencing process using the Linux grep command to search for sequences of the housekeeping genes GAPDH and RPL4 in the raw and trimmed data files, but found no evidence of mRNA contamination; we therefore decided to proceed with the downstream analyses using strict guidelines to minimize the bias that could be introduced by library preparation errors. We assessed the miRNA detection rate for each sample after alignment with the program STAR. The miRNAs detection rate was measured correlating the number of reads mapped to reference miRNAs obtained from a freely available miRNA database (http://www.mirbase.org/) and the number of detected miRNAs. An arbitrary cutoff was set at 1.5 million mapped reads as suggested by Metpally et

73 al. [252], and the 13 samples (out of 89) whose reads were below that threshold, were removed from further analyses. After the alignment, we performed differential expression analysis using DESeq2 and screened the results searching only for miRNAs that were overexpressed with a fold change higher than 4 and with padj minor of 0.005. We considered only the overexpressed miRNAs since it is known that errors in library preparation cause the loss of some miRNAs from the pool of miRNAs sequenced [261]. Therefore, we could not be sure if any miRNA downregulation found in the DESeq2 results was a real biological event or an artifact. Finally, we decided to try and confirm the bioinformatic results using Real-Time qPCR since this method is more accurate.

The guidelines used were not optimal, but we thought they were necessary to obtain robust results from a project conditioned but such technical errors. We therefore excluded some miRNAs which could be of interest and left out from our study those miRNAs that seemed to be downregulated in the differential expression analysis. We saw this as a necessary “sacrifice” since we could not rely on the bioinformatic analyses and we could not afford to buy multiple TaqMan assays on the basis of such unreliable evidence. We were aware that the best option would be to sequence again the samples, but performing the entire procedure a second time was not doable within our economic limits.

Methods for fusion transcript detection

The development of transcriptomics made possible the discovery of many novel fusion genes that were previously difficult to detect by karyotyping followed by FISH and direct sequencing. RNA-seq has become the most efficient method for the identification of fusion transcripts [262], as it is cheaper than whole genome sequencing and requires “user-friendly” downstream analyses. In fact, there is a plethora of free algorithms able to detect the fusion transcripts in RNA-seq data [239]. Many of these programs such as TopHat [258] and

FusionMap [263] generate a long list with thousands of candidate fusion candidates which are

74 difficult to analyze and filter. In Paper IV, we overcame this problem screening the long list of candidate fusion transcripts produced by FusionMap using both cytogenetics and exon- level expression microarray data. In that paper, we focused exclusively on finding fusion transcripts arising from chromosome 19 changes (see below) which shortened the list of candidates considerably. After this step, we used an algorithm [264] to screen exon-level expression microarray data and select genes having a significant difference in expression levels between the 5′ and 3′ parts of the transcript, something that would indicate a difference brought about by a fusion event. Using these analyses, we were able to identify fusion transcripts involving the gene DPP9 (Paper IV). Since we focused only on fusion events involving chromosome 19, we excluded from consideration all fusion transcripts involving genes located on other chromosomes. We knew that we might miss interesting transcripts taking this approach, and therefore tried to look again at the entirety genome in the subsequent project. In Paper V, we used an unbiased approach (one not based on cytogenetic findings) to detect possible fusion genes. This required a different bioinformatics pipeline based on the use of multiple algorithms. We first used the program FusionCatcher [255] which was developed to minimize the fusion transcripts list to only few candidates (mean of

10-15) using different filters during the pre-processing, alignment, and final filtering [255].

Even though FusionCatcher is a powerful tool, it is not free from errors and we know from first-hand experience that the program sometimes excludes real fusion transcripts (i.e., it generates false negatives) because of the strict filtering guidelines of the algorithm [265]. We therefore decided to screen the lists generated by the three algorithms Chimerascan,

FusionMap, and TopHat searching for the genes involved in fusions previously found with

FusionCatcher to either confirm the fusion transcripts found or to find new fusions involving these genes with other partners, and, possibly, to find a recurrence, i.e., the same fusion in another tumor. By way of example, screening the FusionMap list of one case of endometrioid

75 carcinomas where NRG4 was involved in a fusion event, we found this gene recombined with a new partner, i.e., TSPAN3-NRG4. We then decided to screen the TopHat fusion lists using the coordinates of the genes belonging to the Cyclin family, since we found fusions involving these genes with FusionCatcher. Doing this we found the fusion transcript PCMTD1-CCNL2, which was then proved to be recurrent in EC.

The synergistic use of cytogenetics, transcriptomics, and molecular analyses

Ovarian tumors, carcinomas in particular, are characterized by a high grade of genomic instability [266]. Previous studies by our group showed that the karyotype of borderline epithelial tumors often carries trisomies for chromosomes 12 and 7 [10] whereas ovarian carcinomas are characterized by complex karyotypes with frequent changes on chromosome 11 and 19 [64]. CGH analyses performed on the same samples [10, 64, 72] confirmed that ovarian tumors are characterized by many genomic imbalances. We therefore wanted to make a deeper analysis of these alterations studying the aberrations present at the

RNA level (in terms of fusion genes) and at the miRNA level.

In Paper II, cytogenetic data were used to formulate hypotheses about the role of imbalances for the expression of let-7a and miR-30c. Real-Time qPCR analyses showed that these two miRNAs were downregulated with expression levels near zero in all ovarian tumor types examined (see Results). miR-30c expression is enhanced by the Fragile Histidine Triad protein (FHIT), whereas let-7a is repressed by LIN28A/B. When we investigated the expression of these three regulators, we saw that their levels were not concordant with the low expression levels found for the miRNAs (see Paper II). We therefore concluded that other causes were involved in the deregulation of miR-30c and let-7a, one of them possibly involving the loss of these miRNA loci from the genome. The cytogenetic data and FISH analyses (Figure 14) showed that indeed the clusters coding for these miRNAs were frequently deleted in the whole spectrum of ovarian tumors.

76 Figure 14. FISH analyses for detection of let-7a and miR-30c clusters in Project II.

Heterozygous deletion of MIR30C1 on 1p34.2 A), MIR30C2 on 6q13 B), MIRLETA1 on 9q22.32 C),

MIRLET7A2 on 11q24.1, and MIRLET7A on 22q13.31 D).

The recurrent rearrangement(s) of chromosome 19 was the starting point for the study behind Paper IV where we investigated 19 HGSC and one undifferentiated carcinoma with chromosome 19 changes. Early cytogenetic studies had shown that ovarian carcinomas are characterized by aberrations occurring on both arms of chromosome 19 [267, 268] but with no particular region being recurrently involved. The project was designed to search for presence of fusion genes/transcripts that could arise from such aberrations.

77 Tissue references used in the studies

The use of correct reference materials is crucial in gene expression studies. Usually, samples from of a healthy person or group of persons are used. For ovarian carcinomas, the choice of a suitable reference is challenging. The changing views as to which is the cell of origin in ovarian carcinogenesis (see Introduction) have made earlier preferences less biologically relevant. In the past, when the site of origin of carcinomas was still believed to be the surface epithelium, ovarian epithelium brushes from healthy women were used as controls

[269]. Today, tissue from fallopian tube is more and more used as reference for HGSC [75].

However, the PhD project presented in this thesis focused on studying different types of ovarian tumors with disparate histology and origins and we therefore thought that the most suitable reference would be the whole ovary. This may not have been the most biologically correct choice for all tumor subsets, but it represented a reasonable compromise when trying to compare different types of tumors. We therefore used the commercially available Total

RNAs from Agilent, Clonetech and Zyagen. Both of these references were used to normalize the expression of both genes and miRNAS (Papers II and III) in Real-time qPCR analyses.

Each of these RNA references was tested multiple times in the different assays to make sure that all genes and miRNAs investigated were consistently expressed. In Paper III, we did not sequence any ovarian reference with miRNA-seq because we focused on the differential expression between types of carcinoma and not between carcinomas and normal tissue.

78 Biological considerations

HMGA2 regulation in ovarian tumors

HMGA2 is an oncogene that is currently attracting much attention in cancer genetics.

High expression levels of this gene were found to correlate with a poor survival rate in different types of carcinoma such as esophageal carcinoma [270], breast cancer [271], and

HGSC [272]. In ovarian tumors, most studies focused only on the expression of HMGA2 in

HGSC [273, 274] and its relation to EMT [275, 276]; where the causes leading to overexpression of this gene in ovarian tumors were scarcely investigated. We decided to look into the mechanisms behind HMGA2 deregulation in ovarian tumors extending our research also to ovarian tumor types that were not or only rarely investigated before. The results form the backbone for Papers I and II.

The truncation/fusion of HMGA2 caused by genomic aberrations is one of the main mechanisms that cause overexpression of this gene [162]. We therefore decided to check if this mechanism of deregulation was operative also in ovarian tumors using a PCR assay to check for presence of a truncated/chimeric form of HMGA2. We found that HMGA2 truncation was rare in ovarian tumors, found only in 11 samples out of 109 showing expression of this gene. Interestingly, we found a novel truncated form of HMGA2 containing the normal mRNA sequence until exon 3 fused with different parts of the third intron varying in length in four samples of HGSC (Figure 15).

79 Figure 15. HMGA2 truncated transcripts found in HGSC. Chromatograms of the truncated form of

HMGA2. The junctions between exon 3 and the different intronic parts, A) and B), are shown in the yellow box.

In Paper I, we demonstrated that deregulation of HMGA2 was mostly caused by other mechanisms than truncation/fusion of the gene. In contrast, we found in Paper II that downregulation of miRNAs targeting HMGA2 is the main cause of the overexpression of this gene. We also wanted to gain more insight into the pathogenic mechanisms that may cause the miRNAs’ downregulation and consequentially lead to HMGA2 deregulation. We therefore investigated the expression and regulation of let-7a and miR-30c that are known to target

HMGA2 [115, 151]. We found that HMGA2 was expressed at high levels in pure stromal ovarian tumors, borderline epithelial tumors, and in all carcinomas with CCC as the only exception. A recent study found that HMGA2 is expressed also in ovarian germ cell tumors, especially in immature teratomas and yolk sac tumors [277]. These findings, together with the results of Paper II, suggest that HMGA2 plays a major role in the pathogenesis of ovarian tumors as it is expressed over the full spectrum of these neoplasms. The two miRNAs, let-7a and miR-30c, were highly downregulated in all tumor types analyzed and we demonstrated that one of the causes of deregulation of these miRNAs is the genomic loss of the clusters coding for them (see below). The same inverse correlation between the expression levels of

HMGA2 and its repressor miRNAs found in ovarian tumors, was earlier identified also in

80 other tumor types such as lung cancer [151], pituitary adenoma [278], and gastric cancer

[279].

miRNAs deregulation in ovarian tumors

miRNAs deregulation plays a central role in the genesis of several types of tumors, including ovarian neoplasms [85]. Several studies have identified miRNA markers expressed in the various types of ovarian carcinoma demonstrating miRNAs deregulation in them [280].

Nevertheless, only few studies investigated the mechanisms behind the miRNA deregulation and how it might act pathogenetically. The main aim of Papers II and III was to study the deregulation of miRNAs in ovarian tumors with special emphasis on the possible causes and effects of miRNA deregulation.

In Paper II, we demonstrated that the miRNAs let-7a and miR-30c are highly deregulated in pure stromal tumors, borderline epithelial tumors, and ovarian carcinomas

(Figure 16). We showed that one of the main causes behind this deregulation was genomic loss of the clusters encoding these miRNAs; CGH data showed deletions involving the clusters for let-7a and miR-30c in 76 % and 66 % of the tumors with a rearranged genome, respectively. FISH analyses validated, at the gene level of resolution, the above data and showed that the deletions were heterozygous in most samples. This fits well with what was also reported recently by Kan et al. [29] who found that genomic alterations led to dysregulation of miRNA expression in HGSC. We showed that this also holds for other types of ovarian benign tumors and cancer. In particular, deletions of let-7a clusters were found in all four F analyzed with FISH. In malignant tumors together, the CGH and FISH data showed that genomic deletion of at least one of the clusters was seen in six out of seven borderline tumors, whereas the same pattern was found in two out of three CCC, in five out of eight MC, in 10 out of 17 EC, in all three LGSC, and in 28 out of 33 HGSC. Interestingly, MIRLET7A3, mapping on 22q13.31, was found deleted in 62 % of the cases analyzed by CGH (39/62) but

81 in as many as 81 % of cases analyzed by FISH (21/26). Our results indicated that this cluster is deleted across the entire spectrum of ovarian tumors from benign sex-cord stromal tumors to carcinomas, underscoring yet again the importance of dysregulation of let-7a in ovarian tumorigenesis and progression.

Figure 16. Let-7a and miR-30c expression levels (Project II). Relative normalized expression of let-7a and miR-30c in thecofibroma (ThF), fibroma (F), epithelial borderline tumors (B), clear cell carcinoma (CCC), mucinous carcinoma (M), endometrioid carcinoma (E), LGSC (S-LG), and HGSC

(S-HG).

Mir-30c was found downregulated in all ovarian tumor types analyzed (Figure 16). A study by Wang et al. [281] published after the publication of Paper II confirmed that this miRNA is downregulated in ovarian cancer. In particular, Wang and colleagues showed that the downregulation of miR-30c was correlated with high expression levels of metastasis- associated gene 1 (MTA1) that is targeted and regulated by this miRNA in a similar manner to

HMGA2. MTA1 displays chromatin remodeling properties and regulates the expression of different genes involved in cancer progression and metastasis [282]. MTA1 and HMGA2 share many properties including that they both are chromatin remodeling protein drivers of EMT

[171, 282] regulated by miR-30c [151, 281]. The results of Paper II and the work by Wang et al. [281] together show that miR-30c downregulation is correlated with upregulation of 82 HMGA2 and MTA1 in ovarian tumors. These findings suggest that downregulation of miR-

30c is relevant to ovarian tumorigenesis and that this miRNA may be an important suppressor of EMT in such neoplasms.

In Paper III, we found that miR-192, miR-194, and miR-215 (the miR-192/215 family) were exclusively overexpressed in MC (Figure 17A). We tried to investigate the possible effects of miR-192/215 overexpression analyzing the expression levels of the oncogenes

MDM2, TYMS, and ZEB2, and that are known to be targeted by all these three miRNAs [283-

285]. We analyzed the expression of the aforementioned target genes expecting to find them dowregulated in mucinous carcinomas, but did not find any significant difference in the expression of MDM2, TYMS, and ZEB2 between MC and other ovarian tumors (Figure 17B).

These results suggest that the overexpression of the miRNA192/215 in MC may have no significant biological effect, even though we cannot exclude that these miRNAs may target other genes.

Recent articles by An et al. [286] and Lin et al. [287] identified two new target genes for miR-194 and miR-215. An et al. showed that ZEB1 is a target of miR-194 in paclitaxel resistant HGSC [286], whereas Lin et al. found that NOB1 is a novel target of miR-215 in

HGSC [287]. Because these studies were published after our Paper III was finalized for the submission to the journal, we did not assess the expression of these genes in our series. It would definitely be of interest to check whether the expression of these genes was altered also is in our series.

83 Figure 17. Expression levels of miR192/215 family and MDM2, TYMS, and ZEB2 in ovarian tumors. A) miR-192, miR-194, and miR-215 expression in ovarian tumors. B) Expression levels for

MDM2, TYMS, and ZEB2 in ovarian tumors (data not included in the article)

The results of Paper III showed that the miR192/215 family seemed to have no tumor suppressive effect in MC, so it is reasonable to consider the overexpression of these miRNAs as an indirect effect of the cancer cells’ differentiation into a mucinous phenotype. In this regard, it is well demonstrated that miRNA expression patterns change during differentiation

[288, 289]. This interpretation is lent credit also by the fact that downregulation of the aforementioned miRNA family was found in three mucinous carcinomas displaying particular phenotypic features (mixed mucinous and endometrioid phenotype, cellular atypia, and neuroendocrine differentiation). The strong correlation found between a mucinous phenotype

84 and overexpression of the miR192/215 family highlights the possibility that these molecules may be used as diagnostic biomarkers.

Fusion transcripts in ovarian carcinomas

Several studies have looked for fusion genes in ovarian carcinomas. Taken together, the findings of these studies suggest that fusion genes are passenger mutations generated by the high degree of genomic instability which characterizes these neoplasms; at least, no one fusion has been detected in many ovarian tumor. In fact, no fusion gene has been found in more than 5 % of ovarian carcinomas [75, 76]. Moreover, only the fusion SLC25A40-ABCB1

[75], out of these low-recurrent fusion genes, has a plausible pathogenic effect, while the other fusions involve genes with no known oncogenic properties.

In Paper IV, we identified two fusion transcripts involving the gene DPP9 (DPP9-

PPP6R3 and DPP9-PLIN3) mapping on 19p13.3 in a series of 18 HGSC and one undifferentiated carcinoma with chromosome 19 aberrations. The junction in the DPP9-

PPP6R3 fusion was found between exon 11 of DPP9 and exon 18 of PPP6R3, while in

DPP9-PLIN3 the fusion junction detected by RNA-seq was between exon 16 of DPP9 and exon 8 of PLIN3. Both fusions disrupt DPP9 causing loss of the 3’ part of the gene that codes for the functional domains. The result would be loss of protein function. The protein structure of DPP9 was recently published [290]. The main residues necessary for structure/function of

DPP9 are located in the C-terminal (N422, H500, loop 645-651, α/β hydrolase domain 691-

892, Y762) [291]. All these residues/domains are lost in DPP9-PPP6R3 as this fusion results in loss of the exons (11-22) coding for amino acids 397-892. Similarly, in DPP9-PLIN3 the

C-terminal part is lost as a result of gene disruption (exon 16-22 or 588-982aa); therefore this fusion causes loss of the loop 645-651, the α/β hydrolase domain 691-892, and residue Y762.

85 DPP9 is a serine protease of the DPP4 family that possesses tumor suppressor activities and induces apoptosis through inhibition of AKT [292, 293]. The DPP9-fusion transcript identified in this study causes loss of the gene’s tumor suppressor activity, hinting that DPP9 disruption may be pathogenetic in HGSC with 19p chromosomal alterations.

Hoogstraat and colleagues [294] previously found another in-frame fusion of DPP9 (DPP9-

PAX2) in an HGSC by means of whole-genome mate-pair sequencing. The DPP9-PAX2 fusion breakpoint was after exon 11 of DPP9, i.e., close to the breakpoint identified in the

DPP9-PPP6R3 fusion transcript; hence, also the fusion DPP9-PAX2 causes DPP9 disruption and loss of function. Analyzing the TCGA dataset of HGSC, we found that DPP9 was downregulated in the 7% of the samples without DNA copy number changes [71]. In all likelihood, in some of these samples the downregulation of DPP9 may have been caused by gene breakage through a mechanism similar to the one found in our series. These findings together suggest that DPP9 disruption may be tumorigenically relevant in a small subset of

HGSC.

In Paper V, we found that fusion transcripts involving genes coding for cyclins are frequent in EC. In particular, we found that the fusion transcript PCMTD1-CCNL2 was recurrent in EC (22 %). The PCMTD1-CCNL2 in-frame fusion juxtaposes exon 3 of

PCMTD1, located on 8q11.23, with exon 6 of the Cyclin L2 gene CCNL2 which maps on

1p36.33. The fusion event causes the loss of exons (2-5) of CCNL2 that code for the functional cyclin domains. It therefore generates a non-functional chimeric Cyclin L2. This cyclin participates together with Cyclin L1 and CDK11 in the pre-mRNA splicing processes acting as a tumor suppressor enhancing both apoptosis and chemosensitivity [295]. We concluded that the main pathogenic effect of the PCMTD1-CCNL2 fusion is the loss of one functional copy of the tumor suppressor gene CCNL2. Additionally, we found the fusions

ANXA5-CCNA2 and PDE4D-CCNB1 involving genes coding for two main mitosis regulators,

86 Cyclin A2 and Cyclin B1 [296]. These fusions (ANXA5-CCNA2 and PDE4D-CCNB1) bring almost the entire cyclin genes under the control of the promoter of their 5’ partners, leading to deregulated expression of functional Cyclins A2 and B1. Moreover, we found the CCNY-

NRG4 fusion in a CCC sample. In this transcript, exon 4 of NRG4 is fused with exon 1 of

CCNY, i.e., another cyclin gene. The fusion causes complete genomic loss of the entire

CCNY. Even though the tumor sample examined showed clear cell histology in the ovarian tumor, it is worthy of note that the primary tumor in the uterus had mixed clear cell- endometrioid morphology. That endometrioid cells were present in the primary carcinoma underscores the consistency with which cyclin genes are involved in fusion transcripts in neoplastic cells showing endometrioid features.

87 Part 6: Conclusion and future perspectives

As described above the aim of the research projects presented in this thesis was to identify novel genetic and molecular properties that specify the different types of ovarian tumors and that may help to unravel the pathogenesis of these neoplasms. The novel genetic and molecular features that we found to occur nonrandomly in ovarian tumors are listed below (Table 3).

Table 3. Overview of the main novel histotype-specific genetic features identified in the papers presented in this thesis.

Tumor Type Genetic and molecular features Paper

F TERT promoter mutations I

CCC absence of HMGA2 expression II

MC miR-192/215 family upregulation III

EC Cyclin fusion genes V

Novel HMGA2 truncated form I HGSC DPP9 fusion genes IV

Each of these findings proves that molecular genetic features are typical of pathogenetic pathways in the various types of ovarian cancer and that they may help distinguish different subtypes of ovarian carcinomas. By way of example, in Paper V, we found that a subgroup of EC is characterized by fusion events involving the genes coding for cyclins. This finding hints that a subclassification of endometrioid carcinomas based on the tumors’ molecular genetic features is possible.

Solid tumors have usually been classified according to their microscopic cellular and histological features and grade of differentiation. These features are currently used to

88 differentiate between the five types of ovarian carcinoma that are recognized (see above).

Recent advances in cancer genomics and biology have shown that this classification is of limited value if one hopes that morphological grouping of tumors should also reflect differences in pathogenesis. The most recent classifications instead integrate tumor morphology with genetic and molecular features. That is for example the case of the 2016

WHO classification of tumors of the central nervous system where astrocytomas are classified into two different groups according to their mutation status at the IDH loci [297]. It is to be expected that future classifications of ovarian carcinomas, too, will include genetic and molecular information.

The papers presented in this thesis are part of a research project that our section initiated in 2004 focusing on the discovery of novel genetic features of ovarian tumors. The first research aim [10, 64, 72] was to characterize the main acquired chromosomal alterations found in different types of ovarian tumors; the results thus obtained were the starting point for the projects discussed here. We now focused on transcriptomic analyses of miRNAs and a hunt for fusion transcripts. Additional investigations started at this point. In this project, we focused only on miRNAs but there are also other classes of non-coding RNA that would be interesting to analyze such as circular RNAs (circRNAs) [298]. DNA sequencing and DNA methylation sequencing are fundamental tools for the identification of novel genetic changes in cancer [299]. The characterization of the DNA methylome of ovarian tumors will provide new insight into pathogenesis of these neoplasms and may identify epigenetic marker specific to each of the types of ovarian tumors. Even though the most frequent mutations in ovarian tumors have already been characterized is still possible to find other more rare mutations with

DNA sequencing that may characterize subgroups between the groups of ovarian tumors of the current classification. Since we are moving towards the personalized medicine era every

89 clinically relevant mutation, even the most rare, will aid the clinicians to formulate the proper diagnosis and choose the right treatment.

90 Appendix

Abbreviations 5-AZA 5-Azacytidine 3’-UTR 3’ untranslated region aCGH array CGH ALL acute lymphocytic leukemia AML acute myeloid leukemia ARMS alveolar rabdomyosarcoma BER base excision repair BLAST basic local alignment search tool CCC clear cell carcinoma cDNA complementary DNA CGH comparative genomic hybridization CIN chromosomal instability CLL chronic lymphocytic leukemia ddNTPs dideoxynucleotides DNA deoxyribonucleic acid EC endometrioid carcinoma ERK extracellular signal-regulated protein kinase ETS E-twentysix DNA binding domain F Fibroma FIGO international federation of gynecology and obstetrics FISH fluorescence in situ hybridization HELP hydrophobic EMAP-like protein domain HGSC high-grade serous carcinoma HNPCC hereditary nonpolyposis colorectal cancer HTS high throughput sequencing ISCN international system for chromosome nomenclature LGSC low-grade serous carcinoma MC mucinous carcinoma MDR1 multidrug resistance 1 protein PCR polymerase chain reaction RAF rapidly accelerated fibrosarcoma protein kinase Real-time qPCR Real-time quantitative PCR RISC RNA-induced silencing complex RT-PCR reverse-transcriptase PCR MEK MAPK/extracellular signal-regulated protein kinase MSI microsatellite instability MSP-qPCR methylation specific quantitative PCR NGS next generation sequencing RACE rapid amplification of cDNA ends RNA ribonucleic acid RNA-seq RNA sequencing RPKM reads per kilobase million SYBR Synergy brands TGFβ transforming growth factor-β TGFβRII transforming growth factor-β receptor II Thf thecofibroma WD beta-transducin domain

91 WHO world health organization WNT10B wingless type 10B protein

Gene abbreviations

ABCB1 ATP binding cassette subfamily B member 1 ABL1 ABL proto-oncogene 1 AKT2 AKT serine/threonine kinase 2 ALK anaplastic lymphoma receptor kinase ANXA5 annexin A5 ARID1A AT-rich interaction domain 1A ATM ATM serine/threonine kinase AURKB urora kinase B AXIN1 axin1 BAK1 BCL2 antagonist/killer1 BCAM basal cell adhesion molecule BCL3 B-cell CLL/Lymphoma 3 BCR breakpoint cluster region BRAF B-RAF protoncogene BRCA1 BRCA1, DNA repair associated BRCA2 BRCA2, DNA repair associated C11orf20/CATSPERZ catsper channel auxiliary subunit zeta CCNA1 cyclin A1 CCNA2 cyclin A2 CCNB1 cyclin B1 CCNB2 cyclin B2 CCND2 cyclin D2 CCNL2 cyclin L2 CCNY cylin Y CDKG cyclin-dependent kinase G CDKN2A cyclin-dependent kinase inhibitor 2A CDKN2D cyclin-dependent kinase inhibitor 2D CREBL2 cAMP responsive element binding protein-like 2 CTNNB1 catenin beta 1 DGCR8 DGCR8, microprocessor complex subunit DICER dicer 1, ribonuclease III DPP9 dypeptidyl peptidase 9 DROSHA drosha, ribonuclease III E2F1 E2F transcription factor 1 EBF1 early B-cell factor 1 EML4 echinoderm microtubule associated protein-like 4 ERG ERG, ETS transcription factor ERCC1 ERCC excision repair 1 ESSRA estrogen related receptor alpha FHIT fragile histidine triad FOXO4 forkhead box O4 FUS FUS RNA binding protein HMGA2 high mobility group AT-hook 2 ID1 inhibitor of DNA binding 1

92 IDH1 isocitrate dehydrogenase (NADP(+)) 1, cytosolic IDH2 isocitrate dehydrogenase (NADP(+)) 2, mithocondrial IGH immunoglobulin heavy chain complex KLF5 Kruppel like factor 5 KRAS KRAS proto-oncogene, GTPase LAF4 AF4/FMR2 family member 3 LIN28A lin 28 homolog A LIN28B lin 28 homolog A LUM lumican MGMT O-6-methylguanine-DNA-methyltransferase MDM2 MDM2 proto-oncogene MLL/KMT2A lysine (K)-specific methyltransferase 2A MTA1 metastasis associated 1 MYC MYC proto-oncogene MYH11 myosin heavy chain 11 NAB2 NGFI-A binding protein 2 NANOG Nanog homeobox NF1 neurofibromin 1 NRG4 neuregulin 4 PAX7 paired box 7 PAX8 paired box 8 PCMTD1 protein-L-isoaspartate (D-aspartate) O-methyltransferase domain containing 1 PDE4D phosphodiesterase 4D PDGFRB platelet derived growth factor receptor beta PLAG1 PLAG1 zinc finger PLIN3 perilipin 3 POU5F1 POU class homeobox 1 PPP6R3 protein phosphatase 6 regulatory subunit 3 PTEN phosphatase and tensin homolog RB1 RB transcritpional corepressor 1 ROS1 ROS proto-oncogene 1 SLC25A40 solute carrier family 25 member 40 SNA1 snail family transcriptional repressor 1 SOX2 SRY(sex determining region)-box 2 STARD13 StAR related lipid transfer domain containing 13 STAT3 signal transducer and activator of transcription 3 STAT6 signal transducer and activator of transcription 6 TGM7 transglutaminase 7 TERT telomerase reverse transcriptase TMPRSS2 transmembrane serine protease 2 TP53 tumor protein p53 TP53INP1 tumor protein p53 inducible nuclear protein TSPAN3 tetraspanin 3 TUBB3 tubulin, beta 3 class III TWIST1 twist family bHLH transcription factor 1 UBAP1 ubiquitin associated protein 1 VIM vimentin WDFY2 WD repeat and FYVE domain containing 2 WNT2 Wnt family member 2 ZEB1 zinc finger E-box binding homeobox 1

93 Reference list

Reference List

1. La Vecchia C. Ovarian cancer: epidemiology and risk factors. European journal of cancer prevention : the official journal of the European Cancer Prevention Organisation (ECP). 2017; 26(1):55-62. 2. Norway CRo. Cancer incidence, mortality, survival and prevalence in Norway. Oslo: Cancer Registry of Norway. 2015. 3. Prat J. Ovarian carcinomas: five distinct diseases with different origins, genetic alterations, and clinicopathological features. Virchows Arch. 2012; 460(3):237-249. 4. Kurman RJ, Carcangiu ML, Herrington CS and Young RH. (2014). WHO Classification of Tumours of Female Reproductive Organs: IARC). 5. Meinhold-Heerlein I and Hauptmann S. The heterogeneity of ovarian cancer. Archives of Gynecology and Obstetrics. 2014; 289(2):237-239. 6. Mutch DG and Prat J. 2014 FIGO staging for ovarian, fallopian tube and peritoneal cancer. GynecolOncol. 2014; 133(3):401-404. 7. Hauptmann S, Friedrich K, Redline R and Avril S. Ovarian borderline tumors in the 2014 WHO classification: evolving concepts and diagnostic criteria. Virchows Archiv : an international journal of pathology. 2017; 470(2):125-142. 8. Ho CL, Kurman RJ, Dehari R, Wang TL and Shih Ie M. Mutations of BRAF and KRAS precede the development of ovarian serous borderline tumors. Cancer Res. 2004; 64(19):6915-6918. 9. Staebler A, Heselmeyer-Haddad K, Bell K, Riopel M, Perlman E, Ried T and Kurman RJ. Micropapillary serous carcinoma of the ovary has distinct patterns of chromosomal imbalances by comparative genomic hybridization compared with atypical proliferative serous tumors and serous carcinomas. Hum Pathol. 2002; 33(1):47-59. 10. Micci F, Haugom L, Ahlquist T, Andersen HK, Abeler VM, Davidson B, Trope CG, Lothe RA and Heim S. Genomic aberrations in borderline ovarian tumors. Journal of translational medicine. 2010; 8:21. 11. Zeppernick F and Meinhold-Heerlein I. The new FIGO staging system for ovarian, fallopian tube, and primary peritoneal cancer. Arch Gynecol Obstet. 2014; 290(5):839-842. 12. Scully RE. Pathology of ovarian cancer precursors. Journal of cellular biochemistry Supplement. 1995; 23:208-218. 13. Jarboe E, Folkins A, Nucci MR, Kindelberger D, Drapkin R, Miron A, Lee Y and Crum CP. Serous carcinogenesis in the fallopian tube: a descriptive classification. International journal of gynecological pathology : official journal of the International Society of Gynecological Pathologists. 2008; 27(1):1-9. 14. Sato N, Tsunoda H, Nishida M, Morishita Y, Takimoto Y, Kubo T and Noguchi M. Loss of heterozygosity on 10q23.3 and mutation of the tumor suppressor gene PTEN in benign endometrial cyst of the ovary: possible sequence progression from benign endometrial cyst to endometrioid carcinoma and clear cell carcinoma of the ovary. Cancer Res. 2000; 60(24):7052-7056. 15. Lim G and Oliva E. (2011). Sex Cord Stromal Tumors of the Ovary. In: Soslow RA and Tornos C, eds. Diagnostic Pathology of Ovarian Tumors: Springer New York), pp. 193- 234. 16. Horta M and Cunha TM. Sex cord-stromal tumors of the ovary: a comprehensive review and update for radiologists. Diagnostic and Interventional Radiology. 2015; 21(4):277- 286.

94 17. Fonseca RB and Grzeszczak EF. Case 128: Bilateral ovarian fibromas in nevoid basal cell carcinoma syndrome. Radiology. 2008; 246(1):318-321. 18. Haroon S, Zia A, Idrees R, Memon A, Fatima S and Kayani N. Clinicopathological spectrum of ovarian sex cord-stromal tumors; 20 years' retrospective study in a developing country. Journal of ovarian research. 2013; 6(1):87. 19. Buy J and Ghossain M. (2013). Sex Cord-Stromal Tumors. Gynecological Imaging: Springer Berlin Heidelberg), pp. 329-375. 20. Rabban JT, Gupta D, Zaloudek CJ and Chen LM. Synchronous ovarian granulosa cell tumor and uterine serous carcinoma: a rare association of a high-risk endometrial cancer with anestrogenic ovarian tumor. Gynecologic oncology. 2006; 103(3):1164-1168. 21. Hanahan D and Weinberg Robert A. Hallmarks of Cancer: The Next Generation. Cell. 144(5):646-674. 22. Sherr CJ and McCormick F. The RB and p53 pathways in cancer. Cancer Cell. 2002; 2(2):103-112. 23. Heidenreich B, Rachakonda PS, Hemminki K and Kumar R. TERT promoter mutations in cancer development. CurrOpinGenetDev. 2014; 24:30-7. doi: 10.1016/j.gde.2013.11.005. Epub;2013 Dec;20.:30-37. 24. Adams JM and Cory S. The Bcl-2 apoptotic switch in cancer development and therapy. Oncogene. 2007; 26(9):1324-1337. 25. Bergers G and Benjamin LE. Tumorigenesis and the angiogenic switch. Nat Rev Cancer. 2003; 3(6):401-410. 26. The metabolism of tumours. Investigations from the Kaiser-Wilhelm Institute for Biology, Berlin-Dahlem. Edited by Otto Warburg, Kaiser-Wilhelm Institute for Biology, Berlin-Dahlem. Translated from the German edition, with accounts of additional recent researches, by Frank Dickens, M.A., Ph.D., whole-time worker for the Medical Research Council, Courtauld Institute of Biochemistry, Middlesex Hospital, London. Demy 8vo. Pp. 327 + xxix. Illustrated. 1930. London: Constable & Co. Ltd. 40s. net. British Journal of Surgery. 1931; 19(73):168-168. 27. Warburg O. On respiratory impairment in cancer cells. Science. 1956; 124(3215):269- 270. 28. Krakhmal NV, Zavyalova MV, Denisov EV, Vtorushin SV and Perelmuter VM. Cancer Invasion: Patterns and Mechanisms. Acta Naturae. 2015; 7(2):17-28. 29. Coussens LM and Werb Z. Inflammation and cancer. Nature. 2002; 420(6917):860- 867. 30. DeNardo DG, Andreu P and Coussens LM. Interactions between lymphocytes and myeloid cells regulate pro- versus anti-tumor immunity. Cancer metastasis reviews. 2010; 29(2):309-316. 31. Gasparoto TH, de Souza Malaspina TS, Benevides L, de Melo EJ, Jr., Costa MR, Damante JH, Ikoma MR, Garlet GP, Cavassani KA, da Silva JS and Campanelli AP. Patients with oral squamous cell carcinoma are characterized by increased frequency of suppressive regulatory T cells in the blood and tumor microenvironment. Cancer immunology, immunotherapy : CII. 2010; 59(6):819-828. 32. Yokokawa J, Cereda V, Remondo C, Gulley JL, Arlen PM, Schlom J and Tsang KY. Enhanced Functionality of CD4+ CD25high FoxP3+ Regulatory T Cells in the Peripheral Blood of Patients with Prostate Cancer. Clinical Cancer Research. 2008; 14(4):1032-1040. 33. Rao CV, Yamada HY, Yao Y and Dai W. Enhanced genomic instabilities caused by deregulated microtubule dynamics and chromosome segregation: a perspective from genetic studies in mice. Carcinogenesis. 2009; 30(9):1469-1474. 34. Negrini S, Gorgoulis VG and Halazonetis TD. Genomic instability [mdash] an evolving hallmark of cancer. Nat Rev Mol Cell Biol. 2010; 11(3):220-228.

95 35. Aaltonen LA, Peltomaki P, Leach FS, Sistonen P, Pylkkanen L, Mecklin JP, Jarvinen H, Powell SM, Jen J, Hamilton SR and et al. Clues to the pathogenesis of familial colorectal cancer. Science. 1993; 260(5109):812-816. 36. Yao Y and Dai W. Genomic Instability and Cancer. Journal of carcinogenesis & mutagenesis. 2014; 5:1000165. 37. Al-Tassan N. Inherited variants of MYH associated with somatic G:C-T:A mutations in colorectal tumors. Nature Genet. 2002; 30:227-232. 38. Heim S. Boveri at 100: Boveri, chromosomes and cancer. The Journal of pathology. 2014; 234(2):138-141. 39. Sverre Heim FM. (2015). Cancer Cytogenetics: Chromosomal and Molecular Genetic Aberrations of Tumor Cells: Elsevier). 40. Zhang S and Kipps TJ. The Pathogenesis of Chronic Lymphocytic Leukemia. Annual review of pathology. 2014; 9:103-118. 41. Mitelman F, Johansson B and Mertens F. The impact of translocations and gene fusions on cancer causation. Nat Rev Cancer. 2007; 7(4):233-245. 42. Greaves MF, Maia AT, Wiemels JL and Ford AM. Leukemia in twins: lessons in natural history. Blood. 2003; 102(7):2321-2333. 43. Greaves MF and Wiemels J. Origins of chromosome translocations in childhood leukaemia. Nature Rev Cancer. 2003; 3:639-649. 44. Mitelman F, Johansson B and Mertens F. Fusion genes and rearranged genes as a linear function of chromosome aberrations in cancer. Nature Genet. 2004; 36:331-334. 45. Parker BC and Zhang W. Fusion genes in solid tumors: an emerging target for cancer diagnosis and treatment. Chinese Journal of Cancer. 2013; 32(11):594-603. 46. Soda M, Choi YL, Enomoto M, Takada S, Yamashita Y, Ishikawa S, Fujiwara S-i, Watanabe H, Kurashina K, Hatanaka H, Bando M, Ohno S, Ishikawa Y, Aburatani H, Niki T, Sohara Y, et al. Identification of the transforming EML4-ALK fusion gene in non-small-cell lung cancer. Nature. 2007; 448(7153):561-566. 47. Dang Chi V. MYC on the Path to Cancer. Cell. 2012; 149(1):22-35. 48. Kas K, Voz ML, Roijer E, Astrom AK, Meyen E, Stenman G and Van de Ven WJ. Promoter swapping between the genes for a novel zinc finger protein and beta-catenin in pleiomorphic adenomas with t(3;8)(p21;q12) translocations. Nat Genet. 1997; 15(2):170-174. 49. Broberg K, Zhang M, Strombeck B, Isaksson M, Nilsson M, Mertens F, Mandahl N and Panagopoulos I. Fusion of RDC1 with HMGA2 in lipomas as the result of chromosome aberrations involving 2q35-37 and 12q13-15. International journal of oncology. 2002; 21(2):321-326. 50. Saleem M and Yusoff NM. Fusion genes in malignant neoplastic disorders of haematopoietic system. Hematology. 2016; 21(9):501-512. 51. Robinson DR, Wu YM, Kalyana-Sundaram S, Cao X, Lonigro RJ, Sung YS, Chen CL, Zhang L, Wang R, Su F, Iyer MK, Roychowdhury S, Siddiqui J, Pienta KJ, Kunju LP, Talpaz M, et al. Identification of recurrent NAB2-STAT6 gene fusions in solitary fibrous tumor by integrative sequencing. Nat Genet. 2013; 45(2):180-185. 52. Panagopoulos I, Storlazzi CT, Fletcher CD, Fletcher JA, Nascimento A, Domanski HA, Wejde J, Brosjo O, Rydholm A, Isaksson M, Mandahl N and Mertens F. The chimeric FUS/CREB3l2 gene is specific for low-grade fibromyxoid sarcoma. Genes Chromosomes Cancer. 2004; 40(3):218-228. 53. Sorensen PH, Lynch JC, Qualman SJ, Tirabosco R, Lim JF, Maurer HM, Bridge JA, Crist WM, Triche TJ and Barr FG. PAX3-FKHR and PAX7-FKHR gene fusions are prognostic indicators in alveolar rhabdomyosarcoma: a report from the children's oncology group. Journal of clinical oncology : official journal of the American Society of Clinical Oncology. 2002; 20(11):2672-2679.

96 54. Simon R, Kluth M, Hube-Magg C, Tsourlakis MC, Minner S, Burdelski C, Grupp K, Melling N, Heumann A, Koop C, Steuber T, Graefen M, Huland H, Sauter G and Schlomm T. Abstract 5019: Impact of the ERG status on the prognostic relevance of prostate cancer biomarkers: a survey of 13,000 patients. Cancer Research. 2016; 76(14 Supplement):5019. 55. Yang Z, Yu L and Wang Z. PCA3 and TMPRSS2-ERG gene fusions as diagnostic biomarkers for prostate cancer. Chinese Journal of Cancer Research. 2016; 28(1):65-71. 56. Hägglöf C, Hammarsten P, Strömvall K, Egevad L, Josefsson A, Stattin P, Granfors T and Bergh A. TMPRSS2-ERG Expression Predicts Prostate Cancer Survival and Associates with Stromal Biomarkers. PLoS ONE. 2014; 9(2):e86824. 57. Awad MM and Shaw AT. ALK Inhibitors in Non–Small Cell Lung Cancer: Crizotinib and Beyond. Clinical advances in hematology & oncology : H&O. 2014; 12(7):429-439. 58. Bamford S, Dawson E, Forbes S, Clements J, Pettett R, Dogan A, Flanagan A, Teague J, Futreal PA, Stratton MR and Wooster R. The COSMIC (Catalogue of Somatic Mutations in Cancer) database and website. Br J Cancer. 2004; 91(2):355-358. 59. Zhang J, Baran J, Cros A, Guberman JM, Haider S, Hsu J, Liang Y, Rivkin E, Wang J, Whitty B, Wong-Erasmus M, Yao L and Kasprzyk A. International Cancer Genome Consortium Data Portal--a one-stop shop for cancer genomics data. Database : the journal of biological databases and curation. 2011; 2011:bar026. 60. Vogelstein B, Papadopoulos N, Velculescu VE, Zhou S, Diaz LA, Jr. and Kinzler KW. Cancer genome landscapes. Science. 2013; 339(6127):1546-1558. 61. Jones S, Stransky N, McCord CL, Cerami E, Lagowski J, Kelly D, Angiuoli SV, Sausen M, Kann L, Shukla M, Makar R, Wood LD, Diaz Jr LA, Lengauer C and Velculescu VE. Genomic analyses of gynaecologic carcinosarcomas reveal frequent mutations in chromatin remodelling genes. Nature Communications. 2014; 5:5006. 62. Khurana E, Fu Y, Chakravarty D, Demichelis F, Rubin MA and Gerstein M. Role of non-coding sequence variants in cancer. Nature reviews Genetics. 2016; 17(2):93-108. 63. Singer G, Oldt R, 3rd, Cohen Y, Wang BG, Sidransky D, Kurman RJ and Shih Ie M. Mutations in BRAF and KRAS characterize the development of low-grade ovarian serous carcinoma. Journal of the National Cancer Institute. 2003; 95(6):484-486. 64. Micci F, Haugom L, Abeler VM, Davidson B, Trope CG and Heim S. Genomic profile of ovarian carcinomas. BMCCancer. 2014; 14:315. doi: 10.1186/1471-2407-14- 315.:315-314. 65. Takeda T, Banno K, Okawa R, Yanokura M, Iijima M, Irie-Kunitomi H, Nakamura K, Iida M, Adachi M, Umene K, Nogami Y, Masuda K, Kobayashi Y, Tominaga E and Aoki D. ARID1A gene mutation in ovarian and endometrial cancers (Review). Oncology reports. 2016; 35(2):607-613. 66. Kuo K-T, Mao T-L, Jones S, Veras E, Ayhan A, Wang T-L, Glas R, Slamon D, Velculescu VE, Kuman RJ and Shih I-M. Frequent Activating Mutations of PIK3CA in Ovarian Clear Cell Carcinoma. The American Journal of Pathology. 2009; 174(5):1597-1601. 67. Catasus L, Bussaglia E, Rodrguez I, Gallardo A, Pons C, Irving JA and Prat J. Molecular genetic alterations in endometrioid carcinomas of the ovary: similar frequency of beta-catenin abnormalities but lower rate of microsatellite instability and PTEN alterations than in uterine endometrioid carcinomas. Hum Pathol. 2004; 35(11):1360-1368. 68. Cuatrecasas M, Villanueva A, Matias-Guiu X and Prat J. K-ras mutations in mucinous ovarian tumors: a clinicopathologic and molecular study of 95 cases. Cancer. 1997; 79(8):1581-1586. 69. Ryland GL, Hunter SM, Doyle MA, Caramia F, Li J, Rowley SM, Christie M, Allan PE, Stephens AN, Bowtell DDL, Australian Ovarian Cancer Study G, Campbell IG and Gorringe KL. Mutational landscape of mucinous ovarian carcinoma and its neoplastic precursors. Genome medicine. 2015; 7(1):87.

97 70. Prat J. New insights into ovarian cancer pathology. AnnOncol. 2012; 23 Suppl 10:x111-7.:x111-x117. 71. Integrated genomic analyses of ovarian carcinoma. Nature. 2011; 474(7353):609-615. 72. Micci F, Haugom L, Abeler VM, Trope CG, Danielsen HE and Heim S. Consistent numerical chromosome aberrations in thecofibromas of the ovary. Virchows Archiv : an international journal of pathology. 2008; 452(3):269-276. 73. Fuller PJ, Leung D and Chu S. Genetics and genomics of ovarian sex cord-stromal tumors. Clinical genetics. 2017; 91(2):285-291. 74. Yoshihara K, Wang Q, Torres-Garcia W, Zheng S, Vegesna R, Kim H and Verhaak RG. The landscape and therapeutic relevance of cancer-associated transcript fusions. Oncogene. 2015; 34(37):4845-4854. 75. Patch AM, Christie EL, Etemadmoghadam D, Garsed DW, George J, Fereday S, Nones K, Cowin P, Alsop K, Bailey PJ, Kassahn KS, Newell F, Quinn MC, Kazakoff S, Quek K, Wilhelm-Benartzi C, et al. Whole-genome characterization of chemoresistant ovarian cancer. Nature. 2015; 521(7553):489-494. 76. Earp MA, Raghavan R, Li Q, Dai J, Winham SJ, Cunningham JM, Natanzon Y, Kalli KR, Hou X, Weroha SJ, Haluska P, Lawrenson K, Gayther SA, Wang C, Goode EL and Fridley BL. Characterization of fusion genes in common and rare epithelial ovarian cancer histologic subtypes. Oncotarget. 2017. 77. Salzman J, Marinelli RJ, Wang PL, Green AE, Nielsen JS, Nelson BH, Drescher CW and Brown PO. ESRRA-C11orf20 is a recurrent gene fusion in serous ovarian carcinoma. PLoS biology. 2011; 9(9):e1001156. 78. Kannan K, Coarfa C, Rajapakshe K, Hawkins SM, Matzuk MM, Milosavljevic A and Yen L. CDKN2D-WDFY2 is a cancer-specific fusion gene recurrent in high-grade serous ovarian carcinoma. PLoSGenet. 2014; 10(3):e1004216. 79. Kannan K, Coarfa C, Chao PW, Luo L, Wang Y, Brinegar AE, Hawkins SM, Milosavljevic A, Matzuk MM and Yen L. Recurrent BCAM-AKT2 fusion gene leads to a constitutively activated AKT2 fusion kinase in high-grade serous ovarian carcinoma. Proceedings of the National Academy of Sciences of the United States of America. 2015; 112(11):E1272-1277. 80. Micci F, Panagopoulos I, Thorsen J, Davidson B, Trope CG and Heim S. Low frequency of ESRRA-C11orf20 fusion gene in ovarian carcinomas. PLoS biology. 2014; 12(2):e1001784. 81. Lee RC, Feinbaum RL and Ambros V. The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell. 1993; 75(5):843-854. 82. MacFarlane L-A and Murphy PR. MicroRNA: Biogenesis, Function and Role in Cancer. Current Genomics. 2010; 11(7):537-561. 83. Rajewsky N. L(ou)sy miRNA targets? Nature structural & molecular biology. 2006; 13(9):754-755. 84. Winter J, Jung S, Keller S, Gregory RI and Diederichs S. Many roads to maturity: microRNA biogenesis pathways and their regulation. Nat Cell Biol. 2009; 11(3):228-234. 85. Lin S and Gregory RI. MicroRNA biogenesis pathways in cancer. Nature reviews Cancer. 2015; 15(6):321-333. 86. Shivdasani RA. MicroRNAs: regulators of gene expression and cell differentiation. Blood. 2006; 108(12):3646-3653. 87. Pratt AJ and MacRae IJ. The RNA-induced Silencing Complex: A Versatile Gene- silencing Machine. The Journal of biological chemistry. 2009; 284(27):17897-17901. 88. Volinia S, Calin GA, Liu CG, Ambs S, Cimmino A, Petrocca F, Visone R, Iorio M, Roldo C, Ferracin M, Prueitt RL, Yanaihara N, Lanza G, Scarpa A, Vecchione A, Negrini M, et al. A microRNA expression signature of human solid tumors defines cancer gene targets.

98 Proceedings of the National Academy of Sciences of the United States of America. 2006; 103(7):2257-2261. 89. Fatica A and Fazi F. MicroRNA-Regulated Pathways in Hematological Malignancies: How to Avoid Cells Playing Out of Tune. International Journal of Molecular Sciences. 2013; 14(10):20930-20953. 90. Svoronos AA, Engelman DM and Slack FJ. OncomiR or Tumor Suppressor? The Duplicity of MicroRNAs in Cancer. Cancer Res. 2016; 76(13):3666-3670. 91. Xue L, Su D, Li D, Gao W, Yuan R and Pang W. MiR-200 Regulates Epithelial- Mesenchymal Transition in Anaplastic Thyroid Cancer via EGF/EGFR Signaling. Cell biochemistry and biophysics. 2015; 72(1):185-190. 92. Rahmani F, Avan A, Hashemy SI and Hassanian SM. Role of Wnt/beta-catenin signaling regulatory microRNAs in the pathogenesis of colorectal cancer. J Cell Physiol. 2017. 93. Frenzel A, Lovén J and Henriksson MA. Targeting MYC-Regulated miRNAs to Combat Cancer. Genes & Cancer. 2010; 1(6):660-667. 94. Kan CWS, Howell VM, Hahn MA and Marsh DJ. Genomic alterations as mediators of miRNA dysregulation in ovarian cancer. Genes, Chromosomes and Cancer. 2015; 54(1):1-19. 95. Zhang L, Huang J, Yang N, Greshock J, Megraw MS, Giannakakis A, Liang S, Naylor TL, Barchetti A, Ward MR, Yao G, Medina A, O’Brien-Jenkins A, Katsaros D, Hatzigeorgiou A, Gimotty PA, et al. microRNAs exhibit high frequency genomic alterations in human cancer. Proceedings of the National Academy of Sciences of the United States of America. 2006; 103(24):9136-9141. 96. Czubak K, Lewandowska MA, Klonowska K, Roszkowski K, Kowalewski J, Figlerowicz M and Kozlowski P. High copy number variation of cancer-related microRNA genes and frequent amplification of DICER1 and DROSHA in lung cancer. Oncotarget. 2015; 6(27):23399-23416. 97. Torres A, Torres K, Paszkowski T, Jodłowska-Jędrych B, Radomański T, Książek A and Maciejewski R. Major regulators of microRNAs biogenesis Dicer and Drosha are down- regulated in endometrial cancer. Tumour Biology. 2011; 32(4):769-776. 98. Rakheja D, Chen KS, Liu Y, Shukla AA, Schmid V, Chang T-C, Khokhar S, Wickiser JE, Karandikar NJ, Malter JS, Mendell JT and Amatruda JF. Somatic mutations in DROSHA and DICER1 impair microRNA biogenesis through distinct mechanisms in Wilms tumors. Nature communications. 2014; 2:4802-4802. 99. Baer C, Claus R and Plass C. Genome-Wide Epigenetic Regulation of miRNAs in Cancer. Cancer Research. 2013; 73(2):473. 100. Suzuki H, Maruyama R, Yamamoto E and Kai M. Epigenetic alteration and microRNA dysregulation in cancer. Frontiers in Genetics. 2013; 4:258. 101. Tsai KW, Hu LY, Wu CW, Li SC, Lai CH, Kao HW, Fang WL and Lin WC. Epigenetic regulation of miR-196b expression in gastric cancer. Genes Chromosomes Cancer. 2010; 49(11):969-980. 102. Yuan R, Zhi Q, Zhao H, Han Y, Gao L, Wang B, Kou Z, Guo Z, He S, Xue X and Hu H. Upregulated expression of miR-106a by DNA hypomethylation plays an oncogenic role in hepatocellular carcinoma. Tumour Biol. 2015; 36(4):3093-3100. 103. Fornari F, Milazzo M, Chieco P, Negrini M, Marasco E, Capranico G, Mantovani V, Marinello J, Sabbioni S, Callegari E, Cescon M, Ravaioli M, Croce CM, Bolondi L and Gramantieri L. In hepatocellular carcinoma miR-519d is up-regulated by p53 and DNA hypomethylation and targets CDKN1A/p21, PTEN, AKT3 and TIMP2. The Journal of pathology. 2012; 227(3):275-285. 104. Shaham L, Binder V, Gefen N, Borkhardt A and Izraeli S. MiR-125 in normal and malignant hematopoiesis. Leukemia. 2012; 26(9):2011-2018.

99 105. Liang L, Wong CM, Ying Q, Fan DN, Huang S, Ding J, Yao J, Yan M, Li J, Yao M, Ng IO and He X. MicroRNA-125b suppressesed human liver cancer cell proliferation and metastasis by directly targeting oncogene LIN28B2. Hepatology (Baltimore, Md). 2010; 52(5):1731-1740. 106. Guan Y, Yao H, Zheng Z, Qiu G and Sun K. MiR-125b targets BCL3 and suppresses ovarian cancer proliferation. Int J Cancer. 2011; 128(10):2274-2283. 107. Liu LH, Li H, Li JP, Zhong H, Zhang HC, Chen J and Xiao T. miR-125b suppresses the proliferation and migration of osteosarcoma cells through down-regulation of STAT3. Biochemical and biophysical research communications. 2011; 416(1-2):31-38. 108. Jiang F, Liu T, He Y, Yan Q, Chen X, Wang H and Wan X. MiR-125b promotes proliferation and migration of type II endometrial carcinoma cells through targeting TP53INP1 tumor suppressor in vitro and in vivo. BMC Cancer. 2011; 11(1):425. 109. Shi XB, Xue L, Yang J, Ma AH, Zhao J, Xu M, Tepper CG, Evans CP, Kung HJ and deVere White RW. An androgen-regulated miRNA suppresses Bak1 expression and induces androgen-independent growth of prostate cancer cells. Proceedings of the National Academy of Sciences of the United States of America. 2007; 104(50):19983-19988. 110. Tang F, Zhang R, He Y, Zou M, Guo L and Xi T. MicroRNA-125b induces metastasis by targeting STARD13 in MCF-7 and MDA-MB-231 breast cancer cells. PLoS One. 2012; 7(5):e35435. 111. Krichevsky AM and Gabriely G. miR-21: a small multi-faceted RNA. Journal of cellular and molecular medicine. 2009; 13(1):39-53. 112. Buscaglia LE and Li Y. Apoptosis and the target genes of microRNA-21. Chin J Cancer. 2011; 30(6):371-380. 113. Calin GA, Sevignani C, Dumitru CD, Hyslop T, Noch E, Yendamuri S, Shimizu M, Rattan S, Bullrich F, Negrini M and Croce CM. Human microRNA genes are frequently located at fragile sites and genomic regions involved in cancers. Proceedings of the National Academy of Sciences of the United States of America. 2004; 101(9):2999-3004. 114. Johnson SM, Grosshans H, Shingara J, Byrom M, Jarvis R, Cheng A, Labourier E, Reinert KL, Brown D and Slack FJ. RAS is regulated by the let-7 microRNA family. Cell. 2005; 120(5):635-647. 115. Wang YY, Ren T, Cai YY and He XY. MicroRNA let-7a inhibits the proliferation and invasion of nonsmall cell lung cancer cell line 95D by regulating K-Ras and HMGA2 gene expression. Cancer BiotherRadiopharm. 2013; 28(2):131-137. 116. Wu A, Wu K, Li J, Mo Y, Lin Y, Wang Y, Shen X, Li S, Li L and Yang Z. Let-7a inhibits migration, invasion and epithelial-mesenchymal transition by targeting HMGA2 in nasopharyngeal carcinoma. JTranslMed. 2015; 13:105. doi: 10.1186/s12967-015-0462- 8.:105-0462. 117. Johnson CD, Esquela-Kerscher A, Stefani G, Byrom M, Kelnar K, Ovcharenko D, Wilson M, Wang X, Shelton J, Shingara J, Chin L, Brown D and Slack FJ. The let-7 MicroRNA Represses Cell Proliferation Pathways in Human Cells. Cancer Research. 2007; 67(16):7713. 118. Lu J, Getz G, Miska EA, Alvarez-Saavedra E, Lamb J, Peck D, Sweet-Cordero A, Ebert BL, Mak RH, Ferrando AA, Downing JR, Jacks T, Horvitz HR and Golub TR. MicroRNA expression profiles classify human cancers. Nature. 2005; 435(7043):834-838. 119. Gefen N, Binder V, Zaliova M, Linka Y, Morrow M, Novosel A, Edry L, Hertzberg L, Shomron N, Williams O, Trka J, Borkhardt A and Izraeli S. Hsa-mir-125b-2 is highly expressed in childhood ETV6/RUNX1 (TEL/AML1) leukemias and confers survival advantage to growth inhibitory signals independent of p53. Leukemia. 2010; 24. 120. Liu N, Chen N-Y, Cui R-X, Li W-F, Li Y, Wei R-R, Zhang M-Y, Sun Y, Huang B-J, Chen M, He Q-M, Jiang N, Chen L, Cho WCS, Yun J-P, Zeng J, et al. Prognostic value of a

100 microRNA signature in nasopharyngeal carcinoma: a microRNA expression analysis. The Lancet Oncology. 13(6):633-641. 121. Hu X, Macdonald DM, Huettner PC, Feng Z, El Naqa IM, Schwarz JK, Mutch DG, Grigsby PW, Powell SN and Wang X. A miR-200 microRNA cluster as prognostic marker in advanced ovarian cancer. Gynecologic oncology. 2009; 114(3):457-464. 122. Zhang J-X, Song W, Chen Z-H, Wei J-H, Liao Y-J, Lei J, Hu M, Chen G-Z, Liao B, Lu J, Zhao H-W, Chen W, He Y-L, Wang H-Y, Xie D and Luo J-H. Prognostic and predictive value of a microRNA signature in stage II colon cancer: a microRNA expression analysis. The Lancet Oncology. 14(13):1295-1306. 123. Yang C, Wang C, Chen X, Chen S, Zhang Y, Zhi F, Wang J, Li L, Zhou X, Li N, Pan H, Zhang J, Zen K, Zhang CY and Zhang C. Identification of seven serum microRNAs from a genome-wide serum microRNA expression profile as potential noninvasive biomarkers for malignant astrocytomas. Int J Cancer. 2013; 132(1):116-127. 124. Zhang S, Lu Z, Unruh AK, Ivan C, Baggerly KA and Calin GA. Clinically relevant microRNAs in ovarian cancer. Mol Cancer Res. 2015; 13. 125. Miles GD, Seiler M, Rodriguez L, Rajagopal G and Bhanot G. Identifying microRNA/mRNA dysregulations in ovarian cancer. BMC Res Notes. 2012; 5. 126. Iorio MV, Visone R, Di Leva G, Donati V, Petrocca F, Casalini P, Taccioli C, Volinia S, Liu C-G, Alder H, Calin GA, Ménard S and Croce CM. MicroRNA Signatures in Human Ovarian Cancer. Cancer Research. 2007; 67(18):8699-8707. 127. Calura E, Fruscio R, Paracchini L, Bignotti E, Ravaggi A, Martini P, Sales G, Beltrame L, Clivio L, Ceppi L, Di Marino M, Fuso Nerini I, Zanotti L, Cavalieri D, Cattoretti G, Perego P, et al. miRNA Landscape in Stage I Epithelial Ovarian Cancer Defines the Histotype Specificities. Clinical Cancer Research. 2013; 19(15):4114-4123. 128. Zhang L, Volinia S, Bonome T, Calin GA, Greshock J, Yang N, Liu CG, Giannakakis A, Alexiou P, Hasegawa K, Johnstone CN, Megraw MS, Adams S, Lassus H, Huang J, Kaur S, et al. Genomic and epigenetic alterations deregulate microRNA expression in human epithelial ovarian cancer. Proceedings of the National Academy of Sciences of the United States of America. 2008; 105(19):7004-7009. 129. Corney DC, Hwang CI, Matoso A, Vogt M, Flesken-Nikitin A, Godwin AK, Kamat AA, Sood AK, Ellenson LH, Hermeking H and Nikitin AY. Frequent downregulation of miR- 34 family in human ovarian cancers. Clin Cancer Res. 2010; 16(4):1119-1128. 130. Lee C-H, Subramanian S, Beck AH, Espinosa I, Senz J, Zhu SX, Huntsman D, van de Rijn M and Gilks CB. MicroRNA Profiling of BRCA1/2 Mutation-Carrying and Non- Mutation-Carrying High-Grade Serous Carcinomas of Ovary. PLoS ONE. 2009; 4(10):e7314. 131. Choi YW, Song YS, Lee H, Yi K, Kim Y-B, Suh KW and Lee D. MicroRNA Expression Signatures Associated With BRAF-Mutated Versus KRAS-Mutated Colorectal Cancers. Medicine. 2016; 95(15):e3321. 132. Patnaik SK, Dahlgaard J, Mazin W, Kannisto E, Jensen T, Knudsen S and Yendamuri S. Expression of MicroRNAs in the NCI-60 Cancer Cell-Lines. PLoS ONE. 2012; 7(11):e49918. 133. Sun H, Shao Y, Huang J, Sun S, Liu Y, Zhou P and Yang H. Prognostic value of microRNA-9 in cancers: a systematic review and meta-analysis. Oncotarget. 2016; 7(41):67020-67032. 134. Sun C, Li N, Yang Z, Zhou B, He Y, Weng D, Fang Y, Wu P, Chen P, Yang X, Ma D, Zhou J and Chen G. miR-9 regulation of BRCA1 and ovarian cancer sensitivity to cisplatin and PARP inhibition. Journal of the National Cancer Institute. 2013; 105(22):1750-1758. 135. Brozovic A, Duran GE, Wang YC, Francisco EB and Sikic BI. The miR-200 family differentially regulates sensitivity to paclitaxel and carboplatin in human ovarian carcinoma OVCAR-3 and MES-OV cells. Molecular oncology. 2015; 9(8):1678-1693.

101 136. Schwarzenbach H, Nishida N, Calin GA and Pantel K. Clinical relevance of circulating cell-free microRNAs in cancer. Nat Rev Clin Oncol. 2014; 11. 137. Taylor DD and Gercel-Taylor C. MicroRNA signatures of tumor-derived exosomes as diagnostic biomarkers of ovarian cancer. Gynecologic oncology. 2008; 110. 138. Kan CW, Hahn MA, Gard GB, Maidens J, Huh JY, Marsh DJ and Howell VM. Elevated levels of circulating microRNA-200 family members correlate with serous epithelial ovarian cancer. BMC Cancer. 2012; 12:627. 139. Zuberi M, Mir R, Das J, Ahmad I, Javid J and Yadav P. Expression of serum miR- 200a, miR-200b, and miR-200c as candidate biomarkers in epithelial ovarian cancer and their association with clinicopathological features. Clin Transl Oncol. 2015; 17. 140. Xu YZ, Xi QH, Ge WL and Zhang XQ. Identification of serum microRNA-21 as a biomarker for early detection and prognosis in human epithelial ovarian cancer. Asian Pac J Cancer Prev. 2013; 14(2):1057-1060. 141. Vaksman O, Tropé C, Davidson B and Reich R. Exosome-derived miRNAs and ovarian carcinoma progression. Carcinogenesis. 2014; 35. 142. Cleynen I and Van de Ven WJ. The HMGA proteins: a myriad of functions (Review). IntJOncol. 2008; 32(2):289-305. 143. Zhao K, Kas E, Gonzalez E and Laemmli UK. SAR-dependent mobilization of histone H1 by HMG-I/Y in vitro: HMG-I/Y is enriched in H1-depleted chromatin. The EMBO journal. 1993; 12(8):3237-3247. 144. Chiappetta G, Avantaggiato V, Visconti R, Fedele M, Battista S, Trapasso F, Merciai BM, Fidanza V, Giancotti V, Santoro M, Simeone A and Fusco A. High level expression of the HMGI (Y) gene during embryonic development. Oncogene. 1996; 13(11):2439-2446. 145. Zhou X, Benson KF, Ashar HR and Chada K. Mutation responsible for the mouse pygmy phenotype in the developmentally regulated factor HMGI-C. Nature. 1995; 376(6543):771-774. 146. Ayoubi TA, Jansen E, Meulemans SM and Van de Ven WJ. Regulation of HMGIC expression: an architectural transcription factor involved in growth control and development. Oncogene. 1999; 18(36):5076-5087. 147. Borrmann L, Wilkening S and Bullerdiek J. The expression of HMGA genes is regulated by their 3'UTR. Oncogene. 2001; 20(33):4537-4541. 148. Arlotta P, Tai AK, Manfioletti G, Clifford C, Jay G and Ono SJ. Transgenic mice expressing a truncated form of the high mobility group I-C protein develop adiposity and an abnormally high prevalence of lipomas. JBiolChem. 2000; 275(19):14394-14400. 149. Kristjansdottir K, Fogarty EA and Grimson A. Systematic analysis of the Hmga2 3' UTR identifies many independent regulatory sequences and a novel interaction between distal sites. RNA. 2015; 21(7):1346-1360. 150. Wang T, Wang G, Hao D, Liu X, Wang D, Ning N and Li X. Aberrant regulation of the LIN28A/LIN28B and let-7 loop in human malignant tumors and its effects on the hallmarks of cancer. MolCancer. 2015; 14:125. doi: 10.1186/s12943-015-0402-5.:125-0402. 151. Suh SS, Yoo JY, Cui R, Kaur B, Huebner K, Lee TK, Aqeilan RI and Croce CM. FHIT suppresses epithelial-mesenchymal transition (EMT) and metastasis in lung cancer through modulation of microRNAs. PLoSGenet. 2014; 10(10):e1004652. 152. Yang X, Zhao Q, Yin H, Lei X and Gan R. MiR-33b-5p sensitizes gastric cancer cells to chemotherapy drugs via inhibiting HMGA2 expression. Journal of drug targeting. 2017:1- 16. 153. Lin Y, Liu AY, Fan C, Zheng H, Li Y, Zhang C, Wu S, Yu D, Huang Z, Liu F, Luo Q, Yang CJ and Ouyang G. MicroRNA-33b Inhibits Breast Cancer Metastasis by Targeting HMGA2, SALL4 and Twist1. SciRep. 2015; 5:9995. doi: 10.1038/srep09995.:9995.

102 154. Wang H, Sun Z, Wang Y, Hu Z, Zhou H, Zhang L, Hong B, Zhang S and Cao X. miR- 33-5p, a novel mechano-sensitive microRNA promotes osteoblast differentiation by targeting Hmga2. Sci Rep. 2016; 6:23170. 155. Di Cello F, Hillion J, Hristov A, Wood LJ, Mukherjee M, Schuldenfrei A, Kowalski J, Bhattacharya R, Ashfaq R and Resar LM. HMGA2 participates in transformation in human lung cancer. Mol Cancer Res. 2008; 6(5):743-750. 156. Wang X, Liu X, Li AY, Chen L, Lai L, Lin HH, Hu S, Yao L, Peng J, Loera S, Xue L, Zhou B, Zhou L, Zheng S, Chu P, Zhang S, et al. Overexpression of HMGA2 promotes metastasis and impacts survival of colorectal cancers. Clin Cancer Res. 2011; 17(8):2570- 2580. 157. Wu J and Wei JJ. HMGA2 and high-grade serous ovarian carcinoma. Journal of molecular medicine (Berlin, Germany). 2013; 91(10):1155-1165. 158. Meyer B, Krisponeit D, Junghanss C, Murua Escobar H and Bullerdiek J. Quantitative expression analysis in peripheral blood of patients with chronic myeloid leukaemia: correlation between HMGA2 expression and white blood cell count. Leukemia & lymphoma. 2007; 48(10):2008-2013. 159. Storlazzi CT, Albano F, Locunsolo C, Lonoce A, Funes S, Guastadisegni MC, Cimarosto L, Impera L, D'Addabbo P, Panagopoulos I, Specchia G and Rocchi M. t(3;12)(q26;q14) in polycythemia vera is associated with upregulation of the HMGA2 gene. Leukemia. 2006; 20(12):2190-2192. 160. Li D, Lin HH, McMahon M, Ma H and Ann DK. Oncogenic raf-1 induces the expression of non-histone chromosomal architectural protein HMGI-C via a p44/p42 mitogen-activated protein kinase-dependent pathway in salivary epithelial cells. The Journal of biological chemistry. 1997; 272(40):25062-25070. 161. Wend P, Runke S, Wend K, Anchondo B, Yesayan M, Jardon M, Hardie N, Loddenkemper C, Ulasov I, Lesniak MS, Wolsky R, Bentolila LA, Grant SG, Elashoff D, Lehr S, Latimer JJ, et al. WNT10B/beta-catenin signalling induces HMGA2 and proliferation in metastatic triple-negative breast cancer. EMBO molecular medicine. 2013; 5(2):264-279. 162. Schoenmakers EF, Wanschura S, Mols R, Bullerdiek J, Van den Berghe H and Van de Ven WJ. Recurrent rearrangements in the high mobility group protein gene, HMGI-C, in benign mesenchymal tumours. NatGenet. 1995; 10(4):436-444. 163. Agostini A, Gorunova L, Bjerkehagen B, Lobmaier I, Heim S and Panagopoulos I. Molecular characterization of the t(4;12)(q27~28;q14~15) chromosomal rearrangement in lipoma. Oncology Letters. 2016; 12(3):1701-1704. 164. Mandahl N, Hoglund M, Mertens F, Rydholm A, Willen H, Brosjo O and Mitelman F. Cytogenetic aberrations in 188 benign and borderline adipose tissue tumors. Genes ChromosomesCancer. 1994; 9(3):207-215. 165. Panagopoulos I, Gorunova L, Agostini A, Lobmaier I, Bjerkehagen B and Heim S. Fusion of the HMGA2 and C9orf92 genes in myolipoma with t(9;12)(p22;q14). Diagnostic Pathology. 2016; 11:22. 166. Takahashi T, Nagai N, Oda H, Ohama K, Kamada N and Miyagawa K. Evidence for RAD51L1/HMGIC fusion in the pathogenesis of uterine leiomyoma. Genes Chromosomes Cancer. 2001; 30(2):196-201. 167. Kazmierczak B, Meyer-Bolte K, Tran KH, Wockel W, Breightman I, Rosigkeit J, Bartnitzke S and Bullerdiek J. A high frequency of tumors with rearrangements of genes of the HMGI(Y) family in a series of 191 pulmonary chondroid hamartomas. Genes ChromosomesCancer. 1999; 26(2):125-133. 168. Geurts JM, Schoenmakers EF and Van de Ven WJ. Molecular characterization of a complex chromosomal rearrangement in a pleomorphic salivary gland adenoma involving the 3'-UTR of HMGIC. Cancer GenetCytogenet. 1997; 95(2):198-205.

103 169. Nyquist KB, Panagopoulos I, Thorsen J, Roberto R, Wik HS, Tierens A, Heim S and Micci F. t(12;13)(q14;q31) leading to HMGA2 upregulation in acute myeloid leukaemia. British journal of haematology. 2012; 157(6):769-771. 170. Fedele M, Berlingieri MT, Scala S, Chiariotti L, Viglietto G, Rippel V, Bullerdiek J, Santoro M and Fusco A. Truncated and chimeric HMGI-C genes induce neoplastic transformation of NIH3T3 murine fibroblasts. Oncogene. 1998; 17(4):413-418. 171. Morishita A, Zaidi MR, Mitoro A, Sankarasharma D, Szabolcs M, Okada Y, D'Armiento J and Chada K. HMGA2 is a driver of tumor metastasis. Cancer Res. 2013; 73(14):4289-4299. 172. Liu TP, Huang CC, Yeh KT, Ke TW, Wei PL, Yang JR and Cheng YW. Down- regulation of let-7a-5p predicts lymph node metastasis and prognosis in colorectal cancer: Implications for chemotherapy. Surgical oncology. 2016; 25(4):429-434. 173. Wang Z, Lin S, Zhang J, Xu Z, Xiang Y, Yao H, Ge L, Xie D, Kung HF, Lu G, Poon WS, Liu Q and Lin MC. Loss of MYC and E-box3 binding contributes to defective MYC- mediated transcriptional suppression of human MC-let-7a-1~let-7d in glioblastoma. Oncotarget. 2016; 7(35):56266-56278. 174. Deng L, Yang SB, Xu FF and Zhang JH. Long noncoding RNA CCAT1 promotes hepatocellular carcinoma progression by functioning as let-7 sponge. Journal of experimental & clinical cancer research : CR. 2015; 34:18. 175. Yang G, Zhang W, Yu C, Ren J and An Z. MicroRNA let-7: Regulation, single nucleotide polymorphism, and therapy in lung cancer. J Cancer Res Ther. 2015; 11 Suppl 1:C1-6. 176. Yang N, Kaur S, Volinia S, Greshock J, Lassus H, Hasegawa K, Liang S, Leminen A, Deng S, Smith L, Johnstone CN, Chen XM, Liu CG, Huang Q, Katsaros D, Calin GA, et al. MicroRNA microarray identifies Let-7i as a novel biomarker and therapeutic target in human epithelial ovarian cancer. Cancer Res. 2008; 68(24):10307-10314. 177. Panda AC, Grammatikakis I, Kim KM, De S, Martindale JL, Munk R, Yang X, Abdelmohsen K and Gorospe M. Identification of senescence-associated circular RNAs (SAC-RNAs) reveals senescence suppressor CircPVT1. Nucleic Acids Res. 2017; 45(7):4021-4035. 178. Zhou H, Guo W, Zhao Y, Wang Y, Zha R, Ding J, Liang L, Hu J, Shen H, Chen Z, Yin B and Ma B. MicroRNA-26a acts as a tumor suppressor inhibiting gallbladder cancer cell proliferation by directly targeting HMGA2. International journal of oncology. 2014; 44(6):2050-2058. 179. Yang Y, Zhang P, Zhao Y, Yang J, Jiang G and Fan J. Decreased MicroRNA-26a expression causes cisplatin resistance in human non-small cell lung cancer. Cancer biology & therapy. 2016; 17(5):515-525. 180. Tanic M, Yanowsky K, Rodriguez-Antona C, Andres R, Marquez-Rodas I, Osorio A, Benitez J and Martinez-Delgado B. Deregulated miRNAs in hereditary breast cancer revealed a role for miR-30c in regulating KRAS oncogene. PLoS One. 2012; 7(6):e38847. 181. Agostini A, Brunetti M, Davidson B, Trope CG, Heim S, Panagopoulos I and Micci F. Expressions of miR-30c and let-7a are inversely correlated with HMGA2 expression in squamous cell carcinoma of the vulva. Oncotarget. 2016; 7(51):85058-85062. 182. Zhang P, Bai H, Liu G, Wang H, Chen F, Zhang B, Zeng P, Wu C, Peng C, Huang C, Song Y and Song E. MicroRNA-33b, upregulated by EF24, a curcumin analog, suppresses the epithelial-to-mesenchymal transition (EMT) and migratory potential of melanoma cells by targeting HMGA2. Toxicology letters. 2015; 234(3):151-161. 183. Jiang W, Gu W, Qiu R, He S, Shen C, Wu Y, Zhang J, Zhou J, Guo Y, Wan D, Li Z, Deng J, Zeng L, Tang J, Zhou J, Zhi Q, et al. miRNA-101 Suppresses Epithelial-to-

104 Mesenchymal Transition by Targeting HMGA2 in Pancreatic Cancer Cells. Anti-cancer agents in medicinal chemistry. 2016; 16(4):432-439. 184. Zhu C, Li J, Cheng G, Zhou H, Tao L, Cai H, Li P, Cao Q, Ju X, Meng X, Wang M, Zhang Z, Qin C, Hua L, Yin C and Shao P. miR-154 inhibits EMT by targeting HMGA2 in prostate cancer cells. Molecular and cellular biochemistry. 2013; 379(1-2):69-75. 185. Wu ZY, Wang SM, Chen ZH, Huv SX, Huang K, Huang BJ, Du JL, Huang CM, Peng L, Jian ZX and Zhao G. MiR-204 regulates HMGA2 expression and inhibits cell proliferation in human thyroid cancer. Cancer biomarkers : section A of Disease markers. 2015; 15(5):535- 542. 186. Chen Z, Li Q, Wang S and Zhang J. miR4855p inhibits bladder cancer metastasis by targeting HMGA2. International journal of molecular medicine. 2015; 36(4):1136-1142. 187. Fan C, Lin Y, Mao Y, Huang Z, Liu AY, Ma H, Yu D, Maitikabili A, Xiao H, Zhang C, Liu F, Luo Q and Ouyang G. MicroRNA-543 suppresses colorectal cancer growth and metastasis by targeting KRAS, MTA1 and HMGA2. Oncotarget. 2016; 7(16):21825-21839. 188. Wood LJ, Maher JF, Bunton TE and Resar LM. The oncogenic properties of the HMG-I gene family. Cancer Res. 2000; 60(15):4256-4261. 189. De Martino I, Visone R, Wierinckx A, Palmieri D, Ferraro A, Cappabianca P, Chiappetta G, Forzati F, Lombardi G, Colao A, Trouillas J, Fedele M and Fusco A. HMGA proteins up-regulate CCNB2 gene in mouse and human pituitary adenomas. Cancer Res. 2009; 69(5):1844-1850. 190. Tessari MA, Gostissa M, Altamura S, Sgarra R, Rustighi A, Salvagno C, Caretti G, Imbriano C, Mantovani R, Del Sal G, Giancotti V and Manfioletti G. Transcriptional activation of the cyclin A gene by the architectural transcription factor HMGA2. Molecular and cellular biology. 2003; 23(24):9104-9116. 191. Wu J, Liu Z, Shao C, Gong Y, Hernando E, Lee P, Narita M, Muller W, Liu J and Wei J-J. HMGA2 overexpression induced ovarian surface epithelial transformation is mediated through regulation of EMT genes. Cancer research. 2011; 71(2):349-359. 192. Li AY, Boo LM, Wang SY, Lin HH, Wang CC, Yen Y, Chen BP, Chen DJ and Ann DK. Suppression of nonhomologous end joining repair by overexpression of HMGA2. Cancer Res. 2009; 69(14):5699-5706. 193. Borrmann L, Schwanbeck R, Heyduk T, Seebeck B, Rogalla P, Bullerdiek J and Wisniewski JR. High mobility group A2 protein and its derivatives bind a specific region of the promoter of DNA repair gene ERCC1 and modulate its activity. Nucleic Acids Res. 2003; 31(23):6841-6851. 194. Zha L, Zhang J, Tang W, Zhang N, He M, Guo Y and Wang Z. HMGA2 Elicits EMT by Activating the Wnt/β-catenin Pathway in Gastric Cancer. Digestive diseases and sciences. 2013; 58(3):724-733. 195. Thuault S, Tan EJ, Peinado H, Cano A, Heldin CH and Moustakas A. HMGA2 and Smads co-regulate SNAIL1 expression during induction of epithelial-to-mesenchymal transition. The Journal of biological chemistry. 2008; 283(48):33437-33446. 196. Tan EJ, Kahata K, Idas O, Thuault S, Heldin CH and Moustakas A. The high mobility group A2 protein epigenetically silences the Cdh1 gene during epithelial-to-mesenchymal transition. Nucleic Acids Res. 2015; 43(1):162-178. 197. Rowe RG, Wang LD, Coma S, Han A, Mathieu R, Pearson DS, Ross S, Sousa P, Nguyen PT, Rodriguez A, Wagers AJ and Daley GQ. Developmental regulation of myeloerythroid progenitor function by the Lin28b-let-7-Hmga2 axis. The Journal of experimental medicine. 2016; 213(8):1497-1512. 198. Ge Y, Kong Z, Guo Y, Tang W, Guo W and Tian W. The role of odontogenic genes and proteins in tooth epithelial cells and their niche cells during rat tooth root development. Archives of oral biology. 2013; 58(2):151-159.

105 199. Singh I, Mehta A, Contreras A, Boettger T, Carraro G, Wheeler M, Cabrera-Fuentes HA, Bellusci S, Seeger W, Braun T and Barreto G. Hmga2 is required for canonical WNT signaling during lung development. BMC biology. 2014; 12:21. 200. Ashar HR, Chouinard RA, Jr., Dokur M and Chada K. In vivo modulation of HMGA2 expression. Biochimica et biophysica acta. 2010; 1799(1-2):55-61. 201. Pfannkuche K, Summer H, Li O, Hescheler J and Droge P. The high mobility group protein HMGA2: a co-regulator of chromatin structure and pluripotency in stem cells? Stem cell reviews. 2009; 5(3):224-230. 202. Nowell PC and Hungerford DA. Chromosome studies on normal and leukemic human leukocytes. Journal of the National Cancer Institute. 1960; 25:85-109. 203. Ferguson-Smith MA. History and evolution of cytogenetics. Molecular Cytogenetics. 2015; 8:19. 204. Mandahl N. (2001). Methods in solid tumors cytogenetics. In: Rooney.D, ed. Human cytogenetics: malignancies and acquired abnormalities, pp. 165-203. 205. Willatt L and Morgan S. (2009). Shaffer LG, Slovak ML, Campbell LJ (2009): ISCN 2009 an international system for human cytogenetic nomenclature. Hum Genet Human Genetics: Springer-Verlag), pp. 603-604. 206. Speicher MR and Carter NP. The new cytogenetics: blurring the boundaries with molecular biology. Nature reviews Genetics. 2005; 6(10):782-792. 207. Ahn JW, Bint S, Bergbaum A, Mann K, Hall RP and Ogilvie CM. Array CGH as a first line diagnostic test in place of karyotyping for postnatal referrals - results from four years’ clinical application for over 8,700 patients. Molecular Cytogenetics. 2013; 6:16-16. 208. Mohapatra G, Sharma J and Yip S. (2013). Array CGH in Brain Tumors. In: Banerjee D and Shah SP, eds. Array Comparative Genomic Hybridization: Protocols and Applications. (Totowa, NJ: Humana Press), pp. 325-338. 209. van Beers EH and Nederlof PM. Array-CGH and breast cancer. Breast Cancer Research. 2006; 8(3):210-210. 210. Templeton NS. The polymerase chain reaction. History, methods, and applications. Diagnostic molecular pathology : the American journal of surgical pathology, part B. 1992; 1(1):58-72. 211. Saiki RK, Gelfand DH, Stoffel S, Scharf SJ, Higuchi R, Horn GT, Mullis KB and Erlich HA. Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science. 1988; 239(4839):487-491. 212. Kermekchiev MB, Tzekov A and Barnes WM. Cold-sensitive mutants of Taq DNA polymerase provide a hot start for PCR. Nucleic Acids Res. 2003; 31(21):6139-6147. 213. Morley AA. Digital PCR: A brief history. Biomolecular Detection and Quantification. 2014; 1(1):1-2. 214. Korbie DJ and Mattick JS. Touchdown PCR for increased specificity and sensitivity in PCR amplification. NatProtoc. 2008; 3(9):1452-1456. 215. Rychlik W, Spencer WJ and Rhoads RE. Optimization of the annealing temperature for DNA amplification in vitro. Nucleic Acids Research. 1990; 18(21):6409-6412. 216. Okello JBA, Rodriguez L, Poinar D, Bos K, Okwi AL, Bimenya GS, Sewankambo NK, Henry KR, Kuch M and Poinar HN. Quantitative Assessment of the Sensitivity of Various Commercial Reverse Transcriptases Based on Armored HIV RNA. PLoS ONE. 2010; 5(11):e13931. 217. Friedrich M, Grahnert A and Hauschildt S. Analysis of the 3' UTR of the ART3 and ART4 gene by 3' inverse RACE-PCR. DNA sequence : the journal of DNA sequencing and mapping. 2005; 16(1):53-57.

106 218. Grahnert A, Friedrich M, Engeland K and Hauschildt S. Analysis of mono-ADP- ribosyltransferase 4 gene expression in human monocytes: splicing pattern and potential regulatory elements. Biochimica et biophysica acta. 2005; 1730(3):173-186. 219. Higuchi R, Dollinger G, Walsh PS and Griffith R. Simultaneous amplification and detection of specific DNA sequences. Bio/technology (Nature Publishing Company). 1992; 10(4):413-417. 220. Rinttila T, Kassinen A, Malinen E, Krogius L and Palva A. Development of an extensive set of 16S rDNA-targeted primers for quantification of pathogenic and indigenous bacteria in faecal samples by real-time PCR. Journal of applied microbiology. 2004; 97(6):1166-1177. 221. Lorente A, Mueller W, Urdangarin E, Lazcoz P, von Deimling A and Castresana JS. Detection of methylation in promoter sequences by melting curve analysis-based semiquantitative real time PCR. BMC Cancer. 2008; 8. 222. Kirk BW, Feinsod M, Favis R, Kliman RM and Barany F. Single nucleotide polymorphism seeking long term association with complex disease. Nucleic Acids Res. 2002; 30(15):3295-3311. 223. Klutstein M, Nejman D, Greenfield R and Cedar H. DNA Methylation in Cancer and Aging. Cancer Research. 2016. 224. Fraga MF and Esteller M. DNA methylation: a profile of methods and applications. Biotechniques. 2002; 33. 225. Worm J, Aggerholm A and Guldberg P. In-tube DNA methylation profiling by fluorescence melting curve analysis. Clin Chem. 2001; 47. 226. Sanger F, Brownlee GG and Barrell BG. A two-dimensional fractionation procedure for radioactive nucleotides. Journal of molecular biology. 1965; 13(2):373-398. 227. Sanger F, Nicklen S and Coulson AR. DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences of the United States of America. 1977; 74(12):5463-5467. 228. Hunkapiller T, Kaiser R, Koop B and Hood L. Large-scale and automated DNA sequence determination. Science. 1991; 254(5028):59-67. 229. Maxam AM and Gilbert W. A new method for sequencing DNA. Proceedings of the National Academy of Sciences of the United States of America. 1977; 74(2):560-564. 230. Ronaghi M, Karamohamed S, Pettersson B, Uhlen M and Nyren P. Real-time DNA sequencing using detection of pyrophosphate release. Anal Biochem. 1996; 242(1):84-89. 231. Nyren P and Lundin A. Enzymatic method for continuous monitoring of inorganic pyrophosphate synthesis. Anal Biochem. 1985; 151(2):504-509. 232. Shendure J and Ji H. Next-generation DNA sequencing. Nat Biotechnol. 2008; 26(10):1135-1145. 233. Shendure J, Porreca GJ, Reppas NB, Lin X, McCutcheon JP, Rosenbaum AM, Wang MD, Zhang K, Mitra RD and Church GM. Accurate multiplex polony sequencing of an evolved bacterial genome. Science. 2005; 309(5741):1728-1732. 234. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, Dewell SB, Du L, Fierro JM, Gomes XV, Godwin BC, He W, et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature. 2005; 437(7057):376-380. 235. Goodwin S, McPherson JD and McCombie WR. Coming of age: ten years of next- generation sequencing technologies. Nature reviews Genetics. 2016; 17(6):333-351. 236. Metzker ML. Sequencing technologies [mdash] the next generation. Nature reviews Genetics. 2010; 11(1):31-46. 237. Wang Z, Gerstein M and Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nature reviews Genetics. 2009; 10(1):57-63.

107 238. Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE, Kuehn MS, Taylor CM, Neph S, Koch CM, Asthana S, Malhotra A, et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007; 447(7146):799-816. 239. Kumar S, Vo AD, Qin F and Li H. Comparative assessment of methods for the fusion transcripts detection from RNA-Seq data. Scientific Reports. 2016; 6:21597. 240. Maher CA, Palanisamy N, Brenner JC, Cao X, Kalyana-Sundaram S, Luo S, Khrebtukova I, Barrette TR, Grasso C, Yu J, Lonigro RJ, Schroth G, Kumar-Sinha C and Chinnaiyan AM. Chimeric transcript discovery by paired-end transcriptome sequencing. Proc Natl Acad Sci USA. 2009; 106. 241. Prensner JR, Iyer MK, Balbin OA, Dhanasekaran SM, Cao Q, Brenner JC, Laxman B, Asangani IA, Grasso CS, Kominsky HD, Cao X, Jing X, Wang X, Siddiqui J, Wei JT, Robinson D, et al. Transcriptome sequencing across a prostate cancer cohort identifies PCAT- 1, an unannotated lincRNA implicated in disease progression. Nat Biotechnol. 2011; 29(8):742-749. 242. Friedlander MR, Chen W, Adamidi C, Maaskola J, Einspanier R, Knespel S and Rajewsky N. Discovering microRNAs from deep sequencing data using miRDeep. Nat Biotech. 2008; 26(4):407-415. 243. Pritchard CC, Cheng HH and Tewari M. MicroRNA profiling: approaches and considerations. Nature reviews Genetics. 2012; 13(5):358-369. 244. Ewing B, Hillier L, Wendl MC and Green P. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 1998; 8(3):175-185. 245. Oliver GR, Hart SN and Klee EW. Bioinformatics for clinical next generation sequencing. Clin Chem. 2015; 61(1):124-135. 246. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, McKenna A, Fennell TJ, Kernytsky AM, Sivachenko AY, Cibulskis K, Gabriel SB, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011; 43(5):491-498. 247. Conesa A, Madrigal P, Tarazona S, Gomez-Cabrero D, Cervera A, McPherson A, Szcześniak MW, Gaffney DJ, Elo LL, Zhang X and Mortazavi A. A survey of best practices for RNA-seq data analysis. Genome Biology. 2016; 17:13. 248. Trapnell C, Pachter L and Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009; 25. 249. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M and Gingeras TR. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2012. 250. Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL and Pachter L. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nature Protocols. 2012; 7(3):562-578. 251. Anders S, Pyl PT and Huber W. HTSeq—a Python framework to work with high- throughput sequencing data. Bioinformatics. 2015; 31(2):166-169. 252. Metpally R, Nasser S, Malenica I, Courtright A, Carlson E, Ghaffari L, Villa S, Tembe W and Van Keuren-Jensen K. Comparison of Analysis Tools for miRNA High Throughput Sequencing Using Nerve Crush as a Model. Frontiers in Genetics. 2013; 4(20). 253. Angelini C, De Canditiis D and De Feis I. Computational approaches for isoform detection and estimation: good and bad news. BMC Bioinformatics. 2014; 15:135. 254. An J, Lai J, Lehman ML and Nelson CC. miRDeep*: an integrated application tool for miRNA identification from RNA sequencing data. Nucleic Acids Res. 2013; 41(2):727-737. 255. Nicorici D, Satalan M, Edgren H, Kangaspeska S, Murumagi A, Kallioniemi O, Virtanen S and Kilkku O. FusionCatcher - a tool for finding somatic fusion genes in paired- end RNA-sequencing data. bioRxiv. 2014.

108 256. Iyer MK, Chinnaiyan AM and Maher CA. ChimeraScan: a tool for identifying chimeric transcription in sequencing data. Bioinformatics. 2011; 27(20):2903-2904. 257. Ge H, Liu K, Juan T, Fang F, Newman M and Hoeck W. FusionMap: detecting fusion genes from next-generation sequencing data at base-pair resolution. Bioinformatics. 2011; 27. 258. Kim D and Salzberg SL. TopHat-Fusion: an algorithm for discovery of novel fusion transcripts. Genome Biology. 2011; 12(8):1-15. 259. Micci F, Haugom L, Ahlquist T, Andersen HK, Abeler VM, Davidson B, Trope CG, Lothe RA and Heim S. Genomic aberrations in borderline ovarian tumors. Journal of translational medicine. 2010; 8:21-21. 260. Redshaw N, Wilkes T, Whale A, Cowen S, Huggett J and Foy CA. A comparison of miRNA isolation and RT-qPCR technologies and their effects on quantification accuracy and repeatability. Biotechniques. 2013; 54(3):155-164. 261. Linsen SEV, de Wit E, Janssens G, Heater S, Chapman L, Parkin RK, Fritz B, Wyman SK, de Bruijn E, Voest EE, Kuersten S, Tewari M and Cuppen E. Limitations and possibilities of small RNA digital gene expression profiling. Nat Meth. 2009; 6(7):474-476. 262. Annala M, Parker B, Zhang W and Nykter M. Fusion genes and their discovery using high throughput sequencing. Cancer letters. 2013; 340(2):10.1016/j.canlet.2013.1001.1011. 263. Ge H, Liu K Fau - Juan T, Juan T Fau - Fang F, Fang F Fau - Newman M, Newman M Fau - Hoeck W and Hoeck W. FusionMap: detecting fusion genes from next-generation sequencing data at base-pair resolution. (1367-4811 (Electronic)). 264. Hoff AM, Johannessen B, Alagaratnam S, Zhao S, Nome T, Lovf M, Bakken AC, Hektoen M, Sveen A, Lothe RA and Skotheim RI. Novel RNA variants in colorectal cancers. Oncotarget. 2015; 6(34):36587-36602. 265. Panagopoulos I, Gorunova L, Viset T and Heim S. Gene fusions AHRR-NCOA2, NCOA2-ETV4, ETV4-AHRR, P4HA2-TBCK, and TBCK-P4HA2 resulting from the translocations t(5;8;17)(p15;q13;q21) and t(4;5)(q24;q31) in a soft tissue angiofibroma. Oncology reports. 2016; 36(5):2455-2462. 266. Wei W, Dizon D, Vathipadiekal V and Birrer MJ. Ovarian cancer: genomic analysis. Annals of Oncology. 2013; 24(Suppl 10):x7-x15. 267. Pejovic T, Heim S, Mandahl N, Elmfors B, Floderus UM, Furgyik S, Helm G, Willen H and Mitelman F. Consistent occurrence of a 19p+ marker chromosome and loss of 11p material in ovarian seropapillary cystadenocarcinomas. Genes Chromosomes Cancer. 1989; 1(2):167-171. 268. Aman P, Pejovic T, Wennborg A, Heim S and Mitelman F. Mapping of the 19p13 breakpoint in an ovarian carcinoma between the INSR and TCF3 loci. Genes Chromosomes Cancer. 1993; 8(2):134-136. 269. Zorn KK, Jazaeri AA, Awtrey CS, Gardner GJ, Mok SC, Boyd J and Birrer MJ. Choice of Normal Ovarian Control Influences Determination of Differentially Expressed Genes in Ovarian Cancer Expression Profiling Studies. Clinical Cancer Research. 2003; 9(13):4811. 270. Mito JK, Agoston AT, Dal Cin P and Srivastava A. Prevalence and significance of HMGA2 expression in oesophageal adenocarcinoma. Histopathology. 2017. 271. Wu J, Zhang S, Shan J, Hu Z, Liu X, Chen L, Ren X, Yao L, Sheng H, Li L, Ann D, Yen Y, Wang J and Wang X. Elevated HMGA2 expression is associated with cancer aggressiveness and predicts poor outcome in breast cancer. Cancer Lett. 2016; 376(2):284- 292. 272. Califano D, Pignata S, Losito NS, Ottaiano A, Greggi S, De Simone V, Cecere S, Aiello C, Esposito F, Fusco A and Chiappetta G. High HMGA2 expression and high body mass index negatively affect the prognosis of patients with ovarian cancer. J Cell Physiol. 2014; 229(1):53-59.

109 273. Mahajan A, Liu Z, Gellert L, Zou X, Yang G, Lee P, Yang X and Wei JJ. HMGA2: a biomarker significantly overexpressed in high-grade ovarian serous carcinoma. Mod Pathol. 2010; 23(5):673-681. 274. Hetland TE, Holth A, Kaern J, Florenes VA, Trope CG and Davidson B. HMGA2 protein expression in ovarian serous carcinoma effusions, primary tumors, and solid metastases. Virchows Arch. 2012; 460(5):505-513. 275. Wu J, Liu Z, Shao C, Gong Y, Hernando E, Lee P, Narita M, Muller W, Liu J and Wei JJ. HMGA2 overexpression-induced ovarian surface epithelial transformation is mediated through regulation of EMT genes. Cancer Res. 2011; 71(2):349-359. 276. Wu J and Wei JJ. HMGA2 and high-grade serous ovarian carcinoma. JMolMed(Berl). 2013; 91(10):1155-1165. 277. Solheim O, Førsund M, Tropé CG, Kraggerud SM, Nesland JM and Davidson B. Epithelial–mesenchymal transition markers in malignant ovarian germ cell tumors. APMIS. 2017; 125(9):781-786. 278. Qian ZR, Asa SL, Siomi H, Siomi MC, Yoshimoto K, Yamada S, Wang EL, Rahman MM, Inoue H, Itakura M, Kudo E and Sano T. Overexpression of HMGA2 relates to reduction of the let-7 and its relationship to clinicopathological features in pituitary adenomas. Mod Pathol. 2009; 22(3):431-441. 279. Motoyama K, Inoue H, Nakamura Y, Uetake H, Sugihara K and Mori M. Clinical significance of high mobility group A2 in human gastric cancer and its relationship to let-7 microRNA family. Clin Cancer Res. 2008; 14(8):2334-2340. 280. Katz B, Tropé CG, Reich R and Davidson B. MicroRNAs in Ovarian Cancer. Human Pathology. 2015; 46(9):1245-1256. 281. Wang X, Qiu L-W, Peng C, Zhong S-P, Ye L and Wang D. MicroRNA-30c inhibits metastasis of ovarian cancer by targeting metastasis-associated gene 1. Journal of Cancer Research and Therapeutics. 2017; 13(4):676-682. 282. Sen N, Gui B and Kumar R. Role of MTA1 in cancer progression and metastasis. Cancer and Metastasis Reviews. 2014; 33(4):879-889. 283. Khella HWZ, Bakhet M, Allo G, Jewett MAS, Girgis AH, Latif A, Girgis H, Von Both I, Bjarnason GA and Yousef GM. miR-192, miR-194 and miR-215: a convergent microRNA network suppressing tumor progression in renal cell carcinoma. Carcinogenesis. 2013; 34(10):2231-2239. 284. Boni V, Bitarte N, Cristobal I, Zarate R, Rodriguez J, Maiello E, Garcia-Foncillas J and Bandres E. miR-192/miR-215 influence 5-fluorouracil resistance through cell cycle- mediated mechanisms complementary to its post-transcriptional thymidilate synthase regulation. Molecular cancer therapeutics. 2010; 9(8):2265-2275. 285. Pichiorri F, Suh S-S, Rocci A, De Luca L, Taccioli C, Santhanam R, Zhou W, Benson Jr DM, Hofmainster C, Alder H, Garofalo M, Di Leva G, Volinia S, Lin H-J, Perrotti D, Kuehl M, et al. Downregulation of p53-inducible microRNAs 192, 194, and 215 Impairs the p53/MDM2 Autoregulatory Loop in Multiple Myeloma Development. Cancer Cell. 2010; 18(4):367-381. 286. An J, Lv W and Zhang Y. LncRNA NEAT1 contributes to paclitaxel resistance of ovarian cancer cells by regulating ZEB1 expression via miR-194. OncoTargets and therapy. 2017; 10:5377-5390. 287. Lin Y, Jin Y, Xu T, Zhou S and Cui M. MicroRNA-215 targets NOB1 and inhibits growth and invasion of epithelial ovarian cancer. American journal of translational research. 2017; 9(2):466-477. 288. Joshi SR, Comer BS, McLendon JM and Gerthoffer WT. MicroRNA Regulation of Smooth Muscle Phenotype. Molecular and cellular pharmacology. 2012; 4(1):1-16.

110 289. Graff JW, Dickson AM, Clay G, McCaffrey AP and Wilson ME. Identifying functional microRNAs in macrophages with polarized phenotypes. The Journal of biological chemistry. 2012; 287(26):21816-21825. 290. Ross B, Krapp S, Augustin M, Kierfersauer R, Arciniega M, Geiss-Friedlander R and Huber R. Structures and mechanism of dipeptidyl peptidases 8 and 9, important players in cellular homeostasis and cancer. Proceedings of the National Academy of Sciences. 2018. 291. Agrawal N, Akbani R, Aksoy BA, Ally A, Arachchi H, Asa Sylvia L, Auman JT, Balasundaram M, Balu S, Baylin Stephen B, Behera M, Bernard B, Beroukhim R, Bishop Justin A, Black Aaron D, Bodenheimer T, et al. Integrated Genomic Characterization of Papillary Thyroid Carcinoma. Cell. 159(3):676-690. 292. López-Otín C and Matrisian LM. Emerging roles of proteases in tumour suppression. Nat Rev Cancer. 2007 Oct; 7:800. 293. Yao TW, Kim WS, Yu DM, Sharbeen G, McCaughan GW, Choi KY, Xia P and Gorrell MD. A novel role of dipeptidyl peptidase 9 in epidermal growth factor signaling. Mol Cancer Res. 2011; 9(7):948-959. 294. Hoogstraat M, de Pagter MS, Cirkel GA, van Roosmalen MJ, Harkins TT, Duran K, Kreeftmeijer J, Renkens I, Witteveen PO, Lee CC, Nijman IJ, Guy T, van ’t Slot R, Jonges TN, Lolkema MP, Koudijs MJ, et al. Genomic and transcriptomic plasticity in treatment- naïve ovarian cancer. Genome Research. 2014; 24(2):200-211. 295. Li HL, Huang DZ, Deng T, Zhou LK, Wang X, Bai M and Ba Y. Overexpression of cyclin L2 inhibits growth and enhances chemosensitivity in human gastric cancer cells. Asian Pac J Cancer Prev. 2012; 13(4):1425-1430. 296. Gong D and Ferrell JE. The Roles of Cyclin A2, B1, and B2 in Early and Late Mitotic Events. Molecular Biology of the Cell. 2010; 21(18):3149-3161. 297. Louis DN, Perry A, Reifenberger G, von Deimling A, Figarella-Branger D, Cavenee WK, Ohgaki H, Wiestler OD, Kleihues P and Ellison DW. The 2016 World Health Organization Classification of Tumors of the Central Nervous System: a summary. Acta Neuropathologica. 2016; 131(6):803-820. 298. Bolha L, Ravnik-Glavac M and Glavac D. Circular RNAs: Biogenesis, Function, and a Role as Possible Cancer Biomarkers. International journal of genomics. 2017; 2017:6218353. 299. Tomczak K, Czerwińska P and Wiznerowicz M. The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemporary Oncology. 2015; 19(1A):A68-A77.

111

Paper I

A novel truncated form of HMGA2 in tumors of the ovaries

Agostini A, Panagopoulos I, Davidson B, Tropé CG, Heim S, Micci F.

Oncology Letters 2016 Aug;12(2):1559-1563

ONCOLOGY LETTERS 12: 1559-1563, 2016

A novel truncated form of HMGA2 in tumors of the ovaries

ANTONIO AGOSTINI1,2, IOANNIS PANAGOPOULOS1,2, BEN DAVIDSON3,4, CLAES GORAN TROPE5, SVERRE HEIM1,2,4 and FRANCESCA MICCI1,2

1Section for Cancer Cytogenetics, Institute for Cancer Genetics and Informatics, The Norwegian Radium Hospital, Oslo University Hospital; 2Centre for Cancer Biomedicine, University of Oslo; 3Department of Pathology, The Norwegian Radium Hospital, Oslo University Hospital; 4Faculty of Medicine, University of Oslo; 5Department of Gynecology, The Norwegian Radium Hospital, Oslo University Hospital, 0310 Oslo, Norway

Received July 27, 2015; Accepted May 10, 2016

DOI: 10.3892/ol.2016.4805

Abstract. Neoplasms of the ovary are the second most The majority (90%) of ovarian tumors are epithelial in common tumor of the female reproductive system, and the nature. Malignant epithelial ovarian tumors are currently most lethal of the gynecological malignancies. Ovarian divided into high-grade serous, low-grade serous, endome- tumors are divided into a copious number of different groups trioid, mucinous and clear cell carcinomas (2). UHÁHFWLQJWKHLUGLIIHUHQWIHDWXUHV7KHSUHVHQWVWXG\DQDO\]HG Borderline tumors of the ovary are neoplasms of low 187 ovarian tumors (39 sex-cord stromal tumors, 22 border- malignant potential. These tumors present with cellular atypia, line tumors and 126 carcinomas) for the expression of the but are not invasive. A few of these tumors share genomic and high-mobility group AT-hook 2 (HMGA2) gene, for mutations molecular features with carcinomas (3), but it remains unclear in the isocitrate dehydrogenase (NADP(+)) 1, cytosolic (IDH1), as to whether this is a feature of only a subset or of borderline isocitrate dehydrogenase (NADP(+)) 2, mitochondrial (IDH2) tumors in general. and telomerase reverse transcriptase (TERT) genes, and for Sex-cord stromal tumors account for 8% of all ovarian methylation of the O6-methylguanine-DNA methyltrans- WXPRUVDQGDUHIXUWKHUFODVVLÀHGEDVHGRQWKHLUSUHGRPLQDQW ferase (MGMT) promoter. Reverse transcription-polymerase FHOOFRQWHQWIRUH[DPSOHJUDQXORVDFHOOWXPRUVFRQWDLQ• chain reaction analysis showed that HMGA2 was expressed JUDQXORVDFHOOV7KHWKHFRPDÀEURPDJURXSRIWXPRUVLVGRPL- in 74.5% of the samples (120/161). A truncated transcript of QDWHGE\WKHFDFHOOV WKHFRPD VWURPDOÀEUREODVWV ÀEURPD RU HMGA2ZDVLGHQWLÀHGLQFDVHV$QRYHOWUXQFDWHGIRUPRI the two cell types in different proportions (thecofibroma) (4). HMGA2 was found in 4 serous high-grade carcinomas. Only Heterogeneity among ovarian tumors biologically and 4 tumors (4/185) showed the TERT C228T mutation. No IDH1 clinically represents a considerable challenge. The molecular or IDH2 mutations were found. Methylation of the promoter and genetic profiles of the tumors could aid in predicting of MGMT was found in 2 borderline tumors (2/185). HMGA2 their inherent aggressiveness and could also provide keys to was expressed, in its truncated and native form, in different PRUHVSHFLÀFWUHDWPHQWV$VDFRQWULEXWLRQWRZDUGWKLVJRDO ovarian tumors, even the less aggressive types, underscoring WKHSUHVHQWVWXG\DQDO\]HGVDPSOHVRIGLIIHUHQWW\SHVRI the general importance of this gene in ovarian tumorigenesis. RYDULDQWXPRUVQDPHO\JUDQXORVDFHOOWXPRUVWKHFRÀEURPDV Mutations involving TERT, as well as MGMT promoter meth- ÀEURPDVWHUDWRPDVERUGHUOLQHWXPRUVDQGLQÀOWUDWLQJFDUFL- ylation, are rare events in ovarian tumors. nomas of different histologies, for their expression of the high-mobility group AT-hook 2 (HMGA2) gene, for mutations Introduction of the isocitrate dehydrogenase (NADP(+)) 1, cytosolic (IDH1), isocitrate dehydrogenase (NADP(+)) 2, mitochondrial (IDH2) Tumors of the ovaries account for 30% of all cancers of the and telomerase reverse transcriptase (TERT) genes, and for female genital system; they are a heterogeneous group of methylation of the promoter of the O6-methylguanine-DNA neoplasms, divided into a number of different subgroups, methyltransferase (MGMT) gene. depending largely on histological and cytological features (1). Materials and methods

Tumor material. 7KHPDWHULDOFRQVLVWHGRIIUHVKIUR]HQ samples from 187 ovarian tumors surgically resected at Correspondence to: Dr Francesca Micci, Section for Cancer The Norwegian Radium Hospital (Oslo, Norway) between Cytogenetics, Institute for Cancer Genetics and Informatics, The Norwegian Radium Hospital, Oslo University Hospital, December 1999 and January 2010. The tumors have been Ullernchausseen 64A, 0310 Oslo, Norway SUHYLRXVO\FKDUDFWHUL]HGIRUFKURPRVRPDODEHUUDWLRQVDQG E-mail: [email protected] genomic imbalances (3,5,6). The present series consisted of VH[FRUGVWURPDOWXPRUV WKHFRÀEURPDVÀEURPDV Key words: HMGA2, IDH1, IDH2, TERT, MGMT, ovarian tumors 2 granulosa cell tumors and 2 teratomas), 22 borderline tumors and 126 carcinomas (56 serous high-grade carcinomas, 1560 AGOSTINI et al: HMGA2 EXPRESSION IN TUMORS OF THE OVARIES

30 endometrioid carcinomas, 18 mucinous carcinomas, 7DEOH,6SHFLÀFSULPHUVIRUWKH UDSLGDPSOLÀFDWLRQRIF'1$ 12 clear cell carcinomas, 4 serous low-grade carcinomas, HQGVSURGXFWDPSOLÀFDWLRQ 1 undifferentiated carcinoma, and 5 mixed-type carcinomas). This study was approved by the regional ethics committee Primer name Primer sequence (Regional Komité For Medisinsk Forskning-Setikk Sør-Øst, Norge; http://helseforskning.etikkom.no) and written informed HMGA2-982-F1 5'-CAA GAG TCC CTC TAA AGC consent was obtained from the patients. AGC TCA-3' HMGA2R169 5'-TGG GAT GCA GAC TTC AGT DNA and RNA extraction, and cDNA synthesis. DNA extrac- TGG AA-3' tion was performed using the Maxwell 16 Extractor (Promega, HMGA2F323 5'-CCC TAT CAC CTC ATC TCC 0DGLVRQ:,86$ DQGWKH0D[ZHOO7LVVXH'1$3XULÀFD- CG-3' tion kit (Promega) according to the manufacturer's instructions. HMGA2R323 5'-TTG TCC ACT CAT TCA GCA RNA was extracted using the miRNeasy kit (Qiagen GmbH, GAT C-3' Hilden, Germany) and QIAcube (Qiagen GmbH). The concen- HMGA2R363 5'-CAG GCA TGG CTC TGC ATG tration and purity of the DNA and RNA were measured with TG-3' the Nanovue Spectrophotometer (GE Healthcare, Pittsburgh, HMGA2R175 5'-TGA CCA CTG AAT TCT GGC PA, USA). Extracted RNA (1 μg) was reverse-transcribed in CTC A-3' a 20-μl reaction volume using the iScript Advanced cDNA Synthesis kit according to the manufacturer's instructions (Bio-Rad Laboratories, Oslo, Norway). and unmethylated standards (Epitect PCR Control DNA Set; Molecular analyses. Molecular analyses of IDH1, IDH2, Qiagen GmbH) were used as controls. The thermal cycling TERT and MGMT were performed according to previously LQFOXGHGPLQDWÝ&IROORZHGE\F\FOHVRIVHFDW described protocols (7). Ý&VHFDWÝ&VHFDWÝ&DQGVHFDWÝ&DQG DÀQDOSURFHVVLQZKLFKWKHWHPSHUDWXUHLQFUHDVHGE\Ý& HMGA2. Molecular analyses of HMGA2 were performed using LQVWHSVVWDUWLQJIURPÝ&DQGHQGLQJDWÝ&Data were a previously described protocol (7), but with minor variations. DQDO\]HGZLWKWKH&);0DQDJHUVRIWZDUH %LR5DG/DERUDWR- In cases 7, 8, 9, 10 and 11, the nested polymerase chain reaction ries), and all the melting curves of the samples were referenced 3&5 RIWKH UDSLGDPSOLÀFDWLRQRIF'1$HQGV (3'RACE) to the methylated and unmethylated controls (9). was performed using the Touchdown-PCR conditions described by Korbie et al (8 WRLQFUHDVHWKHVSHFLÀFLW\RIWKH3&5DQG Results improve the quality of the products. The 3'RACE results for cases 7, 8, 9 and 11 were validated through additional PCRs HMGA2. Informative results for HMGA2 expression were using primers obtained by the sequence of the 3'RACE products obtained for all 161 tumors from which RNA was available. 7DEOH, 0RUHVSHFLÀFDOO\+0*$)DQG+0*$5 The gene was found to be expressed in 74.5% of the samples were used in case 7, HMGA2F233 and HMGA2R233 were (120/161 tumors). The frequency varied among the different used in case 8, HMGA2-928F and HMGA2R263 were used WXPRUW\SHVDQGKLVWRORJLFDOVXEJURXSVZLWKWKHWKHFRÀEURPDV in case 9, and HMGA2-928F and HMGA2R175 were used in showing the highest frequency (100%; 16/16 tumors), followed case 11 (Table II). The products of these PCRs were sequenced by the high-grade serous carcinomas (90.2%; 37/41 tumors), DQGDQDO\]HGE\%/$7 http://genome.ucsc.edu/cgi-bin/hgBlat). WKHÀEURPDV  WKHERUGHUOLQHWXPRUV  14/21 tumors), the mucinous carcinomas (61.1%; 11/18 tumors), 0HWK\ODWLRQVSHFLÀFTXDQWLWDWLYHSRO\PHUDVHFKDLQUHDFWLRQ the endometrioid carcinomas (60.0%; 18/30 tumors) and the 063T3&5 Unmethylated cytosine residues were converted clear cell carcinomas with only 18.2% (2/11 tumors). to uracil by bisulfite treatment of 1.3 μg DNA using the The samples were run for two parallel PCRs, which ampli- (SL7HFW%LVXOÀWH.LW 4LDJHQ*PE+ DFFRUGLQJWRWKHPDQX- ÀHGH[RQVDQGH[RQVRIHMGA2, respectively. Out of facturers' protocol. Following conversion, DNA was eluted in WKHVDPSOHVDQDO\]HGVKRZHGH[SUHVVLRQRIIXOOOHQJWK buffer (Qiagen GmbH WRDÀQDOFRQFHQWUDWLRQRIQJ/μl. HMGA2, whereas 41 tumors showed no HMGA2 expression. %LVXOÀWHWUHDWHG'1$ μl ZDVDPSOLÀHGZLWKDT3&5LQ The remaining 11 tumors showed expression of a truncated a 20-μl reaction mixture containing 10 μl of Precision Mix HMGA2, i.e., exons 1-3 (Table II). In these cases, 3'RACE (Bio-Rad Laboratories, Oslo, Norway), 0.4 μl of forward and PCR was performed in search of possible fusion transcripts. In reverse primers, and 7.2 μl of H2O. Two different mixes were 9 cases, the HMGA2WUDQVFULSWFRXOGEHLGHQWLÀHGE\6DQJHU used, one containing the primer sequences of MGMT for the sequencing, whereas in cases 2 and 10, sequencing was not unmethylated reaction [5'-TTT GTG TTT TGA TGT TTG TAG informative. In 4 out of the 9 tumors showing a truncated form GTT TTTGT-3' (forward) and 5'-AAC TCC ACA CTC TTC of HMGA2 (cases 1, 3, 4 and 5), an ectopic sequence was found CAA AAA CAA AACA-3' (reverse)] and one mix containing fused with HMGA2. This sequence has been previously found the primers for the methylated reaction [5'-TTT CGA CGT in human lipomas (accession number U29117). A single tumor TCG TAG GTT TTCGC-3' (forward) and 5'-GCA CTC TTC (case 6) displayed a truncated transcript of HMGA2 with a CGA AAA CGA AACG-3' (reverse)]. All qPCR analyses were sequence similar to the previously reported Homo sapiens performed in triplicate with the CFX96 Touch Real-Time cDNA FLJ18441 (accession number AK311399). Another PCR detection system (Bio-Rad Laboratories) and methylated 4 serous high-grade tumors (cases 7, 8, 9 and 11) showed a ONCOLOGY LETTERS 12: 1559-1563, 2016 1561

Table II. Truncated forms of HMGA2 found in ovarian tumors.

Case/lab no. Diagnosis RT-PCR Sequence Karyotype

1/05-1270 Mucinous carcinoma Truncated transcript U29117 Culture failure 2/07-1630 Mucinous carcinoma Truncated transcript Not available 46,XX[69] 3/08-1650 Mucinous carcinoma Truncated transcript U29117 46,XX,del(1)(q21)[2] / 46,XX[88] 4/01-196 Endometrioid carcinoma Truncated transcript U29117 46,XX[16] 5/07-0449 Endometrioid carcinoma Truncated transcript U29117 46,XX[12] 6/03-481 Borderline Truncated transcript AK31139 46,XX[32] 7/01-169 Serous high-grade Truncated transcript Novel 46,XX[48] 8/02-333 Serous high-grade Truncated transcript Novel 47~49,XX,+8,+9[2] / 49,idem,+5,-6,+7[4] / 54, idem,+3,+5,+6,+7,+14,+17,+19[5] 9/02-363 Serous high-grade Truncated transcript Not available 65~68,XX,-X,+1,+2,del(3)(p13),+4, add(4)(p12) x2, add(5)(p15),-6,+7, add(7)(p15),-8,-9,-12,-13,-14,-15,-17, -18,der(19)add(19)(p13)add(19)(q13) +20,+20,-21,-22,+2mar[9] 10/04-499 Serous high-grade Truncated transcript Not available 46,XX[5] 11/04-715 Serous high-grade Truncated transcript Novel Culture failure

HMGA2, high-mobility group AT-hook 2; RT-PCR, reverse transcription-polymerase chain reaction.

A

B

Figure 1. HMGA2 truncated transcript found in serous high-grade carcinomas. Chromatograms of the HMGA2 transcripts found in (A) cases 7 and 11, and (B) case 8. The junctions between exon 3 (left) and the intronic regions (right) are highlighted in yellow.

truncated form of HGMA2 not reported previously. 3'RACE mutation sites were investigated: IDH1R100, IDH1R109, results for these cases were validated through an additional and IDH1R132 of IDH1, and IDH2R140, IDH2R149 and PCR using the primers obtained by the sequence of the 3'RACE IDH2R172 of IDH2. All analyses gave informative results. No products. This was informative for 3 out of 4 cases, but gave mutations were found in these tumors. In 8 tumors, the single no informative result for case 9. The transcripts contained the QXFOHRWLGHSRO\PRUSKLVP 613 ,'+*ZDVLGHQWLÀHG normal mRNA sequence until exon 3, with different regions of the third intron varying in length from case to case (Fig. 1). MGMT. The MGMT promoter methylation status of the 185 samples from which DNA was extracted was assessed TERT. All 185 cases from which DNA was available were using MSP-qPCR. All analyses gave informative results. Only DQDO\]HGIRUPXWDWLRQVLQWKHSURPRWHUUHJLRQRITERT. The 2 borderline tumors were found to have MGMT promoter most frequent mutations, i.e., C228T and C250T, were focused methylation. on since they have been observed in a large number of tumors RIGLIIHUHQWW\SHV  2QO\ERUGHUOLQHWXPRUDQGÀEURPDV Discussion showed the C228T mutation. All other cases showed no muta- tion. The high-mobility group AT-hook proteins, HMGA1 and HMGA2, are non-histone proteins that are involved in a IDH1 and IDH2 mutations. A total of 184 samples were range of nuclear processes from chromatin dynamics to gene DQDO\]HGIRUPXWDWLRQVLQ IDH1 and IDH2. The following regulation. HMGA gene family expression is observed during 1562 AGOSTINI et al: HMGA2 EXPRESSION IN TUMORS OF THE OVARIES embryonic development (11), while expression is largely absent SUHVHQWVWXG\WKHÀQGLQJRIHMGA2 expression in different in adult normal tissues (11). However, high HMGA2 expression RYDULDQWXPRUVHYHQWKHOHVVDJJUHVVLYHW\SHVVXFKDVÀEURPD levels have been found in a variety of benign and malignant WKHFRÀEURPDDQGERUGHUOLQHWXPRUVKLJKOLJKWVRQFHPRUHWKH tumors (12,13). HMGA2 is involved in a number of different importance of HMGA2 for the development of ovarian tumors. processes, from cellular proliferation to epithelial-mesen- TERT encodes the telomerase reverse transcriptase. It is well chymal transition (14). Recent studies have highlighted a known that the gene is involved in cancer, and numerous studies pivotal role for HMGA2 in tumor metastasis (15). HMGA2 have shown that mutations in the promoter region of the gene expression, in its full and/or truncated length, has not been can increase telomerase expression (22-24). The present study investigated extensively in ovarian tumors, although a study focused on the most frequently occurring TERT mutations, i.e., by Hetland et al (16) showed the expression of HGMA2 by C228T and C250T. These mutations introduce a novel binding immunohistochemistry in primary solid tumors (96%), metas- site (TTCCGG) for members of the E-twenty-six/ternary tases (90%) and effusions (94%) of serous carcinomas. The complex factor transcription factor family (10). The C228T muta- present results showed that HMGA2 is expressed in a number WLRQZDVIRXQGLQRXWRIWKHWXPRUVDQDO\]HGRIZKLFK of ovarian tumors (74.5%; 120/161 tumors). High-level expres- ZHUHÀEURPDVZKHUHDVWKHIRXUWKZDVDERUGHUOLQHWXPRU7KH sion was noted throughout the entire spectrum of malignancy, low percentage of TERT mutations found in the various ovarian i.e., from sex-cord stromal neoplasms to borderline tumors to WXPRUVDPSOHVDQDO\]HGOHDGVXVWRK\SRWKHVL]HWKDWLWLVQHLWKHU aggressive carcinomas. The only quantitative exception was a primary event nor even an important step in the majority of the clear cell carcinomas, among which only 18.2% showed types of ovarian tumors. Notably, TERT C228T appears to be HMGA2 expression (2/11 tumors). Whether neoplasms of this UHFXUUHQWLQRYDULDQÀEURPDVEXWIXUWKHUVWXGLHVRIDODUJHU subtype really express HMGA2 more rarely is a conclusion cohort are necessary to more reliably evaluate the frequency of that must await examination of more tumors. WKLVPXWDWLRQ7KHSUHVHQWVWXG\GLGQRWÀQGDQ\TERT muta- A truncated gene was found in 11 ovarian tumors of the tions in the clear cell samples, however, Huang et al (25) found present cohort. HMGA2 has previously been found disrupted, the mutation in 16% of tumors (9/56 tumors) of this carcinoma due to rearrangement of chromosomal band 12q15, in different subtype. The discrepancy between the present data and previous benign connective tissue tumors, including lipomas (17), pleo- data may be due to the low number of clear cell carcinomas morphic salivary gland adenomas (18), uterine leiomyomas (19) DQDO\]HGLQWKHSUHVHQWFRKRUW Q   and lung hamartomas (20). The alterations involve exon 3 The IDH1 and IDH2 genes encode two types of isocitrate and cause the deletion of downstream regions, resulting in a dehydrogenase. Mutations in either of the genes can result in truncated transcript that is able to evade miRNA-dependent DQHQ]\PHWKDWSURGXFHVK\GUR[\JOXWDUDWH7KLVPHWDEROLWH gene silencing. Another alteration involves the chromosomal is an inhibitor of Ơ-ketoglutarate-dependent oxygenases, which rearrangement of 12q13-15 to form a fusion gene. In the can cause genome-wide methylations that exhibit an effect on present study, karyotypic information was available for 9 out gene expression when impaired. IDH1 and/or IDH2 mutations of the 11 tumors with a truncated HMGA2 (Table II). The have been found in gliomas (26) and hematological malignan- karyotype was normal in 6 cases, whereas case 3 showed a cies (27). SNP IDH1G105, which is an adverse prognostic del(1)(q21) as a sole rearrangement, case 8 exhibited a karyo- factor in cytogenetically normal acute myeloid leukemia (28), type described as 47~49,XX,+8,+9[2]/49,idem,+5,-6,+7[4]/54, ZDVLGHQWLILHGLQRXWRIWKHWXPRUVDQDO\]HGLQWKH idem,+3,+5,+6,+7,+14,+17,+19[5], and the karyotype of case 9 present series. This indicates that IDH1 and IDH2 are only was interpreted as 65~68,XX,-X,+1,+2, del(3)(p13),+4 ,add(4) rarely involved in ovarian tumorigenesis. (p12)x2,add(5)(p15),-6,+7,add(7) (p15),-8,-9,-12,-13,-14,-15,-17, MGMT encodes O6-methylguanine DNA methyltransferase, -18, der(19) add (19)(p13) add(19)(q13) ,+20,+20,-21,-22,+2mar[9]. D'1$UHSDLUHQ]\PHWKDWUHPRYHVDON\ODGGXFWVIURPWKH Notably, none of these cases showed a structural rearrange- O6-position of guanine. Expression of MGMT can result in resis- ment involving 12q15 despite the fact that a truncated form tance to alkylating cytostatics. MGMT promoter methylation of the HMGA2 gene was found in all of them. Possibly, this increases the sensitivity of cells to alkylating drugs, as has been gene-level change may be due to a small deletion not visible at shown in a number of cancer types, particularly gliomas (29). the chromosomal level. Due to the efficacy of MGMT promoter methylation status The HMGA2 truncated transcripts were further character- as a prognostic and predictive tumor marker, this assessment L]HGE\ 5$&(3&5VHDUFKLQJIRUSRVVLEOHIXVLRQSDUWQHUV has become one of the most commonly requested analyses for However, it emerged that in 9 tumors (2 cases were not infor- gliomas (30). The present study found this gene to be altered mative), the HMGA2 transcript was disrupted in the third exon. LQRQO\ERUGHUOLQHWXPRUVRXWRIDQDO\]HGWXPRUVRI Notably, in 4 of them (cases 1, 3, 4 and 5), a sequence previously different types, indicating that MGMT promoter methylation is IRXQGLQKXPDQOLSRPDVZDVLGHQWLÀHG0RUHRYHUDQRYHO not a common event in ovarian tumorigenesis. In conclusion, truncated transcript was detected for HMGA2 in 4 high-grade the present study contributes to elucidating the genetic features serous carcinomas. The sequence of these transcripts contains of the HMGA2 gene in ovarian neoplasms in that it has been the normal mRNA sequence until exon 3, followed by found expressed in benign and malignant tumors. Furthermore, different regions of the third intron, which vary in length from DQRYHOWUXQFDWHGIRUPRI+0*$KDVEHHQLGHQWLÀHG case to case. It therefore appears that HMGA2 breakage leads to a truncated transcript that possesses exonic and intronic Acknowledgements sequences. He et al (21) suggested that HMGA2 transcript shortening in serous ovarian cancer is the result of alternative This study was supported by grants from the Norwegian polyadenylation that leads to a novel 3'UTR formation. In the Radium Hospital Foundation, the Norwegian Cancer Society, ONCOLOGY LETTERS 12: 1559-1563, 2016 1563 the Inger and John Fredriksen Foundation for Ovarian 17. Schoenmakers EF, Wanschura S, Mols R, Bullerdiek J, Van den Berghe H and Van de Ven WJ: Recurrent rearrangements in the Cancer Research, and the Research Council of Norway high mobility group protein gene, HMGI-C, in benign mesen- through its Centers of Excellence funding scheme, project chymal tumours. Nat Genet 10: 436-444, 1995. number 179571. 18. Geurts JM, Schoenmakers EF and Van de Ven WJ: Molecular FKDUDFWHUL]DWLRQRIDFRPSOH[FKURPRVRPDOUHDUUDQJHPHQWLQ a pleomorphic salivary gland adenoma involving the 3'-UTR of References HMGIC. Cancer Genet Cytogenet 95: 198-205, 1997. 19. Mine N, Kurose K, Nagai H, Doi D, Ota Y, Yoneyama K, Konishi H, Araki T and Emi M: Gene fusion involving HMGIC 1. Tavassoli FA and Devilee P (eds): Pathology and Genetics. is a frequent aberration in uterine leiomyomas. J Hum Genet 46: Tumours of the Breast and Female Genital organs. IARC Press, 408-412, 2001. Lyon, 2003. 20. .D]PLHUF]DN % 0H\HU%ROWH . 7UDQ .+ :öckel W, 2. Prat J: Ovarian carcinomas: Five distinct diseases with different %UHLJKWPDQ,5RVLJNHLW-%DUWQLW]NH6DQG%XOOHUGLHN-$ origins, genetic alterations, and clinicopathological features. high frequency of tumors with rearrangements of genes of Virchows Arch 460: 237-249, 2012. the HMGI(Y) family in a series of 191 pulmonary chondroid 3. Micci F, Haugom L, Ahlquist T, Andersen HK, Abeler VM, hamartomas. Genes Chromosomes Cancer 26: 125-133, 1999. Davidson B, Trope CG, Lothe RA and Heim S: Genomic aber- 21. He X, Yang J, Zhang Q, Cui H and Zhang Y: Shortening of the rations in borderline ovarian tumors. J Transl Med 8: 21, 2010. 3' untranslated region: An important mechanism leading to 4. Buy JN and Ghossain M (eds): Sex cord-stromal tumors. In: overexpression of HMGA2 in serous ovarian cancer. Chin Med J Gynecological Imaging: A Reference Guide to Diagnosis. (Engl) 127: 494-499, 2014. Springer-Verlag, Berlin, Heidelberg, pp329-375, 2013. 22. Killela PJ, Reitman ZJ, Jiao Y, Bettegowda C, Agrawal N, 5. Micci F, Haugom L, Abeler VM, Tropé CG, Danielsen HE and 'LD]/$-U)ULHGPDQ$+)ULHGPDQ+*DOOLD*/ Heim S: Consistent numerical chromosome aberrations in theco- Giovanella BC, et al: TERT promoter mutations occur frequently ÀEURPDVRIWKHRYDU\. Virchows Arch 452: 269-276, 2008. in gliomas and a subset of tumors derived from cells with low rates 6. Micci F, Haugom L, Abeler VM, Davidson B, Tropé CG and of self-renewal. Proc Natl Acad Sci USA 110: 6021-6026, 2013. Heim S: *HQRPLFSURÀOHRIRYDULDQFDUFLQRPDV. BMC Cancer 14: 23. Huang DS, Wang Z, He XJ, Diplas BH, Yang R, Killela PJ, 315, 2014. Meng Q, Ye ZY, Wang W, Jiang XT, et al: Recurrent TERT 7. Agostini A, Panagopoulos I, Andersen HK, Johannesen LE, SURPRWHUPXWDWLRQVLGHQWLÀHGLQDODUJHVFDOHVWXG\RIPXOWLSOH Davidson B, Tropé CG, Heim S and Micci F: HMGA2 expression tumour types are associated with increased TERT expression pattern and TERT mutations in tumors of the vulva. Oncol and telomerase activation. Eur J Cancer 51: 969-976, 2015. Rep 33: 2675-2680, 2015. 24. Vinagre J, Almeida A, Pópulo H, Batista R, Lyra J, Pinto V, 8. Korbie DJ and Mattick JS: Touchdown PCR for increased &RHOKR5&HOHVWLQR53UD]HUHV+/LPD/et al: Frequency of VSHFLÀFLW\DQGVHQVLWLYLW\LQ3&5DPSOLÀFDWLRQ1DW3URWRF TERT promoter mutations in human cancers. Nat Commun 4: 1452-1456, 2008. 2185, 2013. 9. Smith E, Jones ME and Drew PA: Quantitation of DNA meth- 25. Huang HN, Chiang YC, Cheng WF, Chen CA, Lin MC and ylation by melt curve analysis. BMC Cancer 9: 1-12, 2009. Kuo KT: Molecular alterations in endometrial and ovarian clear 10. Heidenreich B, Rachakonda PS, Hemminki K and Kumar R: cell carcinomas: Clinical impacts of telomerase reverse tran- TERT promoter mutations in cancer development. Curr Opin scriptase promoter mutation. Mod Pathol 28: 303-311, 2015. Genet Dev 24: 30-37, 2014. 26. Havik AB, Lind GE, Honne H, Meling TR, Scheie D, Hall KS, 11. Chiappetta G, Avantaggiato V, Visconti R, Fedele M, Battista S, van den Berg E, Mertens F, Picci P, Lothe RA, et al: Sequencing 7UDSDVVR)0HUFLDL%0)LGDQ]D9*LDQFRWWL96DQWRUR0et al: IDH1/2 glioma mutation hotspots in gliomas and malignant High level expression of the HMGI (Y) gene during embryonic peripheral nerve sheath tumors. Neuro Oncol 16: 320-322, 2014. development. Oncogene 13: 2439-2446, 1996. 27. Patel KP, Barkoh BA, Chen Z, Ma D, Reddy N, Medeiros LJ and 12. Rogalla P, Drechsler K, Frey G, Hennig Y, Helmke B, Bonk U Luthra R: Diagnostic testing for IDH1 and IDH2 variants in acute and Bullerdiek J: HMGI-C expression patterns in human tissues. myeloid leukemia an algorithmic approach using high-resolution Implications for the genesis of frequent mesenchymal tumors. melting curve analysis. J Mol Diagn 13: 678-686, 2011. Am J Pathol 149: 775-779, 1996. 28. Wagner K, Damm F, Göhring G, Görlich K, Heuser M, Schäfer I, 13. Pallante P, Sepe R, Puca F and Fusco A: High mobility group A Ottmann O, LüEEHUW0+HLW:.DQ]/et al: Impact of IDH1 (HMGA) proteins as tumor markers. Front Med (Lausanne) 2: R132 mutations and an IDH1 single nucleotide polymorphism in 15, 2015. cytogenetically normal acute myeloid leukemia: SNP rs11554137 14. Wu J and Wei JJ: HMGA2 and high-grade serous ovarian is an adverse prognostic factor. J Clin Oncol 28: 2356-2364, 2010. carcinoma. J Mol Med (Berl) 91: 1155-1165, 2013. 29. Margison GP and SantibáñH].RUHI0)2DON\OJXDQLQH'1$ 15. 0RULVKLWD$=DLGL050LWRUR$6DQNDUDVKDUPD'6]DEROFV0 alkyltransferase: Role in carcinogenesis and chemotherapy. Okada Y, D'Armiento J and Chada K: HMGA2 is a driver of Bioessays 24: 255-266, 2002. tumor metastasis. Cancer Res 73: 4289-4299, 2013. 30. Håvik AB, Brandal P, Honne H, Dahlback HS, Scheie D, 16. Hetland TE, Holth A, Kærn J, Flørenes VA, Tropé CG and Hektoen M, Meling TR, Helseth E, Heim S, Lothe RA and Davidson B: HMGA2 protein expression in ovarian serous Lind GE: MGMT promoter methylation in gliomas-assessment carcinoma effusions, primary tumors, and solid metastases. E\S\URVHTXHQFLQJDQGTXDQWLWDWLYHPHWK\ODWLRQVSHFLÀF3&5. Virchows Arch 460: 505-513, 2012. J Transl Med 10: 36, 2012.

Paper II

Genomic imbalance are involved in miR-30c and let-7a deregulation in ovarian tumors: implication for HMGA2 expression

Agostini A, Brunetti M, Davidson B, Tropé CG, Heim S, Panagopoulos I, Micci F.

Oncotarget 2017 Mar 28;8(13):21554-21560

www.impactjournals.com/oncotarget/ Oncotarget, Advance Publications 2017

Genomic imbalances are involved in miR-30c and let-7a deregulation in ovarian tumors: implications for HMGA2 expression

Antonio Agostini1,2, Marta Brunetti1,2, Ben Davidson3,4, Claes G Tropé5, Sverre Heim1,2,4, Ioannis Panagopoulos1,2, Francesca Micci1,2 1Section for Cancer Cytogenetics, Institute for Cancer Genetics and Informatics, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway 2Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway 3Department of Pathology, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway 4Faculty of Medicine, University of Oslo, Oslo, Norway 5Department of Gynecology, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway Correspondence to: Francesca Micci, email: [email protected] Keywords: HMGA2, miR-30c, let-7a, FHIT, LIN28A Received: July 05, 2016 Accepted: January 31, 2017 Published: March 01, 2017

ABSTRACT The High-mobility group AT-hook 2 protein (HMGA2) is involved in different processes during tumorigenesis. High expression levels of HMGA2 are found in various types of cancer, with recent studies highlighting the important role of miRNAs in the regulation of HMGA2 expression. We report a study of 155 ovarian tumors (30 sex- cord stromal tumors, 22 borderline tumors, and 103 carcinomas) analyzed for HMGA2 expression as well as the expression of two miRNAs targeting this gene, let-7a and miR-30c. We also evaluated the expression of the fragile histidine triad (FHIT) and lin28 homologues (LIN28A/B) genes which are known to be an enhancer of miR-30c expression and a repressor of let-7a, respectively. HMGA2 was found expressed at high levels in most samples analyzed, with clear cell carcinomas as the only exception. let-7a and miR-30c were highly deregulated in all tumor types. LIN28A and FHIT were found overexpressed in all examined tumor types. The chromosomal imbalances that might lead to loss of the genes expressing let-7a and miR-30c could be evaluated on the basis of previously generated karyotypic and high resolution comparative genomic hybridization (CGH) data on 103 tumors. 76% of the samples with an imbalanced genome had at least one chromosomal aberration leading to a deletion of a miRNA FOXVWHUIRUOHWDDQGPL5F),6+XVLQJORFXVVSHFL¿FSUREHVIRUWKHVHFOXVWHUV validate the aberrations at the gene level. Our study shows that genomic imbalances are involved in miR-30c and let-7a deregulation. One can reasonably assume that dysregulation of these miRNAs is a cause leading to HMGA2 upregulation in ovarian tumors.

INTRODUCTION characterized by different genetic aberrations. Serous high-grade carcinomas are associated with TP53 and Tumors of the ovary are a heterogeneous group BRCA mutations, whereas their low-grade counterparts which is divided into many different subentities based on often carry KRAS and BRAF mutations. KRAS and histologic and cytologic features. Most ovarian tumors HER2 mutations are frequent in mucinous carcinomas (about 90%) are epithelial in nature. Malignant epithelial whereas ARID1A is frequently mutated in endometrioid ovarian tumors are currently divided into high-grade and clear cell carcinomas [2]. Complex chromosomal serous, low-grade serous, endometrioid, mucinous, and rearrangements, involving chromosome bands 19p13, clear cell carcinomas [1]. Each carcinoma histotype is 19q23, 8q23, and 22q in decreasing order of frequency, are www.impactjournals.com/oncotarget 1 Oncotarget seen in most types of ovarian carcinoma [3]. Borderline the endometrioid (mean = 36; median = 5) carcinomas. tumors of the ovary are neoplasms of low or uncertain The borderline tumors and the mucinous carcinomas malignant potential. They present cellular atypia but are both showed a mean of 21 with a median of 21 and 16, not invasive [4]. Sex-cord stromal tumors account for 8% UHVSHFWLYHO\ ZKHUHDV WKH ¿EURPD VXEJURXS VKRZHG RI DOO RYDULDQ WXPRUV WKHFR¿EURPDV DQG ¿EURPDV DUH a mean = median = 16. The lowest levels of HMGA2 among the most common tumors of this type [5] whose expression were detected in clear cell carcinomas genetic features remain largely unknown. Nevertheless, (mean = 1.8; median = 0.5). some nonrandom chromosomal aberrations have been let-7a and miR-30c were highly deregulated in the reported with trisomy and/or tetrasomy 12 as the most ovarian tumors with low levels of expression being found FRPPRQ DQHXSORLGLHV LQ WKHFR¿EURPDV IROORZHG E\ for both miRNAs in all tumor types examined (Figure 1). trisomy for chromosomes 10, 18, 4, and 9 [6]. LIN28A, on the other hand, was overexpressed in all Altered expression of the high-mobility group histotypes (Figure 1) with the highest expression being AT-hook protein gene, HMGA2, has been reported in both seen in HG-S (mean = 10.6; median = 2.32) followed by benign and malignant ovarian tumors [7, 8]. HMGA2 is WKHFR¿EURPDVDQG¿EURPDVERWKZLWKDPHDQRIDQGD usually expressed during embryonic development [9] but median of 5 and 1.5, respectively. The group of borderline not in adult normal tissues [10]. The gene is assumed to be tumors showed a mean value of 5.7 (median = 2.09). The involved in several different tumorigenic processes from remaining histotypes showed a mean relative expression of cellular proliferation to epithelial-mesenchymal transition 4.3 for LG-S, 3.8 for clear cell, 3.6 for endometrioid, and DQG LQYDVLYH JURZWK >@ /DWHO\ WKH UROH RI VSHFL¿F 2.5 for mucinous carcinomas. LIN28B expression was not miRNAs has been highlighted in the regulation of HMGA2 detected in any of the samples analyzed nor was it seen expression in neoplastic cells [12–14]. MicroRNAs are in the normal controls. FHIT was slightly overexpressed non-coding RNAs with diverse biological functions in all ovarian tumor types. The highest expression levels [15]. They play an important regulatory role by targeting IRU WKH JHQH ZHUH IRXQG LQ WKHFR¿EURPDV PHDQ   VSHFL¿FP51$VIRUGHJUDGDWLRQRUWUDQVODWLRQUHSUHVVLRQ median = 2) followed by HG-S (mean = 2; median = 1.7), [16]. This may be the main event in the development LG-S (mean = 2; median = 1.5), endometrioid carcinomas of some cancers as many transcripts are affected with PHDQ   PHGLDQ    ¿EURPDV PHDQ   SURIRXQGLQÀXHQFHRQVLJQDOLQJSDWKZD\V>@ median = 1.1), mucinous (mean = 1.5; median = 1.4), To obtain more insight into the role of HMGA2 borderline (mean = 1.5; median = 1.3), and clear cell and its possible regulation by miRNAs in ovarian carcinomas (mean = 1.4; median = 1.1) (Figure 1). malignancies, we analyzed a series of 155 ovarian tumors ,Q RUGHU WR ¿QG RXW LI JHQRPLF LPEDODQFHV FRXOG of different histologies for HMGA2 gene expression explain the low expression levels of these miRNA, we as well as for the expression of two miRNAs targeting went back to our previously generated cytogenetic data HMGA2, let-7a and miR-30c. We also evaluated the (karyotype and/or HR-CGH) available on 103 of the expression of the fragile histidine triad (FHIT) and the 155 samples [3] to see if the region where the let-7a and lin28 homologues (LIN28A and LIN28B) genes, which are miR-30c clusters are located were visibly lost. We found a known to enhance miR-30c expression and repress let- deletion corresponding to at least one of the chromosomal DUHVSHFWLYHO\>@7KH¿QGLQJVZHUHFROODWHGZLWK bands where the three clusters for let-7a map-9q22.32, 11q24.1, the pattern of genomic imbalances previously detected and 22q13.31-in 47 out of 62 tumors with an abnormal genome in the majority of these tumors (n = 103) by means of (76% of the tumors; Supplementary Table 1). karyotyping and/or high resolution comparative genomic The most frequent deletion was of chromosomal K\EULGL]DWLRQ +5&*+ ),6+ZLWKORFXVVSHFL¿FSUREHV subband 22q13.31 where MIRLET7A3 maps (39 out was used to validate these results at gene level. of 47 samples with deletion of the let-7a cluster, 83%). MIRLET7A1, located in chromosomal subband 9q22.32, RESULTS was deleted in 49% of the samples with deletion of the let-7a cluster, while MIRLET7A2 (11q24.1) was deleted All molecular investigations gave informative in 19%. Moreover, the bands where the two clusters for results. The expressions of the genes and miRNAs were miR-30c (MIR30C1/2) map, 1p34.2 and 6q13, were found normalized using normal, commercially available tissue deleted in 41 out of 62 cases with genomic aberrations samples from the ovary. HMGA2 was expressed at high (66%). MIR30C1 on 1p34.2 had a rate of loss of 88% (36 level in the majority of tumors analyzed (Figure 1). The out of 41), while MIR30C2 (6q13) was deleted in 32% (13 highest HMGA2 relative normalized expression levels out 41) of the samples with a miR-30c cluster deletion. were found in tumors of the serous subgroup where HG-S :HSHUIRUPHG),6+DQDO\VHVXVLQJORFXVVSHFL¿F had a mean value of 74.3 and a median of 31, whereas probes for the let-7a and miR-30c clusters on 42 cases, the LG-S showed a mean value of 67.6 and a median 22 of which had not been cytogenetically characterized, of 27.2. The serous ovarian carcinomas were followed WRFRQ¿UPDWWKHJHQLFUHVROXWLRQOHYHOWKDWORVVRIRQHRU E\WKHWKHFR¿EURPDV PHDQ PHGLDQ  DQG more clusters had indeed occurred. FISH experiments gave www.impactjournals.com/oncotarget 2 Oncotarget informative results in all but four tumors (38 out of 42). variety of nuclear processes from chromatin dynamics to Twelve cases showed a normal diploid pattern of signals gene regulation. The gene is expressed during embryonic for the all probes analyzed. The remaining 26 cases showed development [9] but is usually silent in adult normal mostly heterozygous deletions of one or more miRNAs tissues except adipose tissues, lung, and kidney [10]. High (Figure 2); only in four cases did the signal pattern indicate expression levels of HMGA2 have been found in various a homozygous deletion. At least one let-7a cluster was tumors, both benign and malignant. In the present series, deleted in 88.5% (23 out of 26) tumors with abnormalities. the gene was found highly expressed in all tumor types MIRLET7A3 on 22q13.31 was deleted in 91% of the cases analyzed; however, the level differed among the different with a deleted let-7a cluster (21/23), while the deletion histological subgroups. Serous ovarian carcinomas rate for MIRLET7A2 (11q24.1) and MIRLETA1 (9q22) showed the highest expression, with HG-S generally was 21% (5/23) and 30% (7/23), respectively. At least one demonstrating higher levels than did the LG-S subgroup. mir-30c cluster was deleted in 19% (5/26) of the cases Although this certainly indicates a general correlation analyzed. MIR30C1 on 1p34.2 was deleted in 2 out of 5 between expression level and how malignant the tumor cases with a miR-30c cluster deletion, while MIR30C2 on is, exceptions were seen. Whereas HMGA2 was expressed 6q13 was deleted in 4 out of 5 cases. at high levels in the sex-cord stromal tumors analyzed ¿EURPDVDQGWKHFR¿EURPDV WXPRUVWKDWKDYHKLWKHUWR DISCUSSION not been subjected to this type of examination, clear cell carcinomas were found to show the lowest expression level The tumorigenic mechanisms behind different despite these tumors’ obvious malignant potential. ovarian malignancies are still poorly understood, as The mechanisms that lead to expression of HMGA2 is tumor progression leading to recurrences and/or are still not fully understood, but interaction between metastases. In a previous study [7], we found HMGA2 miRNAs and the HMGA2ƍXQWUDQVODWHGUHJLRQ ¶875  expressed in 74% of the ovarian tumors examined. We seems to play a crucial role. It has been shown that the KDYHQRZTXDQWL¿HGWKHH[SUHVVLRQRIWKLVJHQHLQWKH HMGA2 ƍ875 KDV PDQ\ UHJXODWRU\ VHTXHQFHV ZKLFK same series of tumors. are targeted by different families of miRNAs [19], and it HMGA2 belongs to the High-mobility group is thought that miRNA-dependent repression is the main AT-hook family of non-histonic proteins involved in a wide mechanism for controlling HMGA2 expression [12–14].

Figure 1: Relative normalized expression of HMGA2, LIN28A, and FHIT as well as the miRNAs miR-30c and let-7a. The relative expression of HMGA2 (A), LIN28A (D), FHIT (E) and the miRNAs let-7a (B) and miR-30c (C) is correlated with histotype. www.impactjournals.com/oncotarget 3 Oncotarget Another piece of evidence for the importance of the pathway leading to HMGA2 deregulation is relying on interaction between HMGA2 and miRNAs is the many overexpression of LIN28A, downregulating let-7a and disrupted forms of HMGA2, due to rearrangements of leading to overexpression of HMGA2. We found that chromosomal band 12q15, that are seen in different LIN28A showed an average normalized expression of 2 benign mesenchymal tumors [20–22]. These alterations in almost all groups, while let-7a showed a normalized involve exon 3 and cause deletion of downstream regions expression close to zero. leading to a truncated transcript that can evade miRNA- CGH data showed deletions involving the clusters dependent gene silencing. Since we have previously for let-7a and miR-30c in 76% and 66% of the tumors shown that truncation of HMGA2 is not common in with a rearranged genome, respectively. FISH analyses ovarian malignant tumors [7], we in this study focused validated at the genic resolution level the above data on other causes of HMGA2 deregulation analyzing two and showed that the deletions were heterozygous in the miRNAs that are known to target HMGA2 and suppress PDMRULW\RIVDPSOHV7KLV¿WVZHOOZLWKZKDWZDVDOVR its function, miR-30c and let-7a. Deregulation of the reported by Kan et al. 2015 [29] who found that genomic IRUPHU KDV EHHQ LGHQWL¿HG DV WKH PDLQ PHFKDQLVP RI alterations led to dysregulation of miRNA expression in HMGA2 regulation in lung cancer [13] whereas similar serous ovarian carcinomas. We now show that this also involvement of the latter was found in other tumors such holds for other types of ovarian cancer. Deletions of as nasopharyngeal carcinoma and non small cell lung OHWDFOXVWHUVZHUHIRXQGLQDOOIRXU¿EURPDVDQDO\]HG cancer [23, 24]. Both miRNAs were found low expressed with FISH. Taken together, the CGH and FISH data show in all the ovarian tumors analyzed with no major group- that, in borderline tumors, genomic deletion of at least one to-group differences. Each of these miRNAs is regulated of the clusters was seen in six out of seven cases, while in a different manner: miR-30c expression is enhanced the same pattern was seen in two out of three clear cell by the Fragile Histidine Triad protein (FHIT), whereas FDUFLQRPDVLQ¿YHRXWRIHLJKWPXFLQRXVFDUFLQRPDVLQ let-7a is repressed by LIN28A/B. Since it is not known 10 out of 17 endometrioid carcinomas, in all three LG-S, which of these two pathways is active in ovarian cancer, and in 28 out of 33 HG-S. Interestingly, MIRLET7A3 on we investigated both and found FHIT gene expressed in all 22q13.31 was found frequently deleted in ovarian tumors, tumors. This is a well known tumor suppressor gene [25] in 62% of the cases analyzed by CGH (39/62) but 81% of whose expression has been shown to exhibit an inverse the cases analyzed by FISH (21/26). Our results indicated correlation with HMGA2 in lung cancer [13], where FHIT that this cluster is deleted across the entire spectrum of enhances the expression of miR-30c and causes HMGA2 ovarian tumors: from benign sex-cord stromal tumors repression. This does not seem to be the case for ovarian to carcinomas, underscoring yet again the importance tumors as both FHIT and HMGA2 were found expressed of dysregulated miRNAs in ovarian tumorigenesis and in all tumors analyzed. Another path known to regulate progression. Further analysis of larger cohorts of ovarian HMGA2 expression involves the miRNAs belonging tumors of different types/histologies should clarify the to the let-7 family that are frequently downregulated extent to which genomic losses act through altering in cancer, usually resulting in HMGA2 overexpression miRNA function. [26, 27]. In all tumors analyzed, we found low expression In conclusion, our study shows that the miRNAs of let-7a and corresponding overexpression of LIN28A. let-7a and miR-30c are deregulated in ovarian cancer, Allegedly, the downregulation of the let-7 family of and that genomic imbalances may be a cause of this miRNAs is caused by overexpression of the RNA binding deregulation. Although we suppose that the low levels of proteins homologues LIN28A and/or LIN28B that let-7a in ovarian tumors are brought about by deletions of inhibit the maturation of both pri- and pre-let-7 [28]. This let-7a clusters, we cannot exclude that also LIN28A may seems to be the case also for ovarian tumors where the play a role in the downregulation of this miRNA when

Figure 2: FISH analyses of the let-7a and miR-30c clusters. A signal pattern indicating heterozygous deletion is seen of MIR30C1 on 1p34.2 (A), MIR30C2 on 6q13 (B), MIRLETA1 on 9q22 (C), MIRLET7A2 on 11q24, and MIRLET7A on 22q13 (D). www.impactjournals.com/oncotarget 4 Oncotarget expressed (Figure 1). FHIT does not seem to enhance from the ovary, MVP Total RNA Human Ovary (Agilent miR-30c expression in ovarian tumors, so probably 7HFKQRORJLHV6DQWD&ODUD&$86$ DQG+XPDQ2YDU\ other causes act together with genomic imbalances 7RWDO51$ =\DJHQ6DQ'LHJR&$86$ ZHUHXVHGDV leading to miR-30c deregulation. The high frequency of reference for relative expression normalization. The Real- downregulation of these miRNAs found indicates that this Time data were analyzed with Bio-Rad CFX manager 3.1 LVDVLJQL¿FDQWZD\RIREWDLQLQJHMGA2 deregulation in (Bio-Rad). The normalized expression was calculated ovarian tumors. Our data nevertheless lead us to suppose using the 2íǻǻ&W(Livak) method [30]. that also other players are involved in HMGA2 regulation, not least because although the let-7a and miR-30c levels microRNA expression differ but little among the various tumor histotypes, the levels of HMGA2 expression do. The fact that HMGA2 is Ten ng of total RNA were reverse transcribed not frequently expressed in clear cell carcinomas although with the TaqMan MicroRNA Reverse Transcription Kit this is one of the tumor groups with the lowest expression (Applied Biosystems) following the manufacturer’s of the miRNAs, suggests some other type of HMGA2 protocol. miRNA expressions were assessed with Real- regulation in at least this ovarian cancer subgroup. Time PCR using the TaqMan MicroRNA Assays (Applied Biosystems) for let-7a (TM:000377) and miR-30c MATERIALS AND METHODS 70 ZKHUHDV518% 70 ZDVXVHG as reference. Tumor material Gene expression The material consisted of fresh frozen samples from 155 ovarian tumors surgically removed at The One μg of extracted total RNA for each tumor Norwegian Radium Hospital between 1999 and 2010. was reverse-transcribed in a 20 ȝl reaction volume using The series consists of 30 sex-cord stromal tumors iScript Advanced cDNA Synthesis Kit according to WKHFR¿EURPDV7K)¿EURPDV) ERUGHUOLQH the manifacturer›s instructions (Bio-Rad Laboratories, tumors (B), and 103 carcinomas of which 12 were of the Oslo, Norway). Gene expression was assessed with clear cell (CC) histological subtype, 16 were mucinous Real-Time PCR using the TaqMan Gene Expression Assays (M), 30 endometrioid (E), 10 low-grade (LG-S) serous, (Applied Biosystems) for HMGA2 (Hs_04397751_m1), and 35 high-grade (HG-S) serous carcinomas. The FHIT (Hs_00179987_m1), LIN28A (Hs_00702808_ study was approved by the regional ethics committee s1), and LIN28B (Hs_01013729_m1), whereas RPL4 (Regional komité for medisinsk forskning-setikk Sør- (Hs_01939407_gH) was used as a reference gene as it has Øst, Norge, http://helseforskning.etikkom.no) and written been proven to be stably expressed in ovarian cells [31, 32]. informed consent was obtained from the patients. Fluorescence in situ hybridization (FISH) RNA extraction FISH analyses were performed on interphase nuclei. Total RNA was extracted using miRNeasy Kit %DFWHULDO $UWL¿FLDO &KURPRVRPHV %$&  FORQHV ZHUH (Qiagen, Hilden, Germany) and QIAcube (Qiagen). The retrieved from the RPCI-11 Human BAC and CalTech concentration and purity of the RNA was measured with a Human BAC libraries (P. De Jong Libraries; http:// Nanovue Spectrophotometer (GE Healthcare, Pittsburgh, bacpac.chori.org/home.htm). The BAC clones were 3$86$  selected according to physical and genetic mapping data as reported on the Human Genome Browser Real-time polymerase chain reaction (real-time DW WKH 8QLYHUVLW\ RI &DOLIRUQLD 6DQWD &UX] ZHEVLWH PCR) (May 2004, http:// genome.ucsc.edu/). The clones used were: RP11-2B6 for MIRLET7A1 (9q22.32), The expression of the miRNAs and genes of interest RP11-453C14 for MIRLET7A2 (11q24.1), CTD- was assessed by Real-Time PCR using the CFX96 Touch 2504J8 for MIRLET7A3 (22q13.31), RP11-170L4 for Real-Time PCR detection system (Bio-Rad Laboratories, MIR30C1 (1p34.2), and RP11-756H9 for MIR30C2 (6q13). Oslo, Norway). The reactions were carried out in All clones were grown in selective media, and DNA was quadruplicate using TaqMan Assays and the TaqMan extracted and labelled according to the manufacturer’s 8QLYHUVDO0DVWHU0L[,,ZLWK81* $SSOLHG%LRV\VWHPV recommendations (http://bacpac.chori.org/home.htm). )RVWHU &LW\ &$ 86$  IROORZLQJ WKH PDQXIDFWXUHU¶V The slides were counter-stained with 0.2 μg/ml DAPI and SURWRFRO +XPDQ 8QLYHUVDO 5HIHUHQFH 7RWDO 51$ overlaid with a 24 × 50 mm2 coverslip. Fluorescent signals &ORQWHFK 0RXQWDLQ 9LHZ &$ 86$  ZDV XVHG DV were captured and analyzed using the CytoVision system internal reaction control. Two commercial Total RNAs /HLFD%LRV\VWHPV1HZFDVWOH8. 

www.impactjournals.com/oncotarget 5 Oncotarget ACKNOWLEDGMENTS AND FUNDING 11. Cleynen I, Van de Ven WJ. The HMGA proteins: a myriad of functions (Review). Int J Oncol. 2008; 32:289–305. The authors wish to thank Hege Kilen Andersen and 12. Liu Y, Liang H, Jiang X. miR-1297 promotes apoptosis Kristin Andersen for excellent technical assistance with and inhibits the proliferation and invasion of hepatocellular the FISH analyses. This work was supported by grants carcinoma cells by targeting HMGA2. Int J Mol Med. from the Norwegian Radium Hospital Foundation, the 2015:10. John and Inger Fredriksen Foundation for Ovarian Cancer 13. Suh SS, Yoo JY, Cui R, Kaur B, Huebner K, Lee TK, Research, and the Anders Jahres foundation through Aqeilan RI, Croce CM. FHIT suppresses epithelial- 81,)25 8QLYHUVLW\RI2VOR  mesenchymal transition (EMT) and metastasis in lung cancer through modulation of microRNAs. PLoS Genet. CONFLICTS OF INTEREST 2014; 10:e1004652. 14. Lin Y, Liu AY, Fan C, Zheng H, Li Y, Zhang C, Wu S, Yu D, 7KHDXWKRUVGHFODUHQRFRPSHWLQJ¿QDQFLDOLQWHUHVWV Huang Z, Liu F, Luo Q, Yang CJ, Ouyang G. MicroRNA- 33b Inhibits Breast Cancer Metastasis by Targeting REFERENCES HMGA2, SALL4 and Twist1. Sci Rep. 2015; 5:9995. doi: 10.1038/srep09995.:9995.  3UDW - 2YDULDQ FDUFLQRPDV ¿YH GLVWLQFW GLVHDVHV ZLWK 15. Ambros V. The functions of animal microRNAs. Nature. different origins, genetic alterations, and clinicopathological 2004; 431:350–355. features. Virchows Arch. 2012; 460:237–249. 16. MacFarlane LA, Murphy PR. MicroRNA: Biogenesis, 2. Prat J. New insights into ovarian cancer pathology. Ann Function and Role in Cancer. Current Genomics. 2010; Oncol. 2012; 23:x111-7.:x111-x117. 11:537–561. 3. Micci F, Haugom L, Abeler VM, Davidson B, Trope CG, 17. Jansson MD, Lund AH. MicroRNA and cancer. MolOncol. +HLP 6 *HQRPLF SUR¿OH RI RYDULDQ FDUFLQRPDV %0& 2012; 6:590–610. Cancer. 2014; 14:315. doi: 10.1186/1471-2407–14-315.:315– 18. Su JL, Chen PS, Johansson G, Kuo ML. Function and 314. regulation of let-7 family microRNAs. Microrna. 2012; 4. Tibiletti MG, Bernasconi B, Furlan D, Bressan P, Cerutti R, 1:34–39. Facco C, Franchi M, Riva C, Cinquetti R, Capella C, 19. Kristjansdottir K, Fogarty EA, Grimson A. Systematic Taramelli R. Chromosome 6 abnormalities in ovarian DQDO\VLVRIWKH+PJDƍ875LGHQWL¿HVPDQ\LQGHSHQGHQW surface epithelial tumors of borderline malignancy suggest regulatory sequences and a novel interaction between distal a genetic continuum in the progression model of ovarian sites. RNA. 2015; 21:1346–1360. neoplasms. Clin Cancer Res. 2001; 7:3404–3409. 5. Lim G, Oliva E. Sex Cord Stromal Tumors of the Ovary. 20. Schoenmakers EF, Wanschura S, Mols R, Bullerdiek J, Van In: Soslow RA and Tornos C, eds. Diagnostic Pathology of den Berghe H, Van de Ven WJ. Recurrent rearrangements in Ovarian Tumors: Springer New York). 2011; 193–234. the high mobility group protein gene, HMGI-C, in benign mesenchymal tumours. Nat Genet. 1995; 10:436–444. 6. Micci F, Haugom L, Abeler VM, Trope CG, Danielsen HE and Heim S. Consistent numerical chromosome aberrations 21. Mine N, Kurose K, Nagai H, Doi D, Ota Y, Yoneyama K, LQ WKHFR¿EURPDV RI WKH RYDU\ 9LUFKRZV $UFKLY  Konishi H, Araki T, Emi M. Gene fusion involving HMGIC 452:269–276. is a frequent aberration in uterine leiomyomas. J Hum 7. Agostini A, Panagopoulos I, Davidson B, Trope CG, Genet. 2001; 46:408–412. Heim S, Micci F. A new truncated form of HMGA2 in 22. Bartuma H, Hallor KH, Panagopoulos I, Collin A, tumors of te ovaries. Onc Lett. 2015. Rydholm A, Gustafson P, Bauer HC, Brosjo O, 8. Hetland TE, Holth A, Kaern J, Florenes VA, Trope CG, Domanski HA, Mandahl N, Mertens F. Assessment of Davidson B. HMGA2 protein expression in ovarian serous the clinical and molecular impact of different cytogenetic carcinoma effusions, primary tumors, and solid metastases. subgroups in a series of 272 lipomas with abnormal Virchows Arch. 2012; 460:505–513. karyotype. Genes Chromosomes Cancer. 2007; 46:594–606. 9. Chiappetta G, Avantaggiato V, Visconti R, Fedele M, 23. Wu A, Wu K, Li J, Mo Y, Lin Y, Wang Y, Shen X, Li S, Battista S, Trapasso F, Merciai BM, Fidanza V, Giancotti V, Li L, Yang Z. Let-7a inhibits migration, invasion and Santoro M, Simeone A, Fusco A. High level expression epithelial-mesenchymal transition by targeting HMGA2 of the HMGI (Y) gene during embryonic development. in nasopharyngeal carcinoma. J Transl Med. 2015; 13:105. Oncogene. 1996; 13:2439–2446. doi: 10.1186/s12967-015-0462-8.:105-0462. 10. Rogalla P, Drechsler K, Frey G, Hennig Y, Helmke B, 24. Wang YY, Ren T, Cai YY, He XY. MicroRNA let-7a inhibits %RQN 8 %XOOHUGLHN - +0*,& H[SUHVVLRQ SDWWHUQV LQ the proliferation and invasion of nonsmall cell lung cancer human tissues. Implications for the genesis of frequent cell line 95D by regulating K-Ras and HMGA2 gene mesenchymal tumors. Am J Pathol. 1996; 149:775–779. expression. Cancer Biother Radiopharm. 2013; 28:131–137.

www.impactjournals.com/oncotarget 6 Oncotarget 25. Waters CE, Saldivar JC, Hosseini SA, Huebner K. The 29. Kan CWS, Howell VM, Hahn MA, Marsh DJ. Genomic FHIT gene product: tumor suppressor and genome alterations as mediators of miRNA dysregulation in ovarian "caretaker". Cell Mol Life Sci. 2014; 71:4577–4587. cancer. Gene Chromosomes. Cancer. 2015; 54:1–19. 26. Park SM, Shell S, Radjabi AR, Schickel R, Feig C, 30. Livak KJ, Schmittgen TD. Analysis of relative gene Boyerinas B, Dinulescu DM, Lengyel E, Peter ME. Let-7 expression data using real-time quantitative PCR and the prevents early cancer progression by suppressing expression 2(-Delta Delta C(T)) Method. Methods. 2001; 25:402–408. of the embryonic gene HMGA2. Cell Cycle. 2007; 6: 31. Kolkova Z, Arakelyan A, Casslen B, Hansson S, Kriegova E. 2585–2590. 1RUPDOL]LQJWR*$'3+MHRSDUGLVHVFRUUHFWTXDQWL¿FDWLRQ 27. Kolenda T, Przybyla W, Teresiak A, Mackiewicz A, of gene expression in ovarian tumours - IPO8 and RPL4 are Lamperska KM. The mystery of let-7d-a small RNA with reliable reference genes. J Ovarian Res. 2013; 6:60–66. great power. ContempOncol(Pozn). 2014; 18:293–301. 32. Fu J, Bian L, Zhao L, Dong Z, Gao X, Luan H, Sun Y 28. Wang T, Wang G, Hao D, Liu X, Wang D, Ning N, Li X. DQG6RQJ+,GHQWL¿FDWLRQRIJHQHVIRUQRUPDOL]DWLRQRI Aberrant regulation of the LIN28A/LIN28B and let-7 loop quantitative real-time PCR data in ovarian tissues. Acta in human malignant tumors and its effects on the hallmarks BiochimBiophysSin(Shanghai). 2010; 42:568–574. of cancer. Mol Cancer. 2015; 14:125. doi: 10.1186/s12943- 015-0402-5.:125-0402.

www.impactjournals.com/oncotarget 7 Oncotarget

Paper III

The microRNA mir-192/215 family is upregulated in mucinous ovarian carcinomas

Agostini A, Brunetti M, Davidson B, Tropé CG, Heim S, Panagopoulos I, Micci F.

Submitted manuscript (Scientific Reports)

Paper IV

Recurrent involvement of DPP9 in gene fusions in serous ovarian carcinoma

Smebye ML, Agostini A, Johannessen B, Thorsen J, Davidson B, Tropé CG, Heim S, Skotheim RI, Micci F.

BMC Cancer 2017 Sep 11;17(1):642

Smebye et al. BMC Cancer (2017) 17:642 DOI 10.1186/s12885-017-3625-6

RESEARCH ARTICLE Open Access Involvement of DPP9 in gene fusions in serous ovarian carcinoma Marianne Lislerud Smebye1,2, Antonio Agostini1,2, Bjarne Johannessen2,3, Jim Thorsen1,2, Ben Davidson4,5, Claes Göran Tropé6, Sverre Heim1,2,5, Rolf Inge Skotheim2,3 and Francesca Micci1,2*

Abstract Background: A fusion gene is a hybrid gene consisting of parts from two previously independent genes. Chromosomal rearrangements leading to gene breakage are frequent in high-grade serous ovarian carcinomas and have been reported as a common mechanism for inactivating tumor suppressor genes. However, no fusion genes have been repeatedly reported to be recurrent driver events in ovarian carcinogenesis. We combined genomic and transcriptomic information to identify novel fusion gene candidates and aberrantly expressed genes in ovarian carcinomas. Methods: Examined were 19 previously karyotyped ovarian carcinomas (18 of the serous histotype and one undifferentiated). First, karyotypic aberrations were compared to fusion gene candidates identified by RNA sequencing (RNA-seq). In addition, we used exon-level gene expression microarrays as a screening tool to identify aberrantly expressed genes possibly involved in gene fusion events, and compared the findings to the RNA-seq data. Results: We found a DPP9-PPP6R3 fusion transcript in one tumor showing a matching genomic 11;19-translocation. Another tumor had a rearrangement of DPP9 with PLIN3. Both rearrangements were associated with diminished expression of the 3′ end of DPP9 corresponding to the breakpoints identified by RNA-seq. For the exon-level expression analysis, candidate fusion partner genes were ranked according to deviating expression compared to the median of the sample set. The results were collated with data obtained from the RNA-seq analysis. Several fusion candidates were identified, among them TMEM123-MMP27, ZBTB46-WFDC13,andPLXNB1-PRKAR2A,allof which led to stronger expression of the 3′ genes. In view of our previous findings of nonrandom rearrangements of chromosome 19 in this cancer type, particular emphasis was given to changes of this chromosome and a DDA1-FAM129C fusion event was identified. Conclusions: We have identified novel fusion gene candidates in high-grade serous ovarian carcinoma. DPP9 was involved in two different fusion transcripts that both resulted in deregulated expression of the 3′ end of the transcript and thus possible loss of the active domains in the DPP9 protein. The identified rearrangements might play a role in tumorigenesis or tumor progression. Keywords: Ovarian carcinoma, Fusion genes, Gene expression, DPP9

* Correspondence: [email protected] 1Section for Cancer Cytogenetics, Institute for Cancer Genetics and Informatics, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway 2Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway Full list of author information is available at the end of the article

© The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Smebye et al. BMC Cancer (2017) 17:642 Page 2 of 10

Background polymerase read-through of adjacent genes [14]. Fusion Ovarian malignancies account for 4% of cancer in genes have been identified in several epithelial cancers women and are the most frequent cause of death due [15], but none have so far been validated as recurrent in to gynecological cancer in Western countries [1]. independent cohorts of ovarian cancer [13]. In a recent Carcinomas are the most common subtype, with the study of 92 serous ovarian carcinomas using DNA and serous histotype being particularly prevalent [1]. Most RNA sequencing, gene breakage was found to be a com- serous ovarian carcinomas are genomically unstable. mon mechanism for inactivating tumor suppressor genes Approximately 50% of the tumors have defects in the but no fusion gene was identified as a recurrent, biologic- homologous recombination DNA repair pathway, with ally plausible driver in tumorigenesis [10]. Still, for BRCA1 and BRCA2 alterations as the most frequent, subgroups of patients, gene fusions might play such a role. while the remaining half show less characteristic In the present study, we looked for fusion gene candi- aberration patterns [2–4]. dates and aberrantly expressed genes as well as the pos- Genomic imbalances are widespread in serous ovarian sible mechanisms behind such expression changes in a carcinomas and in a large The Cancer Genome Atlas series of ovarian carcinomas with cytogenetically identi- (TCGA) study of ovarian carcinomas, 113 significant fied chromosome 19 changes. We used two different focal DNA copy number alterations were identified [4]. genome-scale approaches: exon-level microarrays and Structural chromosomal aberrations are also frequently transcriptome sequencing (RNA-seq). seen, with involvement of chromosome 19 being particu- larly common [4–10]. Chromosome 11 has been reported Methods to be one of the recurrent partners in such rearrange- Patient samples ments [9, 11, 12]. The functional consequences of these Ovarian carcinomas from 16 women were included in the aberrations are not understood. study. Three patients had bilateral tumors so that in total Genomic rearrangements may lead to transcriptional 19 carcinoma samples were examined (see Table 1 for deregulation, gene truncations, or gene fusions that clinical information). Histologic classification of the encode fusion proteins [13]. Abnormal transcription carcinomas identified 18 tumors as serous and one as events can also result in chimeric RNAs, e.g., by RNA undifferentiated (immunohistochemistry was negative for

Table 1 Clinical information Sample Patient Age Dx Stadium Karyotype: chr19 RNA-seq Neo chemo 1 I 66 HGSC IIIC No der(19)b 2 II 47 HGSC IIIB der(19) V 3 IIIa 54 HGSC IIIC No der(19)b 4 IIIb 54 HGSC IIIC der(19) V 5 IVa 61 HGSC IIIC Culture failureb 6 IVb 61 HGSC IIIC der(19) V 7 V 52 HGSC IIIC der(19) V 8 VI 50 HGSC IV der(19) V 9 VII 66 HGSC IV der(19) V 10 VIII 59 HGSC IV der(19) V yes 11 IX 56 HGSC IIIC der(19) V 12 X 51 UC IIC der(19) V 13 XI 74 HGSC IIIC der(19) V yes 14 XII 57 HGSC IIIC 46,XXb 15 XIIIa 64 HGSC IIC der(19) V 16 XIIIb 64 HGSC IIC 46,XXb 17 XIV 63 HGSC IIIC der(19) V 18 XV 56 SCa IIIC der(19) yes 19 XVI 77 HGSC IIIC der(19) V Dx diagnosis, HGSC high-grade serous carcinoma, Neo chemo neoadjuvant chemotherapy, UC undifferentiated carcinoma aSC = serous carcinoma, cannot be graded due to chemo bbilateral tumor has der(19) in the karyotype Smebye et al. BMC Cancer (2017) 17:642 Page 3 of 10

WT-1). The tumors were selected based on the presence TAKARA Premix Ex Taq (TaKaRa-Bio, Europe/SAS, of cytogenetically visible genomic changes involving Saint-Germain-en-Laye, France) with PCR conditions chromosome 19; all tumors either had a structurally rear- as previously described [23]. Primers are listed in ranged chromosome 19 in the karyotype or were the Additional file 1: Table S1. PCR products were ana- contralateral counterpart of a tumor with such a change. lyzed by electrophoresis through 1.0% agarose gel and Material from the same patients has previously been ex- products of the correct length were subjected to amined by karyotyping, fluorescence in situ hybridization Sanger sequencing using BigDye Terminator V1.1 cycle (FISH), and comparative genomic hybridization (CGH) sequencing kit on an ABI 3500 Genetic Analyzer analysis at the chromosomal level as well as by microar- (ThermoFisher Scientific, Waltham, MA, USA). The rays [16–18]. The tumor biobank has been registered BLAST and BLAT programs were used for computer according to national legislation and the study has been analysis of sequence data [24, 25]. approved by the Regional Committee for Medical Research Ethics South-East, REK; project numbers Microarray gene expression analysis S-07194a and 2.2007.425. 100 ng total RNA was used as input for global gene expression analysis at the exon level using Affymetrix Isolation of RNA GeneChip Human Exon 1.0 ST Arrays (Affymetrix, Fresh tumor tissue was frozen and stored at −80 °C. Total Santa Clara, CA, USA). Each microarray contained 1.4 RNA was extracted using TRIzol reagent (Invitrogen, million probe sets (the majority of which were comprised Carlsbad, CA) and miRNAeasy spin columns (Qiagen of four probes), where each probe set corresponded to GmbH, Hilden, Germany). First, tumor tissue was homog- approximately one known or computationally predicted enized in TRIzol and the aqueous phase was removed and exon. RNA from each sample was individually amplified, used further with the Qiagen miRNeasy Mini kit reversely transcribed, fragmented, and labeled. Labeled according to the manufacturer’s protocol. Quantitation sense strand cDNA was hybridized onto the arrays for and quality control of the isolated total RNA were 16–18 h, after which the arrays were washed, stained, and performed using NanoVue spectrophotometer (GE scanned as described earlier [26]. Healthcare, Little Chalfont, UK) and the Experion auto- Cell intensity (CEL)-files from the tumor samples were mated electrophoresis system (Bio-Rad Laboratories, background corrected, inter-chip quantile normalized, Hercules, CA, USA). Total RNA degradation was evalu- and summarized at gene level by the robust multi-array ated by reviewing the electropherograms and the RNA average (RMA) approach [27] implemented in the quality indicator (RQI). Affymetrix Expression Console 1.1 software. Genes, i.e., transcript clusters annotated with gene symbols, were RNA-seq identified using the HuEx-1_0-st-v2.r2 core library files High-throughput paired-end RNA-sequencing was and the annotation files HuEx-1_0-st-v2.na35.hg19.pro- performed according to the TruSeq paired-end RNA- beset.csv and HuEx-1_0-st-v2.na35.hg19.transcript.csv, sequencing protocols from Illumina for Solexa sequen- available from the Affymetrix web page [28]. For tech- cing on a Genome Analyzer IIx with paired-end module nical control, three runs of one normal ovarian sample (Illumina Inc., San Diego, CA, USA) as described earlier (Cat#HR-406 Human Ovary Total RNA, Zyagen, San [19]. Nucleotides of 76 base pair length were sequenced. Diego, CA, USA) were included. The RNA-seq files gave on average 32 million read pairs per sample (range: 24–42 million). The FASTQC soft- Identification of gene fusions from exon-microarray data ware was used for quality control of the raw sequence To identify putative fusion events from the microarray data [20]. The software FusionMap [21] (release date data, we computationally selected genes in which some 2012–04-16; paired-end utility using default parameters) of the samples showed outlier expression profiles for and the associated pre-built Human B37 and RefGene one of the gene moieties (i.e., either the 5′ or the 3′ end from the FusionMap website were used for discovery of of the transcript), as previously described by Hoff et al. fusion transcripts [22]. [29]. We particularly focused on fusions corresponding to structural changes of chromosome 19 and/or its part- RT-PCR and Sanger sequencing ner chromosomes in the rearrangements previously Putative gene fusions were validated by Reverse identified by karyotyping [12]. For robustness, filtering Transcriptase-Polymerase Chain Reaction (RT-PCR) procedures excluded breakpoint candidates with only followed by Sanger sequencing. Expression of the house- one probe set on either side of the suggested breakpoint. keeping gene ABL1 was used as internal control. cDNA Breakpoints in genes with gene symbols including two (originally prepared for the microarray analysis, see gene names (e.g. “DCAF8L2 // LOC101928481”)or“—” below) equivalent to 10 ng RNA was amplified using were also filtered out. Candidate fusion partner genes Smebye et al. BMC Cancer (2017) 17:642 Page 4 of 10

were ranked according to the magnitude of their expres- analysis of the DPP9-PPP6R3 fusion transcript sequence sion deviation from the median of the set, and the showed 100% identity to only these two gene sequences, results were compared with the RNA-seq data analysis. supporting the presence of a true fusion rearrangement (Additional file 1: Table S3). The remaining 11–19 fusion Results candidates did not show similar specificity by BLAT. We used two different approaches to identify aberrantly expressed genes possibly involved in fusion events: 1) a DPP9 combination of karyotypic information and RNA-seq The junction in the DPP9-PPP6R3 fusion was found be- data; 2) a comparison of microarray and RNA-seq data. tween exon 11 of DPP9 (accession number NM_139159.4) and exon 18 of PPP6R3 (NM_001164162.1). RNA-seq fusion candidates The presence of the DPP9-PPP6R3 fusion was vali- RNA-seq data on 13 ovarian carcinomas were analyzed dated by RT-PCR and Sanger sequencing (Fig. 1). Sanger by the software FusionMap, yielding a list of 2069 sequencing showed that the fusion introduced a stop unique fusion gene candidates from all the samples codon directly after the junction (Fig. 1), indicating a combined. The average number of fusion candidates per truncated DPP9 protein, if any protein at all. Presence of sample was 274 (range: 209–345). the DPP9-PPP6R3 fusion rearrangement was tested for Three tumors (samples 7, 11, and 17) had structural re- by RT-PCR in the remaining samples with cDNA of arrangements involving chromosomes 11 and 19 in their sufficient quality (n=15) but the hybrid transcript was karyotype, with breakpoint positions in 11q13 ~ q14, not expressed by any other tumor. 19p13, and 19q13. Five fusion gene candidates were iden- We then searched the RNA-seq data for other fusion tified by RNA-seq analysis as corresponding to these transcripts involving DPP9 and found it recombined breakpoints (Additional file 1: Table S2). In sample 7, with another partner, Perilipin 3 (PLIN3, NM_005817.4, there was a fusion sequence between Dipeptidyl-Peptidase 19p13.3), in sample 8 (Additional file 1: Table S2). The 9(DPP9, mapping to 19p13.3) and Protein Phosphatase 6, fusion junction detected by RNA seq was between exon Regulatory Subunit 3 (PPP6R3,11q13.2).TheBLAT 16 of DPP9 (NM_139159.4) and exon 8 of PLIN3.The

Fig. 1 Experimental validation of DPP9 fusions. a RT-PCR validation of the DPP9-PPP6R3 rearrangement in sample 7. Lanes from left: (1) DNA size marker, (2–3) PCR-reactions, and (4–5) nested-PCR. b RT-PCR with primers of the putative DPP9-PLIN3 rearrangement gave a weak band of the expected length of base pairs (3rd lane, band indicating PCR-product of approximately 300 base pairs). c-d Sanger sequencing. Partial sequence of the DPP9-PPP6R3 rearrangement. The fusion junction is marked by an arrow. The rearrangement introduces a stop codon in the new sequence (“tag”, marked by an asterisk) Smebye et al. BMC Cancer (2017) 17:642 Page 5 of 10

gene expression of DPP9 in this sample showed a We used the exon-level microarray data as a screening decrease in expression downstream of exon 16, however, tool to search for genes in which either the 5′ or 3′ end the decrease in expression was not as marked as in sam- expression differed from the median of the sample set, ple 7 (Fig. 2). The expression of PLIN3 was similar in all i.e., showed “transcript breakpoints”, hypothesizing that analyzed tumors, admittedly with the caveat that for this transcribed fusion rearrangements would rank high on gene, the microarray analysis data was based on only this list. In the total data set, the algorithm identified one probe set so we cannot identify possible breakpoints 120,099 such breakpoints after filtering procedures. The (data not shown). The BLAT analysis of the DPP9-PLIN3 top 1000 candidates, corresponding to approximately fusion sequence showed 100% identity to only these two 0.8% of the suggested breakpoints, were selected for fur- gene sequences (Additional file 1: Table S3). RT-PCR of ther analysis (Table 2: Exon-level expression analysis). A the DPP9-PLIN3 fusion gave a band on the agarose gel clearly aberrant expression was found for the LIM corresponding to the expected length (361 base pairs; homeobox 2 gene (LHX2, XM_006717323, 9q33.3) in Fig. 1), but we were not able to get an informative se- sample 18 (Fig. 4a), with a breakpoint ranking as the quence from the product by Sanger sequencing. BLAT 15th best of all breakpoints. In LHX2, the microarray analysis of the junction sequence supported the presence results showed a higher level of expression from the of a gene fusion since the DPP9-PLIN3 fusion sequence probe set corresponding to exon 2 (i.e., probe set matched only these two gene sequences. The genomic 3,188,660). This was not among the samples from which region where both DPP9 and PLIN3 reside, i.e., cytoband RNA-seq data were available (Table 1); however, we 19p13, was found to be rearranged in the sample with searched for involvement of this gene in fusions in the the reported fusion (case 8) inasmuch as the karyotype RNA-sequenced samples but found no fusion gene showed a der(19)add(19)(p13)add(19)(q13). candidates involving LHX2. The microarray data and the RNA-seq data were com- Exon-level gene expression bined by comparing the top 1000 exon-level breakpoints Microarray expression analysis provided information on to the complete list of RNA-seq fusion candidates. Two 17,506 genes (i.e., unique transcript clusters annotated genes, the Transporter 2, ATP-Binding Cassette, Sub- with gene symbols) at exon-level and 17,638 genes at Family B (TAP2) and the Collagen, Type IX, Alpha 1 gene-level. As expected, hierarchical clustering of the (COL9A1), were found with altered expression in two and global expression data set at gene-level separated cancer three of the samples, respectively. Thus, a total of 32 genes samples from controls (Fig. 3). The undifferentiated showed a match between presence of altered expression carcinoma (cancer sample 12) clustered together with on microarray and involvement as an RNA-seq fusion the serous carcinomas. transcript candidate (Additional file 1: Table S4).

Fig. 2 DPP9 gene expression profile. Sample-wise median-centered exon-level expression of DPP9 (y-axis; log2) is plotted for each of the probe sets (x-axis; sorted according to their genome positions). The DPP9-PPP6R3 fusion rearrangement was identified in sample 7 (red)andthe gene expression is reduced correspondingly, downstream of the RNA-seq fusion breakpoint in exon 11 (targeted by probe set 3,846,952). The DPP9-PLIN3 rearrangement was identified in sample 8 (blue), and the expression is reduced downstream of the RNA-seq fusion breakpoint in exon 16 (probe set 3,846,940) Smebye et al. BMC Cancer (2017) 17:642 Page 6 of 10

MMP27 A transcript breakpoint in the Matrix Metallopeptidase 27 gene (MMP27, NM_022122, 11q22.2) in sample 12 was also highly ranked, i.e., breakpoint number 80 (Additional file 1: Table S3). The RNA-seq showed a matching fusion with Transmembrane Protein 123 (TMEM123, NM_052932, 11q22.2; Additional file 2: Figure S1). The junction in the TMEM123-MMP27 tran- script was found between exons 2 and 7 targeted by the probe sets 3,388,639 and 3,388,742, respectively (Fig. 4c, Additional file 2: Figure S1). In this fusion, the TMEM123 5’gene was more strongly expressed than the 3’gene. The two genes reside only 220 kb apart on the same strand with two other matrix metallopeptidase Fig. 3 Hierarchical clustering. Hierarchical clustering of the 22 genes in-between (MMP7 and MMP20). expression microarray runs (19 tumors and 1 normal control run in triplicate) based on the total data set at gene-level. Normal runs (blue) WFDC13 separate from cancer samples; the undifferentiated carcinoma (orange) clusters together with the serous carcinomas (red) The gene WAP Four-Disulfide Core Domain 13 (WFDC13, NM_172005, 20q13.12) had an RNA-seq suggested fusion after exon 2, matching the point of Among the genes in which changes in the expression increase in gene expression (Fig. 4d). The putative fusion level fit with a possible fusion detected by transcriptome partner was the Zinc Finger And BTB Domain Containing sequencing, we found the following: 46 (ZBTB46, 20q13.33, NM_025224, Additional file 2: Figure S1) which resides 18 M base pairs (Mbp) down- stream, leading to formation of a ZBTB46-WFDC13 LYNX1 transcript. Ly6/Neurotoxin 1 (LYNX1, NM_177458.1, 8q24.3) was found as the highest ranked gene. In sample 17, its PRKAR2A expression increased markedly from probe set 3,157,159 The Protein Kinase, CAMP-Dependent, Regulatory, and on (Fig. 4b). The nominated RNA-seq 5′ fusion Type II, Alpha gene (PRKAR2A, NM_004157, 3p21.31) partner was FCF1 RRNA-Processing Protein (FCF1, showed altered expression from exon 7 on in sample 9 14q24.3, NM_015962; Additional file 2: Figure S1). In (Fig. 4e). The RNA-seq data identified the Plexin B1 two additional instances (samples 1 and 3), the gene gene (PLXNB1, NM_002673.5, 3p21.31, Additional file 2: expression profiles for LYNX1 were similar (see Fig. 4b); Figure S1), which is located 347 k base pairs (kbp) down- however, these two tumors were not RNA-sequenced. stream of PRKAR2A, as a putative partner with three Analyses of RNA-seq data also identified LYNX1 as the different fusion junction sequences (Additional file 1: 3′ gene in the reciprocal fusion in the same sample, i.e., Table S4), of which two showed 100% identity according LYNX1-FCF1. The two fusion junction transcripts from to BLAT analysis. The PRKAR2A gene showed altered the RNA-seq data both mapped to several genomic loca- expression in an additional tumor (sample 14) with tions according to BLAT analysis. increase from exon 9 on. Unfortunately, the latter sample was not RNA-sequenced.

Table 2 Exon-level expression analysis Chromosome 19 sub-analysis Total gene set Chr 19 genes We also performed a separate analysis of genes from chromosome 19 that were among the top ranked break- No. breakpoints analyzed 1000 28 point candidates (Table 2). The highest ranked exon-level Unique genes 706 25 breakpoint in a gene from chromosome 19 was found for Recurrent genes 176 2 the Zinc Finger Protein 257 gene (ZNF257) in sample 15. RNA-seq fusion candidates 86 genes 4 This gene ranked as number 30 in the total data set. Ana- Exon breakpoint + RNA-seq fusion 35a 1 lysis of the RNA-seq data gave no corresponding fusion aIn 35 events, the same gene was among the top 1000 transcript-breakpoints partner information. and a nominated RNA-seq fusion gene partner in the same sample. Two genes When collating the lists of transcript breakpoints and are listed more than once: TAP2 and COL9A1, which are listed two and three times, respectively. Table S4 lists the RNA-seq fusion gene candidates of the RNA-seq fusion transcript candidates, the highest 32 genes ranked chromosome 19 gene was seen to be Family Smebye et al. BMC Cancer (2017) 17:642 Page 7 of 10

Fig. 4 Exon-level gene expression profiles with RNA-seq fusion breakpoints. The Y-axes show the median-centered gene expression values which

are in log2, X-axes show the probe sets sorted according to their genome positions. a For LHX2, sample 18 (red) has clearly deviating expression from the rest of the samples in the 3′ end. b Expression of LYNX1. For sample 17, RNA-seq analyses nominated both the fusion FCF1-LYNX1 and LYNX1-FCF1. Thus, two RNA-seq fusion breakpoints are indicated. c Expression of MMP27. d Expression of WFDC13. e Expression of PRKAR2A. (F) Expression of FAM129C

With Sequence Similarity 129, Member C (FAM129C, Discussion NM_173544.4, 19p13.11) with breakpoint number 37 in Large cohort studies of serous ovarian carcinomas have the total data set (Fig. 4f). In sample 8, the RNA-seq not found recurrent gene fusions [4, 10]. However, iso- analysis showed involvement of this gene in a fusion lated events in single patients can still be important steps with the DET1 and DDB1 Associated 1 gene (DDA1, in tumorigenesis and/or progression of individual tumors. NM_024050, 19p13.11, Additional file 2: Figure S1). We report here a combination of transcriptome analyses DDA1-FAM129C was furthermore among the RNA-seq of ovarian carcinomas selected on the basis of known candidates that were seen to correspond to a karyotypic presence of structural rearrangements of chromosome 19 finding – this sample had an identified breakpoint in in their karyotypes. The DPP9-PPP6R3 fusion rearrange- 19p13 – so for this fusion candidate examination at all ment was identified by RNA-seq and exon-level micro- three resolution levels led to concordant results. array analyses and validated by RT-PCR in one ovarian Smebye et al. BMC Cancer (2017) 17:642 Page 8 of 10

serous carcinoma. The fusion leads to disruption and proliferation, and attenuating activation of the oncogene subsequent deregulation of DPP9 gene expression. DPP9 AKT (protein kinase B) [39]. The DPP9-rearrangements was found rearranged with PLIN3 in another serous resulted in loss of the active sites of DPP9; it is well carcinoma, and also in this case the DPP9 expression was known that gene fusions may represent loss-of-function disrupted and lowered toward the 3′ end. The lost frag- events which play a role in carcinogenesis, as reported ment of the DPP9 transcript includes the code for the in colorectal [40] and prostate cancer [41]. functional peptidase and esterase-lipase domains of the LHX2 was among the genes showing differential DPP9 protein. Previously, Hoogstraat et al. [30] found expression of either the 5′ or 3′ end of the transcript rearrangement of DPP9 in a high-grade serous ovarian (Fig. 4a). Different gene expression profiles among sam- carcinoma by means of whole-genome mate-pair sequen- ples can sometimes be due to expression of alternative cing, a DPP9-PAX2 in-frame rearrangement with the transcripts. This does not seem to be the case for LHX2, breakpoint after exon 11 in DPP9, i.e., close to the break- however, which has several annotated transcript isoforms point identified in the present DPP9-PPP6R3 fusion. No according to the ENSEMBL data base [42], but none that information was reported on DPP9 expression. Despite can explain the observed gene expression breakpoint. The the presence of three different partner genes (PPP6R3, gene encodes a transcriptional activator that has been PLIN3, and PAX2) involved in the DPP9- rearrangements reported to promote tumor growth and metastasis in seen until now, it seems that they all lead to loss of the 3′ breast cancer [43]. Its involvement in three gene fusions part of the DPP9 transcript (Additional file 1: Table S2). (IGH-LHX2, ADAMTS13-LHX2,andAAK1-LHX2) has The effect of the disruption of DPP9 expression was been described in chronic myelogenous leukemia, breast evaluated by BLAT; despite different fusion breakpoint cancer, and uterine carcinosarcoma, respectively [34, 44]. positions, both the DPP9-PPP6R3 and the DPP9-PLIN3 Three of our samples showed strong expression of the rearrangements would lead to loss of the same functional 3′ of LYNX1. The exon-level gene expression results could domains at protein level, namely the peptidase and possibly be explained by alternative transcripts since the esterase-lipase domains. increased expression matches the starting point of two In the TCGA dataset of high-grade serous ovarian transcript isoforms (transcripts ENST00000521396 and carcinomas, 7% of the samples had downregulated gene ENST00000317543). No fusions involving LYNX1 have expression of DPP9 without DNA copy number alteration been reported in the literature so far, but the suggested (i.e., 23 of 316 samples, data available at cBio Cancer fusion partner FCF1 was involved in a TIMM9-FCF1 Genomics Portal, Memorial Sloan-Kettering Cancer Cen- fusion in an astrocytoma [34]. Another example of a tran- ter (MSKCC), www.cbioportal.org, default settings [4, 31, script breakpoint that corresponds to the start of a tran- 32]). This was also the case in our samples where DPP9 script isoform is provided by PRKAR2A, but in that case a was found downregulated without DNA copy number true fusion event seems more likely given the RNA-seq change. Patch et al. [10] showed that, in ovarian carcin- data. A CDC25A-PRKAR2A fusion was previously reported omas, gene breakage was a common mechanism for in- in an ovarian carcinoma [34]. activating tumor suppressor genes (RB1, NF1, RAD51B, Deregulation of MMP27 was consonant with the findings and PTEN); thus, loss-of-function gene changes could made by cytogenetics-based genomic analysis, RNA- explain the observed DPP9 downregulation. sequencing, and studies of the exon-level gene expression We have found no studies describing fusions involving profile. The RNA-seq listed TMEM123 as the 5’partner. A DPP9 in the Mitelman database or in the TCGA fusion of the same two genes has been reported in a breast fusion gene portal [33, 34]; thus, this is the first carcinoma, while in one ovarian carcinoma TMEM123 was report showing that DPP9-rearrangements occur in found fused with MMP7 [34]. In all these three reports, serous ovarian carcinoma. TMEM123 had the same breakpoint position. Since The DPP9 gene encodes a serine protease that belongs MMP27 and TMEM123 map closely on the same DNA to the DPPIV subfamily and is ubiquitously expressed strand (in 11q22.2), an interstitial deletion could have (www..org, [35]). Proteases may act as tumor caused the fusion, and imbalances in this chromosomal suppressors [36] as exemplified by DPP4 which is hom- region have been seen by both HR-CGH and karyotyping ologous to DPP9, both encode proteins that harbor the [12]. Upregulation of matrix metalloproteinases are known DPPIV-domain as well as hydrolase and peptidase con- to play a role in cancer and metastasis [45]. served domains. Loss of DPP4 contributes to tumorigen- Previously performed karyotypic analyses gave infor- esis in several cancers [36], including ovarian carcinoma mation about the possible chromosomal rearrangements [37], whereas forced expression has shown growth behind several of the identified fusion gene candidates. inhibitory effect on glioma cells [38]. The DPP9 protein One example is ZBTB46-WFDC13, where both genes participates in cell signaling and has several tumor sup- map to chromosome 20 and where the karyotype pressing abilities such as inducing apoptosis, suppressing included the following highly rearranged chromosome: Smebye et al. BMC Cancer (2017) 17:642 Page 9 of 10

der(19)(20p13 → 20q13::19p13 → 19q13::8q22 → 8qter) Additional files [18]; it is reasonable to assume that additional submicro- scopic alterations were also present leading to the gene Additional file 1: Table S1. RT-PCR primers. Table S2. RNA-seq fusion WFDC13 gene candidates. Table S3. BLAT results of fusion transcripts. Table S4. rearrangement. Interestingly, is closely related RNA-seq fusion gene partner candidates among the top 1000 exon-level to the clinically important gene WFDC2 which is also expression breakpoints. (XLS 86 kb) known as Human Epididymis Protein 4. WFDC2 is known Additional file 2 Figure S1. Gene expression of the nominated fusion to be overexpressed in ovarian carcinomas and encodes partners. In general, the expression of the 5′ genes in the samples with the HE4 protein, one of very few biomarkers used to the reported fusion genes does not differ from the rest. Median-centered gene expression values are in log2 (y-axes). Probe sets are shown along monitor the disease in ovarian cancer patients [46]. the x-axis, sorted according to their genome positions. Examples of fusion PRKAR2A-rearrangements have been reported as an in- gene candidates: (A) DPP9-PPP6R3; (B) FCF1-LYNX1; (C) TMEM123-MMP27; frame CDC25A-PRKAR2A fusion in the TCGA ovarian can- (D) ZBTB46- WFDC13; (E) PLXNB1-PRKAR2A; and (F) DDA1-FAM129C. cer data set [34]. Interestingly, the breakpoint thus identified corresponds to the exon-level breakpoint in our sample 14. Abbreviations CEL: Cell intensity; CGH: Comparative genomic hybridization; This sample was not RNA-sequenced, so we do not know if FISH: Fluorescence in situ hybridization; RMA: Robust multi-array average; this rearrangement caused the gene expression profile. RNA-seq: RNA sequencing; RQI: RNA quality indicator; RT-PCR: Reverse The chromosome 19 sub-analysis highlighted the Transcriptase-Polymerase Chain Reaction; TCGA: The Cancer Genome Atlas FAM129C gene which by RNA-seq was found to partici- Acknowledgments pate in the generation of the fusion gene candidate We thank Lisbeth Haugom and Kaja Chrisitine Graue Berg for technical DDA1-FAM129C. Both the genomic findings and gene assistance with the RNA analyses. expression analyses gave results pointing in the same Funding direction. The two genes are located only 211 kbp apart The study was supported by grants from the Norwegian Cancer Society, the on the same strand (19p13.11); thus, an interstitial Inger and John Fredriksen Foundation for Ovarian Cancer Research, The deletion could have caused the fusion. Another fusion Radium Hospital Foundation, and the Research Council of Norway through its CLTC-FAM129C Centers of Excellence funding scheme, project number 179571. The Fundings involving this gene, namely the in were not involved in the design of the study. breast cancer, was reported before [34]. Little is known about the function of FAM129C or the possible conse- Availability of data and materials The exon microarray data set is available at the Gene Expression Omnibus quences of its upregulated expression. (GEO) data repository (http://www.ncbi.nlm.nih.gov/geo/), Series GSE83111. Some notes of caution are warranted on the limitations of the approach we have used to identify potential fusion Authors’ contributions gene partners by looking for genes with different expres- Study design: FM, SH. Provided samples, clinical and pathological advice: BD, ′ ′ CGT. Molecular analyses, interpretation of data: MLS, AA, BJ, JT, RIS, FM. sion of their 3 and 5 ends. Gene fusions involving the Writing, revision of manuscript: MLS, AA, BJ, JT, BD, CGT, SH, RIS, FM. All promoter region of one gene and the coding sequence of authors have read and approved the final version of the manuscript. another will not be detected and would constitute “false ” Ethics approval and consent to participate negatives . Furthermore, the algorithm might misevaluate The tumor biobank has been registered according to national legislation and expression of different transcript isoforms from a single the study has been approved by the Regional Committee for Medical Research gene and nominate it as a fusion gene candidate, i.e., a Ethics South-East, Norway (project numbers S-07194a and 2.2007.425). Written “false positive”. The algorithm is also sensitive to technical informed consent was obtained from the patients at the time of surgery. noise in the data. Some of the breakpoints identified did Consent for publication not seem to be due to abnormal transcription but to noise Written informed consent to report patient data was obtained from the within a limited sample set. By using different methods for patients at the time of surgery. gene expression analysis and comparing the results, we Competing interests have tried to identify gene expression alterations that arise The authors declare no Competing interest. due to fusion events rather than due to expression of different transcript isoforms or alternative splicing. Publisher’sNote Springer Nature remains neutral with regard to jurisdictional claims in Conclusions published maps and institutional affiliations. By combining karyotype information, RNA-seq, and Author details exon-level gene expression microarray analysis, we have 1Section for Cancer Cytogenetics, Institute for Cancer Genetics and identified several fusion gene candidates in high-grade Informatics, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway. 2Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway. serous ovarian carcinomas. Most of these fusions seem 3Department of Molecular Oncology, Institute for Cancer Research, The to be isolated events. However, rearrangements of DPP9, Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway. 4 leading to decreased expression of its 3′ end, were Department of Pathology, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway. 5Faculty of Medicine, University of Oslo, Oslo, Norway. identified in two cases and might possibly result in loss 6Department of Gynecology, The Norwegian Radium Hospital, Oslo of tumor suppressor function. University Hospital, Oslo, Norway. Smebye et al. BMC Cancer (2017) 17:642 Page 10 of 10

Received: 9 March 2017 Accepted: 28 August 2017 25. BLAT. http://genome.ucsc.edu/cgi-bin/hgBlat. Accessed 8 Apr 2016. 26. Smebye ML, Sveen A, Haugom L, Davidson B, Trope CG, Lothe RA, Heim S, Skotheim RI, Micci F. Chromosome 19 rearrangements in ovarian carcinomas: zinc finger genes are particularly targeted. Genes References Chromosomes Cancer. 2014;53(7):558–67. ’ 1. Kurman RJ, Hendrick Ellenson L, Ronnett BM, editors. Blaustein s pathology 27. Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, of the female genital tract. 6th ed. New York: Springer-Verlag; 2011. Speed TP. Exploration, normalization, and summaries of high density 2. Bowtell DD, Bohm S, Ahmed AA, Aspuria P-J, Bast RC Jr, Beral V, Berek JS, oligonucleotide array probe level data. Biostatistics. 2003;4(2):249–64. Birrer MJ, Blagden S, Bookman MA, et al. Rethinking ovarian cancer II: 28. Affymetrix. http://www.affymetrix.com/estore/catalog/131452/AFFY/Human reducing mortality from high-grade serous ovarian cancer. Nat Rev Cancer. +Exon+ST+Array#1_1. Accessed 21 June 2016. – 2015;15(11):668 79. 29. Hoff AM, Johannessen B, Alagaratnam S, Zhao S, Nome T, Lovf M, Bakken 3. Ciriello G, Miller ML, Aksoy BA, Senbabaoglu Y, Schultz N, Sander C. AC, Hektoen M, Sveen A, Lothe RA, et al. Novel RNA variants in colorectal Emerging landscape of oncogenic signatures across human cancers. Nat cancers. Oncotarget. 2015;6(34):36587–602. – Genet. 2013;45(10):1127 33. 30. Hoogstraat M, de Pagter MS, Cirkel GA, van Roosmalen MJ, Harkins TT, 4. Cancer Genome Atlas Research Network. Integrated genomic analyses of Duran K, Kreeftmeijer J, Renkens I, Witteveen PO, Lee CC, et al. Genomic – ovarian carcinoma. Nature. 2011;474(7353):609 15. and transcriptomic plasticity in treatment-naive ovarian cancer. Genome 5. Pejovic T, Heim S, Mandahl N, Baldetorp B, Elmfors B, Floderus UM, Furgyik Res. 2014;24(2):200–11. S, Helm G, Himmelmann A, Willen H, et al. Chromosome aberrations in 35 31. Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, Aksoy BA, Jacobsen A, – primary ovarian carcinomas. Genes Chromosomes Cancer. 1992;4(1):58 68. Byrne CJ, Heuer ML, Larsson E, et al. The cBio cancer genomics portal: an 6. Pejovic T, Heim S, Mandahl N, Elmfors B, Floderus UM, Furgyik S, Helm G, open platform for exploring multidimensional cancer genomics data. Willen H, Mitelman F. Consistent occurrence of a 19p+ marker chromosome Cancer Discovery. 2012;2(5):401–4. and loss of 11p material in ovarian seropapillary cystadenocarcinomas. 32. Gao J, Aksoy BA, Dogrusoz U, Dresdner G, Gross B, Sumer SO, Sun Y, – Genes Chromosomes Cancer. 1989;1(2):167 71. Jacobsen A, Sinha R, Larsson E, et al. Integrative analysis of complex cancer 7. Taetle R, Aickin M, Yang JM, Panda L, Emerson J, Roe D, Adair L, Thompson genomics and clinical profiles using the cBioPortal. Sci Signal. 2013;6(269):l1. F, Liu Y, Wisner L, et al. Chromosome abnormalities in ovarian 33. Mitelman F, Johansson B, Mertens F: Mitelman database of chromosome adenocarcinoma: I. Nonrandom chromosome abnormalities from 244 cases. aberrations and gene fusions in cancer, http://cgap.nci.nih.gov/ – Genes Chromosomes Cancer. 1999;25(3):290 300. Chromosomes/Mitelman. Accessed 21 June 2016. 8. Kiechle-Schwarz M, Bauknecht T, Schmidt J, Walz L, Pfleiderer A. Recurrent 34. Yoshihara K, Wang Q, Torres-Garcia W, Zheng S, Vegesna R, Kim H, Verhaak cytogenetic aberrations in human ovarian carcinomas. Cancer Detect Prev. RG. The landscape and therapeutic relevance of cancer-associated transcript – 1995;19(3):234 43. fusions. Oncogene. 2015;34(37):4845–54. 9. Heim S, Mitelman F, editors. Cancer Cytogenetics: chromosomal and molecular 35. Consortium TU. UniProt: a hub for protein information. Nucleic Acids Res. genetic aberrations of tumor cells. Hoboken: Wiley-Blackwell; 2015. 2015;43(D1):D204–12. 10. Patch A-M, Christie EL, Etemadmoghadam D, Garsed DW, George J, Fereday S, 36. Lopez-Otin C, Matrisian LM. Emerging roles of proteases in tumour – Nones K, Cowin P, Alsop K, Bailey PJ, et al. Whole genome characterization of suppression. Nat Rev Cancer. 2007;7(10):800–8. – chemoresistant ovarian cancer. Nature. 2015;521(7553):489 94. 37. Kajiyama H, Kikkawa F, Suzuki T, Shibata K, Ino K, Mizutani S. Prolonged 11. Kiechle-Schwarz M, Bauknecht T, Karck U, Kommoss F, du Bois A, Pfleiderer survival and decreased invasive activity attributable to dipeptidyl peptidase A. Recurrent cytogenetic aberrations and loss of constitutional IV overexpression in ovarian carcinoma. Cancer Res. 2002;62(10):2753–7. – heterozygosity in ovarian carcinomas. Gynecol Oncol. 1994;55(2):198 205. 38. Busek P, Stremenova J, Sromova L, Hilser M, Balaziova E, Kosek D, Trylcova J, 12. Micci F, Weimer J, Haugom L, Skotheim RI, Grunewald R, Abeler VM, Silins I, Strnad H, Krepela E, Sedo A. Dipeptidyl peptidase-IV inhibits glioma cell Lothe RA, Trope CG, Arnold N, et al. Reverse painting of microdissected growth independent of its enzymatic activity. Int J Biochem Cell Biol. 2012; chromosome 19 markers in ovarian carcinoma identifies a complex 44(5):738–47. – rearrangement map. Genes Chromosomes Cancer. 2009;48(2):184 93. 39. Yao TW, Kim WS, Yu DM, Sharbeen G, McCaughan GW, Choi KY, Xia P, 13. Mertens F, Johansson B, Fioretos T, Mitelman F. The emerging complexity of Gorrell MD. A novel role of dipeptidyl peptidase 9 in epidermal growth – gene fusions in cancer. Nat Rev Cancer. 2015;15(6):371 81. factor signaling. Molecular Cancer Res. 2011;9(7):948–59. 14. Annala MJ, Parker BC, Zhang W, Nykter M. Fusion genes and their discovery 40. Yu J, Wu WKK, Liang Q, Zhang N, He J, Li X, Zhang X, Xu L, Chan MTV, Ng – using high throughput sequencing. Cancer Lett. 2013;340(2):192 200. SSM, et al. Disruption of NCOA2 by recurrent fusion with LACTB2 in 15. Kumar-Sinha C, Kalyana-Sundaram S, Chinnaiyan A. Landscape of gene fusions colorectal cancer. Oncogene. 2016;35(2):187–95. in epithelial cancers: seq and ye shall find. Genome Medicine. 2015;7(1):129. 41. Berger MF, Lawrence MS, Demichelis F, Drier Y, Cibulskis K, Sivachenko AY, 16. Micci F, Haugom L, Abeler VM, Davidson B, Trope CG, Heim S. Genomic Sboner A, Esgueva R, Pflueger D, Sougnez C, et al. The genomic complexity profile of ovarian carcinomas. BMC Cancer. 2014;14:315. of primary human prostate cancer. Nature. 2011;470(7333):214–20. 17. Micci F, Haugom L, Ahlquist T, Abeler VM, Trope CG, Lothe RA, Heim S. 42. Cunningham F, Amode MR, Barrell D, Beal K, Billis K, Brent S, Carvalho-Silva Tumor spreading to the contralateral ovary in bilateral ovarian carcinoma is D, Clapham P, Coates G, Fitzgerald S, et al. Ensembl 2015. Nucleic Acids Res. a late event in clonal evolution. J Oncology. 2010;2010:646340. 2015;43(Database issue):D662–9. 18. Micci F, Skotheim RI, Haugom L, Weimer J, Eibak AM, Abeler VM, Trope CG, 43. Kuzmanov A, Hopfer U, Marti P, Meyer-Schaller N, Yilmaz M, Christofori G. Arnold N, Lothe RA, Heim S. Array-CGH analysis of microdissected LIM-homeobox gene 2 promotes tumor growth and metastasis by inducing chromosome 19 markers in ovarian carcinoma identifies candidate target autocrine and paracrine PDGF-B signaling. Mol Oncol. 2014;8(2):401–16. – genes. Genes Chromosomes Cancer. 2010;49(11):1046 53. 44. Nadal N, Chapiro E, Flandrin-Gresta P, Thouvenin S, Vasselon C, Beldjord K, 19. Nyquist KB, Panagopoulos I, Thorsen J, Haugom L, Gorunova L, Bjerkehagen Fenneteau O, Bernard O, Campos L, Nguyen-Khac F. LHX2 deregulation by B, Fossa A, Guriby M, Nome T, Lothe RA, et al. Whole-transcriptome juxtaposition with the IGH locus in a pediatric case of chronic myeloid sequencing identifies novel IRF2BP2-CDX1 fusion gene brought about by leukemia in B-cell lymphoid blast crisis. Leuk Res. 2012;36(9):e195–8. translocation t(1;5)(q42;q32) in mesenchymal chondrosarcoma. PLoS One. 45. Egeblad M, Werb Z. New functions for the matrix metalloproteinases in 2012;7(11):e49705. cancer progression. Nat Rev Cancer. 2002;2(3):161–74. 20. FASTQ. Babraham Bioinformatics. http://www.bioinformatics.babraham.ac. 46. Drapkin R, von Horsten HH, Lin Y, Mok SC, Crum CP, Welch WR, Hecht JL. uk/projects/fastqc. Accessed 8 Apr 2016. Human epididymis protein 4 (HE4) is a secreted glycoprotein that is 21. Ge H, Liu K, Juan T, Fang F, Newman M, Hoeck W. FusionMap: detecting overexpressed by serous and endometrioid ovarian carcinomas. Cancer Res. fusion genes from next-generation sequencing data at base-pair resolution. 2005;65(6):2162–9. Bioinformatics. 2011;27(14):1922–8. 22. FusionMap website: http://www.omicsoft.com/fusionmap/ Accessed 8 Apr 2016. 23. Agostini A, Panagopoulos I, Andersen HK, Johannesen LE, Davidson B, Trope CG, Heim S, Micci F. HMGA2 expression pattern and TERT mutations in tumors of the vulva. Oncol Rep. 2015;33(6):2675–80. 24. BLAST. http://blast.ncbi.nlm.nih.gov/Blast.cgi. Accessed 8 Apr 2016.

Paper V

Identification of novel cyclin gene fusion transcripts in endometrioid ovarian carcinomas

Agostini A, Brunetti M, Davidson B, Tropé CG, Heim S, Panagopoulos I, Micci F.

Submitted manuscript (International Journal of Cancer)

Identification of novel cyclin gene fusion transcripts in endometrioid ovarian carcinomas

Antonio Agostini1, Marta Brunetti1, Ben Davidson2,3, Claes G Tropé4, Sverre Heim1,3, Ioannis Panagopoulos1, Francesca Micci1*

1 Section for Cancer Cytogenetics, Institute for Cancer Genetics and Informatics, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway

2 Department of Pathology, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway

3 Faculty of Medicine, University of Oslo, Oslo, Norway

4 Department of Gynecology, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway

* Corresponding author: Francesca Micci, Section for Cancer Cytogenetics, Institute for Cancer Genetics and Informatics, The Norwegian Radium Hospital, Oslo University Hospital, Ullernchausseen 64A, 0310 Oslo, Norway

Abstract

Formation of fusion genes is pathogenetically crucial in many solid tumors. They are particularly characteristic of several mesenchymal tumors, but may also be found in epithelial neoplasms. Ovarian carcinomas, too, may harbor fusion genes but only few of these were found to be recurrent with a rate ranging from 0.5 to 5 %. Because most attempts to find specific and recurrent fusion transcripts in ovarian carcinomas focused exclusively on high-grade serous carcinomas, the situation in the other carcinoma subgroups remains largely uninvestigated as far as fusion genes are concerned.

We performed transcriptome sequencing on a series of 34 samples from ovarian tumors that included borderline, clear cell, mucinous, endometrioid, low-grade and high-grade serous carcinomas in search of fusion genes typical of these subtypes. We found a total of 24 novel fusion transcripts. The PCMTDI-CCNL2 fusion transcript, which involves a member of the cyclin family, was found recurrently involved but only in endometrioid carcinomas (4 of 18 tumors; 22 %). We also found three additional fusion transcripts involving genes belonging to the cyclin family: ANXA5-CCNA2 and PDE4D-CCNB1 were detected in two endometrioid carcinomas, whereas CCNY-NRG4 was identified in a clear cell carcinoma. The recurrent involvement of CCNL2 in four fusions and of three other genes of the cyclin family in three additional transcripts hints that deregulation of cyclin genes is important in the pathogenesis of ovarian carcinomas in general but of endometrioid carcinomas particularly.

1 Introduction SLC25A40-ABCB1 transcript in six samples (5 %). Earp et al. (7) found CRHR1-KANSL1 Malignant epithelial tumors (carcinomas) are to be the most frequent fusion transcript in the most common ovarian cancers and also the their series (2.7 % of all tumors). Our group most lethal gynecological malignancies (1). recently reported the involvement of DPP9 in Based on histopathology and genetic profiling, two out of 18 samples of HGSC karyotypically ovarian carcinomas are divided into five main characterized by rearrangements of types: high-grade serous (HGSC) (representing chromosome 19 (8). Taken together, these 70 % of the malignancies), endometrioid (EC) results suggest that ovarian cancer is not (10 %), clear cell (10 %), mucinous (3 %), and characterized by highly recurrent fusion low-grade serous carcinomas (LGSC) (<5 %); transcripts. It should be taken into account, together they account for over 95 % of ovarian however, that the majority of studies referred malignant tumors (2). Each of these histotypes to above focused exclusively on high-grade differs in what is their precursor lesion(s), serous carcinomas meaning that the other oncogenesis, response to chemotherapy, and histotypes have not yet been extensively prognosis (3). Serous high-grade carcinomas analyzed. We therefore screened a series of 34 harbor TP53 and BRCA mutations, whereas tumors representing the whole spectrum of their low-grade counterparts often carry KRAS ovarian malignant epithelial tumors in order to and BRAF mutations. KRAS and HER2 look for new recurrent fusion transcripts mutations are frequent in mucinous carcinomas arising in non-HGSC tumors. whereas in endometrioid and clear cell carcinomas, ARID1A is frequently mutated (3). Material and Methods

Several studies have focused on the Tumor material identification of fusion genes in ovarian The material consisted of fresh frozen carcinomas. Although more than 700 samples samples from ovarian tumors surgically have been analyzed so far, only a few recurrent removed at The Norwegian Radium Hospital transcripts were found, and always with a low between 1999 and 2010. Samples from 34 rate of recurrence (0.5-5 %). Two studies used ovarian carcinomas (including borderline the genomic data produced by the Cancer tumors) were sequenced (two borderline, two Genome Atlas project (4, 5) to find that three low-grade serous, three mucinous, four clear fusion transcripts were recurrent in HGSC cell, nine EC, and 14 HGSC). A second cohort carcinomas: CCDC6-ANK3 (found in 4 of 113 samples was subsequently used to samples or 1 % of the tumors), and COL14A1- validate the results and test how frequent were DEPTOR and KAT6B-ADK (each found in 2 the novel fusion transcripts that had been samples or 0.5 %). Patch et al. (6) analyzed detected. The latter series consisted of 10 114 samples from chemoresistant HGSC and fibromas, 10 thecofibromas, 10 borderline found promoter swapping affecting the

2 epithelial tumors, and 83 carcinomas of which (https://ccb.jhu.edu/software/tophat/index.shtm 35 were HGSC, 16 mucinous, 18 EC, 10 clear l) (13). The candidate fusion transcripts cell, and four low-grade serous. The study was obtained by bioinformatic analysis were approved by the regional ethics committee checked using BLAST (Regional komité for medisinsk (http://blast.ncbi.nlm.nih.gov/Blast.cgi and forskningsetikk Sør-Øst, Norge, BLAT (https://genome.ucsc.edu/cgibin/hgBla https://helseforskning.etikkom.no/) and written t?command=start ). informed consent was obtained from the Reverse transcriptase-polymerase chain patients. reaction (RT-PCR) and Sanger sequencing RNA extraction One μg of RNA was reverse transcribed using Total RNA was extracted using miRNeasy Kit iScript Advance cDNA synthesis kit (Biorad). (Qiagen, Hilden, Germany) and QIAcube In order to validate the fusion genes detected (Qiagen). The concentration and purity of the as part of the bioinformatic analyses, cDNA RNA was measured with the Nanovue equivalent to 10 ng RNA was amplified using Spectrophotometer (GE Healthcare, Pittsburgh, the TAKARA Premix Ex Taq (TaKaRa-Bio, PA, USA). The RNA quality of the 34 samples Europe/SAS, Saint-Germain-en-Laye, France). sequenced was checked with Experion The primers are listed in Supplementary Automated Electrophoresis System using the Material I. The PCR cycling program for all RNA StdSens analysis kit (Bio-Rad reactions was as follows: 30 sec at 94 °C Laboratories, Oslo, Norway). followed by 35 cycles of 7 sec at 98 °C, 30 sec at 55 °C, 60 sec at 72 °C, and a final extension High-throughput paired-end RNA- for 2 min at 72 °C. Expression of the sequencing and bioinformatics analyses housekeeping gene ABL1 was monitored as Three μg of total RNA were sent for high- cDNA quality control. We also tested our throughput paired-end RNA-sequencing at the series of tumors for presence of a fusion gene Norwegian Sequencing Center, Ullevål CDKN2D-WDFY2. The primers and PCR University Hospital (https://www.sequen conditions were as reported (14). Three μlof cing.uio.no/) as described previously (9). The the PCR products were stained with GelRed software used for detection of fusion (Biotium, Hayward, CA, USA) and analyzed transcripts included Fusioncatcher v0.99.4e by electrophoresis through 1.0 % agarose gel. (https://github.com/ndaniel/fusioncatcher) (10), The gel was scanned with G-Box (Syngene, Chimerascan v0.4.5 (https://github.com Los Altos, CA, USA) and the images were /genome/chimerascan-vrl) (11), FusionMap acquired using GeneSnap (Syngene). The 31.03.15 (https://omictools.com/fusionmap- remaining 22 μl of the amplified fragments tool) (12), and TopHat 2.0.9 were purified using the QIAquick PCR purification Kit (Qiagen). Direct sequencing

3 was performed using the light run sequencing Sanger (direct) sequencing (Table I). The service of GATC Biotech (http://www.gatc- uniqueness of these fusion transcripts was biotech.com/en/sanger-services/lightrun- checked against the Mitelman Database for sequencing.html) or the ABI3500 Genetic Chromosome Aberrations and Gene Fusions in Analyzer (ThermoFiosher Scientific, Waltham, Cancer (https://cgap.nci.nih.gov/Chromosomes MA, USA) using BigDye Terminator V1.1 /Mitelman). Only the fusion transcripts cycle sequencing kit. The BLAST and BLAT KDMA5-NINJ2 and NSD1-ZNF346 were programs were used for computer analysis of previously identified, by Yoshihara et al. (5) in sequence data. HGSC. All other fusion genes were novel.

Results We found recurrent involvement of genes belonging to the cyclin family in RNA-sequencing gave informative endometrioid carcinomas in the form of fusion results for all 34 samples. The subsequent transcripts PCMTDI-CCNL2, ANXA5-CCNA2, bioinformatic analysis was also informative and PDE4D-CCNB1 (Figure I). Furthermore, giving a mean of four fusion transcripts per CCNY-NRG4, another fusion involving a tumor. The list of fusion candidates was cyclin gene, was found in a tumor showing shortened by checking every transcript with the mixed endometrioid/clear cell histotype in its BLAST and BLAT programs. All fusion primary location (uterus) but only a clear cell sequences that did not involve the coding pattern in the ovarian recurrence (see below). regions of both genes (3’UTR-coding DNA In addition, we found two transcripts involving sequence (CDS), intronic-CDS, and/or the neuregulin 4 (NRG4) gene, the already intronic-intronic), were discarded as were mentioned CCNY-NRG4 and TSPAN3-NRG4; sequences identified as read-throughs. We the latter was found in an HGSC. In both cases, focused on transcripts involving genes known the fusion involved exon 4 of NRG4 (Figure II). to be relevant in cancer and transcripts that The bioinformatic analyses identified an were identified by more than one program, the additional fusion transcript involving exon 4 of only exception being PCMTDI-CCNL2 which NRG4 and exon 7 of the Tripartite Motif was identified only by TopHat. Using these Containing 68 gene (TPRMS68) in another criteria, we came up with a list of 42 candidate sample of HGSC; however, we could not fusion transcripts (Table I) present in 11 confirm the presence of this transcript by samples out of 34 sequenced. More means of PCR and sequencing analysis. specifically, we found seven fusions in 14 HGSC, three in nine EC, and one in the four Since all these fusion transcripts were clear cell carcinomas analyzed. non-recurrent in the original series of 34 tumors subjected to NGS, we also tested a Twenty-two out of the 39 candidate larger cohort of 113 ovarian tumors for their fusion genes could be validated by PCR and possible presence. PCMTD1-CCNL2 was

4 Table 1. List of the candidate fusion transcripts.

Breakpoint Fusion Sample Diagnosis Candidate fusion genes a Location (exons) Outcome CCNY NRG4 10p11.21 15q24.2 1 4 in frame I CCCb MRPL21 TADA2A 11q13.3 17q12 6 7 in-frame MICALL1 GGA1 22q13.3 22q3 13 14 in-frame II E PCMTD1 CCNL2 8q11.23 1p36.33 3 6 in-frame ANXA5 CCNA2 4q27 4q27 3 3 in-frame PAK1 GYLTL1B 11q13.5 11p11.2 14 15 in-frame III E AP1M2 HIP1 19p13.2 7q11.23 10 31 out-of-frame CTBP2 DENND3 10q26.13 8q24.3 1 6 in-frame NAIP OLCN 5q13.2 5q13.2 2 5 in-frame PDE4D CCNB1 5q12.1 5q13.2 1 2 in-frame IV E MELK TMEM88 9p13.2 9p13.3 10 11 in-frame RGS10 ZMYM1 10q26.11 1p34.3 3 3 in-frame VE PCMTD1 CCNL2 8q11.23 1p36.33 3 6 in-frame VI E PCMTD1 CCNL2 8q11.23 1p36.33 3 6 in-frame VII E PCMTD1 CCNL2 8q11.23 1p36.33 3 6 in-frame SCNN1A CHD4 12p13.31 12p13.31 8 26 in-frame VIII HGSC TSPAN3 NRG4 15q24.3 15q24.2 6 4 in-frame IX HGSC TRIM68 NRG4 11p15.4 15q24.2 7 4 in-frame NCAPG2 RBPMS 7q36.3 8p12 15 2 in-frame MBD2 PERP 18q21 6q23 2 2 out-of-frame XHGSC DHX30 ABHD14B 3p21 3p21.2 6 5 in-frame SNTB1 ZNF250 8q24.1 8q24.3 1 6 in-frame MAP3K10 C19orf47 19q13.2 19q13.2 5 7 in-frame FUT8 FNTB 14q23.3 14q23.3 2 6 in-frame AP2B1 ZNF512 17q12 2p23.3 2 13 in-frame XI HGSC FARP2 PPP1R7 2q37.3 2q37.3 8 2 in-frame FAM160B1 NHLRC2 10q25.3 10q25.3 5 6 out-of-frame CTIF MOB2 18q21.1 11p15.5 7 4 in-frame MGEA5 KCNIP2 10q24.32 10q24.32 7 4 in-frame XII HGSC FAM20C SUGCT 7p22.3 7p14.1 3 14 out-of-frame ARHGAP35 UNC13A 19q13.32 19p13.11 4 40 in-frame FGFR2 FAM24B 10q26.13 10q26.13 9 2 in-frame XIII HGSC KDM5A NINJ2 12p13.3 12p13.3 24 2 in-frame PDZD8 ABLIM1 10q26.11 10q25.3 2 2 in-frame HDAC7 VDR 12q13.11 12q13.11 16 3 out-of-frame VRK1 TDP1 14q32.2 14q32.11 3 15 in-frame NSD1 ZNF346 5q35.3 3q35.2 3 6 in-frame NFIX RAD23A 19p13.3 19p13.3 2 1 in-frame XIV HGSC PRKD1 CNIH1 14q12 14q22.2 1 2 in-frame TMEM123 MMP27 11q22.2 11q22.2 2 7 in-frame KLC1 ZFAT 14q32.33 8q24.22 1 8 in-frame a The fusion transcripts that were validated with RT-PCR and Sanger sequencing are written in bold. b Clear cell carcinoma

5 Figure I. Schematic illustration of the putative chimeric proteins resulting from the detected fusions of cyclin genes. A) Illustration of CCNL2 protein and the putative chimeric protein resulting from the PCMTD1-CCNL2 fusion gene with a chromatogram showing the fusion junction identified by Sanger sequencing. B) Wild-type CCNA2 and putative fusion protein translated from ANXA5-CCNA2 with a chromatogram showing the fusion junction. C) CCNB1 protein illustration and the putative chimeric protein encoded by the fusion gene PDEA4D-CCNB1 with a chromatogram of the fusion junction

6 Figure II. Schematic illustration of the putative chimeric NRG4-encoded proteins. A) Illustration of the Neuregulin 4 (NRG4) protein. B) Putative chimeric NRG4 resulting from the CCNY-NRG4 fusion and chromatogram showing the fusion junction identified by Sanger sequencing. C) Putative chimeric NRG4 resulting from the TSPAN3-NRG4 fusion and chromatogram showing the fusion junction.

7 found in three additional cases of EC (thus it CDKN2D-WDFY2 (14), and BCAM-AKT2 (17) was found present in four out of 18 carcinomas were initially described as recurrent in HGSC of the endometrioid histotype in total; 22 %). at rates of 15 %, 20 %, and 7 %, respectively, The PCMTD1-CCNL2 in-frame fusion the findings have not been validated by other juxtaposes exon 3 of the Protein-L-Isoaspartate groups or in different series (18). More than (D-Aspartate) O-Methyltransferase Domain 700 samples were screened in other studies and Containing 1 gene (PCMTD1; accession fusion transcripts were found at frequencies number: NM_052937.3) from 8q11.23 with ranging from 0.5 % (KAT6B-ADK) to 2.7 % exon 6 of the Cyclin L2 gene (CCNL2; (CRHR1-KANSL1) (4-7). It is worthy of note NR_135154.1) from 1p36.33 (Figure I). The in the context that most studies focused on putative chimeric transcript is 3209 bp long HGSC whereas the other four types of ovarian and consists of a sequence of 762 bp from carcinoma were less often investigated; only PCMTD1 fused with a sequence of 2447 bp 63 endometrioid carcinomas, 36 clear cell from CCNL2. It codes for a chimeric protein of carcinomas, and six mucinous carcinomas had 409 aa containing the first 136 aa (1-136 out of been analyzed for fusion genes prior to the 357) of PCMTD1 (NP_443169) and 273 aa present study (7, 16). In a recent study, Earp et (253-526 out of 526) of Cyclin L2 al. (7) identified UBAP1-TGMT in clear cell (NP_112199.2). carcinomas exclusively, finding it in two out of 20 tumors of this histotype. No other fusion was found to be recurrent. We also tested the cohort for We identified PCMTD1-CCNL2 as a presence of the CDKN2D-WDFY2 transcript novel and recurrent fusion in endometrioid that was previously reported with a recurrence carcinomas, finding the transcript in four out of rate of 20 % in HGSC (14) , finding no such 18 (22 %) EC. The CCNL2 gene encodes three fusion. cyclin L2 isoforms (19). The main isoform (Cyclin L2α) contains two cyclin domains, Discussion spanning amino acids 76–150 and 192–281, Studies over the past decades have and a C-terminal RS site (arginine-serine uncovered the oncogenic role of fusion genes dipeptide) (385-423) that plays a role in in hematological malignancies and protein-protein interactions with the SR family mesenchymal tumors, and have highlighted the of splicing factors (20). The three splicing diagnostic and therapeutic advantages provided variants L2βA1/2/3 have exon 6 as the last by the detection of these chimeric transcripts coding exon and code for the 226 aa Cyclin and their tumor-specific expression (15). A L2βA isoform. The variant L2βB terminates in similar search for fusion transcripts in ovarian exon 7 and codes for the 236 aa Cyclin L2βB. cancer has shown that they are not common. Cyclin L2, which is different from most other Though the fusions ESSRA-C11orf20 (16), cyclins, is expressed during the entire cell

8 cycle, and was detected in many tissues, ovary important event in the pathogenesis of this included (Table II). Cyclin L2 participates subtype of ovarian carcinoma. together with Cyclin L1 and CDK11 in pre- Besides the aforementioned fusion, we mRNA splicing processes (19, 20). Lack of a found also another two fusion transcripts, functional Cyclin L2 may impair normal ANXA5-CCNA2 and PDE4D-CCNB1, splicing mechanisms as all three Cyclin L2 involving other cyclin genes in two different isoforms have been shown to be fundamental samples of EC. The in-frame fusion ANXA5- components of the splicing complex (19). CCNA2 juxtaposes exon 3 of the Annexin A5 The role of Cyclin L2 in cancer has not gene (ANXA5; NM_001154.3, from 4q27) with been investigated extensively; however, Li et exon 3 of the gene coding for the Cyclin A2 al. [16] showed that it acts as a tumor (CCNA2; NM_001237.3, mapping in the same suppressor protein in gastric cancer enhancing genomic region only 160 kb more distal). The both apoptosis and chemosensitivity. Yang et fusion results in a 2250 bp (259 bp from al. [15] showed a similar tumor suppressor ANXA5 and 1991 bp from CCNA2)transcript activity of Cyclin L2 in hepatocarcinoma and which codes for a functional chimeric cyclin that both cyclin domains were fundamental for composed of 31 aa (1-31 out of 320) of the protein’s proper functioning (20). The ANXA5 (AAH01429.1) and 274 aa (158-432 fusion PCMTD1-CCNL2 leads to a chimeric out of 432) of Cyclin A2 (AAI04784.1) Cyclin L2α lacking the first cyclin domain (76- (Figure I). The functional cyclin domains of 150) and containing only 27 aa of the second Cyclin A2 are located in regions 181-307 and cyclin domain (Figure I). Due to the fusion, 309-427 (Figure I) and are therefore conserved the chimeric protein is no longer able to bind in the chimeric cyclin encoded by ANXA5- CDK11 and suppress tumor growth. The CCNA2. The fusion gene PDE4D-CCNB1 fusion leads to loss of the other two isoforms brings together exon 1 of the since the splicing variants coding for isoforms Phosphodiesterase 4D gene (PDE4D; L2βA/B are lost. NM_001197223.1) from 5q11.2 and exon 2 of a gene coding for Cyclin B1 (CCNB1, Chromosomal band 1p36, which NM_031966.3) from 5q13.2. The fusion harbors the CCNL2 locus, was identified as a results in a chimeric transcript of 1886 bp (627 hotspot for genomic imbalances in EC (21). In bp from PDE4D and 1886 bp from CCNB1) particular, the band was found deleted by HR- which codes for a putative protein of 584 aa CGH in 75 % of the EC analyzed but not in containing 151 aa (1-151 out of 809) of other types of ovarian carcinoma (21). These Phosphodiesterase 4D (NP_001098101) and data indicate that not only fusions, but also the whole Cyclin B1 (433aa) (AAP88038) other genomic events, including imbalanced (Figure I). The consequences of these two ones, may target CCNL2 in EC, and that loss of this tumor suppressor gene may be an

9 fusions should be similar despite the fact that metastatic tumor sample examined showed they affect different genes. clear cell histology, it is interesting to note that its primary in the uterus had mixed Cyclin genes act in concert (22). The morphology with clear cell as well as expression levels of Cyclin A2 are tightly endometrioid morphology (Figure III). synchronized with cell cycle Presence of the latter phenotype again seems progression. CCNA2 transcription begins in consistent with involvement of cyclin genes in late G1, peaks and plateaus in mid-S, and then fusion transcripts in tumors showing declines in G2 (23). The transcription is mostly endometrioid features. regulated by the transcription factor E2F that de-represses the promoter (24). Cyclin B1 The NRG4 gene found was rearranged appears in S phase and accumulates in G2 and also with another partner, Tetraspanin-3 mitosis before disappearing at transition from (TSPAN3). The transcript consisted of exon 6 metaphase to anaphase. Synthesis of Cyclin B1 from TSPAN3 fused with exon 4 from NRG4 during the cell cycle is mainly regulated at the (similar to CCNY-NRG4). The gene NRG4 is transcriptional level by p53 (25) and is located on chromosomal band 15q24 and codes enhanced by the P300 coactivator (26), for Nereugulin 4 (CAL35829.1), a ligand of USF1(27), and C-Myc (28). The fusions the EGF receptor family. Whereas the ANXA5-CCNA2 and PDE4D-CCNB1 bring the Neuregulin 4 gene is not expressed at high cyclins under the control of the promoter of levels in the normal ovary (13 RPKM), CCNY their 5’ partners that are normally expressed in (70 RPKM) and TSPAN3 (143 RPKM) are ovarian cells (Table II). This promoter consistently transcribed (Table II). Fusion swapping overcomes the normal regulation of between these genes results in an increased the cyclin genes resulting in deregulation, i.e., level of the chimeric NRG4. The NRG4 gene overexpression/permanent expression of codes for a 115 aa protein which contains two chimeric Cyclins A2 and B1. This may main functional domains: an extracellular profoundly affect cell cycle regulation since EGF-like domain (17-46 aa) and a these two cyclins cooperate in both early and transmembrane domain (63-83 aa) (Figure II). late mitosis (29). The EGF-like domain is fundamental in the activation of EGF family receptors HER4 and The CCNY-NRG4 fusion transcript was HER3 (30). In the CCNY-NRG4 and TSPAN3- found in a clear cell ovarian carcinoma. In this NRG4 fusion transcripts, the EGF-like domain transcript, exon 4 of NRG4 is fused with exon of NRG4 is partially lost as only 10 out of 29 1 of CCNY, i.e., another cyclin gene. The aa are conserved in the chimeric protein while fusion causes the complete loss, at the genomic the transmembrane domain is conserved level, of the entire cyclin gene (CCNY; (Figure II). These findings suggest that NRG4 NM_181698.3), replacing it by NRG4 (exons is recurrently found involved in fusion 4-6; NM_138573.3). Despite the fact that the

10 Table 2. Overview of the expression status, at RNA and protein level, for the genes found involved in fusion events

RNA expression Protein expression b Protein expression Gene Illumina BodyMap 2.0 a (normal tissue) (cancer samples) c PCMTD1 48 Low Medium CCNL2 59 Medium Medium ANXA5 156 Medium Low CCNA2 10 Not-detected Low PDE4D 21 Medium Medium CCNB1 15 Not-detected Medium/Low CCNY 70 Low Medium/Low TSPAN3 143 Not-detected Medium/Low NRG4 13 Not-detected Low

a Ovarian tissue RNA expression from Illumina Human BodyMap 2.0 dataset in reads per kilobase million (RPKM) assessed on 70 samples of ovarian normal tissue. Source GeneCards (http://www.genecards.org) b Protein expression from normal ovarian tissue assessed by immunohistochemistry. Source the Human Protein Atlas (http://www.proteinatlas.org) c Protein expression from 12 samples of ovarian cancer assessed by immunohistochemistry. Source the Human Protein Atlas



11 Figure III. Histological appearance of the tumor of case I. Hematoxylin eosin staining at x50 A) and x200 B) magnification and Napsin A immunostaining C) of the primary uterine carcinoma with a mixed clear cell- endometrioid morphology. Hematoxylin eosin staining at x50 D) and x200 E) magnification and Napsin A immunostaining F) of the secondary clear cell carcinoma in the ovary

12 transcripts as it was rearranged in 8.8 % of the found involved in fusions were previously samples analyzed by RNA-sequencing (3 out reported in 35 different fusion transcripts in of 34); the gene is also promiscuous as it is studies of breast cancer and 14 fusion fused with three different partner. transcripts were identified in lung cancer (5, 37, 38) (Table III), hinting that they may be The role of cyclins in ovarian generally relevant in carcinogenesis of carcinomas has been studied by several groups. different types. Cyclin E was found deregulated due to gene amplification in 20 % of HGSC analyzed by Since we did not find any sample the TCGA project (4) but also in other tumor carrying the CDKN2D-WDFY2 fusion and no types such as endometrioid (31) and clear cell evidence for the presence of ESSRA-C11orf20 ovarian carcinomas (32). Other cyclins, too, and/or BCAM-AKT2 in our series, we were found deregulated and involved in the conclude that the frequency of these fusions pathogenesis of HGSC, such as Cyclin D1 must be much lower than was initially (33), F (34), B1 (35), and Y (36). However, reported (14, 16, 17). Indeed, it seems evident this is the first time that the fusion transcript that the occurrence of pathogenetically PCMTD1-CCNL2 is found recurrently in EC, essential fusion events in ovarian carcinomas in 22 % of the examined tumors. is well below what is the case in hematological malignancies and/or All other fusion transcripts that we mesenchymal tumors (39). This fact implies validated using PCR were non-recurrent in our that fusion genes are not the main mechanism series. However, taking into account in the pathogenesis of ovarian carcinomas, information from the Mitelman database perhaps they even represent secondary events. (http://cgap.nci.nih.gov/Chromosomes/Mitelm On the other hand, the fusion genes found in an) we found that some of the fusion genes this and other studies may also be a reflection identified in this study have indeed been of massive pathogenetic heterogeneity in previously reported, both in ovarian cancer and ovarian cancer, thus identifying tumor subsets other tumors (Table III), albeit in some cases within, but on other occasions transcending, with a different partner. The fusions KDMA5- the accepted phenotypic subgroups of ovarian NINJ2 and NSD1-ZNF346 were previously cancer. One may hope that further elucidation identified in HGSC as was seen in our case. of this tumorigenic variability will contribute The genes NINJ2, PAK1, PCMTD1, SNTB1, to a more meaningful classification of these and ZNF512 were also found fused with malignancies and eventually to the finding of different partners in HGSC (5). Taking all medicines directed at the molecular genetic these results together, we see that all the changes that are central to the disease process. mentioned genes were recurrently rearranged in ovarian cancer, admittedly at very low frequencies. Furthermore, 21 of the genes we

13 Table 3. Overview of genes found involved in chimeric transcripts in this series and in previous studies

Gene Type of Cancer References ABDH14B breast [1] AP2B1 lung, breast [1] ARHGAP35 breast [1] CCNB1 osteosarcoma, breast [2], [1] CCNY prostate, breast, kidney, bladder, thyroid [3], [1], [4] CTIF lung, breast [1] DHX30 breast [1] FARP2 prostate, lung [3] , [5] KDM5A Acute Myeloid Leukemia, breast, lung, kidney [6], [1] MELK breast [1] MGEA5 soft tissue tumors [7], [8] NINJ2 lung, brain, ovary, breast [1] NRG4 breast [1] NSD1 Acute Myeloid Leukemia, ovary, lung, , breast [9], [1] PAK1 ovary, breast [1],[10] PCMTD1 Acute Lymphocytic Leukemia, ovary, prostate, lung, [11], [1] breast RBPMS lung, breast, kidney, thyroid [1], [10],[4] SNTB1 ovary, breast [1] TSPAN3 breast [1] UNC13A brain, lung [1] VRK1 lung, oral cavity [1] ZNF512 ovary, breast [1]

Reference List

1. Yoshihara K, Wang Q, Torres-Garcia W, Zheng S, Vegesna R, Kim H and Verhaak RG. The landscape and therapeutic relevance of cancer-associated transcript fusions. Oncogene. 2015; 34(37):4845-4854. 2. Yang J, Annala M, Ji P, Wang G, Zheng H, Codgell D, Du X, Fang Z, Sun B, Nykter M, Chen K and Zhang W. Recurrent LRP1-SNRNP25 and KCNMB4-CCND3 fusion genes promote tumor cell motility in human osteosarcoma. Journal of hematology & oncology. 2014; 7:76. 3. Teles Alves I, Hartjes T, McClellan E, Hiltemann S, Bottcher R, Dits N, Temanni MR, Janssen B, van Workum W, van der Spek P, Stubbs A, de Klein A, Eussen B, Trapman J and Jenster G. Next-generation sequencing reveals novel rare fusion events with functional implication in prostate cancer. Oncogene. 2015; 34(5):568-577. 4. Agrawal N, Akbani R, Aksoy BA, Ally A, Arachchi H, Asa Sylvia L, Auman JT, Balasundaram M, Balu S, Baylin Stephen B, Behera M, Bernard B, Beroukhim R, Bishop Justin A, Black Aaron D, Bodenheimer T, et al. Integrated Genomic Characterization of Papillary Thyroid Carcinoma. Cell. 159(3):676-690. 5. The Cancer Genome Atlas Research N. Comprehensive molecular profiling of lung adenocarcinoma. Nature. 2014; 511(7511):543-550. 6. van Zutven LJ, Onen E, Velthuizen SC, van Drunen E, von Bergh AR, van den Heuvel-Eibrink MM, Veronese A, Mecucci C, Negrini M, de Greef GE and Beverloo HB. Identification of NUP98 abnormalities in acute leukemia: JARID1A (12p13) as a new partner gene. Genes Chromosomes Cancer. 2006; 45(5):437-446. 7. Antonescu CR, Zhang L, Nielsen GP, Rosenberg AE, Dal Cin P and Fletcher CD. Consistent t(1;10) with rearrangements of TGFBR3 and MGEA5 in both myxoinflammatory fibroblastic sarcoma and hemosiderotic fibrolipomatous tumor. Genes Chromosomes Cancer. 2011; 50(10):757-764. 8. Hallor KH, Sciot R, Staaf J, Heidenblad M, Rydholm A, Bauer HC, Astrom K, Domanski HA, Meis JM, Kindblom LG, Panagopoulos I, Mandahl N and Mertens F. Two genetic pathways, t(1;10) and amplification of 3p11-12, in myxoinflammatory fibroblastic sarcoma, haemosiderotic fibrolipomatous tumour, and morphologically similar lesions. The Journal of pathology. 2009; 217(5):716-727. 9. Brown J, Jawad M, Twigg SR, Saracoglu K, Sauerbrey A, Thomas AE, Eils R, Harbott J and Kearney L. A cryptic t(5;11)(q35;p15.5) in 2 children with acute myeloid leukemia with apparently normal karyotypes, identified by a multiplex fluorescence in situ hybridization telomere assay. Blood. 2002; 99(7):2526-2531. 10. Stransky N, Cerami E, Schalm S, Kim JL and Lengauer C. The landscape of kinase fusions in cancer. 2014; 5:4846. 11. Atak ZK, Gianfelici V, Hulselmans G, De Keersmaecker K, Devasia AG, Geerdens E, Mentens N, Chiaretti S, Durinck K, Uyttebroeck A, Vandenberghe P, Wlodarska I, Cloos J, Foa R, Speleman F, Cools J, et al. Comprehensive analysis of transcriptome variation uncovers known and novel driver events in T-cell acute lymphoblastic leukemia. PLoS genetics. 2013; 9(12):e1003997.

14 Acknowledgments Fereday S, et al. Whole-genome characterization of chemoresistant ovarian This work was supported by grants cancer. Nature. 2015 May 28;521(7553):489- from the Norwegian Radium Hospital 94. PubMed PMID: 26017449. Epub Foundation, the John and Inger Fredriksen 2015/05/29. eng. Foundation, and the Anders Jahre’s foundation 7. Earp MA, Raghavan R, Li Q, Dai J, through UNIFOR (University of Oslo). Winham SJ, Cunningham JM, et al. Characterization of fusion genes in common and rare epithelial ovarian cancer histologic Reference List subtypes. Oncotarget. 2017 Apr 01. PubMed PMID: 28423358. Epub 2017/04/20. eng. 1. Prat J. Ovarian carcinomas: five 8. Smebye ML, Agostini A, Johannessen distinct diseases with different origins, genetic B, Thorsen J, Davidson B, Trope CG, et al. alterations, and clinicopathological features. Involvement of DPP9 in gene fusions in serous Virchows Arch. 2012 3/2012;460(3):237-49. ovarian carcinoma. BMC Cancer. 2017 Sep 2. Kurman RJ, Carcangiu ML, 11;17(1):642. PubMed PMID: 28893231. Herrington CS, Young RH. WHO Pubmed Central PMCID: PMC5594496. Epub Classification of Tumours of Female 2017/09/13. eng. Reproductive Organs. Volume 6 ed: IARC; 9. Heim S, Mandahl N, Mitelman F. 2014 2014. Genetic convergence and divergence in tumor 3. Prat J. New insights into ovarian progression. Cancer Res. 1988 //;48:5911-6. cancer pathology. AnnOncol. 2012 9/2012;23 10. Nicorici D, Satalan M, Edgren H, Suppl 10:x111-7.:x111-x7. Kangaspeska S, Murumagi A, Kallioniemi O, 4. Integrated genomic analyses of ovarian et al. FusionCatcher - a tool for finding carcinoma. Nature. 2011 Jun somatic fusion genes in paired-end RNA- 29;474(7353):609-15. PubMed PMID: sequencing data. bioRxiv. 2014 2014-01-01 21720365. Pubmed Central PMCID: 00:00:00. PMC3163504. Epub 2011/07/02. eng. 11. Iyer MK, Chinnaiyan AM, Maher CA. 5. Yoshihara K, Wang Q, Torres-Garcia ChimeraScan: a tool for identifying chimeric W, Zheng S, Vegesna R, Kim H, et al. The transcription in sequencing data. landscape and therapeutic relevance of cancer- Bioinformatics. 2011 08/11 associated transcript fusions. Oncogene. 2015 03/04/received Sep 10;34(37):4845-54. PubMed PMID: 07/26/revised 25500544. Pubmed Central PMCID: 08/03/accepted;27(20):2903-4. PubMed PMID: PMC4468049. Epub 2014/12/17. eng. PMC3187648. 6. Patch AM, Christie EL, 12. Ge H, Liu K Fau - Juan T, Juan T Fau Etemadmoghadam D, Garsed DW, George J, - Fang F, Fang F Fau - Newman M, Newman

15 M Fau - Hoeck W, Hoeck W. FusionMap: 18. Micci F, Panagopoulos I, Thorsen J, detecting fusion genes from next-generation Davidson B, Trope CG, Heim S. Low sequencing data at base-pair resolution. frequency of ESRRA-C11orf20 fusion gene in 20110706 DCOM- 20111025(1367-4811 ovarian carcinomas. PLoS biology. 2014 (Electronic)). eng. Feb;12(2):e1001784. PubMed PMID: 13. Kim D, Salzberg SL. TopHat-Fusion: 24504521. Pubmed Central PMCID: an algorithm for discovery of novel fusion PMC3913552. Epub 2014/02/08. eng. transcripts. Genome Biology. 2011;12(8):1-15. 19. Loyer P, Trembley JH, Grenet JA, 14. Kannan K, Coarfa C, Rajapakshe K, Busson A, Corlu A, Zhao W, et al. Hawkins SM, Matzuk MM, Milosavljevic A, Characterization of cyclin L1 and L2 et al. CDKN2D-WDFY2 is a cancer-specific interactions with CDK11 and splicing factors: fusion gene recurrent in high-grade serous influence of cyclin L isoforms on splice site ovarian carcinoma. PLoSGenet. 2014 selection. The Journal of biological chemistry. 3/27/2014;10(3):e1004216. 2008 Mar 21;283(12):7721-32. PubMed PMID: 15. Parker BC, Zhang W. Fusion genes in 18216018. Epub 2008/01/25. eng. solid tumors: an emerging target for cancer 20. Yang L, Li N, Wang C, Yu Y, Yuan L, diagnosis and treatment. Chinese Journal of Zhang M, et al. Cyclin L2, a novel RNA Cancer. 2013 10/08/received polymerase II-associated cyclin, is involved in 10/10/accepted;32(11):594-603. PubMed pre-mRNA splicing and induces apoptosis of PMID: PMC3845546. human hepatocellular carcinoma cells. The 16. Salzman J, Marinelli RJ, Wang PL, Journal of biological chemistry. 2004 Mar Green AE, Nielsen JS, Nelson BH, et al. 19;279(12):11639-48. PubMed PMID: ESRRA-C11orf20 is a recurrent gene fusion in 14684736. Epub 2003/12/20. eng. serous ovarian carcinoma. PLoS biology. 2011 21. Micci F, Haugom L, Abeler VM, Sep;9(9):e1001156. PubMed PMID: 21949640. Davidson B, Trope CG, Heim S. Genomic Pubmed Central PMCID: PMC3176749. Epub profile of ovarian carcinomas. BMCCancer. 2011/09/29. eng. 2014 5/5/2014;14:315. doi: 10.1186/1471- 17. Kannan K, Coarfa C, Chao PW, Luo L, 2407-14-315.:315-14. Wang Y, Brinegar AE, et al. Recurrent 22. Vermeulen K, Van Bockstaele DR, BCAM-AKT2 fusion gene leads to a Berneman ZN. The cell cycle: a review of constitutively activated AKT2 fusion kinase in regulation, deregulation and therapeutic targets high-grade serous ovarian carcinoma. in cancer. Cell proliferation. 2003 Proceedings of the National Academy of Jun;36(3):131-49. PubMed PMID: 12814430. Sciences of the United States of America. 2015 Epub 2003/06/20. eng. Mar 17;112(11):E1272-7. PubMed PMID: 23. Henglein B, Chenivesse X, Wang J, 25733895. Pubmed Central PMCID: Eick D, Bréchot C. Structure and cell cycle- PMC4371965. Epub 2015/03/04. eng. regulated transcription of the human cyclin A

16 gene. Proceedings of the National Academy of 29. Gong D, Ferrell JE. The Roles of Sciences of the United States of America. Cyclin A2, B1, and B2 in Early and Late 1994;91(12):5490-4. PubMed PMID: Mitotic Events. Molecular Biology of the Cell. PMC44021. 2010 05/05/received 24. Yam CH, Fung TK, Poon RY. Cyclin 07/07/revised A in cell cycle control and cancer. Cellular and 07/14/accepted;21(18):3149-61. PubMed molecular life sciences : CMLS. 2002 PMID: PMC2938381. Aug;59(8):1317-26. PubMed PMID: 12363035. 30. Falls DL. Neuregulins: functions, Epub 2002/10/05. eng. forms, and signaling strategies. Experimental 25. Krause K, Wasner M, Reinhard W, cell research. 2003 Mar 10;284(1):14-30. Haugwitz U, Lange-zu Dohna C, Mössner J, et PubMed PMID: 12648463. Epub 2003/03/22. al. The tumour suppressor protein p53 can eng. repress transcription of cyclin B. Nucleic 31. Marone M, Scambia G, Giannitelli C, Acids Research. 2000 08/16/received Ferrandina G, Masciullo V, Bellacosa A, et al. 09/25/revised Analysis of cyclin E and CDK2 in ovarian 09/25/accepted;28(22):4410-8. PubMed PMID: cancer: gene amplification and RNA PMC113869. overexpression. Int J Cancer. 1998 Jan 26. Wasner M, Tschöp K, Spiesbach K, 05;75(1):34-9. PubMed PMID: 9426687. Epub Haugwitz U, Johne C, Mössner J, et al. Cyclin 1998/01/14. eng. B1 transcription is enhanced by the p300 32. Ayhan A, Kuhn E, Wu RC, Ogawa H, coactivator and regulated during the cell cycle Bahadirli-Talbott A, Mao TL, et al. CCNE1 by a CHR-dependent repression mechanism. copy-number gain and overexpression identify FEBS letters. 2003 2003/02/11/;536(1):66-70. ovarian clear cell carcinoma with a poor 27. Cogswell JP, Godlevski MM, Bonham prognosis. Mod Pathol. 2017 Feb;30(2):297- M, Bisi J, Babiss L. Upstream stimulatory 303. PubMed PMID: 27767100. Epub factor regulates expression of the cell cycle- 2016/10/22. eng. dependent cyclin B1 gene promoter. Molecular 33. Shigemasa K, Tanimoto H, Parham GP, and cellular biology. 1995 May;15(5):2782-90. Parmley TH, Ohama K, O'Brien TJ. Cyclin D1 PubMed PMID: 7739559. Pubmed Central overexpression and p53 mutation status in PMCID: PMC230509. Epub 1995/05/01. eng. epithelial ovarian cancer. Journal of the 28. Yin XY, Grove L, Datta NS, Katula K, Society for Gynecologic Investigation. 1999 Long MW, Prochownik EV. Inverse regulation Mar-Apr;6(2):102-8. PubMed PMID: of cyclin B1 by c-Myc and p53 and induction 10205781. Epub 1999/04/17. eng. of tetraploidy by cyclin B1 overexpression. 34. Comisso E, Scarola M, Rosso M, Cancer Res. 2001 Sep 01;61(17):6487-93. Piazza S, Marzinotto S, Ciani Y, et al. OCT4 PubMed PMID: 11522645. Epub 2001/08/28. controls mitotic stability and inactivates the eng. RB tumor suppressor pathway to enhance

17 ovarian cancer aggressiveness. Oncogene. 2017 Mar 20. PubMed PMID: 28319064. Epub 2017/03/21. eng. 35. Zheng H, Hu W, Deavers MT, Shen DY, Fu S, Li YF, et al. Nuclear cyclin B1 is overexpressed in low-malignant-potential ovarian tumors but not in epithelial ovarian cancer. Am J Obstet Gynecol. 2009 Oct;201(4):367 e1-6. PubMed PMID: 19608149. Epub 2009/07/18. eng. 36. Liu H, Shi H, Fan Q, Sun X. Cyclin Y regulates the proliferation, migration, and invasion of ovarian cancer cells via Wnt signaling pathway. Tumour Biol. 2016 Aug;37(8):10161-75. PubMed PMID: 26831658. Epub 2016/02/03. eng. 37. The Cancer Genome Atlas Research N. Comprehensive molecular profiling of lung adenocarcinoma. Nature. 2014 07/31/print;511(7511):543-50. 38. Comprehensive molecular portraits of human breast tumours. Nature. 2012 10/04/print;490(7418):61-70. 39. Sverre Heim FM. Cancer Cytogenetics: Chromosomal and Molecular Genetic Aberrations of Tumor Cells. 4th ed: Elsevier; 2015.

18 Supplementary material 1 List of the primers used to validate the candidate fusion transcripts Oligo Name Sequence (5' to 3') DNAJC717-FW TCTATGCTCCAACCGGAA KRT12-RV CAGCATGTTACTCTGAAAGGA DNAJC717-NFW AGTGCGATGTGGTAATGG KRT12-NRV TGAACTTGAGATGAGACCAC CCNY-FW AACAGGATAGCAGCAGGAGT NRG4-RV1 CTGGACTGAACTGGCTCTCT CCNY-NFW AGCCGCGAGGACACG NRG4-NRV1 AGGAAGTAGAAGGCTCCAATGATA MRPL21-FW TGAACTAGACCTTGCGTGTG TADA2A-RV CGTCTTTGTCTCTCCTTTAACCT MRPL21-NFW TCGAGTAGAAGCCACAGTCA TADA2A-NRV CTCTCAAGTCCCATTCTGCAT MICALL1-FW GATGGATACCATTGAGCGCC GGA1-RV TTGCTGGGTTTGATGGACTC MICALL1-NFW CCGAGCTCATCTATGTCTTCA GGA1-NRV AGTGAGAGCTCGGTTGGTA ISY1-FW GAGGGAAAAGTGAAGGAACGA CNBP-RV TGGCAAGATGACCAGACTCA ISY1-NFW GAGACGACAGATCATTGGAGAG CNBP-NRV TCCACCAGTAGGACATTCCC ANXA5-FW CTGACCTGAGTAGTCGCCAT CCNA2-RV CCAAATGCAGGGTCTCATTCT ANXA5-NFW ACTTCCCTGGATTTGATGAGC CCNA2-NRV AGTGATGTCTGGCTGTTTCTT PAK1-FW TCGAAATGATTGAAGGGGAGC GYLTL1B-RV CCCAATGCTAAGATGAGGGG PAK1-NFW GACTTTCTGAACCGCTGTCT GYLTL1B-NRV CAAGTCCTGGTGGAATTCGT AP1M2-FW CCCAGATTCAAGACCAGTGT HIP1-RV TCACTTAAATGCTGGGGTCC AP1M2-NFW GAGAAACGTCGTGATTTGGAG HIP1-NRV TGGCTCTGTCCTTTCTCATTT CTBP2-FW CATTTTAAGGTGGCTCGGGA DENND3-RV GAAGAAGACGATCCGCTGTT CTBP2-NFW GGGACCGAACCGGGAG DENND3-NRV CTCGGGCAGGTAACACAATC PDE4D-FW TTATAGCCCAGCGTACGAGA CCNB1-RV CCAAAATAGGCTCAGGCGAA CCNB1-NRV CATTTTGGCCTGCAGTTGTT MELK-FW GTGTGGATTTCTACCATTTGATGA TMEM88-RV CAGGTTCCAGAGTCCATGTC MELK-NFW ATGATGTTCCCAAGTGGCTC TMEM88-NRV AGAACTGCAGCACATCGTAG PCMTD1F1 TGGGAAGTGGAACCGGATA CCNL2R1 CCGAGCATAAAGCTGCAAG PCMTD1F2 AGCTTCATTCAGATGTGGTGG CCNL2R2 TCCGGGCAGCAAGATAAATG RGS10-FW CCGTCAGACATCCACGAC ZMYM-RV TCATTGTCACTACCAACGACTG RGS10-NFW GGAGAATCTGCTGGAAGACC ZMYM-NRV ATGCTGTCAAGGGTAGGAGT SCNN1A-FW CAGACTTGGGGGCGATTATG CHD4-RV TGGGAGCCATCATTGTAGTTG SCNN1A-NFW TTCAAAGTACACACAGCAGGT CHD4-NRVCHD4-NRV TCCCAGTAGTCAGGATCCAC TSPAN3-FW GGGTGTGAGGCTCTAGTAGT NRG4-RV2 TACTGCTCGTCTCTACCAGG TSPAN3-NFW TACGAGCTCCTCATCACTGG NRG4-NRV2 TGCAAAGGAAGTAGAAGGCTC NCAPG2-FW CCTGGTGAGCCTCATCTTTAAT RBPMS-RV ACTCTAGTCGTAGTGTTTGCG NCAPG2-NFW TTCGTCATTGCTTAAATGCCTG RBPMS-NRV AGGCTGTTTAGATGTGAGCTT DHX30-FW CCCATGTGTGTCAACCCTAC ABHD14B-RV CGGCTTGTGGAGGTAACAG DHX30-NFW CACTGCACATAAAATGGCCC ABHD14B-NRV TTCACCACAGAGTGGTTGG SNTB1-FW GTGCTGAAGCAGGAGCTG ZNF250-RV GTGAGTGGCATGCTTTGG SNTB1-NFW CATCCTCATCAGCAAGATCTTCAA ZNF250-NRV TTCCCCAGAATCACTGTTTGC SLC45A4-FW GGTAATGAAAATGGCTCCGC MINOS1-RV TCTCAGGTGAAGTCACTGCT SLC45A4-NFW CATAGACCGAATCCCCATGC MINOS1-NRV AGCCATTCCTAATCCCATGC PLXNB1-FW TGGGAGTGTGTTCTCCGT PRKAR2A-RV TCACCCTGAGTGATTATGCG PLXNB1-NFW TGGTGGCTATGATAGGGGAT PRKAR2A-NRV TGATTTAAGGAGGGGCACAG MAP3K10-FW GTTTGATGACCTTCGGACCAA C19orf47-RV CGTCCTATGGAGACCTTTTGC MAP3K10-NFW GATGGACATCGTGGAACGG C19orf47-NRV GGGCATGTTGATGACGTACT FUT8-FW GGTTCCTGGCGTTGGATTAT FNTB-RV CACATCATGCAACATGGCTC FUT8-NFW TCCTGATCACTCTAGCCGAG FNTB-NRV AGGTGTGGTAGAAATCACGC AP2B1-FW CGGGATCTCATAGCAGATTCAA ZNF512-RV GAGGCTGACACTACCAGTTC AP2B1-NFW CTGAATGAATGCACTGAATGGG ZNF512-NRVZNF512-NRV TCTGTCTCGGGAAGCTCAAT FARP2-FW ATTACGATGAAACGCTGGACC PPP1R7-RV CCAGCTCTCGAAGACTCTGTA FARP2-NFW AGTTGGAAATGTACGGCATCA PPP1R7-NRV CCTCAAATCCTTCAATCTTCCCT FAM160B1-FW GTCCTCCGGGAATGAAACAG NHLRC2-RV TGCCTTGTGAGGATAGGCA FAM160B1-NFW GTCCTAGCAACACCAACAGA NHLRC2-NRV ACTTCCAGCAAACCTAAGGC CTIF-FW AAAAGCTGGACTTCACCCAG MOB2-RV TCTGCAGATCTTCCTCACCA CTIF-NFW TGAATGACATCGAGAAGGTCC MOB2-NRV TCTCTGCCGTATTTTGTGGG MGEA5-FW AGAGCTCCAGTAATCTGGGAT KCNIP2-RV ACTGGGAGTAAATCTGCTTGAA MGEA5-NFW AGCCAACTACGTTGCTATCC KCNIP2-NRV GACATTCGTTCTTGAAGCCC FAM20C-FW CGGCCGTGGACTCCTAT SUGCT-RV TTCATGTTGGTCCACCACTC FAM20C-NFW GCTCATCATGACCTTCCAGAAT SUGCT-NRV CTCCTTCAGGATGTGCGTTG ARHGAP35-FW ATTGAGTACATTGAAGCCACAGG UNC13A-RV TCTTGGATTTGGTCGCAAACT ARHGAP35-NFW GCGGGAACAAGTCTGAGATG UNC13A-NRV GTTCCTGGATGAGTGAACAGC FGFR2-FW TTCACTCTGCATGGTTGACA FAM24B-RV GATACCACCAGCGATGACAG FGFR2-NFW TACAGCTTCCCCAGACTACC FAM24B-NRV GAGCTCCAATCATGTGTCAGA KDM5A-FW CTGAAAAGATCATCAGTGCAGAA NINJ2-RV GTTCAGCCGTGCAATGAC KDM5A-NFW TCTCCTCGACAAACAATGGAC NINJ2-NRV CCAGGGTGGTGTAGTAGTGA PDZD8-FW TCTCCTTCGTGGAAGACCC ABLIM1-RV TCTTTATGAAGAAGCCCCCTTG PDZD8-NFW GCACACCCTACCGAATTACAA ABLIM1-NRVABLIM1-NRV CCGAAGCACTTCACCCTTG HDAC7-FW TCTACGGCACCAACCCG VDR-RV GCTCCTCCTCATGCAAGTTC HDAC7-NFW GACACCATCTGGAATGAGCTT VDR-NRV CGCGGTACTTGTAGTCTTGG VRK1-FW CCTCGTGTAAAAGCAGCTCA TDP1-RV ATATGGCACAGGAAAGGTGG VRK1-NFW AGGAATGGAAAGTAGGATTACCC TDP1-NRV CCAAATGCTGAAGGGAGGAA NSD1-FW ATCAGACCTGTGAACTACCCA ZNF346-RV GGGCAGAATCACCTCTAGTC NSD1-NFW TTTAGATGCCCCTGAAGACAAG ZNF346-NRV GATGCCACTGTTTTTGGTGAC NFIX-FW GACCAGAAGGGCAAGATCC RAD23A-RV TCACCATGACGACCACAAAG NFIX-FW AGCCACATCACATTGGAGTC RAD23A-RV AGGGACATCGTCACTCAAGA PRKD1-FW TCTGGGCTCTCGGAGAAAG CNIH1-RV AGAGGGGCATATTGAGACCC PRKD1-NFW GATCGGCCTGAGCCGT CNIH1-NRV AGCGTGGATGAGGTACTCTG TMEM123-FW GCATCCTGCCCTCGGA MMP27-RV TTCACACGTCCTGGAAAACC TMEM123-NFW GGTGCTAGCGCTGCTG MMP27-NRV CGTATGCAGCTTGCAGATCA KLC1-FW GCTCGGCGCACAGTC ZFAT-RV GTTTTCTCAGTGCTGTGACG ZFAT-NRV GCCTTGAGGTTACTCTTGCT