Oncogene (2014) 33, 5434–5441 & 2014 Macmillan Publishers Limited All rights reserved 0950-9232/14 www.nature.com/onc

ORIGINAL ARTICLE Novel fusion transcripts in human gastric cancer revealed by transcriptome analysis

H-P Kim1,10, G-A Cho1,10, S-W Han1,2, J-Y Shin3,4, E-G Jeong1, S-H Song1, W-C Lee3,5, K-H Lee1,2, D Bang6, J-S Seo3,4,5,7,8, J-Il Kim3,4,5,7 and T-Y Kim1,2,9

Gene fusion is involved in the development of various types of malignancies. Recent advances in sequencing technology have facilitated identification of fusions and have stimulated the research of this field in cancer. In the present study, we performed next-generation transcriptome sequencing in order to discover novel gene fusions in gastric cancer. A total of 282 fusion transcript candidates were detected from 12 gastric cancer cell lines by bioinformatic filtering. Among the candidates, we have validated 19 fusion transcripts, which are 7 inter-chromosomal and 12 intra-chromosomal fusions. A novel DUS4L–BCAP29 fusion transcript was found in 2 out of 12 cell lines and 10 out of 13 gastric cancer tissues. Knockdown of DUS4L–BCAP29 transcript using siRNA inhibited cell proliferation. Soft agar assay further confirmed that this novel fusion transcript has tumorigenic potential. We also identified that microRNA-coding gene PVT1, which is amplified in double minute in SNU-16 cells, is recurrently involved in gene fusion. PVT1 produced six different fusion transcripts involving four different as fusion partners. Our findings provide better insight into transcriptional and genetic alterations of gastric cancer: namely, the tumorigenic effects of transcriptional read-through and a candidate region for genetic instability.

Oncogene (2014) 33, 5434–5441; doi:10.1038/onc.2013.490; published online 18 November 2013 Keywords: gene fusion; fusion transcript; gastric cancer; transcriptome sequencing; next-generation sequencing

INTRODUCTION In the present study, we used paired end transcriptome Gene fusion is one of the common oncogenic activation sequencing, which is an efficient tool for detecting fusion genes. mechanisms involved in the development of various types of This approach allowed us to detect 23 fusion transcripts, including malignancies including leukaemia, lymphoma, sarcoma and chimeric RNAs and fusion genes, all of which, to our knowledge, are prostate cancer.1 More importantly, gene fusions can produce novel. Fusion partner genes included hepatocyte growth factor important druggable targets, which make detection of gene receptor (MET), cyclin-dependent kinase12 (CDK12) and fibroblast fusion as an essential part of diagnosis and treatment strategy for growth factor receptor 2 (FGFR2) that are potential druggable some malignancies. Representative example is imatinib targeting targets. We also confirmed that transcriptional read-through of BCR-ABL tyrosine kinase, which has entirely changed treatment dihydrouridine synthase 4-like (DUS4L)–B-cell receptor-associated paradigm of chronic myelogenous leukaemia.2 Recent success of 29 (BCAP29) can be tumorigenic. Transcriptional read- crizotinib in nonsmall cell lung cancer harbouring EML4-ALK fusion through has previously been neglected because of the lack of has further underscored the importance of identifying novel gene genetic changes that occur and has never been functionally studied. fusions and effective targeted agents.3,4 In addition, we found the plasmacytoma variant translocation 1 (Pvt- Although other methods for detecting genetic alterations have 1) oncogene (PVT1), amplified and located on double minute (DM) been used, it was the advent of next-generation sequencing that , was involved in six different fusions. This suggests stimulated the fusion gene research in epithelial cancers. that PVT1 is associated with genetic instability resulting in gene Transcriptome sequencing has been the main contributor to fusions with various partners. Taken together, transcriptome fusion gene studies, for example, on breast,5,6 prostate7,8 and sequencing provides various novel fusion transcript candidates in gastric cancers.9 gastric cancer, and functional studies of these candidates contribute Gastric cancer is one of the most common and deadly to enhanced understanding of gastric carcinogenesis. cancers.10,11 Several gastric cancer-related fusion genes have been reported;9,12 however, the impact of gene fusion in gastric cancer has not been thoroughly studied. As with the discoveries of RESULTS fusion genes in other tumours, the identification of additional Detection and filtration of gene fusion candidates fusion genes in gastric cancer might provide new perspectives for We sequenced the transcriptomes of 12 gastric cancer cell lines its categorisation and understanding. (Supplementary Table S1) and detected 282 gene fusion

1Cancer Research Institute, Seoul National University College of Medicine, Seoul, Korea; 2Department of Internal Medicine, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea; 3Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Korea; 4Psoma Therapeutic Inc, Seoul, Korea; 5Department of Biomedical Sciences, Seoul National University Graduate School, Seoul, Korea; 6Department of Chemistry, College of Science, Yonsei University, Seoul, Korea; 7Department of Biochemistry, Seoul National University College of Medicine, Seoul, Korea; 8Macrogen Inc., Seoul, Korea and 9Department of Molecular Medicine & Biopharmaceutical Sciences, Graduate School of Convergence Science and Technology, Seoul National University College of Medicine, Seoul, Korea. Correspondence: Dr T-Y Kim, Department of Internal Medicine, Seoul National University College of Medicine, 101 Daehak-ro, Jongno-gu, Seoul 110-744, Korea. E-mail: [email protected] 10These authors contributed equally to the work. Received 4 April 2013; revised 1 October 2013; accepted 4 October 2013; published online 18 November 2013 Novel fusion transcripts in gastric cancer H-P Kim et al 5435 candidates (Figure 1). From the initial list of 282 gene fusion splicing or trans-splicing, were filtered out at the beginning step candidates, we discarded candidates with fewer than 10 junction because the main focus of these studies was on genetic changes. reads because fewer junction reads are likely indicators of false However, recent studies have suggested that RNA chimera positives. In addition, we observed false positive results in about TMEM79-SMG5 is generated by splicing in prostate cancer by 50 candidate fusions, with less than 10 reads. After this step, only deep sequencing.8 Similarly, SLC45A3–ELK4 fusion transcript can 22 candidates remained; however, we added back PVT1–ATE1, arise from cis-splicing of adjacent genes without corresponding PVT1–PPAPDC1A and CD44–FGFR2 because, in terms of their loci, DNA changes, suggesting that chimeric transcript is a novel class these three candidates were closely related to the already selected of gene fusion.13 On the basis of these, we included fusion PVT1–PDHX and PVT1–APIP fusion genes. In contrast, two transcript in our study. candidates were excluded as they are known to encode In order to define genetic rearrangement status of the 19 fusion hypothetical . Thus, the final list contained 23 gene fusion candidates, we performed long-range PCR in the genomic DNA of candidates (Figure 2). Apart from SNU-216, the other 11 cell lines cancer cell lines. We confirmed that two fusion transcripts (MED1– exhibited one or more fusion transcripts. CDK12 and CAPZA2–MET) were results of gene fusion (Supplementary Text S2). In contrast, long-range PCR revealed that five fusion transcripts are produced by read-through Validation of gene fusion candidates mechanism (Table 1). Although we were unable to validate RT–PCR and Sanger sequencing across fusion junctions were genetic rearrangement in remaining 12 fusion transcripts owing to conducted to confirm 23 candidates identified by transcriptome the technical limitations of long-range PCR, the 7 inter-chromo- sequencing; 19 of the 23 candidates were validated with 7 inter- somal fusion transcripts are most likely resulting from genetic chromosomal and 12 intra-chromosomal fusions (Table 1 and rearrangements. The genetic status of the five intra-chromosomal Supplementary Text S1). Five read-through fusion transcripts were fusion transcripts needs to be investigated using alternative among the 12 intra-chromosomal fusions. In most gene fusion technique in future studies. In the genome, MED1 and CDK12 are studies, chimeric RNAs, which are generated by read-through/ neighbours, but are located on different strands with opposite

Figure 1. Circos plot representing the fusion events in 12 gastric cancer cell lines. Transcriptome analysis was conducted on 12 gastric cancer cell lines: SNU-1, SNU-5, SNU-16, SNU-216, SNU-484, SNU-601, SNU-620, SNU-638, SNU-668, SNU-719, MKN45 and NCI-N87. Each connection with a line in a circle indicates a gene fusion in the cell line.

& 2014 Macmillan Publishers Limited Oncogene (2014) 5434 – 5441 Novel fusion transcripts in gastric cancer H-P Kim et al 5436 orientations; therefore, MED1–CDK12 fusion can be classified as cis Next, to investigate their relationships with tumours, we intra-chromosomal fusion gene.14 CAPZA2 and MET are examined the expression of the fusion transcripts in 13 primary neighbours on the same strand. Unlike most transcriptional gastric cancer tissues and their matched adjacent normal mucosa read-throughs, however, the genes in CAPZA2–MET fusion (Supplementary Table S2). CLN6–CALML4 and DUS4L–BCAP29 were transcripts are in a different order from their order on the DNA detected in 13 and 10 tumour tissues, while these are also present strand where MET is upstream of CAPZA2, suggesting that this is in 5 and 2 normal tissues, respectively. On the contrary, EIF3A– not a transcriptional read-through, but a genetic fusion. Similarly, SFXN4, VAV2–CCDC67 and PVT1–ATE1 were each detected in only RB1–ITM2B fusion was reported in melanoma and was categorised one tumour tissue. as a fusion gene rather than a chimeric transcript.15 Next, we used Prosite (http://prosite.expasy.org/) and SMART (http://smart.embl.de/) to predict protein translated from fusion trancripts.16,17 Although all the other candidate genes could encode protein products, it does not directly follow that they were all in the correct reading frame. For example, the STAG1–MSL2 fusion transcript was assumed to encode a 1289-residue protein by their sequence, but a guanine omission at the fusion junction caused a frameshift, resulting in a 786-residue protein instead. However, other fusion transcripts, for example, CAPZA2–MET, MED1–CDK12 and CD44–FGFR2, were predicted to encode protein kinase domain-containing proteins (Supplementary Text S1). The kinase domain of MED1–CDK12 is located at the fusion junction, which might be expected to alter the original CDK12 kinase, and CDK12 has been reported as a fusion partner of ERBB2 in the MKN7 cell line.18 In contrast, the kinase domains in the CAPZA2–MET and CD44–FGFR2 proteins are not at fusion junctions; therefore, the kinase domains of MET and FGFR2 are likely to be unaffected.

Identification of novel DUS4L–BCAP29 fusion transcript An examination of transcript frequency in the cancer tissues revealed that DUS4L–BCAP29 was the most frequently detected fusion transcript (Supplementary Table S2). DUS4L–BCAP29 was detected and validated in the SNU-5 cell line (Figure 3a), but no genomic level changes were detected by long-range PCR (data Figure 2. Schematic diagram of the candidate-filtering process. To not shown). Considering their loci and orientation on genome, this increase the validation rate, candidates with fewer than 10 junction fusion transcript is likely to be a transcriptional read-through reads were discarded. Candidates predicted to encode not-fully without genetic alterations. DUS4L–BCAP29 fusion transcript was validated/reviewed protein were also removed, leaving 23 final detected in 10 tumour and 2 normal tissues (Figure 3b; left). As the candidate fusion genes. normal tissues are from adjacent mucosa in the same patients, the

Table 1. The 19 validated fusion transcripts

No 50-Gene 30-Gene Context Locus Number of In frame Detected junction reads cell lines 50-Gene 30-Gene

1 TFEB TRERF1 Intra 6p21 6p21.1-p12.1 170 No SNU-620 2 EIF3A SFXN4 Intra 10q26 10q26.11 102 Yes SNU-668 3 DYM DAZAP1 Inter 19p13.3 18q21.1 32 Yes SNU-484 4a CAPZA2 MET Intra 7q31.2-q31.3 7q31 24 Yes MKN45 5 ARL6IP1 RPS15A Read-through 16p12-p11.2 16p 23 Yes SNU-484, SNU-638 6 E2F3 CDKAL1 Read-through 6p22.3 6p22 17 Yes SNU-719 7 IPP GPBP1L1 Intra 1p34-p32 1p34-p32 16 No MKN45 8 PVT1 PDHX Inter 11p13 8q24 15 — SNU-16 9 CLN6 CALML4 Read-through 15q23 15q23 15 No SNU-5, SNU-16, SNU-601, SNU-620, SNU-638, MKN-45, NCI-N87 10 STAG1 MSL2 Intra 3q22.3 3q22.3 13 No SNU-638 11 ESCO1 ABHD3 Intra 18q11.2 18q11.2 12 Yes SNU-1 12 VAV2 CCDC67 Inter 11q21 9q34.1 12 No MKN45 13 DUS4L BCAP29 Read-through 7q22-q31 7q22-q31 10 Yes SNU-5, SNU-719 14 PHACTR4 RCC1 Read-through 1p35.3 1p36.1 10 Yes SNU-719 15 APIP PVT1 Inter 11p13 8q24 10 — SNU-16 16a MED1 CDK12 Cis 17q12 17q12 10 Yes NCI-N87 17 PVT1 ATE1 Inter 10q26.13 8q24 7 — SNU-16 18 PVT1 PPAPDC1A Inter 10q26.12 8q24 4 — SNU-16 19 CD44 FGFR2 Inter 11p13 10q26 2 Yes SNU-16 aGenetic rearrangement.

Oncogene (2014) 5434 – 5441 & 2014 Macmillan Publishers Limited Novel fusion transcripts in gastric cancer H-P Kim et al 5437

5 p <0.001 Si-cont 10 Junction reads * 4 Si-DUS4L-BCAP29

3 DUS4L-BCAP29 5`1 6 7 2 3 8 3`

2 cyclin D

1 cyclin E

Relative proliferation Relative -tubulin 0 0day 1day 2day 3day

p <0.001 35 * 30 1 6 7 8 1 2 3 8 25 DUS4L BCAP29 20 Mock 15 10

Number of colonies Number 5 Chr.7q22-q31 0 Mock DUS4L-BCAP29 DUS4L-BCAP29

Frequency in gastric tissues 10 Mock p <0.001 8 DUS4L-BCAP29 *

6 GFP 4 cyclin D 2 cyclin E

Relative proliferation Relative  0 -tubulin DUS4L 0day 1day 2day 3day

DUS4L-BCAP29 1.2 1.8 1.0 1.6 1.4 0.8 1.2 0.6 1.0 * 0.8 0.4 0.6 0.4 0.2 0.2 0 0 Relative mRNA expression mRNA Relative Si-cont Si-DUS4L expression mRNA Relative Si-cont Si-DUS4L -BCAP29 -BCAP29 Figure 3. Characterisation of the DUS4L–BCAP29 fusion transcript. (a) The DUS4L–BCAP29 fusion transcript was validated in the SNU-5 cell line using RT–PCR and Sanger sequencing. (b) DUS4L–BCAP29 frequency in matched gastric cancer and normal tissue pairs was studied (left). mRNA levels of DUS4L–BCAP29 were assessed via quantitative real-time PCR (right). (c, d) SNU-5 cells were transfected with DUS4L–BCAP29 small interfering RNA (siRNA) (Si-DUS4L–BCAP29) or scramble siRNA (Si-cont), and viable cells were measured for 3 days. The expressions of DUS4L–BCAP29 and DUS4L were evaluated by real-time RT–PCR. Western blot analysis of cyclin D, cyclin E and a-tubulin in SNU-5 cells. Data represent the mean±s.d. (*Po0.001) of three independent experiments. (e) Soft agar assay of SNU-5 cells was performed for 14 days. Data represent the mean±s.d. (*Po0.001) of three independent experiments. (f) DUS4L–BCAP29-negative SNU-638 cells were transfected with plasmids encoding for DUS4L–BCAP29, in addition to mock plasmid. Viable cells were measured for 3 days (left). Western blot analysis of GFP, cyclin D, cyclin E and a-tubulin in SNU-638 cells. Data represent the mean±s.d. (*Po0.001) of three independent experiments. (g) S-phase progression were analysed using the BrdU incorporation assay in SNU-638 cells. Data represent the mean±s.d. (*Po0.005) of three independent experiments. two normal samples harboring the fusion transcripts could be the of SNU-5 cells compared with treatment with the si-control result of field effects. The expression of DUS4L–BCAP29 fusion (Figure 3d). In contrast, siRNA specifically targeting DUS4L but not transcript was higher than its normal counterpart (Figure 3b; the fusion transcript did not affect cell proliferation (data not right). To study its biologic function, SNU-5 cells were treated with shown). Taken together, it is the fusion transcript DUS4L–BCAP29 small interfering RNA (siRNA) to silence the expression of DUS4L– that is promoting cell proliferation but not DUS4L itself. Treatment BCAP29 transcript. Si-DUS4L–BCAP29 repressed the expression of with si-DUS4L–BCAP29 reduced the expression of AKT, ERK and the fusion transcripts effectively, but hardly affected the expres- their phosphorylated forms (Supplementary Figure S1). It was also sion of wild-type DUS4L, indicating that si-DUS4L–BCAP29 found that the expression of cyclin D was repressed in si-DUS4L– specifically silenced the fusion transcript (Figure 3c). Treatment BCAP29-treated cells. Cell cycle analysis showed that cells in G1 with si-DUS4L–BCAP29 significantly suppressed the proliferation and subG1 were increased in DUS4L–BCAP29-silenced cells.

& 2014 Macmillan Publishers Limited Oncogene (2014) 5434 – 5441 Novel fusion transcripts in gastric cancer H-P Kim et al 5438 Western blot showed evidence of apoptosis revealed by the for validation by RT–PCR and Sanger sequencing. With one cleavage of PARP, caspase-7 and caspase-3 and the down- exception, PDHX exon3–PVT1 exon4, all of these gene fusion had regulation of Bcl-2 and Bcl-XL and upregulation of Bim in PVT1 exon1 just before the fusion junctions (Figure 4b). Recently DUS4L–BCAP29-silenced cells compared with the controls reported PVT1-associated fusions, PVT1–NBEA and PVT1–WWOX in (Supplementary Figure S2). To confirm the potential tumorigenic multiple myeloma, also have PVT1 exon1 at the fusion junctions.19 effect of DUS4L–BCAP29 transcript, SNU-638 cell line not Break points near PVT1 were also detected frequently in multiple expressing the fusion transcript was transfected with mock or myeloma patients and in cell lines.19 We performed array CGH to DUS4L–BCAP29 and was investigated for anchorage-independent identify genetic changes in the joining regions (Figure 4c). The growth. After 14-day culture in soft agar, DUS4L–BCAP29- PVT1 copy-number showed abrupt changes within the gene itself, expressing cells clearly formed more foci, compared with control implying the presence of genomic breaks. The copy-numbers of cells (Figure 3e). We also examined cell proliferation after its partner genes also did not seem stable (log2 ratio ¼ 1). The overexpressing DUS4L–BCAP29 in SNU-638 cell line. Proliferation regions covering APIP to CD44 and FGFR2 were amplified, and was increased in a time-dependent manner in DUS4L–BCAP29- ATE1 showed two sudden changes in copy-number, indicating overexpressed SNU-638 cells but not in control cells (Figure 3f). genomic break points. These findings support the susceptibility of We also observed that the expressions of cyclin D and E increased. PVT1 to genomic break, in agreement with previous studies.19 These results were confirmed using BrdU incorporation assay Fluorescence in situ hybridisation (FISH) added more evidence of (Figure 3g). The series of experiments we have performed strongly PVT1 breakage in SNU-16 cells (Figure 4d). Multiple extrachromo- suggest that the novel DUS4L–BCAP29 fusion transcript is somal double minute chromosomes on which PVT1 was highly tumorigenic in gastric cancer. amplified have been reported in SNU-16 cells.20 In contrast, DNA obtained from the tissue of a normal male showed only one PVT1 PVT1: a tangled web of gene fusion pair on chromosome 8. Although most amplified PVT1 genes were We found complicated genetic alterations in SNU-16 cells, around not broken apart, some exhibited only one probe colour (red or PVT1 located on chromosome 8q24. PVT1 was involved in four green), indicating that the chromosome was unpaired and had different fusion genes including, PVT1–PDHX, PVT1–APIP, PVT1– broken apart (Figure 4d). Therefore, gene amplification of PVT1 ATE1 and PVT1–PPAPDC1A (Figure 4a). It was the most frequent and its breakage seem to be responsible for most of the tangled fusion partner we observed in a single cell line. Interestingly, all gene fusion web found in SNU-16. the validated PVT1-associated fusions were inter-chromosomal. APIP and PDHX are neighbouring genes on ; PPAPDC1A and ATE1 are close to each other on ; DISCUSSION CD44 is a neighbour of PDHX; and FGFR2 is located between From the initial 282 gene fusion candidates, we obtained 23 PPAPDC1A and ATE1. Because of their proximities, we returned candidates after filtering using our criteria for a higher validation PVT1–ATE1, PVT1–PPAPDC1A and CD44–FGFR2 to the candidate list rate (Figure 2). Of the 23 candidates, 19 were validated using

PVT1-PDHX PVT1-PDHX PDHX-PVT1 5` - 1 2 3 11 -3` 5` - 1 9 10 11 - 3` 5`- 1 2 3 4 10 - 3` CTGCCCTCCAGTGATCCCAT CTGCCTCCACAAATGCCA CAAAATCGTGCTGACCACA

Chromosome 8 PVT1 myc 8q24 PVT1-APIP PVT1-APIP Chromosome 10 5` - 1 2 3 7 - 3` 5` - 1 6 7 - 3` PPAPDC1A FGFR2 ATE1 10q26.12 10q26 10q26.13 CTGCCCTCCAGACAAGGAG TGCCCTCCAATATGATGA Chromosome 11 APIP PDHX CD44 11p13 11p13 11p13 PVT1-PPAPDC1A PVT1-ATE1 5` - 1 6 7 - 3` 5` - 1 12 - 3` TGCCCTCCATTGCCTTTTC CTGCCCTCCATGGATGAGG

Normal male 46, XY

DAPI FITC Cy3 Merged SNU-16

DAPI FITC Cy3 Merged Figure 4. Gene fusions involving PVT1 in the SNU-16 cell line. (a) Genetic alterations in SNU-16 cells around PVT1 on chromosome 8q24. (b) PVT1-participating fusion transcripts validated by RT–PCR and Sanger sequencing. (c) Array CGH showing copy-number changes in PVT1 and ATE1. Other involved genes were also amplified. (d) Fluorescence in situ hybridisation (FISH) showing PVT1 breakage and PVT1 amplification on double minute chromosomes in SNU-16 cells. Some double minute chromosomes were broken apart (FITC (green): 30 PVT1, Cy3 (red): 50 PVT1).

Oncogene (2014) 5434 – 5441 & 2014 Macmillan Publishers Limited Novel fusion transcripts in gastric cancer H-P Kim et al 5439 RT–PCR and Sanger sequencing (Table 1; Supplementary Text S1). with fusion genes is supported by a previous finding that NUP214– Seven were classified as inter-chromosomal fusions and 12 were ABL1 fusion was amplified and detected in a double minute classified as intra-chromosomal fusions. The 12 intra-chromosomal chromosome from a T-cell acute lymphoblastic leukaemia.31 fusions included five read-through candidates. In the present Moreover, it is plausible that the tangled web of gene fusion is study, read-through transcript was defined when any genetic related with chromothripsis, in that localised blocks of change was not identified by long-range PCR. However, negative chromosomes are rearranged.32 Future studies elucidating the result from long-range PCR may not completely exclude the relationships between the gene fusion, double minute possibility of genetic rearrangement. chromosome and chromothripsis are warranted. In the present study, 19 fusion candidates were categorised into In summary, we describe here several novel gene fusion events three groups by their frequencies in tissue samples and their in gastric cancer revealed by next-generation transcriptome component genes. The first group contains two fusion transcripts sequencing. It is notable that transcriptional read-through without that were common in the tumour tissues, namely, DUS4L–BCAP29 DNA rearrangement could be another mechanism of tumorigen- and CLN6–CALML4, which, because of their adjacent genomic loci esis. We also found that an amplified gene in a double minute and lack of genomic PCR products, were categorised as read- chromosome could be a source of various gene fusions with through transcripts. Although they were detected in a few normal different partners. tissues, they were much more common in tumour tissues. DUS4L– BCAP29 and CLN6(exon2)–CALML4(exon4) were detected in 76.9% and 46.2% of the tumour tissue sample, respectively MATERIALS AND METHODS (Supplementary Table S2). CLN6–CALML4 had three variants, and Cell lines and culture all were commonly expressed in tumour tissues. The functions of Human gastric adenocarcinoma cell lines (SNU-1, SNU-5, SNU-16, SNU-216, read-through fusion transcripts without genetic alteration remain SNU-484, SNU-601, SNU-620, SNU-638, SNU-668 and SNU-719) were relatively understudied. Mechanisms of their formation and their purchased from the Korean Cell Line Bank;20 the NCI-N87 and MKN45 recurrence in cancer tissues have been reported,8,13,21,22 but their cell lines were obtained from the American Type Culture Collection specific functions and contributions to cancer have not been (Manassas, VA, USA) and Health Science Research Resource Bank (Osaka, studied in depth. Here, we have shown that DUS4L–BCAP29 Japan), respectively. All cell lines were cultured in RPMI1640 (Hyclone Laboratories, Ind., Logan, UT, USA), supplemented with 10% fetal bovine contributes to cell proliferation via antiapoptosis signalling serum at 37 1C in a 5% CO -humidified atmosphere. (Figure 3). Further studies are planned to elucidate the mechanism 2 by which DUS4L–BCAP29 are generated. We also found that three candidates show tumour-specific expression, EIF3A–SFXN4, VAV2– Paired-end transcriptome sequencing CCDC67 and PVT1–ATE1, but detected in only in one tumour each. Complementary DNA (cDNA) samples were sequenced using a Genome Second group contains oncogenes for which drugs exist, Analyzer IIx (Illumina Inc., San Diego, CA, USA). For the mRNA-Seq sample namely, CAPZA2–MET, MED1–CDK12 and CD44–FGFR2. Currently, preparation, sequencing libraries were generated according to the standard Illumina protocol for high-throughput sequencing. We hybridised various MET, CDK and FGFR inhibitors are being developed and 23,24 8 pM of each library to a flow cell, with a single lane for each sample, and under clinical trials. The CAPZA2–MET fusion transcript was used an Illumina cluster station for cluster generation. Finally, we detected in MKN45 cells containing amplified MET. Abnormal MET generated 2 Â 53 bp and 1 Â 38 bp sequences using Illumina GAIIx signalling is a well-known marker of poor prognosis and drug (Illumina Inc) sequencer (Supplementary Table S1). resistance, and, as a result, MET has been studied intensively as an anticancer drug target.25 MKN45 cells containing amplified MET Alignment of sequenced reads. We used the standard Illumina pipeline and CAPZA2–MET fusion are sensitive to MET inhibitors.26 (cassava 1.6) with default options to analyse the images and extract base Considering that MET is amplified in MKN45 cells,27 it is possible calls. The sequenced reads were aligned using GSNAP,33 with the non- that regional gene amplification of MET might have generated the default parameters -m 3 INSERT SIZE:-p 300 –i 3, and allowing up to two CAPZA2–MET fusion. MET-specific antibodies might be less mismatches with the reference sequences. The sequenced reads were aligned to human transcript reference sequences from the UCSC database sensitive than tyrosine kinase inhibitors for this protein because (Homo_sapiens.GRCh36), and to the sequence (NCBI Build it has lost a large part of its extracellular domain. In contrast, 36.3) and known exon–exon junctions, for expression analysis at the gene/ CDK12 has not been a mainstream target of anticancer drugs, transcript levels. Discordant read pairs and fusion-spanning reads were including the CDK inhibitors. The MED1–CDK12 fusion gene is extracted from the mapping results and used to identify gene fusion predicted to alter the protein kinase domain of CDK12 events. The pipeline that we used to detect fusion genes has been (Supplementary Text S1). The biological consequence of the described previously.34,35 altered protein kinase needs to be elucidated in the future. FGFR2 is another anticancer drug target for which several drugs have Filtration steps. (1) Homology between two genes was detected using been already developed.28 We confirmed CD44–FGFR2 in an SNU- bl2seq (BLAST 2 sequences); similar genes were filtered out (E-value 16 cell line (Supplementary Text S1). SNU-16 cells containing o0.01). (2) Fusion-spanning reads mapped by fewer than 10 bp to either amplified FGFR2 and CD44–FGFR2 fusion are sensitive to FGFR gene in the fusion were discarded. Somatic gene fusions were obtained by removing the gene fusions that were also observed in the normal sample. inhibitor.29 Third group contains four gene fusions and their variants that RT–PCR and quantitative real-time RT–PCR. Total RNA was extracted from were detected in SNU-16 cell lines and that have PVT1 as a the each cell line using an RNeasy mini kit (Qiagen, Valencia, CA, USA) for component gene (Figure 4). PVT1 exon1 is known to encode the transcriptome sequencing following the manufacturer’s instructions. cDNA 30 oncogenic microRNA miR1204 and, because the microRNA- was synthesised from total RNA (1 mg) with ImProm-II reverse transcriptase coding gene PVT1 is located at the head of the fusion, further (Promega, Madison, WI, USA) and amplified by RT–PCR using AmpliTaq studies should define the functional results of these fusions. Most Gold DNA polymerase (Applied Biosystems, Foster City, CA, USA) with of the amplified PVT1 was located on double minute buffer supplied by the manufacturer with gene/fusion junction-specific chromosomes (Figure 4d). The amplified FGFR2 in SNU-16 also primers. The primers were synthesised by Macrogen Inc. (Seoul, Korea). resides on double minute chromosomes.20 Considering that the b-Actin expression was used as an internal RT–PCR control. Sequencing was carried out by Macrogen Inc. and Bionics Co., Ltd (Seoul, Korea). For amplified PVT1 and FGFR2 in SNU-16 exist on double minute quantitative real-time RT–PCR (qRT–PCR), cDNA was amplified using chromosomes, and their involvement in gene fusion, these genes Premix Ex Taq (TaKaRa, Shiga, Japan) with SYBR Green I (Molecular Probes, could mutate and fuse with each other during amplification Eugene, OR, USA) using a Step One Plus system (Applied Biosystems). The events or during the generation of double minute chromosomes. sequences of all the PCR primers used in this study are listed in The suggestion that double minute chromosomes are associated Supplementary Table S3.

& 2014 Macmillan Publishers Limited Oncogene (2014) 5434 – 5441 Novel fusion transcripts in gastric cancer H-P Kim et al 5440 Primary human gastric tissues Array Comparative Genomic Hybridisation (array CGH) analysis Thirteen human gastric tumour tissues and the matched normal tissues Genomic DNA from cells was analysed using array CGH using a 2 Â 400 K were obtained from the tissue bank of Seoul National University Hospital oligonucleotide microarray (Agilent Technologies, Santa Clara, CA, USA) (Seoul, Korea). After surgical removal, the tissues were immediately frozen according to the manufacturer’s recommendations. The test DNA (2 mg) in liquid nitrogen and stored until required. and reference DNA (2 mg) were digested with AluI and RsaI (Promega). The digested test and reference DNAs were labelled with cyanine (Cy)3- deoxyuridine triphosphate (dUTP) or Cy5-dUTP, respectively, using the siRNAs and transfection Agilent Genomic DNA Labeling Kit PLUS (Agilent). The individually labelled The siRNAs, including the control siRNA (si-CONTROL) with scrambled test and reference samples were then purified using Microcon YM-30 filters sequences, used in this study were synthesised by Genolution Pharma- (Millipore, Billerica, MA, USA). Following purification, the appropriate Cy3- ceuticals, Inc. (Seoul, Korea). The siRNA sequences used against the DUS4L– labelled test DNA and Cy5-labelled reference DNA were mixed together BCAP29 fusion transcript were 50-GAAACAUAUCUGUAUUAAUUU-30 (sense) and combined with 2 Â hybridisation buffer (Agilent), 10 Â blocking agent and 50-AUUAAUACAGAUAUGUUUCUU-30 (antisense). Each siRNA (50 nM) (Agilent) and human Cot-1 DNA (Invitrogen). The hybridisation mixture was treated in each cell line, twice separately, with an interval of 48 h. was dispensed slowly to a microarray chip and assembled with an Agilent Transfection was performed with Lipofectamine 2000 (Invitrogen, Carlsbad, SureHyb chamber. The assembled slide chamber was placed in the rotator CA, USA) according to the manufacturer’s instruction. rack in a hybridisation oven set to 65 1C for 40 h with suitable rotation. The hybridisation was followed by two washing steps in Agilent’s wash buffer 1 and wash buffer 2 following the manufacturer’s instructions. After washing, Cell proliferation assay all microarray slides were scanned on an Agilent Microarray Scanner Immediately after the second treatment with si-DUS4L–BCAP29 or its G2505C with a 5-mm resolution. Captured images were transformed to control, the cells were plated in complete growth media in 60 mm dishes data with Feature Extraction Software, version 10.7 (Agilent) and then (7 Â 105 cells per dish). From the next day and for 3 days, the number of imported into Agilent CGH Analytics 5.0.14 software for analysis. total cells in each dish was counted. Three independent experiments were performed. Data analysis The analysis and visualisation of the array CGH-2 Â 400 K data were DUS4L–BCAP29 Cloning performed using Agilent CGH Analytics 5.0.14 software. A comprehensive Human DUS4L–BCAP29 from a cDNA pool of SNU-16 cells was amplified description of the statistical algorithms is available in the user’s manual and cloned into diverse mammalian expression vectors by using PCR provided by Agilent Technologies. In brief, the Aberration Detection methods. Forward primer was 50-CCGCTCGAGCTAAGCTTATGAAGAGT- Method 2 quality-weighted interval score algorithm identifies aberrant GACTGCATGCAAACGACAATA-30 and reverse primer was 50-CGGAATT- intervals in samples that have consistently high or low log ratios, based on CAAGCTTTCACAGTCTTTTCTTGTTG CCTCTTTCTAA -30. The cDNA was their statistical score. The score represents the deviation of the weighted inserted into XhoI/EcoRI cloning sites of pEGFP-C1 (pEGFP-DUS4L–BCAP29). average of normalised log ratios from its expected value of zero calculated with Derivative Log2 Ratio Standard Deviation algorithm. A Fuzzy Zero algorithm is applied to incorporate quality information about each probe BrdU incorporation and cell growth analysis measurement. The threshold settings that we used for the CGH analytics BrdU incorporation analysis of confluent cells under different experimental software to make a positive call were 6.0 for sensitivity, 0.25 for minimum conditions and cell growth analysis by viable cell counting were performed absolute average log ratio per region and 3 consecutive probes with the as described previously.17 same polarity for the minimum number of probes per region.

Soft agar assay Fluorescence in situ hybridisation FISH and its analysis were carried out by Macrogen Inc. with specific BAC Cells without or with DUS4L–BCAP29 expression were processed for soft clones, which were also from Macrogen Inc. The probes were labelled with agar assay as described previously.19 The plates were then incubated for 14 two different fluorochromes, FITC (green) and Cy3 (red). Detailed probe days in a 5% CO incubator at 37 1C, with replenishment of medium 2 information is given in Supplementary Table S4. containing 200 mg/ml G418 and 10% FBS every other day. Colonies were imaged using a digital camera–equipped microscope. CONFLICT OF INTEREST Cell cycle analysis The authors declare no conflict of interest. After incubation with the second treatment of siRNAs for 24 h, the cells were centrifuged at 1500 r.p.m. for 5 min, fixed in 70% alcohol and stored at À 20 1C. Ten microliters of RNAse (100 mg/ml) was added to the samples; ACKNOWLEDGEMENTS subsequently, the samples were incubated at 37 1C for 10 min and then This study was partially funded by the Korean Healthcare 21 and Technology R&D treated with propidium iodide, after which the DNA content of the cells project, Ministry for Health, Welfare & Family Affairs, Republic of Korea (Grant No. (10 000 cells per experimental group) was determined using an A091081), and by the Priority Research Centers Program through the National FACSCalibur flow cytometer (Becton Dickinson Biosciences, San Jose, CA, USA) with a ModFit LT program (Verity Software House, Inc., Topsham, ME, Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology, Republic of Korea (2009-0093820). USA).

Western blotting and antibodies REFERENCES Cultured cells were washed with ice-cold phosphate-buffered saline and 1 Mitelman F, Johansson B, Mertens F. The impact of translocations and gene lysed with lysis buffer (50 mM Tris-HCl, pH 7.5, 1% NP-40, 0.1% sodium fusions on cancer causation. Nat Rev Cancer 2007; 7: 233–245. deoxycholate, 150 mM NaCl, 50 mM NaF, 1 mM sodium pyrophosphate, 1 mM 2 Savage DG, Antman KH. Imatinib mesylate--a new oral targeted therapy. N Engl J EDTA and protease/phosphatase inhibitors). Twenty micrograms of total Med 2002; 346: 683–693. protein was resolved using SDS–PAGE. The resolved proteins were then 3 Soda M, Choi YL, Enomoto M, Takada S, Yamashita Y, Ishikawa S et al. Identifi- transferred to nitrocellulose membranes. After blocking with 1% skim milk cation of the transforming EML4-ALK fusion gene in non-small-cell lung cancer. and 1% BSA/Tris-buffered saline with Tween 20 (TBST), the membranes Nature 2007; 448: 561–566. were incubated with primary antibodies at 4 1C overnight. Antibodies 4 Kwak EL, Bang YJ, Camidge DR, Shaw AT, Solomon B, Maki RG et al. Anaplastic against cyclin D, p-AKT (pS-473), AKT, p-ERK (p44/42), ERK, PARP, caspase-7, lymphoma kinase inhibition in non-small-cell lung cancer. N Engl J Med 2010; 363: caspase-3, Bcl-2 and Bim were purchased from Cell Signaling Technology 1693–1703. (Beverley, MA, USA). Antibodies against cyclin E, actin and Bcl-XL were from 5 Robinson DR, Kalyana-Sundaram S, Wu YM, Shankar S, Cao X, Ateeq B et al. Santa Cruz Biotechnology (Santa Cruz, CA, USA). An anti-a-tubulin antibody Functionally recurrent rearrangements of the MAST kinase and Notch gene was acquired from Sigma-Aldrich (St Louis, MO, USA). families in breast cancer. Nat Med 2011; 17: 1646–1651.

Oncogene (2014) 5434 – 5441 & 2014 Macmillan Publishers Limited Novel fusion transcripts in gastric cancer H-P Kim et al 5441 6 Edgren H, Murumagi A, Kangaspeska S, Nicorici D, Hongisto V, Kleivi K et al. 22 Li H, Wang J, Ma X, Sklar J. Gene fusions and RNA trans-splicing in normal and Identification of fusion genes in breast cancer by paired-end RNA-sequencing. neoplastic human cells. Cell Cycle 2009; 8: 218–222. Genome Biol 2011; 12: R6. 23 Underiner TL, Herbertz T, Miknyoczki SJ. Discovery of small molecule c-Met 7 Pflueger D, Terry S, Sboner A, Habegger L, Esgueva R, Lin PC et al. Discovery of inhibitors: Evolution and profiles of clinical candidates. Anticancer Agents Med non-ETS gene fusions in human prostate cancer using next-generation RNA Chem 2010; 10: 7–27. sequencing. Genome Res 2011; 21: 56–67. 24 Malumbres M, Pevarello P, Barbacid M, Bischoff JR. CDK inhibitors in cancer 8 Kannan K, Wang L, Wang J, Ittmann MM, Li W, Yen L. Recurrent chimeric RNAs therapy: what is next? Trends Pharmacol Sci 2008; 29: 16–21. enriched in human prostate cancer identified by deep sequencing. Proc Natl Acad 25 Peters S, Adjei AA. MET: a promising anticancer therapeutic target. Nat Rev Clin Sci USA 2011; 108: 9172–9177. Oncol 2012; 9: 314–326. 9 Palanisamy N, Ateeq B, Kalyana-Sundaram S, Pflueger D, Ramnarayanan K, 26 Smolen GA, Sordella R, Muir B, Mohapatra G, Barmettler A, Archibald H et al. Shankar S et al. Rearrangements of the RAF kinase pathway in prostate cancer, Amplification of MET may identify a subset of cancers with extreme sensitivity to gastric cancer and melanoma. Nat Med 2010; 16: 793–798. the selective tyrosine kinase inhibitor PHA-665752. Proc Natl Acad Sci USA 2006; 10 Jemal A, Bray F, Center MM, Ferlay J, Ward E, Forman D. Global cancer statistics. 103: 2316–2321. CA Cancer J Clin 2011; 61: 69–90. 27 Fushida S, Yonemura Y, Urano T, Yamaguchi A, Miyazaki I, Nakamura T et al. 11 Jung KW, Won YJ, Kong HJ, Oh CM, Seo HG, Lee JS. Prediction of cancer incidence Expression of hepatocyte growth factor(hgf) and C-met gene in human gastric- and mortality in Korea, 2013. Cancer Res Treatment 2013; 45: 15–21. cancer cell-lines. Int J Oncol 1993; 3: 1067–1070. 12 Tao J, Deng NT, Ramnarayanan K, Huang B, Oh HK, Leong SH et al. CD44-SLC1A2 28 Katoh Y, Katoh M. FGFR2-related pathogenesis and FGFR2-targeted therapeutics gene fusions in gastric cancer. Sci Transl Med 2011; 3: 77ra30. (Review). Int J Mol Med 2009; 23: 307–311. 13 Zhang Y, Gong M, Yuan H, Park HG, Frierson HF, Li H. Chimeric transcript gen- 29 Gozgit JM, Wong MJ, Moran L, Wardwell S, Mohemmad QK, Narasimhan NI et al. erated by cis-splicing of adjacent genes regulates prostate cancer cell prolifera- Ponatinib (AP24534), a multitargeted pan-FGFR inhibitor with activity in tion. Cancer Discov 2012; 2: 598–607. multiple FGFR-amplified or mutated cancer models. Mol Cancer Ther 2012; 11: 14 Sboner A, Habegger L, Pflueger D, Terry S, Chen DZ, Rozowsky JS et al. FusionSeq: 690–699. a modular framework for finding gene fusions by analyzing paired-end RNA- 30 Huppi K, Volfovsky N, Runfola T, Jones TL, Mackiewicz M, Martin SE et al. The sequencing data. Genome Biol 2010; 11: R104. identification of microRNAs in a genomically unstable region of human chro- 15 Berger MF, Levin JZ, Vijayendran K, Sivachenko A, Adiconis X, Maguire J et al. mosome 8q24. Mol Cancer Res 2008; 6: 212–221. Integrative analysis of the melanoma transcriptome. Genome Res 2010; 20: 413–427. 31 Kawamata N, Zhang L, Ogawa S, Nannya Y, Dashti A, Lu D et al. Double minute 16 Sigrist CJ, de Castro E, Cerutti L, Cuche BA, Hulo N, Bridge A et al. New and chromosomes containing MYB gene and NUP214-ABL1 fusion gene in T-cell continuing developments at PROSITE. Nucleic Acids Res 2013; 41: D344–D347. leukemia detected by single nucleotide polymorphism DNA microarray and 17 Letunic I, Doerks T, Bork P. SMART 7: recent updates to the protein domain fluorescence in situ hybridization. Leuk Res 2009; 33: 569–571. annotation resource. Nucleic Acids Res 2012; 40: D302–D305. 32 Forment JV, Kaidi A, Jackson SP. Chromothripsis and cancer: causes and con- 18 Zang ZJ, Ong CK, Cutcutache I, Yu W, Zhang SL, Huang D et al. Genetic and sequences of chromosome shattering. Nat Rev Cancer 2012; 12: 663–670. structural variation in the gastric cancer kinome revealed through targeted deep 33 Wu TD, Nacu S. Fast and SNP-tolerant detection of complex variants and splicing sequencing. Cancer Res 2011; 71: 29–39. in short reads. Bioinformatics 2010; 26: 873–881. 19 Nagoshi H, Taki T, Hanamura I, Nitta M, Otsuki T, Nishida K et al. Frequent PVT1 34 Ju YS, Lee WC, Shin JY, Lee S, Bleazard T, Won JK et al. A transforming KIF5B and rearrangement and novel chimeric genes PVT1-NBEA and PVT1-WWOX occur in RET gene fusion in lung adenocarcinoma revealed from whole-genome and multiple myeloma with 8q24 abnormality. Cancer Res 2012; 72: 4954–4962. transcriptome sequencing. Genome Res 2012; 22: 436–445. 20 Ku JL, Park JG. Biology of SNU cell lines. Cancer Res Treatment 2005; 37: 1–19. 35 Seo JS, Ju YS, Lee WC, Shin JY, Lee JK, Bleazard T et al. The transcriptional 21 Li H, Wang J, Mor G, Sklar J. A neoplastic gene fusion mimics trans-splicing of landscape and mutational profile of lung adenocarcinoma. Genome Res 2012; 22: RNAs in normal human cells. Science 2008; 321: 1357–1361. 2109–2119.

Supplementary Information accompanies this paper on the Oncogene website (http://www.nature.com/onc)

& 2014 Macmillan Publishers Limited Oncogene (2014) 5434 – 5441