Etics: Early Online, Published on December 26, 2017 As 10.1534/Genetics.117.300552

Genetics: Early Online, published on December 26, 2017 as 10.1534/genetics.117.300552 1 The hidden genomic and transcriptomic plasticity of giant marker chromosomes in cancer 1 2 3 1 1 2 Gemma Macchia * , Marco Severgnini , Stefania Purgato , Doron Tolomeo , Hilen Casciaro , 2 1 1 4 4 3 Ingrid Cifola , Alberto L’Abbate , Anna Loverro , Orazio Palumbo , Massimo Carella , 5 3 2 6 1 4 Laurence Bianchini , Giovanni Perini , Gianluca De Bellis , Fredrik Mertens , Mariano Rocchi , 1# 5 Clelia Tiziana Storlazzi . 6 (1) Department of Biology, University of Bari “Aldo Moro”, Bari, Italy; 7 (2) Institute for Biomedical Technologies (ITB), CNR, Segrate, Italy; 8 (3) Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy; 9 (4) Laboratorio di Genetica Medica, IRCCS Casa Sollievo della Sofferenza, San Giovanni 10 Rotondo, Italy; 11 (5) Laboratory of solid tumor genetics, Université Côte d'Azur, CNRS, IRCAN, Nice, France. 12 (6) Department of Clinical Genetics, University and Regional Laboratories, Lund University, 13 Lund, Sweden. 14 15 EMBL-EBI Array Express database: E-MTAB-5625 16 NCBI Short Read Archive: PRJNA378952. 17 GenBank repository: KY966261-KY966313 and KY966314-KY966332 18 19 20 21 22 23 24 25 26 27 Running Title: Neocentromeres and chimeric transcripts in cancer 1 Copyright 2017. 1 Keywords: neocentromere, fusion transcript, WDLPS, LSC, gene amplification 2 3 * Corresponding author: 4 Macchia Gemma, Department of Biology, University of Bari, Via Orabona no.4, 70125 Bari (Italy) 5 Email: [email protected] 6 Tel No: +39 0805443582 7 Fax: +39 0805443386 8 9 2 1 ABSTRACT 2 3 Genome amplification in the form of rings or giant rod-shaped marker chromosomes is a common 4 genetic alteration in soft tissue tumours. The mitotic stability of these structures is often rescued by 5 perfectly functioning analphoid neocentromeres, which therefore significantly contribute to cancer 6 progression. Here, we disentangled the genomic architecture of many neocentromeres stabilizing 7 marker chromosomes in well-differentiated liposarcoma and lung sarcomatoid carcinoma samples. 8 In cells carrying heavily rearranged RGMs, these structures were assembled as patchworks of 9 multiple short amplified sequences, disclosing an extremely high level of complexity and definitely 10 ruling out the existence of regions prone to the neocentromere seeding. Moreover, by studying two 11 well-differentiated liposarcoma samples derived from the onset and the recurrence of the same 12 tumor, we documented an expansion of the neocentromeric domain that occurred during tumor 13 progression, which reflects a strong selective pressure acting toward the improvement of the 14 neocentromeric functionality in cancer. In lung sarcomatoid carcinoma cells, extensive “centromere 15 sliding” phenomena giving rise to multiple, closely mapping neocentromeric epialleles on separate 16 co-existing markers occur likely due to the instability of neocentromeres arising in cancer cells. 17 Finally, by investigating the transcriptional activity of neocentromeres, we came across a burst of 18 chimeric transcripts, both by extremely complex genomic rearrangements, and cis/trans-splicing 19 events. Post-transcriptional editing events have been reported to expand and variegate the genetic 20 repertoire of higher eukaryotes, so they might have a determining role in cancer. The increased 21 incidence of fusion transcripts, might act as a driving force for the genomic amplification process, 22 together with the increased transcription of oncogenes. 23 24 25 26 3 1 INTRODUCTION 2 Genome amplification is a frequent genetic alteration in cancer, with variable cytogenetic 3 manifestations including double minutes, homogeneously staining regions and/or ring and giant 4 rod-shaped marker chromosomes (RGM) (MATSUI et al. 2013; L'ABBATE et al. 2014; NORD et al. 5 2014). While double minutes and homogeneously staining regions have been described in a variety 6 of cancer types (MATSUI et al. 2013), RGMs are particularly common in soft tissue tumours, 7 notably in well-differentiated liposarcomas (WDLPS), and shown to contain amplified sequences 8 from several chromosomes (NORD et al. 2014). During tumour progression, the ring chromosomes 9 are frequently broken and resealed or transformed into rod-shaped markers capturing the telomeres 10 from other chromosomes (NORD et al. 2014). This instability results in a highly complex internal 11 structure of these markers, as well as in extensive heterogeneity with respect to size and number per 12 cell (GARSED et al. 2014; NORD et al. 2014). RGMs frequently lack functional centromeric alphoid 13 sequences and their mitotic stability is rescued by the emergence of perfectly functioning analphoid 14 neocentromeres, which might indirectly contribute to cancer progression (MACCHIA et al. 2015). 15 Nonetheless, there are few studies addressing neocentromeres in cancer, probably because most of 16 the technologies employed to study the tumour genotypes are unable to unveil them. The 17 occurrence of neocentromeres in cancer, therefore, could be more frequent than reported. Similarly, 18 very little is known about the impact of neocentromeres on transcription, although centromeric 19 satellite regions have been reported to produce non-coding transcripts actively involved in the 20 centromere assembly (CHAN et al. 2012; ROSIC et al. 2014; QUENET AND DALAL 2015; MCNULTY 21 et al. 2017). Also, genes within neocentromeres are still actively transcribed (AMOR AND CHOO 22 2002; WONG et al. 2006). In line with these notions, the occurrence of neocentromeres in colon 23 cancer cell lines was reported to correlate with large DNase I hypersensitive sites, which are usually 24 sites of active transcription or high nucleosome turnover (ATHWAL et al. 2015). By combining 25 chromatin immunoprecipitation (IP) deep sequencing (ChIP-seq), whole genome sequencing 26 (WGS), immuno-fluorescence in situ hybridisation (immuno-FISH), whole transcriptome 27 sequencing (total RNA-seq) and other molecular analyses, we investigated in detail the genomic 4 1 architecture of neocentromeres arising on RGMs, as well as their contribution to transcription, in 2 the lung sarcomatoid carcinoma (LSC) cell line 04T036 and in the three liposarcoma cell lines 3 93T449, 94T778 and 95T1000. Overall, our study uncovered the complex organization of 4 neocentromeres in cancer and shed light on the extraordinarily high genomic and transcriptomic 5 plasticity associated with RGMs in solid tumours. 6 7 MATERIALS AND METHODS 8 Tumour cell lines 9 Four tumour cell lines (04T036, 93T449, 94T778, and 95T1000), kindly provided by The Centre 10 Hospitalier Universitaire de Nice (France), were included in the study. 04T036 was established 11 from the LSC of a 50-year-old man. Cytogenetic and multicolor FISH analyses showed a near- 12 triploid karyotype with numerous structural aberrations and four to six small RGMs containing 13 chromosome 9 amplified sequences, and two RGMs containing chromosome 3 amplified sequences 14 (ITALIANO et al. 2006). 93T449 and 94T778 cell lines were obtained from a primary retroperitoneal 15 WDLPS at onset and at relapse, respectively. These commercial cell lines showed complex 16 karyotypes with multiple RGMs at G-banding and multicolour FISH analysis , and a clear 17 difference in the chromosome overall arrangement between them (SIRVENT et al. 2000; GARSED et 18 al. 2014). 95T1000 cell line was generated from a WDLPS relapse; SKY analysis revealed a 19 hypertriploid karyotype with multiple chromosomal structural abnormalities (PEDEUTOUR et al. 20 2012). All cells retained a giant marker chromosome, previously identified in the primary cell 21 cultures. This giant chromosome contained high-level amplification of chromosomal regions 22 deriving from 10p and 12q and lacked alpha-satellite DNA (PEDEUTOUR et al. 2012). 23 24 SNP array data 25 All cell lines were analysed by Affymetrix Genome Wide Human SNP Array 6.0 platform 26 (Affymetrix, Santa Clara, CA, USA), as described (STORLAZZI et al. 2010). 27 5 1 Whole Genome Sequencing 2 WGS was carried out to disentangle the genomic architecture of RGMs holding neocentromeres. 3 Library preparations were performed using the TruSeqDNA Nano 350 bp protocol (Illumina, San 4 Diego, CA, USA). The sequencing data were acquired using the Illumina Xten at the NYGC (New 5 York, US), in a paired-end 150-cycle run (mean coverage 40× per sample). Reads were aligned to 6 the human reference genome (GRCh37/hg19) using BWA-MEM (v.0.7.12) [http://bio- 7 bwa.sourceforge.net/, (LI AND DURBIN 2009)] and PCR duplicates were removed using Picard 8 (v.1.119) (http://picard.sourceforge.net/). Candidate structural variations (SVs) were identified 9 using Delly (v. 0.5.9) and Crest (v. 1.0) with default parameters (WANG et al. 2011; RAUSCH et al. 10 2012). Copy number analysis was performed using BIC-seq 0.7alpha (XI et al. 2011), and genomic 11 intervals showing a log2 copyRatio > 0.5 and > 2.5 were considered as amplified and highly 12 amplified, respectively. 13 14 ChIP-sequencing 15 To determine the internal structure of the neocentromeres, native ChIP-seq was performed as 16 described (WADE et al. 2009). Immunoprecipitation was run using a polyclonal antibody against the 17 CENP-A (TRAZZI et al. 2009). Both input and IP DNA fragments were purified and processed using 18 the TruSeq ChIP Library Preparation Kit (Illumina) and sequenced on the Illumina HiSeq 2500 at 19 the IGA Technology Services facility (Udine, Italy) (single-end

Load more