Edinburgh Research Explorer

Interspecies conservation of organisation and function between nonhomologous regional

Citation for published version: Tong, P, Pidoux, A, Toda, NRT, Ard, R, Berger, H, Shukla, M, Torres-Garcia, J, Mueller, CA, Nieduszynski, CA & Allshire, RC 2019, 'Interspecies conservation of organisation and function between nonhomologous regional centromeres', Nature Communications, vol. 10, 2343. https://doi.org/10.1038/s41467-019-09824-4

Digital Object Identifier (DOI): 10.1038/s41467-019-09824-4

Link: Link to publication record in Edinburgh Research Explorer

Document Version: Publisher's PDF, also known as Version of record

Published In: Nature Communications

General rights Copyright for the publications made accessible via the Edinburgh Research Explorer is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights.

Take down policy The University of Edinburgh has made every reasonable effort to ensure that Edinburgh Research Explorer content complies with UK legislation. If you believe that the public display of this file breaches copyright please contact [email protected] providing details, and we will remove access to the work immediately and investigate your claim.

Download date: 24. Sep. 2021 ARTICLE

https://doi.org/10.1038/s41467-019-09824-4 OPEN Interspecies conservation of organisation and function between nonhomologous regional centromeres

Pin Tong1,6, Alison L. Pidoux1,6, Nicholas R.T. Toda1,3, Ryan Ard 1,4, Harald Berger1,5, Manu Shukla1, Jesus Torres-Garcia 1, Carolin A. Müller 2, Conrad A. Nieduszynski 2 & Robin C. Allshire 1

1234567890():,; Despite the conserved essential function of centromeres, centromeric DNA itself is not conserved. The histone-H3 variant, CENP-A, is the epigenetic mark that specifies identity. Paradoxically, CENP-A normally assembles on particular sequences at specific genomic locations. To gain insight into the specification of complex centromeres, here we take an evolutionary approach, fully assembling genomes and centromeres of related fission yeasts. Centromere domain organization, but not sequence, is conserved between Schizo- saccharomyces pombe, S. octosporus and S. cryophilus with a central CENP-ACnp1 domain flanked by heterochromatic outer-repeat regions. Conserved syntenic clusters of tRNA genes and 5S rRNA genes occur across the centromeres of S. octosporus and S. cryophilus, suggesting conserved function. Interestingly, nonhomologous centromere central-core sequences from S. octosporus and S. cryophilus are recognized in S. pombe, resulting in cross-species estab- lishment of CENP-ACnp1 and functional . Therefore, despite the lack of sequence conservation, Schizosaccharomyces centromere DNA possesses intrinsic conserved properties that promote assembly of CENP-A chromatin.

1 Wellcome Centre for Cell Biology and Institute of Cell Biology, School of Biological Sciences, The University of Edinburgh, Mayfield Road, Edinburgh EH9 3BF, UK. 2 Sir William Dunn School of Pathology, University of Oxford, South Parks Road, Oxford OX1 3RE, UK. 3Present address: UPMC CNRS, Roscoff Marine Station, Place Georges Teissier, 29680 Roscoff, France. 4Present address: Copenhagen Plant Science Centre, University of Copenhagen, Bülowsvej 34, 1870 Frederiksberg C, Denmark. 5Present address: Symbiocyte, Universität für Bodenkultur Wien, University of Natural Resources and Life Sciences, 1180 Vienna, Austria. 6These authors contributed equally: Pin Tong, Alison L. Pidoux. Correspondence and requests for materials should be addressed to A.L.P. (email: [email protected]) or to R.C.A. (email: [email protected])

NATURE COMMUNICATIONS | (2019) 10:2343 | https://doi.org/10.1038/s41467-019-09824-4 | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09824-4

entromeres are the chromosomal regions upon which arm-sized assemblies and comparative analysis kinetochores assemble to mediate accurate chromosome identified translocations and inversion events that occurred C fi 22 segregation. Evidence suggests that both genetic and epi- during divergence of ssion yeast species . Synteny is preserved genetic influences define centromere identity1–9. Neocentromere adjacent to centromeres (Fig. 1b). Circos plots indicate a chro- formation at new locations lacking homology to centromeres10 mosome arm translocation occurred within two ancestral cen- and the inactivation of one centromere of a dicentric chromo- tromeres to generate S. cryophilus cen2 (S.cry-cen2) and S.cry-cen3 some despite it retaining centromeric sequences11 indicate that relative to S. octosporus and S. pombe (Fig. 1b). Despite centromere sequences are neither necessary nor sufficient for centromere-adjacent synteny, Schizosaccharomyces centromeres centromere assembly1,7,9. CENP-A is found at all active cen- lack detectable sequence homology (see below). All centromeres tromeres and is the epigenetic mark that specifies centromere contain a central domain: central core (cnt) surrounded by identity1,7,9. Artificial tethering of CENP-A or CENP-A loading inverted repeat (imr) elements unique to each centromere (Fig. 2, factors at non-centromeric locations on metazoan Supplementary Fig. 4, Supplementary Tables 1,2, Supplementary is sufficient to trigger assembly5,12. Thus, it is spe- Data 3,4). CENP-ACnp1 localizes to fission yeast centromeres cialized chromatin rather than primary sequences of centromeric (Fig. 2a) and chromatin immunoprecipitation sequencing (ChIP- DNA that determines where kinetochores and hence functional Seq) indicates that central domains are assembled in CENP- centromeres are assembled. On the contrary, however, CENP-A ACnp1 chromatin, flanked by various outer-repeat elements is generally found on particular sequences in any given assembled in H3K9me2 (Fig. 2b, c). Despite the organism1,2,4 and naked repetitive centromere DNA such as lack of sequence conservation, S. octosporus and S. cryophilus alpha-satellite DNA can provide a substrate for the de novo centromere organization is strongly conserved with that of S. assembly of functional centromeres when introduced into human pombe, having CENP-ACnp1-assembled central domains sepa- cells1–4. These observations suggest that, despite the lack of rated by clusters of tRNA genes from outer repeats assembled in conservation between species, centromere sequences possess heterochromatin13,14 (Supplementary Fig. 5, Supplementary properties that make them attractive for assembly of CENP-A Table 3, Supplementary Data 5). In contrast, our analyses of chromatin. partially assembled, transposon-rich centromeres of Schizo- Schizosaccharomyces pombe, a paradigm for dissecting complex saccharomyces japonicus reveals the presence of heterochromatin regional centromere function, has demarcated centromeres on all classes of retrotransposons and CENP-ACnp1 on only two (35–110 kb) with a central domain assembled in CENP-ACnp1 classes (Tj6 and Tj7; Supplementary Fig. 6, Supplementary chromatin, flanked by outer-repeat elements assembled in RNA Table 4)20. interference-dependent heterochromatin, in which histone- H3 is methylated on lysine-9 (H3K9)13–16. Heterochromatin is required for establishment but not maintenance of CENP-ACnp1 Syntenic clusters of tRNA genes at centromeres. Numerous 5S chromatin6,17. We have proposed that it is not the sequence rRNA genes (5S rDNAs) are located in the heterochromatic per se of S. pombe central-core that is key in its ability to establish outer repeats of S. octosporus and S. cryophilus centromeres CENP-A chromatin but the properties programmed by it18,19. (but not S. pombe) (Fig. 1a, Supplementary Data 6, 7). Almost all To investigate whether these properties are conserved, here (25/26; 20/20) are within Five-S-Associated Repeats (FSARs; we completely assemble the sequence across centromeres of 0.6–4.2 kb) (Fig. 3a), encompassing ~35% of outer-repeat other Schizosaccharomyces species and test their cross-species regions. FSARs exhibit 90% intra-class homology (Supplemen- functionality. We show that although Schizosaccharomyces tary Table 5) but no interspecies homology. The three types centromeres are not conserved in sequence, those of Schizo- of FSAR repeats almost always occur together, in the same order saccharomyces octosporus and Schizosaccharomyces cryophilus and orientation, but vary in copy number: S. octosporus: (oFSAR- share with S. pombe a conserved organization of a central domain 1)1(oFSAR-2)1–9(oFSAR-3)1; S. cryophilus: (cFSAR-1)1-3(cFSAR- Cnp1 assembled in CENP-A chromatin, flanked by outer repeats 2)1-2(cFSAR-3)1. Both sides of S. octosporus and S. cryophilus assembled in heterochromatin. Syntenic clusters of tRNA and centromeres contain at least one FSAR-1-2-3 array, except the 5S-rRNA genes are present across S. octosporus and S. cryophilus right side of S.cry-cen2 with two lone cFSAR-3 elements (Fig. 3a, centromeres, further emphasizing their conserved organization. Supplementary Fig. 4). S. cryophilus cFSAR-2 and cFSAR-3 By introducing minichromosomes bearing central domain repeats share ~400 bp homology (88% identity), constituting sequences from S. octosporus and S. cryophilus into S. pombe, hsp16 heat-shock protein open reading frames (ORFs) (Fig. 3a, b, we demonstrate that these nonhomologous centromere sequences Supplementary Data 8) that are intact, implying functionality, can be recognized between divergent species, allowing the selection and expression in some situations. Phylogenetic gene establishment of CENP-ACnp1 chromatin and functional cen- trees indicate that cFSAR-3-hsp16 genes are more closely related tromeres. These observations indicate that centromere DNA with each other than with those in subtelomeric regions or possesses conserved properties that promote the establishment cFSAR-2s (Fig. 3b), consistent with repeat homogenization23–25. of CENP-A chromatin. cFSAR-1s contain an eroded ORF with homology to a small hypothetical protein and S. octosporus oFSAR-2s contain a region of homology with a family of membrane proteins (Fig. 3a). The Results functions of centromere-associated hsp16 genes and other ORF- Conserved organization of fission yeast centromeres. Long-read homologous regions remain to be explored. (PacBio) sequencing permitted complete assembly of the gen- S. cryophilus heterochromatic outer repeats contain additional omes across centromeres of S. octosporus (11.9 Mb) and S. cryo- repetitive elements, including a 6.2 kb element (cTAR-14) with philus (12.0 Mb), extending genome sequences20 to telomeric or homology to the retrotransposon Tcry1 and transposon remnants subtelomeric repeats, or rDNA arrays (Supplementary Figs. 1–3, at the mating-type locus20 (Figs. 1a, 2b, Supplementary Fig. 4, Supplementary Data 1, 2). Consistent with their closer evolu- Supplementary Tables 1,6 and Supplementary Data 3). Tcry1 is tionary relationship20–22, S. octosporus and S. cryophilus (32 My located in the chrIII-R subtelomeric region (Supplementary separation, compared with 119 My separation from S. pombe) Figs. 3, 4 and Supplementary Data 1). Although no retro- exhibit greatest synteny (Fig. 1a), in agreement with a recent transposons have been identified in S. octosporus, remnants are report in which joining of S. cryophilus supercontigs20 into present in the mating-type locus and oTAR-14ex in S.oct-cen3

2 NATURE COMMUNICATIONS | (2019) 10:2343 | https://doi.org/10.1038/s41467-019-09824-4 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09824-4 ARTICLE

a

b

S. octosporus vs. S. pombe S. cryophilus vs. S. pombe S. octosporus vs. S. cryophilus

Fig. 1 Genome organization and synteny in Schizosaccharomyces. a Circos plots depicting pairwise S. pombe, S. octosporus and S. cryophilus genome synteny. Rings from outside to inside represent the following: chromosomes; GC content (high: red, low: yellow); 5S rDNAs (red); tRNA genes (black); LTRs (green); CENP-ACnp1 ChIP-seq (purple); H3K9me2 ChIP-seq (orange); innermost ring and coloured connectors indicate regions of synteny between species. S. pombe chromosomes are indicated by blue (S.pom-chr1), green (S.pom-chr2), red (S.pom-chr3) in the left and right panels, and regions of synteny on S. octosporus and S. cryophilus chromosomes, respectively, are indicated in corresponding colours. A similar designation is used for S. octosporus chromosomes in the middle panel. b Circos plot isolating regions adjacent to centromeres highlighting preserved synteny and an intra-centromeric chromosome arm swap involving S. cryophilus cen2 and cen3 relative to S. pombe and S. octosporus. Source data available: GEO: GSE112454 outer repeats (Fig. 2c, Supplementary Figs. 1, 4, Supplementary tRNA genes and LTRs are thus likely to act as chromatin Tables 2, 7 and Supplementary Data 4). Hence, transposon boundaries at fission yeast centromeres. remnants, FSARs and other repeats are assembled in hetero- A high proportion (~32%) of tRNA genes in S. pombe, chromatin at S. octosporus and S. cryophilus centromeres, and S. octosporus and S. cryophilus genomes are located within potentially mediate heterochromatin nucleation. centromere regions33 (Figs. 1a, 3c, Supplementary Table 8 and tRNA gene clusters occur at transitions between CENP-A and Supplementary Data 9, 10). Centromeric tRNA genes are intact heterochromatin domains in two of three centromeres in and are conserved in sequence with their genome-wide counter- S. octosporus (S.oct-cen2, S.oct-cen3) and S. cryophilus (S.cry- parts, indicating that they are functional genes. Two major, cen1, S.cry-cen2), and are associated with low levels of both conserved tRNA gene clusters reside exclusively within H3K9me2 and CENP-ACnp1 (Fig. 2b, c), suggesting that they may S. octosporus and S. cryophilus centromeres (p-value < 0.00001; act as boundaries, as in S. pombe26–28. No tRNA genes demarcate q-value < 0.05) (Fig. 3c, d). Cluster 1 comprises several subclusters the CENP-A/heterochromatin transition at S.cry-cen3. Instead, of 2–3 tRNA genes in various combinations of up to 8 tRNA this transition coincides precisely with 270 bp LTRs (Fig. 2b, genes, whereas Cluster2 contains up to 5 tRNA genes (Fig. 3d); 17 Supplementary Tables 1, 6 and Supplementary Data 3), which different tRNA genes (14 amino acids) are represented, none may also act as boundaries29–31. Similar to tRNA genes, LTRs of which are unique to centromeres (Fig. 3c). Intriguingly, have been shown to be regions of low nucleosome occupancy, the order and orientation of tRNA genes within clusters is which may counter spreading of heterochromatin31,32. The conserved between species, but intervening sequence is not transition between CENP-ACnp1 and heterochromatin is poorly (Fig. 3d, e). Strikingly, as well as local tRNA gene cluster demarcated at S.oct-cen1 compared with other centromeres. conservation, inspection of centromere maps reveals synteny of This region lacks tRNA genes and, as only retrotransposon tRNA genes and clusters across large portions of S. octosporus remnants are detectable in S. octosporus, the sequence of putative and S. cryophilus centromeres. For example, the tRNA gene order LTRs is unknown. It is possible that the long inverted imr repeats AIR-RKL-E-T-T-L-DVAIR-RKLEF-A-DV (single-letter code) is comprise a gradual transition zone at this centromere. tRNA gene observed at S.oct-cen1 and S.cry-cen3 (Supplementary Fig. 7). This clusters also occur near the extremities of all centromeres in both synteny, together with both possessing small central cores and species, separating heterochromatin from adjacent euchromatin. long imrs, suggests that these two centromeres are ancestrally

NATURE COMMUNICATIONS | (2019) 10:2343 | https://doi.org/10.1038/s41467-019-09824-4 | www.nature.com/naturecommunications 3 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09824-4

a

DNA CENP-ACnp1 S. pombe S. octosporus S. cryophilus S. japonicus

b 1400 kb 1410 kb 1420 kb 1430 kb 1440 kb 1450 kb 1460 kb 1470 kb 1480 kb

[0–90] CENP-ACnp1 S.cry-cen1

[0–30] H3K9me2

RIAVD A5S5S 5S VD E LK KLEF 5S 5S5S 5S 5S 5S R PRIAVD LE DVA 2730 kb 2740 kb 2750 kb 2760 kb 2770 kb 2780 kb 2790 kb 2800 kb [0–90] S.cry-cen2

[0–30]

FELKR RIAVD5S 5S 5S NM E DVAIRKE EKRIAVD E MN5S RIAVD DVAIE5S E

920 kb 930 kb 940 kb 950 kb 960 kb 970 kb 980 kb 990 kb [0–90] S.cry-cen3

[0–30]

E L DVAIR 5S5S 5S RKL ETTLLTR LTR DVAIR RKLEF 5S 5S 5S ADV

c 3310 kb 3320 kb 3330 kb 3340 kb 3350 kb 3360 kb 3370 kb [0–90] S.oct-cen1

[0–30]

VD A5S5S VD AIR RKL E T T 5S5S LDVAIR RKLEF 5S 5S A DVAIE E 2570 kb 2580 kb 2590 kb 2600 kb 2610 kb 2620 kb 2630 kb 2640 kb 2650 kb

[0–90] S.oct-cen2

[0–30]

AVD AKL5S5S 5S 5S 5S 5S 5S 5S5S 5S FELKERPDVAIRDV 5S 5S 5S RIAVD EL 1780 kb 1790 kb 1800 kb 1810 kb 1820 kb 1830 kb 1840 kb 1850 kb 1860 kb

[0–90] S.oct-cen3

[0–30]

FELKR DVAIRNMEDV 5S 5S AIRKE EKRIA5S 5S 5S VD EMNRIAVD LE

related (Fig. 3f ). Similarly, at S.oct-cen3 and S.cry-cen2, tRNA each containing long (oCNT-L(6.4 kb); cCNT-L(6.0 kb)) and genes occur in the order NME-DV-AIRKE-EKRIA-VD-EMN- short (oCNT-S(1.2 kb); cCNT-S(1.3 kb)) species-specific repeats RIAVD, and at S.oct-cen2 and S.cry-cen1 the same tRNA genes (Fig. 3f, Supplementary Tables 1, 2, 9 and Supplementary Data 3, are present in the imr repeats and beyond (FELK-KL-E-DV). 4). CNT repeats are arranged head-to-tail at one centromere Central cores have similar sizes and structures in the two species, and head-to-head at the other centromere in each species.

4 NATURE COMMUNICATIONS | (2019) 10:2343 | https://doi.org/10.1038/s41467-019-09824-4 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09824-4 ARTICLE

Fig. 2 Domain organization of Schizosaccharomyces centromeres. a Immunostaining of centromeres in indicated Schizosaccharomyces species with anti- CENP-ACnp1 antibody (green) and DNA staining (DAPI; red). Scale bar, 5 μm. b S. cryophilus centromere organization indicating DNA repeat elements. ChIP-seq profiles for CENP-ACnp1 (purple) and H3K9me2-heterochromatin (orange) are shown above each centromere. Positions of tRNA genes (single- letter code of cognate amino acid; black), 5S rDNAs (red) and solo LTRs are indicated (pink). Central cores (cnt—purples) innermost repeats (imr—blue shades). 5S-associated repeats (cFSARs—orange shades); tRNA gene-associated repeats (TARs) containing clusters of tRNA genes (green shades); heterochromatic repeats (cHR) and TARs associated with single tRNA genes (various colours: brown/pink/red). cTAR-14s, containing retrotransposon remnants (deep pink). For details, including individual repeat annotation, see Supplementary Fig. 4 and Supplementary Tables 3,4. c S. octosporus centromere organization indicating DNA repeat elements. Labelling and shading as in b. Only oTAR-14ex (pale pink part) contain retrotransposon remnants. Colouring is indicative of homology within each species but only of possible repeat equivalence (not homology) between species; see Supplementary Table 5,6,19. Source data available: GEO: GSE112454 and in Source Data file

Together, these similarities suggest ancestral relationships (Fig. 6b, c and Table 1). CENP-ACnp1 ChIP-quantitative PCR between S.oct-cen2 and S.cry-cen1, So-cen3 and Scry-cen2. (ChIP-qPCR) indicated that, for minichromosomes with estab- Further, in places where synteny appears to break down, patterns lished centromere function, CENP-ACnp1 chromatin was of tRNA gene clusters suggest specific centromeric rearrange- assembled on nonhomologous S.oct-cnt DNA, to levels similar ments occurred between the species. For instance, tRNA gene to those at endogenous S. pombe centromeres and to S.pom-cnt2 clusters at the edges of S.cry-cen2R and S.cry-cen3L are consistent on a minichromosome (Fig. 6d). Minichromosomes containing with an inter-centromere arm translocation relative to S.oct- S.oct-cnt provided efficient segregation function (Table 1), no cen1R and S.oct-cen2R, indicated by gene synteny maps (Figs. 1b, longer requiring CENP-ACnp1 overexpression to maintain that 4a and Supplementary Fig. 7). function once established (Fig. 6e), consistent with the self- propagating ability of CENP-A chromatin5,18. Minichromosomes Fission yeast centromeres show interspecies functionality.No containing S. cryophilus central-core regions (S.cry-cnt) were central-core sequence homology was revealed between species also able to establish functional centromeres and segregation using BLASTN. To identify potential underlying centromere function in S. pombe. These S.cry-cnt-bearing minichromosomes sequence features, k-mer frequencies (5-mers), normalized for assembled CENP-ACnp1 chromatin to high levels, similar to those centromeric AT-bias, were subjected to principal component at endogenous S. pombe centromeres (Supplementary Fig. 8). analysis (PCA). CENP-ACnp1-associated regions of S. pombe, Centromere function was not due to minichromosomes gaining S. octosporus and S. cryophilus genomes all group together, dis- portions of S. pombe central-core DNA (Supplementary Fig. 9). A tinct from the majority of non-centromere sequences (p-value, similar minichromosome bearing a region (retrotransposon Tj7) 9.3 × 10−7) (Fig. 4b, c). Interestingly, S. pombe neocentromere- highly enriched for CENP-ACnp1 in S. japonicus did not forming regions34 also cluster separately from other genomic convincingly form functional centromeres when introduced into regions, sharing sequence features with centromeres. Surprisingly, S. pombe or assemble CENP-ACnp1 chromatin to an appreciable taking GC content into account, the S. japonicus genome as extent (Supplementary Fig. 6). Thus, S. pombe, S. octosporus and a whole shows no significant difference in 5-mer frequency S. cryophilus centromeres share a similar organization, underlying compared with the other three fission yeast genomes. In contrast, sequence features and cross-species establishment of CENP- S. japonicus CENP-ACnp1-associated 5-mer frequencies show ACnp1 chromatin, whereas putative S. japonicus centromeres significant differences from its own wider genome sequence and appear not to share these attributes. Our analyses indicate that from centromere sequences of the other fission yeast species S.oct-cnt and S.cry-cnt DNAs are competent to establish CENP-A (Supplementary Fig. 6). chromatin and centromere function in S. pombe when CENP- K-mer analysis and conserved centromeric organization ACnp1 is overexpressed, suggesting that S. octosporus and prompted us to investigate cross-species functionality of protein S. cryophilus central-core DNA have intrinsic properties that and DNA components of Schizosaccharomyces centromeres. promote the establishment of CENP-A chromatin despite lacking green fluorescent protein (GFP)-tagged CENP-ACnp1 protein sequence homology. from each species localized to S. pombe centromeres and complemented the cnp1-1 mutant35 (Fig. 5a–c), indicating that Discussion heterologous CENP-A proteins assemble and function at Our analyses indicate that the centromeres of S. pombe, S. pombe centromeres, despite normally assembling on non- S. octosporus and S. cryophilus share a conserved organization of homologous sequences in their respective organisms. a CENP-ACnp1-assembled central-core flanked by outer repeats Introduction of S. pombe central-core (S.pom-cnt) DNA on assembled in H3K9me heterochromatin. Despite this conserva- minichromosomes into S. pombe results in the establishment and tion of organization, centromere sequence is not conserved, maintenance of CENP-ACnp1 chromatin if S.pom-cnt is adjacent although underlying sequence features are detectable by PCA of to heterochromatin, or if CENP-A is overexpressed6,17,18,36. S.oct- 5-mer frequencies. The cross-species functionality of S.oct-cnt cnt regions (3.2–10 kb) or S.pom-cnt2 (positive control) were and S.cry-cnt central-core DNA in S. pombe suggests that the placed adjacent to S. pombe outer-repeat DNA in minichromo- central-core regions of these three species are favoured substrates, some constructs (Fig. 6a), which were transformed into S. pombe sharing intrinsic properties that promote the establishment of cells overexpressing S. pombe GFP-CENP-ACnp1 (hi-CENP- CENP-ACnp1 chromatin, properties that S.jap-Tj7 may lack. ACnp1)18. Acquisition of centromere function is indicated by Although the nature of putative conserved CENP-A-promoting minichromosome retention on non-selective indicator plates properties is unknown, recent studies have revealed distinctive (white/pale pink colonies) and by the appearance of sectored characteristics of centromeric DNA. S. pombe central-core DNA colonies (Fig. 6b, c). The pHET-S.pom-cnt2 minichromosome has the innate property of driving high rates of histone-H3 containing S.pom-cnt2 established centromere function at high nucleosome turnover, causing low nucleosome occupancy19 and frequency immediately upon transformation in hi-CENP-ACnp1 may programme pervasive low-quality RNAPII transcription cells (Table 1). Centromere function was also established on to promote assembly of CENP-A chromatin18. These and other S.oct-cnt-containing minichromosomes in hi-CENP-ACnp1 cells properties, such as non-B form DNA37, may contribute to an

NATURE COMMUNICATIONS | (2019) 10:2343 | https://doi.org/10.1038/s41467-019-09824-4 | www.nature.com/naturecommunications 5 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09824-4

a S. cryophilus S. octosporus 5S rRNA gene

hsp16

Hypothetical protein cFSAR-3 cFSAR-2 cFSAR-1 oFSAR-3 oFSAR-2 oFSAR-1 2 kb 1 copy 1-2 copies 1–3 copies 1 copy 1–9 copies 1 copy Homology to family of membrane proteins

b cry-ORF8 d hsp16 S. cry cry-ORF6 cry-ORF9 cry-ORF20 << cry-ORF7 KE cry-ORF21 cry-ORF22 ><><><< hsp16_SPOG_03590T0 >< DVAIRKE cry-ORF11 DV cry-ORF24 ><><> cry-ORF23 DVAIR ><><< cry-ORF10 AIRKE cry-ORF12-cFSAR-2 Cluster 1 ><> cry-ORF16-cFSAR-2 AIR cry-ORF3-cFSAR-2 ><><><<< cry-ORF4-cFSAR-2 DVAIRNME cry-ORF2-cFSAR-2 <<< cry-ORF18-cFSAR-2 NME cry-ORF5-cFSAR-3 cry-ORF17-cFSAR-3 cry-ORF14-cFSAR-3 cry-ORF1-cFSAR-3 > cry-ORF19-cFSAR-3 R cry-ORF13-cFSAR-3p ><> cry-ORF15-cFSAR-3 RKL oct-ORF4 <> ><>>< S. oct oct-ORF5 RKLEF oct-ORF2 KL oct-ORF1 <>>< oct-ORF3 Cluster 2 KLEF hsp16_SOCG_01107T0 >< oct-ORF6 EF both oct cry oct-ORF7 oct-ORF8 hsp16_SPBC3E7.02c_T0 hsp16_SJAG_03815T0 hsp20_SPOG_00514T0 hsp20 hsp20_SOCG_04797T0 hsp20_SPCC338.06c_T0 hsp20_SJAG_02919T0 SC_HSP42 SC_HSP26 0.2

c 200 e 150 tRNA count 100 50 Centromeric Non-centromeric 1800 (DVAIR) (1.9 kb) 1000 Cluster 1

200

0 S.cry-cen3R cTAR-4/5 0 200 1000 1800 S.oct-cen1R oTAR-4/5 (DVAIR) (2 kb) Cluster 1 & 2

1800 Cluster 2 (DVAIR) (2 kb) 1000

tRNA count 20 200 15 10 5 0 S.oct-cen3L oTAR-4/5 0 0 200 1000 1800 S.oct-cen1R oTAR4-oTAR5 (DVAIR) (2 kb) 8 6 4 5 5 2

15 10 15 10 S. pom S. cry S. oct S. pom S. cry S. oct S. pom S. cry S. oct

f c-imr1 c-cnt1 c-imr1 c-imr2 c-cnt2 c-imr2 c-imr3 c-cnt3 c-imr3 S. cryophilus cCNT-S cCNT-L cCNT-L cCNT-S FELK KL DVAIRKE EKRIAVD LTR TTLTR o-imr2 o-cnt2 o-imr2 o-imr3 o-cnt3 o-imr3 o-imr1 o-cnt1 o-imr1 S. octosporus oCNT-S oCNT-L oCNT-L oCNT-S FELK KL AIRKE EKRIA TT p-imr3 p-cnt3 p-imr3 p-imr1 p-cnt1 p-imr1 p-imr2 p-cnt2 p-imr2 S. pombe TM TM DRVT L E L TVRD EA I I AE IAV VAI intrinsic CENP-A deposition programme conserved between transposon integration38, followed by heterochromatin formation Schizosaccharomyces centromeric DNA. to silence retrotransposons and preserve genome integrity39,40. Based on conserved features, ancestral Schizosaccharomyces The ability of heterochromatin to recruit cohesin41,42, benefitting centromeres may have consisted of a CENP-ACnp1-assembled chromosome segregation, selected for heterochromatin central-core surrounded by tRNA gene clusters and 5S rDNAs. maintenance43,44, rather than selecting for underlying sequence, We surmise that RNAPIII promoters perhaps provided targets for which evolved by repeat expansion and continuous

6 NATURE COMMUNICATIONS | (2019) 10:2343 | https://doi.org/10.1038/s41467-019-09824-4 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09824-4 ARTICLE

Fig. 3 S. cryophilus and S. octosporus contain conserved clusters of tRNA genes and similar nonhomologous repeat elements. a Schematic of S. cryophilus and S. octosporus FSAR repeats, indicating positions of 5S rDNAs, hsp16 genes and other ORFs. Copy number of each FSAR within centromeric arrays is indicated. b Phylogenetic relationship of S. cryophilus centromeric hsp16 genes with genomic hsp16 and hsp20 genes of S. cryophilus, S. octosporus, S. pombe and S. japonicas. c Heat map of tRNA gene frequency at centromeric and non-centromeric sites (blue shades) for S. pombe, S. cryophilus and S. octosporus. Anticodons and cognate amino acids indicated right (purple: present at centromeres). Clusters containing these tRNA genes indicated. Histogram (top): total tRNA gene frequencies in centromeres and non-centromeric sites of indicated species. Histogram (left): tRNA gene frequencies in each species. d Depiction of centromeric tRNA gene clusters and subclusters. Combinations of 2 or 3 tRNA genes subclusters present in both species (purple) or specific to S. octosporus (red) or S. cryophilus (blue) are indicated (single-letter code of cognate amino acid; arrows indicate plus or minus strand). e Top: Dot-plot alignment (MEGABLAST) showing synteny between oTAR-4/oTAR-5 (DVAIR-Cluster 1) from S.oct-cen1R (chr1:3355194-3357165) with oTAR-4/oTAR-5 (DVAIR-Cluster 1) from S.cry-cen3R (chr3:964707-966623). Bottom: Dot-plot of oTAR-4/oTAR-5 (DVAIR-Cluster 1) from S.oct-cen1R (chr1:3355194- 3357165) and oTAR-4/oTAR-5 (DVAIR-Cluster 1) from S.oct-cen3L (chr3:1791072-1793051). f Schematic of central domain similarity between species. Central cores (purple shades), imr (blues), TARs containing tRNA gene clusters (greens). Long (CNT-L) and short (CNT-S) central-core repeats are indicated. tRNA genes indicated in single-letter amino acid code. Colours highlight similarity of organization between species and indicates homology within, not between, species. Source data available: GEO: GSE112454

– homogenization23 25. As tRNA genes may have performed sequenced on the PacBio RSII instrument with P4/C3 chemistry. Magbead loading important functions—as boundaries preventing heterochromatin was used to load each sample at a concentration between 50 and 200 pM. Addi- spread into central cores and perhaps in higher-order centromere tional PacBio sequencing (without BluePippin) was performed by Biomedical — Research Core Facilities, University of Michigan. There, the following kits were organization and architecture tRNA gene clusters were main- used: DNA Sequencing Kit XL 1.0, DNA Template Prep Kit 2.0 (3 kb–10 kb) and tained26.InS. pombe, non-centromeric and centromeric tRNA DNA/Polymerase Binding Kit P4. MagBead Standard Seq v2 sequencing was genes and 5S rDNAs cluster adjacent to centromeres in a TFIIIC- performed using 10,000 bp size bin with no Stage Start with a 2 h observation dependent manner27,28. The multiple tandem centromeric 5S time on a PacBio RSII sequencer. A summary of PacBio sequencing performed is listed in Supplementary Table 11. rDNAs and tRNA genes could contribute to a robust, highly folded heterochromatin structure promoting optimum kine- fi De novo whole genome assembly of PacBio sequence reads. PacBio reads tochore con guration for co-ordinated microtubule attachments were assembled using HGAP3 (The Hierarchical Genome Assembly Process 44 and accurate chromosome segregation . version 3)48. Reads were first sorted by length and the top 30% used as seed reads The lack of overt sequence conservation between centromeres by HGAP3. All remaining reads of at least 1 kb in length were used to polish the of different species appears not to prevent functional conserva- seed reads. These polished reads were used to de novo assemble the genomes and Quiver software used to generate consensus genome contigs. Comparisons tion, which may be driven by underlying sequence features or to the ChIP-seq input data and Broad Institute Schizosaccharomyces reference properties such as the transcriptional landscape. Maintenance genomes20 showed very high agreement with these datasets. The S. octosporus and of centromere function has been observed at a pre-established S. cryophilus chromosomes were named according to their sequence lengths, the human centromere (pre-assembled with CENP-A and an intact, longest chromosome being labelled as chromosome I in each case. functional kinetochore) in chicken cells45 (310 My divergence). Establishment of CENP-A chromatin on human alpha-satellite in De novo assembly of S. pombe genome using nanopore technology. Genomic 49 fl mouse cells46 (90 My divergence) is dependent on the 17 bp DNA was extracted as described previously . Brie y, cells were incubated with Zymolyase 20T to digest the cell wall, pelleted, resuspended in TE (10 mM Tris- CENP-B box present at both human (alpha-satellite) and mouse HCl pH8, 1 mM EDTA) and lysed with SDS, followed by addition of potassium (minor satellite) centromere sequences, and on the CENP-B acetate and precipitation with isopropanol. After treatment with RNase A and protein. This conserved functionality is surpassed by the com- proteinase K, two phenol chloroform extractions were performed. DNA was petence of S. octosporus central-core DNA to establish CENP-A precipitated in the presence of sodium acetate and isopropanol, followed by centrifugation and washing of the pellet with 75% ethanol. After air drying, the chromatin in S. pombe from which it is separated by 119 My of pellet was resuspended in TE. DNA purity and concentration were assessed using a 20 evolution (equivalent to 383 My using a chordate molecular Nanodrop 2000 and the double-stranded high-sensitivity assay on a Qubit fluo- clock) and lacks any clear conserved sequence elements akin to rometer, respectively. Genomic DNA was sequenced using the MinION nanopore a CENP-B box. The analyses presented thus extend the evolu- sequencer (Oxford Nanopore Technologies). Three sequencing libraries were generated using the one-dimensional (1D) ligation kit SQK-LSK108, the two- tionary timescale over which cross-species establishment of dimensional (2D) ligation kit SQK-NSK007 and the 1D Rapid sequencing kit SQK- CENP-A chromatin has been demonstrated. RAD002, following the manufacturer’s guidelines. Each library was sequenced on one MinION flow cell. Sequencing reads were base-called using Metrichor (1D and 2D ligation libraries) or Albacore (rapid sequencing library). The com- Methods bined dataset incorporating reads from three flow cells was assembled using Canu Cell growth and manipulation. Standard genetic and molecular techniques were v1.5. The assembly was computed using default Canu parameters and a genome 47 followed. Fission yeast methods were as described . Strains used in this study are size of 13.8 Mbp. QUAST v3.2 was used to evaluate the genome assembly. listed in Supplementary Table 10. All Schizosaccharomyces strains were grown at 32 °C in YES (Yeast Extract with Supplements), except S. cryophilus, which was grown at 25 °C, unless otherwise stated. S. pombe cells carrying minichromosomes Genome annotation and chromosome structure. Genes were annotated onto the were grown in PMG-ade-ura. For low GFP-tagged CENP-ACnp1 protein expression genome both de novo, using BLAST and the sequences of known genes, and by from episomal plasmids, cells were grown in PMG-leu with thiamine. using liftover (https://genome-store.ucsc.edu) to carry over the previous gene annotation information from the Broad institute reference genomes (ref). Cross- Map50 was then used to lift the chain files over to the new, updated genome. The PacBio sequencing of genomic DNA. High-molecular-weight genomic DNA was locations of tRNA genes were predicted using tRNAscan51,52. Dfam 2.053 was used prepared from S. cryophilus, S. octosporus and S. japonicus using a Qiagen Blood to annotate repetitive DNA elements. MUMmer3.2354 was used to compare the and Cell Culture DNA Kit (Qiagen), according to the manufacturer’s instructions. genomes and annotate repeat elements and tandem repeat sequences, including Pacific Biosciences (PacBio) sequencing was carried out at the CSHL Cancer Center those located in centromeric domain and telomere sequences. Centromeric repeat Next Generation Genomics Shared Resource. Samples were prepared following the elements were manually identified using BLASTN and MEGABLAST (https://blast. standard 20 kb PacBio protocol. Briefly, 10–20 μg of genomic material was sheared ncbi.nlm.nih.gov). Each repeat element was named according to their sequence via g-tube (Covaris) to 20 kb. Samples were damage repaired via ExoVII (PacBio), features (association with tRNA gene and rDNAs) and locations. The sequence of damage-repair mix and end-repair mix using standard PacBio 20 kb protocol. the wild-type (h90) S. pombe mating-type locus was obtained by manually merging Repaired DNA underwent blunt-end ligation to add SMRTbell adaptors. For some nanopore and PacBio contigs using available data20 (Supplementary Fig. 10) and libraries, 10–50 kb molecules from 1 to 2 μg SMRTbell libraries were size selected information at www.pombase.org/status/mating-type-region. Genome synteny using BluePippin (Sage Science), after which samples were annealed to Pacbio alignment analysis was carried out using syMAP4255,56, based on orthologous SMRTbell primers per the standard PacBio 20 kb protocol. Annealed samples were genes among the three genomes.

NATURE COMMUNICATIONS | (2019) 10:2343 | https://doi.org/10.1038/s41467-019-09824-4 | www.nature.com/naturecommunications 7 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09824-4

a S.cry-cen1 S.cry-cen2 S.cry-cen3 rpb1 rec6 tip41 rad50 per1 chk1

rpb1 rec6 tip41 chk1 per1 rad50 S.oct-cen2 S.oct-cen3 S.oct-cen1 10 kb b 20

10

0 PC2 (2.8%)

–10

–20

–40 –20 0 20 PC1 (4.4%)

Neocentromere CENP-ACnp1 associated Genomes

Centromeric Mating-type locus Subtelomeres heterochromatin

c 60 Kruskal–Wallis, p < 2.2e–16 *** ** * 40 **** **** ns 20 PC1

0

–20

–40 Neo- CENP-ACnp1 Genomes Cen Mating-type Sub- centromere associated hetero- locus telomere chromatin

Fig. 4 Schizosaccharomyces centromeres share ancestry and sequence features. a Structural alignment of putatively equivalent centromere repeat elements of S. cryophilus and S. octosporus to highlight potential centromere rearrangements during evolution. b Principal component analysis PC1 and PC2 of 5-mer frequencies of three fission yeast genomes. Genome regions (12 kb window) were assigned to one of five specific annotated groups (CENP-ACnp1- associated (purple, n = 24), centromeric heterochromatin (orange, n = 112), mating-type locus (blue, n = 18), subtelomeres (yellow, n = 67), neocentromere-forming regions34 (red, n = 9), or other genome regions (grey, n = 7652). For each group the oval line encloses 95% of the data points. c Boxplot principal component PC1 of each group. Colours and values for n as in b. Mean comparison between groups was used (p-value: > 0.05, ns; *p > 0.01; **p > 0.001; ***p > 0.0001; ****p < 0.0001)66. Centre line, medium; box limits, upper and lower quartiles; whiskers, 1.5 × interquartile range; points, outliers. Source data available: GEO: GSE112454

8 NATURE COMMUNICATIONS | (2019) 10:2343 | https://doi.org/10.1038/s41467-019-09824-4 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09824-4 ARTICLE

a αN α1 S. pombe 1 S. cryophilus S. octosporus S. japonicus α2 α3 S. pombe 61 S. cryophilus S. octosporus S. japonicus

b 25 °C32 °C 36 °C

pGFP

pGFP-Sp-CENP-ACnp1

pGFP-So-CENP-ACnp1

pGFP-Sc-CENP-ACnp1

pGFP-Sj-CENP-ACnp1

c

pGFP

pGFP-Sp-CENP-ACnp1

pGFP-So-CENP-ACnp1

pGFP-Sc-CENP-ACnp1

pGFP-Sj-CENP-ACnp1

Cdc11 GFP DAPI Merge

Fig. 5 Cross-species functionality of CENP-ACnp1 proteins. a Alignment of Schizosaccharomyces CENP-ACnp1 proteins. Positions of alpha helices (yellow), N-terminal tail (green) and CENP-A-targeting domain (CATD; red) are indicated. b S. pombe temperature-sensitive cnp1-1 cells expressing plasmid-borne GFP-CENP-ACnp1 from the indicated species (Sp, S. pombe; So, S. octosporus; Sc, S. cryophilus; Sj, S. japonicus), or GFP alone, were spotted on phloxine B-containing plates and incubated for 2–5 days at the indicated temperatures. c Localization of GFP-tagged CENP-ACnp1 from indicated Schizosaccharomyces species in S. pombe. Wild-type S. pombe cells bearing plasmids described in a were grown at 32 °C before fixation and staining with anti-GFP (green), anti- Cdc11 (red, spindle-pole body) and DAPI (blue, DNA). Centromeres cluster at the spindle-pole body in S. pombe. Scale bar, 5 μm. Source data available as a Source Data file

ChIP-quantitative PCR analysis. For analysis of CENP-ACnp1 association with Chromatin immunoprecipitation sequencing. A modified ChIP protocol was minichromosomes bearing S. octosporus central-core DNA, three independent used. Briefly, pellets containing 7.5 × 108 cells were lysed by four 1 min pulses of transformants with established centromere function (indicated by ability to form bead beating in 500 μl of lysis buffer (50 mM HEPES-KOH, pH 7.5, 140 mM NaCl, sectored colonies) for each minichromosome were grown in PMG-ade-ura cultures 1 mM EDTA, 1% Triton X-100, 0.1% sodium deoxycholate), with resting on ice and fixed with 1% formaldehyde for 15 min at room temperature. ChIP was per- in between. The insoluble chromatin fraction was pelleted by centrifugation at formed as previously described57. Briefly, 2.5 × 108 cells were lysed by bead beating 6000 × g and washed with 1 ml lysis buffer before resuspension in 300 μl lysis buffer (Biospec) in 300 μl Lysis Buffer (50 mM Hepes-KOH pH 7.5, 140 mM NaCl, 1 mM containing 0.2% SDS. Chromatin was sheared by sonication using a Bioruptor EDTA, 1% (v/v) Triton X-100, 0.1% (w/v) sodium deoxycholate). Lysates were (Diagenode) for 30 min (30 s on/off, high setting). Nine hundred microlitres of lysis sonicated (Bioruptor, Diagenode) for 20 min (30 s on/off, high setting), followed by buffer (no SDS) was added and samples clarified by centrifugation at 17,000 × g for centrifugation at 17,000 × g (2 × 10 min) to pellet cell debris. Lysates were pre- 20 min and the supernatant used for ChIP. Six microlitres of anti-H3K9me2 mouse cleared for 1 h with 25 μl of Protein-G agarose beads (Roche) and 10 μl precleared monoclonal mAb5.1.158 (kind gift from Takeshi Urano) or 30 μl sheep anti-CENP- lysate retained as ‘input’ sample. Three hundred microlitres of lysate was incubated ACnp1 antiserum57 were used, along with protein-G-dynabeads (ThermoFisher overnight with 10 μl sheep anti-CENP-ACnp1 serum and 25 μl Protein-G agarose Scientific) or Protein-G agarose beads (Roche), respectively. (For neocentromere beads. Beads were washed with Lysis Buffer, Lysis Buffer with 500 mM NaCl, strains, cells were first treated with Zymolyase 100T (AMS Biotechnology), washed WASH buffer (10 mM Tris-HCl pH 8, 0.25 M LiCl, 0.5% NP-40, 0.5% (w/v) in sorbitol and permeablized. Chromatin was fragmented with incubation with sodium deoxycholate,1 mM EDTA) and TE. DNA was recovered from input and micrococcal nuclease. Cell suspensions were adjusted to standard ChIP buffer IP samples using Chelex resin (BioRad). Ten microlitres of anti-CENP-ACnp1 conditions and extracted chromatin was processed as per standard ChIP.) sheep antiserum57 (raised to the N-terminal 19 amino acids of S. pombe CENP- Immunoprecipitated DNA was recovered using Qiagen PCR purification columns. ACnp1) and 25 μl Protein-G-Agarose beads were used per ChIP. qPCR was per- ChIP-Seq libraries were prepared with 1–5 ng of ChIP or 10 ng of input DNA. formed using a LightCycler 480 and reagents (Roche), and analysed using Light- DNA was end-repaired using NEB Quick blunting kit (E1201L). The blunt, Cycler 480 Software 1.5 (Roche). Primers used in qPCR are listed in Supplementary phosphorylated ends were treated with Klenow-exo− (NEB, M0212S) and dATP. Table 12. Mean %IP ChIP values for Sp-cnt or So-cnt on minichromsomes were After ligation of NEXTflex adaptors (Bioo Scientific) DNA was PCR amplified with normalized to %IP for endogenous S. pombe cnt1. Error bars represent SD. Illumina primers for 12–15 cycles and library fragments of ~300 bp (insert plus

NATURE COMMUNICATIONS | (2019) 10:2343 | https://doi.org/10.1038/s41467-019-09824-4 | www.nature.com/naturecommunications 9 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09824-4

ado-imr1 o-cnt1 o-imr1 ChIP: CENP-ACnp1

1.5 TT 3.2 kb o-imr2 o-cnt2 o-imr2 1.0 ChIP

oCNT-S oCNT-L S. pombe cnt1 FELK KL Cnp1 10 kb 6.5 kb 0.5

4.7 kb CENP-A o-imr3 o-cnt3 o-imr3 normalised to

oCNT-L oCNT-S 0.0 AIRKE EKRIA 6.5 kb 3.6 kb 2.6 kb

CENP-ACnp1 pK-So-cnt2–10 kb Heterochromatin chromatin? pK-Sp-cnt2–8.5 kb pK-So-cnt2–6.5 kb pK-So-cnt2–4.7 kb pK-So-cnt1–3.2 kb pK-So-cnt3–2.6 kb S. pombe S. octosporus K-rpt S.oct-cnt central core central core S. pombe S. octosporus sup3e bce

–T CENP-ACnp1 overexpressed

+T wt CENP-ACnp1 levels

Fig. 6 S. octosporus central-core DNA establishes CENP-ACnp1 chromatin upon introduction into S. pombe. a Indicated regions of S. octosporus central-core DNA placed adjacent to a portion of S. pombe heterochromatin-forming outer-repeat sequence on a plasmid. b S. pombe transformants containing minichromsome plasmids were replica-plated to low-adenine non-selective plates: colonies retaining the chimeric minichromosome plasmid are white/ pale pink, those that lose it are red. Representative plate showing pKp-So-cnt3-6.5kb-containing colonies. c S. pombe cells containing pKp-So-cnt3-6.5 kb chimeric minichromosome were streaked to single colonies. Red colour indicates loss of minichromosome; small red sectors indicate low-frequency minichromosome loss and mitotic segregation function. d ChIP-qPCR for CENP-ACnp1 on S. pombe hi-CENP-ACnp1 cells containing chimeric minichromosomes with established centromere function. Three biologically independent transformants were analysed for each minichromosome (n = 3). ChIP enrichment on S.pom-cnt2 and S.oct-cnt-bearing minichromosomes is normalized to the level at endogenous S. pombe cnt1. Individual data points are shown as black dots. Error bars, SD. e Propagation of chimeric minichromosome stability. Cells containing pK(5.6 kb)-So-cnt2-10 kb were streaked on low- adenine-containing plates with or without thiamine, which results in repression or expression of high levels of S. pombe CENP-ACnp1. Source data available as a Source Data file adaptor sequences) were selected using Ampure XP beads (Beckman Coulter). The account for multiple testing, the centromere tRNA gene clusters each exhibited a libraries were sequenced following Illumina HiSeq2000 work flow (or as indicated q-value <0.005. in Supplementary Table 11). Hsp16 gene tree analysis. hsp16 paralogs from S. octosporus and S. cryophilus genomes were predicted using BLASTP. The predicted protein sequences from fi fi Cnp1 De ning ssion yeast centromeres. CENP-A and H3K9me2 ChIP-seq data hsp16 genes across all four fission yeasts were aligned together with those from were generated to identify centromere regions. ChIP-Seq reads with mapping Saccharomyces cerevisiae using Clustal Omega. BEAST (Bayesian Evolutionary qualities lower than 30, or read pairs that were over 500 nt or <100 nt apart, were Analysis Sampling Trees)64 and FigTree (http://tree.bio.ed.ac.uk/software/figtree/) discarded. ChIP-seq data were normalized with respect to input data. Paired-end was used to generate and view the hsp16 gene phylogenetic tree. ChIP-seq data (single-end for S. japonicus) was aligned to the updated genome 59 60 61 62 sequences using Bowtie2 . Samtools , Deeptools and IGV were subsequently 5-mer frequency PCA. The CENP-ACnp1-associated sequences in the S. pombe, S. fi 63 used to generate sequence data coverage les and to visualize the data. MACS2 cryophilus and S. octosporus genomes are all ~12 kb in length. Each genome was Cnp1 was used to detect CENP-A and heterochromatin-enriched regions of the therefore split into 12 kb sliding windows with a 4.5 kb overlap. The frequencies of genome. each 5-mer was calculated in each window using Jellyfish65. CENP-ACnp1-asso- ciated regions showed a general enrichment of AT base pairs relative to the genome Centromere tRNA gene cluster analysis. To test for the enrichment of tRNA as a whole. To normalize for GC content among the windows, all base pairs were gene clusters at centromere regions, a greedy search approach was used to randomized in each sequence window to generate 1000 artificial sequences with the identify potential clusters. All tRNA genes <1000 bp apart were grouped into same GC content. 5-mer frequencies were then recalculated for each of these 1000 clusters. To test for significant clustering of tRNA genes at the centromere, the artificial sequences and the true original 5-mer frequencies compared with these locations of tRNA genes across the genome were shuffled 1000 times. For each background frequencies by calculating a z-score. Consequently, these enrichment cluster observed in the real genome, the proportion of permutations where the scores represent the k-mer enrichments in a given sequence normalized for GC same cluster was observed at least as many times was calculated to provide content. Genome windows were split into six groups: CENP-ACnp1-associated estimates of significance. Following conversion of these p-values to q-values to sequences (CENP-ACnp1 peaks covering >6 kb of sequence); outer-repeat

10 NATURE COMMUNICATIONS | (2019) 10:2343 | https://doi.org/10.1038/s41467-019-09824-4 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09824-4 ARTICLE

— Table 1 Establishment frequency and stability of of integration into endogenous chromosomes and the establishment frequency adjusted accordingly. minichromosomes in S. pombe Minichromosome stability assay. Minichromosome loss frequency was deter- fl Plasmid Establishment Loss rate per mined by half-sector assay. Brie y, transformants containing minichromsomes frequency % (n) division % (n) with established centromere function were grown in PMG-ade-ura to select for cells containing the minichromosome. At least two transformants were analysed pK-Sp-cnt2-8.5 kb 94 (217) 5.8 (3284) per minichromosome. Cells were plated on low-adenine-containing plates and pK-So-cnt2-10 kb 36.4 (88) 11.2 (1636) allowed to grow non-selectively for 4–7 days. Minichromosome loss is indicated by pK-So-cnt3-6.5 kb 40.1 (262) 5.8 (3705) red sectors and retention by white sectors. To determine loss rate per division, pK-So-cnt2-6.5 kb 28.1 (208) 6.6 (3621) all colonies were examined with a dissecting microscope. All colonies—except pure — pK-So-cnt2-4.7 kb 5.2 (973) 6.8 (7176) reds were counted to give total number of colonies. Pure reds were checked for pK-So-cnt1-3.2 kb 1.9 (1529) 11.5 (2017) the absence of white sectors and were excluded, because they had lost the mini- chromosome before plating. To determine colonies that lost the minichromosome pK-So-cnt3-2.6 kb 0.3 (1443) 21.1 (1237) in the first division after plating, ‘half-sectored’ colonies were counted. This pKp-So-cnt3-6.5 kb 6.6 (916) 15.9 (2099) included any colony that was 50% or greater red (including those with only a tiny pKp-So-cnt3-3.6 kb 0 (1538) NA white sector). Loss rate per division is calculated as the number of half-sectored pKp 0 (295) NA colonies as a percentage of all (non-pure-red) colonies.

Establishment frequency of chimeric minichromosomes in S. pombe hi-CENP-ACnp1 cells Recovery of minichromosomes from S. pombe. To confirm that establishment of determined by replica plating of transformants (Methods) as shown in Fig. 6b(n = number centromere function by minichromosome plasmids was not due to rearrangement of transformants analysed). Chromosome loss rate of established minichromosomes was determined by half-sector assay (Methods). Two transformants containing established or gain of S. pombe central-core sequences, minichromosomes were recovered from 8 centromeres were analysed for each minichromosome and the mean loss rate determined, S. pombe into E. coli. Approximately 1 × 10 S. pombe cells containing mini- n = number of colonies screened. NA, not applicable as the minichromosomes did not establish chromosome plasmids with established centromere function were incubated 1 ml centromere function PEMS buffer (100 mM PIPES pH 7, 1 mM EDTA, 1 mM MgCl2, 1.2 M Sorbitol) containing 1 mg/ml Zymolyase-100T (AMS Biotechnology) for 30–60 min at 36 °C to digest cell walls. After pelleting and washing with PEMS, spheroplasts were lysed heterochromatin regions (more than half the window covered by H3K9me2 peaks and plasmid DNA isolated using Qiagen miniprep kit, following manufacturer’s adjacent to CENP-A domains); subtelomeric regions (more than half the window instructions. Due to low-yield plasmids were recovered by transformation into covered by H3K9me peaks and close to the end of a chromosome); mating-type E. coli gt116 cells, followed by restriction enzyme analysis of resultant miniprep locus; neocentromere regions (identified using CENP-ACnp1 ChIP-seq data of S. DNA. Digestion patterns of recovered and original minichromsome plasmids was pombe neocentromere-containing strains34); and remaining genome sequences. As compared by agarose gel electrophoresis. the highly repetitive transposon-rich S. japonicus centromere regions are not fully assembled, the precise location of the centromere-kinetochore is unknown. We Immunolocalization. For localization of CENP-ACnp1, Schizosaccharomyces cul- therefore adopted a limited PCA approach and selected the top 11 most highly tures were fixed with 3.7% formaldehyde for 7 min, before processing for immu- enriched 12 kb regions from CENP-ACnp1 ChIP-seq (Supplementary Data 11). nofluorescence as described57. Briefly, cells were fixed with 3.7% formaldehyde for These were compared with ten randomly selected non-centromeric sequences from 7 min, followed by cell-wall digestion with Zymolyase-100T (AMS Biotechnology) each of the fission yeast genomes as above. in PEMS buffer (100 mM PIPES pH 7, 1 mM EDTA, 1 mM MgCl2, 1.2 M Sorbitol). Logistic regression and mean comparison were used to determine whether After permeablization with Triton X-100, cells were washed, blocked in PEMBAL principal components were linked to the probability of a sequence belonging to a (PEM containing 1% bovine serum albumin, 0.1% sodium azide, 100 mM lysine particular sequence group66. Logistic regression and mean comparison were used hydrochloride). Anti-CENP-ACnp1 sheep antiserum57 (raised to the N-terminal 19 to determine whether principal components (FactoMineR) were linked to the amino acids of S. pombe CENP-ACnp1) was used in PEMBAL at 1:1000 dilution probability of a sequence belonging to a particular sequence group. and Alexa-488-coupled donkey anti-sheep secondary antibody (A11015; Invitro- gen) at 1:1000 dilution. Cells were stained with 4′,6-diamidino-2-phenylindole Construction of minichromosomes. S. pombe functional minichromosomes (DAPI) and mounted in Vectashield. Microscopy was performed with a Zeiss contain central domain DNA and flanking repeat DNA on one side; the long, Imaging 2 microscope (Zeiss) using a × 100 1.4 NA Plan-Apochromat objective, fi inverted repeats found in the natural context are not tolerated by Escherichia coli67. Prior lter wheel, illumination by HBO100 mercury bulb. Image acquisition with Regions of S. octosporus and S. cryophilus central-core regions were amplified with a Photometrics Prime sCMOS camera (Photometrics, https://www.photometrics. primers indicated in Supplementary Table 12 and inserted as BglII-NcoI, BamHI- com) was controlled using Metamorph software (Universal Imaging Corporation). – NcoIorBglII-SalI fragments into BglII-NcoI- or BglII-SalI-digested plasmid pK Exposures were 1500 ms for FITC/Alexa-488 channel and 300 1000 ms for DAPI. (5.6 kb)-MCS-ΔBam, which contains a 5.6 kb fragment of the S. pombe K(dg) Images shown in Fig. 2a are autoscaled. Cnp1 outer repeat. To create plasmid pK-So-cnt2-10 kb, an additional 3.6 kb region from To express GFP-tagged versions of Schizosaccharomyces CENP-A proteins fi S.oct-cnt2 was inserted as a BamHI-SalI fragment into BglII-SalI-digested pK-So- in S. pombe, ORFs were ampli ed from relevant genomic DNA using primers listed cnt2-6.5 kb to make a 10 kb region of S. octosporus central core. For pKp plasmids, in Supplementary Table 12. Fragments were digested with NdeI-BamHI or NdeI- 69 S. octosporus or S. cryophilus central-core regions were by inserted as BglII-NcoI, BglII and ligated into NdeI-BamHI digested pREP41X-GFP vector SalI-BamHI or XhoI-BamHI fragments into BamHI-NcoI- or SalI-BamHI-digested (Supplementary Table 13). For detection of GFP-tagged versions of Cnp1 plasmid pKp (pMC91), which contains 2 kb region from S. pombe K(dg) outer Schizosaccharomyces CENP-A proteins in S. pombe, cells containing Cnp1 repeat. For the S. japonicus CENP-ACnp1-associated retrotransposon Tj720 (Sup- pREP41X-GFP-CENP-A episomal plasmids (variable copy number) were + Cnp1 plementary Fig. 6), a region spanning the almost the entire retrotransposon (but grown in PMG-leu thiamine to allow low GFP-CENP-A expression. Cells fi lacking the second LTR to avoid rearrangement or transposition problems in E. coli were xed, processed for immunolocalization and microscopy as above. Anti-GFP 57 or S. pombe) was amplified by PCR with primers indicated in Supplementary antibody (A11122; Invitrogen) was used at 1:300 and anti-Cdc11 (a spindle-pole Table 12 and cloned in two steps into the NotI-XbaI site of pK(5.6 kb)-MCS-ΔBam body marker; gift from Ken Sawin) was used at 1:600. Secondary antibodies were, to make pK-Sj-Tj7-4.8 kb. Plasmids are listed in Supplementary Table 13. respectively, Alexa-488-coupled chicken anti-rabbit (A21441; Invitrogen) and Alexa-594-coupled donkey anti-sheep (A11016; Invitrogen), both at 1:1000. Exposures were FITC/488 channel 1500 ms, TRITC/594 1000 ms and DAPI Centromere establishment assay. Strains A7373 or A7408, which contains 500–1000 ms. For display of images in Fig. 5C, TRITC/594 and FITC/488 images integrated nmt41-GFP-CENP-ACnp1 to allow high level expression of CENP-A18, are scaled relative to the maximum intensity in the set of images, whereas DAPI were grown in PMG-complete medium and transformed using sorbitol- images are autoscaled. electroporation method68. Cells were plated on PMG-uracil-adenine plates and incubated at 32 °C for 5–10 days until medium-sized colonies had grown. Colonies μ Reporting summary. Further information on research design is available in the were replica-plated to PMG-low-adenine (10 g/ml) plates to determine the fre- Nature Research Reporting Summary linked to this article. quency of establishment of centromere function. These indicator plates allow minichromosome loss (red) or retention (white/pale pink) to be determined. Data availability Minichromosome retention indicates that centromere function has been estab- lished, and that minichromosomes segregate efficiently in mitosis. In the absence of Sequence data generated in this study have been submitted to GEO under accession centromere establishment, minichromosomes behave as episomes that are rapidly number: GSE112454. This study used PacBio and nanopore sequencing data under lost. Minichromosomes occasionally integrate giving a false positive white phe- project PRJNA472404. Assembled genomes are available at http://bifx-core.bio.ed.ac.uk/ fi notype. To assess the frequency of such integration events and to confirm estab- ~ptong/genome_assembly/. All other relevant data supporting the key ndings of this fi lishment of centromere segregation function, a proportion of colonies giving the study are available within the article and its Supplementary Information les or from the white/pale pink phenotype upon replica plating were re-streaked to single colonies corresponding authors upon reasonable request. The source data underlying Figs. 2, 5, 6 fi on low-adenine plates—sectored colonies are indicative of segregation function and Supplementary Figs 6, 8, 9, are provided in a Source Data le. A reporting summary with low levels of minichromosome loss, whereas pure white colonies are indicative for this Article is available as a Supplementary Information file.

NATURE COMMUNICATIONS | (2019) 10:2343 | https://doi.org/10.1038/s41467-019-09824-4 | www.nature.com/naturecommunications 11 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09824-4

Received: 3 May 2018 Accepted: 27 March 2019 29. Donze, D., Adams, C. R., Rine, J. & Kamakaka, R. T. The boundaries of the silenced HMR domain in Saccharomyces cerevisiae. Genes Dev. 13, 698–708 (1999). 30. Carabana, J., Watanabe, A., Hao, B. & Krangel, M. S. A barrier-type insulator forms a boundary between active and inactive chromatin at the murine TCRβ locus. J. Immunol. 186, 3556–3562 (2011). References 31. Valenzuela, L. & Kamakaka, R. T. Chromatin insulators. Annu. Rev. Genet. 40, – 1. Buscaino, A., Allshire, R. & Pidoux, A. Building centromeres: home sweet 107 138 (2006). home or a nomadic existence? Curr. Opin. Genet Dev. 20, 118–126 (2010). 32. Steglich, B. et al. The Fun30 chromatin remodeler Fft3 controls nuclear fi 2. Plohl, M., Mestrović, N. & Mravinac, B. Centromere identity from the DNA organization and chromatin structure of insulators and subtelomeres in ssion point of view. Chromosoma 123, 313–325 (2014). yeast. PLoS Genet. 11, e1005101 (2015). 3. Malik, H. S. & Henikoff, S. Major evolutionary transitions in centromere 33. Iben, J. R. & Maraia, R. J. tRNAomics: tRNA gene copy number variation and complexity. Cell 138, 1067–1082 (2009). codon use provide bioinformatic evidence of a new anticodon:codon wobble – 4. Dumont, M. & Fachinetti, D. DNA sequences in centromere formation and pair in a eukaryote. RNA 18, 1358 1372 (2012). function. Prog. Mol. Subcell. Biol. 56, 305–336 (2017). 34. Ishii, K. et al. Heterochromatin integrity affects chromosome reorganization – 5. Mendiburo, M. J., Padeken, J., Fülöp, S., Schepers, A. & Heun, P. Drosophila after centromere dysfunction. Science 321, 1088 1091 (2008). CENH3 is sufficient for centromere formation. Science 334, 686–690 (2011). 35. Takahashi, K., Chen, E. S. & Yanagida, M. Requirement of Mis6 centromere fi 6. Folco, H. D., Pidoux, A. L., Urano, T. & Allshire, R. C. Heterochromatin and connector for localizing a CENP-A-like protein in ssion yeast. Science 288, – RNAi are required to establish CENP-A chromatin at centromeres. Science 2215 2219 (2000). 319,94–97 (2008). 36. Baum, M., Ngan, V. K. & Clarke, L. The centromeric K-type repeat and the fi 7. Gómez-Rodríguez, M. & Jansen, L. E. Basic properties of epigenetic systems: central core are together suf cient to establish a functional – lessons from the centromere. Curr. Opin. Genet Dev. 23, 219–227 (2013). Schizosaccharomyces pombe centromere. Mol. Biol. Cell 5, 747 761 (1994). 8. McKinley, K. L. & Cheeseman, I. M. The molecular basis for centromere 37. Kasinathan, S. & Henikoff, S. Non-B-form DNA is enriched at centromeres. – identity and function. Nat. Rev. Mol. Cell Biol. 17,16–29 (2015). Mol. Biol. Evol. 35, 949 962 (2018). 9. Karpen, G. H. & Allshire, R. C. The case for epigenetic effects on centromere 38. Guo, Y., Singh, P. K. & Levin, H. L. A long terminal repeat retrotransposon identity and function. Trends Genet 13, 489–496 (1997). of Schizosaccharomyces japonicus integrates upstream of RNA pol III 10. Scott, K. C. & Sullivan, B. A. Neocentromeres: a place for everything and transcribed genes. Mob. DNA 6, 19 (2015). ć everything in its place. Trends Genet. 30,66–74 (2014). 39. Plohl, M., Luchetti, A., Mestrovi , N. & Mantovani, B. Satellite DNAs between fi 11. Stimpson, K. M., Matheny, J. E. & Sullivan, B. A. Dicentric chromosomes: sel shness and functionality: structure, genomics and evolution of tandem – unique models to study centromere function and inactivation. Chromosome repeats in centromeric (hetero)chromatin. Gene 409,72 82 (2008). Res. 20, 595–605 (2012). 40. Nishibuchi, G. & Déjardin, J. The molecular basis of the organization of 12. Barnhart, M. C. et al. HJURP is a CENP-A chromatin assembly factor repetitive DNA-containing constitutive heterochromatin in mammals. – sufficient to form a functional de novo kinetochore. J. Cell Biol. 194, 229–243 Chromosome Res. 25,77 87 (2017). (2011). 41. Nonaka, N. et al. Recruitment of cohesin to heterochromatic regions by fi – 13. Partridge, J. F., Borgstrøm, B. & Allshire, R. C. Distinct protein interaction Swi6/HP1 in ssion yeast. Nat. Cell Biol. 4,89 93 (2002). domains and protein spreading in a complex centromere. Genes Dev. 14, 42. Bernard, P. et al. Requirement of heterochromatin for cohesion at – 783–791 (2000). centromeres. Science 294, 2539 2542 (2001). 14. Allshire, R. C. & Ekwall, K. Epigenetic regulation of chromatin states in 43. Peters, J.-M. & Nishiyama, T. Sister chromatid cohesion. Cold Spring Harb. – Schizosaccharomyces pombe. Cold Spring Harb. Perspect. Biol. 7, a018770 Perspect. Biol. 4, a011130 a011130 (2012). (2015). 44. Gregan, J. et al. The kinetochore proteins Pcs1 and Mde4 and 15. Martienssen, R. & Moazed, D. RNAi and heterochromatin assembly. heterochromatin are required to prevent merotelic orientation. Curr. Biol. 17, – Cold Spring Harb. Perspect. Biol. 7, a019323 (2015). 1190 1200 (2007). 16. Reyes-Turcu, F. E. & Grewal, S. I. Different means, same end-heterochromatin 45. Shen, M. H., Yang, J. W., Yang, J., Pendon, C. & Brown, W. R. The accuracy of formation by RNAi and RNAi-independent RNA processing factors in fission segregation of human mini-chromosomes varies in different vertebrate cell yeast. Curr. Opin. Genet Dev. 22, 156–163 (2012). lines, correlates with the extent of centromere formation and provides 17. Kagansky, A. et al. Synthetic heterochromatin bypasses RNAi and evidence for a trans-acting centromere maintenance activity. Chromosoma – centromeric repeats to establish functional centromeres. Science 324, 109, 524 535 (2001). 1716–1719 (2009). 46. Okada, T. et al. CENP-B controls centromere formation depending on the – 18. Catania, S., Pidoux, A. L. & Allshire, R. C. Sequence features and chromatin context. Cell 131, 1287 1300 (2007). fi transcriptional stalling within centromere DNA promote establishment of 47. Moreno, S., Klar, A. & Nurse, P. Molecular genetic analysis of ssion yeast – CENP-A chromatin. PLoS Genet. 11, e1004986 (2015). Schizosaccharomyces pombe. Meth. Enzym. 194, 795 823 (1991). fi 19. Shukla, M. et al. Centromere DNA destabilizes H3 nucleosomes to promote 48. Chin, C.-S. et al. Nonhybrid, nished microbial genome assemblies from long- – CENP-A deposition during the cell cycle. Curr. Biol. 28, 3924–3936.e4 (2018). read SMRT sequencing data. Nat. Methods 10, 563 569 (2013). https://doi.org/10.1016/j.cub.2018.10.049 49. Murray, J. M., Watson, A. T. & Carr, A. M. Molecular genetic tools and fi 20. Rhind, N. et al. Comparative functional genomics of the fission yeasts. Science techniques in ssion yeast. Cold Spring Harb. Protoc. 2016, pdb.top087601 332, 930–936 (2011). (2016). 21. Helston, R. M., Box, J. A., Tang, W. & Baumann, P. Schizosaccharomyces 50. Zhao, H. et al. CrossMap: a versatile tool for coordinate conversion between – cryophilus sp. nov., a new species of fission yeast. FEMS Yeast Res. 10, genome assemblies. Bioinformatics 30, 1006 1007 (2014). 779–786 (2010). 51. Lowe, T. M. & Eddy, S. R. tRNAscan-SE: a program for improved detection – 22. Ács-Szabó, L., Papp, L.A., Antunovics, Z., Sipiczki, M., & Miklós I. Assembly of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25, 955 964 of Schizosaccharomyces cryophilus chromosomes and their comparative (1997). genomic analyses revealed principles of genome evolution of the haploid 52. Schattner, P., Brooks, A. N. & Lowe, T. M. The tRNAscan-SE, snoscan and fission yeasts. Sci. Rep. 8, 14629 (2018). snoGPS web servers for the detection of tRNAs and snoRNAs. Nucleic Acids – 23. Dawe, R. K. & Henikoff, S. Centromeres put in the driver’s seat. Res. 33, W686 W689 (2005). Trends Biochem Sci. 31, 662–669 (2006). 53. Hubley, R. et al. The Dfam database of repetitive DNA families. Nucleic Acids – 24. Roizès, G. Human centromeric alphoid domains are periodically homogenized so Res. 44, D81 D89 (2016). that they vary substantially between homologues. Mechanism and implications 54. Kurtz, S. et al. Versatile and open software for comparing large genomes. for centromere functioning. Nucleic Acids Res. 34, 1912–1924 (2006). Genome Biol. 5, R12 (2004). 25. Sharma, A., Wolfgruber, T. K. & Presting, G. G. Tandem repeats derived from 55. Soderlund, C., Nelson, W., Shoemaker, A. & Paterson, A. SyMAP: A system centromeric retrotransposons. BMC Genomics 14, 142 (2013). for discovering and viewing syntenic regions of FPC maps. Genome Res. 16, – 26. Scott, K. C., Merrett, S. L. & Willard, H. F. A heterochromatin barrier 1159 1168 (2006). partitions the fission yeast centromere into discrete chromatin domains. Curr. 56. Soderlund, C., Bomhoff, M. & Nelson, W. M. SyMAP v3.4: a turnkey synteny – Biol. 16, 119–129 (2006). system with application to plant genomes. Nucleic Acids Res. 39, e68 e68 27. Noma, K.-I., Cam, H. P., Maraia, R. J. & Grewal, S. I. S. A role for TFIIIC (2011). fi transcription factor complex in genome organization. Cell 125, 859–872 (2006). 57. Castillo, A. G. et al. Plasticity of ssion yeast CENP-A chromatin driven by 28. Mizuguchi, T., Barrowman, J. & Grewal, S. I. S. Chromosome domain relative levels of histone H3 and H4. PLoS Genet. 3, e121 (2007). architecture and dynamic organization of the fission yeast genome. FEBS Lett. 58. Nakagawachi, T. et al. Silencing effect of CpG island hypermethylation fi 589, 2975–2986 (2015). and histone modi cations on O6-methylguanine-DNA methyltransferase

12 NATURE COMMUNICATIONS | (2019) 10:2343 | https://doi.org/10.1038/s41467-019-09824-4 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09824-4 ARTICLE

(MGMT) gene expression in human cancer. Oncogene 22, 8835–8844 Z. Pacific Biosciences (PacBio) sequencing carried out at the CSHL Cancer Center Next (2003). Generation Genomics Shared Resource, which is supported by the Cancer Center Sup- 59. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. port Grant 5P30CA045508 was paid for by a kind gift from Kathryn W. Davis to G.J.H. Nat. Methods 9, 357–359 (2012). 60. Li, H. et al. The Sequence Alignment/Map format and SAMtools. – Author contributions Bioinformatics 25, 2078 2079 (2009). R.C.A. and A.L.P. designed the study. P.T. performed the PacBio genome assemblies 61. Ramírez, F., Dündar, F., Diehl, S., Grüning, B. A. & Manke, T. deepTools: a fl and bioinformatics, ChIP-seq analysis and PCA analysis. C.M. performed the nanopore exible platform for exploring deep-sequencing data. Nucleic Acids Res. 42, sequencing of S. pombe supervised by C.N. H.B., N.R.T.T., J.T.-G. and R.A. generated W187–W191 (2014). – ChIP-seq data with contribution from M.S. A.L.P. performed cytology, analysis of 62. Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29,24 26 repetitive regions and experiments on cross-species functionality. R.C.A. supervised (2011). the study. A.L.P. wrote the manuscript with contributions from P.T., R.C.A. and other 63. Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, authors. All authors read and approved the final version of the manuscript. R137 (2008). 64. Drummond, A. J. & Rambaut, A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7, 214 (2007). Additional information 65. Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel Supplementary Information accompanies this paper at https://doi.org/10.1038/s41467- counting of occurrences of k-mers. Bioinformatics 27, 764–770 (2011). 019-09824-4. 66. Kruskal, W. H. & Wallis, W. A. Use of ranks in one-criterion variance analysis. Joournal Am. Ctatistical Soc. 47, 583–621 (1952). Competing interests: The authors declare no competing interests. 67. Hahnenberger, K. M., Carbon, J. & Clarke, L. Identification of DNA regions required for mitotic and meiotic functions within the centromere of Reprints and permission information is available online at http://npg.nature.com/ Schizosaccharomyces pombe chromosome I. Mol. Cell Biol. 11,2206–2215 (1991). reprintsandpermissions/ 68. Suga, M. & Hatakeyama, T. High efficiency transformation of Schizosaccharomyces pombe pretreated with thiol compounds by Journal peer review information: Nature Communications thanks Fernando Azorin, electroporation. Yeast 18, 1015–1021 (2001). Gernot Presting and other anonymous reviewer(s) for their contribution to the peer 69. Craven, R. A. et al. Vectors for the expression of tagged proteins in review of this work. Peer reviewer reports are available. Schizosaccharomyces pombe. Gene 221,59–68 (1998). Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Acknowledgements We thank Alastair Kerr, Shaun Webb and Daniel Robertson for bioinformatics support, David Kelly for microscopy support, Ken Sawin and Takeshi Urano for antibodies, and Open Access This article is licensed under a Creative Commons Kojiro Ishii, Ken Sawin and Nick Rhind for yeast strains. We thank Robert Lyons, Joe Attribution 4.0 International License, which permits use, sharing, Washburn, Christina McHenry (University of Michigan) and Greg J. Hannon, Richard adaptation, distribution and reproduction in any medium or format, as long as you give McCombie, Eric Antoniou and Sara Goodwin (CSHL) for PacBio sequencing. We are appropriate credit to the original author(s) and the source, provide a link to the Creative grateful to Chris Ponting for advice and comments on the manuscript and to Sandra Commons license, and indicate if changes were made. The images or other third party Catania and other members of the Allshire and Heun labs for helpful discussions. material in this article are included in the article’s Creative Commons license, unless N.R.T.T., R.A. and J.T.-G. were supported by the Darwin Trust of Edinburgh. The indicated otherwise in a credit line to the material. If material is not included in the Darwin Trust and a Principal’s Career Development scholarship supported N.R.T.T. P.T. article’s Creative Commons license and your intended use is not permitted by statutory was partly supported by funding from the European Commission Network of Excellence regulation or exceeds the permitted use, you will need to obtain permission directly from EpiGeneSys- (HEALTH-F4-2010-257082) and a Wellcome Enhancement Award the copyright holder. To view a copy of this license, visit http://creativecommons.org/ (095021) to R.C.A. R.C.A. is a Wellcome Principal Research Fellow (095021, 200885); the licenses/by/4.0/. Wellcome Centre for Cell Biology is supported by core funding from Wellcome (203149). C.A.M. and C.A.N. are supported by Biotechnology and Biological Sciences Research Council (BBSRC) grant BB/N016858/1 and Wellcome Investigator Award 110064/Z/15/ © The Author(s) 2019

NATURE COMMUNICATIONS | (2019) 10:2343 | https://doi.org/10.1038/s41467-019-09824-4 | www.nature.com/naturecommunications 13