bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 Early diverging fungus Mucor circinelloides lacks centromeric histone CENP-A and
2 displays a mosaic of point and regional centromeres
3
4 María Isabel Navarro-Mendoza1#, Carlos Pérez-Arques1#, Shweta Panchal2, Francisco E.
5 Nicolás1, Stephen J. Mondo3,4, Promit Ganguly2, Jasmyn Pangilinan3, Igor V. Grigoriev3,5,
6 Joseph Heitman6*, Kaustuv Sanyal2*, and Victoriano Garre1*
7 1Department of Genetics and Microbiology, Faculty of Biology, University of Murcia, Spain;
8 2Molecular Biology and Genetics Unit, Jawaharlal Nehru Centre for Advanced Scientific
9 Research, Jakkur, Bangalore, India; 3US Department of Energy Joint Genome Institute, Walnut
10 Creek, California, USA; 4Bioagricultural Science and Pest Management Department, Colorado
11 State University, Fort Collins, Colorado, 80521, USA; 5Department of Plant and Microbial
12 Biology, University of California Berkeley, Berkeley, CA 94598, USA; 6Department of
13 Molecular Genetics and Microbiology, Duke University Medical Center, NC, USA
14 #Equally contributed; *Corresponding author
15 Abstract
16 Centromeres are rapidly evolving across eukaryotes, despite performing a conserved
17 function to ensure high fidelity chromosome segregation. CENP-A chromatin is a hallmark of
18 a functional centromere in most organisms. Due to its critical role in kinetochore architecture,
19 the loss of CENP-A is tolerated in only a few organisms, many of which possess holocentric
20 chromosomes. Here, we characterize the consequence of the loss of CENP-A in the fungal
21 kingdom. Mucor circinelloides, an opportunistic human pathogen, lacks CENP-A along with
22 the evolutionarily conserved CENP-C, but assembles a monocentric chromosome with a
23 localized kinetochore complex throughout the cell cycle. Mis12 and Dsn1, two conserved
24 kinetochore proteins were found to bind nine short overlapping regions, each comprising an
1
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
25 ~200-bp AT-rich sequence followed by a centromere-specific conserved motif that echoes the
26 structure of budding yeast point centromeres. Resembling fungal regional centromeres, these
27 core centromere regions are embedded in large genomic expanses devoid of genes yet marked
28 by Grem-LINE1s, a novel retrotransposable element silenced by the Dicer-dependent RNAi
29 pathway. Our results suggest that these hybrid features of point and regional centromeres arose
30 from the absence of CENP-A, thus defining novel mosaic centromeres in this early-diverging
31 fungus.
32 Introduction
33 Accurate chromosome segregation is crucial to maintain genome integrity during cell
34 division. The timely attachment of microtubules to centromere DNA is essential to achieve
35 proper chromosome segregation. This is accomplished by a specialized multilayered protein
36 complex, the kinetochore which links microtubules to centromere DNA. This protein bridge is
37 divided into two layers – the inner and outer kinetochore. The fundamental inner kinetochore
38 protein is the histone H3 variant CENP-A. It binds directly to centromere DNA and lays the
39 foundation to recruit other essential proteins of the kinetochore complex, playing a fundamental
40 role in centromere structure and function, and hence, precise chromosome segregation1,2.
41 CENP-A is also found at all identified neocentromeres3 and at the active centromeres of
42 dicentric chromosomes4, acting as an epigenetic determinant of centromeric identity.
43 Despite its conserved function, the centromere is one of the most rapidly evolving
44 regions of the genome5. This so-called “centromere paradox” has led to centromeres of diverse
45 sizes and content. The first centromeres identified in Saccharomyces cerevisiae were found to
46 be point centromeres - small regions of ~120 bp defined by specific DNA sequences6,7. In
47 contrast to point centromeres described in only a few budding yeasts of the phylum
48 Ascomycota, most other fungi and metazoans have regional centromeres that are larger,
49 ranging from a few kilobases to several megabases8. Regional centromeres are often
2
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
50 interspersed with repetitive sequences and are mostly defined by epigenetic factors rather than
51 DNA sequence per se9. While the centromeres described thus far are confined to one region of
52 the DNA, there are organisms in which their entire chromosomes are loaded with kinetochores
53 and display a parallel separation of the two chromatids. Such chromosomes possess
54 holocentromeres and have been found in several clades of insects10 and nematodes like
55 Caenorhabditis elegans11. Despite this remarkable variation in the types of centromeres, most
56 organisms have their centromere loci defined by the binding of CENP-A. However, there are
57 a few exceptions, including some insect lineages and kinetoplastids, which have independently
58 lost CENP-A. The kinetoplastid Trypanosoma brucei has evolved a unique set of proteins to
59 perform the role of the kinetochore complex12, whereas all insects that have recurrently lost
60 CENP-A have transitioned from monocentricity to holocentricity10.
61 Neither holocentricity nor lack of CENP-A has ever been reported in the fungal
62 kingdom until a recent study showed that this protein could be absent in the Mucoromycotina
63 subphylum13, which belongs to the early diverging fungi. Kinetochore structure and centromere
64 function are well-studied in species of the Dikarya, whereas centromere identity as well as the
65 kinetochore architecture remain largely unexplored in basal fungi because these are challenging
66 to manipulate genetically. Mucor circinelloides, a fungus of the subphylum Mucoromycotina,
67 is an exception and molecular tools are available to modify its genome14. M. circinelloides, as
68 with other Mucorales, causes an infectious disease known as mucormycosis, which have been
69 associated with high mortality rates due to rhino-orbital-cerebral, pulmonary, or cutaneous
70 infections15. Despite its medical significance and research efforts to identify new antifungal
71 targets, the genome biology and cell cycle of this fungus remain poorly understood. Using
72 evolutionarily conserved kinetochore proteins as tools, we investigated the dynamics of the
73 kinetochore complex during nuclear division and unveiled the unique characteristics of the
74 centromeres of M. circinelloides. Studying distinct and divergent factors involved in the cell
3
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
75 cycle of this organism will open avenues for the development of species-specific antifungal
76 drugs.
77 Results
78 Last Mucoralean common ancestor lost CENP-A and CENP-C, and the descendant
79 Mucoromycotina lack centromere-specific histone variants
80 A recent study of kinetochore evolution in eukaryotes reported the striking absence of
81 two well conserved inner kinetochore proteins, CENP-A and CENP-C, in two species of the
82 Mucoromycotina13. To ascertain the presence or loss of well-studied kinetochore proteins
83 across the entire subphylum, we analyzed the curated proteomes of 75 fungal species, including
84 55 species of Mucoromycotina and 20 species from other fungal clades (Supplementary Table
85 1). The kinetochore protein sequences from Mus musculus, Ustilago maydis, S. cerevisiae, and
86 Schizosaccharomyces pombe served as queries to conduct iterative Blast and Hmmer searches
87 against these proteomes. These sequences were confirmed as putative orthologs by the presence
88 of conserved Pfam domains or by obtaining a matching first hit in a reciprocal Blastp search
89 against the initial four species: mouse, U. maydis, budding, and fission yeasts (Supplementary
90 Table 1, Supplementary Data 1).
91 To confirm the presence or absence of centromere-specific histone H3 CENP-A in each
92 fungal species, 289 histone H3 orthologs were analyzed, identifying 27 putative CENP-A
93 proteins (Supplementary Fig. 1, Supplementary Table 1, Supplementary Data 1). These
94 proteins share common features with well-known CENP-A proteins16, such as an N-terminal
95 tail that varies significantly in length and sequence compared to the canonical H3 counterparts
96 (Supplementary Fig. 1b), the absence of a conserved glutamine residue in the first α-helix
97 (69Q) and a phenylalanine residue (84F) (Supplementary Fig. 1c), a longer and divergent first
98 loop (Supplementary Fig. 1c), and a non-conserved C-terminal tail (Supplementary Fig. 1d).
4
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
99 In addition, the histone fold domain (HFD) sequences of the putative CENP-A proteins have
100 diverged rapidly from the conserved canonical H3 sequences16 (Supplementary Fig. 1e). Nine
101 additional H3 orthologs also share several CENP-A common features but were regarded as rare
102 histones because they lack these conserved substitutions. The analysis revealed that CENP-A
103 was absent in most of the Mucoromycotina clade, specifically in the orders Umbelopsidales
104 and Mucorales; nevertheless, we identified putative CENP-A proteins in all Endogonalean
105 species (Fig. 1). Similarly, CENP-C is present in the Endogonales and Umbelopsidales but not
106 in the Mucorales (Fig. 1). These findings suggest that the last Mucoromycotina common
107 ancestor possessed CENP-A, which was lost after the Endogonales diverged from the rest of
108 the clade. In the absence of CENP-A, CENP-C was presumably lost soon after. In addition,
109 CENP-Q is absent in all Mucoralean species, and CENP-U is lost in the whole subphylum.
110 These proteins have been recurrently lost across the fungal kingdom, as evidenced by their
111 absence in species of all phyla of early-diverging fungi and most species in Basidiomycota
112 (Fig. 1). The absence of orthologs that were present in closely-related species should be taken
113 with caution because this could be a result of the incomplete state of their genome assemblies
114 and annotation.
115 Despite these relevant losses, the remaining kinetochore proteins were well conserved
116 among most species of Mucoromycotina, especially in M. circinelloides. Therefore, we
117 hypothesized that another histone protein could have taken the role of CENP-A as a part of
118 centromeric nucleosome for kinetochore assembly. We searched the M. circinelloides genome
119 sequence for histone protein-coding genes and examined their sequences seeking distinctive
120 features of histone variants. In contrast to canonical mammalian histones H3.1 and H3.2, fungal
121 cell cycle-dependent histones may have intronic sequences and polyadenylation [poly(A)]
122 signals17,18. Thus, the analysis of possible variants focused on the presence of substitutions in
123 relevant amino acid residues targeted by post-translational modifications19. The M.
5
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
124 circinelloides genome contains four histone H3- and three histone H4-coding genes, named
125 histone H three (hht1-4) and histone H four (hhf1-3), respectively (Supplementary Fig. 2a),
126 and all of them feature a poly(A) signal. Three of the four histone H3-coding genes are
127 intronless and their protein sequence identical, except the intron-containing gene hht4
128 (Supplementary Fig. 2a). Also, the amino acid residues at positions 32, 43, and 97 in the Hht4
129 protein sequence differ compared to the conserved H3 sequence (Supplementary Fig. 2b).
130 Similarly, histone H4-coding genes hhf2 and hhf3 lack introns and encode identical proteins,
131 whereas hhf1 has intronic sequences (Supplementary Fig. 2a) and its product has a longer N-
132 terminal tail (Supplementary Fig. 2c).
133 To test if the minor differences observed in Hht4 and Hhf1 could have resulted in a
134 centromere-specific binding protein, the cellular localization was analyzed and compared to
135 their canonical histone counterpart Hhf3. We constructed alleles that contained each histone-
136 coding gene fused in-frame at the C-terminus with the enhanced green fluorescent protein
137 (eGFP) flanked by each of the histone gene native promoter and terminator sequences, and a
138 selectable marker (Supplementary Fig. 3). Homokaryotic strains expressing fluorescent
139 histones Hht4, Hhf1, or Hhf3 were obtained by transforming a wild-type strain with their
140 respective eGFP-tagged alleles. Integration by homologous recombination at the corresponding
141 native loci was confirmed by PCR (Supplementary Fig. 3, Supplementary Tables 2,3).
142 Asexual spores from these strains were observed with a confocal microscope under two
143 stages of germination: ungerminated and 4-h pre-germinated spores, that had started swelling
144 and forming germ tubes. In both conditions, all fluorescent histone fusions were localized
145 encompassing the entire nucleus instead of forming kinetochore-like nuclear clusters
146 (Supplementary Fig. 2d). Most species of the Mucoromycotina are coenocytic fungi and have
147 multinucleated spores, as does the M. circinelloides wild-type strain used in this study which
148 has large spores with a high number of nuclei20. We established that nuclear division in M.
6
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
149 circinelloides is asynchronous by observing the nuclear distribution of Hht4-eGFP protein in
150 pre-germinated spores during a time-lapse imaging session (Supplementary video 1).
151 Altogether, these findings indicate that either none of these histone variants functions like
152 CENP-A in kinetochore assembly or Mucor centromeres are holocentric in nature. We could
153 test these two possibilities by studying other kinetochore proteins that are evolutionarily
154 conserved and easily identifiable in M. circinelloides.
155 Mis12 and CENP-T complexes show discrete nuclear localization
156 Analyzing the nuclear localization and assembly dynamics of the kinetochore complex
157 of M. circinelloides could offer insights into its architecture. In the absence of CENP-A or
158 another similar histone H3 variant, and CENP-C, that could anchor the kinetochore formation,
159 we focused our attention on the evolutionarily conserved and obvious homologs of inner
160 kinetochore proteins that are expected to be closer to the centromere DNA in this organism
161 (Fig. 2a). We designed fluorescent kinetochore fusion proteins of Mis12, Dsn1, and CENP-T
162 because these are well-conserved in almost all species of the Mucoromycotina. First, we
163 constructed alleles to tag the Mis12 and Dsn1 proteins with the red fluorescent protein
164 mCherry, both at their C-termini and N-termini, aiming to integrate the alleles in their
165 endogenous loci following a similar strategy as for the epitope tagging of the histone genes
166 described above. Homologous recombination in the desired transformants was confirmed by
167 PCR, and six mutant colonies from the four tagged alleles were checked for fluorescent
168 localization. Unfortunately, fluorescent signals were not detected in any of the four different
169 tagged strains possibly due to low levels of expression.
170 As an alternative approach, the Mis12 and Dsn1 proteins were tagged with mCherry at
171 both ends and overexpressed from a strong promoter (Pzrt1) in M. circinelloides (Fig. 2b,
172 Supplementary Fig. 4). The alleles were designed to target integration into the carRP locus, a
173 gene encoding an enzyme involved in carotenogenesis. Thus, the mutant strains obtained
7
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
174 should lack colored-carotenes and display an albino phenotype14. After confirming the
175 integration of the alleles in the carRP locus by PCR (Supplementary Fig. 4a, b, Supplementary
176 Table 2), one strain harboring each construct was used as a parental strain to integrate the allele
177 for H3-eGFP expression at its own locus as previously described, allowing monitoring
178 colocalization of the kinetochore proteins within the nucleus. Once the double integration was
179 confirmed, two independent strains of each type were selected for fluorescent localization
180 (Supplementary Table 3).
181 Mis12 and Dsn1 displayed a clear localization signal forming small fluorescent puncta
182 within the nucleus marked by histone H3 (Fig 2c, d respectively). The microscopy screening
183 confirmed that the expression of both N-terminal and C-terminal tagged proteins was similar.
184 The Mis12 and Dsn1 localization patterns revealed a single cluster of kinetochores as observed
185 in many fungal species. This result strongly suggests that M. circinelloides has monocentric
186 chromosomes with a localized kinetochore even in the absence of CENP-A.
187 Next, another evolutionarily conserved inner kinetochore protein CENP-T, which is
188 known to interact with DNA, was tagged with eGFP at the C-terminus following the same
189 procedure used for the histone tagging and integrated by homologous recombination at the
190 endogenous locus in strains expressing the Mis12-mCherry and Dsn1-mCherry fusion proteins.
191 Integration of the alleles at the desired loci was confirmed by PCR in both parental backgrounds
192 (Supplementary Fig. 4c), and the localization of the kinetochore proteins was examined by
193 analyzing their fluorescent signal in pre-germinated spores. CENP-T colocalized with Mis12
194 and Dsn1 in each nucleus (Fig 2e, f, respectively), indicating that all three proteins assemble
195 together in the kinetochore ensemble. Time-lapse imaging during spore germination of the
196 double-tagged strains allowed the analysis of nuclear division and kinetochore structure during
197 the cell cycle in live cells. The Mis12 and Dsn1 proteins clustered at the nuclear periphery
8
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
198 during division (Fig 2g), a typical feature of fungal kinetochore proteins, suggesting all these
199 three proteins are constitutively present at the kinetochore during the cell cycle.
200 M. circinelloides exhibits small mosaic centromeres associated with RNAi-suppressed L1-
201 like retrotransposons
202 The centromere regions of early-diverging fungi have never been described, possibly
203 because of their genetic intractability. The generation of kinetochore-tagged strains of M.
204 circinelloides offered the opportunity to identify the nature of centromeres in an organism that
205 lost CENP-A and CENP-C. To map M. circinelloides centromeres, chromatin
206 immunoprecipitation followed by next generation sequencing (ChIP-seq) was performed with
207 Mis12- and Dsn1-mCherry proteins. Illumina sequencing was conducted in two independently
208 immunoprecipitated (IP DNA) samples for each kinetochore protein, one for Mis12-mCherry
209 and the other for Dsn1-mCherry, using RFP-Trap MA beads (Chromotek) for mCherry
210 immunoprecipitation. Input control samples consisting of total sheared DNA (Input DNA)
211 were also sequenced for each tagged strain, as well as mock binding control (beads only DNA)
212 samples with MA beads without antibody. Raw sequencing data were processed, and reads
213 were aligned to a newly generated PacBio genome assembly of Mucor circinelloides f.
214 lusitanicus MU402 strain (hereafter the M. circinelloides genome), the parental strain of the
215 kinetochore protein-tagged strains.
216 Both Mis12 and Dsn1 IP DNA sequence reads were compared to the corresponding
217 control Input DNA reads to identify statistically significant kinetochore protein-enriched
218 genomic loci in both IP DNA samples (FDR ≤ 5x10-5, fold enrichment ≥ 1.6; Supplementary
219 Table 4). The result of this analysis was visualized in a genome browser to ensure that the
220 enrichment was robust and absent in either the Input or beads only DNA controls
221 (Supplementary Fig. 5). By this approach, we detected nine significantly enriched peaks that
222 overlapped in both Mis12 and Dsn1 IP DNA samples, and thus define core centromeres (CCs)
9
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
223 (Fig 3a). Each of these core CEN sequences was located in a different scaffold of the genome
224 assembly (Fig. 3b) and was designated by their scaffold number. The identification of five
225 repeated regions matching the telomeric sequence (5’-TTAGGG-3’) at the 3’-end of scaffolds
226 2, 3, 8, 9, and 10, suggested that CC2 and CC9 display a subtelomeric localization (Fig. 3b).
227 Because the length of the overlapping peaks was minimally different in the two kinetochore IP
228 DNA samples, DNA sequence enriched with at least one of the kinetochore proteins was
229 assumed to be the length of kinetochore binding region on each chromosome. In addition,
230 contiguous enriched regions were added to each CC, defining the core centromere regions
231 (Supplementary Table 4).
232 The average length of core centromere regions of M. circinelloides is 941 bp (Fig. 3c).
233 These unusually short regions are consistently AT-rich (71% average) (Fig. 3d), resembling
234 point centromeres of S. cerevisiae and its related budding yeast species. Further analysis
235 revealed a 41-bp DNA sequence motif conserved in all nine kinetochore protein-bound core
236 centromere regions (Fig. 3e). An extensive search of the presence of this motif across the
237 whole genome revealed that the 41-bp motif is centromere-specific in M. circinelloides.
238 Moreover, a closer inspection of the core centromere regions revealed a conserved pattern.
239 Each core centromere is composed of a minor peak enriched of kinetochore proteins spaced by
240 a highly AT-rich stretch spanning the major binding peak of kinetochore proteins that starts
241 with the 41-bp conserved motif (Fig. 3f).
242 Surrounding the core centromere regions, both upstream and downstream, are stretches
243 of genomic sequences completely devoid of genes. These pericentric regions are considerably
244 larger than the core and vary in length, ranging from 15 to 75 kb (Fig. 4a). These pericentric
245 regions contain a variable number of large repeats oriented away from the core centromere
246 regions (Supplementary Fig. 6). These repeats cluster in nine groups according to their
247 similarity (≥ 95% identity and ≥ 70% coverage across their entire sequence). Clusters 1, 2 and
10
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
248 3 contain all of the full-length elements, which were termed autonomous elements, while the
249 remaining clusters are composed of incomplete repeats or remnants (Supplementary Fig. 7a).
250 A pairwise comparison of all of the repeats revealed a high similarity in their aligned sequences
251 (≥ 90% identity), though the truncated sequences of remnants showed higher differences
252 accounting for the gaps in the alignment.
253 Each autonomous repeat comprises two sequential open reading frames (ORFs) with
254 an overlapping codon. The first ORF encodes a product containing several zinc finger domains,
255 and the second codes for a protein with AP endonuclease and reverse transcriptase domains
256 (Fig 4b, Supplementary Table 5). Based on their high similarity, their ORF domain
257 architecture, the presence of a poly(A) signal at their 3’-end, and the absence of long terminal
258 repeats (LTR) at either end, they were classified as a single non-LTR retrotransposon.
259 Phylogenetic profiling revealed that these retrotransposons belong to the LINE1 clade
260 (Supplementary Fig. 7b) and are closely related to other LINE1-like retrotransposons of
261 Mucoromycotina species, so they were termed Genomic Retrotransposable Element of
262 Mucoromycotina (Grem) LINE1-like. A search of the M. circinelloides genome revealed that
263 the Grem-LINE1s are found specifically in the pericentromeric regions; or at the non-telomeric
264 ends of three small scaffolds, oriented away from the end (Supplementary Table 5). Four major
265 scaffolds end abruptly in a putative centromere region, suggesting that they may be linked to
266 these minor scaffolds that contain the Grem elements at one end. Broadening the search across
267 55 genomes of species of Mucoromycotina identified the Grem elements in most Mucorales
268 and Umbelopsidales (≥ 30% identity and ≥ 50% coverage), but they could not be found in any
269 of the Endogonalean genomes, indicating an inverse correlation between the presence of Grem-
270 LINE1s and CENP-A (Fig. 1 and Supplementary Table 6).
271 RNAi silences expression and prevents transposition of retrotransposable elements in
272 several fungal species featuring functional RNAi21,22. M. circinelloides possess an elaborate
11
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
273 RNAi mechanism that protects the genome against invading elements, that could prevent the
274 movement of retrotransposable elements. The two Dicer enzymes, particularly Dcl2, and one
275 of the three Argonaute proteins (Ago1), play a pivotal role in this protective RNAi-related
276 pathway23. To analyze the transcriptional landscape of the pericentric regions, previously
277 generated and publicly available transcriptomic raw data for mRNA and small RNAs were
278 reprocessed and aligned to the M. circinelloides genome assembly. The pericentric regions
279 were almost devoid of transcription in the wild-type strain, although the retrotransposons were
280 modestly expressed. Indeed, a high number of sRNAs corresponding to the retrotransposons
281 were detected (Fig. 4c, Supplementary Fig. 6), indicating an inverse correlation between
282 mRNA and sRNA levels. This inverse correlation was confirmed by analyzing the
283 transcriptomic profile of mutant strains lacking essential components of the RNAi machinery,
284 Ago1 or Dcl1/Dcl2. In these RNAi-deficient mutants, the pattern was reversed; high mRNA
285 levels from the retrotransposons were observed, while the production of sRNAs originating
286 from those regions was considerably lower than in the wild-type strain. These findings suggest
287 that the retrotransposons contained in the pericentric regions of M. circinelloides are silenced
288 by the Dicer-dependent RNAi pathway.
289 In most organisms, deposition of CENP-A at the centromere-specific nucleosomes is
290 correlated to depletion of histone H324–26. Because CENP-A is absent in M. circinelloides, we
291 quantified the presence of histone H3 at the centromere-specific nucleosomes by performing
292 ChIP with anti-histone H3 antibodies and a qPCR analysis. The occupancy of histone H3 was
293 examined by qPCR in four regions across the scaffold_2: the spanning kinetochore protein-
294 binding region (core centromere, CC2), the ORF-free pericentromeric region, the centromere-
295 flanking ORFs and the control region far-CEN ORF ~2 Mb away from the centromere (Fig.
296 4d). There are no significant differences in histone H3 occupancy across the analyzed regions,
12
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
297 indicating that H3 is not depleted at either the kinetochore protein binding region or the
298 pericentric region compared to the flanking ORFs and the control far-CEN ORF (Fig. 4d).
299 Discussion
300 Mucoralean species are the causative agents of mucormycosis, an emerging fungal
301 infection with extremely poor clinical outcomes, probably as a consequence of their innate
302 resistance to all current antifungal drugs27,28 and their ability to evade host innate immunity29,30.
303 In spite of their clinical relevance, limited information is available on essential biological
304 processes in these fungal pathogens, especially nuclear division and cell cycle. In this study,
305 we identified major kinetochore components in M. circinelloides, and utilized them to identify
306 and characterize nine chromosomal loci as centromeres.
307 In most eukaryotes, the centrochromatin is composed of nucleosomes containing the
308 histone H3 variant CENP-A as a hallmark of centromere identity. The most striking
309 observation of our study was the absence of this centromere determining protein in all
310 Mucoralean species. Absence of CENP-A was accompanied by loss of CENP-C, while most
311 other kinetochore protein orthologs were present in these species. Loss of a previously-thought
312 indispensable protein like CENP-A raised questions on the evolutionary bases of centromere
313 and kinetochore structure. Absence of CENP-A had been previously observed in certain insect
314 orders including butterflies and moths, bugs and lice, earwigs, and dragonflies which in all
315 known cases led to a transition to holocentricity, suggesting that the presence of a centromere-
316 specific protein is dispensable for segregation of a chromosome with localized kinetochore10.
317 Our study is the first to demonstrate loss of core kinetochore proteins in the fungal
318 kingdom, reinforcing that the absence of CENP-A can be tolerated in eukaryotic organisms.
319 Surprisingly, punctate localization patterns of other conserved kinetochore proteins including
320 Mis12, Dsn1, and CENP-T throughout the cell cycle indicate that these organisms still retain a
13
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
321 monocentric arrangement of a canonical kinetochore structure, as opposed to the holocentric
322 insects that lack CENP-A. Kinetochore proteins in M. circinelloides were observed to be
323 localized at the periphery of each nucleus in the asexual spores as well as mycelia. Three types
324 of nuclear divisions have been studied in multinucleated fungi – synchronous, asynchronous,
325 and parasynchronous31. In all ascomycetes except S. pombe and Zymoseptoria tritici,
326 kinetochore proteins are centromere-localized and clustered throughout the cell cycle32,33. In
327 the basidiomycete C. neoformans, kinetochore proteins transition between clustering and
328 declustering states in different stages of the cell cycle34.
329 We have observed asynchronous nuclear divisions during germination of asexual
330 spores of M. circinelloides, a multinucleated and coenocytic fungus. The kinetochore was
331 found to be consistently present as a cluster in all stages of mitosis, implying that the basic
332 kinetochore structure in this organism is constitutively assembled throughout the cell cycle.
333 Overall, these findings hint that the role of CENP-A is being carried out by either a distinct
334 protein that has a newly evolved function or by one of the known kinetochore proteins like
335 CENP-T, which also has a histone-fold DNA binding domain. We also tried to find whether
336 one of the histone H3 variants plays a role of centromeric histone, but none of them displayed
337 kinetochore-like localization, indicating that they are probably not involved in replacing
338 CENP-A function in chromosome segregation mediated by the kinetochore machinery.
339 A ChIP-based analysis using Dsn1 and Mis12 revealed the first glimpse of the
340 centromere structure of this Mucoromycotina species. The centromeres in this organism
341 possess features of both point and regional centromeres, resulting in a unique mosaic; thus, we
342 named them mosaic centromeres (Fig. 5). One of their distinctive features is a small
343 kinetochore protein binding domain (~750-1150 bp) defined by the presence of an AT-rich
344 sequence stretch (~70-80% AT, ~100 bp), which is followed by a conserved centromere-
345 specific 41-bp sequence motif. This motif is only found in M. circinelloides centromeres and
14
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
346 marks the start of the major kinetochore-binding region, suggesting that it could be a binding
347 site for the unknown centromere-specific protein. Overall, this pattern resembles the
348 distribution of conserved elements in S. cerevisiae point centromeres, in which two conserved
349 Centromere Determining Elements (CDEI and CDEIII) are separated by an AT-rich non-
350 conserved DNA sequence stretch (CDEII)7.
351 Other features of M. circinelloides mosaic centromeres are reminiscent of fungal
352 regional centromeres like the ones described in Candida albicans35. The pericentric regions of
353 M. circinelloides are large (~15-75 kb), gene-free, and transcriptionally silenced sequences that
354 are interspersed by Grem-LINE1s, which are repeats of a LINE1-like non-LTR
355 retrotransposable element. These types of retroelements have been identified in the
356 centromeres and neocentromeres of highly diverged eukaryotes, especially LINE1-like
357 elements in mammals36–41 and also other non-LTR retroelements as active components of fruit
358 fly centromeres42, and LTR retrotransposons in plants43 and fungi22. Based on these findings,
359 a model has been proposed that CENP-A is recruited by genomic sequences rich in
360 retrotransposable elements and thus, gives rise to the centromeres44. M. circinelloides is the
361 first organism to display an enrichment in retrotransposable elements in their centromeric
362 regions in the absence of CENP-A, suggesting that these non-LTR elements determine the
363 formation of centromeres. Thus, we hypothesize that identifying Grem clusters in other species
364 of Mucoromycotina could pinpoint the location of their putative pericentromeric regions.
365 RNAi machinery is essential to control the mobility of transposable elements, but also
366 to maintain centromere structure and stability22. Fungi that have lost essential components of
367 the RNAi machinery have shorter pericentric regions and fewer transposable elements than
368 closely related RNAi-proficient species. M. circinelloides has a functional and complex RNAi
369 system with Dicer-dependent and -independent pathways23. Our results demonstrate that the
15
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
370 expression of the centromeric Grem-LINE1s is suppressed by the Dicer-dependent RNAi
371 pathway, which may have resulted in accumulation of these elements in the centromeres22.
372 The composition of centromere-specific nucleosomes has been debated and several
373 conflicting models have been proposed, which include a hemisome in yeast and Drosophila
374 with one CENP-A/H4 and one H2A/H2B dimer45,46, a conventional octamer with two CENP-
375 A/H4/H2A/H2B47, hexasomes with two CENP-A/H4/Scm3, and mixed octasomes which have
376 one molecule of H3 with CENP-A48. In addition, H3 nucleosomes have been shown to be
377 interspersed with CENP-A nucleosomes in certain plants and animals49–51. We observed the
378 presence of H3 at the relatively short kinetochore protein-binding region at a level that is
379 similar to the pericentric region or gene bodies. Because histone H3 was found to be enriched
380 at the core centromere, we propose that either a homotypic histone H3 nucleosome or a
381 heterotypic histone H3 nucleosome exists where CENP-A is replaced with some other protein.
382 Further studies will provide the exact composition of centromere-specific nucleosomes in this
383 basal fungus.
384 Overall this study has uncovered for the first time the centromeres of an early-diverging
385 fungus, which are characterized by a mosaic of features from both point and regional
386 centromeres found in the fungal kingdom. Several studies on rapidly-evolving centromeres
387 suggest that point centromeres may be a recently evolved state as compared to regional
388 centromeres9. Therefore, we hypothesize that the last mucoralean common ancestor had
389 regional centromeres with pericentric regions that harbored retrotransposable elements
390 contained by an active RNAi machinery. The loss of CENP-A triggered an evolutionary
391 rearrangement from typical regional centromeres to the smaller kinetochore-binding regions
392 that defines the mosaic centromeres of M. circinelloides.
16
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
393 Methods
394 Fungal strains and culture conditions
395 All the strains used and generated in this work derive from M. circinelloides f.
396 lusitanicus CBS277.49. The double auxotroph MU40214 (Ura-, Leu-) or the MU636 (Ura+, Leu-
397 ) strain was used for transformation with cassettes containing the selectable marker for uracil
398 pyrG or for leucine leuA, respectively. Single tagged strains in kinetochore proteins MU840,
399 MU842, MU844, and MU846 were used as parental strains for transformation with the Hht4-
400 eGFP cassette to obtain double-tagged strains. All the generated strains in this work are listed
401 in Supplementary Table 3.
402 After transformation, to select uracil prototrophy M. circinelloides spores were plated
403 in minimal medium with Casamino Acids (MMC)14, and for leucine prototrophy spores were
404 plated in medium yeast nitrogen base (YNB)14, both adjusted at pH 3.2 for transformation and
405 colony isolation. M. circinelloides spores were cultured in rich medium yeast-peptone-glucose
406 (YPG)14 for optimal growth and sporulation at pH 4.5. The microscopy and ChIP assays were
407 performed with pre-germinated spores growing in YPG pH 4.5 for 3 hours at 250 rpm and 26
408 ºC.
409 Microscopy imaging and acquisition
410 Freshly isolated M. circinelloides spores were washed twice with 1X phosphate
411 buffered saline (PBS). The spores were resuspended in PBS and 10 μl suspension was placed
412 on a slide, covered with a coverslip and sealed with clear nail polish. The images were acquired
413 at room temperature using laser scanning inverted confocal microscope LSM 880-Airyscan
414 (Zeiss, Plan Apochromat 63x, NA oil 1.4) equipped with highly sensitive photo-detectors. The
415 filters used were GFP/FITC 488, mCherry 561 for excitation and GFP/FITC 500/550 band
416 pass, mCherry 565/650 long pass for emission. Z- stack images were taken at every 0.5 μm and
17
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
417 processed using Zeiss Zen software. All the images were displayed after the maximum intensity
418 projection of images using Zeiss Zen software.
419 Live cell imaging
420 Freshly isolated M. circinelloides spores were incubated in YPG growth medium at
421 26ºC for 2 h. Then, the spores to be imaged were pelleted at 3000 xg and washed twice with
422 1X phosphate buffered saline (PBS). The spores were resuspended in PBS and 10μl suspension
423 was placed on a slide containing thin 2% agarose patch containing 2% dextrose and covered
424 with a coverslip. Live cell imaging was performed at 30ºC in a temperature-controlled chamber
425 of an inverted confocal microscope (Zeiss, LSM-880) using a Plan Apochromat 100X NA oil
426 1.4 objective and GaAsp photodetectors. Images were collected at 60-s intervals with 0.5 µm
427 Z‐steps using GFP/FITC 488, mCherry 561 for excitation. GFP/FITC 500/550 band pass or
428 GFP 495/500 and mCherry 565/650 or mCherry 580‐750 band pass for emission. All the
429 images were displayed after the maximum intensity projection of images at each time using
430 Zeiss Zen software.
431 Ortholog search
432 Seventy five fungal proteomes were retrieved from the Joint Genome Institute (JGI)
433 Mycocosm genome portal52 (Supplementary Table 1). BLAST+53 v2.7.1 individual protein-
434 protein BLASTp, iterative BLASTp (PSI-BLAST) and iterative HMMER v3.2.1
435 (http://www.hmmer.org) jackhammer searches (E-value ≤ 1x10-5) were conducted against
436 these proteomes using M. musculus, U. maydis, S. cerevisiae, and S. pombe protein sequences
437 (Supplementary Table 1) as queries. In addition, an hmmsearch was launched using the Pfam-
438 A54 HMM models for each kinetochore protein [Pfam gathering threshold (GA) as cut-off
439 value]. All the putative orthologs found were used to perform a reciprocal BLASTp against M.
440 musculus, U. maydis, S. cerevisiae, and S. pombe protein sequences. Also, Pfam-A protein
18
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
441 domains were searched in all the possible orthologs using HMMER hmmscan. Protein
442 sequences that either lacked appropriate Pfam domains or failed to produce a hit in a reciprocal
443 blastp search were discarded, resulting in a matrix of putative orthologs for each given species
444 (Supplementary Table 1). The cladogram showing the relationship among species was drawn
445 using the tree data from JGI Mycocosm52 and the interactive Tree of Life55 (iTOL v4) tool. To
446 determine CENP-A presence or absence in each given species, all 289 protein sequences of
447 histone H3 putative orthologs (Supplementary Table 1) were aligned using MUSCLE56
448 v3.8.1551, and the phylogenetic relationship of their HFD was inferred by the neighbor-joining
449 method employing the Jones, Taylor, and Thornton (JTT) substitution model57, with a bootstrap
450 procedure of 1,000 iterations (MEGA58 v10.0.5).
451 Construction of the strains expressing the histone and kinetochore proteins with the
452 fluorescent tag
453 Alleles containing fluorescent fusion protein coding-sequences were designed to either
454 integrate them in their corresponding endogenous locus, or in the carRP locus. The Hht4-eGFP,
455 Hhf1-eGFP, Hhf4-eGFP and CENP-T-eGFP alleles contained the ORF of each gene fused in-
456 frame at their C-termini with the eGFP sequence, followed by the leuA gene as a selectable
457 marker. The alleles were obtained by overlapping PCR using 5’-modified primers harboring
458 restriction sites for easy cloning. Briefly, each allele was obtained by fusing the 1-kb fragment
459 upstream the stop codon of each gene, the 0.7-kb eGFP fragment (ending in a stop codon), the
460 3.4-kb leuA fragment, and the 1-kb sequence downstream the stop codon of each tagged gene.
461 All the linear alleles were digested with appropriate restriction enzymes, ligated into the
462 plasmid Bluescript SK (+), and cloned into Escherichia coli.
463 The Mis12-mCherry and Dsn1-mCherry constructions were designed to allow the
464 integration and ectopic overexpression of the fused proteins in the carRP locus. First, a
19
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
465 universal mCherry-expression vector was constructed, containing the strong promoter Pzrt1,
466 the ORF of the mCherry coding-gene including the stop codon, and the pyrG gene as a
467 selectable marker (named pMAT1915). To do that, a fragment containing the 1-kb downstream
468 and upstream sequences from the carRP gene flanking the Pzrt1 promoter was amplified with
469 an inverse PCR using the plasmid pMAT147714 as a template. Then, the mCherry and pyrG
470 fragments were added by In-Fusion cloning (Takara) using primers with appropriate
471 overlapping 5’-tails (Supplementary Table 2). Once the universal pMAT1915 vector was
472 constructed, it was used for the C-terminal tagging of mis12 or dsn1, fusing their ORFs without
473 the stop codon upstream and in-frame with the mCherry ORF by In-Fusion cloning. The
474 resulting plasmids were pMAT1917 and pMAT1919, respectively for mis12 and dsn1. For the
475 N-terminal tagging, the plasmid pMAT1915 was inverse-amplified using primers that exclude
476 the stop codon of the mCherry fragment, and the mis12 and dsn1 ORFs, including their stop
477 codons, were cloned in-frame. The resulting plasmids were pMAT1916 and pMAT1918 for
478 mCherry-Mis12 and mCherry-Dsn1 expression, respectively.
479 StellarTM competent cells (Takara) were used for transformation by heat-shock
480 following the supplier procedures. The correct construction of all the generated plasmids was
481 confirmed by restriction analysis and Sanger sequencing.
482 For transformation, the alleles were released from the vector backbone by restriction
483 digestion and used to transform M. circinelloides protoplasts by electroporation14. Spores from
484 individual colonies were collected and plated in selective media to ensure homokaryosis. After
485 at least five vegetative cycles, the integration in the target locus was confirmed by PCR using
486 primers that amplified the whole locus (Supplementary Table 2), producing PCR-fragments
487 that allowed discrimination between the mutant and the wild-type locus (Supplementary Fig.
488 3, 4). Genomic DNA purification was performed following the procedure previously
489 published59.
20
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
490 Chromatin immunoprecipitation and sequencing
491 The ChIP protocol for M. circinelloides was adapted from previous studies in other
492 fungal models22. Briefly, spores from the strains MU842 (expressing Mi12-mCherry) and
493 MU846 (expressing Dsn1-mCherry) were used for independent IP experiments. 107 spores/mL
494 of these strains were germinated in 100 ml of liquid YPG pH 4.5 medium, shaking at 250 rpm
495 at 26ºC for 3 hours. Then, cells were fixed by adding formaldehyde at a final concentration of
496 1% for 20 minutes. The fixation was quenched by adding glycine to a final concentration of
497 135 mM. Cells were pelleted and ground with liquid nitrogen, and 150 mg of ground pellet was
498 resuspended in ChIP lysis Buffer (50 mM HEPES pH 7.4, 150 mM NaCl, 1% Triton X-100,
499 0.1% DOC, and 1 mM EDTA), with a protease inhibitor cocktail. The sample was sonicated
500 using a Bioruptor (Diogenode) with pulses of 30s ON/OFF for 50 cycles to obtain a chromatin
501 fragmentation between 100 and 300 bp.
502 The lysates were clarified by centrifugation at 12,000 xg for 10 min at 4ºC. At this step,
503 100 μL were frozen at -20ºC for Input DNA control. The samples were then divided into 450
504 μL for the antibody immunoprecipitation and 450 μL for the mock control. For IP 25 μL of
505 Red Fluorescent Protein (RFP)-Trap Magnetic Agarose (MA) beads (Chromotek) were added
506 and 25 μL of MA beads (Chromotek) to the binding control. Samples were incubated at 4ºC
507 overnight. After incubation, the beads were washed twice with 1 mL of the low salt wash buffer
508 (20 mM Tris pH 8.0, 150 mM NaCl, 0.1% SDS, 1 % Triton X-100, and 2 mM EDTA), twice
509 with 1 mL of the high salt wash buffer (20 mM Tris pH 8.0, 500 mM NaCl, 0.1% SDS, 1%
510 Triton X-100 and 2 mM EDTA), once with the LiCl wash buffer (10 mM Tris pH 8.0, 1 mM
511 EDTA, 0.25 M LiCl, 1% NP40, and 1% DOC), and finally once with TE (10mM Tris pH 8.0,
512 and 1 mM EDTA). To elute the DNA, 250 μL of TES (50 mM Tris pH 8.0, 10 mM EDTA, 1%
513 SDS) was added to each sample and incubated for 10 min at 65ºC, twice. The
514 immunoprecipitated samples and the Input controls were de-crosslinked by incubating at 65ºC
21
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
515 overnight. The samples were treated with 199 μg of RNase A (Sigma Aldrich) and 190 μg of
516 Proteinase K (Sigma Aldrich) for 2 hours at 50ºC. For DNA purification, 1 volume of
517 phenol:chloroform:isoamyl alcohol (25:24:1) was added, the aqueous phase was recovered,
518 and then 1 volume of chloroform:isoamyl alcohol (24:1) was added. After centrifugation, the
519 aqueous phase was recovered, and DNA was precipitated with 20 μg of glycogen (Thermo
520 Fisher), 1/10 volume of Na-Acetate 3 M (pH 5.2), and 1 volume of ethanol 100%. Pulled DNA
521 from duplicated samples were precipitated together for sequencing. Samples incubated at -20ºC
522 overnight were centrifuged for 10 min at 12,000 xg and the pellets were washed with ethanol
523 70%. Each sample was resuspended in 30 μL of MilliQ water. The quality of the DNA was
524 analyzed by QuBit (Thermo Fisher) and sent to Novogene, which prepared the libraries using
525 TruSeq ChIP Library Prep Kit and sequenced them with Illumina Hiseq 2500 High-Output v4
526 single-end reads of 50 bp.
527 Histone H3-ChIP assays were performed in the same way with the following changes.
528 The protoplasts were obtained as described before60 and then sonicated as above. Native ChIP
529 was performed22 using anti-H3 antibody (Abcam ab1791). The DNA pellet was dissolved in
530 25 μl of MilliQ water. All samples (Input, IP with or without antibodies) were used for PCR.
531 The Input and IP DNA was subsequently used for qPCR using the primers listed in
532 Supplementary Table 2b. Histone H3 enrichment was determined by the percentage input
533 method using the formula:100 푥 2푎푑푗푢푠푡푒푑 퐶푡 퐼푛푝푢푡−푎푑푗푢푠푡푒푑 퐶푡 퐼푃 (ref). The adjusted Ct value is
534 the dilution factor (log2 of dilution factor) subtracted from the Ct value of the Input or IP DNA
535 samples. Three technical replicates were taken for qPCR analysis and standard error of mean
536 was calculated.
537 Sequencing data analysis
22
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
538 Raw reads from Chromatin Immunoprecipitation and RNA sequencing (ChIP- and
539 RNA-seq) were quality-checked with FASTQC v0.11.8 and adapters removed with Trim
540 Galore! v0.6.2 (http://www.bioinformatics.babraham.ac.uk/projects/). Reads were aligned to
541 the Mucor circinelloides f. lusitanicus MU402 reference genome Muccir1_3 (referred to as the
542 M. circinelloides genome throughout the text, available at JGI
543 http://genome.jgi.doe.gov/Muccir1_3/Muccir1_3.home.html), using the Burrows-Wheeler
544 Aligner61 (BWA v0.7.8) algorithm for ChIP- and sRNA-seq reads and STAR62 v2.7.1a for
545 mRNA-seq data. The number of overlapping aligned reads per 25-bp bin was used as a measure
546 of coverage, obtaining normalized bigWig coverage files with deepTools63 v3.2.1
547 bamCoverage function. For RNA-seq data, coverage was normalized to bins per million
548 mapped reads (BPM). For ChIP-seq data, reads were normalized per genomic content (1x
549 normalization). ChIP enrichment peaks were identified by Model-based Analysis of ChIP-seq64
550 (MACS2 v2.1.1) callpeak function and sorted by fold-enrichment and FDR values
551 (Supplementary Table 4). Enriched peaks (fold-enrichment ≥ 1.6, FDR ≤ 5x10-5) were
552 manually curated by analyzing the IP coverage with the Integrative Genomics Viewer65 (IGV
553 v2.4.1) genome browser, ensuring that peaks were present in both Mis12 and Dsn1 IP samples,
554 and absent in either the input and binding controls. The resulting centromeric sequences were
555 analyzed with Multiple Em for Motif Elicitation66 (MEME v5.0.5) to discover conserved DNA
556 motifs on both strands. GC content every 25-bp bin was calculated with bedTools67 v2.28.
557 Normalized coverage, GC content and genetic elements annotation files were rendered with
558 either Gviz68 v1.29.0 for whole-chromosome visualization or deepTools pyGenomeTracks
559 module for close-in genomic views.
560 Transposable element analysis
561 The M. circinelloides genome assembly of strain MU402 was searched for repeats using
562 RepeatScout69 v1.0.5 and RepeatMasker (http://www.repeatmasker.org) v4.0.9. The repeats
23
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
563 located in the pericentromeric regions were aligned using MAFFT70 v7.407 and clustered with
564 CD-HIT71 v4.8.1, generating groups of sequences that shared ≥ 95% identity and ≥ 70%
565 coverage and calculating their p-distance (nucleotide changes per site). A BLASTn search
566 against M. circinelloides genome assembly was conducted using a representative repeated
567 sequence from each cluster to locate similar sequences across the entire genome, obtaining hits
568 with ≥ 80% identity and ≥ 20% coverage (Supplementary Table 5). The pericentromeric repeats
569 were examined with getorf from Emboss72 v6.6.0, obtaining all ORFs ≥ 600 nt. Protein
570 domains in these ORFs translated sequences were predicted by hmmsearch using the Pfam-A
571 v32.0 database. The reverse transcriptase (RVT) domains found in the repeated sequences were
572 aligned with MUSCLE to create a consensus sequence, and this consensus sequence was
573 aligned with 281 RVT domains of well-known non-LTR retrotransposons from Repbase73
574 24.04. A neighbor-joining phylogenetic tree was inferred from this alignment using the JTT
575 substitution model and a bootstrap procedure of 1000 iterations (MEGA v10.0.5). To assess
576 the presence of the retrotransposable elements in other species of Mucoromycotina, a tBLASTn
577 search was launched against the 55 available genomes at JGI (Supplementary Table 6) using
578 the translated sequence of both ORFs combined, recovering hits with ≥ 50% coverage and E-
579 value ≤ 10-5.
580 Data availability
581 ChIP-seq raw data and processed files are deposited at the Gene Expression Omnibus
582 (GEO) repository under GSE132687 accession number. Open access will be granted upon
583 publication and they are provisionally available upon token request. Both mRNA and sRNA-
584 seq data were already available at the Sequence Read Archive (SRA) and can be accessed with
585 the following run accession numbers: SRR1611144 (R7B wild-type strain mRNA),
586 SRR1611151 (ago1 deletion mutant strain mRNA), SRR1611171 (double dcl1 dcl2 deletion
24
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
587 mutant strain mRNA), SRR039123 (R7B wild-type strain sRNA), SRR836082 (ago1 deletion
588 mutant strain sRNA), SRR039128 (double dcl1 dcl2 deletion mutant strain sRNA).
589
590 References
591 1. McKinley, K. L. & Cheeseman, I. M. The molecular basis for centromere identity and
592 function. Nat. Rev. Mol. Cell Biol. 17, 16–29 (2016).
593 2. Cheeseman, I. M. The kinetochore. Cold Spring Harb. Perspect. Biol. 6, a015826
594 (2014).
595 3. Marshall, O. J., Chueh, A. C., Wong, L. H. & Choo, K. H. A. Neocentromeres: new
596 insights into centromere structure, disease development, and karyotype evolution. Am.
597 J. Hum. Genet. 82, 261–82 (2008).
598 4. Earnshaw, W. C. & Migeon, B. R. Three related centromere proteins are absent from
599 the inactive centromere of a stable isodicentric chromosome. Chromosoma 92, 290–6
600 (1985).
601 5. Henikoff, S. The centromere paradox: stable inheritance with rapidly evolving DNA.
602 Science (80-. ). 293, 1098–1102 (2001).
603 6. Clarke, L. & Carbon, J. Isolation of a yeast centromere and construction of functional
604 small circular chromosomes. Nature 287, 504–9 (1980).
605 7. Clarke, L. & Carbon, J. The structure and function of yeast centromeres. Annu. Rev.
606 Genet. 19, 29–55 (1985).
607 8. Friedman, S. & Freitag, M. Centrochromatin of Fungi. in 85–109 (2017).
608 doi:10.1007/978-3-319-58592-5_4
609 9. Malik, H. S. & Henikoff, S. Major evolutionary transitions in centromere complexity.
610 Cell 138, 1067–1082 (2009).
611 10. Drinnenberg, I. A., DeYoung, D., Henikoff, S. & Malik, H. S. Recurrent loss of CenH3
25
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
612 is associated with independent transitions to holocentricity in insects. Elife 3, 717–750
613 (2014).
614 11. Steiner, F. A. & Henikoff, S. Holocentromeres are dispersed point centromeres localized
615 at transcription factor hotspots. Elife 3, (2014).
616 12. Akiyoshi, B. & Gull, K. Discovery of unconventional kinetochores in kinetoplastids.
617 Cell 156, 1247–1258 (2014).
618 13. van Hooff, J. J., Tromer, E., van Wijk, L. M., Snel, B. & Kops, G. J. Evolutionary
619 dynamics of the kinetochore network in eukaryotes as revealed by comparative
620 genomics. EMBO Rep. 18, e201744102 (2017).
621 14. Nicolás, F. E. et al. Molecular tools for carotenogenesis analysis in the mucoral Mucor
622 circinelloides. Methods Mol. Biol. 1852, 221–237 (2018).
623 15. Prakash, H. & Chakrabarti, A. Global Epidemiology of Mucormycosis. J. Fungi 5,
624 (2019).
625 16. Malik, H. S. & Henikoff, S. Phylogenomics of the nucleosome. Nat. Struct. Biol. 10,
626 882–891 (2003).
627 17. Yun, C. S. & Nishida, H. Distribution of introns in fungal histone genes. PLoS One 6,
628 4–7 (2011).
629 18. Osley, M. The regulation of histone synthesis in the cell cycle. Annu. Rev. Biochem. 60,
630 827–861 (1991).
631 19. Luense, L. J. et al. Comprehensive analysis of histone post-translational modifications
632 in mouse and human male germ cells. Epigenetics Chromatin 9, 24 (2016).
633 20. Li, C. H. et al. Sporangiospore size dimorphism is linked to virulence of Mucor
634 circinelloides. PLoS Pathog. 7, e1002086 (2011).
635 21. Smith, K. M., Phatale, P. A., Sullivan, C. M., Pomraning, K. R. & Freitag, M.
636 Heterochromatin is required for normal distribution of Neurospora crassa CenH3. Mol.
26
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
637 Cell. Biol. 31, 2528–42 (2011).
638 22. Yadav, V. et al. RNAi is a critical determinant of centromere evolution in closely related
639 fungi. Proc. Natl. Acad. Sci. 115, 3108–3113 (2018).
640 23. Torres-Martínez, S. & Ruiz-Vázquez, R. M. RNAi pathways in Mucor: A tale of
641 proteins, small RNAs and functional diversity. Fungal Genet. Biol. 90, 44–52 (2016).
642 24. Mizuguchi, G., Xiao, H., Wisniewski, J., Smith, M. M. & Wu, C. Nonhistone Scm3 and
643 histones CenH3-H4 assemble the core of centromere-specific nucleosomes. Cell 129,
644 1153–1164 (2007).
645 25. Thakur, J., Talbert, P. B. & Henikoff, S. Inner kinetochore protein interactions with
646 regional centromeres of fission yeast. Genetics 201, 543–61 (2015).
647 26. Kapoor, S., Zhu, L., Froyd, C., Liu, T. & Rusche, L. N. Regional centromeres in the
648 yeast Candida lusitaniae lack pericentromeric heterochromatin. Proc. Natl. Acad. Sci.
649 112, 12139–12144 (2015).
650 27. Calo, S. et al. Antifungal drug resistance evoked via RNAi-dependent epimutations.
651 Nature 513, 555–558 (2014).
652 28. Caramalho, R. et al. Intrinsic short-tailed azole resistance in mucormycetes is due to an
653 evolutionary conserved aminoacid substitution of the lanosterol 14α-demethylase. Sci.
654 Rep. 7, 3–12 (2017).
655 29. Andrianaki, A. M. et al. Iron restriction inside macrophages regulates pulmonary host
656 defense against Rhizopus species. Nat. Commun. 9, 3333 (2018).
657 30. Pérez-Arques, C. et al. Mucor circinelloides thrives inside the phagosome through an
658 atf-mediated germination pathway. MBio 10, 1–15 (2019).
659 31. Gladfelter, A. S., Hungerbuehler, A. K. & Philippsen, P. Asynchronous nuclear division
660 cycles in multinucleated cells. J. Cell Biol. 172, 347–62 (2006).
661 32. Liu, X., McLeod, I., Anderson, S., Yates, J. R. & He, X. Molecular analysis of
27
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
662 kinetochore architecture in fission yeast. EMBO J. 24, 2919–30 (2005).
663 33. Schotanus, K. et al. Histone modifications rather than the novel regional centromeres of
664 Zymoseptoria tritici distinguish core and accessory chromosomes. Epigenetics
665 Chromatin 8, 41 (2015).
666 34. Kozubowski, L. et al. Ordered kinetochore assembly in the human-pathogenic
667 basidiomycetous yeast Cryptococcus neoformans. MBio 4, (2013).
668 35. Sanyal, K., Baum, M. & Carbon, J. Centromeric DNA sequences in the pathogenic yeast
669 Candida albicans are all different and unique. Proc. Natl. Acad. Sci. 101, 11374–11379
670 (2004).
671 36. Miga, K. H. et al. Centromere reference models for human chromosomes X and Y
672 satellite arrays. Genome Res. 24, 697–707 (2014).
673 37. Chueh, A. C., Northrop, E. L., Brettingham-Moore, K. H., Choo, K. H. A. & Wong, L.
674 H. LINE retrotransposon RNA is an essential structural and functional epigenetic
675 component of a core neocentromeric chromatin. PLoS Genet. 5, e1000354 (2009).
676 38. Carbone, L. et al. Centromere remodeling in Hoolock leuconedys (Hylobatidae) by a
677 new transposable element unique to the gibbons. Genome Biol. Evol. 4, 648–658 (2012).
678 39. de Sotero-Caio, C. G. et al. Centromeric enrichment of LINE-1 retrotransposons and its
679 significance for the chromosome evolution of Phyllostomid bats. Chromosom. Res. 25,
680 313–325 (2017).
681 40. Longo, M. S., Carone, D. M., Green, E. D., O’Neill, M. J. & O’Neill, R. J. Distinct
682 retroelement classes define evolutionary breakpoints demarcating sites of evolutionary
683 novelty. BMC Genomics 10, 334 (2009).
684 41. O’Neill, R. J. W., O’Neill, M. J. & Graves, J. A. M. Undermethylation associated with
685 retroelement activation and chromosome remodelling in an interspecific mammalian
686 hybrid. Nature 393, 68–72 (1998).
28
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
687 42. Chang, C.-H. et al. Islands of retroelements are major components of Drosophila
688 centromeres. PLOS Biol. 17, e3000241 (2019).
689 43. Presting, G. G. Centromeric retrotransposons and centromere function. Curr. Opin.
690 Genet. Dev. 49, 79–84 (2018).
691 44. Klein, S. J. & O’Neill, R. J. Transposable elements: genome innovation, chromosome
692 diversity, and centromere conflict. Chromosom. Res. 26, 5–23 (2018).
693 45. Henikoff, S. & Furuyama, T. The unconventional structure of centromeric nucleosomes.
694 Chromosoma 121, 341–352 (2012).
695 46. Dalal, Y., Wang, H., Lindsay, S. & Henikoff, S. Tetrameric structure of centromeric
696 nucleosomes in interphase Drosophila cells. PLoS Biol. 5, e218 (2007).
697 47. Camahort, R. et al. Cse4 is part of an octameric nucleosome in budding yeast. Mol. Cell
698 35, 794–805 (2009).
699 48. Lochmann, B. & Ivanov, D. Histone H3 localizes to the centromeric DNA in budding
700 yeast. PLoS Genet. 8, e1002739 (2012).
701 49. Blower, M. D., Sullivan, B. A. & Karpen, G. H. Conserved organization of centromeric
702 chromatin in flies and humans. Dev. Cell 2, 319–330 (2002).
703 50. Greaves, I. K., Rangasamy, D., Ridgway, P. & Tremethick, D. J. H2A.Z contributes to
704 the unique 3D structure of the centromere. Proc. Natl. Acad. Sci. 104, 525–530 (2007).
705 51. Yan, H. & Jiang, J. Rice as a model for centromere and heterochromatin research.
706 Chromosom. Res. 15, 77–84 (2007).
707 52. Grigoriev, I. V. et al. MycoCosm portal: gearing up for 1000 fungal genomes. Nucleic
708 Acids Res. 42, D699–D704 (2014).
709 53. Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics 10, 421
710 (2009).
711 54. El-Gebali, S. et al. The Pfam protein families database in 2019. Nucleic Acids Res. 47,
29
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
712 D427–D432 (2019).
713 55. Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v4: recent updates and new
714 developments. Nucleic Acids Res. (2019). doi:10.1093/nar/gkz239
715 56. Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high
716 throughput. Nucleic Acids Res. 32, 1792–1797 (2004).
717 57. Jones, D. T., Taylor, W. R. & Thornton, J. M. The rapid generation of mutation data
718 matrices from protein sequences. Comput. Appl. Biosci. 8, 275–82 (1992).
719 58. Tamura, K. et al. MEGA5: Molecular Evolutionary Genetics Analysis using maximum
720 likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol.
721 28, 2731–2739 (2011).
722 59. Navarro-Mendoza, M. I. et al. Components of a new gene family of ferroxidases
723 involved in virulence are functionally specialized in fungal dimorphism. Sci. Rep. 8,
724 7660 (2018).
725 60. Nicolás, F. E. et al. Molecular tools for carotenogenesis analysis in the mucoral Mucor
726 circinelloides. Methods in Molecular Biology 1852, (2018).
727 61. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler
728 transform. Bioinformatics 25, 1754–1760 (2009).
729 62. Dobin, A. & Gingeras, T. R. Mapping RNA-seq Reads with STAR. in Current Protocols
730 in Bioinformatics 11.14.1-11.14.19 (John Wiley & Sons, Inc., 2015).
731 doi:10.1002/0471250953.bi1114s51
732 63. Ramírez, F. et al. deepTools2: a next generation web server for deep-sequencing data
733 analysis. Nucleic Acids Res. 44, W160–W165 (2016).
734 64. Zhang, Y. et al. Model-based Analysis of ChIP-Seq (MACS). Genome Biol. 9, R137
735 (2008).
736 65. Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
30
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
737 66. Bailey, T. L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic
738 Acids Res. 37, W202–W208 (2009).
739 67. Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing
740 genomic features. Bioinformatics 26, 841–842 (2010).
741 68. Hahne, F. & Ivanek, R. Visualizing Genomic Data Using Gviz and Bioconductor. in
742 335–351 (2016). doi:10.1007/978-1-4939-3578-9_16
743 69. Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in
744 large genomes. Bioinformatics 21, i351–i358 (2005).
745 70. Katoh, K. & Standley, D. M. MAFFT Multiple Sequence Alignment Software Version
746 7: Improvements in Performance and Usability. Mol. Biol. Evol. 30, 772–780 (2013).
747 71. Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-
748 generation sequencing data. Bioinformatics 28, 3150–3152 (2012).
749 72. Rice, P., Longden, I. & Bleasby, A. EMBOSS: the European Molecular Biology Open
750 Software Suite. Trends Genet. 16, 276–7 (2000).
751 73. Bao, W., Kojima, K. K. & Kohany, O. Repbase Update, a database of repetitive elements
752 in eukaryotic genomes. Mob. DNA 6, 11 (2015).
753 Acknowledgments
754 This work was supported by the Ministerio de Economía y Competitividad, Spain (grant
755 number BFU2015-65501-P, cofinanced by FEDER, and RYC-2014-15844) and Ministerio de
756 Ciencia, Innovación y Universidades, Spain (grant number PGC2018-097452-B-I00,
757 cofinanced by FEDER). CPA and MINM were supported by predoctoral fellowships from the
758 Ministerio de Educación, Cultura y Deporte, Spain (grants number FPU14/01983 and
759 FPU14/01832 respectively). We acknowledge financial support from Tata Innovation
760 Fellowship, Department of Biotechnology, Government of India (BT/HRD/35/01/03/2017),
761 DBT-BUILDER programme support at JNCASR (BT/INF22/SP27679/2018) and intramural
31
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
762 support from JNCASR to KS. SP is supported by DST-WOS-A postdoctoral fellowship
763 (SR/WOS-A/LS-207/2018). Supported in part by NIH/NIAID R37 grant AI39115-21 and R01
764 grant AI50113-15 to J.H., and by the CIFAR program Fungal Kingdom: Threats &
765 Opportunities, for which J.H. serves as Co-Director and Fellow. We thank B. Suma at the
766 confocal imaging facility of the Molecular Biology and Genetics Unit, JNCASR for assistance
767 in imaging. Work conducted by the US Department of Energy Joint Genome Institute, a DOE
768 Office of Science User Facility, was supported by the Office of Science of the US Department
769 of Energy under Contract No. DE-AC02-05CH11231. We thank Alexander Idnurm (University
770 of Melbourne, Australia), Luis M. Corrochano (University of Seville, Spain), Timothy Y.
771 James (University of Michigan, USA), Mathew E. Smith (University of Florida, USA), Joseph
772 W. Spatafora (Oregon State University, USA), Jason E. Stajich (University of California-
773 Riverside, USA), and Gregory Bonito (Michigan State University, USA) for sharing fungal
774 genome sequences to analyze the distribution of kinetochore proteins. We also thank Thomas
775 D. Petes (Duke University School of Medicine, USA) and Beth A. Sullivan (Duke University
776 School of Medicine, USA) for critical reading of the manuscript and the members of KS lab,
777 VG lab, and JH lab for valuable discussions during bi-weekly Skype meetings.
778 Author contributions
779 KS, JH, VG, and FEN conceived and supervised the study. MINM and CPA analyzed the
780 distribution of kinetochore proteins in fungi, generated and validated the plasmids and strains,
781 performed the ChIP-seq experiments, identified and defined the retrotransposable elements,
782 and analyzed the mRNA and sRNA-seq data. MINM, CPA and PG analyzed the ChIP-seq data.
783 MINM, CPA, PG and SP characterized the centromeric loci. SP, MINM and CPA did the
784 fluorescence microscopy imaging. SP performed the ChIP-qPCR experiments. IVG, JP and
785 SJM provided the genome assembly and gene annotation. MINM, CPA and SP designed the
32
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
786 figures and wrote the manuscript with the support from KS, JH, and VG. VG, JH, FEN and KS
787 provided funding. All authors revised the manuscript and approved the final version.
788 Figure legends
789 Fig. 1. Distribution of the kinetochore complex across the Mucoromycotina subphylum.
790 Concise kinetochore schematic showing the most conserved protein complexes in eukaryotes
791 is shown on the left corresponding to the kinetochore proteins analyzed in the matrix on the
792 right. The matrix displays the presence or absence of 26 kinetochore proteins across a
793 cladogram of 51 fungal lineages (top) to show the relationships among them. Every fungal
794 phylum is represented and color-coded, differentiating the Mucoromycota into the
795 Glomeromycotina, Mortierellomycotina, and Mucoromycotina subphyla to provide a special
796 emphasis on the latter. Arrows at divergence events in the cladogram mark the hypothetical
797 loss of proteins CENP-A (red) and CENP-C (blue) in these clades.
798 Fig. 2. Inner and outer kinetochore proteins colocalize in M. circinelloides nuclei in the
799 absence of CENP-A and CENP-C. (a) Graphical representation of Mis12, Dsn1, and CENP-
800 T sequence features showing individual Pfam domains PF05859, PF08202, PF15511
801 respectively identified in M. circinelloides. (b) Schematic to represent C-terminal tagging of
802 histone H3 and kinetochore proteins with mCherry (red circles), eGFP (green circles). (c-f)
803 Confocal microscope imaging showing the cellular localization of well-conserved mCherry-
804 tagged outer kinetochore proteins Mis12 (c, e) and Dsn1 (d, f), eGFP-tagged histone H3 (c, d),
805 and eGFP-tagged inner kinetochore CENP-T (e, f) in pre-germinated spores of M.
806 circinelloides strains expressing fluorescent fusion proteins. A calibrated scale (white bar) is
807 provided for size comparison (5 µm). (f) Time-lapsed confocal image displaying the cellular
808 localization of mCherry-tagged Mis12 and eGFP-tagged histone H3 in a germinative tube
809 sprouting from a spore. A discontinuous white perimeter outlines the germinative tube in the
33
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
810 fluorescence images, and a yellow circle indicates a nuclear division event. Cartoon for the
811 zoomed image represents the localization and signal intensity of Mis12-mCherry in the
812 nucleus.
813 Fig. 3. The nine centromeres of M. circinelloides are short, AT-rich, and harbor a
814 conserved DNA motif. (a) Putative kinetochore-binding regions showing color-coded
815 enrichment of immunoprecipitated DNA (IP DNA) from Mis12 and Dsn1 mCherry-tagged
816 strains compared to their corresponding input (Input DNA) and binding controls (Beads only).
817 1 kb flanking sequences from the center of the enrichment peak are shown. Major (M) and
818 minor (m) kinetochore-binding regions are indicated. (b) Scaled ideogram of the nine scaffolds
819 (white) of M. circinelloides genome assembly that contain a kinetochore-binding region (blue),
820 showing the telomeric repeats (red). (c) Size distribution of the nine core centromere (CC)
821 regions. (d) GC content across the nine core centromere (CC) sequences. The median (red line)
822 and standard deviation (black lines) are shown in c and d. (e) 41 bp centromere-specific DNA
823 motif conserved in all nine CC regions and absent in the remainder of the genome. (f) A
824 genomic view of CEN11 illustrating all nine core centromere regions. Kinetochore-binding
825 region enrichment (IP coverage) as the average of both immunoprecipitation signals minus
826 Input and beads only controls (left axis, blue), GC content across the region (right axis, black),
827 and the centromere-specific DNA motif described in e (red) are shown.
828 Fig. 4. M. circinelloides core centromeres are present in large ORF-free pericentric
829 regions having retrotransposable elements regulated by the Dicer-dependent RNAi
830 pathway and bind H3 nucleosomes. (a) Genomic sequences lacking annotated genes (light
831 blue) that flank the CC regions (dark blue) served as the reference points to calculate the length
832 both upstream and downstream. (b) Schematic of the Grem-LINE1 interspersed in the
833 pericentromeric regions. Open Reading Frames (ORF) and protein domains predicted from
834 their coding sequences are shown as colored boxes [Zf, zinc finger (PF00098 and PF16588);
34
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
835 AP, AP endonuclease (PTHR22748); RVT, reverse transcriptase (PF00078); and zf-RVT,
836 zinc-binding in reverse transcriptase (PF13966)]. (c) CEN4 is exemplified by low GC content,
837 lack of genes and transcripts, which are shown in different tracks displaying the kinetochore-
838 binding region enrichment (IP, an average of both IP DNA signals minus Input and beads only
839 controls), annotation of genes (red blocks) and transposable elements (light blue blocks), CEN-
840 specific DNA motif position (vertical line) and direction (arrow), GC content, and
841 transcriptomic data of mRNA (green) and sRNAs (red) in M. circinelloides wild-type, ago1,
842 and double dcl1 dcl2 deletion mutant strains after 48 h of growth in rich media. (d) Genomic
843 location of CEN2 showing the regions studied for histone H3 occupancy as labelled and colored
844 rectangles. ChIP assays were performed using polyclonal antibodies against histone H3.
845 Primers were designed for CC2 region, pericentric regions (1L, 2L, 1R, 2R), flanking ORFs
846 (ORF-L, ORF-R) and a far-CEN control ORF ~2 Mb away from CEN2 (Far-CEN). IP samples
847 were analyzed by real-time PCR using these primers. The y-axis denotes the qPCR value as a
848 percentage of the total chromatin input with standard error mean (SEM), from each region
849 tested. The experiment was repeated three times with similar results.
850 Fig. 5. M. circinelloides possesses a mosaic of point and regional centromeres. A
851 comparative analysis of centromere features of S. cerevisiae (ascomycete), M. circinelloides
852 (mucoromycete), and Cryptococcus neoformans (basiodiomycete). M. circinelloides
853 centromeres are a mosaic with features of point centromeres like shorter kinetochore-bound
854 region (core centromere) and a conserved motif (red dots), combined with features of regional
855 centromeres like large ORF-free pericentromeres dotted with transposable elements that are
856 suppressed by the RNAi machinery.
857 Supplementary Fig. 1. Mucorales and Umbelopsidales lack CENP-A. (a) Schematic of
858 canonical histone H3 and CENP-A shared protein features. (b-d) Multiple protein sequence
859 alignments of the N-tail (b), Loop 1 (c), and C-terminal (d) regions. The scale above the
35
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
860 comparisons indicates the amino acid positions with S. pombe sequence as the reference. Red
861 arrows mark relevant amino acid positions. Amino acids are colored in increasing shades of
862 blue to show conservation within each group. (e) Neighbor-joining phylogenetic tree (JTT
863 model) of the Histone Folding Domain (HFD), showing phylogenetic distance (branch length)
864 and branch support (1000 bootstraps). Branches with < 50% bootstrap support are collapsed.
865 The protein sequences analyzed in b to e are distributed among three groups: well-studied
866 histone H3 (yellow), rare histone H3 (blue), and the predicted CENP-A proteins in this study
867 (red). Asterisks (*) indicate proteins from the Mucoromycotina.
868 Supplementary Fig. 2. M. circinelloides histones H3 and H4 are not centromere-specific
869 binding proteins. (a) Location of all the histone H2A, H2B, H3, and H4-coding genes in the
870 M. circinelloides genome. Genes are represented by arrows showing transcription direction;
871 the coding sequence is depicted as larger red blocks, flanked by smaller blue blocks
872 representing the untranslated regions, and connected by black lines as intronic sequences.
873 Light-gray arrows indicate neighboring non-histone genes and their transcription direction. (b,
874 c) Protein alignment of several well-characterized H3 (b) and H4 (c) histones with M.
875 circinelloides orthologs. A scale indicates the amino acid positions taking S. pombe sequence
876 as the reference. Histone folding domains (HFD) are outlined in a diagram below each
877 alignment, as well as the N-terminal tail. Amino acids are colored in increasing shades of blue
878 and a consensus protein logo is provided to reflect conservation. (d) Confocal microscopy
879 images of M. circinelloides strains expressing eGFP-fluorescent fusion histone proteins Hht4,
880 Hhf1, and Hhf3 in 4-hour pregerminated spores. The fluorescent signal is colored as green. A
881 calibrated scale (white bar) is provided for size comparison (5 µm).
882 Supplementary Fig. 3. Generation of strains expressing fluorescent fusion histones.
883 Diagram of the fluorescent fusion alleles used for integration by homologous recombination
884 (shown as discontinuous crosses) into the wild-type allele of each histone loci: hht4 (a), hhf1 (b),
36
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
885 and hhf3 (c). Arrows in the diagrams mark the annealing regions for specific primers used to
886 confirm the integration by PCR, amplifying both the wild-type (WT) and mutant (MUT)
887 alleles. Gel images below the diagrams show the size-specific PCR fragments. M lanes were
888 loaded with the GeneRuler DNA Ladder Mix (Thermo Fisher) to estimate the fragment sizes.
889 Green regions indicate mCherry and eGFP genes.
890 Supplementary Fig. 4. Generation of mutant strains expressing fluorescent kinetochore
891 proteins. Diagram (above) and agarose gel images (below) showing the integration of the cnpT
892 fusion allele into its wild-type locus (a); and the mis12 (b) and dsn1 (c) fusion alleles (both N-
893 terminal and C-terminal fusions) into the carRP locus. Discontinuous crosses indicate
894 homologous recombination and arrows mark the annealing regions for specific primers used to
895 confirm the integration by PCR, amplifying both the wild-type (WT) and mutant (MUT) alleles
896 shown in the gel images. M lanes were loaded with the GeneRuler DNA Ladder Mix (Thermo
897 Fisher) to estimate the fragment sizes. Red and green regions indicate mCherry and eGFP
898 genes, respectively
899 Supplementary Fig. 5. M. circinelloides has nine centromeres. IP enrichment (coverage x1)
900 of immunoprecipitated DNA (IP DNA) from Mis12 and Dsn1 mCherry-tagged strains
901 compared to their corresponding input (Input DNA) and binding controls (beads only DNA),
902 shown as color-coded tracks across the whole sequence of scaffolds 01-11. Asterisks (*) mark
903 significant peaks in both IP DNA samples that are not present in the controls, indicating the
904 putative centromeric regions.
905 Supplementary Fig. 6. M. circinelloides pericentromeric regions. A genomic view of the
906 pericentromeric regions of all nine centromeres. Each region shows the kinetochore-binding
907 region enrichment (IP, an average of both IP signals minus Input and Beads only controls),
908 annotated genes (red blocks) and transposable elements (light blue blocks), CEN-specific DNA
37
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
909 motif position (black vertical line) and its direction (arrow), GC content, and transcriptomic
910 data of mRNA (green) and sRNAs (red) in M. circinelloides wild-type strain, and ago1 and
911 double dcl1 dcl2 deletion mutants after 48 h of growth in rich media.
912 Supplementary Fig. 7. Centromere-specific non-LTR L1-like retrotransposable elements.
913 (a) Matrix showing pairwise p-distance (% changes per site) across all 44 repetitive elements
914 flanking the centromeres of M. circinelloides. Elements sharing ≥ 95.0% identity are clustered
915 together, generating 10 groups that comprise full-length elements and incomplete sequences
916 (remnants). (b) Neighbor-joining phylogeny (JTT model) of the RVT domain of well-known
917 non-LTR elements. The RVT domain of the centromere-specific element (Grem LINE1)
918 identified in this study is marked inside a blue box. Phylogenetic distance (branch length) and
919 branch support (1000 bootstraps, size-coded circles) are shown. Leaves within the same clade
920 of non-LTR retrotransposons are collapsed, forming a scalene triangle (top and bottom corners
921 show the shortest and longest phylogenetic distance, respectively); except for the L1 clade
922 which contains the Grem LINE-1.
38
bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure 1
CENP-A CENP-C
Microtubules sp. rouxii linderi butleri circina crassa fusiger repens padenii trispora poitrasii mucedo africana A. nidulans F. F. anomala G. G. B. cordense allomycis isabellina S. N. A. pombe S. mexicana parasitica umbellata umbellata persicaria coronatus irregularis U. maydis U. verticillata C. C. B. echinulata T. T. elegans B. recurvatus spectabilis vesiculosa H. elegans H. elegans D. H. radiatus H. M. M. M. albicans C. A. P. corymbifera K. R. R. U. U. Z. macrogynus microsporus P. S. punctatus S. S. C. C. C. C. R. G. P.articulosus cerevisiaeS. circinelloides cucurbitarum C. C. heterogamus M. M. P.umbonatus E. intestinalis E. C. C. R. R. H. flammicorona Endogone L. L. blakesleeanus C. neoformans C. A. R. R. Z. Z. C. C. J. M. M. P.
Ndc80 NDC Nuf2 80 Spc25 Spc24 Zwint-1 Knl1
Outer Nsl1 MIS12 Dsn1 Kinetochore Mis12 Nnf1 CENP-T T-W CENP-W S-X CENP-S CENP-X CENP-H H-I-K-M CENP-I CENP-K CENP-M
Inner L-N CENP-L O-P CENP-N
Kinetochore Q-U CENP-O CENP-P C CENP-Q CENP-U CENP-C CENP-A
Phylum Subphylum Order Centromere Microsporidia Zoopagomycota Glomeromycotina
Cryptomycota Mucoromycota Mortierellomycotina Endogonales
Blastocladiomycota Ascomycota Mucoromycotina Umbelopsidales
Chytridiomycota Basidiomycota Mucorales bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure 2
a Protein ID: 1432987 DIC Mis12-mCherry H3-eGFP Merged c Mis12Mis12 domain domain 1 25 153 246
Protein ID: 1400423
DIC Dsn1-mCherry H3-eGFP Merged Dsn1Dsn1 domain domain d 1 103 295 397
Protein ID: 1361249
CENP-T domain 1 403 534 DIC CENP-T-eGFP Mis12-mCherry Merged e b hht4 Phht4
mis12 Pzrt1 DIC CENP-T-eGFP Dsn1-mCherry Merged f dsn1 Pzrt1
cnpT PcnpT
DIC Mis12-mCherry g H3-eGFP Merged
t=0
t=1 min
t=3 min
t=5 min
t=6 min
t=7 min
t=8 min bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure 3
a b
1 CC1 CC2 CC3 2 3 40 Mis12 IP M 4 Dsn1 IP 5 30 Beads only M 6 Mis12 Input CEN Scaffold 8 20 Dsn1 Input 9 TEL m M m 11 10 m 0 1.0 2.0 3.0 4.0 5.0 6.0 Coverage (1x) Coverage 0 Length (Mb) -1.0 peak 1.0Kb -1.0 peak 1.0Kb -1.0 peak 1.0Kb
CC4 CC5 CC6 c d 100 40
30 M M M
20 s 50 m m CC 10 m
Coverage (1x) Coverage 0 0 500 1000 1500
-1.0 peak 1.0Kb -1.0 peak 1.0Kb -1.0 peak 1.0Kb (%) content GC CC region size (bp) 0 CC8 CC9 CC11 CCs 40 M
(1x) M e 30 M 20 m m 10 m
Coverage 0 -1.0 peak 1.0Kb -1.0 peak 1.0Kb -1.0 peak 1.0Kb f Core centromere region 30 0.60
M
m GC content GC
IP Coverage IPCoverage (1x) 0 0.00 CEN motif
884.0 884.5 885.0 885.5 886.0 886.5 887.0 CC11 (Kb) bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure 4
a b
CEN1 CC CEN2 Pericentric region CEN3 CEN4 ORF1 ORF2 CEN5 AP RVT Poly(A) An CEN6 ZFs ZF-RVT CEN8 CEN9 CEN11
-40 -30 -20 -10 0 10 20 30 40 50 60 Pericentric region length (kb) c
CEN4
49kb 25 IP 0 CEN motif Annotation
GC content
mRNA WT sRNA
mRNA Ago- (ago1Δ) sRNA
mRNA Dicer- (dcl1Δ dcl2Δ) sRNA
scaffold_4
d
25 0.25 IP
0 0.20
CEN motif 0.15 Annotation 0.10 % Input% Regions 0.05 analyzed ORF-L 1L 2L CC2 1R 2R ORF-R 0.00 4,690 4,695 4,700 4,705 4,710 4,715 4,720 4,725 4,730 4,735 kb scaffold_2 bioRxiv preprint doi: https://doi.org/10.1101/706580; this version posted July 18, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure 5
Regional CEN
C. neoformans
Mosaic CEN
Point CEN M. circinelloides
S. cerevisiae
Core CEN-specific DNA sequence motif Pericentric transposon-rich region