<<

G3: || Early Online, published on December 8, 2015 as doi:10.1534/g3.115.023275

Diversity and divergence of dinoflagellate proteins

Georgi K. Marinov1 and Michael Lynch1

1Department of Biology, Indiana University, Bloomington, IN 47405, United States

Abstract

Histone proteins and the nucleosomal organization of chromatin are near-universal eukaroytic features, with the exception of dinoflagellates. Previous studies have suggested that do not play a major role in the packaging of dinoflagellate genomes, although several genomic and transcriptomic surveys have detected a full set of core histone genes. Here, transcrip- tomic and genomic sequence data from multiple dinoflagellate lineages are analyzed, and the diversity of histone proteins and their variants characterized, with particular focus on their po- tential posttranslational modifications and the conservation of the histone code. In addition, the set of putative epigenetic mark readers and writers, chromatin remodelers and histone chap- erones are examined. Dinoflagellates clearly express the most derived set of histones among all autonomous nuclei, consistent with a combination of relaxation of sequence con- straints imposed by the histone code and the presence of numerous specialized histone vari- ants. The histone code itself appears to have diverged significantly in some of its compo- nents, yet others are conserved, implying conservation of the associated biochemical processes. Specifically, and with major implications for the function of histones in dinoflagellates, the re- sults presented here strongly suggest that transcription through nucleosomal arrays happens in dinoflagellates. Finally, the plausible roles of histones in dinoflagellate nuclei are discussed.

Introduction know that dinoflagellates firmly belong to the , together with apicomplexans and , and that the loss A core feature of eukaryotic biology is the nucleoso- of nucleosomes is a derived feature. But the mystery of mal organization of chromatin. Nucleosomes consist of a hi- dinoflagellate chromatin remains largely unresolved. stone octamer containing two copies of each of the four core Several reports have identified histone-like proteins in histone proteins H2A, H2B, H3 and H4, wrapped around dinoflagellates (Chan et al. 2006; Chan et al. 2007; Wargo ∼147 bp of DNA (Luger et al. 1997). In addition, the and Rizzo 2000; Chudnovsky et al. 2002; Sala-Rovira et linker histone H1 binds to the nucleosome and the linker al. 1991; Wong et al. 2003; Rizzo and Burghardt 1982). DNA between individual nucleosomes. More recently, it was found that dinoflagellates express The major exception from this almost universal organi- -derived nucleoproteins, completely unrelated to hi- zation are dinoflagellates. Dinoflagellates exhibit numerous stones (DVNPs, or Dinoflagellate Viral NucleoProteins), highly unusual features, such as the organization of their mi- which seem to substitute for histones as far as the packaging tochondrial (Waller and Jackson 2009) and (Zhang of DNA is concerned (Gornik et al. 2012). However, multi- et al. 1999; Barbrook and Howe 2000) genomes, but their ple reports have also identified histone genes and low levels nuclei are particularly striking (Rizzo 2003). Dinoflagellate of histone proteins in several . These include stud- chromatin does not exhibit a banding pattern upon nuclease ies of transcriptomes from Lingulodinium (Roy and Morse digestion, it contains little acid-soluble protein (the ratio of 2012), (Bayer et al. 2012), and Alexandrium basic proteins to DNA is ∼10% compared to the typical catenella (Zhang et al. 2014), the draft genome sequence 1:1; Rizzo and Nooden 1972; Herzog and Soyer 1981; ), and of Symbiodinium minutum (Shoguchi et al. 2013), and en- chromosomes exist in a permanently condensed liquid crys- vironmental transcriptomes (Lin et al. 2010). talline state (Rill et al. 1989). Histone proteins are not readily detected in dinoflagellates, and until quite recently These observations suggest that histones do play some they were thought to be completely absent. So unusual role in dinoflagellate biology, but its precise remains is dinoflagellate chromatin that at one time dinoflagellates unclear. A somewhat underappreciated fact is that the were suggested to be “mesokaryotes”, i.e. intermediate be- loss of nucleosomes has far more profound consequences tween prokaryotes and (Dodge 1965). We now than the mere packaging of DNA, as the post-translational

1 © The Author(s) 2013. Published by the Genetics Society of America. Figure 1: Detection of histones and DVNP proteins in dinoflagellate transcriptomic and genomic as- semblies. “Total” refers to the number of unique proteins detected, while “complete” refers to the subset of full-length proteins (i.e., assembled transcripts in which both start and stop codons are present). Symbiodinium minutum and Perkin- sus marinus are colored differently as genome assemblies are available for these species, while only the transcriptomic space has been sampled for all others. The dinotoms are also highlighted separately as they contain a unreduced diatom , meaning that their transcriptomes contains transcripts from two different eukaryotic genomes, only one of which is a dinoflagellate. The cladogram was generated following previously published phylogenies (Orr et al. 2014). modifications (PTMs) of histone proteins and the “histone been identified, densely covering histone tails (Huang et al. code” they constitute (Jenuwein and Allis 2001) play a key 2014), which is one explanation for the extreme conserva- role in most aspects of chromatin biology. These modifica- tion of their sequence across very deeply diverging lineages tions happen primarily (but not only) in the N-terminal of eukaryotes (Waterborg 2012; Postberg et al. 2010; Feng tails of histones and serve as platforms for the recruit- and Jacobsen 2011). ment of specific PTM “reader” -containing proteins In the light of the deep conservation and fundamental (Kouzarides 2007). Hundreds of histone modifications have importance of the histone code, it is of significant interest

2 Figure 2: Distribution of histone protein lengths in dinoflagellates. (A) Histone H2A; (B) Histone H2B; (C) Hi- stone H3; (D) Histone H4. Values for Homo sapiens, melanogaster, , and core histones are shown at the bottom of each panel for comparison.

3 Figure 3: Expression levels of DVNP, linker histone and histone genes in dinoflagellates in TPM (Tran- scripts Per Million). (A) Alexandrium tamarense; from left to right: SRR1296765, SRR1296766, SRR1300221, SRR1300222; (B) carterae; from left to right: SRR1294391, SRR1294392, SRR1294393, SRR1294394, SRR1296757, SRR1296758; (C) spinosum; from left to right: SRR1300306, SRR1300307, SRR1300308; (D) : SRR1296929; (E) Amoebophrya sp.: SRR1296703. See Figures S2–S7 for data for other species.

4 to know the extent to which it is conserved in dinoflagel- suggests that histone H1 is present in all dinoflagellates, lates given that histones are present but are not the major with the failures to detect it being false negatives. constituent of chromatin in these organisms. Such insights A unique representative of the complexity of dinoflagel- can shed light on the functional roles of histone proteins late biology deserves a special mention. The largest number in dinoflagellate biology. In this study, these issues are ad- of histones were identified in Durinskia baltica and Kryp- dressed by carrying out a detailed survey of the sequence toperidinium/Glenodinium foliaceum. These species are of histone proteins, as well as the presence or absence of known as “dinotoms” (Imanian et al. 2010), as they har- chromatin mark writers, readers and erasers in available bor a tertiary diatom endosymbiont (Dodge 1971; Tomas transcriptomic and genomic data from a large number of and Cox 1973; Figueroa et al. 2009), which has not been dinoflagellate species. reduced and retains a large genome. This means that they have both a dinoflagellate nucleus and a diatom one, the lat- ter with conventional nucleosomal organization. Thus, all Results results that follow should be interpreted with this caveat in mind in the case of dinotoms. Histone proteins in dinoflagellates Another potentially confounding factor concerns the pu- rity of the samples studied, many of which were not axenic Their often very large size and existing technological lim- (Table S1). In some cases, the presence of other eukaryotes itations have so far prevented the complete sequencing of is necessary to maintain dinoflagellate cultures. For ex- dinoflagellates genomes, and little is known in detail about ample, contains kleptoplastids of cryptophyte their sequence and organization. Fortunately, in recent origin (Schnepf and Elbr¨achter 1988; Nagai et al. 2008), years transcriptomic data from a large number of dinoflagel- which apparently do not contain a (Lucas and lates has become available, in particular through the efforts Vesk 1990), and are extracted from the Myrionecta of the Marine Microbial Eukaryote Transcriptome Sequenc- rubra (= rubra) (Takishita et al. 2002), which ing Project (MMETSP; Keeling et al. 2014). MMETSP in turn acquires them from cryptophytes (Johnson et al. transcriptome assemblies and translations served as the 2006); Dinophysis is grown together with Myrionecta. The main dataset for this study. In addition, a draft genomic other such example involves marina, which is het- sequence exists for Symbiodinium minutum (Shoguchi et al. erotrophic and is often grown with other eukaryotes as prey 2013) even though it is far from complete, and a genome (Lowe et al. 2011), in this case the diatom Phaeodactylum assembly is available from the NCBI database for Perkin- tricornutum. sus marinus, a member of the early branching, sister to Thus, there is a possibility that histones and other pro- dinoflagellates lineage, the Perkinsids; annotated proteins teins identified in transcriptome assemblies do not belong to from these genome assemblies were also included. Finally, the listed species. Dinoflagellate transcripts are subject to an MMETSP transcriptome for , a photo- trans-splicing, thus in principle, the presence or absence of synthetic relative of apicomplexans was used as an out- splice leaders can be used to identify transcripts of dinoflag- group/control. The species considered and their evolution- ellate origin. However, this is best done by using the splice ary relationships are shown in Figure 1, and the full list of leader to positively select transcripts during library con- datasets can be found in Table S1. In total, the analysis struction (Gavelis et al. 2015), while conventional RNA-seq focused on 40 dinoflagellate species/isolates with transcrip- is highly vulnerable to underrepresentation of the extreme tome assemblies, two species with genome sequences, and 50 ends of transcripts, meaning that the absence of splice the Chromera velia transcriptome. leaders is not a reliable marker for the non-dinoflagellate A combination of hidden markov model (HMM) scans origin of transcripts. and BLASTP searches against translated transcriptomes Despite these caveats, two observations argue against (see the Methods section and figure legends for more de- contamination being a major issue with the analysis pre- tails) was used to identify histone proteins. A similar search sented here: first, histones are found in both axenic and for DVNPs was also carried out. Part of the reality of work- non-axenic cultures (Table S1 and Figure 1), and second, ing with transcriptome assemblies is that proteins are not as will become clear below, the properties of putative di- always completely assembled, which can confound certain noflagellate histones are quite unique, making them unlikely analyses. For these reasons, hits were divided into complete to come from other eukaryotes. and incomplete sequences, and putative misassemblies were filtered out (details in the Methods section). Expression levels of dinoflagellate histones Figure 1 shows the number of fully and incompletely as- sembled core histone proteins, putative linker histones and The expression levels of histones can provide additional in- DVNPs in all species studied. All four core histones were formation about their functional significance in dinoflagel- identified in all dinoflagellates, usually in multiple distinct lates. To this end, transcriptomes were quantified by align- variants, with histone H3 exhibiting a particularly great ing the raw reads back to the assemblies and carrying out diversity. Putative linker histones were also identified in transcript-level quantification using eXpress (Roberts and many species; the breadth of its phylogenetic distribution Pachter 2013). Figure 7 and Figures S2–S7 show the dis-

5 6 tribution of expression values for core histones, linker his- paraphyletic (Talbert et al. 2012), and numerous cases of tones and DVNPs. In all non-dinotom species, DVNPs are loss, gain and replacement of variants, even between closely significantly more highly expressed than histones, although related species, have been documented (Baldi and Becker considerable variations in the relative levels are observed. 2013; Drinnenberg et al. 2014; Wang et al. 2011). While these results should be treated with caution as The patterns emerging from these comparisons are com- it is well known that replication-dependent histone mRNAs plex, with the deep branches of all trees being poorly re- are not polyadenylated in other eukaryotes (Marzluff 2005), solved. This is particularly true for H2A and H2B, which is while the datasets studied were generated after polyA- not surprising as these are in general the most variable core selection, these observations are consistent with histones histones within eukaryotes and exhibit the highest degree being present at relatively low levels and DVNPs being the of innovation in terms of novel histone variants in different main packaging proteins in dinoflagellates. lineages. Within H2A histones, a group of putative H2A.Z vari- Properties of dinoflagellate histones ants is observed (Figure 3), but in addition to it, there Previous reports in individual species have noted the in- are several other very deep branches consisting of proteins creased length of dinoflagellate histones compared to con- that might serve as novel variants with currently unknown ventional core histones. Figure 2 shows the lengths of all functions. One other histone variant can be identified from complete H2A, H2B, H3, and H4 proteins identified in this sequence – H2A.X, which is best known for its role during study. Dinoflagellate histones are indeed frequently elon- DNA damage response (Ismail and Hendzel 2008). H2A.X gated. However, this is not an universal feature, as many is characterized by the presence of a SQ(E/D)Φ phosphory- normal-length histones are also observed, and often the lation motif at the C-terminus of the protein (Talbert et al. same species expresses both long and short histone vari- 2012), the exact sequence of which varies but is generally ants. constant within the broad divisions of eukaryotes. A num- The elongation of dinoflagellate histone proteins is not ber of dinoflagellate histones do contain similar motifs at due to the presence of additional protein domains as only their C-terminus, but a great deal of diversity is observed core histone domains are readily identifiable (see example among them (Table S2): SQEF, SQEY, SQQY, and SQDF in Figure S1), with only two exceptions: a TRAM-LAG1- motifs are all observed. Of note, most of the H2A variants CLN8 domain in a Kryptoperidinium foliaceum H2A, and in genome contain SQ(E/D)Φ motifs, one being a NAD(P)-binding domain in an Alexandrium tamarense SQEI, and eight SQEM; the functional significance of hav- H2A; their functional significance is currently unclear. ing so many putative H2A.X variants is currently unclear. Next, phylogenetic analysis was carried out on histones Excluding putative H2A.Z variants, a few dinoflagellate from dinoflagellates and from a number of other unicellu- H2A proteins cluster closely with conventional H2A, but lar and multicellular species (Figures 3, 4, 5, and 6). Most most of these are from dinotoms and very similar to H2A eukaryotes express multiple variants of the four core his- sequences from the diatom Thalassiosira, suggesting that tones (Talbert et al. 2012); some of the additional variants they are of endosymbiont origin. The majority of dinoflag- are thought to be ancestral to all eukaryotes. In particu- ellate H2A sequences are highly divergent. Similar patterns lar, H2A.Z is often incorporated in nucleosomes surround- are observed for histone H2B (Figure 4), with a dinotom- ing transcription start sites, and specific variants of histone specific cluster, many highly derived sequences from both H3 associate with centromeres. H2A.Z and centromeric dinotoms and other dinoflagellates, and a small number of H3 from several divergent eukaryotes were also included in proteins clustering with conventional core histone H2B. the analysis in order to identify their putative dinoflagel- Deeply diverging sequences also constitute the major- late homologs (Figures 3 and 5). However, it should be ity of dinoflagellate histone H3 proteins (Figure 5). No noted that such functional homologs might not be identi- group of dinoflagellate histones that robustly clusters with fiable from sequence alone, as many histone variants are centromeric H3 sequences from other eukaryotes could be known to be rapidly evolving and/or to be polyphyletic or identified, but given the very rapid evolution of centromeric

Figure 4 (preceding page): Maximum likelihood phylogenetic tree of H2A protein sequences in dinoflagel- lates. The tree was generated using RAxML ((version 8.0.26) under the LG+G model and with 100 bootstrap replicates. Additional H2A sequences from genome and MMETSP assemblies of other were included (the ciliates Parame- cium caudatum and thermophila, the cryptophyte Guillardia theta, the diatom Thalassiosira weissflogii, the discosean amoeba Paramoeba atlantica, another amoebozoan, Stereomyxa ramosa, the bicosoecid Bicosoecid sp., the chlorarachniophyte Bigelowiella natans, the two chromerids Chromera velia and , the Gloeochaete witrockiana, the haptophyte Emiliania huxleyi, the raphidophyte Fibrocapsa japonica) as well as sequences from human, , flies and Arabidopsis. The H2A variants H2A.Z (from multiple eukaryotes), and H2A.X and macroH2A (from Homo sapiens) were also included, and are highlighted in blue. Dinoflagellate sequences are marked in yellow while perkinsid proteins are colored in pink.

7 8 H3 proteins, this is not surprising. Thus, the presence of Conservation and divergence of the histone code in centromeric histones in dinoflagellates cannot be excluded. dinoflagellates Notably, several deeply diverging H3 variants are also ob- served in . As discussed above, the question of the functional impor- tance of dinoflagellate histones is closely related to the ex- tent of conservation and divergence of the dinoflagellate hi- stone code. To address this question, the conservation of Histones H3.1 and H3.3 form a pair of sequence vari- key posttranslationally modified histone residues relative to ants shared between many eukaryotes. H3.1 is deposited other eukaryotes was assessed. Before the results from these during the S phase of the cycle while H3.3 replaces it comparisons are described, a brief overview of the signifi- as a result of transcriptional activity outside of it (Hake cance of these residues is presented. and Allis 2006; Otero et al. 2014). H3.3 and H3.1 are distinguished by a very small number of residues, in par- ticular and classically, the presence at position 31 of a S The histone code (metazoans) or T (plants) in H3.3 versus A in H3.1; some Hundreds of histone modifications have been identified other variations on that theme are also observed in other (Huang et al. 2014), including monoubiquitination, acety- eukaryotes (Talbert et al. 2012). In a few dinoflagellates, lation, mono-, di-, and trimethylation of lysines, mono- including Alexandrium tamarense, some Symbiodinium iso- and symmetric and asymmetric dimethylation of arginines, lates, and Heterocapsa rotundata, pairs of very similar hi- phosphorylation of serines, threonines, and tyrosines, iso- stones distinguished by the presence or absence of a phos- merization of prolines, and others (Figure S9). While the phorylatable at position 31, are observed (Fig- great majority have not been characterized in any detail, ure S8). Although there are only a few such species and the role of a significant number is fairly well understood, their phylogenetic distribution is patchy, these observations which has revealed both the functional importance of the do suggest that such pairs of H3 variants might be present histone code and the strong constraints that it imposes on in dinoflagellates, especially if one of the other distinctions the corresponding residues across eukaryotes. The presence between H3.3 and H3.1 also holds, namely that the former or absence of these residues in dinoflagellate histones can is replication-independent and with polyadenylated mRNAs be highly informative about the conservation of the relevant while the other is replication-dependent and with mRNAs aspects of the histone code and chromatin biology. This ap- lacking polyA tails (Marzluff et al. 2008), and is thus sys- proach has some limitations, as conservation of amino acid tematically underrepresented in polyA-selected RNA-seq li- residues need not be solely due to their PTMs, and does braries. not on its own mean that the relevant modifications are in fact deposited in vivo and have a conserved function. How- ever, the absence of an important modified residue does Histone H4 displays the most striking divergence among mean that the corresponding histone marks are not con- dinoflagellate histones (Figure 6). Aside from the dinotom- served, and that their functionality has been lost (or, pos- specific group, most dinoflagellate H4 proteins form a well sibly, adopted by other residues elsewhere on nucleosomes). defined group of highly derived sequences. In addition, a The key histone marks in eukaryotes and their functions, Symbiodinium-specific deeply divergent variant is observed, as revealed by studies in yeast, mammals and other model and Amoebophrya expresses four even more divergent vari- systems will be briefly described before the state of the hi- ants. Curiously, such variants are also found in the Perkin- stone code in dinoflagellates is discussed. The list is by no sus marinus genome, in addition to the more conventional means exhaustive and the discussion is by necessity sim- H4 variants it contains. Many of these are elongated vari- plified as the complexity of the histone code is immense. ants. As H4 is the most conserved of all histones and The focus will be on a subset of marks associated with also the one with the fewest known variants (Talbert et al. the following processes: transcriptional activation, initia- 2012), these are especially intriguing observations in terms tion, and elongation, the formation of heterochromatin and of their functional implications. other repressive chromatin environments, and the regula-

Figure 5 (preceding page): Maximum likelihood phylogenetic tree of H2B protein sequences in dinoflagel- lates and perkinsids. The tree was generated using RAxML ((version 8.0.26) under the LG+G model and with 100 boot- strap replicates. Additional sequences from genome sequences and MMETSP assemblies of other protists were included (the two chromerids Chromera velia and Vitrella brassicaformis, the ciliates caudatum and Tetrahymena thermophila, the discosean amoeba Paramoeba atlantica, the bicosoecid Cafeteria roenbergensis, the chlorarachniophyte Bigelowiella natans, the chrysophyte Mallomonas, the cryptophyte Cryptomonas curvata, the diatom Thalassiosira weiss- flogii, the glaucophyte Gloeochaete witrockiana, the haptophyte Emiliania huxleyi, the raphidophyte Fibrocapsa japonica, the xanthopyte Vaucheria litorea) as well as sequences from human, yeast, flies and Arabidopsis. Dinoflagellate sequences are highlighted in yellow while perkinsid proteins are marked in pink.

9 10 tion of chromatin dynamics during . activating effect is not their sole function. For example, Repression and heterochromatin. Heterochroma- H3K27ac has been identified as a marker of active en- tinization is a major mechanism for the permanent re- hancer elements in metazoans (Rada-Iglesias et al. 2010; pression of transposable elements and some host genes in Creyghton et al. 2010), H3K56ac plays a role in the ac- eukaryotes; heterochromatin is also often found in telom- tivation of transcription but is also important for histone eric and centromeric regions. The main mechanisms for exchange and nucleosome assembly (Rufiange et al. 2007; establishing and maintaining heterochromatin are deeply Xu et al. 2005; Xie et al. 2009), etc. conserved (including in other alveolates; Liu et al. 2007; In all eukaryotes studied so far, active promoters are Schwope and Chalker 2014) and likely ancestral to all eu- specifically marked by H3K4me3. The exact biochemical karyotes, with histone marks playing a key role. The classic mechanisms in which H3K4me3 is involved have not yet heterochromatin histone mark is the trimethylation of ly- been completely elucidated (Shilatifard 2012), but its deep sine 9 of histone H3 (H3K9me3). H3K9me3 recruits the conservation suggests that it plays a key role in the process heterochromatin protein HP1 (Lachner et al. 2001; Bannis- of initiation. ter et al. 2001), which leads to compaction of chromatin. It In addition, multiple histone marks have been correlated is also bound by H3K9me3 methyltransferases (Tachibana with transcription activation, but the mechanistic details of et al. 2001), which can further methylate neighboring H3K9 their role are not entirely clear. These include the phospho- residues, and has a positive feedback loop relationship with rylation of threonine 6 on histone H3 (H3T6ph), which pre- cytosine DNA methylation (Stancheva 2005). vents the demethylation of H3K4me3 (Metzger et al. 2010), Another mechanism for repression is mediated by H3S28ph, which marks active promoters and transcribed the H3K27me3 mark and involves the action of Polycomb gene bodies (Drobic et al. 2010; Sawicka et al. 2014) proteins, which both deposit and are recruited by it. The and cross-talks negatively with H3K27me3 (Fischle et al. process has been extensively characterized in both plants 2003), and several arginine methylations. H3R2 is known and metazoans (Zheng and Chen 2011; Simon and Kingston to be both symmetrically and asymmetrically methylated, 2013), and also involves the monoubiquitination of histone with asymmetric methylation (H3R2me2a) being mutually H2A (H2AK119ub1; Shilatifard 2006). exclusive with H3K4me3 (Guccione et al. 2007; Kirmizis Additional histone marks associated with repressive et al. 2007) while H3R2me2s correlates positively with it chromatin include H4K20me3 (Balakrishnan and Milavetz (Yuan et al. 2012). Similarly, H4R3me2s is associated with 2010) and H3K64me3 (Daujat et al. 2009). It should be repression (Tsutsui et al. 2013; Dhar et al. 2012), and noted that heterochromatin is not a homogeneous entity, H4R3me2a with activation (Sun et al. 2011). and that different marks or combinations of marks may Transcription elongation. The eukaryote transcrip- characterize distinct types of repressive chromatin. tion cycle involves a complexly choreographed sequence of Transcription activation and initiation. The first nucleosome remodeling events and PTMs on histones and histone modification to be assigned potential functional the polymerase itself. Because nucleosomes are inhibitory significance was histone , which usually exer- to it, the nucleosome barrier has to be overcome for tran- cises a general stimulating effect on transcription by reduc- scription to proceed. This is achieved through a combi- ing the positive charge of histones and loosening histone- nation of histone acetylation and the action of the FACT DNA interactions (Allfrey et al. 1964). Many lysines complex (Reinberg and Sims 2006; Pavri et al. 2006; Zhu on histones are acetylated (Figure S9; the ones discussed et al. 2005), which partially disassembles nucleosomes, al- in some detail here include H3K9ac, H3K14ac H3K18ac, lowing for the polymerase to proceed, and then reassembles H3K23ac, H3K27ac, H3K56ac, H3K64ac, H4K5ac, them. The action of FACT is associated with the tran- H4K8ac, H4K12ac, H4K16ac, H4K59ac, H4K79ac, sient monoubiquitination of histone H2B (H2BK120ub1 in H4K91ac, H2AK4ac, H2AK8ac, H2AK12ac, H2BK5ac, mammals). H2BK12ac, H2BK16ac, and H2BK20ac), but this general The role of H3K36me3 in transcription elongation is

Figure 6 (preceding page): Maximum likelihood phylogenetic tree of H3 protein sequences in dinoflag- ellates and perkinsids. The tree was generated using RAxML ((version 8.0.26) under the LG+G model and with 100 bootstrap replicates. Additional sequences from genome sequences and MMETSP assemblies of other protists were included (the eight H3 variants from the ciliate lemnae, H3 sequences from two other ciliates, and Tetrahymena thermophila, the bicosoecid Bicosoecid sp., the chlorarachniophyte Bigelowiella natans, the two chromerids Chromera velia and Vitrella brassicaformis, the diatoms Thalassiosira weissflogii and Fragilariopsis ker- guelensis, the glaucophyte Gloeochaete witrockiana, the haptophytes Emiliania huxleyi and Prymnesium parvum, the amoebozoans Stereomyxa ramosa and Vexillifera sp., the Bolidomonas heterokont, the cercozoan Minchinia chitonis) as well as sequences from human, yeast, flies and Arabidopsis. Centromeric histone H3 variants from Homo sapiens, Saccharomyces cerevisiae, Arabidopsis, and Tetrahymena thermophila were also included and are highlighted in grey. Dinoflagellate sequences are highlighted in yellow while perkinsid proteins are marked in pink.

11 12 particularly well understood mechanistically (Wagner and Additional marks implicated in the regulation of mi- Carpenter 2012). The acetylation marks deposited during totic chromatin dynamics include H3T3ph (Polioudaki et elongation need to be erased if cryptic transcription from al. 2004; Wang et al. 2010), H3T11ph (Preuss et al. 2003), within the now more open chromatin is to be prevented (Ka- which is also involved in transcriptional activation (Metzger plan et al. 2003). H3K36me3 is deposited by methyltrans- et al. 2008), H3S28ph, H4S1ph and H2AS1ph (Barber et ferases (KMTs) associated with the elongating polymerase, al. 2004), H2AS122ph (Yamagishi et al. 2010), H2AT120ph and then serves as a recruitment mark for histone deacety- (van der Horst et al. 2015), and others. Most of these marks lases, which remove the acetylation marks and close the are conserved between plants and humans although the pre- cycle. However, this is not the only function of H3K36me3, cise patterns of their chromosomal distribution may differ as it has also been shown to play a significant role in the somewhat (Houben et al. 2007; Manzanero et al. 2000). process of splicing in metazoans (Kolasinska-Zwierz et al. Histone marks also play a role in nucleosome assembly 2009; Schwartz et al. 2009). during and independently of replication. H3K56ac, which Another site of modification associated with elonga- occurs in the core histone domain rather than the tail, facil- tion is H3K79 (H3K79me2/3; Nguyen and Zhang 2011). itates histone exchange (Rufiange et al. 2007); other marks H3K79me is also notable for being catalyzed by the Dot1 suggested to play such a role include H4K79ac and H4K91ac KMT (Feng et al. 2002) rather than a SET-domain methyl- (Yang et al. 2011). transferase as all other lysine methylations on histones. An additional mark involved in elongation is H2AY57ph, Conservation and divergence of dinoflagellate his- which appears to be required for the H2BK120 ubiquitina- tone tails tion/deubiquitination cycle (Basnet et al. 2014). In the light of the histone code, many dinoflagellate his- Finally, the isomerization of prolines 30 and 38 in H3 tones present a curious mixture of conserved and divergent (H3P30iso, H3P38iso) can regulate the activity of H3K36 characters (Figure 8 shows H3 tails in cate- methyltransferases (Nelson et al. 2006). natum). The most well known H3 variants are centromeric Mitosis. Chromatin undergoes dramatic transforma- histones, and those usually have highly divergent histone tions during mitosis, as it is duplicated in the S phase and tails (Figure S10). There are many examples of such tails then compacted in the M phase. Histone marks play a key in dinoflagellates, but there are also histones with tails fairly role in these processes (Desvoyes et al. 2014; Doenecke similar to the conventional state, and all sorts of variations 2014), in particular several phosphorylatable residues on between these two extremes. Thus Gymnodinium catena- histones H3, H4 and H2A (Sawicka and Seiser 2012). tum expresses a variant that is a close match to the typical H3S10ph, although it can also occur in other contexts, H3 sequence (CAMPEP_0117531604), and two very divergent is the best characterized such mark, involved in chromo- ones (CAMPEP_0117479984 and CAMPEP_0117494448), one somal condensation during mitosis (Hendzel et al. 1997). or both of which exhibit substitutions at many key residues A key event during this process is the establishment of an of the histone code, including H3K9, H3K27, H3K36 and interaction between the N-terminal tail of histone H4 and H3K79, as well as several insertions within the N-terminal an acidic patch on H2A-H2B dimers on neighboring nucle- tails and the core histone domain. The other variants in osomes, which, however, is prevented by the presence of Gymnodinium show an intermediate level of divergence, H4K16ac; H3S10ph recruits histone deacetylases, which re- with the region between residues 12 and 31 being partic- move acetylation marks from H4 tails, allowing compaction ularly variable. to happen (Wilkins et al. 2014). H3S10ph also has the Similar observations can be made in many other species, effect of excluding HP1, which is ejected from chromatin although because of the certainly incomplete sampling of during the M phase, as HP1 cannot bind to H3 tails con- variants in transcriptome assemblies, such a comprehensive taining both H3K9me3 and H3S10ph (Fischle et al. 2005). set of variations is not always seen.

Figure 7 (preceding page): Maximum likelihood phylogenetic tree of H4 protein sequences in dinoflag- ellates and perkinsids. The tree was generated using RAxML ((version 8.0.26) under the LG+G model and 100 bootstrap replicates. Additional sequences from genome sequences and MMETSP assemblies of other protists (the two ciliates, Paramecium caudatum and Tetrahymena thermophila, the amoebozoan Stereomyxa ramosa, the Bolidomonas heterokont, the chlorarachniophytes Bigelowiella natans and Lotharella globosa, the two chromerids Chromera velia and Vitrella brassicaformis, the cryptophyte Guillardia theta, the diatom Thalassiosira weissflogii, the glaucophyte Gloeochaete witrockiana, the haptophytes Emiliania huxleyi and Prymnesium parvum, the raphidophyte Fibrocapsa japonica, the xan- thopyte Vaucheria litorea, ) were included as well as sequences from human, yeast, flies and Arabidopsis. Dinoflagellate sequences are highlighted in yellow while perkinsid proteins are marked in pink.

13 Figure 8: Multiple sequences alignments of the N-terminal regions of histone H3 sequences from Gymno- dinium catenatum and of core histone H3 proteins from several other eukaryotes.. Sequences were aligned using MUSCLE (Edgar 2004; version 3.8.31) and visualized using JalView (Waterhouse et al. 2009; version 2.8.2).

The histone code in dinoflagellates The results from these comparisons for histones H3, H4, H2A, and H2B are shown in Figures 9, 10, 11 and 12, re- To assess the conservation of the histone code, multiple se- spectively. quence alignments of all variants of each core histone in Before these results are discussed, it should be noted each species and a reference (taken to be the human core that only very divergent H3 histones are found in the as- histones, with histone H3.1 used in the case of H3) were car- sembled Symbiodinium minutum contigs (even the residues ried out. All histone sequences with complete N-terminal that are conserved in one of the annotated variants accord- tails were used for this analysis (i.e. a protein was allowed ing to the criteria used above, are found in the context of to be incomplete at its C-terminus and not just the “com- an otherwise very divergent N-terminal tail). However, the plete” sequences shown in Figure 1 were included). This transcriptomes of other Symbiodinium isolates contain more was done in order to capture all the histone tail diversity conventional histone H3 sequences, a discrepancy that, if for which there is evidence in the data. contamination is to be excluded, is best explained as being Conservation was then scored against the reference se- due to the incompleteness of the Symbiodinium minutum quence as follows: for a given position i in the reference assembly, which is known to capture only a fraction of the protein and a radius r, a precise match of the sequence total genomic sequence (Shoguchi et al. 2013). with coordinates [i − r, i + r] was required to score that Repression and heterochromatin. H3K9, displays residue as conserved, i.e. when r = 0, only a conservation an intriguing pattern of presence and absence in dinoflag- of the residue itself is required, but when r = 1 and r = 2, ellates. It is present in at least one variant in the major- the 3-amino acid and 5-amino acid contexts were consid- ity of species (with the exception of a few transcriptomes, ered. This approach was adopted as histone marks are of- in which only a single variant was assembled, and Amoe- ten deposited and read according to their sequence context. bophrya and australes, in which all three and Thus, conservation of the context makes it more plausible two H3 variants, respectively, do not have it), although its that the modification is also conserved. Of course, there sequence context is often not well conserved. However, most are limitations to such interpretations – residues and their species also have multiple variants without H3K9. Many of context need not be conserved because of conservation of a these belong to the group of variants with extended protein particular histone mark, and the absence of strict conser- sequence and very divergent tails, which are more likely to vation of context does not necessarily mean that the mark have specialized functions such as those of centromeric H3 is not deposited (its reader and writer proteins might have histones in other eukaryotes. Yet quite remarkably, H3K9 evolved and adapted accordingly). Nevertheless, it remains is not absent solely from H3 tails that are highly divergent true that the complete absence of a residue is almost certain – for example, Alexandrium monilatum expresses a 167-aa evidence that the corresponding histone marks are absent histone H3 (the increased length is due to a C-terminal too, and that strong conservation of its context boosts con- extension), the N-terminal portion of which is overall a fidence in the preservation of at least the capacity to deposit close match to human histone H3 with the notable excep- them. tion of H3K9, which is substituted by a methionine (Figure

14 15 S11). Functional homologs of HP1 have been identified in these lysines are known to be redundant with each other the two other major lineages, ciliates (Schwope et (Dion et al. 2005; Martin et al. 2004), and could be lost if al. 2014) and apicomplexans (Flueck et al. 2009), but no constraints on histone sequence are relaxed. clear homologs (defined as proteins with a pair of chromod- The status of conservation of H3K4 is of particular in- omains and a chromo shadow domain) were detected in di- terest given its strong association with active TSSs in other noflagellate transcriptomes. None were found in Chromera eukaryotes. It is found in all dinoflagellates (with the excep- velia or Perkinsus marinus either, but at least in the case tion of the single divergent H3 variant assembled in Brand- of Chromera velia this is most likely a false negative case. todinium nutriculum), usually in most variants, and often in As H3K9 is also subject to numerous other posttranscrip- a well conserved sequence context. Variants without it are tional modifications, the possibility that its conservation is also observed, and as with H3K9, these are not always of for other reasons and that the overall H3K9me3/HP1 hete- the elongated highly divergent varieties, but overall these rochromatin formation mechanism is not conserved cannot observations suggest a retention of the functional impor- be dismissed. tance of H3K4me3 in dinoflagellates. Almost all dinoflagellates possess at least one histone H3 Other, less well characterized residues such as H3R2 and variant with H3K27 (more than half of the variants possess- H3T6 are also fairly well conserved, but it could be that ing H3K27 is the typical condition), although as in the case this is due to constraints on the H3K4 sequence context. of H3K9, its sequence context is not well conserved (because Another methylated arginine, H4R3, displays a remarkable the neighboring H3S28 is usually missing). The number of absence of conservation, being found in few dinoflagellates H2A variants assembled is often low, thus it is not certain outside of dinotoms (but well conserved in Perkinsus and that all are present in the assemblies, but in most species a Chromera). variant with H2AK119 (often ubiquitinated in concert with The conservation status of these marks suggests preser- H3K27me3 deposition) is present. vation of the pivotal role of H3K4me3 and a derived state H4K20 is observed in most dinoflagellate H4 histones for some other components of the histone code associated with the exception of the four variants in Amoebophrya and with the activation and repression of transcription. the two variants in the Symbiodinium minutum genome as- Transcription elongation. H3K36 is observed in the sembly. majority of dinoflagellates (with the possible exception of H3K64 is a remarkably well conserved residue (although brevis among the species with a large number of not always in its sequence context), being present in at assembled H3 variants), suggesting the possible conserva- least one (usually most) H3 variant in all dinoflagellates. tion of its role in transcription elongation. H3K79 is less As H3K64 resides in the core histone domain, this might well conserved although still found in most species. The be due to constraints other than its role in heterochromatin presence/absence pattern of H2BK120 is patchy, but this formation, but its strong conservation suggests that if nu- might well be due to incomplete assemblies as the number cleosomal heterochromatin does form in dinoflagellates, it of observed H2B variants is often low. might be playing a role in the process. The two prolines subject to isomerisation, H3P30 and Overall, the analysis of heterochromatin-associated his- H3P38 are also found in most dinoflagellates but not in all tone mark-bearing residues suggests that while the capacity tails and not always together. to deposit these marks has been retained in most dinoflag- Somewhat surprisingly, the residue implicated in tran- ellates, the constraints on their sequence context seem to scriptional elongation that displays the strongest conserva- have been relaxed. This is possibly due to a significant tion, is H2AY57. reduction of the amount and functional importance of nu- Overall, these observations suggest that the capacity to cleosomal heterochromatin, or even its complete loss. deposit the key histone marks involved in transcriptional Transcription activation and initiation. With the elongation is present, with some divergence, and the pro- exception of H4K12, the majority of acetylated lysines cesses they are involved in might be conserved in dinoflag- in the N-terminal histone tails display poor conservation ellates. across dinoflagellates. This is not entirely surprising as Mitosis. The phosphorylatable residues involved in

Figure 9 (preceding page): Conservation of key posttranscriptionally modified residues in dinoflagellate histone H3 proteins. The radius r refers to the size of the context considered when scoring conservation. When r = 0, only the residue itself is considered; when r = 1, a perfect match to the three-amino acid peptide also including the flanking residues on each side is required; when r = 2, the five-amino acid peptide also including the two flanking residues on each side is considered. The fractions C/N indicate the number of histone H3 proteins C with conserved residues according to the criteria specified by the radius relative to the total number of histone H3 proteins N for each species. All histone H3 sequences with complete N-terminal tails identified in the transcriptomic data are included (the C-terminal portions of the protein were allowed to be incomplete for the purposes of this analysis, but the N factor in the C/N fractions was adjusted accordingly if necessary in cases when gaps in the alignments were due to an incomplete sequence).

16 17 mitosis (H3S10ph, H3T3ph, H3T11ph, H3S28ph, H4S1ph, finger, which binds to a variety of unmodified and mod- H2AS1ph, H2AS122ph, H2AT120ph) display quite poor ified histone tail substrates, WD40 domains, which bind conservation in dinoflagellates. H3T3ph is an exception to trimethylated lysines among other targets, Tudor do- but this could be because of constraints on the neighbor- mains, which bind to both methylated lysines and arginines, ing H3K4. H3S10 is also present in at least one variant PWWP domains, which bind to methylated lysines, and in many species, but is also completely absent in numer- others. ous others. Also remarkable is the absence of conservation ATP-dependent chromatin remodeling complexes, of H4K16, the residue involved in the control of nucleosome which move nucleosomes along DNA, are another important compaction. Given these observations, it is quite likely that component of the eukaryote chromatin toolkit. Four major the H3K9me3S10ph switch does not operate in dinoflagel- families of remodelers – SWI/SNF, ISWI, CHD, and INO80 lates and that histone phosphorylation plays a reduced (if – are known, distinguished by characteristic combinations any) role in mitosis. of protein domains (Clapier and Cairns 2009). Of note, all these residues are best conserved in Oxyrrhis To evaluate the composition of the set of writer, eraser marina, the earliest branching dinoflagellate included in and reader proteins and chromatin remodelers in dinoflag- this study. It is therefore possible that the correspond- ellates, putative homologs were identified by scanning ing processes are most preserved in Oxyrrhis while other datasets for the presence of their characteristic domains, dinoflagellates are in an even more derived state. or combinations of domains (Figure 13). In addition to H3K56, H4K79 and H4K91, the acetylation of which has Perkinsus and Chromera, the same analysis was also carried been implicated in the process of nucleosome assembly, are out on the sequenced and annotated genomes of a diverse very well conserved across dinoflagellates, including at the set of unicellular eukaryotes, in order to compare the num- level of their sequence context. ber of proteins in each group to what is observed in other protozoans. Histone marks writers, erasers and readers, and Remarkably, most dinoflagellates contain a larger num- chromatin remodelers ber of sirtuins and HDACs than most other eukaryotes. Histone marks are deposited and erased by specific While these deacetylases need not be acting on histones (es- and read by proteins containing “reader” domains, recogniz- pecially sirtuins), these observations support the occurence ing particular modifications. Histone acetylation is carried of histone acetylation and deacetylation in dinoflagellates. out by several families of lysine acetyltransferases – KATs, The corresponding HAT enzymes are more difficult to or, in the context of histones, HATs (Marmorstein and Zhou identify. There are multiple families of HATs, and not all 2014; Wang et al. 2008; Thomas and Voss 2007; Doyon of them act specifically on histones. In particular, the and Cˆot´e2004). Acetylation marks are removed by his- GNAT-type (GCN5-Related N-Acetyltransferases) HATs tone deacetylases (HDACs), of which there are also several also acetylate a wide variety of other substrates and are classes (Mariadason 2008; Vaquero 2009), one of which in- not even restricted to eukaryotes (Xie et al. 2014); these cludes members of the sirtuin protein family. Lysine methy- are abundant in dinoflagellates but it is not certain that lation is carried out by SET domain-containing proteins they acetylate histones. For these reasons, the adapter com- (Dillon et al. 2005), with the exception of H3K79 methy- ponents of the chromatin modifying complexes, in which lation and Dot1. Lysine methylation is primarily erased they are usually found, were searched for. However, none through the action of demethylases of the Jumonji (Jmj) were detected in any of the dinoflagellate species examined. family (Takeuchi et al. 2006). Arginine methyltransferases Both KAT11 and MYST HATs were found in dinotoms, in include PRMT5 and CARM1-type proteins (Karkhanis et Perkinsus marinus, and in Alexandrium tamarense among al. 2011; Wysocka et al. 2006). the dinoflagellates; a KAT11 HAT is found in one of the A variety of reader domains are known (Musselman et al. Symbiodinium isolates and in Gymnodinium, and a MYST 2012). These include the bromodomain and the chromod- HAT is found in Oxyrrhis. No other obvious HATs were omain, which bind to acetylated and methylated lysines, re- found, thus the most likely explanation is that dinoflagel- spectively (Sanchez et al. 2014; Blus et al. 2011), the PHD late HATs primarily belong to the GNAT family, and might

Figure 10 (preceding page): Conservation of key posttranscriptionally modified residues in dinoflagellate histone H4 proteins. The radius r refers to the size of the context considered when scoring conservation. When r = 0, only the residue itself is considered; when r = 1, a perfect match to the three-amino acid peptide also including the flanking residues on each side is required; when r = 2, the five-amino acid peptide also including the two flanking residues on each side is considered. The fractions C/N indicate the number of histone H4 proteins C with conserved residues according to the criteria specified by the radius relative to the total number of histone H4 proteins N for each species. All histone H4 sequences with complete N-terminal tails identified in the transcriptomic data are included (the C-terminal portions of the protein were allowed to be incomplete for the purposes of this analysis, but the N factor in the C/N fractions was adjusted accordingly if necessary in cases when gaps in the alignments were due to an incomplete sequence).

18 19 participate in chromatin modifying complexes of derived each subunit in dinotoms is expected (Figure S12), a di- composition. versity of subunits is also observed in other species, as well The number of SET-domain proteins in dinoflagellates as in Perkinsus. Intriguingly, in some cases variant SSRP1 is strikingly large and exceeds anything observed in other subunits also contain HMG boxes, a DNA binding domain, eukaryotes. Not only that, but numerous Dot1 proteins (Figure S13), which is not observed in FACT subunits of are also observed, as is an expansion of the PRMT5 fam- other eukaryotes; its functional significance is currently not ily. Accordingly, the diversity of Jumonji proteins is also clear. large. These are certainly underestimates given that they The ubiquitous presence of the FACT complex through- are based on transcriptomes and not complete genomes, out the dinoflagellate phylogeny suggests that transcription although the extent of actual functional diversification is through nucleosomal arrays does occur in dinoflagellate nu- not yet clear. Of note, these observations are confirmed clei. in the Symbiodinium minutum genome assembly, therefore it is unlikely that they are due to diversity of alternative transcript products in the transcriptome. In contrast to the expansion of lysine and arginine methyltransferases, the number of proteins with classical Histone chaperones “reader” domains is reduced in dinoflagellates, with the exception of the WD40 domain. Proteins with bromod- While the FACT complex disassembles and reassembles omains, chromodomains, PHD fingers, Tudor or PWWP nucleosomes during transcription, other histone chaper- domains are found throughout the dinoflagellate phylogeny, ones act during replication-dependent and replication- but they are fewer in number compared to other protozoans, independent nucleosome assembly and histone exchange with the exception of dinotoms. Dinoflagellates do express (Burgess and Zhang 2013). NAP1 loads H2A-H2B dimers a large number of proteins with WD40 domains; however, (Mosammaparast et al. 2002), while CAF-1 is the the WD40 domain is by no means restricted to the context replication-dependent and Hira the replication-independent of reading chromatin marks, and is found in many other chaperone for H3-H4 dimers (Verreault et al. 1996; Ray- proteins in the cell (Xu and Min 2011). Gallet et al. 2002); both receive the dimers from another Finally, putative chromatin remodelers in were identi- loading factor, ASF1 (English et al. 2006). fied. The number of candidate SWI/SNF ATPases is com- Figure 13 shows the number of detected homologs for parable to that in other protozoans, although clear CHD each factor. CAF-1 is only found in dinotoms and in and ISWI homologs are detected in few species other than Perkinsus, while Hira is detected in few but phylogeneti- dinotoms (Figure 13). cally widely distributed dinoflagellate species. Given that ASF1 is found in almost all dinoflagellates, the absence of The FACT complex detection of Hira can be interpreted as a probable case of widespread false negatives. If, however, CAF-1 is genuinely The FACT complex consists of two subunits, Spt16 and absent in dinoflagellates (while Hira is present), that would SSRP1, and plays a key role in transcription through chro- have very intriguing implications for the role of nucleosomes matinized DNA templates (Orphanides et al. 1998; Rein- in dinoflagellate chromatin (see the Discussion section). berg and Sims 2006). Its presence or absence is therefore highly informative of the role that histones might play in Remarkably, the majority of dinoflagellates express a dinoflagellates. Strikingly, homologs of the components of large number of NAP1 proteins (Figure 13); this is also ob- FACT are detected in almost all dinoflagellates (Figure 13), served in a few other unicellular eukaryotes, but rarely to with an identical domain structure to that of FACT sub- the same extent. One explanation for the expansion of the units in yeast and human (Figure 14), strongly implying NAP1 family is that it might be due to an underlying di- that they indeed constitute a bona fide FACT complex. versification of NAP1 substrates, in the form of different Furthermore, while the existence of multiple variants of H2A/H2B variants and dimers.

Figure 11 (preceding page): Conservation of key posttranscriptionally modified residues in dinoflagellate histone H2A proteins. The radius r refers to the size of the context considered when scoring conservation. When r = 0, only the residue itself is considered; when r = 1, a perfect match to the three-amino acid peptide also including the flanking residues on each side is required; when r = 2, the five-amino acid peptide also including the two flanking residues on each side is considered. The fractions C/N indicate the number of histone H2A proteins C with conserved residues according to the criteria specified by the radius relative to the total number of histone H2A proteins N for each species. All histone H2A sequences with complete N-terminal tails identified in the transcriptomic data are included (the C-terminal portions of the protein were allowed to be incomplete for the purposes of this analysis, but the N factor in the C/N fractions was adjusted accordingly if necessary in cases when gaps in the alignments were due to an incomplete sequence)

20 21 RNA Polymerase II CTD tails variants include putative homologs, or at least func- tional analogs, of well known variants in other eu- The C-terminal domain of the largest subunit of RNA Poly- karyotes, including H2A.Z, H2A.X and possibly even merase II (Rpb1) and its posttranslational modifications H3.1/H3.3. play a crucial role in coordinating numerous processes dur- ing the transcriptional cycle. In almost all eukaryotes (Yang 2. Overall, dinoflagellates express the most divergent hi- and Stiller 2014) the tail consists of multiple copies of a stone proteins among all autonomous eukaryote nu- consensus heptad sequence, YSPTSPS, which contains five clei: their histones exhibit frequent loss of key residues phosphorylation sites and two proline isomerization sites. highly conserved among all other eukaryotes and Specific modifications are deposited on and removed from many variants are elongated. The latter, however, is these repeats during each phase of the transcriptional cy- not a universal feature, as a fairly conventional set of cle, and they serve to recruit various effector and regulatory histones (in terms of protein length, but not necessar- proteins (Buratowski 2009), constituting the so-called CTD ily sequence conservation) is found in many species. code (Eick and Geyer 2013), operating in concert with the histone code. 3. Dinoflagellate histones are expressed at lower levels Rpb1 homologs with CTD tails were identified in most than DVNPs, consistent with the limited role of his- dinoflagellates; curiously, in a few cases more than one ho- tones in chromatin packaging. molog was found, even in non-dinotom species. The heptad repeats are recognizable in most species (Figure 14); how- 4. The histone code exhibits significant divergence from ever, it is only in dinotom Rpb1 proteins that they closely the conventional eukaryote state but some of its key match the consensus sequence (Figure S14B). In all other elements are conserved. These include H3K4 and dinoflagellates, they are highly divergent (example shown in some of the residues involved in transcription elonga- Figure S14A). This can be interpreted as a sign that some of tion. Histone marks associated with heterochromatin the classical functionality of the CTD code has been lost in might also be conserved. The components of the his- dinoflagellates, although divergent CTD repeats are seen in tone code involved in chromatin dynamics during mi- other eukaryotes too. The PTMs of the tails have not been tosis are among the least conserved. directly studied in any such group, which will be necessary 5. Dinoflagellates express members of most protein fam- to clarify the roles they play in transcription. ilies comprising the chromatin mark reader, writer and eraser toolkit. Remarkably, lysine and arginine Discussion methyltransferases and demethylases have even ex- panded in the group. In contrast, the reader proteins This study presents the most comprehensive characteriza- display reduced diversity. tion of dinoflagellate histone proteins carried out so far. 6. There is evidence for the presence of ATPases involved The presence of histones, despite their low abundance in di- in chromatin remodeling in dinoflagellates. noflagellates chromatin, is confirmed and generalized to all species for which genomic or transcriptomic sequence data 7. Histone chaperons are definitely present and the is available. The properties of their sequences are integrated NAP1 family has even expanded. It is possible that with the phylogenetic distribution of key components of the no replication-dependent H3-H4 chaperone is present, histone modification, chromatin remodeling, and transcrip- although that remains to be confirmed. tional machineries. These analyses reveal that: 8. Remarkably, a complete FACT complex is also 1. Core histones are present in all dinoflagellates, typ- present, occasionally in multiple variants, meaning ically in multiple variants. The linker histone H1 is that transcription through nucleosome arrays occurs also detected in most species. Dinoflagellate histone in dinoflagellates.

Figure 12 (preceding page): Conservation of key posttranscriptionally modified residues in dinoflagellate histone H2B proteins. The radius r refers to the size of the context considered when scoring conservation. When r = 0, only the residue itself is considered; when r = 1, a perfect match to the three-amino acid peptide also including the flanking residues on each side is required; when r = 2, the five-amino acid peptide also including the two flanking residues on each side is considered. The fractions C/N indicate the number of histone H2B proteins C with conserved residues according to the criteria specified by the radius relative to the total number of histone H2B proteins N for each species. All histone H2B sequences with complete N-terminal tails identified in the transcriptomic data are included (the C-terminal portions of the protein were allowed to be incomplete for the purposes of this analysis, but the N factor in the C/N fractions was adjusted accordingly if necessary in cases when gaps in the alignments were due to an incomplete sequence).

22 23 9. The RNA Polymerase II CTD tails exhibit high de- How are we to interpret these observations? gree of divergence, making it unclear to what extent the CTD code is conserved. The presence of the FACT complex is key to the piecing together of a working model as it means that transcription through nucleosomes occurs in dinoflagellates. Less clear 10. Finally, Perkinsus, the sister lineage of dinoflagellates, is the context in which it is happening. The permanent has a mostly conventional set of histones, yet some of condensation and liquid crystalline state of dinoflagellate its histone variants exhibit divergence features that chromosomes is expected to not be conductive to transcrip- resemble what is observed in core dinoflagellates. This tion. Thus, models of transcription in dinoflagellates have includes the otherwise most highly conserved and with proposed that most of it happens on decondensed extrachro- very few variants histone H4. It is plausible that mosomal chromatin loops that stick out of the permanently although perkinsid chromosomes are organized into condensed general chromatin mass (Wisecaver and Hack- typical nucleosomal chromatin, the first steps in the ett 2011). If that model is correct, it could be that these process of evolution towards the derived state seen temporary loops associate with nucleosomes, and transcrip- in dinoflagellates might have taken place prior to the tion proceeds in a more or less conventional manner, with separation of the two groups. promoters defined by the presence of H3K4me3 and maybe

Figure 13 (preceding page): Presence and absence of histone mark writer, eraser and reader proteins, chromatin remodeling complexes, histone chaperones, the FACT complex, and RNA Polymerase II CTD repeats in dinoflagellates and representative unicellular eukaryotes. There are multiple different HAT families and homologs from these were searched for separately. Except for the yeast Hat1 and the human PCAF protein, the domains found in GNAT-type acetyltransferases are also found in many acetyltransferase enzymes whose activity has little to do with chromatin, but the catalytic core of GNAT-type HAT complexes is formed by several proteins (Ada2, Ada3, Sgf29, Gcn5 in the case of the SAGA complex), and the adapter proteins (Muratoglu et al. 2003) were searched for in addition to acetyltransferase domains. The KAT11 domain is found in numerous HATs such as the metazoan p300 and CBP and the yeast Rtt109 (Wang et al. 2008). MYST-family HATs (Thomas and Voss 2007) were identified using searches for the MOZ SAS domain, and; in addition, a search for the NuA4 domain, part of the EAF6 component of the NuA4 complex of which MYST HATs are often part of (Doyon and Cˆot´e2004), was carried out. There are four classes of HDACs (Mariadason 2008), one of which, the sirtuins is characterized by the presence of a SIR2 domain (although sirtuin specificity is not restricted to histones). Two classes of lysine methyltransferases are known too – the majority are of the SET-domain type, Dot1 is the exception. Known arginine methylation enzymes include PRMT5– and CARM1–type proteins (Karkhanis et al. 2011; Wysocka et al. 2006); CARM1 is not shown as no CARM1 proteins are found in any of the unicellular species shown here. Aside from the monoamine oxidase LSD1 (Rudolph et al. 2013), lysine demethylases contain a Jumonji domain (JmjC) (Johansson et al. 2014). Bromodomains (Sanchez et al. 2014) and chromodomains (Blus et al. 2011) are readers of lysine acetylation and methylation, respectively. WD40 domains bind to trimethylated lysines among other substrates, Tudor domains can bind to both methylated lysines and arginines, PWWP domains bind to methylated lysines, and PHD fingers can bind to a variety of unmodified and modified histone tail substrates (Musselman et al. 2012). The chromatin remodeling complexes of the SWI/SNF, ISWI, CHD and INO80 families are characterized by the presence in their ATPase subunits of a combination of a SWI2 N and a Helicase C domain (although the domains themselves are individually not necessarily restricted to proteins with such functions), plus additional domains in each of the families. These domains include HAND, SLIDE, SnAC, and DBINO as shown in the figure. In addition, the number of detected proteins containing both a SWI2 N and a Helicase C domains is shown, as well as those containing all three of Chromo, SWI2 N and a Helicase C domains (i.e. putative CHD-family ATPases) and proteins with all three of the SLIDE, SWI2 N and a Helicase C domains (putative ATPases in the ISWI family). SWIB and SWIRM are domains found in some of the noncatalytic components of SWI/SNF chromatin remodeling complexes. The FACT complex consists of two proteins, Spt16 and SSRP1; the latter contains the SSrecog domain. In addition to nucleosome chaperoning during transcription elongation, multiple histone chaperones are involved in replication-coupled and replication-independent nucleosome assembly and histone exchange (Burgess and Zhang 2013); among them, NAP1 loads H2A-H2B dimers (Mosammaparast et al. 2002), while CAF-1 and HIRA transfer H3-H4 dimers (Verreault et al. 1996; Ray-Gallet et al. 2002) during and independent of replication, respectively, which are in turn loaded onto them by ASF1 (English et al. 2006). The figure shows the number of proteins detected containing each of the domains (a threshold of e ≤ 10−8 was applied using HMMER3.0), with the exception of the CTD repeats of the Rpb1 subunit of RNA Polymerase II, which were manually annotated in putative Rpb1 homologs (note that in that case “0” means that CTD repeats are not apparent in the Rpb1 homolog, while “–” refers to absence of detection of an Rpb1 homolog). The results for the organisms labeled in orange are based on available genome assemblies in contrast to those lettered in black, which are derived from transcriptome assemblies.

24 Figure 14: FACT complex subunits and their domain organization in glacialis. (A) Polarella glacialis SPT16 proteins; (B) Polarella glacialis SSRP1 proteins; (C) Saccharomyces cerevisiae SPT16; (D) Saccha- romyces cerevisiae SSRP1 (POB3); (E) Domain color code. An orange “I” in front and/or after the protein indicates that the protein sequence is known to be not represented completely in the transcriptome assembly. even H2A.Z, and a transcription elongation cycle involv- Of course, many unknowns remain, for example, how ing the action of FACT and the deposition of H3K36me3 the formation and identity of chromatin loops is regulated, and some of the other marks usually associated with ac- which histones exactly associate with them, among the nu- tive transcription. The possible absence of replication- merous variants observed, and what the role of the ones (if dependent histone chaperones also fits such a picture, as any) associating with other areas of the genome is. It is they would not be necessary if the primary mode of nucle- tempting and natural to think that the least derived his- osome assembly is related to active transcription outside of tones are the ones forming nucleosomal arrays on loops. the S-phase (however, it is not directly compatible with the This is the most plausible explanation, but it need not possible presence of replication-dependent and replication- be true. Examples of significant innovation in the com- independent H3.1/H3.3 variants). The observation that hi- position of the nucleosome are known – for example, in stone marks involved in chromatin condensation during mi- bdelloid rotifers, a metazoan lineage with otherwise normal tosis are poorly conserved in dinoflagellates is also consis- chromatin, conventional histone H2A has been apparently tent with it. completely replaced by an elongated H2A variant (Van Don- inck et al. 2009). Given the uniqueness and extreme diver-

25 gence of dinoflagellate nuclear organization, the possibility like proteins in dinoflagellates are no less interesting. They of analogous innovations cannot be rejected, including some could well function as much more than mere packaging pro- that are restricted to individual lineages and not common teins and be involved in a completely new “DVNP code”, to all dinoflagellates. analogous to the histone code. Not only that but a large Additional clues might be provided by the other notable number of distinct DVNPs are expressed in all dinoflag- example of nuclei, in which histones do not play a primary ellates; the divisions of functions between them are com- packaging role, sperm cells, a system that has been stud- pletely unknown at present. A separate DVNP code might ied much more extensively. Chromatin undergoes a dra- provide an explanation for the expansion of some portions of matic transformation during spermatogenesis as histones the epigenetic writer and eraser toolkit, as DVNPs are most are largely replaced by protamines and it becomes con- likely methylated by novel, specialized KMT enzymes. Less densed (Rathke et al. 2014), a condition reminiscent to clear in such context is why proteins with classical reader that of dinoflagellate nuclei. However, not all histones are domains are reduced in number. Some possible ways out of removed – recent genome-wide mapping studies have re- this conundrum include the takeover of such functionalities vealed that they remain associated with a small fraction by proteins with other domains (such as WD40) and the of the genome (Hammoud et al. 2009; Erkek et al. 2013; evolution of entirely novel ones. Carone et al. 2014), in particular the promoters of devel- DVNPs and permanent condensation of chromatin could opmental regulators. Of note, it seems that nucleosomal also explain the poor conservation of histone phospho- patterns upon nuclease digestion of sperm chromatin only rylated during the mitotic cycle – dinoflagellate histones become apparent under preparation conditions that stabi- are not involved in a nucleosomal compaction and decom- lize protein-DNA interactions (Carone et al. 2014). Sperm paction cycle during mitosis. Similarly, DVNPs are likely is also the site of expression of many unique histone vari- to be the main component of what is the equivalent of het- ants (Wiedemann et al. 2010; Soboleva et al. 2011; Schenk erochromatin, which, if true, makes it unclear what func- et al. 2013), thought to play a role in both the process of tional roles might be left for histone-based heterochroma- histone replacement and in the final condensed chromatin. tinization; this is the reason why the putative conservation Parallels can be drawn between that system and dinoflag- of heterochromatin marks is the most uncertain among the ellates – variant histones might be associated with specific modifications discussed here. regions of dinoflagellate genomes even within an otherwise An enormous amount remains to be learned about di- permanently condensed chromatin mass. noflagellate chromatin, transcription, and transcriptional and posttranscriptional regulation. The answers to these Specialized histone variants from other organisms can questions will derive from the application of the genomics also have analogs in dinoflagellates, and these are not lim- and tools that have successfully been applied to ited to centromeric histones. The group that might be most the study of chromatin structure and histone modifications informative are kinetoplastids. Dinoflagellates and kineto- in model eukaryote systems. These include the use of tar- have evolved convergently in many aspects of their geted mass-spectrometry analysis to reveal the exact post- biology (Lukes et al. 2009), including the polycistronic or- translational modifications deposited in vivo onto histones ganization of genes and the ubiquity of trans-splicing. Hi- and DVNPs (Tan et al. 2011), and of functional genomic stone marks and variants have been studied in some kine- assays such as ChIP-seq (Johnson et al. 2007), ATAC-seq toplastids and intriguing discoveries have been made. For (Buenrostro et al. 2013) and DNAse-seq (Hesselberth et example, in Trypanosoma brucei novel variants of all four al. 2009), and Hi-C/ChIA-PET (Lieberman-Aiden et al. core histones mark the boundaries of polycistronic tran- 2009; Fullwood et al. 2009) to characterize chromatin struc- scription units (Siegel et al. 2009), and another histone ture, the distribution of histones, histone marks and other H3 variant is associated with (Lowell and Cross chromatin-associated proteins along dinoflagellate genomes, 2004). Analogous variants might be present in dinoflagel- and their three-dimensional organization. A prerequisite for lates. Chromosome dynamics during mitosis in kinetoplas- such studies is the availability of high-quality and reason- tids is also unique and apparently either highly derived or ably complete genome sequences from several species, which very deeply diverging (Akiyoshi and Gull 2013); trypanoso- should be facilitated by the ongoing advances in genome se- matid H3 histones lack H3S10 (Sullivan and Naguleswaran quencing technologies. 2006), which might be related to this fact, and has simi- larities to the poor conservation of histone phosphorylation sites in dinoflagellates. Methods The restriction of histones to certain sections of the Genomic and Transcriptomic Sequence Data genome possibly combined with their diversification and subfunctionalization can explain the relaxation of the con- MMETSP transcriptome datasets and assemblies were straints on their sequence imposed by the histone code. The downloaded on June 19th 2014. Glenodinium foliaceum functional role of DVNPs is highly relevant for clarifying to (Stein 1883) and Kryptoperidinium foliaceum (Linde- what extent that hypothesis is true. The focus of this study mann 1924) are listed separately following the submis- has been on histones, but DVNPs and any other histone- sion labels, even though they are considered synonymous

26 (G´omez2005). Low-quality transcriptome assemblies, fea- -t -X 1000. Transcript-level quantification was then car- turing very low numbers of assembled transcripts, were ried out with eXpress (Roberts and Pachter 2013; version removed. A full list of the samples used is provided in Ta- 1.5.1). TPM (Transcript Per Million) values were used for ble S1. In addition, genome assemblies and annotation for subsequent analysis. Perkinsus marinus (accession number GCF_000006405.1) and Symbiodinium minutum were also used. Additional genome assemblies and annotations were downloaded Acknowledgments from the NCBI (Thecamonas trahens: GCA_000142905.1, Acanthamoeba castellanii: GCF_000313135.1, Monosiga This material is based upon work supported by the National brevicollis: GCF_000002865.2, Paramecium cauda- Science Foundation under Grant No. CNS-0521433. tum: GCA_000715435.1, Capsaspora owczarzaki: GCF_000151315.1, Trichomonas vaginalis: GCF_000002825.2, References Ectocarpus siliculosus: GCA_000310025.1) or from the EnsemblProtists ( falciparum, Toxoplasma Akiyoshi B, Gull K. 2013. Evolutionary cell biology of chro- gondii, Entamoeba histolytica, Chlamydomonas reinhardtii, mosome segregation: insights from trypanosomes. Open pusilla, Tetrahymena thermophila, Guillar- Biol 3(5):130023. dia theta, Emiliania huxleyi, Naegleria gruberi, Dic- Allfrey VG, Faulkner R, Mirsky AE. 1964. Acetylation and tyostelium discoideum, Phytophthora infestans, Cyanid- Methylation of Histones and Their Possible Role in the ioschyzon merolae) and EnsemblFungi (Saccharomyces Regulation of RNA Synthesis. Proc Natl Acad Sci U S cerevisiae, Schizosaccharomyces pombe) databases. A 51:786–794. Balakrishnan L, Milavetz B. 2010. Decoding the histone H4 Sequence Analysis lysine 20 methylation mark. Crit Rev Biochem Mol Biol 45(5):440–452. Histone proteins were identified using a combination of BLASTP searches (using histone sequences from Baldi S, Becker PB. 2013. The variant histone H2A.V Homo sapiens, Saccharomyces cerevisiae, Drosophila of Drosophila – three roles, two guises. Chromosoma melanogaster, Arabidopsis thaliana and Tetrahymena ther- 122(4):245–258. mophila as queries, and an e-value of 10−10 as cutoff) Bannister AJ, Zegerman P, Partridge JF, Miska EA, and HMMER3.0 (Eddy 2011) scans against the Pfam 27.0 Thomas JO, Allshire RC, Kouzarides T. 2001. Selec- database (Finn et al. 2015) (scanning for histone fold and tive recognition of methylated lysine 9 on histone H3 by linker histone domains). the HP1 chromo domain. Nature 410(6824):120–124. For most of the analyses presented, incomplete hits (i.e. Barber CM, Turner FB, Wang Y, Hagstrom K, Taverna partial sequences without both a start and a stop codon) SD, Mollah S, Ueberheide B, Meyer BJ, Hunt DF, Che- were removed. For the analysis of H3 histone tails, proteins ung P, Allis CD. 2004. The enhancement of histone with complete N-termini but incomplete C-termini were in- H4 and H2A serine 1 phosphorylation during mitosis cluded; in addition, H3 sequences were manually examined and S-phase is evolutionarily conserved. Chromosoma and in cases where additional amino acids residues were 112(7):360–371. present in front of an otherwise clearly conventional his- Barbrook AC, Howe CJ. 2000. Minicircular plastid DNA in tone H3 tail, such residues were removed (their most likely the dinoflagellate Amphidinium operculatum. Mol Gen origin is the presence of an earlier start codon in the tran- Genet 263:152–158. script assembly and the most parsimonious explanation is Basnet H, Su XB, Tan Y, Meisenhelder J, Merkurjev D, that they are not part of the actual protein). Clear cases Ohgi KA, Hunter T, Pillus L, Rosenfeld MG. 2014. Ty- of misassembly (such as concatenated copies of histone pro- rosine phosphorylation of histone H2A by CK2 regulates teins in the same sequence) were also removed. transcriptional elongation. Nature 516(7530):267–271. Multiple sequence alignments were carried out using Bayer T, Aranda M, Sunagawa S, Yum LK, Desalvo MK, MUSCLE(Edgar 2004; version 3.8.31) and visualized with Lindquist E, Coffroth MA, Voolstra CR, Medina M. JalView (Waterhouse et al. 2009; version 2.8.2). Phyloge- 2012. Symbiodinium transcriptomes: genome insights netic analysis was carried out using MEGA 6.0 (Tamura et into the dinoflagellate symbionts of reef-building corals. al. 2013) (selection of best substitution model) and RAxML PLoS One 7(4):e35269. (Stamatakis 2014) (version 8.0.26; generation of maximum Blus BJ, Wiggins K, Khorasanizadeh S. 2011. Epigenetic likelihood trees and bootstrap analysis). virtues of chromodomains. Crit Rev Biochem Mol Biol 46(6):507–526. quantification Buenrostro JD, Giresi PG, Zaba LC, Chang HY, Greenleaf WJ. 2013. Transposition of native chromatin for fast Sequencing reads were first mapped (as 2×50bp sequences) and sensitive epigenomic profiling of open chromatin, to the transcriptome assemblies using Bowtie (Langmead et DNA-binding proteins and nucleosome position. Nat al. 2009; version 1.0.1), with the following settings: -v 3 -a Methods 10(12):1213–1218.

27 Buratowski S. 2009. Progression through the RNA poly- 14(2):147–154. merase II CTD cycle. Mol Cell 36(4):541–546. Drinnenberg IA, deYoung D, Henikoff S, Malik HS. 2014. Burgess RJ, Zhang Z. 2013. Histone chaperones in nucleo- Recurrent loss of CenH3 is associated with independent some assembly and human disease. Nat Struct Mol Biol transitions to holocentricity in insects. Elife 3. doi: 20(1):14–22. 10.7554/eLife.03676. Carone BR, Hung JH, Hainer SJ, Chou MT, Carone DM, Drobic B, P´erez-Cadah´ıa B, Yu J, Kung SK, Davie JR. Weng Z, Fazzio TG, Rando OJ. 2014. High-resolution 2010. Promoter chromatin remodeling of immediate- mapping of chromatin packaging in mouse embryonic early genes is mediated through H3 phosphorylation at stem cells and sperm. Dev Cell 30(1):11–22. either serine 28 or 10 by the MSK1 multi-protein com- Chan YH, Kwok AC, Tsang JS, Wong JT. 2006. Alveolata plex. Nucleic Acids Res 38(10):3196–3208. histone-like proteins have different evolutionary origins. Eddy SR. 2011. Accelerated Profile HMM Searches. PLoS J Evol Biol 19(5):1717–1721. Comput Biol 7(10):e1002195. Chan YH, Wong JT. 2007. Concentration-dependent orga- Edgar RC. 2004. MUSCLE: multiple sequence alignment nization of DNA by the dinoflagellate histone-like pro- with high accuracy and high throughput. Nucleic Acids tein HCc3. Nucleic Acids Res 35(8):2573–2583. Res 32(5):1792–1797. Chudnovsky Y, Li JF, Rizzo PJ, Hastings JW, Fagan TF. Eick D, Geyer M. 2013. The RNA polymerase II carboxy- 2002. Cloning, expression, and charactertization of a terminal domain (CTD) code. Chem Rev 113(11):8456– histone-like protein from the marine dinoflagellate Lin- 8490. gulodinium polyedrum (). J Phycol 38:1–9. English CM, Adkins MW, Carson JJ, Churchill ME, Tyler Clapier CR, Cairns BR. 2009. The biology of chromatin JK. 2006. Structural basis for the histone chaperone remodeling complexes. Annu Rev Biochem 78:273–304. activity of Asf1. Cell 127(3):495–508. Creyghton MP, Cheng AW, Welstead GG, Kooistra T, Erkek S, Hisano M, Liang CY, Gill M, Murr R, Dieker Carey BW, Steine EJ, Hanna J, Lodato MA, Framp- J, Sch¨ubeler D, van der Vlag J, Stadler MB, Peters ton GM, Sharp PA, Boyer LA, Young RA, Jaenisch R. AH. 2013. Molecular determinants of nucleosome re- 2010. Histone H3K27ac separates active from poised tention at CpG-rich sequences in mouse spermatozoa. enhancers and predicts developmental state. Proc Natl Nat Struct Mol Biol 20(7):868–875. Acad Sci U S A 107(50):21931–21936. Feng Q, Wang H, Ng HH, Erdjument-Bromage H, Tempst Daujat S, Weiss T, Mohn F, Lange UC, Ziegler-Birling P, Struhl K, Zhang Y. 2002. Methylation of H3-lysine C, Zeissler U, Lappe M, Sch¨ubeler D, Torres-Padilla 79 is mediated by a new family of HMTases without a ME, Schneider R. 2009. H3K64 trimethylation marks SET domain. Curr Biol 12(12):1052–1058. heterochromatin and is dynamically remodeled during Feng S, Jacobsen SE. 2011. Epigenetic modifications in developmental reprogramming. Nat Struct Mol Biol plants: an evolutionary perspective. Curr Opin Plant 16(7):777–781. Biol 14(2):179–186. Desvoyes B, Fern´andez-Marcos M, Sequeira-Mendes J, Figueroa RI, Bravo I, Fraga S, Garc´esE, Llaveria G. 2009. Otero S, Vergara Z, Gutierrez C. 2014. Looking at plant The life history and of Kryptoperidinium foli- cell cycle from the chromatin window. Front Plant Sci aceum, a dinoflagellate with two eukaryotic nuclei. Pro- 5:369. tist 160(2):285–300. Dhar SS, Lee SH, Kan PY, Voigt P, Ma L, Shi X, Finn RD, Bateman A, Clements J, Coggill P, Eberhardt Reinberg D, Lee MG. 2012. Trans-tail regulation of RY, Eddy SR, Heger A, Hetherington K, Holm L, MLL4-catalyzed H3K4 methylation by H4R3 symmetric Mistry J, Sonnhammer EL, Tate J, Punta M. 2014. dimethylation is mediated by a tandem PHD of MLL4. Pfam: the protein families database. Nucleic Acids Res Genes Dev 26(24):2749–2762. 42(Database issue):D222–230. Dillon SC, Zhang X, Trievel RC, Cheng X. 2005. The SET- Fischle W, Tseng BS, Dormann HL, Ueberheide BM, Gar- domain protein superfamily: protein lysine methyltrans- cia BA, Shabanowitz J, Hunt DF, Funabiki H, Allis ferases. Genome Biol 6(8):227. CD. 2005. Regulation of HP1-chromatin binding by Dion MF, Altschuler SJ, Wu LF, Rando OJ. 2005. Genomic histone H3 methylation and phosphorylation. Nature characterization reveals a simple histone H4 acetylation 438(7071):1116–1122. code. Proc Natl Acad Sci U S A 102(15):5501–5506. Fischle W, Wang Y, Allis CD. 2003. Binary switches and Dodge JD. 1971. A dinoflagellate with both a mesokaryotic modification cassettes in histone biology and beyond. and a eukaryotic nucleus: Part 1 fine structure of the Nature 425(6957):475-479. nuclei. Protoplasma 73:145–157. Flueck C, Bartfai R, Volz J, Niederwieser I, Salcedo-Amaya Doenecke D. 2014. Chromatin dynamics from S-phase to AM, Alako BT, Ehlgen F, Ralph SA, Cowman AF, mitosis: contributions of histone modifications. Cell Bozdech Z, Stunnenberg HG, Voss TS. 2009. Plasmod- Tissue Res 356(3):467–475. ium falciparum heterochromatin protein 1 marks ge- Doyon Y, Cˆot´eJ. 2004. The highly conserved and multi- nomic loci linked to phenotypic variation of exported functional NuA4 HAT complex. Curr Opin Genet Dev virulence factors. PLoS Pathog 5(9):e1000569.

28 Fullwood MJ, Liu MH, Pan YF, Liu J, Xu H, Mohamed YB, baltica and Kryptoperidinium foliaceum. PLoS One Orlov YL, Velkov S, Ho A, Mei PH, Chew EG, Huang 5(5):e10711. PY, Welboren WJ, Han Y, Ooi HS, Ariyaratne PN, Vega Ismail IH, Hendzel MJ. 2008. The γ-H2A.X: is it just a sur- VB, Luo Y, Tan PY, Choy PY, Wansa KD, Zhao B, Lim rogate marker of double-strand breaks or much more? KS, Leow SC, Yow JS, Joseph R, Li H, Desai KV, Thom- Environ Mol Mutagen 49(1):73–82. sen JS, Lee YK, Karuturi RK, Herve T, Bourque G, Jenuwein T, Allis CD. 2001. Translating the histone code. Stunnenberg HG, Ruan X, Cacheux-Rataboul V, Sung Science 293(5532):1074–1080. Johansson C, Tumber A, WK, Liu ET, Wei CL, Cheung E, Ruan Y. 2009. An Che K, Cain P, Nowak R, Gileadi C, Oppermann U. oestrogen-receptor-alpha-bound human chromatin inter- 2014. The roles of Jumonji-type oxygenases in human actome. Nature 462(7269):58–64. disease. Epigenomics 6(1):89–120. G´omezF. 2005. A list of free-living dinoflagellate species Johnson DS, Mortazavi A, Myers RM, Wold B. 2007. in the world’s . Acta Bot Croat 64(1):129–212. Genome-wide mapping of in vivo protein-DNA interac- Gavelis GS, White RA, Suttle CA, Keeling PJ, Leander BS. tions. Science 316(5830):1497–1502. 2015. Single-cell transcriptomics using spliced leader PCR: Evidence for multiple losses of in Johnson MD, Tengs T, Oldach D, Stoecker DK. 2006. polykrikoid dinoflagellates. BMC Genomics 16:528. Sequestration, performance, and functional control of Gornik SG, Ford KL, Mulhern TD, Bacic A, McFadden GI, cryptophyte plastids in the ciliate Myrionecta rubra Waller RF. 2012. Loss of nucleosomal DNA condensa- (Ciliophora). J Phycol 42:1235–1246. tion coincides with appearance of a novel nuclear protein Kaplan CD, Laprade L, Winston F. 2003. Transcription in dinoflagellates. Curr Biol 22(24):2303–2312. elongation factors repress transcription initiation from Guccione E, Bassi C, Casadio F, Martinato F, Cesaroni M, cryptic sites. Science 301(5636):1096–1099. Schuchlautz H, L¨uscher B, Amati B. 2007. Methylation Karkhanis V, Hu YJ, Baiocchi RA, Imbalzano AN, Sif S. of histone H3R2 by PRMT6 and H3K4 by an MLL com- 2011. Versatility of PRMT5-induced methylation in plex are mutually exclusive. Nature 449(7164):933–937. growth control and development. Trends Biochem Sci Hake SB, Allis CD. 2006. Histone H3 variants and their 36(12):633–641. potential role in indexing mammalian genomes: the Keeling PJ, Burki F, Wilcox HM, Allam B, Allen EE, “H3 barcode hypothesis”. Proc Natl Acad Sci U S A Amaral-Zettler LA, Armbrust EV, Archibald JM, Bharti 103(17):6428–6435. AK, Bell CJ, Beszteri B, Bidle KD, Cameron CT, Camp- Hammoud SS, Nix DA, Zhang H, Purwar J, Carrell DT, bell L, Caron DA, Cattolico RA, Collier JL, Coyne K, Cairns BR. 2009. Distinctive chromatin in human Davy SK, Deschamps P, Dyhrman ST, Edvardsen B, sperm packages genes for embryo development. Nature Gates RD, Gobler CJ, Greenwood SJ, Guida SM, Ja- 460(7254):473–478. cobi JL, Jakobsen KS, James ER, Jenkins B, John U, Hendzel MJ, Wei Y, Mancini MA, Van Hooser A, Ranalli T, Johnson MD, Juhl AR, Kamp A, Katz LA, Kiene R, Brinkley BR, Bazett-Jones DP, Allis CD. 1997. Mitosis- Kudryavtsev A, Leander BS, Lin S, Lovejoy C, Lynn specific phosphorylation of histone H3 initiates primarily D, Marchetti A, McManus G, Nedelcu AM, Menden- within pericentromeric heterochromatin during G2 and Deuer S, Miceli C, Mock T, Montresor M, Moran MA, spreads in an ordered fashion coincident with mitotic Murray S, Nadathur G, Nagai S, Ngam PB, Palenik chromosome condensation. Chromosoma 106(6):348– B, Pawlowski J, Petroni G, Piganeau G, Posewitz MC, 360. Rengefors K, Romano G, Rumpho ME, Rynearson T, Herzog M, Soyer MO. 1981. Distinctive features of dinoflag- Schilling KB, Schroeder DC, Simpson AG, Slamovits ellate chromatin. Absence of nucleosomes in a prim- CH, Smith DR, Smith GJ, Smith SR, Sosik HM, Stief P, itive species Prorocentrum micans E. Eur J Cell Biol Theriot E, Twary SN, Umale PE, Vaulot D, Wawrik B, 23(2):295–302. Wheeler GL, Wilson WH, Xu Y, Zingone A, Worden AZ. Hesselberth JR, Chen X, Zhang Z, Sabo PJ, Sandstrom R, 2014. The Marine Microbial Eukaryote Transcriptome Reynolds AP, Thurman RE, Neph S, Kuehn MS, Noble Sequencing Project (MMETSP): illuminating the func- WS, Fields S, Stamatoyannopoulos JA. 2009. Global tional diversity of eukaryotic life in the oceans through mapping of protein-DNA interactions in vivo by digital transcriptome sequencing. PLoS Biol 12(6):e1001889. genomic footprinting. Nat Methods 6(4):283–289. Kirmizis A, Santos-Rosa H, Penkett CJ, Singer MA, Ver- Houben A, Demidov D, Caperta AD, Karimi R, Agueci meulen M, Mann M, Bhler J, Green RD, Kouzarides F, Vlasenko L. 2007. Phosphorylation of histone H3 in T. 2007. Arginine methylation at histone H3R2 plants–a dynamic affair. Biochim Biophys Acta 1769(5– controls deposition of H3K4 trimethylation. Nature 6):308–315. 449(7164):928–932. Huang H, Sabari BR, Garcia BA, Allis CD, Zhao Y. 2014. Kolasinska-Zwierz P, Down T, Latorre I, Liu T, Liu XS, SnapShot: histone modifications. Cell 159(2):458–458. Ahringer J. 2009. Differential chromatin marking of in- Imanian B, Pombert JF, Keeling PJ. 2010. The com- trons and expressed exons by H3K36me3. Nat Genet plete plastid genomes of the two ’dinotoms’ Durinskia 41(3):376–381.

29 Kouzarides T. 2007. Chromatin modifications and their visiae. Genetics 167(3):1123–1132. function. Cell 128(4):693–705. Marzluff WF, Wagner EJ, Duronio RJ. 2008. Metabolism Lachner M, O’Carroll D, Rea S, Mechtler K, Jenuwein T. and regulation of canonical histone mRNAs: life without 2001. Methylation of histone H3 lysine 9 creates a bind- a poly(A) tail. Nat Rev Genet 9(11):843–854. ing site for HP1 proteins. Nature 410(6824):116–120. Marzluff WF. 2005. Metazoan replication-dependent his- Langmead B, Trapnell C, Pop M, Salzberg SL. 2009. Ul- tone mRNAs: a distinct set of RNA polymerase II tran- trafast and memory-efficient alignment of short DNA se- scripts. Curr Opin Cell Biol 17(3):274–280. quences to the human genome. Genome Biol 10(3):R25. Metzger E, Imhof A, Patel D, Kahl P, Hoffmeyer K, Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev Friedrichs N, M¨ullerJM, Greschik H, Kirfel J, Ji S, M, Ragoczy T, Telling A, Amit I, Lajoie BR, Sabo PJ, Kunowska N, Beisenherz-Huss C, G¨unther T, Buettner Dorschner MO, Sandstrom R, Bernstein B, Bender MA, R, Sch¨le R. 2010. Phosphorylation of histone H3T6 by Groudine M, Gnirke A, Stamatoyannopoulos J, Mirny PKCβI controls demethylation at histone H3K4. Nature LA, Lander ES, Dekker J. 2009. Comprehensive map- 464(7289):792–796. ping of long-range interactions reveals folding principles Metzger E, Yin N, Wissmann M, Kunowska N, Fischer K, of the human genome. Science 326(5950):289–293. Friedrichs N, Patnaik D, Higgins JM, Potier N, Scheidt- Lin S, Zhang H, Zhuang Y, Tran B, Gill J. 2010. Spliced mann KH, Buettner R, Sch¨uleR. 2008. Phosphorylation leader-based metatranscriptomic analyses lead to recog- of histone H3 at threonine 11 establishes a novel chro- nition of hidden genomic features in dinoflagellates. matin mark for transcriptional regulation. Nat Cell Biol Proc Natl Acad Sci U S A 107(46):20033–20038. 10(1):53-60. Lindemann E. 1924. Der Bau der H¨ullebei Heterocapsa Mosammaparast N, Ewart CS, Pemberton LF. 2002. A role und Kryptoperidinium foliaceum (Stein) n. nom. (Zu- for nucleosome assembly protein 1 in the nuclear trans- gleich eine vorl¨aufigeMitteilund). Botanisches Archiv port of histones H2A and H2B. EMBO J 21(23):6527– 5:114–120. 6538. Liu Y, Taverna SD, Muratore TL, Shabanowitz J, Hunt DF, Muratoglu S, Georgieva S, P´apai G, Scheer E, En¨unl¨u Allis CD. 2007. RNAi-dependent H3K27 methylation is I, Komonyi O, Cserp´anI, Lebedeva L, Nabirochkina required for heterochromatin formation and DNA elim- E, Udvardy A, Tora L, Boros I. 2003. Two different ination in Tetrahymena. Genes Dev 21(12):1530–1545. Drosophila ADA2 homologues are present in distinct Lowe CD, Martin LE, Roberts EC, Watts PC, Wootton GCN5 histone acetyltransferase-containing complexes. EC, Montagnes DJS. 2011. Collection, isolation and cul- Mol Cell Biol 23(1):306–321. turing strategies for . J Res Musselman CA, Lalonde ME, Cˆot´e J, Kutateladze TG. 33(4):569–578. 2012. Perceiving the epigenetic landscape through hi- Lowell JE, Cross GA. 2004. A variant histone H3 is en- stone readers. Nat Struct Mol Biol 19(12):1218–1227. riched at telomeres in Trypanosoma brucei. J Cell Sci Nagai S, Nitshitani G, Tomaru Y, Sakiyama S, Kamiyama 117(Pt 24):5937–5947. T. 2008 by the toxic dinoflagellate Dinophysis Lucas IAN, Vesk M. 1990 The fine structure of two pho- fortii on the ciliate Myrionecta rubra and observation of tosynthetic species of Dinophysis (Dinophysiales, Dino- sequestration of ciliate . J Phycol 44:909– phyceae). J Phycol 26:345–357. 922. Luger K, M¨aderAW, Richmond RK, Sargent DF, Rich- Nelson CJ, Santos-Rosa H, Kouzarides T. 2006. Proline mond TJ. 1997. Crystal structure of the nucleosome core isomerization of histone H3 regulates lysine methylation particle at 2.8 A resolution. Nature 389(6648):251–260. and gene expression. Cell 126(5):905–916. Lukes J, Leander BS, Keeling PJ. 2009. Cascades of conver- Nguyen AT, Zhang Y. 2011. The diverse functions of Dot1 gent evolution: the corresponding evolutionary histories and H3K79 methylation. Genes Dev 25(13):1345–1358. of euglenozoans and dinoflagellates. Proc Natl Acad Sci Orphanides G, LeRoy G, Chang CH, Luse DS, Reinberg D. USA 106 Suppl 1:9963–9970. 1998. FACT, a factor that facilitates transcript elonga- Manzanero S, Arana P, Puertas MJ, Houben A. 2000. The tion through nucleosomes. Cell 92(1):105–116. chromosomal distribution of phosphorylated histone H3 Orr RJ, Murray SA, St¨uken A, Rhodes L, Jakobsen KS. differs between plants and animals at . Chromo- 2014. When naked became armored: an eight-gene phy- soma 109(5):308–317. logeny reveals monophyletic origin of in dinoflag- Mariadason JM. 2008. HDACs and HDAC inhibitors in ellates. PLoS One 7(11):e50004. colon cancer. 3(1):28–37. Otero S, Desvoyes B, Gutierrez C. 2014. Histone H3 dy- Marmorstein R, Zhou MM. 2014. Writers and readers of hi- namics in plant cell cycle and development. Cytogenet stone acetylation: structure, mechanism, and inhibition. Genome Res 143(1-3):114–124. Cold Spring Harb Perspect Biol 6(7):a018762. Pavri R, Zhu B, Li G, Trojer P, Mandal S, Shilatifard A, Martin AM, Pouchnik DJ, Walker JL, Wyrick JJ. 2004. Re- Reinberg D. 2006. Histone H2B monoubiquitination dundant roles for histone H3 N-terminal lysine residues functions cooperatively with FACT to regulate elonga- in subtelomeric gene repression in Saccharomyces cere- tion by RNA polymerase II. Cell 125(4):703–717.

30 Polioudaki H, Markaki Y, Kourmouli N, Dialynas G, Chromosoma 100:510–518. Theodoropoulos PA, Singh PB, Georgatos SD. 2004. Sanchez R, Meslamani J, Zhou MM. 2014. The bromod- Mitotic phosphorylation of histone H3 at threonine 3. omain: from epigenome reader to druggable target. FEBS Lett 560(1–3):39–44. Biochim Biophys Acta 1839(8):676–685. Postberg J, Forcob S, Chang WJ, Lipps HJ. 2010. The Sawicka A, Hartl D, Goiser M, Pusch O, Stocsits RR, Tamir evolutionary history of histone H3 suggests a deep eu- IM, Mechtler K, Seiser C. 2014. H3S28 phosphorylation karyotic root of chromatin modifying mechanisms. BMC is a hallmark of the transcriptional response to cellular Evol Biol 10:259. stress. Genome Res 24(11):1808–1820. Preuss U, Landsberg G, Scheidtmann KH. 2003. Novel Sawicka A, Seiser C. 2012. Histone H3 phosphorylation: a mitosis-specific phosphorylation of histone H3 at Thr11 versatile chromatin modification for different occasions. mediated by Dlk/ZIP kinase. Nucleic Acids Res Biochimie 94:2193–2201. 31(3):878–885. Schenk R, Jenke A, Zilbauer M, Wirth S, Postberg J. 2011. Rada-Iglesias A, Bajpai R, Swigut T, Brugmann SA, Flynn H3.5 is a novel hominid-specific histone H3 variant that RA, Wysocka J. 2010. A unique chromatin signature is specifically expressed in the seminiferous tubules of uncovers early developmental enhancers in humans. Na- human testes. Chromosoma 120(3):275–285. ture 470(7333):279–283. Schnepf E, Elbr¨achter M. 1988 Cryptophycean-like dou- Rathke C, Baarends WM, Awe S, Renkawitz-Pohl R. 2014. ble membrane-bound in the dinoflagellate, Chromatin dynamics during spermiogenesis. Biochim Dinophysis Ehrenb.: evolutionary, phylogenetic and tox- Biophys Acta 1839(3):155–168. icological implications. Bot Acta 101:196–203. Ray-Gallet D, Quivy JP, Scamps C, Martini EM, Lipinski Schwartz S, Meshorer E, Ast G. 2009. Chromatin organiza- M, Almouzni G. 2002. HIRA is critical for a nucleosome tion marks exon-intron structure. Nat Struct Mol Biol assembly pathway independent of DNA synthesis. Mol 16(9):990–995. Cell 9(5):1091–1100. Reinberg D, Sims RJ 3rd. 2006. de FACTo nucleosome Schwope RM, Chalker DL. 2014. Mutations in Pdd1 reveal dynamics. J Biol Chem 281(33):23297–23301. distinct requirements for its chromodomain and chro- Rill RL, Livolant F, Aldrich HC, Davidson MW. 1989. Elec- moshadow domain in directing histone methylation and tron of liquid crystalline DNA: direct ev- heterochromatin elimination. Eukaryot Cell 13(2):190– idence for cholesteric-like organization of DNA in di- 201. noflagellate chromosomes. Chromosoma 98(4):280–286. Shilatifard A. 2006. Chromatin modifications by methyla- Rizzo PJ, Burghardt RC. 1982. Histone-like protein tion and ubiquitination: implications in the regulation and chromatin structure in the wall-less dinoflagellate of gene expression. Annu Rev Biochem 75:243–269. Gymnodinium nelsoni. Biosystems 15(1):27–34. Shilatifard A. 2012. The COMPASS family of histone H3K4 Rizzo PJ, Nooden LD. 1972. Chromosomal proteins in the methylases: mechanisms of regulation in development dinoflagellate alga Gyrodinium cohnii. Science 176:796– and disease pathogenesis. Annu Rev Biochem 81:65–95. 797. Shoguchi E, Shinzato C, Kawashima T, Gyoja F, Mung- Rizzo PJ. 2003. Those amazing dinoflagellate chromo- pakdee S, Koyanagi R, Takeuchi T, Hisata K, Tanaka somes. Cell Res 13:215–217. M, Fujiwara M, Hamada M, Seidi A, Fujie M, Usami T, Roberts A, Pachter L. 2013. Streaming fragment assign- Goto H, Yamasaki S, Arakaki N, Suzuki Y, Sugano S, ment for real-time analysis of sequencing experiments. Toyoda A, Kuroki Y, Fujiyama A, Medina M, Coffroth Nat Methods 10(1):71–73. MA, Bhattacharya D, Satoh N. 2013. Draft assembly Roy S, Morse D. 2012. A full suite of histone and his- of the Symbiodinium minutum nuclear genome reveals tone modifying genes are transcribed in the dinoflagel- dinoflagellate gene structure. Curr Biol 23(15):1399– late Lingulodinium. PLoS One 7(4):e34340. 1408. Rudolph T, Beuch S, Reuter G. 2013. Lysine-specific his- Siegel TN, Hekstra DR, Kemp LE, Figueiredo LM, Lowell tone demethylase LSD1 and the dynamic control of chro- JE, Fenyo D, Wang X, Dewell S, Cross GA. 2009. Four matin. Biol Chem 394(8):1019–1028. histone variants mark the boundaries of polycistronic Rufiange A, Jacques PE, Bhat W, Robert F, Nourani A. transcription units in Trypanosoma brucei. Genes Dev 2007. Genome-wide replication-independent histone H3 23(9):1063–1076. exchange occurs predominantly at promoters and impli- Simon JA, Kingston RE. 2013. Occupying chromatin: Poly- cates H3 K56 acetylation and Asf1. Mol Cell 27(3):393– comb mechanisms for getting to genomic targets, stop- 405. ping transcriptional traffic, and staying put. Mol Cell Sala-Rovira M, Geraud ML, Caput D, Jacques F, Soyer- 49(5):808–824. Gobillard M-O, Vernet G, Herzog M. 1991. Molecu- Soboleva TA, Nekrasov M, Pahwa A, Williams R, Hutt- lar cloning and immunolocalization of two variants of ley GA, Tremethick DJ. 2011. A unique H2A histone the major basic nuclear protein (HCc) from the histone- variant occupies the transcriptional start site of active less eukaryote Crypthecodinium cohnii (Pyrrhophyta). genes. Nat Struct Mol Biol 19(1):25–30.

31 Stamatakis A. 2014. RAxML version 8: a tool for phylo- trastructure. J Phycol 9:304–323. genetic analysis and post-analysis of large phylogenies. Tsutsui T, Fukasawa R, Shinmyouzu K, Nakagawa R, Tobe Bioinformatics 30(9):1312–1313. K, Tanaka A, Ohkuma Y. 2013. Mediator complex re- Stancheva I. 2005. Caught in conspiracy: cooperation be- cruits epigenetic regulators via its two cyclin-dependent tween DNA methylation and histone H3K9 methylation kinase subunits to repress transcription of immune re- in the establishment and maintenance of heterochro- sponse genes. J Biol Chem 288(29):20955–20965. matin. Biochem Cell Biol 83(3):385–395. van der Horst A, Vromans MJ, Bouwman K, van der Waal Stein F. 1883. Die naturgeschichte der arthrodelen Flagel- MS, Hadders MA, Lens SM. 2015. Inter-domain Coop- laten. In: Der Organismus der Infusionstiere III Pt. 2. eration in INCENP Promotes Aurora B Relocation from 1–30. Centromeres to . Cell Rep 12(3):380-387. Sullivan WJ Jr, Naguleswaran A, Angel SO. 2006. Histones Van Doninck K, Mandigo ML, Hur JH, Wang P, Guglielmini and histone modifications in protozoan parasites. Cell J, Milinkovitch MC, Lane WS, Meselson M. 2009. Phy- Microbiol 8(12):1850-1861. logenomics of unusual histone H2A Variants in Bdelloid Sun L, Wang M, Lv Z, Yang N, Liu Y, Bao S, Gong W, rotifers. PLoS Genet 5(3):e1000401. Xu RM. 2011. Structural insights into protein arginine Vaquero A. 2009. The conserved role of sirtuins in chro- symmetric dimethylation by PRMT5. Proc Natl Acad matin regulation. Int J Dev Biol 53(2–3):303–322. Sci U S A 108(51):20538–20543. Verreault A, Kaufman PD, Kobayashi R, Stillman B. 1996. Tachibana M, Sugimoto K, Fukushima T, Shinkai Y. 2001. Nucleosome assembly by a complex of CAF-1 and acety- Set domain-containing protein, G9a, is a novel lysine- lated histones H3/H4. Cell 87(1):95–104. preferring mammalian histone methyltransferase with Wagner EJ, Carpenter PB. 2012. Understanding the lan- hyperactivity and specific selectivity to lysines 9 and 27 guage of Lys36 methylation at histone H3. Nat Rev Mol of histone H3. J Biol Chem 276(27):25309–25317. Cell Biol 13(2):115–126. Takeuchi T, Watanabe Y, Takano-Shimizu T, Kondo S. Waller RF, Jackson CJ. 2009. Dinoflagellate mitochon- 2006. Roles of jumonji and jumonji family genes drial genomes: stretching the rules of molecular biology. in chromatin regulation and development. Dev Dyn Bioessays 31:237–245. 235(9):2449–2459. Wang F, Dai J, Daum JR, Niedzialkowska E, Banerjee B, Takishita K, Koike K, Maruyama T, Ogata T. 2002. Molec- Stukenberg PT, Gorbsky GJ, Higgins JM. 2010. Histone ular evidence for plastid robbery (Kleptoplastidy) in H3 Thr-3 phosphorylation by Haspin positions Aurora B Dinophysis, a dinoflagellate causing diarrhetic shellfish at centromeres in mitosis. Science 330(6001):231–235. poisoning. 153:293–302. Wang G, He Q, Liu F, Cheng Z, Talbert PB, Jin W. 2011. Talbert PB, Ahmad K, Almouzni G, Ausi´oJ, Berger F, Characterization of CENH3 proteins and centromere- Bhalla PL, Bonner WM, Cande WZ, Chadwick BP, associated DNA sequences in diploid and allotetraploid Chan SW, Cross GA, Cui L, Dimitrov SI, Doenecke Brassica species. Chromosoma 120(4):353-365. D, Eirin-Lpez JM, Gorovsky MA, Hake SB, Hamkalo Wang L, Tang Y, Cole PA, Marmorstein R. 2008. Struc- BA, Holec S, Jacobsen SE, Kamieniarz K, Khochbin S, ture and chemistry of the p300/CBP and Rtt109 histone Ladurner AG, Landsman D, Latham JA, Loppin B, Ma- acetyltransferases: implications for histone acetyltrans- lik HS, Marzluff WF, Pehrson JR, Postberg J, Schneider ferase evolution and function. Curr Opin Struct Biol R, Singh MB, Smith MM, Thompson E, Torres-Padilla 18(6):741–747. ME, Tremethick DJ, Turner BM, Waterborg JH, Woll- Wargo MJ, Rizzo PJ. 2000. Characterization of Gymno- mann H, Yelagandula R, Zhu B, Henikoff S. 2012. A dinium mikimotoi (Dinophyceae) nuclei and identifica- unified phylogeny-based nomenclature for histone vari- tion of the major histone-like protein, HGm. J Phycol ants. Epigenetics Chromatin 5:7. 36:584–589. Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. Waterborg JH. 2012. Evolution of histone H3: emergence 2013. MEGA6: Molecular Evolutionary Genetics Anal- of variants and conservation of post-translational modi- ysis Version 6.0. Mol Biol Evol 30(12):2725–2729. fication sites. Biochem Cell Biol 90(1):79–95. Tan M, Luo H, Lee S, Jin F, Yang JS, Montellier E, Bu- Waterhouse AM, Procter JB, Martin DM, Clamp M, Bar- chou T, Cheng Z, Rousseaux S, Rajagopal N, Lu Z, Ye ton GJ. 2009. Jalview Version 2–a multiple sequence Z, Zhu Q, Wysocka J, Ye Y, Khochbin S, Ren B, Zhao Y. alignment editor and analysis workbench. Bioinformat- 2011. Identification of 67 histone marks and histone ly- ics 25(9):1189–1191. sine crotonylation as a new type of histone modification. Wiedemann SM, Mildner SN, B´anisch C, Israel L, Maiser Cell 146(6):1016–1028. A, Matheisl S, Straub T, Merkl R, Leonhardt H, Krem- Thomas T, Voss AK. 2007. The diverse biological roles of mer E, Schermelleh L, Hake SB. 2010. Identification and MYST histone acetyltransferase family proteins. Cell characterization of two novel primate-specific histone H3 Cycle 6(6):696–704. variants, H3.X and H3.Y. J Cell Biol 190(5):777–791. Tomas RN, Cox ER. 1973. Observations on symbiosis of Wilkins BJ, Rall NA, Ostwal Y, Kruitwagen T, Hiragami- Balticum and its intracellular alga. 1. Ul- Hamada K, Winkler M, Barral Y, Fischle W, Neu-

32 mann H. 2014. A cascade of histone modifications II C-terminal domain. Proc Natl Acad Sci U S A induces chromatin condensation in mitosis. Science 111(16):5920–5925. 343(6166):77–80. Yang X, Yu W, Shi L, Sun L, Liang J, Yi X, Li Q, Zhang Wisecaver JH, Hackett JD. 2011. Dinoflagellate genome Y, Yang F, Han X, Zhang D, Yang J, Yao Z, Shang evolution. Annu Rev Microbiol 65:369–387. Y. 2011. HAT4, a Golgi apparatus-anchored B-type hi- Wong JT, New DC, Wong JC, Hung VK. 2003. Histone- stone acetyltransferase, acetylates free histone H4 and like proteins of the dinoflagellate Crypthecodinium cohnii facilitates chromatin assembly. Mol Cell 44(1):39–50. have homologies to bacterial DNA-binding proteins. Eu- Yuan CC, Matthews AG, Jin Y, Chen CF, Chapman BA, karyot Cell 2(3):646–650. Ohsumi TK, Glass KC, Kutateladze TG, Borowsky ML, Wysocka J, Allis CD, Coonrod S. 2006. Histone arginine Struhl K, Oettinger MA. 2012. Histone H3R2 sym- methylation and its dynamic regulation. Front Biosci metric dimethylation and histone H3K4 trimethylation 11:344–355. are tightly correlated in eukaryotic genomes. Cell Rep Xie L, Zeng J, Luo H, Pan W, Xie J. 2014. The roles of 1(2):83–90. bacterial GCN5-related N-acetyltransferases. Crit Rev Zhang S, Sui Z, Chang L, Kang K, Ma J, Kong F, Zhou W, Eukaryot Gene Expr 24(1):77–87. Wang J, Guo L, Geng H, Zhong J, Ma Q. 2014. Tran- Xie W, Song C, Young NL, Sperling AS, Xu F, Sridharan scriptome de novo assembly sequencing and analysis of R, Conway AE, Garcia BA, Plath K, Clark AT, Grun- the toxic dinoflagellate Alexandrium catenella using the stein M. 2009. Histone h3 lysine 56 acetylation is linked Illumina platform. Gene 537(2):285–293. to the core transcriptional network in human embryonic stem cells. Mol Cell 33(4):417–427. Zhang Z, Green BR, Cavalier-Smith T. 1999. Single gene Xu C, Min J. 2011. Structure and function of WD40 do- circles in dinoflagellate chloroplast genomes. Nature main proteins. Protein Cell 2(3):202–214. 400:155–159. Xu F, Zhang K, Grunstein M. 2005. Acetylation in histone Zheng B, Chen X. 2011. Dynamics of histone H3 lysine 27 H3 globular domain regulates gene expression in yeast. trimethylation in plant development. Curr Opin Plant Cell 121(3):375–385. Biol 14(2):123–129. Yamagishi Y, Honda T, Tanno Y, Watanabe Y. 2010. Two Zhu B, Zheng Y, Pham AD, Mandal SS, Erdjument- histone marks establish the inner centromere and chro- Bromage H, Tempst P, Reinberg D. 2005. Monoubiqui- mosome bi-orientation. Science 330(6001):239–243. tination of human histone H2B: the factors involved and Yang C, Stiller JW. 2014. Evolutionary diversity and their roles in HOX gene regulation. Mol Cell 20(4):601– taxon-specific modifications of the RNA polymerase 611.

33