<<

H3 variants and their potential role in indexing mammalian : The ‘‘H3 barcode hypothesis’’

Sandra B. Hake and C. David Allis*

Laboratory of Biology, The Rockefeller University, Box 78, 1230 York Avenue, New York, NY 10021

This contribution is part of the special series of Inaugural Articles by members of the National Academy of Sciences elected on May 3, 2005.

Contributed by C. David Allis, January 31, 2006

In the history of science, provocative but, at times, controversial translated into meaningful biological responses (1, 2). The ideas have been put forward to explain basic problems that ‘‘’’ hypothesis states that a specific histone modifi- confront and intrigue the scientific community. These hypotheses, cation, or combinations thereof, can affect distinct downstream although often not correct in every detail, lead to increased cellular events by altering the structure of chromatin (cis mech- discussion that ultimately guides experimental tests of the princi- anisms) or by generating a binding platform for effector proteins pal concepts and produce valuable insights into long-standing (trans mechanisms). Such effectors specifically recognize par- questions. Here, we present a hypothesis, the ‘‘H3 barcode hy- ticular PTM(s) and initiate events that ultimately lead to down- pothesis.’’ Hopefully, our ideas will evoke critical discussion and stream events, such as gene activation or silencing. Tests of this new experimental approaches that bear on general topics, such as hypothesis, as well as extensions of it (3), are gaining experi- nuclear architecture, epigenetic memory, and cell-fate choice. Our mental support (e.g., refs. 4 and 5), although alternative views hypothesis rests on the central concept that mammalian histone H3 have been expressed (6, 7). Despite these uncertainties, emerg- variants (H3.1, H3.2, and H3.3), although remarkably similar in ing evidence underscores elaborate mechanisms for introducing amino acid sequence, exhibit distinct posttranslational ‘‘signa- variation, covalent and noncovalent, into the chromatin polymer tures’’ that create different chromosomal domains or territories, (reviewed in ref. 8). The challenge remains as to how this which, in turn, influence epigenetic states during cellular differ- variation is converted into meaningful biological readout. entiation and development. Although we restrict our comments to H3 variants in mammals, we expect that the more general concepts Histone H3 Variants and Their Evolution presented here will apply to other histone variant families in With the exception of H4, all core histone proteins have variant organisms that employ them. counterparts, which often differ in surprisingly few amino acids (reviewed in ref. 9). Histone genes encoding these variants can histone H3 variants H3.1, H3.2, H3.3 ͉ ‘‘barcode hypothesis’’ ͉ epigenetic be classified into three main subtypes on the basis of their memory ͉ cell differentiation expression pattern and genomic organization (10, 11): replica- tion-dependent (RD), replication- and cell cycle phase- Chromatin and Its Role in Cellular Processes independent (RI), and tissue-specific (TS) . RI expres- very eukaryotic cell contains genetic information in the form sion of histone genes reinforces the general view that histone Eof DNA that is compacted to varying degrees in a confined proteins evolved to participate actively in DNA-templated pro- nuclear space. However, DNA is packaged in such a way that cesses rather than to serve simply a passive DNA-packaging role enables its readout, replication, and repair in response to cellular (see below). Nowhere is the concept of better needs and external stimuli. This condensation is achieved by an illustrated than with the family of H3 histones. intimate interaction between DNA and histone proteins to form Most express a -specific H3 variant chromatin. The fundamental unit of chromatin is the nucleo- (Saccharomyces cerevisiae, Cse4; Drosophila, CID; and Homo some particle, consisting of core histone proteins (H2A, H2B, sapiens, CENP-A) that is evolutionarily well conserved in its H3, and H4) around which the DNA is wrapped. Chromatin is globular core region but not in its N-terminal tail (reviewed in often broadly divided into two cytologically distinct fractions: ref. 12) and is essential for cell survival because of its funda- , which is generally permissive for transcription, and mental role in centromeric function during (13). Inter- , which is largely repressive. Two basic varieties estingly, during evolution, additional genes encoding H3 variants of heterochromatin exist, constitutive and facultative; DNA have emerged (Fig. 1A). For example, outside of the centromeric within constitutive heterochromatin is obligately silenced; fac- H3 variant, the unicellular yeast S. cerevisiae possesses only H3.3, ultative heterochromatin is silenced only in certain contexts. a H3 variant that is expressed and incorporated into chromatin Relevant to our proposed ‘‘H3 barcode hypothesis’’ is the in a RI fashion and associated in higher eukaryotes with extent to which the chromatin fiber is constant or variable. transcriptional activation (see below). Although budding yeast Constancy is provided by the nearly universal nucleosomal contains well defined ‘‘silent’’ chromatin, several hallmark fea- packaging theme of histones and DNA in all eukaryotes. Vari- tures of constitutive heterochromatin in higher eukaryotes (e.g., ation is provided by subtle changes in this packaging theme that H3 K9, and K27 methylation) have yet to be observed in S. provide ‘‘instructions’’ as to how the DNA template is to be cerevisiae (14). This observation correlates well with the presence ‘‘read’’ when needed. Histone proteins are, for example, well known to be extensively modified by a vast array of covalent modifications on ‘‘external’’ (N- and C-terminal tails) as well as Conflict of interest statement: No conflicts declared. ‘‘internal’’ (histone-fold) domains, often leading to complex Abbreviations: LBR, lamin B receptor; PTM, posttranslational modification; RD, replication- modification patterns that correlate closely with various states of dependent; RI, replication-independent. gene expression or other DNA-templated processes. This stag- See accompanying Profile on page 6425. gering number of posttranslational modifications (PTMs) has *To whom correspondence should be addressed. E-mail: [email protected]. prompted theories as to how these chemical marks might be © 2006 by The National Academy of Sciences of the USA

6428–6435 ͉ PNAS ͉ April 25, 2006 ͉ vol. 103 ͉ no. 17 www.pnas.org͞cgi͞doi͞10.1073͞pnas.0600803103 Downloaded by guest on September 23, 2021 modest changes in primary sequence among H3 variants unim- portant, a likely consequence of evolutionary ‘‘drift?’’ Alterna- tively, the small number of amino acid changes in these H3 INAUGURAL ARTICLE variants lead to unique protein structures and, in turn, to unique nucleosomal architecture and chromosomal domains that might govern H3 variant-specific biological functions (as is the case for centromere-associated H3s) (16). Future studies aimed at de- termining the x-ray structures of containing dif- ferent histone variants may provide structural insights into their effects on stability and organization. The literature on H3 variants does not contain a universal nomenclature for these variants, and, therefore, we propose to adopt the following convention: histone H3 protein containing S31, A87, I89, and G90 will be called H3.3; H3 with A31, S87, V89, M90, and S96 will be called H3.2; and H3.1 has the sequence of H3.2, with the exception of position 96, where it contains a cysteine. Amino acids 87–90 in H3.3 have been shown to be important for RI incorporation into chromatin (17), and these data suggest that this region might act as a ‘‘chaperone recognition domain’’ where HIRA binds to H3.3 and CAF-1 to H3.1 (see below and ref. 18). It is as yet unknown whether H3.2 binds to a different chaperone and whether amino acid position 96 plays any role in this potential chaperone recognition domain (Fig. 1B). Elegant experiments have shown that H3.3 is associated with transcriptionally active gene loci and is enriched in covalent modifications associated with gene activation in flies, plants, and humans (17, 19–21). In contrast, in Drosophila and Arabidopsis, H3.2 has been shown to be enriched in marks that are associated with gene silencing (19, 20). These observations suggest that, during evolution, organisms draw on different profiles of phys- iologically relevant PTMs but also selective employment (re- Fig. 1. H3 variants in different organisms. (A) Schematic of evolutionary cruitment and replacement) of different histone H3 variants, a appearance of histone H3 variants. All organisms express a centromere- concept well articulated by Henikoff and colleagues (22). Be-

specific H3 variant (CENP-A, filled blue circle). In addition to the centromeric cause H3.1 and H3.2 differ by only a single amino acid, most CELL BIOLOGY H3 variant, the following H3 variants are expressed in these organisms: S. studies tend to group these variants as one. However, recent cerevisiae contains only H3.3 (blue gradient circle); S. pombe expresses a results provide evidence that human H3.1, H3.2, and H3.3 differ hybrid H3 protein that contains amino acids characteristic for H3.3 and H3.2; in both their expression and PTM patterns as follows: H3.3 is Arabidopsis thaliana, Xenopus laevis, and Drosophila melanogaster (for ex- enriched in PTMs associated with gene activation (hyperacety- ample) express H3.3 and H3.2 (blue circle with white dots); mammals such as Mus musculus and H. sapiens express H3.3, H3.2, H3.1 (white circle with blue lation and dimethylation of K36 and K79), H3.2 is enriched in dots), and a testis-specific H3.1t (white circle with blue stripes) variant of PTMs associated with gene silencing (K27 di- and trimethyla- unknown function. H3.3 has been associated with euchromatin and transcrip- tion), and H3.1 is enriched in PTMs associated with gene tional activation. H3.2 and H3.1 might localize to heterochromatin and are activation (K14 acetylation) and gene silencing (K9 dimethyla- involved in transcriptional silencing. (B) Alignment of human noncentromeric tion), suggesting that these mammalian H3 variants may, indeed, histone H3 variants. Differences in amino acid sequence among human H3.3, have separate biological functions (23). These studies under- H3.2, H3.1, and H3.1t are shown in white boxes. Cysteine residues are high- score a general conclusion: Remarkably similar histone proteins lighted in red (Cys 96 in dark red and Cys 110 in pink). Identical amino acids are may vary considerably in their expression and PTM profiles. shown in gray. TS, tissue-specific. The region where most amino acid differ- ences between the variants are found is underlined as a potential chaperone Determining how these differences translate into different bio- recognition domain (see text for details), and the chaperones binding to H3 logical functions and, notably, whether different functions, in- variants are depicted below. deed, exist for the closely related H3.1 and H3.2 remains a challenge for future research. The mechanism(s) by which histone variants and their PTMs of only a H3.3 variant and the streamlined gene-rich composition are transmitted through the cell cycle also remains unsolved. of the yeast . Interestingly, the fission yeast Saccharo- Depending on the precise mechanism of nucleosome assembly at myces pombe contains one H3, with characteristics from both the time of DNA replication, histone variants may provide a H3.3 and H3.2, a finding consistent with this yeast having bridge for the transmission of epigenetic information from one constitutive heterochromatin more typical of higher organisms cell or one sexual generation to the next (18). If, for example, the (15). Organisms such as plants, flies, frogs, and birds contain, in incorporation of histone variants into replicating chromatin is addition to H3.3, another H3 variant H3.2 that differs in only nonrandom, we envision that the variants may provide potential four amino acids from H3.3 (Fig. 1B); H3.2 is expressed and ‘‘backup’’ for the more labile histone PTMs by playing a role in incorporated into chromatin in a RD fashion. Only in mammals the establishment of ‘‘epigenetic memory.’’ Central to this have two additional H3 variants evolved: H3.1 and a testis- concept is the general view that H3 variants can impart structural specific H3.1 variant (H3.1t) (Fig. 1A). H3.1 differs from H3.2 differences to individual nucleosomes, nucleosomal arrays, or in only one amino acid (amino acid 96: cysteine͞serine, respec- higher-order chromatin domains that contain them before PTMs tively) and is also expressed in RD fashion, whereas H3.1t is are added (or removed) (24). Below, we present several ideas for expressed only in testis and has four additional amino acid how such differences might occur, even though only a small substitutions when compared with H3.1 (see Fig. 1B). Are these number of amino acid differences exist between H3 variants.

Hake and Allis PNAS ͉ April 25, 2006 ͉ vol. 103 ͉ no. 17 ͉ 6429 Downloaded by guest on September 23, 2021 Cysteines of H3 Variants and Their Potential Role in Nuclear Architecture Well established in the literature, but relatively underappreci- ated, is the that most members of the histone H3 family contain one or more cysteine(s) in their protein core and that this feature is a hallmark property of histone H3; all other histone proteins lack cysteine (Fig. 1B). Cysteine is one of the most rarely used amino acids in nature (1.9% occurrence in proteins) (25), suggesting that it plays a specialized role in the function of proteins that contain it. Equally well established is the fact that cysteines can form disulfide bonds under oxidative conditions and are involved in the homotypic or heterotypic dimerization and oligomerization of proteins. As shown in Fig. 1B, the histone H3 variants (except H3 in S. cerevisiae) contain one cysteine at position 110 that is located in their a2 helix, the region where both H3 proteins are closely apposed in the nucleosome core particle (26). The region immediately sur- rounding amino acid 110 is important to hold together two –H2B–H3–H4 tetramers, because C110E muta- tions, for example, destabilized the H3–H3 hydrophobic four- helix bundle tetramer interface in vitro (27). It is not yet clear what role, if any, cysteine 110 plays in this process in vivo.We propose that Cys-110, common in essentially all H3 variants, forms an ‘‘intra’’-disulfide bond with H3 in the same nucleosome under oxidative conditions, adding stability to the H3–H4 tet- ramer (Fig. 2A). In support, a crosslinked H3-H3 octamer can still form a nucleosome in vitro (28). No cysteine exists in S. cerevisiae H3, and giving yeast an artificial cysteine in place of its at 110 does not appear to have clear phenotypic conse- quences (29). We note that budding yeast lacks many of the better known heterochromatin marks and related machinery (i.e., K9 methylation in H3, HP1-like homologues, etc.). Thus, the tentative conclusion that cysteine utilization is unimportant based solely on experiments conducted in budding yeast may not be warranted. We look forward to the generation of H3 cysteine mutants in organisms that use more classical heterochromatin. Although the extent to which the nucleus contains an oxidizing or reducing environment is not well established, redox-sensing mechanisms appear to play important roles in the nucleus. Certain transcription factors, for example, NF-␬B, contain a cysteine that has been shown to participate in intermolecular disulfide formation (30) and must be in a reduced state in order for NF-␬B to bind to DNA. Reduction is achieved by the action of molecules that are unique to the nucleus (31). In contrast, other transcription factors have an increased DNA-binding affinity under oxidative conditions (32), lending support to the general notion that physiologically relevant, redox-sensitive Fig. 2. Potential usage of H3 variant-specific cysteines 110 and H3.1-specific mechanisms may occur inside the nucleus. cysteine 96. (A) H3 cysteine 110 forms a potential intramolecular disulfide It is intriguing to revisit earlier literature (33, 34) aimed at bond (light red box) with H3Ј cysteine 110 in the same nucleosome (for details, determining whether the cysteines in histone H3 variants see text). For simplicity, only the H3–H4 tetramer is shown as top view (Left). ‘‘sense’’ changes in the redox state of the nucleus. If so, does the All mammalian H3 variants contain cysteine 110 and can potentially partici- proximity of the two cysteines at the interface between homo- pate in disulfide bonding. (Right) H3–H4 dimers. (B) H3.1 cysteine 96 poten- typic H3 dimers within each nucleosome play a stabilizing role tially forms intermolecular disulfide bonds (dark red box) with H3.1Ј cysteines in the architecture of the chromatin polymer that, in turn, 96 in different nucleosomes, leading to chromatin condensation and hetero- impacts on the regulation of gene expression? Roughly 20 years chromatin generation (for details, see text). (C) H3.1 cysteine 96 is envisioned to potentially form disulfide bonds (dark red box) with cysteine in LBR on the ago, Allfrey and colleagues (35) hypothesized a meaningful nuclear envelope or with a cysteine in an as yet unknown protein (X?) in the difference between euchromatin and heterochromatin, as as- nucleus (for details, see text). We speculate that chromatin containing H3.1 sayed by accessibility to sulfhydryl reagents, which can form nucleosomes is preferentially located near the nuclear membrane and irre- disulfide bonds with exposed cysteines under oxidative condi- versibly rendered for transcription regardless of PTMs. tions. Transcriptionally active regions were labeled preferentially with sulfhydryl-specific reagents, whereas nucleosomes in het- erochromatin and nontranscribed regions were not. Moreover, correlate well with results showing that exposure of fibroblasts these reagents preferentially bound to the cysteines in chromatin to mercury leads to the accumulation of this metal into euchro- fractions enriched for hyperacetylated H3, suggesting that tran- matin but not into heterochromatin (37, 38). Enrichment of scriptional activity ‘‘opens’’ the otherwise more tightly com- ‘‘active,’’ hyperacetylated chromatin, obtained by virtue of its pacted chromatin, exposing the H3 cysteine so that it can be ability to bind to mercury-containing columns, formed the basis bound by sulfhydryl-reactive molecules (36). These observations of several intriguing experiments, including fractionation of

6430 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0600803103 Hake and Allis Downloaded by guest on September 23, 2021 yeast chromatin with an artificial cysteine at position 110 in place protein 1 (HP1) (40) that has been found to interact with histone of its natural serine (29). H3 methylated at 9 (41). We have shown that H3.1 is These data suggest that cysteine 110 in H3 is more accessible enriched in K9 dimethylation, suggesting that H3.1, but not H3.2 INAUGURAL ARTICLE to sulfhydryl-reactive reagents in euchromatin and may be more or H3.3, might be the H3 variant that selectively binds HP1 and buried in heterochromatin, providing a potential molecular interacts with LBR at the nuclear envelope. It remains to be marker, underscoring a physical change in the nature of higher- determined whether LBR-bound heterochromatin contains only order chromatin structure that may reflect different physiolog- H3.1 and, if so, whether cysteine 96 is important for the ical states. It remains unclear whether the inaccessibility of establishment of nuclear membrane-associated heterochroma- cysteine 110 in transcriptionally silent regions is an indirect tin. In conclusion, we speculate that the unique cysteines in H3 consequence of chromatin compaction. Alternatively, a more variants might be important for nucleosomal and chromatin direct effect is possibly due to a disulfide bonding between both higher-order structures in ways that remain to be determined. In cysteines 110 in the two H3s in the same nucleosome that, in turn, turn, we speculate that these structures determine, directly or compacts nucleosomal and higher-order structures (Fig. 2A). indirectly, transcriptional regulatory states and distinct nuclear Determining the extent to which H3 ‘‘oxidation͞reduction’’ domains or compartments (see below). occurs in vivo, if at all, remains a worthwhile challenge for future studies. Histone H3 Variants and Epigenetic Memory Interestingly, two mammalian histone H3 variants, H3.1 and During the development of multicellular organisms, cells differ- H3.1t, contain an additional cysteine 96 in their protein-core entiate by changing their gene expression profiles in response to region besides the more highly conserved cysteine 110 discussed stimuli or environmental cues. Long after these external stimuli above (Figs. 1B and 2A). Because cysteine 96 is likely located on are gone, ‘‘cellular memory’’ mechanisms enable cells to remem- the protein’s surface (26), we speculate that it may play an ber their chosen fate over many cell divisions (reviewed in ref. unappreciated role in chromatin compaction and gene silencing 42). Chromatin has long been suspected to play a major role in by its ability to form disulfide bonds with other H3s in different these mechanisms, but how an epigenetic memory, defined by nucleosomes or with other cysteine-containing proteins under networks of inherited sets of expressed and silenced genes, is oxidative conditions or in the presence of an as yet undetermined faithfully transmitted to daughter cells during each S-phase oxidizing molecule(s). Several scenarios for cysteine 96 can be remains unresolved. We favor the general view that histone envisioned: variants, especially H3.1, H3.2, and H3.3, contribute to not only (i) The H3.1-specific histone chaperone CAF-1 (18) may gene expression and silencing events, but also to the mainte- specifically recognize the region containing cysteine 96 in H3.1 nance of epigenetic inheritance. In this view, histone PTMs as part of a chaperone-specific replacement mechanism that alone cannot explain the establishment of epigenetic memory serves to direct H3.1 to target genomic loci (see below and Fig. during several cell divisions. We propose that histone H3 1B). In Drosophila, the region of H3.3 that differs most from that variants contribute to ‘‘indexing’’ the genome into functionally of H3.2 (amino acids 87, 89, and 90) is important for RI separate domains (e.g., euchromatin, facultative heterochroma- incorporation (17). These findings suggest that this region of H3 tin, and constitutive heterochromatin) that, in turn, establish and is important for the binding of specialized histone chaperones, maintain epigenetic memory for each individual cell type. If and we speculate that cysteine 96 plays a role in this process, correct, one requirement for H3 variants to play a major role in CELL BIOLOGY thereby distinguishing H3.1 from H3.2 and H3.3. To our knowl- epigenetic inheritance is that nucleosomes contain ‘‘homo’’- edge, little is known as to how histone-specific chaperones (e.g., dimers of the same H3 variant, which are deposited by different CAF-1 versus HIRA; see ref. 18) recognize the appropriate chaperones (see Fig. 3). In support, Nakatani and colleagues target histones, nor is it known whether H3.2 is escorted to (18) provided evidence that mammalian histone H3 variants chromatin by its own unique chaperone (see Fig. 1B). H3.1 and H3.3 are incorporated into chromatin by separate (ii) Nucleosomes that contain H3.1 might bind to other chaperones (CAF-1 and HIRA, respectively). Once properly H3.1-containing nucleosomes through internucleosomal disul- deposited into chromatin, H3 variants must be read by mecha- fide bonds between cysteines 96. We envision that this event nisms that remain unclear but are likely to involve PTMs (see would serve to provide additional stability to higher-order below). nucleosomal contacts and may provide an explanation for H3.1- Different models have been proposed to explain how epige- mediated condensation of heterochromatin (Fig. 2B). The con- netic memory can be achieved (reviewed in ref. 43). Henikoff cept of cysteine 96-mediated disulfide ‘‘bridging’’ suggests that and coworkers (44) recently proposed that histone states are not H3.1 might play a unique role in the formation of constitutive actively duplicated but are reestablished each cell cycle by active heterochromatin that stably represses transcription through the transcription and new deposition of histone variants, in partic- generation of H3.1-containing oligonucleosomes. Here, we note ular H3.3 (Fig. 3, see nonreplicating DNA). Although transcrip- that the formation of H3 dimers (H3.1, H3.2, and H3.3) and H3 tion-coupled histone-variant deposition may function to estab- oligomers (H3.1 only) occurs, at least in vitro, under oxidative lish and reestablish active euchromatin, it is unlikely to be the conditions and is inhibited in a reducing environment (S.B.H. sole means of epigenetic inheritance, because it does not account and C.D.A, unpublished observations). The in vivo significance for the inheritance of silenced chromatin. The timing of the of these findings, if any, remains unclear. replication of different chromatin states during S-phase might (iii) H3.1 might form disulfide bonds with other nuclear also play an important role in establishing epigenetic memory. It cysteine-containing nonhistone proteins (Fig. 2C). One attrac- is well known that transcriptionally active chromatin is replicated tive candidate is the nuclear membrane-associated protein lamin in early S-phase, whereas heterochromatin is replicated in late B receptor (LBR); other disulfide ‘‘partners’’ (‘‘X’’) are also S-phase (reviewed in ref. 45). It will also be interesting to possible (Fig. 2C). LBR has been shown to bind distinct hetero- determine whether facultative and constitutive heterochromatin chromatin-enriched fractions (39). Moreover, Makatsori and replicate at different times during late S-phase, which might coworkers (23) found that LBR-associated purified fractions coincide with the expression of H3.2 and H3.1 and͞or their contain histone H3 enriched in PTMs associated with transcrip- specific chaperones, therefore providing one regulatory step in tional silencing similar to those that we have observed on H3.1. achieving epigenetic memory. Additionally, LBR binds heterochromatin as a higher oligomer. Much experimental evidence points toward another model of Interestingly, another study reports the formation of a higher- inheritance, the conservative model. This model suggests that order complex including H3, H4, LBR, and heterochromatin intact parental nucleosomal cores are most likely dispersively

Hake and Allis PNAS ͉ April 25, 2006 ͉ vol. 103 ͉ no. 17 ͉ 6431 Downloaded by guest on September 23, 2021 Fig. 3. Epigenetic memory and H3 variants: graphic of different models of epigenetic inheritance (for details, see text). Nucleosomes contain two of H3.3 (blue gradient circle), H3.2 (blue circle with white dots) or H3.1 (white circle with blue dots), and H4 (yellow circle). N-terminal tails of H3 variants are posttranslationally modified: H3.3, active PTMs (green flag); H3.2, silencing PTMs (red flag); H3.1, silencing PTMs that differ from those observed on H3.2 (orange flag). Outside of S-phase, H3.3 can be deposited into chromatin in a RI manner [as either H3.3–H4 tetramers (Left) or H3.3-H4 dimers (Right)] to activate gene transcription immediately, as proposed by Henikoff and colleagues (44). The conservative inheritance model proposes that, during replication, H3–H4 tetramers are distributed on daughter strands in a random fashion. (Left) H3 variant-specific chaperones deposit H3–H4 tetramers onto daughter strands to fill in the gaps, distributing H3 variants by potentially sensing adjacent H3 variants on the same daughter strand. (Right) The semiconservative model of replication, as proposed by Tagami et al. (18), is shown. During replication, nucleosomes are separated into two H3–H4 dimers that are distributed equally onto daughter strands. H3 variant-specific chaperones deposit H3.3–H4 dimers (HIRA), H3.1–H4 dimers (CAF-1), and H3.2–H4 dimers (unknown, ?) to histone dimers on the daughter strands forming homogenic nucleosomes.

segregated to daughter strands (46, 47) (Fig. 3 Left). In this posttranslationally modified parental H4–H3 variant dimer, model, the maintenance of epigenetic inheritance is difficult to thereby acquiring epigenetic memory. New H3.1–H4 dimers are envision but might be achieved by the replication timing differ- deposited in chromatin through CAF-1 chaperone (RD- ences between chromatin stages or H3 variant-specific chaper- expressed), resulting in (H3.1–H4)2 tetramers. In contrast, ones that sense adjacent H3–H4 tetramers on the same daughter HIRA is believed to deposit newly synthesized H3.3–H4 dimers strand. Another possibility is that the topology of DNA differs into chromatin, forming (H3.3–H4)2 tetramers on the daughter among different H3 variant-H4 tetramers, and, therefore, only strands. specific H3 variant-containing nucleosomal cores are deposited We speculate that there may be an additional, as yet uniden- onto the DNA during replication. Although not impossible, tified, H3.2-specific histone chaperone that deposits only little, if any, evidence exists to support these scenarios. H3.2–H4 dimers (or tetramers). As discussed above, whether In contrast, the semiconservative inheritance model proposes serine 96 in H3.2 (as compared with cysteine 96 in H3.1; see Fig. that nucleosomes are ‘‘split’’ into H3–H4 dimers that are dis- 1B) is an important recognition site for this hypothetical H3.2 tributed to each daughter strand (Fig. 3 Right). Although con- chaperone is not known. Regardless of which model is more troversial (see ref. 43), these findings suggest that H3 and H4 correct, CAF-1 may recognizes both H3.1 and H3.2. Here, the may be deposited into chromatin as a dimeric unit rather than site-specific incorporation of each variant into chromatin would as a tetrameric unit, as has been proposed (48). The semicon- depend on the template variant in the parental H3–H4 dimer, servative model suggests that parental (H3–H4)2 tetramers are the time when H3.1 and H3.2 are expressed during late S-phase, dissociated into two H3–H4 dimers during DNA replication and etc. ‘‘Daughter’’ mononucleosomes would then be completed segregated evenly onto daughter DNA strands (18). One con- with the addition of H2A–H2B dimers donated by other chap- sequence of this model is that each daughter strand obtains one erones or exchange machinery (reviewed in refs. 49 and 50).

6432 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0600803103 Hake and Allis Downloaded by guest on September 23, 2021 Several chromatin-remodeling complexes, including the SWI͞ localized to genes that are constitutively silent or to genomic loci SNF, RSC, and ISWIb complexes, can catalyze the exchange of containing little or no apparent protein-coding information, H2A–H2B dimers between chromatin fragments in an ATP- whereas CENP-A is localized to highly specialized INAUGURAL ARTICLE dependent reaction (51). On the other hand, the H2A.Z variant (Fig. 4A). If correct, we envision that this barcoding of genomic has recently been shown to be incorporated into chromatin by a DNA with histone H3 variants would be subjected to change specialized ATP-dependent nucleosome remodeling complex; during stem cell differentiation, when chromatin-remodeling the SWR1 complex, which consists of 13 subunits, including the events take place. We further speculate that these remodeling Swi2͞Snf2-related ATPase Swr1 (52). It remains to be seen pathways impart a memory to cell lineage-dependent gene whether other H2A variants, such as H2A.X, macroH2A, and expression in light of the epigenetic inheritance models pre- H2A.Bbd are also incorporated into chromatin by other spe- sented above. cialized chaperones and whether all these H2A variants might In considering the H3 barcode hypothesis, we propose that then be pairing specifically with different H3 variants in one patterning of histone PTMs would serve to regulate the imme- nucleosome. After completion of the newly assembled chroma- diate responses of genes to external stimuli and maintain net- tin, we envision that appropriate histone-modifying enzymes will works of gene expression or silencing over short developmental add or remove PTMs, maintaining a specific histone code for time periods (Fig. 4B). Our hypothesis suggests that genes are each particular chromosomal region. switched ‘‘on’’ or ‘‘off’’ according to their PTM pattern during We suggest, then, that H3.1, H3.2, and H3.3 have different one cell cycle. For long-term memory (many cell generations) of biological functions, based on differences in cell and tissue- a cell’s particular ‘‘epigenetic state;’’ however, we propose that specific expression patterns and PTMs (23). We favor the the selective incorporation of histone H3 variants into various general view that histone variants index select chromosomal chromosomal domains plays a role in establishing gene- regions by using selective chromatin-assembly mechanisms of expression profiles exhibited by a particular cell type at a the type described above, regardless of which model of inheri- particular point in time. In support of this hypothesis, Loppin tance is actually happening in the cell. Once in place, we envision and coworkers (59) have recently suggested that H3.3 is incor- that variant nucleosomes, marked by different PTMs, influence porated by HIRA chaperone into the chromatin of the male gene expression and nuclear architecture and, therefore, achieve pronucleus in Drosophila, thereby replacing and persistent epigenetic memory over multiple cell generations. leading to sperm nucleus decondensation. In the mouse, H3.1 is absent in the male pronucleus, which is largely decondensed Histone H3 Variants and Cell Lineage Restriction: (60), a finding also consistent with our hypothesis. In addition, The H3 Barcode Hypothesis a recent study from Felsenfeld and colleagues showed that, in Adult mammals contain hundreds of cell types distributed chicken erythroid cells, exogenous H3.3 expression resulted in among specialized tissues and organs, each with an identical increased expression of folate receptor and VEGF-D genes, DNA content. Yet, each of these cell types has a unique pattern whereas H3 (H3.2) caused decreased expression of these genes, of gene expression. In simple terms, genes behave in three ways therefore implying a difference in function for H3 (H3.2) and during development: Some genes are subject to lineage- H3.3. All of the above studies support the general notion that dependent activation events, such as PAX-5, PU-1, E2A, and H3.3 is associated with decondensed open chromatin, whereas EBF, leading to the generation of cell-type-specific precursors, mammalian H3.1 and chicken H3.2 mark heterochromatin that CELL BIOLOGY in this case, B cell precursors, in the hematopoietic system (53), is in a ‘‘closed’’ state. An important feature of our hypothesis is whereas others undergo lineage-dependent silencing events, that chromatin structures change during cell differentiation such as X- inactivation and the silencing of embry- through the selective incorporation of different histone H3 onic genes such as Oct 4 (54). Lastly, the expression of house- variants. keeping genes is maintained constitutively. By combining all of the above ideas and models, we propose Stem cell and animal cloning (nuclear transfer) experiments that it should be possible to distinguish cell types by the genomic hint that much of the molecular basis of tissue-specific gene localization of H3.1, H3.2, and H3.3, producing a pattern or expression and developmental potential is deeply rooted in the barcode of staining along chromosomal regions much like details of chromatin structure and epigenetic mechanisms (55). characteristic band͞interband regions of Drosophila polytene In addition, the intranuclear ‘‘architecture’’ of chromatin likely (Fig. 4A). In this speculative model, chromosomes has a bearing on its regulation. Transcriptionally inactive genes, from cell type A contain H3 variants in different genomic loci for example, reside in a position near the nuclear periphery (56), than chromosomes from cell type B, because different sets of or interphase centromeres (57), whereas active genes are main- genes are activated and͞or silenced by selective deposition or tained near the center of nuclei. The nuclear location of genes exchange of appropriate H3 variants. Additionally, we propose may therefore affect their transcriptional status, and some that each chromosome in any given cell type should have a evidence suggests that this is a dynamic process involved in cell different distribution of the H3 variants along their chromosome differentiation (58). arms, outside of more constant chromosomal landmarks such as The extent to which H3 variants factor into these events, if at centromeres and that are also likely marked with their all, is largely unexplored. We propose that histone H3 variants own H3 variant signatures (e.g., CENP-A at centromeric re- play a major role in cell differentiation and cell lineage restric- gions). One test of the H3 barcode hypothesis would be to display tion, and we put forward a speculative hypothesis, the H3 the different H3 variants with differentially marked or colored barcode hypothesis, to explain how this may occur. Our model tags, asking whether a barcode pattern is revealed that differs suggests that mammals have evolved an additional way of from chromosome to chromosome and cell type to cell type. regulating their genetic information over many cell generations. Ultimately, chromatin immunoprecipitation (ChIP) assays, com- We propose that the mammalian genome is indexed by histone bined with whole-genome microarray and tiling analyses (ChIP H3 variants (Fig. 4A) in a nonrandom fashion that reflects the by chip; for one example, see ref. 62) will provide a powerful test assembly mechanisms and ‘‘personalized’’ chaperones and ex- of these ideas, when the appropriate immunological reagents change factors described above (Fig. 3). We envision that H3.3 become available for these highly conserved proteins. As men- is incorporated into transcriptionally active regions, whereas, in tioned above, histone in mammalian models presents a contrast, H3.2 is deposited in transcriptionally silent areas that challenge for those histone genes that are present in high copy can be reversibly activated, depending on cellular needs (facul- numbers, such as H3.1 and H3.2. However, because H3.3 is tative heterochromatin). In our model, H3.1 might then be encoded by only two genes in mouse and human (H3.3A and

Hake and Allis PNAS ͉ April 25, 2006 ͉ vol. 103 ͉ no. 17 ͉ 6433 Downloaded by guest on September 23, 2021 Fig. 4. The H3 barcode model to index genomic information and ensure epigenetic memory. (A) Theoretical visualization of H3 variants in two chromosomes (1, 2) of A and B cell types show different banding patterns (white with blue dots, H3.1; blue with white dots, H3.2; blue, H3.3). This H3 variant barcode differs from chromosome to chromosome and cell type to cell type. In this model, H3.1 localizes to constitutive heterochromatin, H3.2 to facultative heterochromatin, and H3.3 to euchromatin. (B) Graphical combination of the three biological codes: the genetic code, the H3 barcode, and the histone code. DNA contains genetic information in the form of genes (white boxes) that have to be activated or silenced at appropriate times and noncoding regions, such as centromeres, telomeres, and satellites (dotted line). Actively transcribed genes contain H3.3 (blue gradient circle) in their chromatin, whereas silenced genes have H3.2 (blue circle with white dots) incorporated. A majority of DNA does not contain any meaningful genetic information and also genes, which are constitutively silent. These genomic regions are indexed by the presence of H3.1 (white circles with blue dots) in the chromatin. The next regulatory step to ensure proper gene expression is the regulation of genes with posttranslational histone modifications (green flag, activation PMTs; red and orange flags, different silencing PMTs). We propose that short-term alterations in gene expression is achieved by the employment of specialized PMTs (e.g., acetylation), but long-term establishment (epigenetic memory) of gene expression involves more stable histone modifications as well as the incorporation of the appropriate histone H3 variants.

H3.3B, each encoding identical H3.3 proteins), histone genetics stimuli, and this switch in gene expression is formally accom- with this H3 variant may be possible. plished without the exchange of one H3 variant with another. In conclusion, we speculate that at least three different Outside of gene regulation, PTMs are likely to contribute to biological codes, the genetic code, a PTM histone code, and a at least two other biological processes, deposition-related H3 barcode (and potentially other histone variant barcodes), PTMs (e.g., acetylation of K5͞K8͞K12 on H4) (63) and, may act together to ensure proper gene activation and silenc- possibly, the active exchange of histone variants, a mechanism ing (Fig. 4B). We favor the view that, at least in mammalian about which not much is known. Taken together, we envision cells, histone H3 variants index the genome as follows: H3.3 is that the selective employment of histone H3 variants, together largely, if not exclusively, associated with euchromatin, H3.2 with their PTM signatures, regulate gene expression by bar- with facultative heterochromatin, and H3.1 with constitutive coding the genome according to specific functions: H3.3, heterochromatin (although we note that there might be ex- euchromatin; H3.2, facultative heterochromatin; H3.1, consti- ceptions to this rule). We propose that this barcoding of tutive heterochromatin. However, many questions remain to genomic regions with appropriate H3 variants ensures long- be answered. term cellular memory of the transcriptional status of genes One specific question is how the H3 barcode and the histone that, in addition, can be inherited over many cell generations. code are connected or how different H3 variants become On the other hand, we propose that PTMs play an important associated with distinct PTMs in the first place. One possibility role in the maintenance of these transcriptional stages and are is that the distinct H3 variants, through their ability to differ- also involved in the regulation of short-term gene expression. entially regulate nucleosome stability, control the precise higher- In this view, certain PTMs enable a cell to immediately turn order folding of chromatin that then makes these fibers suitable specific genes on or off after the cell receives appropriate substrates for the appropriate modifying enzymes. For example,

6434 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0600803103 Hake and Allis Downloaded by guest on September 23, 2021 H3.3-containing nucleosomes may be less stable, thereby keep- recruit the appropriate enzyme(s) and, thereby, prevent inap- ing chromatin fibers in a somewhat, but precise, unfolded state. propriate marks and create the final biological effect. In support, These more open fibers may be the preferred substrates for a subpopulation of H3.3 is phosphorylated during mitosis at its INAUGURAL ARTICLE activating enzymes (such as MLL͞Set1, the H3 K4 HMTase). In unique S31 (64). Also, nucleosomes containing H2A.Z are poor contrast, H3.1 and H3.2 may result in generating more stable substrates for certain histone-modifying enzymes (65). nucleosomes (in particular H3.1 through disulfide bonds with its Finally, for the H3 barcode to be functional, it must have a cysteine 96) that lead to more compacted or folded chromatin cellular reader that interprets or scans the proposed patterns of fibers that are the preferred substrates for repressing enzymes H3 variant stripes in their entirety (66). Although such a (such as Ezh2, the H3 K27 HMTase). Thus, the precise chro- reader(s) has yet to be identified, we suspect that PTMs, carried matin structure (or fibers) these variants create and also their by the H3 variants, will hold some clues, if indeed such readers localization in the nuclear architecture may be, in part, the exist. We look forward to experimental tests of this hypothesis reason why they are modified in different ways with PTMs. and extensions of it in the years to come Consistent with the H3 barcode hypothesis, the first layer of chromatin organization (and epigenetic memory) would be We thank all members of the Allis laboratory for insightful discussions. dictated by the particular histone variant, whereas the potential We especially thank E. Bernstein, A. Goldberg, C. Janzen, T. Milne, and J. Wysocka for critical review of the manuscript. Valuable input was also actions of a specific modifying enzyme(s) depends, in part, on the provided by A. Annunziato, B. Strahl, and M. Smith before the submis- unique structure of that chromatin fiber that the variant gener- sion of this article. This work was supported by National Institutes of ates. In addition, DNA-binding transcriptional activators or Health MERIT Award GM 53512 (to C.D.A.) and The Rockefeller repressors that recognize unique chromatin structures might University’s Women and Science Fellowship Program (S.B.H.).

1. Strahl, B. D. & Allis, C. D. (2000) Nature 403, 41–45. 36. Johnson, E. M., Sterner, R. & Allfrey, V. G. (1987) J. Biol. Chem. 262, 2. Fischle, W., Wang, Y. & Allis, C. D. (2003) Curr. Opin. Cell Biol. 15, 172–183. 6943–6946. 3. Fischle, W., Wang, Y. & Allis, C. D. (2003) Nature 425, 475–479. 37. Bryan, S. E., Lambert, C., Hardy, K. J. & Simons, S. (1974) Science 186, 4. Fischle, W., Tseng, B. S., Dormann, H. L., Ueberheide, B. M., Garcia, B. A., 832–833. Shabanowitz, J., Hunt, D. F., Funabiki, H. & Allis, C. D. (2005) Nature 438, 38. Rozalski, M. & Wierzbicki, R. (1983) Biochem. Pharmacol. 32, 2124–2126. 1116–1122. 39. Makatsori, D., Kourmouli, N., Polioudaki, H., Shultz, L. D., McLean, K., 5. Hirota, T., Lipp, J. J., Toh, B. H. & Peters, J. M. (2005) Nature 438, 1176–1180. Theodoropoulos, P. A., Singh, P. B. & Georgatos, S. D. (2004) J. Biol. Chem. 6. Schreiber, S. L. & Bernstein, B. E. (2002) Cell 111, 771–778. 279, 25567–25573. 7. Kurdistani, S. K. & Grunstein, M. (2003) Nat. Rev. Mol. Cell Biol. 4, 276–284. 40. Polioudaki, H., Kourmouli, N., Drosou, V., Bakou, A., Theodoropoulos, P. A., 8. Iizuka, M. & Smith, M. M. (2003) Curr. Opin. Genet. Dev. 13, 154–160. Singh, P. B., Giannakouros, T. & Georgatos, S. D. (2001) EMBO Rep. 2, 9. Pusarla, R. H. & Bhargava, P. (2005) FEBS J. 272, 5149–5168. 10. Isenberg, I. (1979) Annu. Rev. Biochem. 48, 159–191. 920–925. 11. Doenecke, D., Albig, W., Bode, C., Drabent, B., Franke, K., Gavenis, K. & 41. Lachner, M., O’Carroll, D., Rea, S., Mechtler, K. & Jenuwein, T. (2001) Nature Witt, O. (1997) Histochem. Cell Biol. 107, 1–10. 410, 116–120. 12. Smith, M. M. (2002) Curr. Opin. Cell Biol. 14, 279–285. 42. Ringrose, L. & Paro, R. (2004) Annu. Rev. Genet. 38, 413–443. 13. Sullivan, K. F., Hechenberger, M. & Masri, K. (1994) J. Cell Biol. 127, 43. Annunziato, A. T. (2005) J. Biol. Chem. 280, 12065–12068. 44. Henikoff, S., Furuyama, T. & Ahmad, K. (2004) Trends Genet. 20, 320–326.

581–592. CELL BIOLOGY 14. Avramova, Z. V. (2002) Plant Physiol. 129, 40–49. 45. McNairn, A. J. & Gilbert, D. M. (2003) BioEssays 25, 647–656. 15. Pidoux, A., Mellone, B. & Allshire, R. (2004) Methods 33, 252–259. 46. Leffak, M. (1988) Biochemistry 27, 686–691. 16. Black, B. E., Foltz, D. R., Chakravarthy, S., Luger, K., Woods, V. L., Jr., & 47. Annunziato, A. T. & Seale, R. L. (1984) Nucleic Acids Res. 12, 6179–6196. Cleveland, D. W. (2004) Nature 430, 578–582. 48. Baxevanis, A. D., Godfrey, J. E. & Moudrianakis, E. N. (1991) Biochemistry 30, 17. Ahmad, K. & Henikoff, S. (2002) Mol. Cell 9, 1191–1200. 8817–8823. 18. Tagami, H., Ray-Gallet, D., Almouzni, G. & Nakatani, Y. (2004) Cell 116, 49. Loyola, A. & Almouzni, G. (2004) Biochim. Biophys. Acta 1677, 3–11. 51–61. 50. Adams, C. R. & Kamakaka, R. T. (1999) Curr. Opin. Genet. Dev. 9, 185–190. 19. McKittrick, E., Gafken, P. R., Ahmad, K. & Henikoff, S. (2004) Proc. Natl. 51. Bruno, M., Flaus, A., Stockdale, C., Rencurel, C., Ferreira, H. & Owen- Acad. Sci. USA 101, 1525–1530. Hughes, T. (2003) Mol. Cell 12, 1599–1606. 20. Johnson, L., Mollah, S., Garcia, B. A., Muratore, T. L., Shabanowitz, J., Hunt, 52. Mizuguchi, G., Shen, X., Landry, J., Wu, W. H., Sen, S. & Wu, C. (2004) Science D. F. & Jacobsen, S. E. (2004) Nucleic Acids Res. 32, 6511–6518. 303, 343–348. 21. Chow, C. M., Georgiou, A., Szutorisz, H., Maia, E. S. A., Pombo, A., Barahona, I., Dargelos, E., Canzonetta, C. & Dillon, N. (2005) EMBO Rep. 6, 354–360. 53. Singh, H., Medina, K. L. & Pongubala, J. M. (2005) Proc. Natl. Acad. Sci. USA 22. Ahmad, K. & Henikoff, S. (2002) Proc. Natl. Acad. Sci. USA 99, 16477–16484. 102, 4949–4953. 23. Hake, S. B., Garcia, B. A., Duncan, E. M., Kauer, M., Dellaire, G., Shabanow- 54. Workman, J. L. & Kingston, R. E. (1998) Annu. Rev. Biochem. 67, 545–579. itz, J., Bazett-Jones, D. P., Allis, C. D. & Hunt, D. F. (2006) J. Biol. Chem. 281, 55. Ng, R. K. & Gurdon, J. B. (2005) Cell Cycle 4, 760–763. 559–568. 56. Andrulis, E. D., Neiman, A. M., Zappulla, D. C. & Sternglanz, R. (1998) Nature 24. Ramaswamy, A., Bahar, I. & Ioshikhes, I. (2005) Proteins 58, 683–696. 394, 592–595. 25. Doolittle, R. F. (1989) Trends Biochem. Sci. 14, 244–245. 57. Maison, C., Bailly, D., Peters, A. H., Quivy, J. P., Roche, D., Taddei, A., 26. Luger, K., Mader, A. W., Richmond, R. K., Sargent, D. F. & Richmond, T. J. Lachner, M., Jenuwein, T. & Almouzni, G. (2002) Nat. Genet. 30, 329–334. (1997) Nature 389, 251–260. 58. Rasmussen, T. P. (2003) Reprod. Biol. Endocrinol. 1, 100. 27. Banks, D. D. & Gloss, L. M. (2004) Protein Sci. 13, 1304–1316. 59. Loppin, B., Bonnefoy, E., Anselme, C., Laurencon, A., Karr, T. L. & Couble, 28. Camerini-Otero, R. D. & Felsenfeld, G. (1977) Proc. Natl. Acad. Sci. USA 74, P. (2005) Nature 437, 1386–1390. 5519–5523. 60. van der Heijden, G. W., Dieker, J. W., Derijck, A. A., Muller, S., Berden, J. H., 29. Chen, T. A., Smith, M. M., Le, S. Y., Sternglanz, R. & Allfrey, V. G. (1991) Braat, D. D., van der Vlag, J. & de Boer, P. (2005) Mech. Dev. 122, 1008–1022. J. Biol. Chem. 266, 6489–6498. 61. Jin, C. & Felsenfeld, G. (2006) Proc. Natl. Acad. Sci. USA 103, 574–579. 30. Matthews, J. R., Wakasugi, N., Virelizier, J. L., Yodoi, J. & Hay, R. T. (1992) 62. Mito, Y., Henikoff, J. G. & Henikoff, S. (2005) Nat. Genet. 37, 1090–1097. Nucleic Acids Res. 20, 3821–3830. 63. Ma, X. J., Wu, J., Altheim, B. A., Schultz, M. C. & Grunstein, M. (1998) Proc. 31. Mitomo, K., Nakayama, K., Fujimoto, K., Sun, X., Seki, S. & Yamamoto, K. Natl. Acad. Sci. USA 95, 6693–6698. (1994) Gene 145, 197–203. 32. Galang, C. K. & Hauser, C. A. (1993) Mol. Cell. Biol. 13, 4609–4617. 64. Hake, S. B., Garcia, B. A., Kauer, M., Baker, S. P., Shabanowitz, J., Hunt, D. F. 33. Marsh, W. H., Ord, M. G. & Stocken, L. A. (1964) Biochem. J. 93, 539–544. & Allis, C. D. (2005) Proc. Natl. Acad. Sci. USA 102, 6344–6349. 34. Ord, M. G. & Stocken, L. A. (1966) Biochem. J. 98, 888–897. 65. Li, B., Pattenden, S. G., Lee, D., Gutierrez, J., Chen, J., Seidel, C., Gerton, J. 35. Prior, C. P., Cantor, C. R., Johnson, E. M., Littau, V. C. & Allfrey, V. G. (1983) & Workman, J. L. (2005) Proc. Natl. Acad. Sci. USA 102, 18385–18390. Cell 34, 1033–1042. 66. Henikoff, S. (2005) Proc. Natl. Acad. Sci. USA 102, 5308–5309.

Hake and Allis PNAS ͉ April 25, 2006 ͉ vol. 103 ͉ no. 17 ͉ 6435 Downloaded by guest on September 23, 2021