<<

technology feature

Sorting out chromatin states 717 Consistent measuring 718 Drilling down to function 719 Table 1: The many states of chromatin (so far) 715

Making sense of chromatin states Monya Baker

Researchers find new pieces in the puzzle of regulation.

Inside the cell, DNA is never without an exercises have already revealed regulatory accessible DNA allows for active transcrip- entourage of . Stretches of ~150 elements across the genome. In time, chro- tion. A closer look at chromatin has revealed base pairs are wrapped around octets of his- matin-state mapping promises to reveal considerably more complexity. tone proteins to form . These many secrets of genome function, how Even without considering the DNA and other DNA-associated proteins make cells inherit acquired states, how chromatin sequence, a single can exist up chromatin, a structure that may be the directs functions such as and in trillions of trillions of potential varia- most complex molecular assembly in the RNA processing, and, crucially, how chro- tions. The four basic types of cell1. Once considered a straightforward matin biology contributes to disease. proteins can be exchanged with variants packaging system for unused DNA, chro- and chemically modified in a bewildering matin is becoming recognized as a dynamic From genome-wide lists to states variety of ways. Amino acids on ’ genome organizer, a scaffold that directs Scientists have long known that nuclear tail-like extensions can be singly methy- DNA activity. DNA exists in different conformations. As lated, dimethylated, trimethylated, acety- Shortly after the turn of the century, far back as the days before television, micro- lated, phosphorylated, ubiquitinated or researchers began cataloging chromatin scopes revealed different types of DNA in the otherwise modified. More than a hundred proteins and their modifications. Now, nucleus: the densely staining heterochroma- chromatin modifications, or ‘marks’, have they are applying computational analysis to tin, also called ‘closed’ chromatin, in which already been identified. Together, these these genome-wide studies in an effort to are packed securely away from tran- epigenetic modifications create patterns segregate chromatin’s complexity into dis- scription machinery, and the lighter-staining that correlate with various functional ele- crete numbers of chromatin states. These , or ‘open’ chromatin, in which ments in the genome. America, Inc. All rights reserved. All rights Inc. America, Nature 1

© 201 Table 1 | The many states of chromatin (so far) Cell source Marks and proteins surveyed States or groups identified Analysis Reference Arabidopsis thaliana seedlings 11 histone marks plus DNA 4 major states Heat map and hierarchical 7 clustering

Caenorhabditis elegans embryos 28 histone modifications, variants 5 groups Hierarchical clustering 13 and larvae and proteins C. elegans at all stages of 33 genome-wide maps, mostly 3 combinations Integrative analysis, using 14 development histone marks chromatin marks to predict function C. elegans cells, isolated tissues 700 datasets that profile 9 or 30 states, depending on Integrative analysis 15 and whole organisms at several transcripts, histone modifications model, used to find cell- and developmental stages and nucleosomes tissue-specific regulators and predict expression D. melanogaster at several 22 histone modifications and 15 clusters grouped into 5 states Cluster and principal component 16 developmental stages chromosomal proteins analysis D. melanogaster cell lines 18 histone marks 9 or 30 states, depending on the Machine learning; combinatorial 6 and organisms at several algorithm model considering probability of developmental stages presence of certain marks D. melanogaster cell lines 53 proteins 5 states Integrative analysis of 8 genome-wide binding maps Homo sapiens lymphocytes 38 histone marks 51 states, grouped into 5 classes Multivariate 9 Nine H. sapiens cell types 9 histone marks 15 states used to identify Multivariate hidden Markov model 10 regulatory elements in the genome

nature methods | VOL.8 NO.9 | SEPTEMBER 2011 | 717 technology feature

attempts they mainly looked for regions Understanding Chromatin fiber chromatin states DNA containing a particular mark or perhaps a combination of two or three marks. In one will be harder than finding study, for example, researchers identified over them, says 50,000 potential human enhancers by map- Gary Karpen at 2 Nucleus ping where different trios of marks occurred . Lawrence Berkeley More recently, researchers have begun to National Labs. take another approach: identifying dozens of “You can identify combinatorial Nucleosome marks across the genome, computationally Broad Communications finding recurring combinations and group- patterns that are common and then ing these combinations into states. start to think about what they mean biologically. Algorithms for defining chromatin states That’s the hard part.” from genome-wide datasets are the “most interesting advance” in understanding DNA is wrapped around complexes called chromatin modification in the past three nucleosomes. Proteins comprising nucleosomes years, says Keji Zhao at the US National “Chromatin-state maps can help us out contain hundreds of different modifications, Institutes of Health. Researchers in his lab- a lot,” says Jason Lieb at the University of which together serve to regulate . oratory were one of the first groups to map North Carolina at Chapel Hill. He and the genome-wide methylation and acety- others have developed techniques to assess lation patterns in human histone proteins. whether chromatin is in open or closed Soon after the discovery of histone- Such maps have a variety of applications, conformations across the genome and modifying enzymes in the mid-1990s, he says. They could help match genes with used these to identify cell-specific regula- researchers began probing where on the their regulatory elements or assess the dif- tory elements. Last year, his team identified genome modifications occurred. In these ferentiation potential of cells. a genetic polymorphism associated with Nature America, Inc. All rights reserved. All rights Inc. America, Nature 1 © 201

718 | VOL.8 NO.9 | SEPTEMBER 2011 | nature methods technology feature

type-2 diabetes in open chromatin of insu- Mapping more lin-producing islet cells3. (Commercial kits histone marks may for evaluating chromatin states at discrete identify additional chromatin states locations are available. EpiQ from Bio-Rad and potentially uses quantitative PCR to assess how acces- new regulatory sible chromatin is across researcher-select- principles, says ed locations. “A researcher can analyze the Bradley Bernstein chromatin structure of over 100 different at Massachusetts Maria Nemchuk genomic loci from a typical EpiQ sample,” General Hospital. says Steven Okino, staff scientist at Bio- “Part of the excitement of Rad. EpiTect ChIP from SA Biosciences the states field is uses a similar approach to look at 84 pre- looking across more and more marks and asking, selected genomic locations.) ‘where are there things we are missing?’.” C. David Allis and colleagues at the Rockefeller University are studying a his- tone protein called .3, which from several species, detecting fewer than sometimes takes the place of the standard five states or more than 50 (Table 1). That histone H3 in nucleosomes. His team found, does not mean that one study is right and surprisingly, that histone H3.3 seems to another is wrong. “There’s not a magic have its own set of chaperone proteins that number of states,” says Lieb. “The whole place it, and it alone, into the genome4,5. point of these is just to distill down the data The chaperone proteins for histone H3.3 into something that’s interpretable.” had not previously been identified as chro- The data appear to be distillable: observed matin remodelers, but researchers in other combinations are only a tiny fraction of laboratories had implicated these proteins the total possibilities. A study published in mental retardation, cell death and pan- in April 2011 examined the occurrence of creatic cancer. DNA methylation and 11 histone marks The distribution of histone H3.3, which across the Arabidopsis thaliana genome7. is often associated with active genes, was Though over 4,000 combinations are theo- also surprising. It occurred in euchroma- retically possible, only 38 occurred fre- tin as well as in quintessential regions of quently. These could be collapsed even fur- such as —the ther, says François Roudier of the Institut regions at the end of . Allis de Biologie de l’Ecole Normale Supérieure turned to the literature and was intrigued in Paris, who co-led the study. “Clustering Nature America, Inc. All rights reserved. All rights Inc. America, Nature 1 by work that analyzed histone modifica- analysis indicates that the 38 combinations tions and DNA-binding proteins across correspond actually to four main chromatin

© 201 the Drosophila melanogaster genome, states with distinct functional properties.” mapping recurring combinations into So far, the repertoire of observed chro- nine chromatin states6. Histone H3.3 was matin signatures seems to be limited, not not a focus of this analysis, says Allis, but only in plants but also in fruit flies, round- the pattern was obvious. The genomic worms and human cells. “It’s encouraging locations of one of the nine states and its that you have a limited number of chro- associated histone modifications matched matin types,” says Bas van Steensel, a chro- what Allis had observed with histone H3.3. matin biologist at the Netherlands Cancer “All the clusters they defined as chromatin Institute. “It makes a little more man- state 3 tracked very nicely,” says Allis. “It ageable.” Recurrent combinations reflect was a near-perfect match.” Work in his and a redundancy that makes biological sense, other labs is now well underway to under- says Bradley Bernstein at the Massachusetts stand pancreatic cancer and other dis- General Hospital. “The cell uses it to ensure eases in terms of chromatin biology, says robust regulation, but the computational Allis. “An awful lot of work came together biologist can use it to obtain robust annota- remarkably fast, and the genome-wide tions of the genome.” maps contributed.” Though algorithms provide a systemic, unbiased approach to find states, scientists Sorting out chromatin states themselves decide, roughly, how many Mapping chromatin states is very much a states algorithms identify. “If you want work in progress. Various statistical tech- to find dozens or hundreds of states, you niques have been applied so far to datasets can, but the datasets as they are now are cataloguing different marks and proteins not comprehensive enough to do such

nature methods | VOL.8 NO.9 | SEPTEMBER 2011 | 719 technology feature

Chromatin states One of the most important next steps proteins, including those that modify his- distill overwhelming is to figure out which combinations of tones or line the . amounts of marks biologically distinct states Most chromatin mapping studies, how- information, says Jason Lieb at the of chromatin, says Kellis. In general, the ever, rely on chromatin immunoprecipi- University of North more marks are examined, the more subtle tation (ChIP), which uses antibodies to Carolina, Chapel Hill. are the distinctions that can be discerned. particular histone modifications or DNA- “It’s bound to be an Some marks carry more information than binding proteins to purify associated DNA, oversimplification, others, however, and the functional mean- which can then be analyzed by sequencing but it’s also bound Tamara Lackey ing of some marks changes depending on or microarrays. But these antibodies do to be a useful way to the presence of other marks, Kellis explains. not always perform as expected. Although understand genome organization.” “In English, just seeing the letter ‘e’ in the genome-wide ChIP studies have been middle of the word does not tell you any- around for several years, biologists must be thing about its pronunciation if you don’t careful about these datasets, says Karpen. fine-grained classification,” says van look at the context of what other letters are “It’s not clear what you can and cannot trust Steensel. “The point at this stage is to obtain there.” Similarly, some marks may indicate from the literature.” It is clear, he adds, that the big picture of chromatin.” van Steensel a repressed state when combined with one researchers need data from several types of and colleagues analyzed 53 proteins to seg- mark but an active state when combined control studies to trust the antibodies. regate the D. melanogaster genome into with another. Because these reagents are so crucial, five states, which they designated using To get a better handle on chromatin researchers in several laboratories recently colors: put simply, yellow designates active states, researchers still have to answer banded together to characterize the perfor- housekeeping genes, red designates active several questions, Kellis says: “How many mance of 246 commercially available anti- tissue-specific genes, blue designates genes marks do we need to experimentally map in bodies directed to 3 unmodified histones covered in the gene-repressing polycomb each new cellular condition? Which marks and 57 distinct histone modifications. They proteins, green designates proteins also should be prioritized to capture different evaluated each antibody in three ways: ChIP found around , and black des- subsets of states and which are redundant? studies to make sure the antibodies would ignates nearly two-thirds of silent genes8. And given a set of genome-wide maps of pull down the desired mark; dot blots (using Though flies and roundworms lack DNA individual chromatin marks, how many synthetic peptides with an array of , many mapping projects use biologically meaningful chromatin states modifications) to assess whether antibod- these species. In addition to the advantages can be distinguished reliably?” ies ever ‘mistook’ one mark for another, and of , the of these species are And the discovery of new histone marks western blots to assess whether antibod- a fraction the size of the , and chromatin proteins will likely reveal ies cross-reacted with other cellular com- providing a better ratio of signal-to-back- new states. “To be sure that you’ve cov- ponents11. Success in one assay does not ground noise. ered all the states, you have to include all guarantee success in others, says Lieb. “The Researchers led jointly by Gary Karpen the proteins that are representative of the that it works on a western [blot] doesn’t Nature America, Inc. All rights reserved. All rights Inc. America, Nature 1 at Lawrence Berkeley National Laboratory states, and since we don’t know what the mean that you should stop testing it.” and Peter Park at Harvard Medical School states are, it’s always possible we are miss- About a quarter of the antibodies failed

© 201 looked at 18 histone marks in D. melano- ing something,” says van Steensel. tests for specificity. For example, antibodies gaster and applied different algorithms to to a triply methylated lysine on a histone the same data to segregate the genome into Consistent measuring might also bind to singly or doubly methy- 9 or 30 states6. The more states there are, the Validating techniques for mapping marks lated versions. In three cases, antibodies more complicated follow-up experiments and chromatin proteins is a challenge. van were completely specific, but for different become, but too few states can be even more Steensel’s technique marks DNA that comes modifications than the ones that they had confusing, says Karpen. “There is a point into contact with a protein of interest by been sold to detect, says Lieb. Antibodies at which you lose meaning because you go fusing it with a D. melanogaster protein that also sometimes pulled down unmodified too low. You’re lumping things together that methylates the nucleotide adenosine. This histones or proteins besides histones. As don’t belong together.” can reveal which sequences encounter, even protein content varies by condition, cell Manolis Kellis and his postdoc Jason Ernst transiently, a variety of chromatin-associated type and species, antibodies that dem- at the Massachusetts Institute of Technology onstrate exquisite specificity under one used 38 histone marks to find 51 states in set of conditions may perform less well human lymphocytes, which they grouped Manolis Kellis of under another. And polyclonal antibodies, into five classes: active intergenic states, MIT says chromatin which comprise the majority of commer- large-scale repressed states, -asso- states reveal cial preparations, can vary considerably sophisticated ciated states, repetitive states and transcrip- between batches. “You have to test every 9 organization: tion-associated states . A subsequent study “You’re only lot,” says Karpen. (One of the correspond- with Bernstein across nine cell types identi- finding a small ing authors, Peter Park, has created a web- fied cell-specific regions, linking distal regu- number of site (http://compbio.med.harvard.edu/ latory elements to putative target genes and chromatin states antibodies/) where researchers can post hinting at the functional relevance of disease- compared to the test results for antibodies by lot number, associated genetic polymorphisms10. huge plethora of which hopefully will save researchers the possibilities.” hassle of replicating control studies.) 720 | VOL.8 NO.9 | SEPTEMBER 2011 | nature methods technology feature

Commercial manufacturers are respond- you’re in danger because there could be Eventually, says ing to demands for reliable tools for one region that’s marked with one histone Bas van Steensel genome-wide ChIP, says John Rosenfeld, modification in one cell type but not anoth- of the Netherlands Cancer Institute, who manages epigenetic product develop- er,” explains Lieb. van Steensel is working chromatin states ment at EMD Millipore. His company, for around this difficulty by genetically engi- will show how example, is now adopting the same series neering flies so that the DNA-marking pro- genome regulation of tests that the consortium applied. They tein is active only in certain conditions or works. “What you are also trying to make sure that researchers certain tissues. “Once this is operational, we want to know always have more than one antibody to use won’t even have to sort the cells,” he says. is whether the for a particular histone modification. More Only DNA from the labeled cells will be states correlate with function, and fundamentally, he says, a growing under- amplified. then it becomes standing of the context of histone marks Much research is done in cultured cells, interesting.” in the cell is changing the antigens used to which can be produced in large quantities. stimulate antibody production. This typi- Scientists in Bernstein’s lab and others are cally starts with synthetic peptides contain- developing techniques to conduct ChIP ing only a single histone modification, but studies with a tenth or less of the normally tantalizing differences between cultured in actual chromatin, histone modifications required numbers of cells. The work, he cells and ex vivo samples. generally occur together with others, so says, is “nothing glamorous.” It consists of manufacturers such as Millipore are adopt- titrating antibodies, optimizing how DNA Drilling down to function ing techniques to ensure that antibodies can is fragmented and amplified, and elimi- The hardest work will not be in identifying bind the histone modification of interest in nating unnecessary steps12. The tedium is the chromatin states but in figuring out how the presence of other modifications. paying off. Bernstein and colleagues now they are maintained and how they regulate Just as crucial as antibodies for under- can perform ChIP studies on cells derived the genome. “The classical way is you take standing chromatin states are the type and from tissue samples and clinical biop- a piece of functional DNA, and you ask if it quantity of the cells. “If you mix cell types, sies. Preliminary results, he says, show helps with the expression of transgenes. You Nature America, Inc. All rights reserved. All rights Inc. America, Nature 1 © 201

nature methods | VOL.8 NO.9 | SEPTEMBER 2011 | 721 technology feature

can do that for 20 or 50 genes, but it’s a lot of work. And to really test for functionality, you probably need to do 1,000 genes and the controls,” says Karpen. “We don’t have the tools to do this in a high-throughput way.” Instead, chromatin maps are opening up the field for researchers who tend to focus in on specific genes, proteins and patterns, says Allis at Rockefeller University, one of the scientists who introduced the idea that combinations of histone marks could have distinct meanings. “It’s staggering what the researchers have learned from the genome-wide methods,” says Allis, “but at the end of the day it’s going to be important to take that information and ask what a particular chromatin state means mechanistically.”

1. van Steensel, B. EMBO J. 30, 1885–1895 (2011). 2. Heintzman, H.D. et al. Nature 459, 108–112 (2009). 3. Gaulton, K.J. et al. Nat. Genet. 42, 255–259 (2010). 4. Goldberg, A.D. et al. Cell 140, 678–691 (2010). 5. Lewis, P.W., Elsaesser, S.J., Noh, K.M., Stadler, S.C. & Allis, C.D. Proc. Natl. Acad. Sci. USA 107, 14075–14080 (2010). 6. Kharchenko, P.V. et al. Nature 471, 480–486 (2010). 7. Roudier, F. et al. EMBO J. 30, 1928–1938 (2011). 8. Filion, G.J. et al. Cell 143, 212–224 (2010). 9. Ernst, J. & Kellis, M. Nat. Biotechnol. 28, 817–825 (2010). 10. Ernst, J. et al. Nature 473, 43–49 (2011). 11. Egelhofer, T.A. et al. Nat. Struct. Mol. Biol. 18, 91–93 (2011). 12. Adli, M., Zhu, J., & Bernstein, B.E. Nat. Methods 7, 615–618 (2010). 13. Liu, T. et al. Genome Res. 21, 227–236 (2011). 14. Gerstein, M.B. et al. Science 330, 1775–1787 (2010). 15. modENCODE Consortium et al. Science 330, 1787–1797 (2010). 16. Riddle, N.C. et al. Genome Res.21, 147–163 (2011).

Monya Baker is technology editor for Nature and Nature Methods ([email protected]). Nature America, Inc. All rights reserved. All rights Inc. America, Nature 1 © 201

722 | VOL.8 NO.9 | SEPTEMBER 2011 | nature methods