Eur opean Rev iew for Med ical and Pharmacol ogical Sci ences 2015; 19: 2651-2665 The determinations of positioning

Y.-N. CAI 1,2 , J.-Y. LIU 3, R. HU 1, Q.-H. ZHANG 1, Z.-M. DAI 1, Y.-J. QIAO 4,5 , X.-H. DAI 1,2

1School of Information Science and Technology, Sun Yat-sen University, Higher Education Mega Center, Guangzhou, China 2SYSU-CMU Shunde International Joint Research Institute (JRI), Shunde, Guangdong, China 3College of Chinese Language and Culture, Jinan University, Guangzhou, PR China 4School of Economics and Commerce, South China University of Technology, Guangzhou Higher Education Mega Center, Guangzhou, China 5Guangzhou Rural Commercial Bank, Zhujiang New Town, Guangzhou, China

Abstract. – OBJECTIVE: are gree of DNA bending and correlates with many the basic packaging units of , deter - other DNA structural characteristics. minants of nucleosome organization playing a modifications play a role of subtle allocation for major role in genome packaging. Although a nucleosome occupancy. wide variety of nucleosome organization fac - tors have been considered separately across Key Words: the whole or partial human genomic regions, it Nucleosome, Positioning, Chromatin. is unclarified that what the major determinants and their roles in scale are when being put all together. And it is also unknown that what the similarities and differences of determinants be - Introduction tween different genomic features such as genes of different expression levels or genomic Nucleosome is the fundamental packing unit regions with different functions. of chromatin 1 composed of DNA and , MATERIALS AND METHODS: We detected meaning a DNA superhelix of about 200bp, one commonalities and characteristics of nucleo - histone octamer (composed respectively by 2 some positioning determinants in different genes and regions with 1591486 nucleosomes copies of the four fundamental histones: H2A, identified by ourselves in human CD4+ cell. H2B, H3, and H4), and one monomer histone RESULTS: It was found that a distinct linear H1. Histone variants such as H2A.Z and H3.3al - combination of about 20 nucleosome-positioning so can be found, especially in active genes 2. Nu - factors explained nucleosome occupancy for cleosome plays a crucial role in the fundamental each genomic feature. In those linear combina - molecular activities, such as DNA replication 3, tions, 6 DNA sequence attributes (Roll stiffness DNA repair 4, 5-9 , and RNA alterna - and Twist stiffness, CT and AG, CG and shift stiff - 10 ness) and a histone modification (H4R3me2) are tive splicing . The exact position of nucleosome shared. And other factors are varied. Roll stiff - in the genome sequence is recognized to be influ - ness and Twist stiffness are the most important enced by several factors, including DNA se - features. They are dominant, alone explaining quence preferences of nucleosomes themselves 11- 96.61-98.45% of the positioning weight in each 15 , chromatin remodeling factors 16-18 , insulator 19 , genomic feature. The characteristic factors in Alu elements 20,21 , TF 22 , DNA 23-27 , hi - each combination are larger in number, but weak - 28 29,30 er in power. Numerous histone modifications play stone modification , histone variants , nu - 31 a subtle role for nucleosome positioning. cleotide content , physicochemical properties, CONCLUSIONS: The present study provides and so on 32-34 . Researches on nucleosome posi - a more accurate positioning nucleosome-map tion have been carried out on S. cerevisiae 35,36 with higher resolution and a dramatically sim - and C. elegans genomes 37 . However, most studies plified means to predict and understand intrin - on nucleosome positioning factors simply fo - sic nucleosome occupancy in different genom - cused on one certain type respectively , such as ic features in human CD4+ cell. Roll stiffness 38 and Twist stiffness are the two most important DNA methylation , high G+C content (which determinants in all genomic features. They may correlates positively with intrinsic nucleosome dominate because they both determine the de - occupancy), nucleotide content, histone modifi -

Corresponding Author: Yanning Cai, Ph.D and Xianhua Dai, Ph.D; e-mail: [email protected] 2651 Y.-N. Cai, J.-Y. Liu, R. Hu, Q.-H. Zhang, Z.-M. Dai, Y.-J. Qiao, X.-H. Dai cation, or DNA physical properties 32 . Some were cleotide and of G/C, purine (A/G), Y (C/T) and just confined to one certain human genomic re - Keto (G/T), physical properties of 35 types of gion such as exon 39 , promotor 40,41 , or region near single strand dinucleotide, 21 types of histone alternative splicing sites 42-44 . Although each of methylation (H2A.Z included), and 18 types of these factors respectively has a certain effect on histone . nucleosome positioning, it’s better to take them all into account. This study aims to investigate the major de - Nucleosomes’ Positioning-map and terminants and their respective strength under Occupancy scores the interaction of 3 types of factors across dif - Nucleosome positioning is usually dynamic ferent human genomic features, including the and unstable. The exact position of nucleosome genes of five expressed levels (active, median, may shift, vibrate or remove with rest or active middle, low and silent level ) and four genomic changes of cell state or cellular processes of regions (3’ untranslated region (UTR), coding- DNA replication, transcription, and recombina - area, promoter, and intron). Since there are tion. Therefore, the certain position of nucleo - more nucleosomes in intron than other regions , some should be described by “positioning” and for a better clarification, intron is d ivided into 5 “occupancy”. Positioning refers to the times (or types (I-active, I-median, I-middle, I-low, and probability) of the nucleosome appearing at the I-silent) according to the express level of the exact ly same place on the DNA sequence, while genes they belong to. occupancy refers to the times (or probability) of the nucleosome appearing at certain locus on the DNA sequence, as is shown in Figure 1 45 . Methods We use available data from genome-wide studies to rebuild a new position map. In order A feature selection step is performed to iden - to investigate intragenic nucleosome position - tify which sequence features influenc ing nucle - ing , we analyzed available sequencing data osome occupancy or positioning are strongly as - from human CD4 + T-cells 46,47 . Furthermore, sociated with the in vitro nucleosome data of public gene-expression data used in this study Kaplan et al (Nature 2009; 458: 362-366). The are also from human CD4+ T-cells, Schones DE features included: (1) mononucleotide frequen - et al 46-48 provides the distribution counts of short cy (i.e. G+C content); (2) predicted DNA struc - sequence read ing for nucleosomes at 10 bp res - tural characteristics (each calculated from the olution in genome-wide human CD4+ cell, dinucleotide content using a simple linear for - which is nucleosome occupancy score mula 17 ); (3) nucleosome positioning and exclud - (NOScore) at 10 bp resolution, but the exact nu - ing sequences from the literature 10-14 ; and (4) the cleosome locations and NOScore at single-nu - frequency of 4-bp sequences over a 150-bp win - cleotide resolution are not given . So, we have to dow. We used 4-mers instead of 5-mers (as in rebuild the nucleosome map and NOScore at the Kaplan model) were used in order to limit single-nucleotide resolution with the ChIP-seq the number of features, and to obtain inputs that reads data of 21 histone methylation (including correlate independently with nucleosome occu - histone variant H2A.Z) 49 , as well as 18 histone pancy (since each 4-mer occurs more frequently acetylation 50 in human CD4+ cell. than nucleosomes, on average). We identified DANPOS 51 is the most optimal algorithm by 130 features that we deemed to be associated now designed for nucleosome calling and scoring with in vitro nucleosome occupancy across the at single-nucleotide resolution, which is written yeast genome, including representatives of all with python programmer language. It can detect categories (1-4) mentioned above . three categories of nucleosome dynamics, such Based on available data, 97 factors as position shift, fuzziness change, and occupan - (http://diprogb.fli-leibniz.de/ ) are selected for cy change for nucleosomes associated with envi - analyzing the nucleosome-positioning determi - ronmental changes. Besides, it can also be used nants in this study . They belong to 3 categories, to give positioning and NOScore for nucleo - the nucleotide content, the nucleotide physico - somes from a single environmental condition. chemical properties and the histone modification. The latter function was used in our studies. The And the details are as follows: contents of 4 algorithm is also applicable for analyzing histone types of mononucleotide, 16 types of dinu - modifications.

2652 The determinations of nucleosome positioning

Figure 1. Nucleosome occupancy and posi - tioning. Illustration that distinguishes nucleo - some occupancy level and positioning. Spheres represent nucleosomes resting on DNA “lines”. Occupancy is high when all lines contain a sphere. Positioning is high when all spheres that are present are aligned vertically.

The Histone Modification Score ties for nucleosome were calculated by the soft - of Nucleosomes ware DiProGB (The Dinucleotide Properties Similarly, the histone modification (including Genome Browser provided on the website, histone variants) scores of mononucleotide dis - http://diprogb.fli-leibniz.de/ )53 . crimination rate was calculated by the software program danpos 51 and, then, the average nu - The Gene Expression Level and cleotide score of the nucleosome will be the Gene Function Regions modification score of the nucleosome. The gene expression level is obtained from the Su et al 48 . Based on expression level, the 12000 The Nucleotide Content of Nucleosome ref genes are divided into five groups: the active and its Physical Property Score group, the medium group, the middle group, the After the nucleosome position sequences hav e low group, and the silent group. Downloaded been got, the nucleosome DNA sequence (hg18) from the UCSC genome browse, the different can be downloaded from the UCSC genome functional regions of human genome hg18 are browser (http://genome.ucsc.edu/ )52 . Next, the 3’UTR, CDS, intron, and the 1000bp region in data of nucleotide contents and physical proper - the upper TSS of each gene are read as promoter.

Figure 2. The proportional relation be - tween nucleosome numbers and lengths of chromosomes. The regression parame - ters of Nu.num = -6.217*104+3.586*10- 3 Chr. len both pass t-tests (p <0.05), and their regression formula passes f-test (p =1.529*10-15), Adjusted R2=0.9448.

2653 Y.-N. Cai, J.-Y. Liu, R. Hu, Q.-H. Zhang, Z.-M. Dai, Y.-J. Qiao, X.-H. Dai

Figure 3. The clustering analysis of nucleo - some density of chromosomes.

All the nucleosomes respond their sequence posi - positioning factor regression models of corre - tions to different gene expression levels and dif - sponding expression levels and different gene re - ferent functional regions. The nucleosomes are gions are obtained. The regression coefficient of proportionally shown in Figure 3. The nuleo - each factor in the model (hereinafter referred to somes accounts for only 0.006% of all nucleo - as positioning weight ) stands for their contribu - somes in the 5’UTR region , which is abandoned tion to the nucleosome positioning for the group. because of the small proportion . After standardization , the above positioning fac - tors of each equation are detected and the final Statistical Methods and Software weights of positioning factors are comparable di - The possible positioning factors for nucleo - rectly. The heavier the factor is , the greater con - somes are quite huge in number and complex in tribution it will make to the nucleosome position - interrelationship. In order to clearly find out the ing. The plus or minus before the weight shows main positioning factors and their interrelation - its positive or negative role to the nucleosome ship, Lasso (The Least Absolute Shrinkage and positioning: plus indicates the induction of nucle - Selectionator, Tibshirani (1996)) is utilized here osome positioning and positive factors while mi - to make linear regression models which is sim - nus shows rejection of nucleosome positioning plified as much as possible (to make the p value and negative factors. of each final positioning factor less than 0.05). To make a contrast of the specificities of nucle - Lasso is a kind of shrinkage estimation, and its osome positioning factor combinations of each fundamental theory is that under the restrained group, layering sampling of random selection of condition , that the sum of the regression coeffi - the same number of nucleosomes respectively cient absolute value is smaller than a constant, from the 24 chromosomes are analyzed by the the residual sum of squares is minimized, and same way of lasso regression and correspondingly consequently some coefficient shrinkages and the regression factors of each group and their in - some models whose coefficients exactly equal to fluence degrees (the specified data is omitted zero will be produced. In this way the weights of here). The regression factors of contrasting groups some relevant factors can be cut to a maximum indicate that the regression factors are special. extent and therefore allocate greater weight to the most contributing factors, and the over-fitting can be avoided as well. Results By the package glmnet of the software R, the Lasso regression analysis of the above various 9555851 nucleosomes of mononucleotide reso - positioning factors is performed. After each re - lution inside CD4+ cell genome are obtained ( p < gression analysis, the obtained model and its po - 10 -5 ). The number of nucleosome in each chromo - sitioning factors will be f-tested (p<0.01) and t- some is shown in Table I . The numbers of each tested (p<0.05). If there is any positioning factor chromosome are different and so is their in the model that cannot be t tested, that factor proportion in all nucleosomes of human genomes. will be deleted from the model. The rest factors The nucleosomes of chr1-chr12 all respectively are analyzed by lasso regression analyzed again. exceed 4% of all nucleosomes of genomes and The process is repeated until all factors in the their total sum exceeds 70% of the whole. How - model can be tested. Through Lasso regression of ever, taking the length of each chromosome into nucleosome of different expression levels and consideration, it can be found that the correlation different gene regions, finally the nucleosome coefficient of the nucleosome numbers of each

2654 The determinations of nucleosome positioning

Table I. Length of 24 human chromosomes, nucleosome number, their percentage and density.

Chr Chr Length (bp) Nucleosome number (Nn) percentage Density (L/Nn)

1 247249719 783879 8.20 315 2 242951149 823684 8.62 295 3 199501827 680950 7.13 293 4 191273063 623827 6.53 307 5 180857866 605946 6.34 298 6 170899992 579708 6.07 295 7 158821424 513830 5.34 309 8 146274826 498448 5.22 293 9 140273252 385858 4.04 364 10 135374737 454847 4.76 298 11 134452384 461601 4.83 291 12 132349534 448252 4.69 295 13 114142980 323990 3.39 352 14 106368585 299672 3.14 355 15 100338915 274370 2.87 366 16 88827254 266105 2.78 334 17 78774742 259372 2.71 304 18 76117153 265819 2.78 286 19 63811651 168518 1.76 379 20 62435964 220165 2.30 284 21 46944323 117620 1.23 399 22 49691432 119100 1.25 417 X 154913754 378681 3.96 409 Y 57772954 1609 0.02 35906 chromosome and their respective base pair length sity of each chromosome is shown in Table I. We is 0.97, and is statistically significant (a=0.01 ). can found that the density of chrY is only 1/126 Software R is used to make linear regression to 1/ 62 that of other chromosomes and the dif - analysis and detect that they are proportional in ference is quite remarkable. Hierarchical cluster - relation, which is shown in Figure 2. ing (hclust) analysis by software R indicates that Although the nucleosome numbers of chromo - the nucleosome densities of 24 chromosomes are some are all proportional to their respective divided into two types: the type of chrY, the rest lengths, the nucleosome densities of each chro - chromosomes belong to the other type, as is mosome, that is the average base pair number of shown in Figure 3. In the same way, the variance nucleosomes, are different. The nucleosome den - analysis by software R is made to check the nu -

Figure 4. The nucleosome percentage of different gene expression levels (left) and different functional regions (right). 2655 Y.-N. Cai, J.-Y. Liu, R. Hu, Q.-H. Zhang, Z.-M. Dai, Y.-J. Qiao, X.-H. Dai

Table II. 10 most important positioning factors for nucleosomes of each level.

active Roll stiffness, Twist stiffness,H4R3me2, H3K9me3, Slide(Dpc) CG, shift stiffness, H4K20me3, H4K20me1, H2BK5ac medium Roll stiffness, Twist stiffness, H4R3me2, 64Shift, H3K9me3 H2AK5ac, CG, Shift stiffness, H4K20me1, H4K91ac middle Roll stiffness, Twist stiffness, H4R3me2, H3K27me3, 64Shift CG, Shift stiffness, H4K91ac, H4K20me1, H4K20me3 low Roll stiffness, Twist stiffness, 64Shift, H4R3me2, H3R2me2 CG, Shift stiffness, H4K91ac, H4K20me1, H3K4me3 silent Roll stiffness, Twist stiffness, H3K9me3, 64Shift, H4R3me2 H3K36me1, CG, H3K27me2, Shift stiffness, H4K20me1

For nucleosomes of each expression level, 5 most important positive factors and 5 most important negative ones are in the up - per and lower lines, respectively. cleosome density of 24 chromosomes and we can fect on nucleosome positioning, and each factor find that the density of chrY is notably different weight accounts for about 20-80% of the whole from those of other chromosomes at a=0.01 ( p = group weight. Main forces : 1 ≤ weight < 10. The 0), indicating that as the unique chromosome for single main force power lies between the over - male, the nucleosome positioning system of it is whelming main forces and the common factor possibly different from that of other chromo - forces. Common forces : 0.1 ≤ weight < 1. Each somes . C onsequently , its internal gene function common factor force weight accounts for about and expression may also be strikingly different 1% of the whole group weight. Tiny adjusting from those of other chromosomes. forces : weight < 0.1. The factors of this group exert very tiny influence on nucleosome position - Distribution on Genes ing. Each tiny adjusting force weight accounts Through gene projection, 1591486 nucleo - for about 0.1% or even less of the whole group somes for positioning factor analysis are ob - weight. tained. All these nucleosomes are divided into groups based on their gene expression power and Analyzed by Different Gene Expression gene functional regions and the proportion of nu - Levels cleosome of each group accounting for all The five groups of nucleosome positioning genome nucleosomes are obtained as shown in factors of different gene expression power and Figure 4. With the decrease of gene expression their weights are shown in Figure 5. The five power and functional region expression level, the groups of positioning factors vary in number and nucleosome proportion increases . As for the type, ranging from 19 to 23. Among the largest complete no expression power intron, its nucleo - weights of each positioning model group, the some percentage also rises with the decrease of first five positive factors and the first negative its gene expression level. factors are shown in Table II . To get the exact influential factors of nucleo - Analyzed by different gene functional regions some positioning of human CD4+ cell genes the eight groups of nucleosome positioning fac - with different expression levels or different gene tors of different gene functional regions and their regions, the whole explaining ability of each weights are shown in Figure 6. model is sacrificed and the lasso regression has Among the largest weights of each positioning been used repeatedly on each nucleosome group model group, the first five positive factors and to get the final p value of each positioning factor the first negative factors are shown in Table III . smaller than 0.05. To better analyze the positioning factor weights of each model group, we divide the posi - Discussion tioning weights into 4 classes according to their quantity : Overwhelming main forces : weight ≥ The above model groups , share the following 10. The overwhelming main forces exert huge ef - similarities:

2656 The determinations of nucleosome positioning

Figure 5. The histogram of nucleosome positioning factors of different gene expression levels. The figure continued

2657 Y.-N. Cai, J.-Y. Liu, R. Hu, Q.-H. Zhang, Z.-M. Dai, Y.-J. Qiao, X.-H. Dai

Nucleosome positioning factors for active, medium, middle, low, and silent gene, respectively. The right is zoom in of the left, respectively.

Figure 5. (Continued).

In each group, there must be the overwhelm - The above similarities indicate that nucleo - ing main positioning factors of Roll stiffness and some positioning gets great support: the two Twist stiffness, and always there are CT and AG, overwhelming main forces appear only as posi - the two tiny adjusting positive factors and CG tive factors, small in number, but huge in power, and Shift stiffness, the two common and tiny ad - and their total weight accounts for 80% to 95% justing negative factors. 2. Every group is short of each group weight. The common forces main - of main positioning factors. 3. In each group, ly appear as positive factors, seldom as negative there must be the common positioning factors of factors; the tiny adjusting factors are great in H4R3me2. The common positioning factors number, but tiny in function and in total sum, and mainly appear as positive factors, seldom as neg - they are the main composing force of negative ative factors. 4. In each group, the number of the force. Therefore, the positive force composed by tiny adjusting positioning factors is the largest, overwhelming main forces, common forces and accounting for nearly 50% of the total . tiny adjusting forces far outweighs the negative

2658 The determinations of nucleosome positioning

Figure 6. The histogram of nucleosome positioning factors of different genomic regions.

2659 Y.-N. Cai, J.-Y. Liu, R. Hu, Q.-H. Zhang, Z.-M. Dai, Y.-J. Qiao, X.-H. Dai

Figure 6. The histogram of nucleosome positioning factors of different genomic regions. Figure continued

2660 The determinations of nucleosome positioning

Figure 6. (Continued) .

2661 Y.-N. Cai, J.-Y. Liu, R. Hu, Q.-H. Zhang, Z.-M. Dai, Y.-J. Qiao, X.-H. Dai

Nucleosome positioning factors for 3’UTR, CDS, promotor, I_active, I_medium, I_middle, I_low and I_silent regions, respec - tively. The right is zoom in of the left, respectively. Figure 6. (Continued) .

force composed by common forces and tiny ad - Besides the above similarities, there are also justing forces, which makes it easy for nucleo - some differences in each group of positioning somes to be positioned. The contrast analysis of factors. nucleosome positioning factors also demon - 1. Each group of positioning factors is differ - strates in an opposite way that exactness of un - ent in number, ranging from 10 to 24. 2. Except cleosome positioning identified by DANPOS. As the above mentioned a few stable positioning fac - for the lack of main positioning forces in each tors, the rest positioning factors in each group do group, it may be relevant to the limited ranges of not always remain the same. 3. The weights of positioning factors in this study. Further studies the same positioning factors in each group are need to be carried out when conditions are per - different, even opposite in function. Take mitted. H4K20me3 as an example, a factor is positive in

2662 The determinations of nucleosome positioning

Table III. 10 most important positioning factors for nucleosomes of each region.

3’ UTR Roll stiffness, Twist stiffness, H4R3me2, Slide(Dpc), H2AZ Shift stiffness, H4K20me1, H4K20me3 (*) CDS Roll stiffness, Twist stiffness, H4R3me2, Shift,TG Shift stiffness, H4K20me3, H4K20me1, H3K4me1(*) Promotor Roll stiffness, Twist stiffness, H4R3me2, H3K9me1, CA H4K5ac, H4K91ac, H4K20me1, H2BK5ac(*) I_active Roll stiffness, Twist stiffness, H4R3me2, H3K9me3, Slide(Dpc) CG, Shift stiffness, H4K20me1, H4K91ac, H3K4me3 I_medium Roll stiffness, Twist stiffness, H4R3me2, H3K23ac, Shift H3K27me3, CG, Shift stiffness, H4K20me1, GG I_middle Roll stiffness, Twist stiffness, H3K27me3, H4R3me2, H2AK9ac CG, H4K91ac, Shift stiffness, H4K20me1, H4K20me3 I_low Roll stiffness, Twist stiffness, Shift, H4R3me2, H3K27me2 CG, Shift stiffness, H4K20me1, H3K4me3(*) I_silent Roll stiffness, Twist stiffness, H3K9me3, Shift, H4R3me2 H3K36me1, CG, Shift stiffness, H4K20me1, H4K91ac

(*): the negetive factors are less than 5. For nucleosomes of each region, 5 most important positive factors and 5 most important negative ones are in the upper and lower lines, respectively. the medium group but negative in the middle cleosome density in chrY. The nucleosomes will group. 4. In each group, the quantities of nu - influence the gene function, which reminds us to cleotide, nucleotide properties and histone modi - investigate the different functions between the fication are changeable in number. nucleosomes of chrY and those of other chromo - Although each group of positioning factors somes. var ies in number, weight and even in function Through the analysis of nucleosome distribu - detection, the total positive factor sum and their tion of genes, it can be found that the higher the weights are far from being challenged by that gene expression level is, the lower the nucleo - of the negative factors, which is finally able to some proportion will be, and vice versa. This determine the nucleosome positioning. The dif - phenomenon is coherent with our current view - ferences of each positioning factor group indi - point, because the higher the gene expression cate the flexibility of nucleosome positioning, level is, the more frequent the internal molecular which can meet the needs of genes to have dif - activities of transcription and translation that nu - ferent gene expression levels and different gene cleosome involves will be ; the less bindings nu - functions. cleosomes will get, and thus nucleosome as the The analysis of the similarities and differences DNA binding compositions will not appear so of nucleosome positioning well reflects the com - frequently. On the other side, the lower the gene plexity of nucleosome positioning system, as expression level is, the more bindings from nu - well as its principle and flexibility. cleosome DNA will get. DANPOS, the optimal method at present is Through analysis of nucleosome positioning utilized to identify 9555852 nucleosomes of hu - of genes, we found that the leading forces for nu - man CD4+ cell genomes. These nucleosomes are cleosome positioning of human CD4+ cell genes uniform distributed into chromosomes according are Roll stiffness and Twist stiffness, however, to length of chromosomes; however, the nucleo - there is not enough main force grade between the some densities of each chromosome are not even, overwhelming main forces and the common especially the density of chrY is far lower than forces. It might be because that the factors lack - those of other chromosomes, which indicates that ing force grade are not covered in this study and, there must be a certain hidden nucleosome posi - thus, is the result of all groups of positioning fac - tioning system in chrY, which cuts down the nu - tors lacking in the positioning factors this grade.

2663 Y.-N. Cai, J.-Y. Liu, R. Hu, Q.-H. Zhang, Z.-M. Dai, Y.-J. Qiao, X.-H. Dai

To meet the needs of different gene expression ZHAO K, B ROWN M, L IU XS. Nucleosome dynamics levels and different gene functions, nucleosomes define transcriptional enhancers. Nat Genet 2010; of human CD4+ cell genomes have evolved vari - 42: 343-347. ous nucleosome positioning strategies. In these 6) JIN J, B AI L, J OHNSON DS, F ULBRIGHT RM, K IREEVA ML, KASHLEV M, W ANG MD. Synergistic action of RNA strategies, the physical properties of dinucleotide polymerases in overcoming the nucleosomal bar - are the most important ones , always with Roll rier. Nat Struct Mol Biol 2010; 17: 745-752. stiffness and Twist stiffness as the overwhelming 7) SVAREN J, H ORZ W. Regulation of gene expression leading positive factors. The negative positioning by nucleosomes. Curr Opin Genet Dev 1996; 6: factors are always taking dinucleotide CG quanti - 164-170. ty and shift stiffness as leading factors. The his - 8) JIANG C, P UGH BF. Nucleosome positioning and tone modification varies with the change of gene gene regulation: advances through genomics. Nat expression levels and function regions, playing a Rev Genet 2009; 10: 161-172. tiny role of adjusting for nucleosome positioning. 9) TIROSH I, B ARKAI N. Two strategies for gene regula - tion by promoter nucleosomes. Genome Res 2008; 18: 1084-1091. 10) TILGNER H, N IKOLAOU C, A LTHAMMER S, S AMMETH M, Conclusions BEATO M, V ALCARCEL J, G UIGO R. Nucleosome posi - tioning as a determinant of exon recognition. Nat This study provides a higher resolution and Struct Mol Biol 2009; 16: 996-1001. more accurate positioning nucleosome- map and 11) WIDOM J. Role of DNA sequence in nucleosome a dramatically simplified means to predict and stability and dynamics. Q Rev Biophys 2001; 34: understand intrinsic nucleosome occupancy in 269-324. different genomic features in human CD4+ cell. 12) FERNANDEZ AG, A NDERSON JN. Nucleosome posi - tioning determinants. J Mol Biol 2007; 371: 649- Roll stiffness and Twist stiffness are the two most 668. important determinants in all genomic features. 13) CHUNG HR, V INGRON M. Sequence-dependent nu - They may dominate because they both determine cleosome positioning. J Mol Biol 2009; 386: 1411- the degree of DNA bending and correlate with 1422. many other DNA structural characteristics. His - 14) FRASER RM, K ESZENMAN -P EREYRA D, S IMMEN MW, A LLAN tone modifications play a role of subtle allocation J. High-resolution mapping of sequence-directed for nucleosome occupancy. nucleosome positioning on genomic DNA. J Mol Biol 2009; 390: 292-305. ––––––––––––––––– –– 15) TRAVERS A, H IRIART E, C HURCHER M, C ASERTA M, D I Acknowledgements MAURO E. The DNA sequence-dependence of nu - cleosome positioning in vivo and in vitro. J Biomol This paper is supported by National Nature Science Fund of Struct Dyn 2010; 27: 713-724. China, Grant: 61174163. 16) MAYES K, Q IU Z, A LHAZMI A, L ANDRY JW. ATP-depen - –––––––––––––––– –-– –– dent chromatin remodeling complexes as novel Conflict of Interest targets for cancer therapy. Adv Cancer Res 2014; 121: 183-233. The Authors declare that they have no conflict of interests. 17) GUYON JR, N ARLIKAR GJ, S ULLIVAN EK, K INGSTON RE. Stability of a human SWI-SNF remodeled nucleo - Reference somal array. Mol Cell Biol 2001; 21: 1132-1144. 18) BECKER PB, H ORZ W. ATP-dependent nucleosome 1) LUGER K, M ADER AW, R ICHMOND RK, S ARGENT DF, R ICH - remodeling. Annu Rev Biochem 2002; 71: 247- MOND TJ. Crystal structure of the nucleosome core 273. particle at 2.8 A resolution. Nature 1997; 389: 251- 19) FU Y, S INHA M, P ETERSON CL, W ENG Z. The insulator 260. binding protein CTCF positions 20 nucleosomes 2) MALIK HS, H ENIKOFF S. Phylogenomics of the nucleo - around its binding sites across the human some. Nat Struct Biol 2003; 10: 882-891. genome. PLoS Genet 2008; 4: e1000138. 3) EATON ML, G ALANI K, K ANG S, B ELL SP, MacAlpine DM. 20) ENGLANDER EW, H OWARD BH. Nucleosome position - Conserved nucleosome positioning defines replica - ing by human Alu elements in chromatin. J Biol tion origins. Genes Dev 2010; 24: 748-753. Chem 1995; 270: 10091-10096. 4) YING H, E PPS J, W ILLIAMS R, H UTTLEY G. Evidence that 21) TANAKA Y, Y AMASHITA R, S UZUKI Y, N AKAI K. Effects of Alu localized variation in primate sequence divergence elements on global nucleosome positioning in the arises from an influence of nucleosome placement human genome. BMC Genomics 2010; 11: 309. on DNA repair. Mol Biol Evol 2010; 27: 637-649. 22) KIM HD, O’S HEA EK. A quantitative model of tran - 5) HE HH, M EYER CA, S HIN H, B AILEY ST, W EI G, W ANG Q, scription factor-activated gene expression. Nat ZHANG Y, X U K, N I M, L UPIEN M, M IECZKOWSKI P, L IEB JD, Struct Mol Biol 2008; 15: 1192-1198.

2664 The determinations of nucleosome positioning

23) PENNINGS S, A LLAN J, D AVEY CS. DNA methylation, nu - 39) ANDERSSON R, E NROTH S, R ADA -I GLESIAS A, W ADELIUS C, cleosome formation and positioning. Brief Funct KOMOROWSKI J. Nucleosomes are well positioned in Genomic Proteomic 2005; 3: 351-361. exons and carry characteristic histone modifica - 24) CHODAVARAPU RK, F ENG S, B ERNATAVICHUTE YV, C HEN PY, tions. Genome Res 2009; 19: 1732-1741. STROUD H, Y U Y, H ETZEL JA, K UO F, K IM J, C OKUS SJ, 40) SCHMID CD, B UCHER P. ChIP-Seq data reveal nucleo - CASERO D, B ERNAL M, H UIJSER P, C LARK AT, K RAMER U, some architecture of human promoters. Cell 2007; MERCHANT SS, Z HANG X, J ACOBSEN SE, P ELLEGRINI M. 131: 831-832; author reply 832-833. Relationship between nucleosome positioning and 41) DENNIS JH, F AN HY, R EYNOLDS SM, Y UAN G, M ELDRIM DNA methylation. Nature 2010; 466: 388-392. JC, R ICHTER DJ, P ETERSON DG, R ANDO OJ, N OBLE WS, 25) CHOY JS, W EI S, L EE JY, T AN S, C HU S, L EE TH . DNA KINGSTON RE. Independent and complementary methylation increases nucleosome compaction and methods for large-scale structural analysis of mam - rigidity. J Am Chem Soc 2010; 132: 1782-1783. malian chromatin. Genome Res 2007; 17: 928-939. 26) LEE JY, L EE TH. Effects of DNA methylation on the 42) CHEN W, L UO L, Z HANG L. The organization of nucle - structure of nucleosomes. J Am Chem Soc 2012; osomes around splice sites. Nucleic Acids Res 134: 173-175. 2010; 38: 2788-2798. 27) JIMENEZ -U SECHE I, K E J, T IAN Y, S HIM D, H OWELL SC, Q IU 43) TILLO D, K APLAN N, M OORE IK, F ONDUFE -M ITTENDORF Y, X, Y UAN C. DNA methylation regulated nucleosome GOSSETT AJ, F IELD Y, L IEB JD, W IDOM J, S EGAL E, H UGHES dynamics. Sci Rep 2013; 3: 2121. TR. High nucleosome occupancy is encoded at hu - 28) HAYES JJ, C LARK DJ, W OLFFE AP. Histone contributions man regulatory sequences. PLoS One 2010; 5: to the structure of DNA in the nucleosome. Proc e9129. Natl Acad Sci U S A 1991; 88: 6829-6833. 44) VALOUEV A, J OHNSON SM, B OYD SD, S MITH CL, F IRE AZ, 29) GUILLEMETTE B, B ATAILLE AR, G EVRY N, A DAM M, SIDOW A. Determinants of nucleosome organization BLANCHETTE M, R OBERT F, G AUDREAU L. Variant histone in primary human cells. Nature 2011; 474: 516-520. H2A.Z is globally localized to the promoters of inac - 45) ZHANG Z, W IPPO CJ, W AL M, W ARD E, K ORBER P, P UGH tive yeast genes and regulates nucleosome posi - BF. A packing mechanism for nucleosome organiza - tioning. PLoS Biol 2005; 3: e384. tion reconstituted across a eukaryotic genome. Sci - 30) HENIKOFF S. Labile H3.3+H2A.Z nucleosomes mark ence 2011; 332: 977-980. ‘nucleosome-free regions’. Nat Genet 2009; 41: 46) SCHONES DE, C UI K, C UDDAPAH S, R OH TY, B ARSKI A, 865-866. WANG Z, W EI G, Z HAO K. Dynamic regulation of nu - 31) TILLO D, H UGHES TR. G+C content dominates intrinsic cleosome positioning in the human genome. Cell nucleosome occupancy. BMC Bioinformatics 2009; 2008; 132: 887-898. 10: 442. 47) SCHONES DE, Z HAO K. Genome-wide approaches to 32) MIELE V, V AILLANT C, D’A UBENTON -C ARAFA Y, T HERMES C, studying chromatin modifications. Nat Rev Genet GRANGE T. DNA physical properties determine nu - 2008; 9: 179-191. cleosome occupancy from yeast to fly. Nucleic 48) SU AI, W ILTSHIRE T, B ATALOV S, L APP H, C HING KA, B LOCK Acids Res 2008; 36: 3746-3756. D, Z HANG J, S ODEN R, H AYAKAWA M, K REIMAN G, C OOKE 33) DENIZ O, F LORES O, B ATTISTINI F, P EREZ A, S OLER -L OPEZ MP, W ALKER JR, H OGENESCH JB. A gene atlas of the M, O ROZCO M. Physical properties of naked DNA in - mouse and human protein-encoding transcrip - fluence nucleosome positioning and correlate with tomes. Proc Natl Acad Sci U S A 2004; 101: 6062- transcription start and termination sites in yeast. 6067. BMC Genomics 2011; 12: 489. 49) BARSKI A, C UDDAPAH S, C UI K, R OH TY, S CHONES DE, 34) GAN Y, G UAN J, Z HOU S, Z HANG W. Structural features WANG Z, W EI G, C HEPELEV I, Z HAO K. High-resolution based genome-wide characterization and predic - profiling of histone in the human tion of nucleosome organization. BMC Bioinformat - genome. Cell 2007; 129: 823-837. ics 2012; 13: 49. 50) WANG Z, Z ANG C, R OSENFELD JA, S CHONES DE, B ARSKI 35) YUAN GC, L IU YJ, D ION MF, S LACK MD, W U LF, A, C UDDAPAH S, C UI K, R OH TY, P ENG W, Z HANG MQ, ALTSCHULER SJ, R ANDO OJ. Genome-scale identifica - ZHAO K. Combinatorial patterns of histone acetyla - tion of nucleosome positions in S. cerevisiae. Sci - tions and methylations in the human genome. Nat ence 2005; 309: 626-630. Genet 2008; 40: 897-903. 36) LEE W, T ILLO D, B RAY N, M ORSE RH, D AVIS RW, H UGHES 51) CHEN K, X I Y, P AN X, L I Z, K AESTNER K, T YLER J, D ENT S, TR, N ISLOW C. A high-resolution atlas of nucleosome HE X, L I W. DANPOS: dynamic analysis of nucleo - occupancy in yeast. Nat Genet 2007; 39: 1235- some position and occupancy by sequencing. 1244. Genome Res 2013; 23: 341-351. 37) JOHNSON SM, T AN FJ, M CCULLOUGH HL, R IORDAN DP, 52) KENT WJ, S UGNET CW, F UREY TS, R OSKIN KM, P RINGLE FIRE AZ. Flexibility and constraint in the nucleosome TH, Z AHLER AM, H AUSSLER D. The human genome core landscape of Caenorhabditis elegans chro - browser at UCSC. Genome Res 2002; 12: 996- matin. Genome Res 2006; 16: 1505-1516. 1006. 38) CHOI JK, B AE JB, L YU J, K IM TY, K IM YJ. Nucleosome 53) FRIEDEL M, N IKOLAJEWA S, S UHNEL J, W ILHELM T. deposition and DNA methylation at coding region DiProDB: a database for dinucleotide properties. boundaries. Genome Biol 2009; 10: R89. Nucleic Acids Res 2009; 37: D37-40.

2665