Transcriptional Decomposition Reveals Active Chromatin Architectures and Cell Specific Regulatory Interactions
Total Page:16
File Type:pdf, Size:1020Kb
ARTICLE DOI: 10.1038/s41467-017-02798-1 OPEN Transcriptional decomposition reveals active chromatin architectures and cell specific regulatory interactions Sarah Rennie1, Maria Dalby1, Lucas van Duin1 & Robin Andersson 1 Transcriptional regulation is tightly coupled with chromosomal positioning and three- dimensional chromatin architecture. However, it is unclear what proportion of transcriptional 1234567890():,; activity is reflecting such organisation, how much can be informed by RNA expression alone and how this impacts disease. Here, we develop a computational transcriptional decom- position approach separating the proportion of expression associated with genome organi- sation from independent effects not directly related to genomic positioning. We show that positionally attributable expression accounts for a considerable proportion of total levels and is highly informative of topological associating domain activities and organisation, revealing boundaries and chromatin compartments. Furthermore, expression data alone accurately predict individual enhancer–promoter interactions, drawing features from expression strength, stabilities, insulation and distance. We characterise predictions in 76 human cell types, observing extensive sharing of domains, yet highly cell-type-specific enhancer–promoter interactions and strong enrichments in relevant trait-associated var- iants. Overall, our work demonstrates a close relationship between transcription and chro- matin architecture. 1 The Bioinformatics Centre, Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, DK-2200 Copenhagen, Denmark. Correspondence and requests for materials should be addressed to R.A. (email: [email protected]) NATURE COMMUNICATIONS | (2018) 9:487 | DOI: 10.1038/s41467-017-02798-1 | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-017-02798-1 he three-dimensional organisation of a genome within a expression related to genome architecture can be separated from nucleus is strongly associated with cell-specific transcrip- effects independent of genomic positioning, and that both com- T 1,2 tional activity . On a global level, transcriptional activa- ponents are strongly represented with contributions depending on tion or repression is often accompanied by nuclear relocation of cell type and location. We demonstrate that the positionally chromatin in a cell-type-specific manner, forming chromatin dependent (PD) component is highly reflective of chromatin – compartments3 of coordinated gene transcription4 6 or silen- organisation, revealing chromatin compartments and structures of cing7. Locally, chromatin forms sub-mega base pair domains of transcriptionally active TADs. We further demonstrate how self-contained chromatin proximity, commonly referred to as transcriptional components can be used to infer cell-type-specific topologically associating domains (TADs)8,9. TADs frequently chromatin interactions and find informative transcriptional fea- encompass interactions between regulatory elements, such as tures, including enhancer expression strength, which distinguish – between promoters and enhancers10 13, as well as between co- long-distance interacting from non-interacting EP pairs. We regulated genes9, which reflects cell-type-restricted transcriptional demonstrate the accuracy of our approach in well-established cell programmes14. In contrast, ubiquitously expressed promoters are lines and then decompose expression data from 76 human cell enriched close to domain boundaries8 and co-localise in the types in order to investigate their active chromatin architectures. nucleus15. A tight coupling between transcriptional activity and Our results indicate extensive sharing of expression-associated chromosomal positioning is further supported by positional domain structures across human cells, reminiscent of that clustering of co-expressed eukaryotic genes16,17, a phenomenon observed for active TADs. Promoter-localised, positionally inde- that is preserved across taxa18. In addition, neighbouring gene pendent (PI) events as well as EP interactions are, on the other expression correlation co-evolves and is particularly evident at hand, highly cell-type specific. Finally, we demonstrate how distances below a mega base pair19. Furthermore, facultative transcriptional components and predicted EP interactions can be heterochromatin may cover mega base pair regions20 and thus used to gain insights into the genetic consequences of complex affect several neighbouring genes. These observations are in line diseases. We observe variable enrichments of genetic variants in with coordinated gene expression and chromatin states within expression components across cell types and traits and find several TADs9,21, suggesting that the coupling between gene expression EP interactions in which disease-associated SNPs at enhancers and chromatin architecture is, at least partly, linked to chromo- may cause aberrant expression of distal genes. Taken together, we somal positioning. observe a tight coupling between transcriptional activity and Genetic disruption or chromosomal rearrangements of TAD three-dimensional chromatin architecture and suggest that reg- boundaries may result in aberrant gene transcription as a con- ulatory topologies, domain structures and their activities may be sequence of altered regulatory activities22 or regional chromatin inferred by RNA expression data and chromosomal position states23. While this suggests a key role for chromatin structure in alone. correct transcriptional activity, a general directional cause and effect between chromatin architecture and transcriptional activity is unclear. On one hand, TAD boundaries are to a large extent Results invariant between cell types8,9 and have been suggested to be – Transcriptional decomposition of RNA expression data.We mainly formed by architectural proteins11,24 26 independent of posited that a proportion of steady-state RNA expression is transcription15. On the other hand, transcription seems to play a reflecting three-dimensional chromatin organisation. We rea- major role in the maintenance of regulatory organisation within soned that a transcription unit (TU) is likely to be more similar in TADs15,27. In addition, chromatin compartments are not affected terms of expression to its proximal TUs than to distal loci, which by architectural protein depletion26. Rather, nuclear three- are likely to be associated with different domains of chromatin dimensional co-localisation of genes seems to be driven to a – interactions. Thus, coordination of expression is related between large degree by transcriptional activity6,15,28 30. positional and chromatin neighbourhoods (Fig. 1a). However, The strong relationships between transcription, chromatin expression is also influenced by gene-specific transcriptional and state, chromosomal positioning and three-dimensional chromatin post-transcriptional regulatory programmes independent of a organisation can be exploited for inference of one feature from TU’s chromosomal position. Therefore, in order to investigate the another. Compartments of transcriptional activity can be deduced coupling between transcriptional activity and genome organisa- from genome-wide chromatin interaction data3,11 and, inversely, tion, we need to be able to estimate the proportion of expression co-expression is indicative of enhancer–promoter (EP) interac- from a genomic region that can be explained by its chromosomal tions31. In addition, RNA expression was placed among the top- position. To this end, we developed a transcriptional decom- ranked features for predicting EP interactions in a recent position approach (Fig. 1b) to separate the component of machine-learning approach32. However, it is unclear what pro- expression reflecting an underlying positional relationship portion of expression is associated with chromatin topology and between neighbouring genomic regions (PD) component) from chromosomal positioning and what proportion is reflecting reg- the expression dictated by TUs’ individual regulatory programs ulatory programmes independent of the former. A way to sys- (PI component). Our strategy is based on approximate Bayesian tematically extract and separate these components from modelling34 and utilises replicate measurements to decompose expression data could lead to new insights into chromatin normalised aggregated RNA expression read counts in genomic topology and cell-type-specific transcriptional regulation. This is bins into two components (PD and PI) and some constant of high importance since genome-wide profiling of chromatin intercept (Methods). The expression associated with each geno- interactions using chromatin conformation capture techniques mic bin was modelled as a sum of the two transcriptional com- such as HiC33 is largely intractable due to several limitations ponents (random effects model), as illustrated in Fig. 1a. The including low spatial resolution, high cost and requirements of decomposition of this sum into its respective components large abundant cellular material. (summands) can be interpreted as an optimisation problem over In this study, we establish a transcriptional decomposition a chromosome to best separate the proportion of total expression approach to investigate and model the coupling between tran- that is similar between consecutive genomic bins (PD) from scriptional activity, chromosomal positioning and chromatin position-independent expression (PI). architecture. Via modelling of expression similarities