<<

Concia et al. Genome Biology (2020) 21:104 https://doi.org/10.1186/s13059-020-01998-1

RESEARCH Open Access Wheat chromatin architecture is organized in genome territories and transcription factories Lorenzo Concia1, Alaguraj Veluchamy2, Juan S. Ramirez-Prado1, Azahara Martin-Ramirez3, Ying Huang1, Magali Perez1, Severine Domenichini1, Natalia Y. Rodriguez Granados1, Soonkap Kim2, Thomas Blein1, Susan Duncan4, Clement Pichot1, Deborah Manza-Mianza1, Caroline Juery5, Etienne Paux5, Graham Moore3, Heribert Hirt1,2, Catherine Bergounioux1, Martin Crespi1, Magdy M. Mahfouz2, Abdelhafid Bendahmane1, Chang Liu6, Anthony Hall4, Cécile Raynaud1, David Latrasse1 and Moussa Benhamed1,7*

Abstract Background: Polyploidy is ubiquitous in eukaryotic plant and fungal lineages, and it leads to the co-existence of several copies of similar or related genomes in one nucleus. In plants, polyploidy is considered a major factor in successful domestication. However, polyploidy challenges folding architecture in the nucleus to establish functional structures. Results: We examine the hexaploid wheat nuclear architecture by integrating RNA-seq, ChIP-seq, ATAC-seq, Hi-C, and Hi-ChIP data. Our results highlight the presence of three levels of large-scale spatial organization: the arrangement into genome territories, the diametrical separation between facultative and constitutive heterochromatin, and the organization of RNA polymerase II around transcription factories. We demonstrate the micro-compartmentalization of transcriptionally active genes determined by physical interactions between genes with specific euchromatic histone modifications. Both intra- and interchromosomal RNA polymerase-associated contacts involve multiple genes displaying similar expression levels. Conclusions: Our results provide new insights into the physical chromosome organization of a polyploid genome, as well as on the relationship between epigenetic marks and chromosome conformation to determine a 3D spatial organization of gene expression, a key factor governing gene transcription in polyploids. Keywords: Hi-C, Hi-ChIP, DNA loops, Transcription factories, Genome territories

* Correspondence: [email protected] 1Institute of Plant Sciences Paris of Saclay (IPS2), UMR 9213/UMR1403, CNRS, INRA, Orsay, France 7Institut Universitaire de France (IUF), Paris, France Full list of author information is available at the end of the article

© The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Concia et al. Genome Biology (2020) 21:104 Page 2 of 20

Background from the same (autopolyploid) or related (allopolyploidy) In eukaryotes, nuclear DNA is folded into chromatin, a species in the same nucleus. How polyploidy affects tightly packed high-order structure whose main compo- chromosome folding architecture and functional nents are DNA and histone proteins. Chromatin con- organization remains to be elucidated. Established poly- formation determines the accessibility of the double ploids often have higher fitness attributes [16], and poly- helix to the transcriptional and replication machineries ploidy is considered a major factor in successful plant [1]. It is tightly regulated in both time and space by domestication [17] as demonstrated by its widespread highly conserved mechanisms such as DNA methylation, presence among cultivated crops, both autopolyploids histone post-translational modifications, alterations in (e.g., potato [Solanum tuberosum], alfalfa [Medicago histone-DNA interactions, and incorporation of histone truncatula], banana [Musa acuminata], and watermelon variants by chromatin remodelers, as well as long non- [Citrullus lanatus]) and allopolyploids (e.g., canola, coding RNA (lncRNA)- and small RNA (sRNA)-related strawberry, cotton, coffee, sugarcane, and wheat) [18]. pathways [2–4]. Polyploidization events trigger extensive epigenetic In light of the strong influence of chromatin conform- and transcriptional alteration of the duplicated or ation on gene expression, the idea of the genome as a lin- merged genomes, accompanied by small- and large-scale ear sequence of nucleotides has been replaced by a conformational changes [18–23]. Such rearrangements dynamic three-dimensional (3D) architecture [5]inwhich are thought to have facilitated the domestication of poly- structural elements such as “loops”, “domains”, “territor- ploid crops by enhancing their adaptive plasticity [17]. ies”,and“factories” are functional components controlling The characterization of 3D chromosome topology in the physical interactions between promoters and distant polyploids crops may help better understand the degree regulatory elements [6]. Accordingly, the nucleus is to which spatial organization contributes to polyploidy thought to operate as an integrated regulatory network [7]. success. Recent development of new methods for the analysis Modern hexaploid wheat (Triticum aestivum L.; 2n = of the genome-wide 3D spatial structure of chromatin, 6x = 42) is a particularly interesting model for analyzing such as Hi-C, Hi-ChIP, and ChlA-PET, has made it pos- chromosome topology, because it is the product of two sible to unveil small- and large-scale genome topology in rounds of interspecific hybridization that occurred at various cell types of metazoan organisms, notably in different evolutionary times (Fig. 1a). The first mammals [8]. These studies revealed the existence of hybridization event occurred 0.36 to 0.50 million years megabase-long chromatin compartments comprising ei- ago and produced a tetraploid species, Triticum turgi- ther open, active chromatin (A compartment) or closed, dum (AABB) [24–27]. This hybridization involved Triti- inactive chromatin (B compartment). Chromatin con- cum urartu (donor of the AA genome) and an unknown formation capture techniques also allowed the descrip- species related to Aegilops speltoides (BB genome) [28]. tion of topologically associating domains (TADs), large Indeed Ae. speltoides cannot be considered as the exclu- chromatin domains (800 kb average length in mammals) sive donor of this genome, but the wheat B genome bringing together contiguous sequences of DNA. Genes might rather have a polyphyletic origin with multiple an- that belong to the same TAD display similar dynamics cestors involved, among which Ae. speltoides. A second of expression, suggesting that physical association is hybridization event between T. turgidum (AABB) and functionally relevant to gene expression control. the diploid species Aegilops tauschii (DD genome) gave The basic organization of plant genomes differs from rise to a hexaploid wheat (AABBDD), the ancestor of the that in animals [9–14]. For instance, plants do not dis- modern bread wheat, about 10,000 years ago [25, 29]. play an apparent A/B compartmentalization as a pre- Since the three ancestors are closely related species des- dominant genome folding feature [14, 15]. cended from a common progenitor, three distinct but Moreover, although TAD-like domains have been highly syntenic subgenomes can be identified (AA, BB, identified in maize, tomato, sorghum, foxtail millet, and and DD) [30]. Compared to tetraploid wheat, modern rice [15], no canonical insulator proteins have been de- hexaploid wheat possesses several agricultural advan- scribed in plants, challenging the classification of these tages, such as increased environmental adaptability, tol- structures [12]. This may be due to the fact that the erance to abiotic stresses (including salinity, acid pH, plant genome structure has a particular transposable and cold), and increased resistance to several pathogens, element content and location relative to genes and opens factors that contribute to its success as a crop [31]. Al- the question of the functional conservation of genome though the genetic determinants of wheat yield and folding in eukaryotes. quality have been extensively investigated [32, 33] and a Another particular feature of plant genomes is the fully annotated reference genome was recently generated high occurrence of polyploidy, which leads to the co- together with tissue-specific and developmental tran- existence of several copies of similar genomes deriving scriptomic co-expression networks [34], the influence of Concia et al. Genome Biology (2020) 21:104 Page 3 of 20

Fig. 1 Large-scale chromatin architecture analysis of hexaploid wheat. a Schematic representation of the relationships between wheat genomes, showing the polyploidization history of hexaploid wheat. b Hi-C contact matrix of the hexaploid wheat genome. c Box plots representing the distribution of the median interaction frequency between 10-Mb bins for each combination of subgenomes (upper panel) and between homoeologous and non-homoeologous of different subgenomes (bottom panel). d Root meristematic cells of T. aestivum cv. Chinese Spring labeled by GISH. The A genome is labeled in magenta, the D genome is labeled in green, and the B genome is not labeled and thus appears in gray; telomeres are labeled in red. (Left panel) metaphase cells showing 14 A chromosomes, 14 B chromosomes, and 14 D chromosomes. (Middle panel) interphase cells. (Right panel) zoom-in of the interphase nucleus indicated by the white box in the middle panel. Scale bar represents 10 μm. e Immunofluorescence detection of H3K27me3 (green) and H3K27me1 (purple) in the isolated nucleus. f Integration of H3K27me3 and Hi-C at the chromosomal level. The heatmap of intrachromosomal interaction frequency of chromosome 4 B (in red, upper panel) is presented together with the read density of H3K27me3 ChIP-seq (in green, lower panel). The genomic coordinates are indicated on the left side of the heatmap for Hi-C and under the read density plot for H3K27me3. The color bar on the right side of the heatmap shows the interaction frequency scale. g Alternative models of large-scale chromatin architecture in the nucleus of wheat Concia et al. Genome Biology (2020) 21:104 Page 4 of 20

chromatin organization on the expression of key traits of L.) (Additional file 1: Supplemental Figure S2). The experi- agricultural interest is still poorly understood. ment revealed again a three-layer hierarchy of chromo- In this study, we present an analysis of hexaploid some interactions identical to wheat, suggesting that this wheat nuclear architecture by the integration of Hi-C, organization is a general feature of polyploid plants. ChIP-seq, RNA-seq, and Hi-ChIP data. Our results high- To further analyze wheat chromatin organization, we light at least three levels of DNA spatial organization: (i) performed nucleus immunostaining with antibodies an arrangement into genome territories, (ii) a diametrical directed against different histone marks. We first labeled separation between facultative and constitutive hetero- constitutive and facultative heterochromatin using chromatin, and (iii) the organization of active chromatin antibodies recognizing H3K27me1 and H3K27me3, DNA around transcription factories, established through respectively (Fig. 1e). H3K27me3 plays a critical role in micro-compartmentalization and physical interactions of the epigenetic silencing of developmentally or stress- these genes carrying euchromatic histone modifications. regulated genes in the context of facultative (reversible) heterochromatin, while H3K27me1 is required for the Results formation and maintenance of constitutive heterochro- Wheat chromatin is organized in genome territories matin, and thus participates in the inhibition of TE To gain insight into wheat chromatin architecture, we expression [11, 37]. used Hi-C, a genome-wide chromatin conformation cap- We observed an opposite and polarized distribution of ture method that detects DNA-DNA physical interactions these two marks throughout the nucleus, which suggests [35] to examine shoot cells of 14-day-old seedlings of T. the presence of subnuclear domains with different types aestivum cv. Chinese Spring. Analysis of the whole- of chromatin. A second immunostaining with an anti- genome interaction matrix (Fig. 1b) revealed three hier- body recognizing H3K9me2, another histone modifica- archical layers of chromosome interactions, from stron- tion associated with constitutive heterochromatin [38, gest to lowest: (i) within chromosomes, (ii) between 39], confirmed this result (Additional file 1: Supplemen- chromosomes of the same subgenome, and (iii) between tal Figure S3, upper panel). We then examined active chromosomes of different subgenomes. This organization chromatin by immunostaining with an antibody directed indicates a non-random spatial distribution of the three against H3K36me3, a euchromatic mark linked to active subgenomes that could mirror the presence of functional transcription [40, 41] (Additional file 1: Supplemental “genome territories.” We quantified these differences by Figure S3, lower panel). Contrary to the previous distri- plotting the distribution of median interaction frequency bution, H3K36me3 was localized in sharp foci scattered between 10-Mb bins for intrachromosomal interactions, across the whole nucleus, suggesting a physical proxim- interchromosomal interactions within each subgenome, ity of clusters of transcriptionally active regions. and interchromosomal interactions between subgenomes To investigate the relationship between chromatin 3D (Additional file 1: Supplemental Figure S1) and between folding and the observed polarization of facultative and pairs of subgenomes (Fig. 1c, middle panel). In addition, constitutive heterochromatin, we integrated H3K27me3 we observed that the A and B genomes interact more fre- ChIP-seq data and Hi-C data at the chromosome-wide quently than A and D or B and D. We independently con- level. Single-chromosome interaction matrixes showed a firmed the presence of subgenome-specific territories strong signal over the main diagonals, consistent with using a genomic in situ hybridization (GISH) experiment distance-dependent decay of interaction frequency, in on root meristematic cells (Fig. 1d). agreement with the previous literature [35] (Fig. 1b, f; Then, we examined the intergenomic interaction fre- Additional file 1: Supplemental Figure S4). The second quency between homeolog and non-homeolog chromo- strong signal was detectable in an antidiagonal direction somes of different subgenomes (Fig. 1c, lower panel), due to extensive interactions between the two arms of finding that homeologs interact more frequently than each chromosome. This conformation is consistent with non-homeolog chromosomes (Mann-Whitney p = 2.18e the Rabl organization previously described in wheat [42] −06; Cliff’s Delta effect size = 0.65). This imbalance was and barley [14]. The comparison with H3K27me3 ChIP- not anticipated and could be the macroscopic manifest- seq data (Fig. 1f) revealed that the peaks of read density ation of specific contacts between homeolog genes, are located primarily in the subtelomeric regions of each whose expression has been reported to be often coordi- chromosome arm. Globally, this explains how the Rabl nated in space and time [36]. Root cells presented the configuration in which subtelomeres are juxtaposed, to- same chromosome interaction pattern (Additional file 1: gether with the clustering of H3K27me3-labeled genes Supplemental Figure S1). in subtelomeric regions, produces the polarized distribu- To verify whether other polyploid plants share the same tion of epigenetic marks and supports the existence of large-scale , we carried out a Hi-C functional compartmentalization within the nucleus of assay on 14-day-old seedlings of rapeseed (Brassica napus bread wheat. Concia et al. Genome Biology (2020) 21:104 Page 5 of 20

The wheat genome displays intergenic condensed spacer negative for genes overlapping with peaks of the active (ICONS) folding structures marks H3K9ac and H3K36me3 than for those without Topologically associating domains (TADs) are defined as these histone marks, while genes carrying the repressive genomic regions containing sequences that interact more mark H3K27me3 displayed the opposite behavior. This frequently with others in the same TAD than with those result highlights that, while genes generally interact in different TADs [43, 44]. In metazoans, the boundaries weakly with their neighboring sequences, those bearing of TADs are enriched in specific proteins involved in their euchromatic histone marks are especially devoid of maintenance, such as the CTCF (CCCTC binding factor) physical contact with the surrounding regions. To deter- transcriptional repressor and cohesin [45–47]. The func- mine if a quantitative relationship exists between the tional conservation in plants of these folding structures re- level of a specific histone modification and the intensity mains unclear. We therefore wanted to investigate the of chromatin physical interactions, we partitioned the possible presence of folding domains in wheat. After nor- genome in ten deciles of insulation index and plotted malizing the Hi-C interaction matrixes for technical and the median read density of each histone mark over the biological biases that could alter the number of reads genes contained in each group (Fig. 3d–f; Add- aligning on each fragment, such as GC content, distance itional file 1: Supplemental Figure S7). We observed that between restriction sites, and mappability with iterative the read density of H3K27me3 gradually increased with correction and eigenvector decomposition (ICE) (Imakaev the insulation index whereas those of H3K9ac and et al. 2012), we applied the “insulation index” technique. H3K36me3 showed a negative correlation supporting The insulation index of a genomic region is calculated as that histone modification may play a major role in chro- the average frequency of interaction with the neighboring matin interaction in wheat. regions within a predefined distance [48]. We found 32, This trend prompted us to examine whether gene ex- 299 folding domains, covering 51% of the genome with an pression is affected by local interactions. We first com- average size of about 225 kbp (Additional file 1:Supple- pared insulation indexes over coding or non-coding mental Figure S5). genes overlapping or non-overlapping with RNA poly- A prominent feature of the insulation index was the merase II ChIP-seq peaks (Fig. 3g and Additional file 1: presence of genes, both protein-coding and non-coding Supplemental Figure S7) and expressed or non- such as microRNA precursors or long non-coding RNA, in expressed genes (Fig. 3h). We found that genes bound to correspondence of local minima and in particular at the RNA polymerase II and producing detectable transcripts boundaries of folding domains (Fig. 2b–e); we subsequently have on average a more negative insulation index. Con- named these structures ICONS, for intergenic condensed sistently with the distribution of histone marks, the fre- spacers. Examining the distribution of the active chromatin quency of interaction of transcribed genes with flanking marks H3K9ac and H3K36me3 and of the facultative het- regions was inversely proportional to their expression erochromatin mark H3K27me3 over ICONS, we found all level given in “transcripts per million” (TPM) (Fig. 3i). three histone marks are enriched at the boundaries, as pre- We then considered the read density of RNA polymerase dicted by the presence of genes at those locations. In con- II ChIP-seq over each of the 10 insulation quantiles de- trast, both transposable elements and DNA methylation on scribed above and found that genes characterized by a CpG and CHG contexts were depleted at the boundaries less negative insulation index are enriched in RNA poly- andenrichedwithinICONS(Additionalfile1: Supplemen- merase II ChIP-seq (Fig. 3j and Additional file 1: Supple- tal Figure S6). The result of an ATAC-seq experiment mental Figure S7). According to these results, the (assay for transposase-accessible chromatin using sequen- frequency of interaction between genes and the sur- cing) [49, 50] revealed that chromatin is highly accessible rounding regions correlates with both their histone at the ICONS boundaries and less within ICONS (Add- marks and their expression level, suggesting that local itional file 1: Supplemental Figure S6). Taken together, architecture is influenced by epigenetic mechanisms act- these features indicate that the wheat genome presents dis- ing in transcriptional regulation. tinctive folding domains (ICONS) rather than canonical topologically associating domains (TADs). Chromatin loops define local-scale functional units of To analyze more in-depth the relationship between wheat genome architecture histone modifications, chromatin accessibility, and chro- The visualization of the interaction matrix revealed the matin interaction, we took into account the average widespread presence of interaction hotspots between insulation index over protein-coding and non-coding genomic bins containing genes, a strong indication of genes by comparing the degree of spatial co-localization gene-to-gene loops (GGLs) (Fig. 2b, d, Fig. 4a). To con- through peaks of each histone mark or ATAC-seq firm the presence of such structures, using the software (Fig. 3a–c, Additional file 1: Supplemental Figure S6 and HOMER (see the “Materials and methods” section), we S7). We discovered that the insulation index is more identified 293,044 loops in shoots (see the “Materials Concia et al. Genome Biology (2020) 21:104 Page 6 of 20

Fig. 2 The wheat genome displays ICONS. a TADtool analysis of a 50-Mb region of chromosome 4D (Chr4D:325000000-375000000). Triangle heatmap of Hi-C interaction frequency (top panel), positions of the identified TADs indicated by black bars (middle panel), and plot of the insulation index (bottom panel). b Zoom-in of the 5-Mb region of Chr4D:345000000-350000000 in a, showing a triangle heatmap of Hi-C interaction frequency (top panel) and insulation index (bottom panel). c Position of genes along the 5-Mb region shown in b. d 2D heatmap showing the interaction frequency in a region (chr3B-1000000–5000000) of the Hi-C map. The dashed circles highlight the hotspots of interaction. Genes are represented by black bars. e Level of insulation index on genes. Median insulation index over genes (red line) and random genomic intervals (blue line) (top panel). Heatmap of insulation index over genes (bottom panel) Concia et al. Genome Biology (2020) 21:104 Page 7 of 20

Fig. 3 Relationship between insulation index, histone marks, and gene expression. a–c Plots displaying the insulation index over genes marked (red line) or not marked (blue line) by H3K9ac (a), H3K36me3 (b), and H3K27me3 (c). d–f Metaplots showing the normalized ChIP-seq read density of H3K9ac (d), H3K36me3 (e), and H3K27me3 (f) median enrichment over the genes categorized from low (first quantile, blue line) to high insulation index (tenth quantile, red line). g Plot displaying the insulation index over genes bound to RNAPII (red line) or not bound (blue line). h Plot displaying the insulation index over genes expressed (red line) or not expressed (blue line). i Plot displaying the insulation index over expressed genes categorized from low expression level (first quartile yellow line) to high expression level (fourth quartile brown line). j Metaplots showing the normalized RNAPII ChIP-seq read density over genes categorized from low (first quantile, blue line) to high (tenth quantile, red line) insulation index. Normalized ChIP-seq read densities along the gene and 2-kb region flanking the TSS or the TES are shown and methods” section), of which 35.6% involved at least ChIP-seq peaks of H3K9ac, H3K27me3, H3K36me3, and one gene and 13.4% between two gene-containing bins. RNA polymerase II on the genes of interest, plus their Overall, 47,763 genes (28.8% of all genes) were associ- transcriptional status (expressed or not) and expression ated with one or more GGLs. The presence of gene-to- level in TPM (Additional file 2: Table S1). We counted gene loops (GGLs) was independently confirmed with a the frequencies of each feature on genes involved in 3C-qPCR assay for four gene pairs spanning a distance GGLs and crossed the frequencies of the first and the between 200 and 400 kb (Additional file 1: Supplemental second gene (see the “Materials and methods” section). Figure S8) [51–53]. For all combinations of features, we then generated a Aiming to understand the role of GGLs in the regula- 2 × 2 contingency table on which we calculated the odds tion of gene expression, we examined the epigenetic ratios (OR) and effect size (Cramer’s V), two statistics marks and transcriptional status of gene pairs associated that express the strength of the association between two with GGLs (Fig. 4a–e). To facilitate the analysis, we categorical variables [54] (Additional file 3: Table S2). compiled a table integrating GGL position, size, strength, A heatmap with the values of log2(odds ratio) is shown and presence or absence of functional marks such as in Fig. 4b. A positive value of log2(odds ratio) means a Concia et al. Genome Biology (2020) 21:104 Page 8 of 20

Fig. 4 Chromatin loops define local-scale functional units of wheat genome architecture. a Integration of Hi-C and ChIP-seq data. A 2D heatmap with the interaction frequency in region chr3B-334500000-337000000 is presented. Black bars represent genes. Dashed circles represent the chromatin loop interactions. b Heatmap presenting the logarithm of odds ratios of all combinations of features of interacting genes (see the

“Results” section). Positive log2(odds ratio) indicates enrichment and negative indicates depletion. c Heatmap presenting the gene expression levels of gene-to-gene loop (GGL) gene pairs. Within each gene pair, each gene was classified into a quartile of expression, and then the number

of gene pairs with each combination of quartiles was counted. d Scatterplot of log2(shoot/root fold change) for pairs of interacting genes associated with a GGL conserved in shoots and roots. Loops (significant interactions) were identified from Hi-C data. e Scatterplots of log2(shoot/ root fold change) for two subsets of loops: weak (first quartile of Z-score) and strong (last quartile of Z-score) positive association between the two marks: when one genes with active marks (H3K36me3, H3K9ac, and RNA mark is present on the first gene, the other marks tend polymerase II) (“active loops”) and the second between to be present on the second gene. non-expressed genes bearing the repressive mark This approach revealed that GGLs occur more frequently H3K27me3 (“repressive loops”). between genes with concordant transcriptional status, We then asked whether the presence of GGLs is pre- whether transcribed or not [log2(odds ratio) = 1.413, Cra- dictive of co-expression between the two genes associ- mer’s V = 0.23]. The marks collectively define two main ated by the loop. To address the question, we considered classes of GGLs: the first occurring between expressed a subset of GGLs involving only expressed genes (TPM > Concia et al. Genome Biology (2020) 21:104 Page 9 of 20

0) and split the partner genes into two series of quartiles expressed, the two genes involved in a GGL tend to be according to their TPM. The count of genes falling into coregulated. While invariant to the linear distance be- each combination of quartiles was used to generate a 2D tween partner genes, the coregulation appears to be heatmap that unveiled a strong preference for the estab- more stringent for stronger GGLs. These findings repre- lishment of GGLs between genes with similar expression sent strong evidence that the local-scale chromatin levels (Fig. 4c), implying a correlation between transcrip- architecture plays a role in transcriptional coregulation. tionally active gene pairs and their 3D spatial proximity. We subsequently wanted to probe whether gene pairs RNA polymerase Il-associated loops organize active connected in GGLs show similar changes in the expres- chromatin into transcription factories sion levels between shoots and roots. Using publicly The results of both Hi-C and RNA polymerase II ChIP- available gene expression data [55], we identified 21,796 seq (Fig. 4a–e and Fig. 3g, j) raise the hypothesis that the differentially expressed (DE) genes between wheat seed- wheat chromatin is organized around transcription fac- ling shoots and roots (p value < 0.01) (see the “Materials tories. We performed immunostaining experiments and methods” section). using an anti-RNA polymerase II antibody and observed We performed a Hi-C experiment on roots, identified RNA polymerase II foci as expected (Fig. 5a). We sought loops with HOMER, and selected those conserved in to further validate the presence of transcription factories both organs and containing DE genes (1659 DE-GGLs). using the recently developed Hi-ChIP protocol, a We then applied a linear regression to the 1659 gene method for sensitive and efficient analysis of protein- pairs using the log2(fold change shoot/root) of the first centric chromosome conformation, with an anti-RNA gene as the predictor and the log2(fold change shoot/ polymerase II antibody [56]. The density of short reads root) of the second gene as the response (Fig. 4d). The aligned on the same fragment, commonly referred to as regression was highly significant (F test p value = 6.6e “dangling ends,” showed the same strong enrichment −98), and the slope coefficient was positive (0.483), over gene bodies as our RNA polymerase II ChIP-seq showing that DE genes in physical contact tend to have datasets (Additional file 1: Supplemental Figure S12). similar changes in the expression levels between shoots The Hi-ChIP datasets confirmed the presence of gen- and roots, a behavior consistent with the existence of omic territories revealed by the Hi-C experiments (Add- coregulation mechanisms. itional file 1: Supplemental Figure S13). In light of the GGL size distribution (Additional file 1: To take full advantage of the higher resolution pro- Supplemental Figure S9), we tested whether the linear vided by Hi-ChIP, we used HOMER to identify intra- distance between interacting genes had an influence on chromosomal RNA polymerase II-associated loops their coregulation and found no differences between (RALs) (Fig. 5b–d and Fig. 6a–c). We found 27,886 and short- and long-range interactions (Additional file 1: 15,887 intrachromosomal RALs in shoots and roots, Supplemental Figure S10). respectively. In addition, we examined whether the correlation be- We explored the relationship between RALs and dif- tween the log2(fold change) of the two partner genes ferential gene expression with a regression analysis like could be predicted by the “loop strength” of the DE- that carried out for DE-GGLs (see above), obtaining bet- GGLs, as measured by the Z value assigned by HOMER ter fitting (R2 = 0.61 versus 0.23) more positive 488 slope based on the ratio of observed to expected reads (see the (0.77 versus 0.48) (Fig. 5d). “Materials and methods” section). Similar to the analysis These indicate that the coregulation between genes in- performed for the expression level in TPM (see above), volved in RNA polymerase II-associated loops is stron- we split the DE-GGLs into two series of quartiles ac- ger than between genes associated to loops independent cording to the strength in shoots and roots and repeated from RNA polymerase II, pointing to a role of chromatin the analysis for each group (defined by a combination of loops in the localized action of RNA polymerase II. quartiles) (Fig. 4e and Additional file 1: Supplemental To determine whether RNAPII-associated interaction Figure S11). The p value and slope of the regression ob- coregulation extends to genes located on different chro- served for DE-GGLs were strongly driven by the stron- mosomes, we identified 51,112 and 72,762 interchromo- gest group in shoots and roots (Fig. 4e). The slope of the somal interactions in shoots and roots, respectively regression calculated using only this subset is 0.90, an al- (Fig. 6). We then tested whether interchromosomal most perfect correlation (slope equal to 1). Overall, we RNAPII-associated interaction, like GGLs, is predictive observed that GGLs occurred predominantly between of co-expression by partitioning them in quantiles ac- genes that share similar epigenetic marks and expression cording to the expression level of the gene pairs involved levels. Based on the co-occurrence of these features, we (see above). The gene counts in each combination of could distinguish two major types of GGLs: active loops quantiles (Fig. 6b) confirmed that genes establishing in- and repressive loops. Furthermore, when differentially terchromosomal interactions associated to RNAPII tend Concia et al. Genome Biology (2020) 21:104 Page 10 of 20

Fig. 5 Wheat chromatin is organized around transcription factories. a Immunofluorescence detection of RNAPII (purple) in the isolated nucleus. b Plot of the chromosome 3B RNAPII Hi-ChIP data. The color is associated with a distance range. c Integration of RNAPII Hi-ChIP, RNAPII ChIP-seq, and RNA-seq data. A 2D heatmap with the interaction frequency in region chr4D-40800000-41300000 is presented. Green bars represent genes. d

Scatterplot of log2(shoot/root fold change) for pairs of interacting genes associated through intrachromosomal RNAPII-associated loops (RALs). Loops (significant interactions) were identified by an RNAPII Hi-ChIP experiment (see the “Results” section). Only the interchromosomal loops conserved between shoots and roots were used in the analysis to have very similar expression levels. We subsequently that 50% have 4 or more partners and 11% have 10 or asked whether differentially expressed genes interacting more partners (Fig. 6d). through these interchromosomal contacts displayed the Taken together, (i) the presence of both intra- and in- same behavior in both shoots and roots. We identified terchromosomal RNAPII-associated contacts, (ii) the 3947 differentially expressed genes associated with con- tendency of these contacts to involve multiple genes, served interactions in shoots and roots and repeated the and (iii) the effects of interaction on gene expression linear regression analysis (Fig. 6c). The outcome con- and coregulation led us to propose a model for T. aesti- firmed a positive correlation between log2(fold change) vum in which RNA polymerase II organizes chromatin of the gene pairs (regression slope of 0.60), consistent topology at a local scale, creating transcription factories with that observed for RNAPII-associated loops. (Fig. 6e). We checked the effect of the RNAPII-associated inter- action strength on the coregulation of interacting gene Discussion pairs by splitting them into two series of quartiles based The study of the chromatin architecture of metazoan ge- on their strength in shoots and roots (see above) and nomes has uncovered a complex organization combining found that the interaction strength does not have a major multiple structural elements, such as chromosome terri- effect (Additional file 1: Supplemental Figure S14). tories, compartments, TADs, and loops [35, 44, 57, 58]. Finally, we counted the number of partners of each Analyses of several plant species, including Arabidopsis, gene involved in RNAPII-associated contacts, finding rice, barley, maize, tomato, sorghum, foxtail millet, and Concia et al. Genome Biology (2020) 21:104 Page 11 of 20

Fig. 6 (See legend on next page.) Concia et al. Genome Biology (2020) 21:104 Page 12 of 20

(See figure on previous page.) Fig. 6 Gene interchromosomal interaction through transcription factories is associated with coregulation. a Integration of RNAPII Hi-ChIP, RNAPII ChIP-seq, and RNA-seq data. Examples of 2D heatmap with the frequency of interchromosomal interactions are presented. Green bars represent genes. b Heatmap presenting the gene expression level of gene pairs involved in RNAPII-associated interactions. Within each gene pair, each gene was classified into a decile of expression, and then the number of gene pairs with each combination of deciles was counted. c Scatterplot

of log2(shoot/root fold change) for pairs of interacting genes associated through interchromosomal RNAPII-associated contacts. Interactions were identified by an RNAPII Hi-ChIP experiment (see the “Results” section). Only interchromosomal contacts conserved in shoots and roots were used in the analysis. d Histogram showing the distribution of genes interacting through RNAPII-associated contacts by numbers of partner genes. e Model of a transcription factory. Each factory could contain several RNA polymerase II molecules. These transcription factories could include many factors involved in transcription, such as coactivators, chromatin remodelers, transcription factors, histone modification enzymes, RNPs, RNA helicases, and splicing and processing factors. Multiple genes can be processed by the same factory (two are shown) cotton, have offered a different scenario that is still Another important finding is the differential frequency poorly understood [14, 15, 59, 60] (see introduction). In of interaction between specific subgenomes: A and B this context, the generation and integration of Hi-C, interact more frequently than do A and D or B and D ChIP-seq, Hi-ChIP, and RNA-seq data presented in this (Fig. 1b, Additional file 1: Supplemental Figure S1). It is study offer a robust insight into the genome topology of tempting to speculate that this imbalance could be related hexaploid wheat, an important crop whose exploration to the evolutionary history of modern wheat: the has traditionally relied on a cytological approach [61]. hybridization event at the origin of the tetraploid T. turgi- The first analysis of the 3D genome organization of a dum, containing the A and B subgenomes, occurred sev- polyploid plant was conducted in cotton and focused mainly eral hundred thousand years before the hybridization on the evolutionary consequences of polyploidization at a between T. turgidum and A. tauschii (carrying the D gen- local scale [60]. In this study, we show with two genome- ome). As a consequence, the A and B subgenomes have wide complementary techniques, such as GISH and Hi-C, co-existed under the severe spatial constraint dictated by that the chromatin of hexaploid wheat is not uniformly dis- the limited nuclear space, especially considering the large tributed across the nucleus but rather occupies the genome size of T. turgidum (12 Gbp) [64]), for a longer subgenome-specific nuclear compartments. This finding is time than either A or B have co-existed with the recently consistent with previous cytological observations indicating added D subgenome. Our observation opens the possibil- that chromosomes of the same subgenome tend to be phys- ity that the A and B subgenomes have established a cer- ically closer than chromosomes of different subgenomes, re- tain degree of physical interaction before subgenome D gardless of whether they are homeologs [61]. Consequently, was introduced, implying that the chromatin architecture we propose that genome territories are the primary level of was conserved during the process of allopolyploidization. chromatin spatial organization in this species. Accordingly, we propose that the genome territories in The retention of multiple homeologs following poly- hexaploid wheat were inherited from its ancestors and ploidy promotes the subfunctionalization of genes, in- maintained after each hybridization event; however, it can- creasing the capacity to adapt to different environments. not be excluded that gradual adjustments occurred during The resulting phenotypic variability from which favor- the course of the following evolution. able traits can be selected could have facilitated the do- In hexaploid wheat and several other species, such as mestication of modern wheat. barley, rye, oat, Drosophila melanogaster, and Saccharo- Despite a considerable degree of synteny among sub- myces cerevisiae, chromosomes are arranged in a Rabl genomes due to the phylogenetic relatedness of the three configuration, where telomeres and centromeres are lo- progenitors [30], wheat chromosomes form meiotic pairs calized at opposite sides of the nucleus [65–68]. Consist- only with the respective homolog and never with their ently, Hi-C interaction maps generated in this study homeologs [62]. This mechanism prevents the independ- show that chromosomes are V-folded and oriented in ent segregation of chromosomes from different genomes the same direction, as apparent from the antidiagonal and results in disomic inheritance. While the Ph1 locus pattern in the interchromosomal maps (Fig. 1b, f). In has been identified as the major regulator of disomic in- addition, we observed a clear polarization of constitutive heritance and molecularly as a duplicated ZIP4 gene heterochromatin (marked by H3K27me1 and H3K9me2) [63], little is known about the underlying mechanisms and facultative heterochromatin (marked by H3K27me3) which facilitate only homolog pairing rather than home- within the nucleus, a combined effect of the Rabl config- olog pairing during the telomere bouquet stage at the uration and the previously reported subtelomeric enrich- start of meiosis. The establishment of genome territories ment in facultative heterochromatin [34]. On a global may be a mechanism that favors the pairing of homologs scale, the polarization of constitutive and facultative het- versus homeologs by creating territorial “boundaries” be- erochromatin could respond to the necessity to subcom- tween the different subgenomes. partmentalize genomic regions with different levels of Concia et al. Genome Biology (2020) 21:104 Page 13 of 20

transcriptional activity and epigenetic marks to localize containing several coregulated genes and possibly en- the enzymatic activities of the transcriptional machinery hancers. Such “transcription factories” have been ob- and epigenetic regulators in a delimited nuclear volume. served previously as subnuclear foci of active RNAPII In the context of a tightly packed structure imposed and regulatory factors, in which several Pol II molecules by the reduced volume of the nucleus, the architecture operate simultaneously [71–73]. We independently con- of chromatin provides a physical framework for the dy- firmed the existence of transcription factories in wheat. namic and coordinated expression of genes in response One possible advantage of this three-dimensional spatial to internal and external cues. In metazoans, chromatin network is that it facilitates the deployment of transcrip- is organized in TADs, regions characterized by preferen- tional regulatory factors over a restricted number of tial contacts between the loci inside the same TAD and “focal points,” resulting in a better coordination of gene lack of interaction with the loci in neighboring TADs. expression inside the nucleus. This could represent an These structures are known to facilitate the establish- evolutionarily selected trait that counteracts the disad- ment of enhancer-promoter contacts and to mediate the vantage of large genome size. regulation of gene expression [69, 70]. Here, we found a widespread presence of TAD-like domains in wheat, Conclusion similar to what has been previously reported in maize, In summary, our findings allow us to propose a model tomato, sorghum, and foxtail [14, 15, 59, 60]. However, in which chromatin architecture in hexaploid wheat is the wheat TAD-like formations displayed plant-specific hierarchically organized around four main topological features, such as (i) depletion of genes within the do- features: (i) genome territories, (ii) constitutive and mains; (ii) enrichment in genes, especially active ones, facultative heterochromatin polarization, (iii) transcrip- and DNA methylation at their borders; and (iii) propen- tion factories, and (iv) ICONS. This sophisticated sity to form loops between genes located at the border. organization is required to balance the size and com- To emphasize these peculiar properties, we hence named plexity of the hexaploid wheat genome. In a more these domains ICONS, for intergenic condensed spacer. general perspective, this study highlights that three- We hypothesize that ICONS promote the formation of dimensional conformation at multiple scales is a key fac- gene-to-gene loops by condensing the intergenic regions, tor governing gene transcription. thus bringing into close proximity the genes located at the ICONS borders. This mechanism could also increase Materials and methods the accessibility of genes to the replication machinery Plant material and other enzymatic activities, including epigenetic regu- The wheat cultivar Chinese Spring and the rapeseed cul- lators, and bury within ICONS the sequences that tivar Westar were used for this study. Seeds were surface should not be accessible, such as repressed genes. In- sterilized with bayrochlore and ethanol, sown on MS deed, we detected the presence of genes carrying agar plates, placed at 4 °C in the dark for 2 days, and H3K27me3 within ICONS but a strong depletion of ac- then grown in a growth chamber for 14 days (long-day tive genes bearing H3K9ac and H3K36me3 (Fig. 3a). In a conditions, 18 °C). large genome such as that of hexaploid wheat, reducing the physical distance between genes could be critical for Genomic in situ hybridization of root meristematic cells improving the efficiency of nuclear metabolism [5]. Fur- The preparation of T. aestivum root metaphase cells and thermore, we observed that genes forming loops tend to subsequent genomic in situ hybridization (GISH) was (i) share the same epigenetic marks; (ii) share epigenetic carried out as described previously [63, 74]. The prepar- marks with the same physiological significance, i.e., ei- ation of root interphase cells was performed as for meta- ther active or repressive (Fig. 5b); (iii) be expressed at phase cells but omitting the nitrous oxide treatment. similar levels (Fig. 5c); and (iv) behave concordantly Triticum urartu and Aegilops tauschii were used as when differentially expressed in two organs (Fig. 5d). probes to label wheat A and wheat D genomes, respect- This evidence strongly suggests that loops bring into ively. Telomere repeat sequence (TRS) probe was ampli- physical proximity genes that have to be targeted by the fied by PCR as described previously [75] and labeled same epigenetic modifiers and, in the case of active with tetramethyl-rhodamine-5-dUTP (Sigma, St. Louis, marks, by the transcription machinery. The presence of MO, USA) by nick translation as described previously interchromosomal loops between coregulated genes, as [76]. T. urartu and Ae. tauschii genomic DNA were la- uncovered in this study, indicates that the same mechan- beled with biotin-16-dUTP and digoxigenin-11-dUTP, ism also could be functioning in trans at very long dis- using the Biotin-Nick Translation Mix and the DIG- tances [36]. Nick Translation Mix, respectively (Sigma) according to In addition, single genes appear to take part in mul- the manufacturer’s instructions. Biotin-labeled probes tiple loops (Fig. 6b), producing clusters of loops were detected with Streptavidin-Cy5 (Thermo Fisher Concia et al. Genome Biology (2020) 21:104 Page 14 of 20

Scientific, Waltham, MA, USA). Digoxigenin-labeled Assay for transposase-accessible chromatin with high- probes were detected with anti-digoxigenin-fluorescein throughput sequencing Fab fragments (Sigma). One hundred micrograms of 14-day-old seedlings was Images were acquired using a Leica DM5500B micro- ground, and the nuclei were isolated with 4 °C using nuclei scope equipped with a Hamamatsu ORCA-FLASH4.0 cam- isolation buffer (0.25 M sucrose, 10 mM Tris-HCl, 10 era and controlled by Leica LAS X software v2.0. Images mM MgCL2, 1% Triton, 5 mM beta-mercaptoethanol) were processed using Fiji (an implementation of ImageJ, a containing proteinase inhibitor cocktail (Roche) and fil- public domain program by W. Rasband available from teredat63µm.Thenucleiwereresuspendedin1×TD http://rsb.info.nih.gov/ii/) and Adobe Photoshop CS4 buffer (Illumina FC-121-1030) and 2.5 µl of Tn5 trans- (Adobe Systems Incorporated, USA) version 11.0 x 64. posase (Illumina FC-121-1030) were added. Transpos- ition reaction was performed at 37 °C for 30 min, and Immunofluorescence DNA was purified using a Qiagen MinElute Kit. DNA Seedlings were fixed in PFA 4% in PHEM (PIPES 60 mM, libraries were amplified for a total of 8 cycles as de- HEPES 25 mM, EGTA 10 mM, MgCl2 2mMpH6.9)under scribed by Buenrostro et al. [49] and Jegu et al. [50]. a vacuum for 20 min. Seedlings were washed for 5 min in PHEM and 5 min in PBS pH 6.9 and chopped on a petri RNA-seq assay dish in PBS supplemented with 0.1% triton (w/v). The mix- Total RNA were extracted from 180 mg of shoots of 12- ture was filtered (50 µm) and centrifugated 10 min at day-old seedlings with the NucleoSpin RNA kit 2000g. The supernatant was carefully removed, and the pel- (Macherey-Nagel), according to the manufacturer’s in- let was washed once with PBS, gently resuspended in 20 µl structions. RNA-seq libraries were prepared using 2 µg PBS, and a drop was placed on a poly-lysine slide and air- of total RNA with the NEBNext® Poly(A) mRNA Mag- dried. Slides were rehydrated with PBS and permeabilized 2 netic Isolation Module and NEBNext® Ultra™ II Direc- times by incubating for 10 min in PBST (PBS, 0.1% tional RNA Library Prep Kit for Illumina (New England Tween20 v/v). The slides were placed in a moist chamber Biolabs), according to manufacturer’s recommendations. and incubated overnight at 4 °C with primary antibody The quality of the libraries was assessed with Agilent anti-H3K9me2 (Millipore 07-441), anti-H3K27me3 (Milli- 2100 Bioanalyzer (Agilent). pore, ref. 07-449), anti-H3K27me1 (Abcam ab195492), anti-H3K36me3 (Abcam, ab9050), or anti-polymerase II RNAPII ChIP assay (Active Motif, 39097) in PBST supplemented with BSA (3% RNAPII ChIP assay was performed with anti-RNA poly- w/v). The slides were washed 5× for 10 min in PBST (at merase II CTD repeat YSPTSPS antibody (Abcam RT) and incubated 1 h at RT in the dark with the secondary ab26721) using the same protocol as in IWGSC, 2018 [34]. antibodies Alexa Fluor 594 goat anti-rabbit (A11037 Invi- trogen), Alexa Fluor 488 goat anti-mouse (A11001 Invitro- Hi-ChIP assay gen), Alexa Fluor 488 goat anti-rabbit (A11034 Invitrogen), For Hi-ChIP experiments, the nuclei were isolated from or Alexa Fluor 594 goat anti-mouse (A11032 Invitrogen), the shoots and roots using the same procedure as in the diluted (1/400 v/v) in PBST, 3% BSA. The slides were in situ Hi-C experiments, and the Hi-ChIP protocol washed 5× for 10 min in PBST and mounted with a drop of from Mumbach et al. 2016 was then applied using the Vectashield with DAPI and were directly imaged on an up- DpnII restriction enzyme (New England Biolabs) and the right microscope (Zeiss Microsystems). anti-polymerase II antibody (Active Motif, 39097). The quality of the libraries was assessed with Agilent 2100 In situ Hi-C assay Bioanalyzer (Agilent), and the libraries were subjected to Fourteen-day-old shoots or roots were used for in situ 2 × 75 bp paired-end high-throughput sequencing by Hi-C. The experiment was carried out according to the NextSeq 500 (Illumina). protocol published by Liu et al. [74] using DpnII enzyme (New England Biolabs) with minor modifications con- Chromosome conformation capture cerning library preparation for which we used NEBNext One gram of 14-day-old shoots was cross-linked in 1% UltraII DNA library preparation kit (New England Bio- (v/v) formaldehyde at room temperature for 20 min. labs). For library amplification, nine PCR cycles were Cross-linked plant material was ground, and the nuclei performed and Hi-C libraries were purified with SPRI were isolated and treated with SDS 0.5% at 62 °C for 5 magnetic beads (Beckman Coulter) and eluted in 20 µl min. SDS was sequestered with 2% Triton X-100. Diges- of nuclease-free water. The quality of the libraries was tions were performed overnight at 37 °C using 150 U of assessed with Agilent 2100 Bioanalyzer (Agilent), and DpnII enzyme (New England Biolabs). Restriction en- the libraries were subjected to 2 × 100 bp paired-end zymes were inactivated by the addition of 1.6% SDS and high-throughput sequencing by HiSeq 4000 (Illumina). incubation at 62 °C for 20 min. SDS was sequestered Concia et al. Genome Biology (2020) 21:104 Page 15 of 20

with 1% Triton X-100. DNA was ligated by incubation Analysis of Hi-C data at 22 °C for 5 h using 4000 U of T4 DNA ligase (Fermen- Raw FASTQ files were preprocessed with Trimmomatic tas). Reverse crosslinking was performed by overnight v0.36 [81] to remove Illumina sequencing adapters. 5′ and treatment at 65 °C. DNA was recovered after Proteinase 3′ endswithaqualityscorebelow5(Phred+33)were K treatment by phenol/chloroform extraction and etha- trimmed, and reads shorter than 20 bp after trimming were nol precipitation. Relative interaction frequencies were discarded. The command line used is “trimmomatic-0.36 PE calculated by quantitative real-time PCR using 15 ng of -phred33 -validatePairs ILLUMINACLIP:TruSeq3- SE.fa:2: DNA. A region uncut by DpnII was used to normalize 30:10 LEADING:5 TRAILING:5 MINLEN:20”. Trimmed the amount of DNA. FASTQ files were processed with the pipeline HiC-Pro v2.9.0 [82] with minor modifications. Briefly, reads were Primer list for 3Cassays aligned to iwgsc_refseq1.0 reference genome assembly Primer name Sequence (available at https://urgi.versailles.inra.fr/download/iwgsc/ TraesCS1B02G226200 AAATTGGCTGCCGATTGGTTCG IWGSC_RefSeq_Assemblies/v1.0/iwgsc_refseqv1.0_all_chro- TraesCS1B02G226300 CTGGGCTAAAAGCCTCACACGTT mosomes.zip. using bowtie2 v2.3.3 [83] with default settings, except for the parameter “--score-min L, -0.6, -0.8”.Forward TraesCS4D02G254100 TGGCGATAATGCAACATTGCAGAA and reverse mapped reads were paired and assigned to TraesCS4D02G254200 ATTTACCTGCAAGGAGAGTTCCCC DpnII restriction fragments. Invalid ligation products (such TraesCS7A02G231500 AAACAAGCTCTTGAGGTGACATCG as dangling ends, fragments ligated on themselves, and liga- TraesCS7A02G231600 CCCAGTTGGTTATTCGGCAGT tions of juxtaposed fragments) were discarded. TraesCS1D02G176100 GACTAGCACCCCCAAGCATTCC A summary of Hi-C mapping statistics is provided TraesCS1D02G176200 AGCTTGGATGCCTGCTTACGG in Additional file 4, Table S3. Valid pairs were used to produce raw interaction matrixes at various resolu- tions. Normalized matrixes were produced by iterative correction using the “ice” utility provided as part of Gene annotation HiC-Pro [84]. Pre-computed .hic files were generated All the analyses involving genes were carried out with the software Juicer Tools and visualized with the using a custom annotation derived from iwgsc_ tool Juicebox v1.9.8 [85]. Interaction frequency box- refseqv1.1 (available at https://urgi.versailles.inra.fr/ plots (Fig. 1c) where created using the packages HiTC download/iwgsc/IWGSC_RefSeq_Annotations/v1.1/ v1.26.0 [86] and ggplot2 v3.1.0 [87] in R v3.4.3 statis- iwgsc_refseqv1.1_genes_2017July06.zip.Briefly,we tical environment [88]. Single-chromosome Hi-C combined high-confidence genes (file IWGSC_v1.1_ interaction heatmaps and H3K27me3 read density HC_20170706.gff3) and a subset of low-confidence tracks (Fig. 1fandAdditionalfile1: Supplemental genes (file IWGSC_v1.1_LC_20170706.gff3) including Figure S4) were produced with HiCPlotter v0.8.1 at only genes overlapping with ChIp-seq peaks of at 1-Mb resolution [89]. least one histone mark (H3K9ac, H3K36me3, H3K27me3) (see below). Genes annotated on “un- anchored scaffolds” (ChrUn) were removed. The final Insulation index and ICONS annotation (file IWGSC_v1.1.noChrUn.gene. HC+LC_ We identified ICONs with the software TADtool over_peaks.gff3) contained 165,277 genes of which v0.77 [90]usingthe“insulation index” method [48]. 105,088 were high-confidence and 60,189 were low- The “insulation index” of a genomic region is defined confidence. as the normalized average frequency of interaction with the neighboring sequences within a pre-defined distance. Regions with an insulation index above a Identification of ncRNAs pre-established threshold are identified as folding Transcript classes were predicted by assessing the domains. coding potential of each gene with Coding Potential In this study, we used ice-normalized matrixes at 25 kb Calculator (CPC) [77] and its improved version CPC2 of resolution with a scoring window of 100 kb and a call- [78], both with default parameters and by comparison ing threshold of 0.4. The histogram of ICONS distribution with the RFAM database 13.0 [79] using cmscan of the by size (Additional file 1: Supplemental Figure S5)was suite infernal v1.1.1 [80]. In case of a hit corresponding plotted with the package ggplot2 v3.1.0 [87] and R. For to structural RNA (tRNA, rRNA, snRNA, or snoRNA), Fig. 3d-f, j, the genome was partitioned in 10 deciles of the transcript was classified as “structural”; then, if CPC 25-kb bins according to their insulation index using the and CPC2 both predicted the transcript as non-coding, function mutate of the package dplyr v0.7.8 in R. Groups we classified it as “non-coding,” otherwise as “coding.” of genes lying in each genomic partition of insulation Concia et al. Genome Biology (2020) 21:104 Page 16 of 20

index (see above) were identified using the bedtools expressed and split in quantiles of expression with R v2.27.1 [91] intersect with the command “for i in {1..10}; using the median value of the three biological replicates. do bedtools intersect -sorted -wb -F 0.5 -a DECILE.$i.bed -b IWGSC_v1. 1.noChrUn.gene. HC+LC_over_peaks.gff3 Differential expression analysis > IWGSC_v1.1 .DECILE.$i.gff3; done”. RNA-seq data relative to seedling shoot and root were obtained from the dataset published by Oono et al. [55]. Paired short reads were preprocessed like ChIP-seq data Analyses of ChIP data (see above) to remove Illumina sequencing adapters. We used the histone marks ChIP-seq datasets from Trimmed paired-end reads were aligned against the IWGSC, 2018 (NCBI SRA accession SRP126222). His- reference genome iwgsc_refseq1.0 using STAR_2.5.4b tone marks and RNAPII ChIP were pre-processed with [96] with the command “STAR --genomeDir /STAR_ Trimmomatic v0.36 [81] to remove Illumina sequencing iwgsc1.0 --sjdbGTFfile /IWGSC_v1.1_HC+LC_ adapters. 5′ and 3′ ends with a quality score below 5 20170706.gtf --sjdbOverhang 100 --outSAMstrandField (Phred+33) were trimmed, and reads shorter than 20 bp intronMotif --quantMode GeneCounts --outSAMtype after trimming were discarded. The command used is BAM SortedByCoordinate -- outFilterIntronMotifs “trimmomatic-0.36.jar PE -phred33 - validatePairs ILLU- RemoveNoncanonical”. Genes differentially expressed (p MINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:5 TRAIL- value < 0.01) were identified from raw read counts with ING:5 MINLEN:20”. Trimmed paired ends reads were DESeq2 v1.22.1 in R. aligned against the reference genome iwgsc_refseq1.0 using bowtie2 v2.3.3 with the setting “--very-sensitive”. Alignments with MAPQ < 10 and duplicated reads were Generation of meta-plots discarded with sambamba v0.6.8. Meta-gene profiles of insulation index and the meta- Normalized short-read density was calculated using the ICONs profiles of epigenetic and genomic features (his- utility bamCoverage from the bioinformatics suite deep- tone marks, RNAPII, ATAC-seq, DNA methylation, TE tools2 3.1.0 [92] with the command line “bamCoverage -p coverage) were generated with deepTools2 v3.1.0 [92] 4 -bs 100 --extendReads -of bigwig --normalizeTo1x computeMatrix and plotted with plotProfile. Shuffled 14547261565 -b INPUT.bam -o OUTPUT.bigwig”.Peaks ICONS and genes were generated with bedtools shuffle. of Chip-seq read density were identified using the software Genes overlapping or non-overlapping with ChIP peaks MACS2 [93] with the command “macs2 callpeak -f BAM of each histone mark and RNAPII (Fig. 3a–c, g; Additional --nomodel --to-large -p 0.01 -broad -g 17e9 --bw 300” for file 1: Supplemental Figure S1 I and S7 B-C and F-G) were histone marks and “macs2 callpeak -f BAMPE --nomodel identified using the tool intersect of the bioinformatics –q 0.001 --broad-cutoff 0.01 -g 17e9 --bw 300” for RNA- suite bedtools v2.27.1 with the command: “bedtools inter- PII. Histone mark peaks with a fold change lower than 3 sect -sorted -wa -a IWGSC_v1.1.noChrUn.gene. HC+LC_ were discarded. over_peaks.gff3 -b PEAK_FILE | sort -k4,4 -u > OVER- LAPPING_GENES.GFF3” and “bedtools intersect - sorted Analyses of ATAC-seq data -v -wa -a IWGSC_v1.1.noChrUn.gene. HC+LC_over_ ATAC-seq paired ends reads were aligned against the peaks.gff3 -b PEAK_FILE | sort -k4,4 -u > NONOVER- ” reference genome iwgsc_refseq1.0 using bowtie2 LAPPING_GENES.GFF3 . For ATAC-seq, only genes v2.3.3 with the setting “--very-sensitive” [83]. containing a peak within 500-bp upstream of TSS were Alignments with MAPQ < 10 were filtered with classified as overlapping. sambamba v0.6.8. Short-read density was calculated DNA methylation data were obtained from Ramirez- using bamCoverage (see above) with the command Gonzalez et al. [36] and are available in the NCBI SRA re- “bamCoverage -bs 10 -- centerReads -b INPUT.BAM pository under accession code SRP133674. Genomic -of bigwig -o OUTPUT.bigwig”. Peaks of the read coverage of transposable elements was calculated using density were identified with the command “macs2 the annotation available at https://urgi.versailles.inra.fr/ callpeak -f BAMPE --nomodel -q 0.001 --broad- cut- download/iwgsc/IWGSC_RefSeq_Annotations/v1.0/iwgsc off 0.01 -g 17e9 --bw 300”. _refseqv1.0_TransposableElements_2017Mar13.gff3.zip.

Identification and analysis of genomic interactions Analyses of RNA-seq data Gene-to-gene loops (GGLs), RNAPII-associated loops Transcript abundance in shoots was quantified directly (RALs), and interchromosomal interactions were identi- from the fastq files with kallisto [94] and converted to fied using HOMER v4.10 [93] at a resolution of 20 kb, gene-level counts (TPM) with the R package tximport 25 kb, and 50 kb, respectively, with p value < 0.05, Z- [95]. Genes were classified as expressed or non- score > 1.5, and FDR < 0.1. Concia et al. Genome Biology (2020) 21:104 Page 17 of 20

For GGLs, the first and second genomic bins of shoot Supplementary information interactions were separately annotated with genes and Supplementary information accompanies this paper at https://doi.org/10. 1186/s13059-020-01998-1. then with ChIP peaks and TPMs using bedtools intersect. Bins not overlapping with genes were Additional file 1. Supplementary figures. discarded, and the first and second bins were rejoined Additional file 2: Table S1. with information about gene to gene loops interaction-wise with the function merge in R. We then (GGLs): position, size, strength, and presence or absence of H3K9ac, transformed the annotated gene pairs into a database H3K27me3, H3K36me3 histone marks, presence or absence of RNA reporting, for each gene, the presence (“1”) or absence polymerase II, transcriptional status and expression level (TPM). “ ” Additional file 3: Table S2. With odds ratios (OR) and effect size ( 0 ) of histone marks or RNAPII, a value of TPM; the (Cramer’s V) for all combinations of features of interacting genes pairs. “ ” “ ” expression status ( expressed or non-expressed ); plus Additional file 4: Table S3. With a summary of Hi-C and HiChIP libraries the distance between the two genes and the Z-score cal- statistics. culated by HOMER. Additional file 5: Review history. Starting from the database, we counted the number of genes with and without each feature and reported Peer review information the aggregated scores in a series of 2 × 2 contingency Kevin Pang was the primary editor of this article and managed its editorial tables with the counts for the first gene in rows and for process and peer review in collaboration with the rest of the editorial team. the second gene in columns. We explored all possible Review history ^ feature combinations and applied a X 2 test to each The review history is available as Additional file 5. 2 × 2 table. Since in a large population such as the one examined here, a statistically significant p value can Authors’ contributions LC, DL, and MB designed the research. DL, NRG, YH, and MP collected the arise from small differences between observed and plant material and performed the Hi-C, Hi-ChIP, ChIP-seq, and RNA-seq ex- expected values [97], we reported the log2 of odds periments. SK prepared the Hi-C libraries. DL, SK, and MP performed the Illu- ratios to interpret the outcome of the test for each mina sequencing. AMR, SDuncan, and SDomenichini contributed to the microscopy, the GISH experiment, and the immune-localization experiment. combination of marks (Additional file 3:TableS2)and LC, TB, CP, DMM, and AV performed the bioinformatics analyses. LC, JSRP, calculated Cramer’s V, a measure of the “effect size.” and MB wrote the manuscript, with support from all authors. EP, SD, GM, HH, GGLs between expressed genes (TPM > 0) were split in CB, MC, MMM, AB, CL, AH, CR, DL, and MB analyzed the data and provided critical feedback. The authors read and approved the final manuscript. twoseriesofquartilesaccordingtotheTPMsofthe two genes, and the counts in each combination of Funding quartiles are indicated in the heatmap in Fig. 4c. The This work was funded by the Agence National de la Recherche ANR (3DWheat project ANR-19-CE20-0001-01) and by the Institut Universitaire de same procedure was followed to produce Fig. 6bby France (IUF). splitting interchromosomal RNAPII-associated interac- tions in ten quantiles of TPM. Availability of data and materials Finally, we selected GGLs, RALs (intrachromosomal), Sequencing data from wheat Hi-C, Hi-ChIP, ATAC-seq, and RNA-seq and RNA polymerase II ChIP-seq, and rapeseed Hi-C are stored and openly available at and interchromosomal RNAPII-associated interactions the GEO under accession GSE133885 [99]. conserved in both shoots and roots. Using bedtools DNA methylation data used for Additional file 1: Supplemental Figure S6 intersect, we annotated with differentially expressed were obtained from Ramirez-Gonzalez et al. [36] and are available in the NCBI SRA repository under accession code SRP133674. genes the first and the second genomic bins of each RNA-seq data from the seedling shoots and roots used for the differential interaction. Bins not overlapping with genes were dis- expression analysis were generated by a previous study [55] and are carded, and the first and second bins were rejoined available in the NCBI SRA repository under accession code DRR003148, DRR003149, and DRR003150 (seedling root) and DRR003154, DRR003155, and interaction-wise with the function merge in R. The linear DRR003156 (seedling shoot). regression between the log2(fold change shoot/root) of the two genes of all pairs was calculated with R, and Ethics approval and consent to participate scatter plots (Figs. 4d, 5a, and 6c) were generated with Not applicable ggplot2 v3.1.0. Quartiles of the distance between the two Competing interests genes (Additional file 1: Supplemental Figure S10) and The authors declare that they have no competing interest quartiles of Z-score in shoots and roots (Fig. 4e, Add- Author details itional file 1: Supplemental Figure S11 and S14) were 1Institute of Plant Sciences Paris of Saclay (IPS2), UMR 9213/UMR1403, CNRS, calculated using the function mutate of the R package INRA, Orsay, France. 2Division of Biological and Environmental Sciences and dplyr v0.7.8 [98], and both regression and scatterplot Engineering, King Abdullah University of Science and Technology, Thuwal 23955-6900, Kingdom of Saudi Arabia. 3John Innes Centre, Norwich Research reproduced for each group. Park, Norwich NR4 7UH, UK. 4Earlham Institute, Norwich Research Park, We counted the number of partners of each gene in Norwich NR4 7UG, UK. 5INRA UMR1095 Genetics, Diversity and interchromosomal RNAPII-associated interactions with Ecophysiology of Cereals, 5 chemin de Beaulieu, 63039 Clermont-Ferrand, France. 6Center for Plant Molecular Biology (ZMBP), University of Tübingen, the R function table and plotted the histogram in Fig. 6d Auf der Morgenstelle 32, 72076 Tübingen, Germany. 7Institut Universitaire de with ggplot2. France (IUF), Paris, France. Concia et al. Genome Biology (2020) 21:104 Page 18 of 20

Received: 30 July 2019 Accepted: 12 March 2020 13(8):1749 Available from: http://www.jstor.org/stable/10.2307/3871316 ?origin=crossref. 21. Madlung A. Remodeling of DNA methylation and phenotypic and transcriptional changes in synthetic Arabidopsis allotetraploids. Plant References Physiol. 2002;129(2):733–46 Available from: http://www.plantphysiol.org/cgi/ 1. Kouzine F, Levens D, Baranello L. DNA topology and transcription. Nucl. doi/10.1104/pp.003095. – 2014;5(3):195 202. 22. Kashkush K, Feldman M, Levy AA. Gene loss, silencing and activation in a 2. Donaldson-Collier MC, Sungalee S, Zufferey M, Tavernari D, Katanayeva N, newly synthesized wheat allotetraploid. Genetics. 2002;160(4):1651–9. Battistello E, et al. EZH2 oncogenic mutations drive epigenetic, 23. Tayalé A, Parisod C. Natural pathways to polyploidy in plants and transcriptional, and structural changes within chromatin domains. Nat consequences for genome reorganization. Cytogenet Genome Res. 2013; Genet. 2019;51(3):517–28 Available from: http://www.nature.com/articles/ 140(2–4):79–96. s41588-018-0338-y. 24. Dvořák J. The relationship between the genome of Triticum urartu and the 3. Ariel F, Romero-Barrios N, Jegu T, Benhamed M, Crespi M. Battles and A and B genomes of Triticum aestivum. Can J Genet Cytol. 1976;18(2):371–7 hijacks: noncoding transcription in plants. Trends Plant Sci. 2015;20(6):362– Available from: http://www.nrcresearchpress.com/doi/10.1139/g76-045. 71. 25. Huang X, Börner A, Röder M, Ganal M. Assessing genetic diversity of wheat 4. Rinn JL, Kertesz M, Wang JK, Squazzo SL, Xu X, Brugmann SA, et al. (Triticum aestivum L.) germplasm using microsatellite markers. Theor Appl Functional demarcation of active and silent chromatin domains in human Genet. 2002;105(5):699–707 Available from: http://link.springer.com/10.1007/ HOX loci by noncoding RNAs. Cell. 2007;129(7):1311–23. s00122-002-0959-4. 5. Misteli T. Beyond the sequence: cellular organization of genome function. 26. Dvorak J, Akhunov ED. Tempos of gene locus deletions and duplications Cell. 2007;128(4):787–800. and their relationship to recombination rate during diploid and polyploid 6. Sutherland H, Bickmore WA. Transcription factories: gene expression in evolution in the Aegilops-Triticum alliance. Genetics. 2005;171(1):323–32 unions? Nat Rev Genet. 2009;10(7):457–66. Available from: http://www.genetics.org/lookup/doi/10.1534/genetics.105. 7. Alt F, Almouzni G, Felsenfeld G, Dekker J. Genome architecture and 041632. expression. Curr Opin Genet Dev. 2013;23(2):79–80. https://doi.org/10.1016/j. 27. Pont C, Salse J. Wheat paleohistory created asymmetrical genomic gde.2012.03.003. evolution. Curr Opin Plant Biol. 2017;36:29–37 Available from: https:// 8. De Graaf CA, Van Steensel B. Chromatin organization: form to function. Curr linkinghub.elsevier.com/retrieve/pii/S1369526616301273. Opin Genet Dev. 2013;23(2):185–90. https://doi.org/10.1016/j.gde.2012.11. 011. 28. Zhang W, Zhang M, Zhu X, Cao Y, Sun Q, Ma G, et al. Molecular cytogenetic 9. Dong P, Tu X, Chu PY, Lü P, Zhu N, Grierson D, et al. Comprehensive and genomic analyses reveal new insights into the origin of the wheat B – mapping of long range interactions reveals folding principles of the human genome. Theor Appl Genet. 2018;131(2):365 75. genome. Mol Plant. 2017;10(12):1497–509. https://doi.org/10.1016/j.molp. 29. Mcfadden ES, Sears ER. The origin of triticum spelta and its free-threshing – 2017.11.005. hexaploid relatives. J Hered. 1946;37(3):81 9 Available from: https:// 10. Feng C-M, Qiu Y, Van Buskirk EK, Yang EJ, Chen M. Light-regulated gene academic.oup.com/jhered/article-lookup/doi/10.1093/oxfordjournals.jhered. repositioning in Arabidopsis. Nat Commun. 2014;5:3027. https://doi.org/10. a105590. 1038/ncomms4027. 30. Bierman A, Botha A-M. A review of genome sequencing in the largest Triticum aestivum – 11. Grob S, Schmid MW, Grossniklaus U. Hi-C analysis in Arabidopsis identifies cereal genome, L. Agric Sci. 2017;08(02):194 207. Available the KNOT, a structure with similarities to the flamenco locus of Drosophila. from: http://www.scirp.org/journal/doi.aspx? https://doi.org/10.4236/as.2017. Mol Cell. 2014;55(5):678–93. https://doi.org/10.1016/j.molcel.2014.07.009. 82014. 12. Rodriguez-Granados NY, Ramirez-Prado JS, Veluchamy A, Latrasse D, 31. Dubcovsky J, Dvorak J. Genome plasticity a key factor. Science. 2007;316: – Raynaud C, Crespi M, et al. Put your 3D glasses on: plant chromatin is on 1862 6. show. J Exp Bot. 2016;67(11):1–17 Available from: http://jxb.oxfordjournals. 32. Wang S, Wong D, Forrest K, Allen A, Chao S, Huang BE, et al. org/lookup/doi/10.1093/jxb/erw168. Characterization of polyploid wheat genomic diversity using a high-density 13. Veluchamy A, Jegu T, Ariel F, Latrasse D, Gayathri Mariappan K, Kim S-KK, 90 000 single nucleotide polymorphism array. Plant Biotechnol J. 2014;12(6): – et al. LHP1 regulates H3K27me3 spreading and shapes the three- 787 96. dimensional conformation of the Arabidopsis genome. Bendahmane M, 33. Jordan KW, Wang S, Lun Y, Gardiner LJ, MacLachlan R, Hucl P, et al. A editor. PLoS One. 2016;11(7):e0158936 Available from: https://dx.plos.org/1 haplotype map of allohexaploid wheat reveals distinct patterns of selection 0.1371/journal.pone.0158936. on homoeologous genomes. Genome Biol. 2015;16(1):1–18. 14. Mascher M, Gundlach H, Himmelbach A, Beier S, Twardziok SO, Wicker T, 34. Appels R, Eversole K, Stein N, Feuillet C, Keller B, Rogers J, et al. Shifting the et al. A chromosome conformation capture ordered sequence of the barley limits in wheat research and breeding using a fully annotated reference genome. Nature. 2017;544(7651):427–33. https://doi.org/10.1038/ genome. Science. 2018;361(6403):eaar7191 Available from: http://www. nature22043. sciencemag.org/lookup/doi/10.1126/science.aar7191. 15. Dong P, Tu X, Chu P-YY, Lü P, Zhu N, Grierson D, et al. 3D chromatin 35. Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, architecture of large plant genomes determined by local A/B Telling A, et al. Comprehensive mapping of long-range interactions reveals compartments. Mol Plant. 2017;10(12):1497–509 Available from: http://www. folding principles of the human genome. Science. 2009;326(5950):289–93 sciencedirect.com/science/article/pii/S1674205217303398. Available from: http://heinonlinebackup.com/hol-cgi-bin/get_pdf. 16. Baduel P, Bray S, Vallejo-Marin M, Kolář F, Yant L. The “polyploid hop”: cgi?handle=hein.journals/famlq41§ion=14. shifting challenges and opportunities over the evolutionary lifespan of 36. Ramírez-González RH, Borrill P, Lang D, Harrington SA, Brinton J, Venturini L, genome duplications. Front Ecol Evol. 2018;6 Available from: https://www. et al. The transcriptional landscape of polyploid wheat. Science. 2018; frontiersin.org/article/10.3389/fevo.2018.00117/full . Accessed 18 Mar 2020. 361(6403):eaar6089 Available from: http://www.sciencemag.org/lookup/ 17. Salman-Minkov A, Sabath N, Mayrose I. Whole-genome duplication as a key doi/10.1126/science.aar6089. factor in crop domestication. Nat Plants. 2016;2(8):16115 Available from: 37. Jacob Y, Michaels SD. H3K27me1 is E(z) in animals, but not in plants. http://www.nature.com/articles/nplants2016115. . 2009;4(6):366–9 Available from: https://www.ncbi.nlm.nih.gov/ 18. Renny-Byfield S, Wendel JF. Doubling down on genomes: polyploidy and pubmed/19736521. crop plants. Am J Bot. 2014;101(10):1711–25. 38. Roudier F, Ahmed I, Bérard C, Sarazin A, Mary-Huard T, Cortijo S, et al. 19. Pontes O, Neves N, Silva M, Lewis MS, Madlung A, Comai L, et al. Integrative epigenomic mapping defines four main chromatin states in Chromosomal locus rearrangements are a rapid response to formation of Arabidopsis. EMBO J. 2011;30(10):1928–38. the allotetraploid Arabidopsis suecica genome. Proc Natl Acad Sci. 2004; 39. West PT, Li Q, Ji L, Eichten SR, Song J, Vaughn MW, et al. Genomic 101(52):18240–5 Available from: http://www.pnas.org/cgi/doi/10.1073/pnas. distribution of H3K9me2 and DNA methylation in a maize genome. PLoS 0407258102. One. 2014;9(8):1–10. 20. Shaked H, Kashkush K, Ozkan H, Feldman M, Levy AA. Sequence elimination 40. Kouzarides T. Chromatin modifications and their function. Cell. 2007;128(4): and cytosine methylation are rapid and reproducible responses of the 693–705 Available from: https://linkinghub.elsevier.com/retrieve/pii/S009286 genome to wide hybridization and allopolyploidy in wheat. Plant Cell. 2001; 7407001845. Concia et al. Genome Biology (2020) 21:104 Page 19 of 20

41. Wu Y, Kikuchi S, Yan H, Zhang W, Rosenbaum H, Iniguez AL, et al. 60. Wang M, Wang P, Lin M, Ye Z, Li G, Tu L, et al. Evolutionary dynamics of 3D Euchromatic subdomains in rice centromeres are associated with genes and genome architecture following polyploidization in cotton. Nat Plants. 2018; transcription. Plant Cell. 2011;23(11):4054–64 Available from: http://www. 4(2):90–7. https://doi.org/10.1038/s41477-017-0096-3. plantcell.org/lookup/doi/10.1105/tpc.111.090043. 61. Avivi L, Feldman M, Brown M. An ordered arrangement of chromosomes in 42. Santos AP, Shaw P. Interphase chromosomes and the Rabl configuration: the somatic nucleus of common wheat, Triticum aestivum L. Chromosoma. does genome size matter? J Microsc. 2004;214(2):201–6. https://doi.org/10. 1982;86(1):1–16 Available from: http://link.springer.com/10.1007/BF00330726. 1111/j.0022-2720.2004.01324.x. 62. Feldman M, Levy AA. Genome evolution due to allopolyploidization in 43. Rocha PP, Raviram R, Bonneau R, Skok JA. Breaking TADs: insights into wheat. Genetics. 2012;192(3):763–74 Available from: http://www.genetics. hierarchical genome organization. Epigenomics. 2015;7(4):523–6. org/lookup/doi/10.1534/genetics.112.146316. 44. Dixon JR, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, et al. Topological domains in 63. Martín AC, Rey M-D, Shaw P, Moore G. Dual effect of the wheat Ph1 locus mammalian genomes identified by analysis of chromatin interactions. on chromosome synapsis and crossover. Chromosoma. 2017;126(6):669–80 Nature. 2012;485(7398):376–80 Available from: http://www.nature.com/ Available from: http://link.springer.com/10.1007/s00412-017-0630-0. articles/nature11082. 64. Avni R, Nave M, Barad O, Baruch K, Twardziok SO, Gundlach H, et al. Wild 45. Ghirlando R, Felsenfeld G. CTCF: making the right connections. Genes Dev. emmer genome architecture and diversity elucidate wheat evolution and 2016;30(8):881–91. domestication. Science. 2017;357(6346):93–7 Available from: http://www. 46. Hansen AS, Cattoglio C, Darzacq X, Tjian R. Recent evidence that TADs and sciencemag.org/lookup/doi/10.1126/science.aan0032. chromatin loops are dynamic structures. Nucleus. 2018;9(1):20–32. 65. Abranches R, Beven AF, Aragón-Alcaide L, Shaw PJ. Transcription sites are 47. Wutz G, Várnai C, Nagasaka K, Cisneros DA, Stocsits RR, Tang W, et al. not correlated with chromosome territories in wheat nuclei. J Cell Biol. Topologically associating domains and chromatin loops depend on cohesin 1998;143(1):5–12. and are regulated by CTCF, WAPL, and PDS5 proteins. EMBO J. 2017;36(24): 66. Matsunaga S, Katagiri Y, Nagashima Y, Sugiyama T, Hasegawa J, Hayashi K, 3573–99 Available from: http://emboj.embopress.org/content/36/24/3573. et al. New insights into the dynamics of plant cell nuclei and chromosomes. abstract. In: Jeon KWBT-IR of C and MB, editor. International Review of Cell and 48. Crane E, Bian Q, McCord RP, Lajoie BR, Wheeler BS, Ralston EJ, et al. Molecular Biology. Academic Press; 2013. p. 253–301. Available from: https:// -driven remodelling of X chromosome topology during dosage www.sciencedirect.com/science/article/pii/B9780124076952000068?via%3 compensation. Nature. 2015;523(7559):240–4. https://doi.org/10.1038/ Dihub. [cited 2019 Nov 25]. nature14450. 67. Jin QW, Fuchs J, Loidl J. Centromere clustering is a major determinant of 49. Buenrostro JD, Giresi PG, Zaba LC, Chang HY, Greenleaf WJ. Transposition of yeast interphase nuclear organization. J Cell Sci. 2000;113(Pt 1):1903–12 native chromatin for fast and sensitive epigenomic profiling of open Available from: http://www.ncbi.nlm.nih.gov/pubmed/10806101. chromatin, DNA-binding proteins and nucleosome position. Nat Methods. 68. Marshall WF, Dernburg AF, Harmon B, Agard DA, Sedat JW. Specific 2013;10(12):1213–8. https://doi.org/10.1038/nmeth.2688. interactions of chromatin with the nuclear envelope: positional 50. Jégu T, Veluchamy A, Ramirez-Prado JS, Rizzi-Paillet C, Perez M, Lhomme A, determination within the nucleus in Drosophila melanogaster. Mol Biol Cell et al. The Arabidopsis SWI/SNF protein BAF60 mediates seedling growth 1996;7(5):825–842. Available from: http://www.ncbi.nlm.nih.gov/pubmed/ control by modulating DNA accessibility. Genome Biol. 2017;18(1):114 8744953%0Ahttp://www.pubmedcentral.nih.gov/articlerender.fcgi?artid= Available from: https://www.ncbi.nlm.nih.gov/pubmed/28619072. PMC275932. 51. Hagège H, Klous P, Braem C, Splinter E, Dekker J, Cathala G, et al. 69. Gonzalez-Sandoval A, Gasser SM. On TADs and LADs: spatial control over Quantitative analysis of chromosome conformation capture assays (3C- gene expression. Trends Genet. 2016;32(8):485–95 Available from: https:// qPCR). Nat Protoc. 2007;2(7):1722–33. https://doi.org/10.1038/nprot.2007.243. linkinghub.elsevier.com/retrieve/pii/S0168952516300464. 52. Ariel F, Jegu T, Latrasse D, Romero-Barrios N, Christ A, Benhamed M, et al. 70. Göndör A, Ohlsson R. Enhancer functions in three dimensions: beyond the Noncoding transcription by alternative rna polymerases dynamically flat world perspective. F1000Res. 2018;7:681 Available from: https://f1 regulates an auxin-driven chromatin loop. Mol Cell. 2014;55(3):383–96. 000research.com/articles/7-681/v1. 53. Jégu T, Latrasse D, Delarue M, Hirt H, Domenichini S, Ariel F, et al. The 71. Razin SV, Ulianov SV, Ioudinkova ES, Gushchanskaya ES, Gavrilov AA, Iarovaia BAF60 subunit of the SWI/SNF chromatin-remodeling complex directly OV. Domains of α-andβ-globin genes in the context of the structural- controls the formation of a gene loop at FLOWERING LOCUS C in functional organization of the eukaryotic genome. Biochem. 2012;77(13):1409– Arabidopsis. Plant Cell 2014;26(2):538–551. Available from: http://www. 23 Available from: http://link.springer.com/10.1134/S0006297912130019. plantcell.org/content/26/2/538. 72. Edelman LB, Fraser P. Transcription factories: genetic programming in three 54. Kim H-Y. Statistical notes for clinical researchers: chi-squared test and dimensions. Curr Opin Genet Dev. 2012;22(2):110–4 Available from: https:// Fisher’s exact test. Restor Dent Endod. 2017;42(2):152 Available from: https:// linkinghub.elsevier.com/retrieve/pii/S0959437X12000111. www.ncbi.nlm.nih.gov/pubmed/28503482. 73. Schoenfelder S, Clay I, Fraser P. The transcriptional interactome: gene 55. Oono Y, Kobayashi F, Kawahara Y, Yazawa T, Handa H, Itoh T, et al. expression in 3D. Curr Opin Genet Dev. 2010;20(2):127–33 Available from: Characterisation of the wheat (Triticum aestivum L.) transcriptome by de https://linkinghub.elsevier.com/retrieve/pii/S0959437X10000274. novo assembly for the discovery of phosphate starvation-responsive genes: 74. Rey M-D, Moore G, Martín AC. Identification and comparison of individual gene expression in Pi-stressed wheat. BMC Genomics. 2013;14(1):77 chromosomes of three accessions of Hordeum chilense, Hordeum vulgare, Available from: http://bmcgenomics.biomedcentral.com/articles/10.1186/14 and Triticum aestivum by FISH. Genome. 2018;61(6):387–96. https://doi.org/ 71-2164-14-77. 10.1139/gen-2018-0016. 56. Mumbach MR, Rubin AJ, Flynn RA, Dai C, Khavari PA, Greenleaf WJ, 75. Cox AV, Bennett ST, Parokonny AS, Kenton A, Callimassia MA, Bennett MD. et al. HiChIP: efficient and sensitive analysis of protein-directed Comparison of plant telomere locations using a PCR-generated synthetic genome architecture. bioRxiv. 2016:73619. https://doi.org/10.1038/ probe. Ann Bot. 1993;72(3):239–47 Available from: http://www.sciencedirect. nmeth.3999. com/science/article/pii/S0305736483711042. 57. Sexton T, Yaffe E, Kenigsberg E, Bantignies F, Leblanc B, Hoichman M, et al. 76. Cabrera A, Martín A, Barro F. In-situ comparative mapping (ISCM) of Glu-1 Three-dimensional folding and functional organization principles of the loci in Triticum and Hordeum. Chromosom Res. 2002;10(1):49–54. https:// Drosophila genome. Cell. 2012;148(3):458–72 Available from: https:// doi.org/10.1023/A:1014270227360. linkinghub.elsevier.com/retrieve/pii/S0092867412000165. 77. Kong L, Zhang Y, Ye Z-Q, Liu X-Q, Zhao S-Q, Wei L, et al. CPC: assess the 58. Rao SSP, Huntley MH, Durand NC, Stamenova EK, Bochkov ID, Robinson protein-coding potential of transcripts using sequence features and support JT, et al. A 3D map of the human genome at kilobase resolution vector machine. Nucleic Acids Res. 2007;35(web server issue):W345–9 reveals principles of chromatin looping. Cell. 2014;159(7):1665–80 Available from: https://www.ncbi.nlm.nih.gov/pubmed/17631615. Available from: https://linkinghub.elsevier.com/retrieve/pii/S0092867414 78. Kang Y-J, Yang D-C, Kong L, Hou M, Meng Y-Q, Wei L, et al. CPC2: a fast and 014974. accurate coding potential calculator based on sequence intrinsic features. 59. Liu C, Wang C, Wang G, Becker C, Zaidem M, Weigel D. Genome-wide Nucleic Acids Res. 2017;45(W1):W12–6 Available from: https://www.ncbi.nlm. analysis of chromatin packing in Arabidopsis thaliana at single-gene nih.gov/pubmed/28521017. resolution. Genome Res. 2016;26(8):1057–68 Available from: http://genome. 79. Kalvari I, Argasinska J, Quinones-Olvera N, Nawrocki EP, Rivas E, Eddy SR, cshlp.org/lookup/doi/10.1101/gr.204032.116. et al. Rfam 13.0: shifting to a genome-centric resource for non-coding RNA Concia et al. Genome Biology (2020) 21:104 Page 20 of 20

families. Nucleic Acids Res. 2018;46(D1):D335–42 Available from: https:// www.ncbi.nlm.nih.gov/pubmed/29112718. 80. Nawrocki EP, Eddy SR. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics. 2013;29(22):2933–5 Available from: https://www.ncbi.nlm.nih. gov/pubmed/24008419. 81. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20 Available from: https:// www.ncbi.nlm.nih.gov/pubmed/24695404. 82. Servant N, Varoquaux N, Lajoie BR, Viara E, Chen C-J, Vert J-P, et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 2015;16: 259 Available from: https://www.ncbi.nlm.nih.gov/pubmed/26619908. 83. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9(4):357–9 Available from: http://www.nature.com/articles/ nmeth.1923. 84. Imakaev M, Fudenberg G, RP MC, Naumova N, Goloborodko A, Lajoie BR, et al. Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nature Methods. 2012;9:999–1003. 85. Neva C. Durand, James T. Robinson, Muhammad S. Shamim, Ido Machol, Jill P. Mesirov, Eric S. Lander, Erez Lieberman Aiden, (2016) Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom. Cell Systems 3 (1):99–101. 86. Servant N, Lajoie BR, Nora EP, Giorgetti L, Chen C-J, Heard E, et al. HiTC: exploration of high-throughput “C” experiments. Bioinformatics. 2012;28(21): 2843–4 Available from: https://www.ncbi.nlm.nih.gov/pubmed/22923296. 87. Valero-Mora PM. ggplot2: elegant graphics for data analysis. J Stat Softw. 2010;35 Available from: https://ggplot2.tidyverse.org. Accessed 18 Mar 2020. 88. R Core Team. R: a language and environment for statistical computing. Vienna; 2017. Available from: https://www.r-project.org/. Accessed 18 Mar 2020. 89. Akdemir KC, Chin L. HiCPlotter integrates genomic data with interaction matrices. Genome Biol. 2015;16(1):198 Available from: https://www.ncbi.nlm. nih.gov/pubmed/26392354. 90. Kruse K, Hug CB, Hernández-Rodríguez B, Vaquerizas JM. TADtool: visual parameter identification for TAD-calling algorithms. Bioinformatics. 2016;32:3190–2. 91. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841–2 Available from: https:// www.ncbi.nlm.nih.gov/pubmed/20110278. 92. Ramírez F, Ryan DP, Grüning B, Bhardwaj V, Kilpert F, Richter AS, et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 2016;44(W1):W160–5 Available from: https:// www.ncbi.nlm.nih.gov/pubmed/27079975. 93. Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 2008;9(9):R137 Available from: http://genomebiology.biomedcentral.com/articles/10.1186/ gb-2008-9-9-r137. 94. Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA- seq quantification. Nat Biotechnol. 2016;34(5):525–7. https://doi.org/10.1038/ nbt.3519. 95. Soneson C, Love MI, Robinson MD. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Res. 2015;4: 1521 Available from: https://www.ncbi.nlm.nih.gov/pubmed/26925227. 96. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29(1):15–21 Available from: https://www.ncbi.nlm.nih.gov/pubmed/23104886. 97. Sullivan GM, Feinn R. Using effect size—or why the P value is not enough. J Grad Med Educ. 2012;4(3):279–82 Available from: https://www.ncbi.nlm.nih. gov/pubmed/23997866. 98. Wickham H, Romain F, Henry L, Müller K. RStudio. dplyr: a grammar of data manipulation. R package version 0.8.0.1; 2018. p. 75. Available from: https:// cran.r-project.org/package=dplyr. 99. Concia L, Veluchamy A, Ramirez-Prado JS, Martin Ramirez A, Huang Y, Perez M, Domenichini S, Rodriguez-Granado NY, Kim S, Blein T, Duncan S, Pichot C, Manza-Mianza D, Juery C, Paux E, Moore g HH, Bergounioux C, Crespi M, Mahfouz MM, Bendahmane A, Liu C, Hall A, Raynaud C, Latrasse D, Benhamed M. Wheat chromatin architecture is organized in genome territories and transcription factories. Datasets. Gene Expression Omnibus; 2020. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE133885. Accessed 18 Mar 2020.

Publisher’sNote Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.