Dovetail™ Micro-C™ Assay: Improved read support for topological features Technical Note

Figure 1 – Dovetail™ Micro-C libraries contain proximity ligation properties and are enriched in long-range Product Highlights: informative reads.

Highest resolution Hi-C data Dovetail™ Micro-C (blue) is compared to single (yellow) and multi (orange) RE-based Hi-C generated from 1 Uniform coverage across the genome million GM12878 cells. All libraries were subsampled to Detect nucleosome-to-nucleosome 1 million read-pairs and processed through the Dovetail interactions Genomics QC pipeline. Library characteristics are compared across libraries. Percentages of valid reads (trans + cis > 1 kbp) of the total library, cis reads with insert sizes > 1 kbp, > 20 kbp, and > 1 Mbp are plotted. Library complexity, as percent unique reads at 300 million read pairs, is extrapolated using the preseq library complexity Introduction tool.

Chromatin conformation capture (3C) methods % Total % cis % Unique enable the study of higher-order features of Library Read Pairs of 300M 100 chromatin organization such as chromatin Micro-C loops, topologically associating domains Multi-RE (TADs), and A/B compartments. In particular, 80 Single-RE Hi-C has been widely adopted to assess the frequency of interaction between distant 60 genomic locations up to megabases apart on 40 a genome-wide scale. The resolution at which Percentage the interaction point can be mapped is reliant 20 on two factors associated with the chromatin fragmentation approach: 0

Valid • Distance between fragmentation sites > 1kbp > 20 kbp > 1 Mbp • Uniformity of fragmentation length Complexity While restriction enzymes (RE) are widely used for chromatin fragmentation in Hi-C protocols, activities so generated fragments are of mono- they are not uniformly distributed across the nucleosome length (146bp). The proximity genome thereby generating fragments of ligation portion of the protocol is optimized highly variable length. This non-uniformity to maximize long-range interactions. Our contributes to variable coverage and reduces Dovetail™ Micro-C libraries produced > 95% of achievable chromatin contact resolution. valid read-pairs with 75% unique molecules at 300 million reads. Notably, this is significantly Dovetail Genomics has employed micrococcal higher than single- or multi-RE approaches nuclease (MNase) in our Dovetail™ Micro-C (Figure 1). Moreover, considering all cis- Proximity Ligation Assay. The resulting highly contacts, 92% were > 1 kbp and uniform, short fragments enable nucleosome- 42% > 1 Mbp. These data indicate that the level resolution of chromatin contacts, a library characteristics desired in high quality theoretical resolution maximum. Here, we Hi-C libraries are significantly improved in the compare the ultra-high-resolution chromatin Dovetail™ Micro-C Assay. conformation information generated by Dovetail™ Micro-C to RE-based Hi-C data and Improved calling of topological highlight its nucleosome-resolution capabilities. features Dovetail Micro-C libraries are enriched The ability to detect higher-order features, such for desirable Hi-C properties as chromatin loops, in proximity ligation data is dependent on enriching long-range informative MNase enzyme possesses both sequence- reads to capture chromatin interaction independent endonuclease and exonuclease frequency. The increased chromosome

1 conformation informative reads combined Micro-C based contact matrix contained with ultra-high-resolution improves loop calling 19,084 chromatin loops with 4,334 overlapping compared to RE-based methods (Figure 2A), as with Rao et al. Overlapping loop calls between demonstrated for previously enumerated loops Dovetail Micro-C and Rao et al. are similar in (Rao et al., 2014). Furthermore, de novo loop number to other such comparisons. calling on the GM12878 cell line using Juicer (Durand et al.) demonstrates that the Dovetail

Figure 2 – Dovetail™ Micro-C contact matrices display improved signal-to-noise enabling superior detection of higher-order features of chromatin organization.

A&B) Contact matrix images from Dovetail™ Micro-C, single-, and multi-RE Hi-C are shown at 4 kbp resolution from GM12878 cell lines. Known loops from Rao et al. are outlined with circles and the number of supporting reads from a 800 M total read depth for each library type is indicated. C) Venn diagram comparing Juicer detected loops between Rao et al., 2014 and Dovetail Micro-C.

A. B.

250 298 515 299 1644 990 1502 2246 2421 Micro-C Micro-C

98 150 136 143 486 321 410 514 794 Multi-RE Hi-C Multi-RE Hi-C

48 48 66 72 107 114 62 126 119 Single-RE Hi-C Single-RE Hi-C

Chr11: 1.5Mb-3.5Mb Chr8: 139Mb-142Mb ZMYND11 PRR26 GTPBP4 ADARB2-AS1 DENND3 PTP4A3 MIR4472-1

TUBB8 MIR7641-2 DIP2C MIR5699 LARP4B IDI1 ADARB2 MIR6072 LOC105376351 PTK2 SLC45A4 GPR20 MROH5 MIR1302-7 TSNARE1

C.

Dovetail™ Rao Micro-C 14,750 4,334 3,714

To place an order or for more information. TECH 2 Visit us at www.dovetailgenomics.com or send an email to [email protected] NOTE Dovetail Micro-C uniquely captures In contrast, the single-RE Hi-C dataset for nucleosome positioning the same region exhibits a coverage drop off, likely due to an area of low RE-site density. As Chromatin digested with MNase reveals a a result, chromatin contacts are only mappable genome-wide nucleosome map that is visible in at low resolution. These data demonstrate the Dovetail Micro-C libraries. To highlight this that Dovetail Micro-C data not only identifies quality, a metagene analysis of high occupancy nucleosomes, but also provides more uniformly CTCF sites was performed. The result displays distributed fragmentation sites over other Hi-C coverage periodicity relative to the CTCF approaches. anchor (Figure 3A) in which peaks indicate DNA that is protected by the nucleosome and Dovetail Micro-C data contains troughs represent intervening DNA that is nucleosome-to-nucleosome contacts accessible to MNase digestion. This oscillation occurs at a frequency of ~146 bp (the length The combined genome-wide nucleosome of DNA wrapped around a mono-nucleosome). positioning and ultra-resolution chromosome This feature is unique to Micro-C data as it is topology enabled by Dovetail Micro-C facilitates absent from RE-based approaches. mapping from nucleosome-to-nucleosome chromatin contacts. To demonstrate this The regularly spaced MNase fragmentation feature, Hi-C contacts within 1,000 bp of all sites resulting from the naturally occurring strong CTCF sites were aggregated at a 1 bp nucleosome distribution produces uniform read contact map resolution (Figure 4). This reveals coverage across the genome. To understand both nucleosome phasing around CTCF (as a this further, we focused on an 8 kbp CTCF- 2-D version of Figure 3A) and nucleosome- mediated region on chr1. Micro-C genomic to-nucleosome interactions (observed off the coverage across this region displays both diagonal). This capability is unique to Micro-C uniform sequence distribution and the ability to data as RE-based Hi-C lacks the resolution map nucleosome position (Figure 3B).

Figure 3 – Dovetail™ Micro-C libraries are enriched in nucleosome-protected fragments enabling genome-wide nucleosome resolution.

A) Metagene analysis of relative coverage across ~20,000 high occupancy CTCF regions in GM12878 cells. Dovetail™ Micro-C (blue) is compared to single (yellow) and multi (orange) RE-based Hi-C. For the Micro-C dataset, peaks in coverage represent DNA protected by the nucleosome and troughs are DNA accessible to MNase during chromatin fragmentation. These data are the average of normalized (by read depth) coverage. B) Genomic coverage at a CTFC occupied site in chr1. Libraries were subsampled to 40X coverage and plotted along an 8 kbp region associated with the CTCF occupied , ACAP3. In contrast to RE-based approaches, Micro-C libraries display nucleosome periodicity and fragment length uniformity. Regions of low coverage in the RE-based Hi-C coverage are regions with low RE-site density.

A. B.

chr1: 1.298 Mbp 1.306 Mbp

200 0.00060 Micro-C 1,298,5931,298,625 0 200 Multi-RE 0.00055 0 200 Single-RE 0 0.00050 400 CTCF 0

Genes

Relative Read Coverage ACAP3 SNORD167 0.00045 Micro-C 400 Multi RE CTCF Single RE 0 chr1: 1.15 Mbp 1.55Mbp -1000 CTCF 1000 MIR200B B3GALT6 ACAP3 MXRA8 ANKRD65 ATAD3B SSU72 Base pairs around CTCF binding sites TNFRSF18 UBE2J2 CPTP CCNL2 ATAD3C ATAD3A

To place an order or for more information, TECH 3 visit us at www.dovetailgenomics.com or send an email to [email protected] NOTE Figure 4 – Nucleosome-level resolution of chromatin contacts generated with Dovetail™ Micro-C.

Hi-C contacts within 1,000 bp of all strong CTCF sites were aggregated at a 1 bp contact map resolution for Micro-C (A) and Multi-RE Hi-C (B). This reveals both nucleosome phasing around CTCF (as a 2-D version of Figure 3A) and nucleosome- to-nucleosome interactions (observed off the diagonal) in the Micro-C library that is not present in RE-based Hi-C. The decrease in interaction across the CTCF regions indicate that CTCF acts as an insulator preventing chromatin interactions. This matrix was generated from 1.3 B paired-end Dovetail Micro-C reads from the human cell line GM12878.

A. B. 1000 1000

500 500

0 0

-500 -500 Distance from CT CF sites (bp) Distance from CT CF sites (bp)

-1000 -1000 -1000 -500 0 500 1000 -1000 -500 0 500 1000 Distance from CTCF sites (bp) Distance from CTCF sites (bp) to capture nucleosome interactions. This Summary highlights that Dovetail™ Micro-C can be used to detect and explore higher-order features at Here we showcase Dovetail Micro-C, a nucleosome-level resolution around a genomic proximity-ligation assay with superior genomic region of interest or, if coupled with the resolution of chromatin contacts and higher Dovetail™ HiChIP MNase Assay, for a order structures involved in chromatin factor of interest. topology. Dovetail Micro-C has been optimized to produce the highest signal-to-noise data with both enrichment of long-range informative reads and nucleosome protected fragments. These unique properties enable nucleosome- level resolution of chromatin contacts.

References

Durand, NC et al. (2016) Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3(1): 95–98. doi:10.1016/j.cels.2016.07.002.

Rao, SSP et al. (2014) A 3D map of the at kilobase resolution reveals principles of chromatin looping. Cell. 159: 1665-1680. DOI:https://doi.org/10.1016/j.cell.2014.11.021

4 TECH NOTE