<<

A parallelized, automated platform enabling individual or sequential ChIP of histone marks and transcription factors

Riccardo Dainesea,b, Vincent Gardeuxa,b, Gerard Llimosa,b, Daniel Alperna,b, Jia Yuan Jianga, Antonio Carlos Alves Meireles-Filhoa,b, and Bart Deplanckea,b,1

aSchool of Life Sciences, Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland; and bSwiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland

Edited by Bing Ren, Ludwig Institute for Cancer Research, La Jolla, CA, and accepted by Editorial Board Member Christopher K. Glass April 16, 2020 (received for review August 6, 2019) Despite its popularity, chromatin immunoprecipitation followed preamplification RNA-seq workflow consists of only three steps— by sequencing (ChIP-seq) remains a tedious (>2 d), manually in- that is, cell lysis, RNA capturing and reverse transcription—ChIP- tensive, low-sensitivity and low-throughput approach. Here, we seq typically involves several preamplification steps (cross-linking, combine principles of microengineering, surface chemistry, and lysis, fragmentation, IP, end repair, and adapter ligation). Finally, to address the major limitations of standard any given RNA transcript is present in each cell in numerous copies, ChIP-seq. The resulting technology, FloChIP, automates and mini- which increases the likelihood of its capture and detection, whereas aturizes ChIP in a beadless fashion while facilitating the down- each locus-specific −DNA contact occurs a maximum of two stream library preparation process through on-chip chromatin times in a diploid cell. The combination of these idiosyncratic dif- tagmentation. FloChIP is fast (<2 h), has a wide dynamic range 6 ferences, together with the lack of enabling solutions, has thus far (from 10 to 500 cells), is scalable and parallelized, and supports prevented the ChIP-seq technology, as opposed to other NGS- antibody- or sample-multiplexed ChIP on both histone marks and ’ based methods, from reaching its full potential in terms of adop- transcription factors. In addition, FloChIP s interconnected design tion, utility, and biomedical relevance. allows for straightforward chromatin reimmunoprecipitation, In addition to the standard ChIP protocol, a modification of which allows this technology to also act as a microfluidic sequen- its workflow involving sequential ChIP has been developed to tial ChIP-seq system. Finally, we ran FloChIP for the transcription infer the genomic cooccurrence of two distinct (protein) targets. factor MEF2A in 32 distinct human lymphoblastoid cell lines, pro- In principle, sequential ChIP consists of performing ChIP twice viding insights into the main factors driving collaborative DNA binding of MEF2A and into its role in B cell-specific regula- on the same input chromatin, which leads to a multiplication of tion. Together, our results validate FloChIP as a flexible and the inefficiencies mentioned above. Therefore, not only does reproducible automated solution for individual or sequential sequential ChIP show the same limitations as regular ChIP-seq, ChIP-seq. but these come in an augmented form due to its consecutive nature. As a result, few studies have so far performed sequential microfluidics | epigenetics | ChIP-seq | Significance he genome-wide distribution and dynamics of protein−DNA Tinteractions constitute a fundamental aspect of gene regu- Chromatin immunoprecipitation followed by next-generation lation. Chromatin immunoprecipitation followed by next gener- sequencing (ChIP-seq) has become the most widely used ation sequencing (ChIP-seq) (1) has become the most widespread method to infer the genomic binding locations of chromatin- technique for mapping protein−DNA interactions genome-wide. associated such as histones and transcription factors. ChIP-seq has been successfully applied to dozens of transcription Since its emergence in 2007, ChIP-seq has been performed factors (TFs), histone modifications, chromatin modifying com- through a long sequence of manual steps, with consequent plexes, and other chromatin-associated proteins in humans and limitations in throughput and sensitivity of the technique. other model organisms (2). The ENCODE (Encyclopedia of DNA Here, we present a microfluidic implementation of the ChIP Elements) and modENCODE consortia alone have already per- procedure, named FloChIP, which greatly streamlines the ex- formed more than 8,000 ChIP-seq experiments, which have greatly perimental workflow. We show that FloChIP can perform enhanced our collective understanding of how gene regulatory standard ChIP and sequential ChIP assays in parallel and re- processes are orchestrated in humans as well as several model or- producibly in an automated fashion for histone marks and ganisms (3). In addition, ChIP-seq proved to be essential to acquire transcription factors. new insights into genomic organization (4–6) and into the mecha- nisms underlying genomic variation-driven phenotypic diversity and Author contributions: R.D. and B.D. designed research; R.D. performed research; R.D., G.L., D.A., J.Y.J., and A.C.A.M.-F. contributed new reagents/analytic tools; R.D. and V.G. disease susceptibility (5, 7, 8). More specifically, this assay proved analyzed data; and R.D. and B.D. wrote the paper. crucial in determining the DNA binding properties of hundreds of The authors declare no competing interest. TFs (9). Nevertheless, in comparison to other widespread methods This article is a PNAS Direct Submission. B.R. is a guest editor invited by the based on next generation sequencing (NGS)—e.g., RNA-seq (10) Editorial Board. and assay for transposase-accessible chromatin using sequencing Published under the PNAS license. — (ATAC-seq) (11) ChIP-seq lags behind in some key metrics, that Data deposition: All data have been deposited on ArrayExpress: E-MTAB-7179 (MEF2A is, throughput, sensitivity, and automation, which hinders its wider data), E-MTAB-7150 (decreasing cell number data), E-MTAB-7186 (sequential ChIP data), adoption and reproducibility. For example, while RNA-seq can now E-MTAB-8533 (TF benchmarking data), and E-MTAB-8551 (histone mark data). be regularly performed on hundreds or thousands of single cells 1To whom correspondence may be addressed. Email: [email protected]. using readily available workflows (12, 13), ChIP-seq has largely This article contains supporting information online at https://www.pnas.org/lookup/suppl/ remained labor intensive and limited to a few samples per run, doi:10.1073/pnas.1913261117/-/DCSupplemental. each composed of millions of cells. Moreover, while a typical First published May 27, 2020.

13828–13838 | PNAS | June 16, 2020 | vol. 117 | no. 24 www.pnas.org/cgi/doi/10.1073/pnas.1913261117 Downloaded by guest on September 29, 2021 ChIP followed by NGS (14, 15) (sequential ChIP-seq), and most BSA, which passively adsorbs to the hydrophobic walls of the of the available studies have so far relied on qPCR to validate microfluidic device. This layer has both an insulating role, pre- putative bivalent regions (16–18) (sequential ChIP-qPCR). venting nonspecific adsorption of chromatin to the chip walls, and a In recent years, several attempts have been made to alleviate docking role for the next layer, which is obtained by flowing on-chip some of the limitations of the ChIP-seq and sequential ChIP a solution containing neutravidin that strongly binds to the biotin approaches. Gasper et al. (19) and Aldridge et al. (20) addressed groups of the first layer. The third layer is formed by flowing a the issue of automation by implementing the manual steps of a solution of biotinylated protein A/G, which becomes firmly immo- conventional ChIP-seq workflow on robotic liquid handling sys- bilized by the unsaturated binding sites of the neutravidin layer. tems. However, in these examples, automation was balanced Protein A/G is a recombinant protein used in a variety of immu- with sensitivity, since these workflows still require tens of mil- noassays due to its ability to strongly bind to a large number of lions of cells per experiment. Van Galen et al. (21) and Chabbert different antibodies. This ability is retained by FloChIP’ssurface et al. (22) addressed the issue of throughput by barcoding and functionalization, which thus constitutes a general substrate for pooling chromatin samples before IP. Although van Galen et al. antibody pull-down. proved that their approach led to higher sensitivity (500 cells per Another critical feature of FloChIP’s workflow is the microfluidic ChIP), neither approach is automated, and both are, so far, tagmentation of immunoprecipitated chromatin. In a previous limited to the detection of histone marks. Ma et al. (23) and study, tagmentation of bead-bound chromatin (ChIPmentation) Rotem et al. (24) addressed the limitation of sensitivity with two was shown to generally increase cost-effectiveness and sensitivity of different microfluidic-based strategies. Ma et al. focused on the ChIP-seq workflow (25). Here, we built upon this concept and improving the efficiency of the IP step by confining it within adaptedittoobtainanexampleofChIPmentationperformeddi- microfluidic channels. Although they showed good IP efficiency rectly on chromatin bound to the inner walls of a microfluidic de- down to as few as 30 cells, their approach requires impractical vice (Fig. 1A, indexing and elution). Briefly, this is achieved by antibody−oligo conjugates, is not automated, and was not shown flowing a Tn5 solution into the device while heating the chip surface to work for TFs. On the other hand, Rotem et al. achieved the to 37 °C, allowing direct on-chip indexing of chromatin-bound remarkable feat of performing ChIP-seq in a single cell by in- DNA. Importantly, microfluidic ChIPmentation streamlines the tegrating the concept of chromatin barcoding and pooling into a downstream library preparation workflow and reduces hands-on single droplet-based microfluidic chip. However, even though time (see also below). the barcoding step has, indeed, single-cell resolution, the most For the successful initiation of the multilayered surface func- critical step—that is, the IP step—is performed manually on 100 tionalization, the only substrate requirement is the hydrophobic cells. As a result, their approach—also shown to work only for surface of the device polymer. Therefore, to maximize the SYSTEMS BIOLOGY histone marks—yielded sparse single-cell data, and thousands of surface-to-volume ratio of our devices, we designed an array of assays are still required to identify specific cell subpopulation micropillars (Fig. 1B), which repeats multiple times across each signatures. In an effort to simplify the sequential ChIP workflow, IP lane (Fig. 1C). With the goal of visually validating the suc- Weiner et al. (15) complemented the IP steps with sequential cessful assembly of our multilayered on-chip chemistry and to chromatin barcoding, thus achieving a high degree of multi- confirm that every layer is essential to this end, we first sought to plexing. However, their approach increases the number of experi- immunoprecipitate chromatin derived from a HeLa H2B-mCherry mental steps, which makes it significantly more labor intensive given cell line using an anti-H2B antibody. The resulting fluorescence that the workflow is not automated. micrographs confirmed that each layer of the molecular species is ENGINEERING In this work, we aimed to address the major limitations of current necessary for successful IP of cellular chromatin (Fig. 1D and SI ChIP-seq and sequential ChIP solutions (throughput, sensitivity, Appendix,Fig.S1A). and automation), by developing a microfluidic strategy that we The IP lane (Fig. 1C) is the fundamental unit of the FloChIP named FloChIP. We show that high quality and parallel/multi- architecture, and it can itself be repeated n times, where n is the plexed ChIP-seq for histone marks (down to 500 cells) and TFs desired throughput of the device. For our initial tests, we used an (100,000cells)isachievedinlessthan2hthroughacombinationof eight-lane FloChIP device (Fig. 1E and SI Appendix, Fig. S1B). microvalves, micropillars, flexible surface chemistry, and on-chip To gain accurate flow control, automation, and multiplexing, a chromatin tagmentation. Moreover, by designing an interconnected network of soft microvalves was added to the design; different and modular device, FloChIP enables straightforward re-IP of eluted multiplexing modes can be achieved with the same microfluidic chromatin, effectively establishing a half-day sequential ChIP pipe- architecture by actuating distinct sets of valves. For instance, we line. Finally, we performed FloChIP for the TF MEF2A using used the name “FloChIP mode 1 - sample multiplex,” for the chromatin derived from 32 lymphoblastoid cell lines (LCLs). Our option of coating all IP lanes of the device with one antibody and data highlight the main drivers of MEF2A collaborative DNA introducing different samples from dedicated individual inlets binding and provide insights into MEF2A’s role in the regulatory (Fig. 2A). In this multiplexing configuration, the number of network underlying lymphoblastoid proliferation. samples that can be processed is determined by the number of IP lanes of the microfluidic chip, for example, here, eight lanes and Results thus eight antibodies. Alternatively, “FloChIP mode 2 - antibody FloChIP Is Engineered for Automated, Beadless, and Miniaturized multiplex” provides the option of coating each IP lane with a ChIP-seq. Conventional ChIP-seq requires hours of manual different antibody and of distributing one sample equally across work and a wide range of consumables provided by a variety of the whole device (thus probing distinct antibodies in parallel suppliers and has limited sensitivity. The main rationale behind using one sample) (Fig. 2B). In this configuration, one can FloChIP’s design was the development of an efficient solution process as many antibodies as the number of IP lanes present on- that would address these drawbacks in a convenient and compact chip, here, eight lanes and thus eight antibodies. manner. The two core elements of FloChIP’s technology are the assembly of a multilayered stack of biomolecules, enabling ver- FloChIP Reliably Reproduces ENCODE Data across a Wide Range of satility in antibody pull-down (Fig. 1A), and an engineered pattern Input Cells. To evaluate the performance of our FloChIP strat- of high surface-to-volume micropillars for efficient chromatin cap- egy, we first set out to empirically estimate FloChIP’s dynamic ture and washing (Fig. 1B). FloChIP’s surface chemistry is based range. To this end, we performed FloChIP in sample multiplex on strong although noncovalent molecular interactions, enabling the mode with GM12878 lymphoblastoid cells, by functionalizing the immobilization of an antibody of choice prior to IP. The first layer is whole chip with an anti-H3K27ac antibody and immunopreci- obtained by flowing on-chip a concentrated solution of biotinylated pitating different chromatin dilutions (ranging from 1 million to

Dainese et al. PNAS | June 16, 2020 | vol. 117 | no. 24 | 13829 Downloaded by guest on September 29, 2021 A STEP 1 B PDMS micropillar array C IP lane Biotinylated BSA Chromatin 20 minutes outlet 100μl @ 2mg/mL PDMS

STEP 2 Neutravidin 20 minutes 100μl @ 1mg/mL

STEP 3 Bioinylated Protein A/G 20 minutes 100μl @ 1mg/mL 80 μm Surface functionalization STEP 4 Antibody D w Biotinylated w/o Biotinylated Protein A/G Protein A/G

Pull-down1 20 minutes 0.1-0.5μg/IP Before 80 wash minutes

STEP 5 After

IP ChIP+washes wash 30-60 Chromatin minutes inlet 3ng-6μg of chromatin 120 minutes E Reagent STEP 6 multiplexer ChIPmentation 45 minutes 0.5μl of Tn5 per IP

STEP 7 Indexing and elution Elution at 65°C 10 minutes 180 2 mm minutes

Fig. 1. FloChIP’s architecture for miniaturized ChIP-seq. (A) FloChIP’s processing phases in descending chronological order. In the first “surface functional- ization phase” (∼80 min), the inner walls are functionalized by sequentially introducing chemical species that firmly interact with both the previous and following layers of functionalization. Also, in chronological order, these species are biotin−BSA, neutravidin, biotin−protein A/G, and antibody. Following functionalization, the IP takes place by flowing sonicated chromatin on-chip in a total time of 30 min to 60 min (depending on the chromatin volume that is introduced), followed by low-salt, high-salt, and LiCl buffer washes. Subsequently, the antibody-bound chromatin is tagmented directly on-chip in order to introduce Illumina-compatible adapters. Finally, the tagmented chromatin is eluted off-chip using an SDS-containing buffer and high temperature.(B) Top- view microscopy picture of a portion containing numerous micropillars. Each portion is itself repeated several times along the length of one IP lane. (C) Top- view schematic of one IP lane. Each IP lane can be repeated n times across a FloChIP device. Flow channels are in blue, and control channels are in red. (D) Fluorescence micrographs showing the requirement for protein A/G in the correct formation of FloChIP’s functionalization. Images were taken at 10× magnification. (E) Top-view schematic of the eight-unit FloChIP device. Flow channels are in blue, and control channels are in red.

500 cells). Despite the observed differences in recovered DNA pairwise correlation of genome-wide read densities demon- (SI Appendix, Fig. S2A), we obtained high and stable fold en- strated the high accuracy of our approach, since we uncovered a richment results across the whole series of dilutions tested (SI high correlation between all library pairs (between r = 0.78 and Appendix, Fig. S2B). To obtain a genome-wide perspective on its r = 0.96). This included ENCODE−FloChIP pairs, among which ’ dynamic range, we sequenced FloChIP s libraries for chromatin the highest correlation was obtained for the 100,000 cell sample, samples obtained from 100,000, 50,000, 5,000 and 500 cells. that is, r = 0.88 (Fig. 2E and SI Appendix, Fig. S2E). These values After sequencing, the rate of uniquely mapped reads remained support the robustness of our approach, since they are in line high for all samples (SI Appendix, Fig. S2C), while the fraction of with those from a previous study (3) which assessed ChIP-seq reads falling into peaks (FRiP score) decreased with decreasing input amounts—from over 60% for the largest sample to just data reproducibility and antibody quality and concluded that ∼ above 10% for the smallest (SI Appendix, Fig. S2D). Neverthe- correlation values of 0.8 between matched conditions are en- less, both locus-specific inspection and genome-wide analysis of tirely acceptable. In addition, to evaluate individual chip-to-chip the obtained libraries revealed reproducible profiles (Fig. 2C) variation, we analyzed the correlation between two libraries obtained and characteristic accumulation of reads into regions in prox- from 100,000 cells in identical conditions but derived from different imity of transcription start sites (TSS; Fig. 2D). Moreover, the FloChIP devices. Again, we obtained high correlation (r = 0.98;

13830 | www.pnas.org/cgi/doi/10.1073/pnas.1913261117 Dainese et al. Downloaded by guest on September 29, 2021 Parallel antibody FloChIP Mode 1 Parallel sample FloChIP Mode 2 A distribution Sample multiplex B distribution Antibody multiplex ......

......

Sample 1 Sample 2 Sample 3 ... Sample n Antibody 1 Antibody 2 Antibody 3 ... Antibody n

C 1 ENCODE D E ] 1 100k cells ENCODE 0 H3K27ac 100’000 cells 50’000 cells 1 100k cells 50k cells 1 1 1 ] 5k cells 0 500 cells 1 50k cells 0] 1 5k cells H3K27ac

0] H3K27ac ENCODE 500 cells ENCODE

1 Norm tag density ] ENCODE rep1 0 0 0 0 0 0 ENCODE rep2 1 0 FloChIP 1 0 FloChIP 1 CENPM BZRAP-AS1 CD74 -2000TSS 2000 F 1 G H3K9me3 ENCODE 0] 1 1 5’000 cells 1 500 cells H3K27ac 1 EncodeEncodeodod FloChIP 0] ENCODE 1 0 ENCODE 0] 1 H3K4me1 1 FloChIP 0] FloChIP ENCODE 1 0 ENCODE ENCODE 0] H3K4me3 1 0 0 FloChIP 0] CLUHP3 ZNF678 0 FloChIP 1 0 FloChIP 1 GAPDH RPLP1 RPLP0

1 H 27ac SYSTEMS BIOLOGY 4me3 I H3K27ac H3K4me1 H3K4me3 H3K9me3 J 4me1 1 1 1 1 4me1 FRiP score 9me3 4me3

9me3 ENCODE ENCODE ENCODE ENCODE ENCODE

Norm tag density 27ac FloChIP

0 0 0 0 0 0% 50% 100% -2000TSS 2000 0 FloChIP 10FloChIP 1 01FloChIP 0 FloChIP 1

Fig. 2. FloChIP-based derivation of chromatin landscapes in LCL GM12878. (A) Schematic depiction of FloChIP’s mode 1: sample multiplex. One antibody solution ENGINEERING is introduced through the common inlet and distributed equally across all IP lanes. During IP, each IP lane is loaded separately by introducing different samples through the individual inlets. (B) Schematic depiction of FloChIP’s mode 2: antibody multiplex. Each IP lane is functionalized separately by introducing different antibodies through the individual inlets. During IP, one sample is introduced through the common inlet and distributed equally across all IP lanes. (C) H3K27ac profiles at three different genomic loci obtained by FloChIP with decreasing cell numbers (100,000 to 500 cells). For comparison, ENCODE data generated by conventional ChIP-seq are also shown. (D) Normalized read density metaprofiles at 2 kilobases (kb) around TSS for samples of decreasing cell numbers and ENCODE. (E) Normalized pairwise correlation of samples with decreasing cell numbers and ENCODE (see Methods for details). (F) Signal tracks for H3K27ac, H3K4me1, and H3K4me3 profiles obtained by FloChIP with 100,000 cells are shown at three different genomic loci. (G) H3K9me3 signal track comparison between ENCODE and FloChIP. (H) Normalized read density metaprofiles at 2 kb around TSS for H3K4me3, H3K27ac, and H3K4me1. For comparison, ENCODE data generated by conventional ChIP-seq are also shown. (I) Correlation plots between FloChIP (x axis) and ENCODE (y axis) data for all targets tested, that is, H3K27ac, H3K4me1, H3K9me3, and H3K4me3. (J) Comparison in terms of fraction of reads in peaks (FRiP) between FloChIP and ENCODE for histone mark samples.

SI Appendix,Fig.S2F), indicating that our system is refractory to FRiP score showed that, despite the ChIP cell input for ENCODE batch variability. being two orders of magnitude greater than that of FloChIP, our Next, we set out to evaluate the reproducibility of our approach technology consistently yields highly enriched libraries, with FRiP with other genomic targets. To this end, by using FloChIP’s mode scores between 1.1× and 4.1× higher for FloChIP compared to 2 - antibody multiplex, we performed ChIP-seq of four histone ENCODE (Fig. 2J). Finally, to investigate the probability of cross- marks in parallel (H3K27ac, H3K4me3, H3K4me1, and H3K9me3) talk across FloChIP lanes, we performed FloChIP on H3K27ac and using the same sample, going from chromatin to sequencing-ready H3K4me3 for human (GM12878 cells) and mouse (mESCs) chro- libraries, in 1 d. Following qPCR enrichment analysis (SI Appendix, matin, each loaded in adjacent IP lanes. Our results were compared Fig. S2G) and sequencing (SI Appendix,Fig.S2H), we found that to human ENCODE data for each respective mark as a control, the obtained signal tracks closely resemble those of ENCODE revealing that virtually no such cross-talk could be observed be- tween FloChIP lanes (SI Appendix,Fig.S2I and Table S1). These (Fig. 2F for H3K27ac, H3K4me1, and H3K4me3; Fig. 2I for data show that FloChIP can be used to robustly generate chromatin H3K9me3). In addition, to evaluate FloChIP’s performance with landscapes for histone marks on the same sample in a parallelized greater resolution, we determined the extent of genome-wide dis- manner and over a wide input range. tribution of reads around the TSS (Fig. 2H) and the overall cor- relation (see Methods) between FloChIP and ENCODE datasets FloChIP’s “Sequential IP” Mode Provides Genome-Wide Information (26–29). Comparison of signal intensities between the respective on Bivalent Regulatory Regions. Conventional ChIP-seq provides datasets confirmed an overall high read density correlation in peaks information on the genome-wide localization of one specific (H3K4me3: r = 0.85; H3K27ac: r = 0.90; H3K4me1: r = 0.89; protein or histone mark at a time. However, DNA regulatory ele- H3K9me3: r = 0.87; Fig. 2I). Moreover, comparison in terms of the ments tend to be characterized by much more complex chromatin

Dainese et al. PNAS | June 16, 2020 | vol. 117 | no. 24 | 13831 Downloaded by guest on September 29, 2021 states that involve multiple histone marks and collaborating TFs We validated this approach by focusing on bivalent chromatin (5, 16, 30). For instance, it has been shown that promoters in embryonic development, given its well-studied role in this showing both repressive (H3K27me3) and active (H3K4me3) context. Specifically, we acquired genome-wide direct cooccu- marks are a characteristic feature in embryonic stem (ES) cells pancy profiles for H3K27me3 and H3K4me3 in mESCs in both (16, 31). This class of promoters was originally named “bivalent” IP directions—that is, H3K27me3 first followed by H3K4me3 (16, 31) and is strongly associated with key spatially regulated (H3K27me3/H3K4me3) and vice versa. As mentioned above, developmental (32). To obtain direct information on the H3K4me3 and H3K27me3 bivalency has been originally attrib- genomic location of bivalent promoters, a variant of the standard uted to promoters of developmental genes, leading to the hy- pothesis that a bivalent state maintains genes in a poised state ChIP protocol called sequential ChIP was developed (16). Despite ’ “ ” the advantage of sequential ChIP over standard ChIP in discern- (16). To evaluate the performance of FloChIP s sequential IP mode, we first focused on three distinct regions that have been ing true bivalency, its manual involvement and laboriousness have previously used as proof-of-concept models by Bernstein et al. thus far prevented widespread usage. To address the technical (16), who illustrated the methylation status difference of these limitations of the current sequential ChIP workflow, we exploited ’ regions by using ChIP-qPCR and sequential ChIP-qPCR. With FloChIP s intrinsic modularity, highly efficient IP, and multi- this approach, they were able to distinguish regions displaying plexing features to generate an automated and miniaturized se- only H3K4me3 (e.g., Tcf4 TSS) and only H3K27me3 (e.g., up- A ’ “ quential ChIP solution (Fig. 3 ). Briefly, FloChIP s sequential stream of Hoxa3) versus those displaying true bivalency (e.g., Irx2 ” IP consists of two consecutive IPs taking place in two adjacent IP TSS). FloChIP-based genomic profiles (Fig. 3 B and C and lanes. The chromatin immobilized and washed in the first IP lane SI Appendix, Fig. S3) validated these previous findings (15, 16). is resuspended by means of a peptide elution strategy (14) and We observed that, as in the original study by Bernstein et al., the then transferred on-chip to a neighboring IP lane where the sec- Tcf4 shows high H3K4me3 but low H3K27me3 en- ond IP is carried out (Fig. 3A), followed by salt washes and richment, and thus low bivalency. On the other hand, Hoxa3 was tagmentation. mainly marked by H3K27me3, with low H3K4me3 enrichment

A B 1 H3K4me3 0 1 H3K27me3 0 1 H3K27me3/H3K4me3 0 1 H3K4me3/H3K27me3 0 4me3 27me3 27me3 4me3 IP #1 Tcf4 Irx2 Hoxa3 Eluted chromatin is stored in FloChIP’s outlets C 1 H3K27me3/H3K4me3 1 0 0 H3K27me3/H3K4me3 1 H3K4me3/H3K27me3 1 H3K4me3/H3K27me3 0 0 Gata3 Gata4

1 H3K27me3/H3K4me3 1 H3K27me3/H3K4me3 0 0 1 H3K4me3/H3K27me3 1 H3K4me3/H3K27me3 0 0 Tal1 Pou5f1

1 H3K27me3/H3K4me3 0 4me3 27me3 27me3 4me3 1 H3K4me3/H3K27me3 0 IP #2 Hoxc13 Hoxc10 Hoxc8 Hoxc6 Hoxc4 Stored chromatin is flowed into the second set of IP lanes D 1 1 1 rep1 P rep1 P P 27me3/4me3 P Co-ChIP Co-ChI FloChI 0 0 0 0 FloChIPFloChIP 227me3/4me37me3/4me3 1 0 FloChIPFloChIP 44me3/27me3me3/27me3 1 0 FloChIPFloChIP 27me327me3/4me3/4me3 1 1 1 1 4me3 27me3 27me3 4me3 rep2 rep2 P Elution P Co-ChI Co-ChI Bivalent chromatin is collected in second set of outlets. rep2 Co-ChIP 0 0 0 0 FloChIP 27me3/4me3 1 0 FloChIP 4me3/27me3 1 0 CoChIP rep1 1

Fig. 3. FloChIP’s “sequential IP” mode for the study of factor cooccupancies including bivalent chromatin in mouse ES cells E14TG2a. (A) FloChIP’s sequential IP steps in descending chronological order as applied for H3K4me3−H3K27me3 cooccupancy. Chromatin that is derived from the first IP is collected through a peptide elution strategy (14) into off-chip reservoirs connected to the device. Following collection, the control channels are actuated to isolate the first IP lane from the chromatin, while opening the path to the second IP lane. At this point, the chromatin flows into the second prefunctionalized IP lane. Finally,the bivalent chromatin is eluted again in off-chip reservoirs. (B) Normalized signal tracks for the two individual IP libraries (H3K4me3 and H3K27me3) as well as the corresponding sequential IP samples (H3K27me3/H3K4me3 and H3K4me3/ H3K27me3) at three different genomic loci presented in the original study by Bernstein et al. (16). (C) Normalized signal tracks for the two IP samples (H3K27me3/H3K4me3 and H3K4me3/ H3K27me3) at different genomic loci reported as bivalent in the Co-ChIP study (15). (D) Normalized correlation plots (see Methods) between FloChIP in sequential ChIP mode in the two orders (i.e., H3K27me3 followed by H3K4me3 and vice-versa) and the corresponding Co-ChIP replicates.

13832 | www.pnas.org/cgi/doi/10.1073/pnas.1913261117 Dainese et al. Downloaded by guest on September 29, 2021 and consequently low bivalency signal (Fig. 3B). Finally, the TSS reflecting homogeneity in DNA yield. This result was obtained of Irx2 showed high bivalency, with all four genomic tracks, two without manually adjusting the volume or concentration of the individual and two sequential FloChIPs, showing high coverage. sample prior to IP. We therefore reasoned that FloChIP itself, Moreover, the promoter of genes such as Gata3, Gata4, Tal1, when used in saturating conditions, provides the additional ad- and the Hoxc clusters showed pronounced bivalency, as expected vantage of equalizing the amount of recovered DNA, which for important developmental genes (15, 33) (Fig. 3C). Next to renders the library preparation process straightforward. After analyzing specific loci, we also validated our data on a broader sequencing, we confirmed the accumulation of mapped reads in scale by measuring the number of reads mapping into the TSS of selected genomic loci (based on MEF2A ENCODE data: up- known bivalent promoters (SI Appendix, Table S2) and achieving stream CCL3, VOPP1, and upstream USP7; Fig. 4C). high correlation (between r = 0.87 and r = 0.93) with the results Subsequently, we again analyzed the quality of our data using obtained by Weiner et al. (15) using their Co-ChIP system a number of genome-wide measures. First, we called peaks for (Fig. 3D). Taken together, our data indicate that FloChIP’s each sample (Fig. 4D; avg(#peaks) = 8,600) and observed a “sequential IP” mode constitutes a miniaturized, low-input generally good degree of read density correlation between (100,000 cells) and rapid (between 5 h and 6 h) sequential ChIP-seq sample pairs (avg(cor) = 0.84) (SI Appendix, Fig. S5A). Sur- workflow for the genome-wide analysis of cooccurring binding prisingly, despite detecting strong enrichment of the -like events. consensus motif in each individual peak set (Fig. 4E, median(pval) = 1e-10), examining the presence of motifs in peaks revealed that FloChIP Robustly Profiles TF Binding. As mentioned above, previous only a small portion of peaks contained the MEF2A motif (Fig. 4F; attempts at improving either the sensitivity or multiplexing ability avg(motif density in peaks) = 15%). Therefore, we set out to ex- of ChIP-seq experiments were successful but mostly, so far, in plore alternative drivers of the observed MEF2A binding. To this the context of histone marks (21–24). In fact, performing ChIP end, we performed de novo motif discovery on each individual peak on TFs poses additional challenges compared to ChIP on histone file and ranked the obtained motifs according to their statistical marks, such as the fact that 1) TF/DNA interactions are less significance. Interestingly, the MEF2A motif emerged as only the abundant and weaker than histone mark/DNA interactions and fifth most enriched one, preceded by the motifs for BATF, 2) antibodies for TFs normally show lower affinity for their RUNX3, NFKB, and the IRF:PU.1 dimer (Fig. 4G). This obser- epitopes compared to histone mark antibodies. These challenges vation is consistent with previous motif-based analyses in LCLs (4, translate into the need for more-abundant sample inputs and 35) and suggests that MEF2A DNA binding could be largely motif longer incubation times. Indeed, during FloChIP optimization, independent and driven by the interaction with other factors (36). we also experienced these challenges, rendering FloChIP’s indirect Of note is that a previous study aimed at deciphering the SYSTEMS BIOLOGY method—that is, with 2- to 4-h antibody/chromatin preincubation in Epstein–Barr virus-based mechanism of B cell/lymphoblastoid tubes—to be the only robust way to obtain high-quality TF ChIP conversion had also found the same motifs to be strongly enriched results. We benchmarked our system on different TF targets that in their EBNA3C ChIP-seq dataset (35). Thus, we hypothesized were already assayed in LCLs by the ENCODE consortium, but that, in LCLs, MEF2A may play a role in the gene regulatory using only 100,000 cells per ChIP: C/EBPβ, PU.1, and RUNX3 network underlying B cell proliferation. To test this hypothesis, we (SI Appendix,Fig.S4). For RUNX3 and PU.1, we obtained good first split the observed peaks in unique and overlapping sets based agreement and correlation between our data and ENCODE’s on the presence of one or more of the MEF2A motifs and the three (RUNX3: average [avg] r = 0.75; PU.1: avg r = 0.72) and high other motifs with the highest enrichment, that is, IRF:PU.1, ENGINEERING correlations between technical replicates (RUNX3: r = 0.95; PU.1: r RUNX3 and BATF (Fig. 4H). We found that the largest set of the = 0.9; SI Appendix,Fig.S4), once again highlighting the re- resulting Venn diagram consists of peaks containing all four motifs producibility of our system. For C/EBPβ, results were more am- (20% of all peaks) and is enriched for ontology terms linked to biguous, since the FloChIP and ENCODE libraries did not immune cell regulation and proliferation (Fig. 4I and SI Appendix, correlate well (between r = 0.49 and r = 0.52) (SI Appendix,Fig. Fig. S5B). In comparison, the second largest set (13%) involved S4I). However, we found that the ENCODE replicates themselves peaks featuring, exclusively, the MEF2A motif and did not show did not have a high correlation (r = 0.51), which questions the re- insightful ontology enrichment. These findings indicate that liability of this dataset (SI Appendix,Fig.S4I). In contrast, FloChIP MEF2A may play a role in the gene regulatory network underlying replicates still yielded a high correlation (r = 0.97), which suggests lymphoblastoid cell proliferation and that it does so specifically as that our FloChIP data are at least more consistent than ENCODE’s part of a larger complex of collaborating TFs. (SI Appendix,Fig.S4I). In addition, genome-wide quality measures Next, we examined the impact of genetic variation on MEF2A such as FRiP score (SI Appendix,Fig.S4E) and motif enrichment binding. To achieve this, we exploited our large MEF2A ChIP- (SI Appendix,Fig.S4H) also revealed high quality and re- seq dataset by considering allele-specific binding (ASB; see producibility of FloChIP data. Forexample,thePU.1,RUNX3,and Methods) across all 32 samples and found that only a very small C/EBPβ motifs were consistently found to be the top hits in their fraction (0.7%) of ASBs could be explained by MEF2A motif respective libraries (SI Appendix). variation (Fig. 5A and SI Appendix, Fig. S6 A–D). Indeed, only 5 In order to demonstrate the applicability of our approach on out of 751 ASBs were significantly disrupting or creating an TFs, we performed MEF2A ChIP-seq on chromatin derived MEF2 motif (considering the entire MEF2 family) at false dis- from distinct LCLs derived from 32 unrelated European indi- covery rate (FDR) < 5% and were concordant in terms of ASB viduals whose genomes were sequenced and aligned to the hg19 allele effect. Similarly, we found 3 ASBs for BATF, 19 for RUNX, assembly as part of the 1000 Genomes Project (34) (Fig. 4A). We and 2 for IRF (∼4% of all ASBs). In total, we tested 401 specifically targeted this TF, given its association with variable mononucleotide human core TF motifs [from HOCOMOCO chromatin modules that were inferred from histone mark and v11 (37)], revealing only 223 ASBs that were significantly dis- PU.1 ChIP-seq data from LCLs, as presented in one of our rupting at least one motif at FDR 5% (∼30%). This result is previous studies (4). Before sequencing, we verified the IP consistent with our observation that few MEF2A DNA binding quality of each library by qPCR (Fig. 4B). Fold change results events are dependent on MEF2A’s own motif and also aligns indicated consistent, high enrichment across the 32 IP lanes with the notion that the majority of variable TF DNA binding (avg(Fold enrichment) = 70). In addition, we noticed that, dur- events are driven by motif-independent mechanisms (5, 36). ing library preparation, the number of amplification cycles that Interestingly, and as indicated above, we found that a large was required to obtain sufficient DNA amounts for NGS se- portion of motif-affected ASBs are linked to RUNX-family quencing was the same for all 32 samples (17 PCR cycles), binding sites (Fig. 5A). In agreement with this, allelic binding

Dainese et al. PNAS | June 16, 2020 | vol. 117 | no. 24 | 13833 Downloaded by guest on September 29, 2021 chr17 chr7 chr16 ABqPCR C34,417,412 55,601,855 8,984,654 DEF# peaks MEF2 motif Motif density enrichment 34,418,177 55,602,605 8,986,358 called enrichment in peaks

1 A

GM06985 verage percentage of peaks with GM06985 0 A

1 verage number of peaks = 8600

GM06986 Average -log(p-value) = 80 GM06986 0 A 1 GM07037 verage fold enrichment = 70 GM07037 0 GM07048 1 GM07048 0 GM07056 1 GM07056 0 GM07346 1 GM07346 0 GM07357 1 GM07357 0 GM10847 1 GM10847 0 1 GM10851 MEF2 GM10851 0 GM11830 1 GM11830

0 motif = 15% GM11831 1 GM11831 0 GM11832 1 GM11832 0 GM11894 1 GM11894 0 GM11931 1 GM11931 0 GM11993 1 GM11993 0 GM11994 1 GM11994 0 GM12154 1 GM12154

0 GM12156 1 GM12156

0 GM12249 1 GM12249

0 GM12283 1 GM12283

0 GM12286 1 GM12286

0 GM12287 1 GM12287 GM12489 0 GM12489 1 GM12750 0 GM12750 1 GM12761 0 GM12761 1

GM12762 0 GM12762 1

GM12763 0 GM12763 1

GM12776 0 GM12776 1

GM12812 0 GM12812 1

GM12813 0 GM12813 1

GM12814 0 GM12814 1 GM12873 GM12873 0 08 24.50200050 log2(fold change) log10(# peaks) -log(P value) Percent CCL3 VOPP1 USP7 Peak-associated promoters containing selected TFs motif GH I de novo motif enrichment BATFBATF RUNXRUNX3 GO analysis overlapping promoters positive regulation of interleukin-2 production T-cell costimulation regulation of B cell proliferation lymphocyte costimulation positive regulation of B cell proliferation regulation of antigen-mediated signaling pathway regulation of interleukin-12 production MEF2 regulation of interleukin-2 production regulation of CD4-positive, alpha-beta T cell activation

-log(pval) motif enrichment lymphocyte proliferation

TF IRF:PU.1 MEF2A 0 1 2 3 4 5 A NFKB RUNX3B MEF2A IRF:PU.1

Fig. 4. FloChIP’s MEF2A IP on 32 cell lines. (A) List of the 32 cell lines used in this study. (B) The qPCR enrichment for each library at the VOPP1 locus. The average fold enrichment across all libraries is 70, represented as a dotted line. (C) Signal tracks are reported for each library for three different genomic loci. (D) Number of peaks called for each library (8,600 peaks on average, represented by a dotted line). (E) MEF2A motif enrichment for each library (the median P value is 1e-10, represented as a dotted line). (F) Percent of called peaks containing the MEF2A motif (average is 15%, represented as a dotted line). (G) Results of de novo motif search and enrichment of each detected motif across libraries. (H) Venn diagram showing the percentages of MEF2A-bound pro- moters and the occurrence of other detected motifs. (I) enrichment analysis of promoters containing MEF2A, BATF, RUNX3, and IRF:PU.1 motif.

13834 | www.pnas.org/cgi/doi/10.1073/pnas.1913261117 Dainese et al. Downloaded by guest on September 29, 2021 polymorphisms [SNPs] tested) is one of the top significant cobinders A B Allele Binding Cooperativity P = Proportion of MEF2A ASBs 2 with MEF2A ( value 0.02). MEF2A and MEF2D motifs also explained by motif disruption FDR < 5% ● appeared in the top seven list, with P values of 0.068 and 0.051, 1 respectively. Of note is that, since the motifs of RUNX factors are Others (194, 26%) very similar to one another, our results cannot specifically distin- 0 IRF (2, 0.3%) guish between RUNX1, RUNX2, and RUNX3. Nevertheless, given RUNX (19, 2.5%) ● BATF (3, 0.4%) −1 ● that only RUNX3 ChIP-seq data exist for LCLs (https://www. MEF2 (5, 0.7%) encodeproject.org/targets/RUNX3-human/ and this study) and that ASB Log Fold Change ● −2 previous evidence also linked RUNX3 to B cell proliferation ● MEF2A cor=0.85 ● RUNX3 cor=0.49 pathways (35), we narrowed our analyses to this TF. Interestingly, Unknown −3 (528, 70%) −10 −50 5 10 within the set of RUNX3 ASBs, we detected several cases in which ‘atSNP’ log likelihood ratio genetic variation from the reference sequence either creates (motif difference between ref and alt) (Fig. 5C)ordisruptsaRUNXmotif(SI Appendix,Fig.S6E). Ac- Allelic imbalance cordingly, RUNX motif creation induced MEF2A DNA binding to C D (in HET samples) the alternate allele (Fig. 5 D–F), whereas motif disruption led to binomial p  6.35E-3 RUNX3 MEF2A ChIP signal loss (SI Appendix,Fig.S6F–H). Taken to- 11 motif 10.0 gether, these genetic variation-based results suggest that, within the TF complex governing lymphoblastoid cell proliferation, RUNX3 7.5 REF and MEF2A cooperate, with RUNX3 acting as a key driver of MEF2A DNA binding. In conclusion, our findings demonstrate the 123456789101112 5.0 value of FloChIP in catalyzing the relatively straightforward acqui- Nb Reads with allele ALT 2.5 sition of TF binding data across many genotypes.

123456789101112 1 Discussion 0.0 A G Profiling the interactions between proteins and DNA has both (REF) (ALT) rs6912511 fundamental and biomedical value (2, 38, 39), but continues to EF chr6:3184953,3189953 constitute an important technological challenge for genomics 5 kb − 2 research (2, 3). ChIP-seq allows the probing of protein DNA interactions on a genome-wide scale, thus achieving high SYSTEMS BIOLOGY rs6912511 throughput in terms of DNA sequence space coverage. However, 1 in terms of experimental output, ChIP-seq largely remains a low- LINC02525 throughput technique. This is mainly due to the long and man- 0 ually intensive ChIP-seq pipeline which, although widespread, ENCODE only offers limited automation, reproducibility, and sensitivity. −1 Valuable efforts have already been devoted to addressing ChIP- Homozygous REF seq’s main limitations, but these efforts tended to target only ENGINEERING −2 specific issues while skipping others, as outlined in the In- Heterozygous – MEF2A binding enrichment at 6:3187033:3187823 AA AG GG troduction (14, 15, 18, 20 25). Orthogonal approaches have re- (HOM REF) (HET) (HOM ALT) Homozygous ALT cently emerged, such as cleavage under targets and release using Genotype (rs6912511 @6:3187454) nuclease (CUT&RUN) (40) and cleavage under targets and Fig. 5. Genetic variation-based analyses reveal RUNX3 to be an important tagmentation (CUT&Tag) (41), that are capable of profiling mediator of MEF2A DNA binding. (A) Proportion of the 751 identified ASBs chromatin in a one-tube format and down to single cells. Rather that can be explained by motif disruption (at FDR 5%). RUNX, IRF, and BATF than relying on solid-state separation of protein/DNA com- sum the results of all of the TF members in their family (because the respective plexes, these approaches exploit fusion proteins (protein motifs are very close). Only ASBs that are concordant with motif disruption or A-Mnase and protein A-Tn5, respectively) to selectively digest or creation are counted. (B) ABC test for inferring motif disruption of other TFs tagment the genomic DNA in the proximity of chromatin-bound that are modulating MEF2A DNA binding. The x axis represents the motif antibodies. Such alternative strategies hold great potential as score/quality difference (for MEF2A in red, and RUNX3 in blue), while the y axis sensitive and streamlined techniques for genomic profiling of represents the fold change between the number of reads mapping to the ref protein/DNA complexes, even down to the single-cell level. But, vs. the alt allele, summed over all heterozygous samples. This was computed for all 5 identified MEF2A ASBs, and 22 RUNX3 ASBs (at FDR 5%). (C)The as opposed to ChIP-seq, they still need to establish themselves as rs6912511 significantly disrupts the RUNX3 motif (FDR < 5%). (D) Allelic Im- widely implemented tools, which remains uncertain due to the balance highlighted for rs6912511 by summing read counts of each allele over fact that data correlations tend to still be greater among con- all heterozygous samples. A binomial test yielded a P value of 6.35E-3, which ventional ChIP-seq tools than between these novel and already revealed rs6912511 as an ASB. (E) IGV tracks showing MEF2A binding enrich- established approaches (e.g., ENCODE data) (40, 41) or FloChIP, ment at the rs6912511 locus. The coverage tracks are from ENCODE (NA12878 and they cannot, at this stage, be used to perform sequential IP MEF2A) and our 32 samples, stratified by rs6912511 genotype, and merged (SI Appendix,TableS3). To more comprehensively address the into three tracks (final bigwigs are normalized by reads per kilobase per mil- limitations of standard ChIP-seq, we developed a microfluidic sys- lion mapped reads (RPKM) using deepTools). (F) Boxplots showing the effect tem that we named FloChIP. In this work, we demonstrate that of rs6912511 on the three different genotypes (AA, AG, and GG). MEF2A binding enrichment (y axis) was computed from the peak in which rs6912511 FloChIP enables both rapid single and sequential IP across a wide was found, and normalized using DESEq2, followed by qqnorm functions in R. input range with high target flexibility and experimental scalability. We first characterized FloChIP’s dynamic range by targeting H3K27ac in samples containing 500 up to 1 million cells. We show high fold enrichment and good genomic coverage for all cooperativity (ABC) analysis showed that variation in RUNX- tested samples, indicating that FloChIP performs robustly across like motifs was significantly correlated with MEF2A DNA binding a wide cell number range. In a second step, we tested different differences (nominal p-val < 5%), pointing to cooperative DNA antibodies in what we define as the “antibody multiplex mode.” binding mechanisms between these TFs (Fig. 5B). Indeed, an ABC In this configuration, the different IP lanes of our device are (see Methods) test revealed that RUNX3 (with 22 single-nucleotide individually functionalized with distinct antibodies, with one

Dainese et al. PNAS | June 16, 2020 | vol. 117 | no. 24 | 13835 Downloaded by guest on September 29, 2021 sample being distributed in parallel to all IP units. The geometric cartridges, paralleling comparable efforts to, for example, ac- layout of the microchannels thereby ensures uniform distribution commodate single-cell transcriptomic analyses (45). of the sample (Fig. 1E and SI Appendix,Fig.S1C). After observing In conclusion, we believe that FloChIP has the potential to good correlations with the benchmarking data (ENCODE) for all empower the community with a practical and reliable IP solu- tested histone mark targets, we conclude that FloChIP can gen- tion. Its future integration with standardized user-friendly de- erate chromatin state landscapes with reliability and flexibility. vices will thereby pave the way toward full automation. After establishing this proof of concept on histone marks, we set out to expand the applicability of FloChIP in two directions: Methods sequential ChIP-seq and ChIP-seq on TFs. For the former, we Detailed info can be found in SI Appendix, Supplementary Methods. took advantage of the use of microvalves to compartmentalize distinct sections of the microfluidic device in a controllable Cell Line Information. For all comparisons with ENCODE data, LCL GM12878 manner, as reported previously for other applications (42, 43). was used. For the sequential ChIP experiments, we used mouse ES cells More specifically, the use of microvalves allows easy orchestra- (E14TG2a) in order to enable the comparison with previous datasets (15, 16, 46). For the large-scale MEF2A experiments, we used the following LCLs: tion of the functionalization of adjacent IP lanes with different GM06985, GM06986, GM06994, GM07037, GM07048, GM07051, GM07056, antibodies, in this case, H3K4me3 and H3K27me3, and transfer GM07346, GM07357, GM10847, GM10851, GM11829, GM11830, GM11831, of eluted chromatin from one IP lane to the next. With this GM11832, GM11840, GM11881, GM11894, GM11918, GM11920, GM11931, approach, we recapitulated the landmark qPCR (16) and se- GM11992, GM11993, GM11994, GM12005, GM12043, GM12154, GM12156, quencing (15) results on the genomic distribution of H3K4me3/ GM12249, GM12275, GM12282, GM12283, GM12286, GM12287, GM12383, H3K27me3 bivalency in ES cells. Here, we note that the key GM12489, GM12750, GM12760, GM12761, GM12762, GM12763, GM12776, advantage of FloChIP lies in turning sequential ChIP assays from GM12812, GM12813, GM12814, GM12815, and GM12873. a long (>2 d), intensive, and error-prone protocol into a fast (half-day) and automated procedure. We therefore believe that Chromatin Preparation. Lymphoblastoid and mouse embryonic cells were FloChIP could catalyze renewed interest in histone mark biva- harvested, washed once with phosphate-buffered saline (PBS), and resus- pended in 1 mL of cross-linking buffer. Cross-linking was quenched, and cells lency (15, 17), whose molecular function and relevance remains were then washed twice with ice-cold PBS, pelleted, deprived of the su- poorly understood, in part, because of the cumbersome nature of pernatant, snap frozen, and stored at −80 °C. The frozen cell pellet was the sequential ChIP-seq technique. resuspended in ice-cold PBS, lysed, and resuspended in sonication buffer. Consistent with the general purpose scope of our microfluidic Nuclei were sonicated on a Covaris E220 machine (see SI Appendix;for ChIP-seq system, our final goal was to demonstrate the ability of representative fragment analyzer output, see SI Appendix, Fig. S7A). FloChIP to carry out IP on TF targets. Such capacity would constitute important technical progress, since previous micro- FloChIP. FloChIP devices were fabricated as previously reported (47) (details in fluidic or multiplexing implementations of ChIP focused mostly SI Appendix, Supplementary Methods). on histone marks (21, 23, 24). To cope with the generally lower A FloChIP experiment starts with preloading the control lines with distilled affinity of TF antibodies for their targets, we complemented the water and activating all valves. Subsequently, all of the reagents required for regular FloChIP procedure with antibody−chromatin pre- the surface chemistry are loaded into pipette tips and inserted into the inlets of the microfluidic device. At this stage, all valves are closed, and there is incubation and oscillatory sample loading (44). We first bench- virtually no cross-talk between any of the reagents above. Immediately after marked FloChIP on three well-studied TFs (PU.1, RUNX3, and completing their insertion into the chip, the tips themselves are connected C/EBPβ), after which we further validated our platform by to solenoid valves for electronic pressure control, and the automated pro- completing 32 MEF2A IPs and on-chip tagmentations in less tocol is launched by running the respective script. The protocol entails, than 1 d. After sequencing, we used the data to investigate the in sequential order, the following steps: 20 min of BSA−biotin (100 μLat role of MEF2A in the gene regulatory network underlying lym- 2 mg/mL), 30 s of PBS wash, 20 min of Neutravidin (100 μL at 1 mg/mL), 30 s phoblastoid cell proliferation. By integrating genotypic and of PBS wash, 20 min of biotin−protein A/G (100 μL at 2 mg/mL), and 30 s of molecular phenotypic data, we found that MEF2A DNA binding PBS wash. The antibodies used in this study are anti-H3K27ac ab4729, anti- is, in large part, controlled by other TFs such as RUNX3, which, H3K4me3 ab8580, anti-H3K4me1 ab8895, anti-H3K9me3 ab8898, anti-H3K27me3 ab6147, anti-MEF2A sc-17785, anti-RUNX3 sc-101553, anti-PU.1 sc-390405, and anti- as part of a larger complex of collaborating TFs, seem to co- C/EBPβ sc-7962. Following antibody loading, chromatin samples are loaded on-chip ordinate B cell proliferation. Importantly, both sequential ChIP- by opening and closing the respective microvalves. Following IP, salt washes are seq and TF ChIP-seq experiments were carried out on samples of performed to eliminate nonspecific binding. Subsequently, Tn5 buffer is flowed on- 100,000 cells, therefore constituting a significant improvement in chip to tagment the immunoprecipitated chromatin. sensitivity for these ChIP-seq variants, which usually require Following Tn5 buffer, sodium dodecyl sulfate (SDS) is loaded on-chip at several million cells (3). 65 °C for 10 min in order to elute the antibody-bound chromatin from the Finally, we anticipate that FloChIP can still benefit from fur- device. Of note is that the first elution step in the FloChIP sequential mode is ther optimization. Despite its ability to automate several steps, performed through a peptide elution strategy (14) in order to preserve epitopes and the surface chemistry in the adjacent IP lanes. The eluate is such as IP, washes, and tagmentation, FloChIP demands im- − portant preparatory hands-on work, for example, wiring the de- independently collected from each IP lane into PCR tubes and de cross- linked at 65 °C for 4 h. Following de−cross-linking, DNA is purified in Qiagen vice and interfacing it to the control system. Moreover, even EB buffer using Qiagen MinElute purification kits (SI Appendix,Fig.S7B). though FloChIP streamlines a significant portion of the ChIP- seq workflow, it remains sensitive to the pre-IP protocol steps Species Mixing Experiment. The species mixing experiment was carried out on and to the choice of specific antigen targets. In other words, two different FloChIP devices, one functionalized with anti-H3K27ac anti- similar to standard ChIP-seq, high-affinity antibodies and cor- body and one with anti-H3K4me3 antibody. Mouse and human chromatin rectly fragmented chromatin will remain essential requirements were then loaded in adjacent IP units in order to test for contamination for the correct functioning of FloChIP. Finally, while we expect across said units. During analysis, we first created a mixed genome, merging that the implementation of FloChIP will be easily attainable by mouse genome mm10 (GRCm38) and hg38 (GRCh38) of laboratories with basic microfluidic infrastructure, we aim to also Ensembl release 93. All libraries were then aligned, using STAR (Spliced Transcripts Alignment to a Reference, v. 2.6.1b) (48), to this genome, using serve the broader scientific community. To do so, we are cur- default parameters and “–outFilterMultimapNmax 1.” This parameter al- rently working toward integrating FloChIP in a more compact lows discarding of any multiply mapping reads and thus recovery of reads solution that will allow laboratories without prior microfluidic that map uniquely to the human or the mouse genome. Then, we counted expertise to adopt it as well. Specifically, we aim to develop the number of reads per (here + species) using a FloChIP benchtop instrument and off-the-shelf FloChIP “samtools idxstats” and summed the reads belonging to the same species.

13836 | www.pnas.org/cgi/doi/10.1073/pnas.1913261117 Dainese et al. Downloaded by guest on September 29, 2021 FloChIP Read Mapping and Processing for Histone Marks and TFs. Sequencing European Bioinformatics Institute (EBI) server (hg19) and removed variants reads were mapped to the human (hg38 for all histone mark and TF com- with a minor allele frequency lower than 5%, while restricting our analysis parisons with ENCODE data; hg19 for the 32 samples of the 1000 Genomes to the 32 samples of interest, which yielded a total of 1,268,985 SNPs. Then, Project) and mouse (mm10) genomes using STAR (48) with default param- we applied the “ASEReadCounter” tool from GATK v.4.0.4.0, on each of the eters. We counted the number of reads that are falling into different fea- 32 samples. These results were then merged by summing read counts for tures (introns, intergenic, CDS, etc.) using RSeQC read_distribution.py script every SNP across heterozygous samples. We filtered out all SNPs with a (v.2.6.4) (49). This revealed no major anomalies, as most reads for all libraries coverage lower than 10 reads, which yielded 6,330 SNPs. Then we opted to mapped to either intergenic or intronic regions, with ENCODE libraries (es- keep only SNPs falling into called peaks, which led to 4,554 SNPs that were pecially for RUNX3, PU.1, and H3K9me3) tending to have a slightly greater further analyzed. On these, we performed a binomial test to assess signifi- proportion of intronically mapped reads (SI Appendix, Fig. S8). Uniquely cant allelic imbalance (nominal P value of 5%), yielding 751 potential ASBs mapped reads were used to call peaks using the HOMER (Hypergeometric (37 at FDR 5%). Optimization of Motif EnRichment) (50) command “findPeaks.pl” with the appropriate flag, that is, “–histone” for histone marks and “–factor” for TFs TF Motif Disruption Analysis. All downstream analyses were performed using with the parameter size set to 1,000 in the case of H3K27ac for decreasing R v. 3.5.0. We analyzed all 751 ASBs to check whether they were significantly cell numbers (SI Appendix, Table S4). FRiP scores were calculated using impacting TF motifs using atSNP v.1.0 (51) with the BSgenome.Hsapiens.UCSC.hg19 HOMER’s command “annotatePeaks.pl” on individual peak files, dividing package as genome library and SNPlocs.Hsapiens.dbSNP144.GRCh37 package as the total number of reads that fall within peaks by the total number of SNP library. We tested 401 mononucleotide human core TF motifs that were mapped reads. Correlation scatter plots were generated using the “multi- downloaded from HOCOMOCO v11 (37). BamSummary” command of the deepTools package in “bins” mode, that is, by binning the entire genome and measuring read densities in each bin. ABC. ABC was assessed for all 751 ASBs, using a linear regression between the ’ “ ” Metaprofiles were generated using HOMER s command annotatePeaks.pl motif disruption/creation log likelihood ratio computed by atSNP (see “ ” in hist mode, which automatically normalizes each directory by the total Methods) and the fold change between the ASB ref and alt counts. This number of mapped reads. Genomic coverage profiles were generated using the analysis was performed for all SNPs significantly disrupting any of the 401 IGV (Integrative Genomics Viewer) browser; all profile groups of the same histone motifs analyzed by atSNP at FDR 5%, and allowed us to find which ASBs mark or TF were normalized by the same factor to allow a reliable comparison. were concordant with motif disruption across several tested SNPs. Motif enrichment analysis of known motifs was performed with HOMER’s find- MotifsGenome.pl for the following motifs: 1) Mef2a(MADS)/HL1-Mef2a.biotin-- Data Availability. All data have been deposited on ArrayExpress (E-MTAB- ChIP-Seq(GSE21529)/Homer, 2) BATF(bZIP)/Th17-BATF-ChIP-Seq(GSE39756)/Homer, 7179, E-MTAB-7150, E-MTAB-7186, E-MTAB-8553, E-MTAB-8551). 3) PU.1-IRF(ETS:IRF)/Bcell-PU.1-ChIP-Seq(GSE21512)/Homer, 4) PU.1(ETS)/ThioMac- PU.1-ChIP-Seq(GSE21512)/Homer, 5) RUNX(Runt)/HPC7-Runx1-ChIP-Seq(GSE22178)/ ACKNOWLEDGMENTS. This work has been supported by funds from the Homer, and 6) CEBP(bZIP)/ThioMac-CEBPb-ChIP-Seq(GSE21512)/Homer. Swiss National Science Foundation (Grants 31003A_162735 and CRSII3_147684), by SystemsX.ch Special Opportunity Project 2015/323, by institutional support SYSTEMS BIOLOGY Allele-Specific Binding. For identifying variants subject to Allele-Specific from the Ecole Polytechnique Fédérale de Lausanne, and by the Innosuisse Grant Binding of MEF2A, we downloaded the 1000G genotyping data from the 38445.1 IP-LS.

1. D. S. Johnson, A. Mortazavi, R. M. Myers, B. Wold, Genome-wide mapping of in vivo 22. C. D. Chabbert et al., A high-throughput ChIP-Seq for large-scale chromatin studies. protein-DNA interactions. Science 316, 1497–1502 (2007). Mol. Syst. Biol. 11, 777 (2015). 2. T. S. Furey, ChIP-seq and beyond: New and improved methodologies to detect and 23. S. Ma, Y.-P. Hsieh, J. Ma, C. Lu, Low-input and multiplexed microfluidic assay reveals characterize protein-DNA interactions. Nat. Rev. Genet. 13, 840–852 (2012). epigenomic variation across cerebellum and prefrontal cortex. Sci. Adv. 4, eaar8187

3. S. G. Landt et al., ChIP-seq guidelines and practices of the ENCODE and modENCODE (2018). ENGINEERING consortia. Genome Res. 22, 1813–1831 (2012). 24. A. Rotem et al., Single-cell ChIP-seq reveals cell subpopulations defined by chromatin 4. S. M. Waszak et al., Population variation and genetic control of modular chromatin state. Nat. Biotechnol. 33, 1165–1172 (2015). architecture in humans. Cell 162, 1039–1050 (2015). 25. C. Schmidl, A. F. Rendeiro, N. C. Sheffield, C. Bock, ChIPmentation: Fast, robust, low- 5. B. Deplancke, D. Alpern, V. Gardeux, The genetics of transcription factor DNA binding input ChIP-seq for histones and transcription factors. Nat. Methods 12, 963–965 variation. Cell 166, 538–554 (2016). (2015). 6. F. Grubert et al., Genetic control of chromatin states in humans involves local and 26. B. Bernstein, H3K27ac ChIP-seq on human GM12878. ENCODE. https://www.encodeproject. distal chromosomal interactions. Cell 162, 1051–1065 (2015). org/experiments/ENCSR000AKC/. Accessed 11 May 2020. 7. B. Lehner, Genotype to phenotype: Lessons from model organisms for human ge- 27. B. Bernstein, H3K4me3 ChIP-seq on human GM12878. ENCODE. https://www.encodeproject. netics. Nat. Rev. Genet. 14, 168–178 (2013). org/experiments/ENCSR000AKA/. Accessed 11 May 2020. 8. F. W. Albert, L. Kruglyak, The role of regulatory variation in complex traits and dis- 28. B. Bernstein, H3K4me1 ChIP-seq on human GM12878. ENCODE. https://www.encodeproject. ease. Nat. Rev. Genet. 16, 197–212 (2015). org/experiments/ENCSR000AKF/. Accessed 11 May 2020. 9. S. A. Lambert et al., The human transcription factors. Cell 172, 650–665 (2018). 29. B. Bernstein, H3K9me3 ChIP-seq on human GM12878. ENCODE. https://www.encodeproject. 10. A. A. Kolodziejczyk, J. K. Kim, V. Svensson, J. C. Marioni, S. A. Teichmann, The tech- org/experiments/ENCSR000AOX/. Accessed 11 May 2020. nology and biology of single-cell RNA sequencing. Mol. Cell 58, 610–620 (2015). 30. F. Spitz, E. E. M. Furlong, Transcription factors: From binding to de- 11. J. D. Buenrostro et al., Single-cell chromatin accessibility reveals principles of regu- velopmental control. Nat. Rev. Genet. 13, 613–626 (2012). latory variation. Nature 523, 486–490 (2015). 31. G. Pan et al., Whole-genome analysis of histone H3 lysine 4 and lysine 27 methylation 12. A. M. Klein et al., Droplet barcoding for single-cell transcriptomics applied to em- in human embryonic stem cells. Cell Stem Cell 1, 299–312 (2007). bryonic stem cells. Cell 161, 1187–1201 (2015). 32. C. Schertel et al., A large-scale, in vivo transcription factor screen defines bivalent 13. E. Z. Macosko et al., Highly parallel genome-wide expression profiling of individual chromatin as a key property of regulatory factors mediating Drosophila wing de- cells using nanoliter droplets. Cell 161, 1202–1214 (2015). velopment. Genome Res. 25, 514–523 (2015). 14. S. Kinkley et al., reChIP-seq reveals widespread bivalency of H3K4me3 and H3K27me3 33. I. Amit, Genome-wide characterization of histone mark co-occurrence at single mol- in CD4+ memory T cells. Nat. Commun. 7, 12514 (2016). ecule resolution. Omnibus. https://www.ncbi.nlm.nih.gov/geo/query/ 15. A. Weiner et al., Co-ChIP enables genome-wide mapping of histone mark co- acc.cgi?acc=GSE83833. Accessed 11 May 2020. occurrence at single-molecule resolution. Nat. Biotechnol. 34, 953–961 (2016). 34. 1000 Genomes Project Consortium, A map of human genome variation from 16. B. E. Bernstein et al., A bivalent chromatin structure marks key developmental genes population-scale sequencing. Nature 467, 1061–1073 (2010). in embryonic stem cells. Cell 125, 315–326 (2006). 35. S. Jiang et al., Epstein−Barr virus nuclear antigen 3C binds to BATF/IRF4 or SPI1/IRF4 17. A. D. Truax, S. F. Greer, ChIP and Re-ChIP assays: Investigating interactions between composite sites and recruits Sin3A to repress CDKN2A. Proc. Natl. Acad. Sci. U.S.A. 111, regulatory proteins, histone modifications, and the DNA sequences to which they 421–426 (2014). bind. Methods Mol. Biol. 809, 175–188 (2012). 36. H. Kilpinen et al., Coordinated effects of sequence variation on DNA binding, chro- 18. M. Furlan-Magaril, H. Rincón-Arano, F. Recillas-Targa, Sequential chromatin immuno- matin structure, and transcription. Science 342, 744–747 (2013). precipitation protocol: ChIP-reChIP. Methods Mol. Biol. 543, 253–266 (2009). 37. I. V. Kulakovskiy et al., HOCOMOCO: Towards a complete collection of transcription 19. W. C. Gasper et al., Fully automated high-throughput chromatin immunoprecipita- factor binding models for human and mouse via large-scale ChIP-seq analysis. Nucleic tion for ChIP-seq: Identifying ChIP-quality p300 monoclonal antibodies. Sci. Rep. 4, Acids Res. 46, D252–D259 (2018). 5152 (2014). 38. H. Yan, S. Tian, S. L. Slager, Z. Sun, ChIP-seq in studying epigenetic mechanisms of 20. S. Aldridge et al., AHT-ChIP-seq: A completely automated robotic protocol for high- disease and promoting precision medicine: Progresses and future directions. Epi- throughput chromatin immunoprecipitation. Genome Biol. 14, R124 (2013). genomics 8, 1239–1258 (2016). 21. P. van Galen et al., A multiplexed system for quantitative comparisons of chromatin 39. D. L. Northrup, K. Zhao, Application of ChIP-Seq and related techniques to the study landscapes. Mol. Cell 61, 170–180 (2016). of immune function. Immunity 34, 830–842 (2011).

Dainese et al. PNAS | June 16, 2020 | vol. 117 | no. 24 | 13837 Downloaded by guest on September 29, 2021 40. P. J. Skene, S. Henikoff, An efficient targeted nuclease strategy for high-resolution 46. T. S. Mikkelsen et al., Genome-wide maps of chromatin state in pluripotent and mapping of DNA binding sites. eLife 6, e21856 (2017). lineage-committed cells. Nature 448, 553–560 (2007). 41. H. S. Kaya-Okur et al., CUT&Tag for efficient epigenomic profiling of small samples 47. A. Isakova et al., SMiLE-seq identifies binding motifs of single and dimeric tran- and single cells. Nat. Commun. 10, 1930 (2019). scription factors. Nat. Methods 14, 316–322 (2017). 42. Z. Swank, N. Laohakunakorn, S. J. Maerkl, Cell-free gene-regulatory network engi- 48. A. Dobin et al., STAR: Ultrafast universal RNA-seq aligner. Bioinformatics 29,15–21 neering with synthetic transcription factors. Proc. Natl. Acad. Sci. U.S.A. 116, (2013). 5892–5901 (2019). 49. L. Wang, S. Wang, W. Li, RSeQC: Quality control of RNA-seq experiments. Bio- 43. A. M. Streets et al., Microfluidic single-cell whole-transcriptome sequencing. Proc. informatics 28, 2184–2185 (2012). Natl. Acad. Sci. U.S.A. 111, 7048–7053 (2014). 50. S. Heinz et al., Simple combinations of lineage-determining transcription factors 44. Z. Cao, C. Chen, B. He, K. Tan, C. Lu, A microfluidic device for epigenomic profiling prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell using 100 cells. Nat. Methods 12, 959–962 (2015). 38, 576–589 (2010). 45. G. X. Y. Zheng et al., Massively parallel digital transcriptional profiling of single cells. 51. C. Zuo, S. Shin, S. Keles¸, atSNP: Transcription factor binding affinity testing for reg- Nat. Commun. 8, 14049 (2017). ulatory SNP detection. Bioinformatics 31, 3353–3355 (2015).

13838 | www.pnas.org/cgi/doi/10.1073/pnas.1913261117 Dainese et al. Downloaded by guest on September 29, 2021