Structure Determination of Genomic Domains by Satisfaction of Spatial Restraints
Total Page:16
File Type:pdf, Size:1020Kb
Chromosome Research DOI 10.1007/s10577-010-9167-2 Structure determination of genomic domains by satisfaction of spatial restraints Davide Baù & Marc A. Marti-Renom # Springer Science+Business Media B.V. 2010 Abstract The three-dimensional (3D) architecture of a Keywords chromosome conformation . chromatin genome is non-random and known to facilitate the structure . integrative modeling . computational spatial colocalization of regulatory elements with the structural biology genes they regulate. Determining the 3D structure of a genome may therefore probe an essential step in Abbreviations characterizing how genes are regulated. Currently, there 1D One-dimensional are several experimental and theoretical approaches that 3D Three-dimensional aim at determining the 3D structure of genomes and 3C Chromosome conformation capture genomic domains; however, approaches integrating 4C Circular chromosome conformation experiments and computation to identify the most likely capture 3D folding of a genome at medium to high resolutions 5C Chromosome conformation capture have not been widely explored. Here, we review carbon copy existing methodologies and propose that the integrative Bp, Kb, Mb Base pairs, kilo base pairs, modeling platform (http://www.integrativemodeling. mega base pairs org), a computational package developed for structur- ChIP Chromatin immunoprecipitation ally characterizing protein assemblies, could be used DamID DNA adenine methyltransferase for integrating diverse experimental data towards the identification determination of the 3D architecture of genomic FISH Fluorescence in situ hybridization domains and entire genomes at unprecedented resolu- IMP Integrative modeling platform tion. Our approach, through the visualization of LMA Ligation-mediated amplification looping interactions between distal regulatory ele- nm Nanometer ments, will allow for the characterization of global NMR Nuclear magnetic resonance chromatin features and their relation to gene expres- PCR Polymerase chain reaction sion. We illustrate our work by outlining the recent RMSD Root mean square deviation determination of the 3D architecture of the α-globin domain in the human genome. Introduction D. Baù : M. A. Marti-Renom (*) Structural Genomics Unit, Bioinformatics and Genomics The three-dimensional (3D) architecture of a genome Department, Centro de Investigación Príncipe Felipe, facilitates the colocalization in space of sequentially Av. Autopista del Saler, 16, 46012 Valencia, Spain distant loci, which is essential for carrying out their e-mail: [email protected] specific functions (Takizawa et al. 2008). The D. Baù, M.A. Marti-Renom determination of the 3D architecture of whole genomes review the modeling of the 3D architecture of the or genomic domains is deemed necessary for character- α-globin domain in chromosome 16 of the human izing how genes are regulated. Unfortunately, existing genome, which has been recently described in detail light microscopy technologies like fluorescence in situ (Baù et al. 2010). This review ends by summarizing hybridization (FISH), although very informative, are not our thoughts on future directions for the structure yet sufficient to provide high-resolution information on determination of genomic domains. chromatin loops and the interactions between distal DNA segments. Therefore, an integrative and general approach for determining the spatial organization of Chromatin interaction maps for 3D modeling chromatin may prove very useful not only for identifying long-range relationships between genes and distant 3C-based methods allow the investigation of the regulatory elements but also for elucidating chromatin overall spatial organization of genomic domains, higher-order folding principles. chromosomes, and entire genomes (Miele and Dekker Previously, chromatin conformation has been 2009). Briefly, 3C-based methods rely on formalde- modeled, for example, using polymer physics hyde cross-linking between spatially close DNA loci (Dekker et al. 2002;Mateos-Langeraketal.2009) through their corresponding bound proteins. Cross- and molecular dynamics (Wedemann and Langowski linked DNA is then digested with a specific restriction 2002). Such methods rely on physics-based approaches enzyme and ligated under very diluted conditions that and only partially leverage the current wealth of favor intramolecular (i.e., the cross-linked fragments) experimental data on chromatin folding; however, they over intermolecular ligation. Cross-linking reversal have proven to be very valuable for understanding and ligation product quantification by the polymerase general features of chromatin fibers, including flexibility, chain reaction (PCR) using locus-specific primers compaction, and unpacking (Dekker 2008;Wachsmuth returns the frequencies at which interactions occur. et al. 2008;LangowskiandHeermann2007;Rosaand Given that the original 3C technique (Dekker et al. Everaers 2008). This special issue of the Chromosome 2002) was only applicable to single pairwise loci at a Research journal includes several such approaches time and within a relatively small DNA region (up to (Fritsch and Langowski 2010;Mirny2010). hundreds of kilo base pairs), there has been a plethora Higher resolution experimental techniques such as of new approaches that expanded 3C. For example, chromosome conformation capture (3C)-based the so-called circular chromosome conformation approaches (Dekker et al. 2002; Lieberman-Aiden et capture (4C) techniques allow for the characterization al. 2009; Dostie et al. 2007; Zhao et al. 2006; Simonis of a genomic domain by a one-to-many loci interac- et al. 2006) have prompted the development of new tion analysis. Those approaches take advantage of the hybrid methods that aim at integrating experiments fact that most of the 3C ligation products are already and computation for identifying the most likely 3D circular or can be easily circularized and then folding of a chromatin domain at medium to high inversely amplified by PCR. Four different laborato- resolutions. The 3C-driven approaches, in combi- ries developed 4C technologies in parallel, which nation with computational modeling, have so far differ in the restriction enzymes used, the step at been capable of generating medium resolution which the circular DNA is formed and the analysis of models of the topological conformation of the the amplified fragments. Such methods include HoxA cluster (Fraser et al. 2009), the α-globin circular 3C (Zhao et al. 2006), 3C-on-chip (Simonis domain (Baù et al. 2010), and the yeast genome et al. 2006), open-ended 3C (Wurtele and Chartrand (Duan et al. 2010). 2006), and “olfactory receptor” 3C (Lomvardas et al. This review starts by describing existing 3C-based 2006). More recently, the 3C carbon copy technology methodologies for 3D determination of genomic (5C) was developed to allow the simultaneous and domains and proposes that the integrative modeling parallel detection of interactions within relatively platform (IMP, http://www.integrativemodeling.org) large genomic domains or even entire chromosomes (Alber et al. 2007a) can be used for integrating such (Dostie and Dekker 2007; Dostie et al. 2006). In 5C, experimental data towards the determination of their the PCR step of 3C is replaced by ligation-mediated 3D architecture at unprecedented resolutions. We then amplification (LMA) followed by the detection of Structure determination of genomic domains ligation products. With LMA, it is possible to use (Duan et al. 2010), using a similar approach, built 3D simultaneously thousands of primers, allowing the models of the entire yeast genome coupling 4C parallel detection of millions of chromatin interac- (Simonis et al. 2006) with massively parallel sequenc- tions. Thus, the generated 5C library is an “amplified” ing. Interaction frequencies calculated by 4C were version of the 3C library that can be analyzed by converted into distances upon the assumption that microarray analysis or deep sequencing. Since 5C polymer packing determines intrachromosomal distan- experiments are designed so that 5C primers anneal ces (Bystricky et al. 2004). In particular, interaction across the 3C ligation products, the specific design of frequencies were translated into nuclear distances by the 5C primers allows interrogating multiple pairwise assigning 130 bp of packed chromatin to a length of loci interactions in an all-against-all fashion (Lajoie et 1 nm. Each chromosome was then represented as a al. 2009). Such experiments are thus suitable for series of beads of 10 Kb which were assigned to the generating complete and extensive loci interaction closest restriction enzyme fragment resulting from the matrices for large genomic regions (up to a few mega 4C experiment. Finally, a 3D model of the entire yeast base pairs). Finally, the Hi-C technology, which genome were constructed by minimizing an objective allows for an unbiased genome-wide analysis, was function that scored the fitting of all bead distances to recently developed to overcome the need for prede- the theoretical distances as derived from the 4C fining a set of target loci to investigate (van Berkum experiments. et al. 2010). Hi-C’s key step is the imprinting of As outlined above, 3C-based experimental data about ligated products (i.e., products of interacting loci) the structure of genomic domains can only be translated with a biotin marker that later on is precipitated with into a 3D model via computational methods. Next, we streptavidin beads. Such a step allows for specifically