Computational Analysis of DNA Interactions to Investigate the Spatial Organization of Chromatin
Total Page:16
File Type:pdf, Size:1020Kb
UNIVERSITY OF CALIFORNIA, SAN DIEGO Computational Analysis of DNA Interactions to Investigate the Spatial Organization of Chromatin A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Chemistry by Dario Meluzzi Committee in charge: Professor Gaurav Arya, Chair Professor Ulrich Müller, Co-Chair Professor Gouri Ghosh Professor Clifford Kubiak Professor Douglas Smith 2013 Copyright Dario Meluzzi, 2013 All rights reserved. The dissertation of Dario Meluzzi is approved, and it is acceptable in quality and form for publication on micro- film and electronically: Co-Chair Chair University of California, San Diego 2013 iii DEDICATION To my parents. iv TABLE OF CONTENTS Signature Page . iii Dedication . iv Table of Contents . .v List of Figures . ix List of Tables . xi Acknowledgements . xii Vita........................................ xiv Abstract of the Dissertation . xv 1 Introduction . .1 1.1 The spatial organization of chromatin . .1 1.2 Chromatin organization and cellular processes . .4 1.3 Experimental methods for studying higher-order chromatin organi- zation . .5 1.4 Hi-C experiments . .7 1.5 Analysis of data from Hi-C experiments . 12 1.5.1 Short reads from Hi-C experiments . 12 1.5.2 Short read alignment . 13 1.5.3 CPs from aligned reads . 15 1.5.4 Conformations from CPs . 16 1.5.5 CPs from polymer simulations . 17 1.5.6 Distances from FISH experiments . 18 1.5.7 Validation of recovered conformations . 19 1.5.8 Analysis of recovered conformations . 20 1.5.9 Validation of estimated CPs . 21 1.5.10 Short reads from Hi-C simulations . 22 1.5.11 Parameters for Hi-C simulations . 23 1.6 Outline of the following chapters . 24 1.7 References . 26 2 Recovering ensembles of chromatin conformations from contact probabilities 33 2.1 Abstract . 33 2.2 Introduction . 34 2.3 Methods . 37 2.3.1 Coarse-grained polymer model of chromatin . 38 v 2.3.2 Generation of conformation ensembles . 40 2.3.3 Refinement of model parameters . 43 2.4 Results and Discussion . 50 2.4.1 Test systems . 50 2.4.2 Method validation . 55 2.5 Conclusion . 66 2.5.1 Appendix: The LMS algorithm . 67 2.6 Acknowledgments . 69 2.7 References . 69 3 Efficient estimation of contact probabilities from inter-bead distance dis- tributions of simulated polymer chains . 72 3.1 Abstract . 72 3.2 Introduction . 73 3.3 Methods . 75 3.3.1 Contact probabilities from inter-bead distance distributions . 75 3.3.2 The extended generalized lambda distribution . 77 3.3.3 Method of moments . 78 3.3.4 Newton-Raphson method to solve for EGLD shape parameters 79 3.3.5 Quadtrees to search for initial solutions . 80 3.3.6 Bead-chain simulations . 83 3.3.7 Fractional errors in estimated CPs . 85 3.4 Results . 86 3.4.1 Quadtrees of initial solutions for EGLD shape parameters . 86 3.4.2 CPs estimated for free bead-chains . 88 3.4.3 CPs estimated for restrained bead-chains . 93 3.4.4 Errors in the estimated CPs . 95 3.5 Discussion . 99 3.6 Acknowledgments . 101 3.7 Appendix . 101 3.7.1 Non-central moments of the GLD . 101 3.7.2 Central moments of the GLD . 102 3.7.3 Jacobian to solve for GLD shape parameter . 103 3.7.4 Moments of the GBD . 105 3.7.5 Jacobian to solve for GBD shape parameters . 107 3.8 References . 108 4 Quantification of cleavage specificity in Hi-C experiments . 111 4.1 Abstract . 111 4.2 Introduction . 112 4.3 Methods . 114 4.3.1 Alignment of reads . 114 4.3.2 Distribution of apparent fragment lengths . 115 vi 4.3.3 Target sites . 116 4.3.4 Local site distributions . 116 4.3.5 Estimation of cleavage fractions . 119 4.3.6 Simulated Hi-C fragments . 121 4.3.7 Cleavage fractions from experimental Hi-C data sets . 122 4.3.8 Relating cleavage probabilities to cleavage fractions . 123 4.4 Results . 125 4.4.1 Apparent distribution of fragment lengths . 125 4.4.2 Local distributions of target site instances depend on cleav- age fractions . 127 4.4.3 Estimation of cleavage fractions from Hi-C simulations on a random reference genome . 131 4.4.4 Estimation of cleavage fractions from simulations on chr19 of mm10 . 133 4.4.5 Cleavage fractions estimated from experimental Hi-C data . 138 4.5 Discussion . 143 4.6 Acknowledgments . 147 4.7 References . 147 5 Biophysics of knotting . 150 5.1 Abstract . 150 5.2 Introduction . 150 5.3 Types of knots . 151 5.4 Knotting in biophysical systems . 154 5.5 Probabilities of knotting . 161 5.6 Features of knotted system . 163 5.6.1 Size of knots and knotted systems . 163 5.6.2 Knot localization . 166 5.6.3 Strength and stability of knotted systems . 167 5.7 Dynamic processes involving knots . 168 5.7.1 Knot diffusion . 168 5.7.2 Electrophoresis . 170 5.7.3 Unknotting . 172 5.8 Conclusion . 175 5.9 Acknowledgments . 176 5.10 References . 176 6 Computational prediction of efficient splice sites for trans-splicing ribozymes185 6.1 Abstract . 185 6.2 Introduction . 186 6.3 Materials and Methods . 190 6.3.1 Prediction of binding free energy change . 190 6.3.2 Ribozymes . 192 vii 6.3.3 Substrates . 193 6.3.4 Trans-splicing reactions . 194 6.3.5 Confirmation of specific trans-splicing products . 195 6.3.6 Trans-tagging assay . 195 6.3.7 Ribozyme transcription efficiency and 30-exon loss . 196 6.4 Results . 197 6.4.1 Calculation of binding free energy change for trans-splicing ribozymes . 197 6.4.2 Splice sites chosen on a model mRNA substrate, CAT mRNA200 6.4.3 Experimental trans-splicing efficiencies on CAT mRNA . 202 6.4.4 Experimental trans-splicing efficiency with unstructured RNA substrates . 204 6.4.5 Computed energetic contributions to trans-splicing efficiency 207 6.4.6 Trans-tagging assay on CAT mRNA . 211 6.4.7 Possible biases of the trans-tagging assay . 214 6.5 Discussion . 221 6.6 Acknowledgments . 226 6.7 References . 226 7 Concluding remarks . 233 7.1 Summary . 233 7.2 Directions for future research . 234 7.2.1 Conformation ensembles from CPs . ..