Integrative Genomic Analysis of Hnrnp L Splicing Regulation
Total Page:16
File Type:pdf, Size:1020Kb
University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations 2015 Terrae Incognitae: Integrative Genomic Analysis of Hnrnp L Splicing Regulation Brian Sebastian Cole University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/edissertations Part of the Allergy and Immunology Commons, Biochemistry Commons, Bioinformatics Commons, Immunology and Infectious Disease Commons, and the Medical Immunology Commons Recommended Citation Cole, Brian Sebastian, "Terrae Incognitae: Integrative Genomic Analysis of Hnrnp L Splicing Regulation" (2015). Publicly Accessible Penn Dissertations. 1664. https://repository.upenn.edu/edissertations/1664 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/edissertations/1664 For more information, please contact [email protected]. Terrae Incognitae: Integrative Genomic Analysis of Hnrnp L Splicing Regulation Abstract Alternative splicing is a critical component of human gene control that generates functional diversity from a limited genome. Defects in alternative splicing are associated with disease in humans. Alternative splicing is regulated developmentally and physiologically by the combinatorial actions of cis- and trans- acting factors, including RNA binding proteins that regulate splicing through sequence-specific interactions with pre-mRNAs. In T cells, the splicing regulator hnRNP L is an essential factor that regulates alternative splicing of physiologically important mRNAs, however the broader physical and functional impact of hnRNP L remains unknown. In this dissertation, I present analysis of hnRNP L-RNA interactions with CLIP-seq, which identifies transcriptome-wide binding sites and uncovers novel functional targets. I then use functional genomics studies to define pre-mRNA processing alterations induced by hnRNP L depletion, chief among which is cassette-type alternative splicing. Finally, I use integrative genomic analysis to identify a direct role for hnRNP L in repression of exon inclusion and an indirect role for enhancing exon inclusion that supports a novel regulatory interplay between hnRNP L and chromatin. In two appendices, I present CLIP-seq studies of two additional RNA binding proteins: the splicing factor CELF2 and the RNA helicase DDX17. In conclusion, I provide comparisons of these three CLIP-seq studies, providing high-level insights into the capabilities and limitations of CLIP-seq. In sum, this dissertation expands our knowledge of hnRNP L splicing control in the context of broader studies of RNA binding proteins in multiple cell types and conditions. Degree Type Dissertation Degree Name Doctor of Philosophy (PhD) Graduate Group Cell & Molecular Biology First Advisor Kristen W. Lynch Keywords Alternative splicing, CLIP-seq, hnRNP L, integrative genomics, RNA-seq, T cells Subject Categories Allergy and Immunology | Biochemistry | Bioinformatics | Immunology and Infectious Disease | Medical Immunology This dissertation is available at ScholarlyCommons: https://repository.upenn.edu/edissertations/1664 TERRAE INCOGNITAE: INTEGRATIVE GENOMIC ANALYSIS OF hnRNP L SPLICING REGULATION Brian Sebastian Cole A DISSERTATION in Cell and Molecular Biology Presented to the Faculties of the University of Pennsylvania in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy 2015 Supervisor of Dissertation ______________ Kristen W. Lynch, Ph.D. Professor of Biochemistry and Biophysics Graduate Group Chairperson _________________ Daniel S. Kessler, Ph.D., Associate Professor of Cell and Developmental Biology Dissertation Committee Russ P. Carstens, M.D., Associate Professor of Medicine Brian D. Gregory, Ph.D., Assistant Professor of Biology Yoseph Barash, Ph.D., Assistant Professor Genetics Stephen A. Liebhaber, M.D., Professor of Genetics and Medicine TERRAE INCOGNITAE: INTEGRATIVE GENOMIC ANALYSIS OF hnRNP L SPLICING REGULATION COPYRIGHT © 2015 Brian Sebastian Cole Permission is granted to copy, distribute, and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with the Invariant Sections being DEDICATION and ACKNOWLEDGEMENTS, with the Front-Cover Texts being TITLE and COPYRIGHT, and with no Back-Cover Texts. To view a copy of this license, visit: https://www.gnu.org/licenses/fdl.html DEDICATION I dedicate this dissertation to my thesis advisor, Kristen Lynch. Her wisdom and strength of character will be with me always. iii ACKNOWLEDGMENT I acknowledge the many who have supported me in my research, both technically and personally. I’d like to thank the legions of open source programmers who have worked for Freedom. Without open source programs, I’d never have been able to take the journey I’ve taken over the past five years. I’d like to thank my family and friends for all their support over the past five years: my mom, my dad, my brother, Leksa, Mike, Ian, Roland, Sheila and Dan, Matt, the Venez Voir crew, Kyle and Andy and everybody at PBG, Rohit and Christina and everybody at WHIT, Craig and Charles over at Penn Health, Jordan and Bren and Luigi and the Data Incubator crew from New York City, Matt and the Beer Club, Mr. Nap the mathiest lawyer who ever drummed sevens and fives, and everybody else who has kept me afloat. iv ABSTRACT TERRAE INCOGNITAE: INTEGRATIVE GENOMIC ANALYSIS OF hnRNP L SPLICING REGULATION Brian Sebastian Cole Kristen W. Lynch, Ph.D. Alternative splicing is a critical component of human gene control that generates functional diversity from a limited genome. Defects in alternative splicing are associated with disease in humans. Alternative splicing is regulated developmentally and physiologically by the combinatorial actions of cis- and trans-acting factors, including RNA binding proteins that regulate splicing through sequence-specific interactions with pre-mRNAs. In T cells, the splicing regulator hnRNP L is an essential factor that regulates alternative splicing of physiologically important mRNAs, however the broader physical and functional impact of hnRNP L remains unknown. In this dissertation, I present analysis of hnRNP L-RNA interactions with CLIP-seq, which identifies transcriptome-wide binding sites and uncovers novel functional targets. I then use functional genomics studies to define pre-mRNA processing alterations induced by hnRNP L depletion, chief among which is cassette-type alternative splicing. Finally, I use integrative genomic analysis to identify a direct role for hnRNP L in repression of exon inclusion and an indirect role for enhancing exon inclusion that supports a novel regulatory interplay between hnRNP L and chromatin. In two appendices, I present CLIP-seq studies of two additional RNA binding proteins: the splicing factor CELF2 and the RNA helicase DDX17. In conclusion, I provide comparisons of these three CLIP-seq studies, providing high-level insights into the capabilities and limitations of CLIP-seq. In v sum, this dissertation expands our knowledge of hnRNP L splicing control in the context of broader studies of RNA binding proteins in multiple cell types and conditions. vi TABLE OF CONTENTS ACKNOWLEDGMENT .............................................................................................................IV ABSTRACT.................................................................................................................................. V LIST OF TABLES ....................................................................................................................... X LIST OF FIGURES.....................................................................................................................XI INTRODUCTION ....................................................................................................................... 1 The spliceosome catalyzes splicing through serial assembly and rearrangements ............. 1 Regulation of splicing is achieved by RNA binding proteins................................................... 2 hnRNP L is an essential regulator of alternative splicing in T cells ........................................ 6 Technologies for transcriptome-wide investigation of RNA binding proteins ..................... 10 Functional studies of splicing factors with next-generation sequencing reveal splicing regulatory activity....................................................................................................................... 11 MATERIALS AND METHODS...............................................................................................14 CLIP-seq read processing and alignment ................................................................................ 14 CLIP-seq binding site definition: peak calling ......................................................................... 17 CLIP-seq motif enrichment analysis ......................................................................................... 21 CLIP-seq peak overlap comparison.......................................................................................... 23 Gene ontology analysis.............................................................................................................. 24 Splicing analysis from RASL-seq data ..................................................................................... 24 Splicing analysis from mRNA-seq data.................................................................................... 25 Stringent union of