Computational Analysis of Transcriptional Responses to the Activin Signal
Total Page:16
File Type:pdf, Size:1020Kb
Computational analysis of transcriptional responses to the Activin signal D I S S E R T A T I O N zur Erlangung des akademischen Grades Doctor rerum naturalium (Dr. rer. nat.) im Fach Biophysik eingereicht an der Lebenswissenschaftlichen Fakultät der Humboldt-Universität zu Berlin Von Dan Shi, M.Sc. Präsidentin der Humboldt-Universität zu Berlin Prof. Dr.-Ing. Dr. Sabine Kunst Dekan der Lebenswissenschaftlichen Fakultät der Humboldt-Universität zu Berlin Prof. Dr. Bernhard Grimm Gutachter/innen: Prof. Dr. Dr. h. c. Edda Klipp Dr. Jana Wolf Dr. Zhike Zi Tag der mündlichen Prüfung: 25.08.2020 Abstract Transforming growth factor-β (TGF-β) signaling pathways play a crucial role in cell proliferation, migration, and apoptosis through the activation of Smad proteins. Research has shown that the biological effects of TGF-β signaling pathway are highly cellular-context- dependent. In this thesis work, I aimed at understanding how TGF-β signaling can regulate target genes differently, how different dynamics of gene expressions are induced by TGF-β signal, and what is the role of Smad proteins in differing the profiles of target gene expression. In this study, I focused on the transcriptional responses to the Nodal/Activin ligand, which is a member of the TGF-β superfamily and a key regulator of early embryonic development. Kinetic models were developed and calibrated with the time course data of RNA polymerase II (Pol II) and Smad2 chromatin binding profiles for the target genes. Using the Akaike information criterion (AIC) to evaluate different kinetic models, we discovered that Nodal/Activin signaling regulates target genes via different mechanisms. In the Nodal/Activin- Smad2 signaling pathway, Smad2 plays different regulatory roles on different target genes. We show how Smad2 participates in regulating the transcription or degradation rate of each target gene separately. Moreover, a series of features that can predict the transcription dynamics of target genes are selected by logistic regression. The approach we present here provides quantitative relationships between transcription factor dynamics and transcriptional responses. This work also provides a general computational framework for studying the transcription regulations of other signaling pathways. I Zusammenfassung Die Signalwege des transformierenden Wachstumsfaktors β (TGF-β) spielen eine entscheidende Rolle bei der Zellproliferation, -migration und -apoptose durch die Aktivierung von Smad-Proteinen. Untersuchungen haben gezeigt, dass die biologischen Wirkungen des TGF-β-Signalwegs stark vom Zellkontext abhängen. In dieser Arbeit ging es darum zu verstehen, wie TGF-β-Signale Zielgene unterschiedlich regulieren können, wie unterschiedliche Dynamiken der Genexpression durch TGF-β-Signale induziert werden und auf welche Weise Smad-Proteine zu unterschiedlichen Expressionsmustern von TGF- β- Zielgenen beitragen. Der Fokus dieser Studie liegt auf den transkriptionsregulatorischen Effekten des Nodal / Activin-Liganden, der zur TGF-β-Superfamilie gehört und ein wichtiger Faktor in der frühen embryonalen Entwicklung ist. Um diese Effekte zu analysieren, habe ich kinetische Modelle entwickelt und mit den Zeitverlaufsdaten von RNA-Polymerase II (Pol II) und Smad2- Chromatin-Bindungsprofilen für die Zielgene kalibriert. Unter Verwendung des Akaike- Informationskriteriums (AIC) zur Bewertung verschiedener kinetischer Modelle stellten wir fest, dass der Nodal / Activin-Signalweg Zielgene über verschiedene Mechanismen reguliert. Im Nodal / Activin-Smad2-Signalweg spielt Smad2 für verschiedene Zielgene unterschiedliche regulatorische Rollen. Wir zeigen, wie Smad2 daran beteiligt ist, die Transkriptions- oder Abbaurate jedes Zielgens separat zu regulieren. Darüber hinaus werden eine Reihe von Merkmalen, die die Transkriptionsdynamik von Zielgenen vorhersagen können, durch logistische Regression ausgewählt. Der hier vorgestellte Ansatz liefert quantitative Beziehungen zwischen der Dynamik des Transkriptionsfaktors und den Transkriptionsantworten. Diese Arbeit bietet auch einen allgemeinen mathematischen Rahmen für die Untersuchung der Transkriptionsregulation anderer Signalwege. II Acknowledgments This paper is completed under the guidance of Prof. Dr. Edda Klipp and Dr. Zhike Zi. I am deeply impressed by the rigorous scientific attitude and good research spirit of my supervisors. They have given me a lot of helps and inspiration in the study, research, and life. Thanks to the care and help of my colleagues in the lab who supported united and enthusiastic research environment for my studies. I would like to give special thanks to my family and friends. Without them, I cannot concentrate on scientific research. Their care and help support me all the way. This study was supported by a scholarship from the China Scholarship Council (CSC). Without the support of CSC, this work would be impossible to be done. Finally, I would like to thank all the experts and professors for reviewing my graduation thesis carefully. III Table of Contents Abstract ...................................................................................................................................... I Zusammenfassung..................................................................................................................... II Acknowledgments.................................................................................................................... III List of Figures ........................................................................................................................ VII List of Tables ........................................................................................................................... IX 1 Background ............................................................................................................................. 1 1.1 Overview .......................................................................................................................... 1 1.2 Transcriptional regulation ................................................................................................ 1 1.2.1 The central dogma of molecular biology .................................................................. 1 1.2.2 Eukaryotic RNA production ..................................................................................... 3 1.2.3 Post-transcriptional modification .............................................................................. 4 1.2.4 Eukaryotic mRNA degradation................................................................................. 6 1.2.5 Eukaryotic transcription regulation........................................................................... 7 1.3 The TGF-β and Nodal/Activin signaling pathways ....................................................... 10 1.3.1 The TGF-β superfamily .......................................................................................... 10 1.3.2 TGF-β receptors ...................................................................................................... 12 1.3.3 Smad proteins and canonical TGF-β signaling ....................................................... 13 1.3.4 Non-canonical TGF-β signaling ............................................................................. 16 1.3.5 The Nodal/Activin signaling pathway .................................................................... 16 1. 4 Next-generation sequencing technologies .................................................................... 21 1.4.1 Next-generation sequencing.................................................................................... 21 1.4.2 Overview of the Illumina sequencing method ........................................................ 21 IV 1.4.3 RNA sequencing ..................................................................................................... 24 1.4.4 ChIP-sequencing ..................................................................................................... 25 1.4.5 Next-generation sequencing data ............................................................................ 27 1.4.6 Bioinformatics for next-generation sequencing ...................................................... 27 1.5 Computational modeling of transcription dynamics ...................................................... 28 2 Aims & Objectives ................................................................................................................ 30 3 Materials & Methods ............................................................................................................ 32 3.1 Datasets .......................................................................................................................... 32 3.2 Sequencing data analysis ............................................................................................... 33 3.2.1 Data processing overview ....................................................................................... 33 3.2.2 RNA-seq data processing ........................................................................................ 34 3.2.3 ChIP-seq data processing ........................................................................................ 40 3.3 Summary ........................................................................................................................ 47 4 Kinetic modeling of the transcriptional responses to the Activin signal .............................. 49 4.1 Introduction ...................................................................................................................