Unraveling Transcript-Based Variability of Host Responses to Tuberculosis
Total Page:16
File Type:pdf, Size:1020Kb
Unraveling transcript-based variability of host responses to Tuberculosis Dissertation zur Erlangung des akademischen Grades Doctor of Philosophy (Ph. D.) im Fach Biologie eingereicht an der Lebenswissenschaftlichen Fakultät der Humboldt-Universität zu Berlin von Teresa Domaszewska (M. Sc.) Durchgeführt am Max-Planck-Institut für Infektionsbiologie in der Abteilung Immunologie Präsidentin der Humboldt-Universität zu Berlin Prof. Dr.-Ing. Dr. Sabine Kunst Dekan der Lebenswissenschaftlichen Fakultät Prof. Dr. Bernhard Grimm Gutachter/innen: 1. Stefan H. E. Kaufmann 2. Barbara Broeker 3. Arturo Zychlinsky Tag der mündlichen Prüfung: 10.01.2019 DECLARATION I hereby declare that I completed the doctoral thesis independently based on the stated resources and aids. I have not applied for a doctoral degree elsewhere and do not have a corresponding doctoral degree. I have not submitted the doctoral thesis, or parts of it, to another academic institution and the thesis has not been accepted or rejected. I declare that I have acknowledged the Doctoral Degree Regulations which underlie the procedure of the Faculty of Life Sciences of Humboldt-Universität zu Berlin, as amended on 5 th March 2015. Furthermore, I declare that no collaboration with commercial doctoral degree supervisors took place, and that the principles of Humboldt-Universität zu Berlin for ensuring good academic practice were abided by. The work presented here has been performed under direct supervision of Dr. January Weiner (Max Planck Institute for Infection Biology, Department of Immunology), to whom I am immensely thankful for guiding me during the years of my doctoral studies. TABLE OF CONTENTS LIST OF FIGURES .............................................................................................................................................. 1 LIST OF TABLES ................................................................................................................................................ 4 LIST OF ABBREVIATIONS .............................................................................................................................. 5 ABSTRACT (ENGLISH) ..................................................................................................................................... 6 ABSTRACT (GERMAN) ..................................................................................................................................... 7 1. CHAPTER 1: INTRODUCTION AND GOALS ................................................................................... 9 1.1. TUBERCULOSIS ................................................................................................................................................ 10 1.1.1. Epidemiology of TB ................................................................................................................................ 10 1.1.2. Transmission ............................................................................................................................................. 11 1.1.3. Prevention of TB ...................................................................................................................................... 11 1.1.4. Symptoms and diagnosis of TB .......................................................................................................... 12 1.2. TRANSCRIPTOME STUDIES IN TB ................................................................................................................. 16 1.2.1. RNA expression ........................................................................................................................................ 16 1.2.2. Methods of RNA detection and quantification............................................................................. 17 1.2.3. Whole blood transcriptomic biosignatures .................................................................................. 19 1.2.4. Machine learning in biomarker discovery .................................................................................... 20 1.2.5. Unsupervised Machine Learning – Principal Component Analysis ..................................... 22 1.2.6. Supervised Machine Learning – Random Forest ........................................................................ 23 1.2.7. Approaches to identify diagnostic TB biomarkers in published studies ............................ 24 1.2.8. Approaches to identify prognostic TB biomarkers in cohorts ............................................... 30 1.2.9. Approaches to identify universal TB biomarkers in multi-cohort studies ........................ 31 Insights missing in the multi-cohort studies ...............................................................................................................31 1.3. VARIOUS FACTORS INFLUENCE MTB INFECTION PROGRESS .................................................................... 32 1.3.1. Variability in the Mtb infection outcomes ..................................................................................... 32 1.3.2. Complexity of the immune response to TB .................................................................................... 33 1.3.3. Interferon signaling pathways in TB ............................................................................................... 34 1.4. THE ROLE OF MOUSE MODEL IN UNDERSTANDING HUMAN IMMUNE RESPONSE IN TB ....................... 37 1.4.1. Mouse models of TB ................................................................................................................................ 37 1.4.2. Mouse models have advanced the understanding of human TB .......................................... 38 1.4.3. Murine models of TB: 129S2 and C57BL/6 ................................................................................... 39 1.4.4. Challenges related to the use of animal models .......................................................................... 40 1.5. GENE SET ENRICHMENT ANALYSIS REVEALS THE BIOLOGY BEHIND TRANSCRIPTOMIC PROFILES ... 41 1.6. MOTIVATION ................................................................................................................................................... 44 2. CHAPTER 2: METHODOLOGY ......................................................................................................... 45 2.1. OVERVIEW ....................................................................................................................................................... 46 2.2. DATA ACQUISITION ......................................................................................................................................... 46 2.2.1. Acquisition of publicly available datasets for TB multi-cohort analysis ........................... 47 2.2.2. Acquisition of publicly available sepsis datasets for the validation of methods ............ 49 2.2.3. Acquisition of GEO datasets for the comparison of mouse and human ............................. 49 2.2.4. Mice and Mtb infection ......................................................................................................................... 50 2.2.5. Blood collection and RNA isolation .................................................................................................. 50 2.2.6. Blood microarrays .................................................................................................................................. 51 2.2.7. Acquisition of THP1 data ..................................................................................................................... 51 2.2.8. Macrophage RNA microarrays .......................................................................................................... 51 2.3. DATA NORMALIZATION .................................................................................................................................. 51 2.3.1. Data preprocessing ................................................................................................................................ 51 2.3.2. Data normalization for multi-cohort analysis ............................................................................ 52 2.4. DIFFERENTIAL EXPRESSION CALCULATION ................................................................................................ 53 2.5. GSEA FOR INDIVIDUAL PATIENTS................................................................................................................ 53 2.6. DEFINITION OF IFN TYPE I AND IFN TYPE II MODULES .......................................................................... 54 2.7. IDENTIFICATION OF IFN+ AND IFN- PATIENTS ........................................................................................ 55 2.8. LOGISTIC REGRESSION ................................................................................................................................... 55 2.9. IDENTIFICATION OF CONCORDANT AND DISCORDANT GENES BETWEEN IFN+ AND IFN- TB PATIENTS ................................................................................................................................................................. 55 2.10. CYTOKINE CONCENTRATIONS IN BLOOD OF IFN I+ AND IFN I- INDIVIDUALS ....................................... 55 2.11. CORRELATION BETWEEN IFN STATUS AND DISEASE SEVERITY .............................................................. 56 2.12. MACHINE LEARNING METHODS ...................................................................................................................