Comprehensive Characterization of the Transcriptional Signaling of Human Parturition Through Integrative Analysis of Myometrial Tissues and Cell Lines
Total Page:16
File Type:pdf, Size:1020Kb
COMPREHENSIVE CHARACTERIZATION OF THE TRANSCRIPTIONAL SIGNALING OF HUMAN PARTURITION THROUGH INTEGRATIVE ANALYSIS OF MYOMETRIAL TISSUES AND CELL LINES by ZACHARY STANFIELD Submitted in partial fulfillment of the requirements For the degree of Doctor of Philosophy Systems Biology and Bioinformatics CASE WESTERN RESERVE UNIVERSITY August, 2019 Case Western Reserve University School of Graduate Studies We hereby approve the thesis of ZACHARY STANFIELD candidate for the degree of Doctor of Philosophy1. Committee Chair Dr. Gurkan Bebek Department of Nutrition Committee Member, Adviser Dr. Mehmet Koyutürk Department of Computer Science and Electrical Engineering Committee Member Dr. Sam Mesiano Department of Obstetrics and Gynecology Committee Member Dr. Mark Cameron Department of Population and Quantitative Health Sciences Date of Defense May 10, 2019 1We certify that written approval has been obtained for any proprietary material contained therein. Acknowledgements I would first like to acknowledge my mentor, Dr. Mehmet Koyutürk. I knew from our first meeting that Dr. Koyutürk’s lab would be a great fit for me. He is a highly en- thusiastic and ambitious researcher as well as a relaxed, supportive, and understanding mentor. Dr. Koyutürk gave me a lot of freedom in my research but was always present to provide useful feedback and ideas on how to move forward. This allowed me the opportunity to progress in research directions I personally found both interesting and rewarding. I would also like to mention Dr. Sam Mesiano, who essentially served as my co-mentor throughout the latter part of my PhD. His extensive knowledge and experi- ence in the topic of my thesis research helped me progress quickly and better plan my project. Dr. Mesiano was always willing to offer advice and introduce me to potential collaborators at various research retreats and conferences. The amount of support and belief shown in me by both Dr. Koyutürk and Dr. Mesiano gave me the confidence and drive I needed throughout my PhD career. Graduate school is just as much a lifestyle as a step in one’s career path. Therefore it is important to build relationships with those around you and experience life outside of research. Thankfully I have been able to meet a lot of great people through school events like intramural sports and grad student functions, as well as while exploring Cleveland. Some of these relationships have lasted since the very beginning and will continue long after graduation. I thank each and every one of these individuals for their friendship and the memories we were able to create. Lastly, I would like to thank my family. My parents are extremely supportive of me in every aspect of my life and I am truly grateful for it. My brother also deserves much credit, having participated in countless conversations about every possible aspect of my iii personal and professional life throughout the years. Without my family’s continuous encouragement and guidance I would not be where I am today. iv Table of Contents Acknowledgements iii List of Tables ix List of Figuresx Comprehensive Characterization of the Transcriptional Signaling of Human Parturition through Integrative Analysis of Myometrial Tissues and Cell Lines xx Chapter 1. Introduction1 Overview of Human Parturition1 Preterm Labor2 Processes Controlling Human Parturition2 The Role of Transcriptomic Studies in Parturition Research4 The Role of Experimental Cell Lines in Parturition Research5 Hypotheses and Objectives6 1.6.1. Characterizing Tissue Gene Expression7 1.6.2. Characterizing Cell Line Gene Expression8 1.6.3. Determining the Agreement between Tissue and Cell Lines9 Chapter 2. Materials and Methods 11 Data 11 2.1.1. Tissue Data 11 2.1.2. Cell Line Data 13 Methods 15 v 2.2.1. Raw Data Processing, Calculating Differential Expression, and Study Concatenation 15 2.2.2. Classification 17 2.2.3. Singular Value Decomposition 18 2.2.4. Enrichment Analysis 19 2.2.5. Annotation of Vascular Smooth Muscle Contraction Pathway Diagram 22 2.2.6. Signaling Network Construction 23 2.2.7. Regulatory Network Analysis 25 2.2.8. Comparison of Myometrial Cell Line and Tissue 26 Chapter 3. Characterizing Transcriptional Alterations at the Onset of Labor in Myometrial Tissue 28 Differential Gene Expression in Quiescent and Laboring Myometrium 28 Within-study Classification 31 Cross-study Classification 36 Identification of Latent Transcriptional Regulation Patterns via Singular Value Decomposition 37 Global Identification of Pathways Altered During Parturition 41 Hypothesis-Driven Signature Identification 43 Building a Parturition Signaling Network 49 Summary 52 Chapter 4. Investigating Pathways Controlling Human Parturition in a Myometrial Cell Line 54 Progesterone Signaling 55 vi cAMP Signaling 56 IL-1¯ Signaling 59 Additional Insights from Single Stimulus Comparisons 60 Anti-inflammatory Actions of P4 and cAMP 62 Synergy Between P4 and cAMP 64 Summary 67 Chapter 5. Identifying Transcriptional Similarities of Human Myometrial Tissues and Cell Lines 68 Agreement of Transcriptional Alterations at the Gene, Pathway, and Network Levels 68 Assessing Similarity Using Tissue Similarity Index 71 Summary 74 Chapter 6. Discussion and Conclusions 76 Tissue Analysis 76 Cell Line Analysis 83 Tissue and Cell Line Comparison 86 Conclusions 87 Chapter 7. Suggested Future Research 89 Utilizing Clinical Data 89 Confirmational Studies for Validation of Findings 90 Myometrial Cell-Specific Signaling 91 Linking the Uterine Tissues 91 Appendix A. Supplementary Tables 93 vii Appendix. Complete References 130 viii List of Tables 2.1 Myometrial tissue gene expression studies analyzed in this chapter. Three gene expression datasets collected from publicly available sources or collaborators are shown. Sample groups and names are consistent with the acronyms listed in column three throughout the chapter. Clinical definitions of labor represent how each sample was called phenotypically, based on the patient’s clinical presentation at the time of cesarean section, as chosen by the researchers collecting the samples. 12 ix List of Figures 2.1 Study design and conceptual illustration of the framework for data analysis. A. The human myometrial hTERT-HMA/B cell line, expressing the progesterone receptors (PRs), was treated with progesterone (P4), forskolin (FSK, induces cAMP), and IL-1¯ individually and in all combinations of these three stimuli. Treatment was performed for 24 hours to establish steady-state cellular transcription. RNA sequencing was performed in triplicate on each condition, resulting in expression measurements for at total of 24 samples. B. Stimulation of the targeted pathways was confirmed by examining the effect of different treatment combinations on mRNA expression of IL-8, a commonly used downstream target of IL-1¯ and indicator of inflammatory signaling. IL-8 induced expression is achieved with IL-1¯ treatment. This induced expression is inhibited by P4 in the presence of PR. cAMP inhibits IL-1¯ induced expression of IL-8, and, finally, the strongest inhibition of IL-8 up-regulation is achieved by treating cells with both FSK and P4. These experiments were carried out by a collaborator, Dr. Peyvand Amini, and are shown here to improve clarity. C. Comparisons performed to characterize the signaling carried out by P4, cAMP,and IL-1¯ individually. The results of these comparisons are shown in Figures 4.1-4.4. D. Comparisons performed to examine the interplay between P4 and cAMP in the context of their anti-inflammatory effects. Samples treated by IL-1¯ and different combinations of P4 and FSK were first compared to x vehicle. Subsequently, the genes, transcription factors, and processes identified in these comparisons were compared to each other to understand how P4 and cAMP work together or individually to suppress inflammatory processes. The results of these analyses are shown in Figures 4.5 and 4.6. 14 2.2 Data and methods used in bioinformatics pipeline. Raw gene expression was processed in multiple steps and combined with curated information from public databases to carry out supervised and unsupervised approaches to identify transcriptional signatures, assess their reproducibility, and identify key signaling pathways and interactions involved in parturition. 16 3.1 Venn diagram of differential expression in Warwick, Detroit, and London. Differentially expressed genes between non-labor and in-labor sample groups for each dataset are compared to calculate similarity across studies. 30 3.2 AUCs, ROC Curves, and sample classification frequencies for K-fold cross-validation and cross-study classification tasks on tissue gene expression studies using LASSO and EN. A-C. K-fold cross validation (CV) for each dataset (k = 3 for Warwick, 5 for London and Detroit). D-I. Classification of one dataset with training performed on another dataset. LASSO and EN were implemented using the glmnet package in R. J-R. Corresponding sample class frequencies for the classification tasks shown in panels A-I, specifically using the results xi from the top performing algorithm, based on AUC. For all prediction tasks, the early labor (L-ILEa) and established labor (L-ILEs) samples were grouped together as a single laboring sample group except for sample classification frequency for 5-fold CV on London, where the three sample groups were kept separate in order to assess how the new initial labor group was classified. All results shown were obtained using 100 runs of training and testing. Sample names in panels J-R follow the naming convention in Table 2.1. 33 3.3 Distribution of singular values and examination of genes and samples in the space spanned by the first singular vector from SVD analysis on aggregated gene expression matrix. A. Singular values calculated by performing SVD on the normalized, study-combined expression matrix (12,622 genes by 71 samples). B. Mapping of the top 25 positively and negatively valued genes in the space represented by the first right singular vector (eigen-gene). Bars representing genes are colored based on its differential expression in each of the four differential expression analyses (shown in Fig. 3.1; i.e.