
Data driven approaches to infer the regulatory mechanism shaping and constraining levels of metabolites in metabolic networks Dissertation zur Erlangung des akademischen Grades Doctor rerum naturalium in der Wissenschaftsdisziplin “Systembiologie” eingereicht in kumulativer Form an der Mathematisch-Naturwissenschaftlichen Fakultät der Universität Potsdam von Kevin Schwahn Disputation am 20.12.2018 Betreuer: Prof. Dr. Alisdair R. Fernie Prof. Dr. Zoran Nikoloski Gutachter: Prof. Dr. Zoran Nikoloski Dr. Joachim Kopka Prof. Dr. Björn Usadel Published online at the Institutional Repository of the University of Potsdam: https://nbn-resolving.org/urn:nbn:de:kobv:517-opus4-423240 https://doi.org/10.25932/publishup-42324 CONTENTS Contents 1 Abstract 7 2 Zusammenfassung 8 3 Introduction 10 3.1 Metabolism and the means to measure its constituting components 11 3.1.1 Network representation of metabolites and metabolism . 11 3.1.2 Metabolomics technologies . 14 3.2 The transcriptome and its measurement 18 3.2.1 Transcriptomic technologies . 19 3.3 Data reduction and regression approaches for transcriptomic and metabolomic data analy- sis 21 3.4 Thesis outline 24 4 Observability of plant metabolic networks is reflected in the correlation of metabolic profiles 26 4.1 Introduction 27 4.2 Materials and Methods 30 4.3 Results and Discussion 31 4.3.1 Number and position of sensor metabolites in models of plant primary metabolism . 31 4.3.2 Data profiles of sensor metabolites show stronger correlations than non-sensor metabolites . 32 4.3.3 Analysis of robustness for the observed sensor/non-sensor patterns . 37 4.3.4 Test on kinetic model of central carbon metabolism . 38 4.3.5 Implications of the findings . 39 4.4 Conclusion 40 3 CONTENTS 5 Stoichiometric correlation analysis: principles of metabolic functionality from metabolomics data 42 5.1 Introduction 43 5.2 Materials and Methods 45 5.2.1 Description of the approach with the underlying assumptions and principles . 45 5.2.2 Implementation of SCA . 48 5.2.3 Models . 49 5.2.4 Metabolic data profiles . 49 5.3 Results and Discussion 50 5.3.1 Stoichiometric Correlation Analysis with a paradigmatic model of the TCA cycle . 50 5.3.2 SCA demonstrates differences in the stringent response between E. coli and A. thaliana 52 5.3.3 SCA shows that domestication in wheat is associated with loss of regulatory couplings . 55 5.4 Conclusion 58 6 Data reduction approaches for dissecting transcriptional effects on metab- olism 59 6.1 Introduction 60 6.2 Materials and Methods 62 6.2.1 Data used in the study . 62 6.2.2 PCA and partial correlation . 62 6.2.3 Combination of PCA and partial correlations to investigate influence of transcripts on metabolites . 63 6.2.4 Calculating significant differences with permutation test . 63 6.2.5 Algorithm Implementation . 64 6.3 Results 64 6.3.1 Two novel methods for categorization of metabolite pairs based on transcriptional effects 64 6.3.2 Transcriptional and post-transcriptional control of metabolite associations in E. coli .. 65 6.3.3 Prevailing regulatory effects in S. cerevisiae - comparison with published results . 68 6.3.4 Transcriptional control of metabolite associations in A. thaliana ................. 70 6.4 Discussion 73 4 CONTENTS 7 Discussion 76 7.1 Central metabolites as sufficient study objectives 76 7.2 Reaction coupling - network information in metabolomic data 78 7.3 Integration of data types - towards the investigation of regulatory effects 79 7.4 System-wide investigation of regulatory mechanism on metabolism 81 8 Appendices 85 8.1 Appendix: Observability of plant metabolic networks is reflected in the correlation of meta- bolic profiles 85 8.1.1 Supplemental Figures . 85 8.1.2 Additional files and tables . 87 8.2 Appendix: Stoichiometric correlation analysis: principles of metabolic functionality from metabolomics data 88 8.2.1 Additional files and tables . 88 8.3 Appendix: Data reduction approaches for dissecting transcriptional effects on metabolism 89 8.3.1 Supplemental Figures . 89 8.3.2 Additional files and tables . 89 9 Bibliography 91 10 Acknowledgments 108 11 Statement of authorship 109 5 CONTENTS 6 1 Abstract 1 Abstract Systems biology aims at investigating biological systems in its entirety by gathering and analyzing large-scale data sets about the underlying components. Computational systems biology approaches use these large-scale data sets to create models at different scales and cellular levels. In addition, it is concerned with generating and testing hypotheses about biological processes. However, such ap- proaches are inevitably leading to computational challenges due to the high dimensionality of the data and the differences in the dimension of data from different cellular layers. This thesis focuses on the investigation and development of computational approaches to analyze metabolite profiles in the context of cellular networks. This leads to determining what aspects of the network functionality are reflected in the metabolite levels. With these methods at hand, this thesis aims to answer three questions: (1) how observability of biological systems is manifested in metabolite profiles and if it can be used for phenotypical comparisons; (2) how to identify couplings of reaction rates from metabolic profiles alone; and (3) which regulatory mechanism that affect metabolite levels can be distinguished by integrating transcriptomics and metabolomics read-outs. I showed that sensor metabolites, identified by an approach from observability theory, are more cor- related to each other than non-sensors. The greater correlations between sensor metabolites were de- tected both with publicly available metabolite profiles and synthetic data simulated from a medium- scale kinetic model. I demonstrated through robustness analysis that correlation was due to the posi- tion of the sensor metabolites in the network and persisted irrespectively of the experimental condi- tions. Sensor metabolites are therefore potential candidates for phenotypical comparisons between conditions through targeted metabolic analysis. Furthermore, I demonstrated that the coupling of metabolic reaction rates can be investigated from a purely data-driven perspective, assuming that metabolic reactions can be described by mass ac- tion kinetics. Employing metabolite profiles from domesticated and wild wheat and tomato species, I showed that the process of domestication is associated with a loss of regulatory control on the level of reaction rate coupling. I also found that the same metabolic pathways in Arabidopsis thaliana and Escherichia coli exhibit differences in the number of reaction rate couplings. I designed a novel method for the identification and categorization of transcriptional effects on me- tabolism by combining data on gene expression and metabolite levels. The approach determines the partial correlation of metabolites with control by the principal components of the transcript levels. The principle components contain the majority of the transcriptomic information allowing to partial out the effect of the transcriptional layer from the metabolite profiles. Depending whether the cor- relation between metabolites persists upon controlling for the effect of the transcriptional layer, the approach allows us to group metabolite pairs into being associated due to post-transcriptional or tran- scriptional regulation, respectively. I showed that the classification of metabolite pairs into those that are associated due to transcriptional or post-transcriptional regulation are in agreement with existing literature and findings from a Bayesian inference approach. The approaches developed, implemented, and investigated in this thesis open novel ways to jointly study metabolomics and transcriptomics data as well as to place metabolic profiles in the network context. The results from these approaches have the potential to provide further insights into the regulatory machinery in a biological system. 7 2 Zusammenfassung 2 Zusammenfassung Die System Biologie ist auf die Auswertung biologischer Systeme in ihrer Gesamtheit gerichtet. Dies geschieht durch das Sammeln und analysieren von großen Datensätzen der zugrundeliegenden Kom- ponenten der Systeme. Computergestützte systembiologische Ansätze verwenden diese großen Da- tensätze, um Modelle zu erstellen und Hypothesen über biologische Prozesse auf verschiedenen zel- lularen Ebenen zu testen. Diese Ansätze führen jedoch unweigerlich zu rechnerischen Herausforde- rungen, da die Daten über eine hohe Dimensionalität verfügen. Des Weiteren weisen Daten, die von verschiedenen zellulären Ebenen gewonnen werden, unterschiedliche Dimensionen auf. Diese Doktorarbeit beschäftigt sich mit der Untersuchung und Entwicklung von rechnergestützten Ansätzen, um Metabolit-Profile im Zusammenhang von zellulären Netzwerken zu analysieren und um zu bestimmen, welche Aspekte der Netzwerkfunktionalität sich in den Metabolit-Messungen wi- derspiegeln. Die Zielsetzung dieser Arbeit ist es, die folgenden Fragen, unter Berücksichtigung der genannten Methoden, zu beantworten: (1) Wie ist die Beobachtbarkeit von biologischen Systemen in Metabolit-Profilen manifestiert und sind diese für phänotypische Vergleiche verwendbar? (2) Wie lässt sich die Kopplung von Reaktionsraten ausschließlich durch Metabolit-Profile identifizieren? (3) Wel- che regulatorischen Mechanismen, die Metabolit-Niveaus beeinflussen, sind unterscheidbar, wenn transkriptomische und metabolische Daten kombiniert werden? Ich konnte darlegen, dass Sensormetabolite, die durch eine Methode der „observability theory“
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages109 Page
-
File Size-