Optimal Experimental Design Applied to Models of Microbial Gene Regulation

Optimal Experimental Design Applied to Models of Microbial Gene Regulation by Nathan Braniff A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Doctor of Philosophy in Applied Mathematics Waterloo, Ontario, Canada, 2020 c Nathan Braniff 2020 Examining Committee Membership The following served on the Examining Committee for this thesis. The decision of the Examining Committee is by majority vote. External Examiner: Julio Banga Professor, IIM-CISC Supervisor: Brian Ingalls Professor, Applied Mathematics, University of Waterloo Internal Members: Matt Scott Assoc. Professor, Applied Mathematics, University of Waterloo Kirsten Morris Professor, Applied Mathematics, University of Waterloo Internal-External Member: Trevor Charles Professor, Biology, University of Waterloo ii Authour's Declaration This thesis consists of material all of which I authored or co-authored: see Statement of Contributions included in the thesis. This is a true copy of the thesis, including any required final revisions, as accepted by my examiners. I understand that my thesis may be made electronically available to the public. iii Statement of Contributions Chapters1,2,7,&8 : I am the sole author of this material, under the supervision of Brian Ingalls. This material has not been previously published. Chapter3 : This material has been previously published as: \New opportunities for optimal design of dynamic experiments in systems and synthetic biology" by Nathan Braniff, and Brian Ingalls in Current Opinion in Systems Biology 9 (2018): 42-48 I co-authored this manuscript with Brian Ingalls. My contribution was gathering the candidate references and in selecting the initial areas of focus for the article. I also wrote the initial draft of the manuscript, which Brian and I then collaborated on to produce the final text. Chapter4 : This material has been previously published as: \Component Characterization in a Growth-Dependent Physiological Context: Optimal Experimental Design" by Nathan Braniff, Matt Scott, and Brian Ingalls in Processes 7.1 (2019): 52. I am the primary author of this manuscript, with edits and intellectual contributions from Matt Scott and Brian Ingalls. The choice of project, the mathematical model and the implementation of the optimal design algorithms were my work, with some guidance and suggestions from Matt and Brian. Chapters5 : This material has been previously published as: \Optimal experimental design for characterizing gene expression: sample scheduling" by Nathan Braniff, Max Reed, and Brian Ingalls in IFAC-PapersOnLine 51.19 (2018): 48-51. I am the primary author of this manuscript, with edits and intellectual contributions from Max Reed and Brian Ingalls. My contribution was in developing and implementing the optimal scheduling method; Max Reed investigated some of the case studies used in demonstrating its utility. Chapters6 : This material has been previously published as: \Optimal Experimental Design for a Bistable Gene Regulatory Network" by Nathan Bran- iff, Addison Richards, and Brian Ingalls in IFAC-PapersOnLine 52.26 (2019): 255-261. I am the primary author of this manuscript, with edits and intellectual contributions from Addison Richards and Brian Ingalls. My contribution was developing and implementing the optimal experimental procedure for the bi-stable motif; Addison Richards implemented a stochastic simulation algorithm for assessing the design methodology and approximations used in this work. iv Abstract Microbial gene expression is a comparatively well understood process, but regulatory interactions between genes can give rise to complicated behaviours. Regulatory networks can exhibit strong context dependence, time-varying interactions and multiple equilibrium. The qualitative diagrammatic models often used in biology are not well suited to reasoning about such intricate dynamics. Fortunately, mathematics offers a natural language to model gene regulation because it can quantify the various system inter-dependencies with much greater clarity and precision. This added clarity makes models of microbial gene regulation a valuable tool for studying both natural and synthetic gene regulatory systems. However models are only as good as the knowledge and assumptions they are built on. Specifically, all models depend on unknown parameters { constant that quantify specific rates and interaction strengths within the regulatory system. In systems biology parameters are generally fit, rather than measured directly, because their values are contextually dependent on state of the microbial host. This fitting requires collecting observations of the modeled system. Exactly what is measured, how many times and under what experimental conditions defines an experimental design. The experimental design is intimately linked to the accuracy of any resulting parameter estimates for a model, but determining what experimental design will be useful for fitting can be difficult. Optimal experimental design (OED) provides a set of statistical techniques that can be used make design choices that improve parameter estimation accuracy. In this thesis I examine the use of OED methods applied to models of microbial gene regulation. I have specifically focused on optimal design methods that combine asymptotic parametric accuracy objectives, based on the Fisher information matrix, with relaxed for- mulations of the design optimization problem. I have applied these OED methods to three biological case studies. (1) I have used these methods to implement a multiple-shooting optimal control algorithm for optimal design of dynamic experiments. This algorithm was applied to a novel model of transcriptional regulation that accounts for the microbial host's physiological context. Optimal experiments were derived for estimating sequence-specific regulatory parameters and host-specific physiological parameters. (2) I have used OED methods to formulate an optimal sample scheduling algorithm for dynamic induction experiments. This algorithm was applied to a model of an optogenetic induction system { an important tool for dynamic gene expression studies. The value of sampling schedules within dynamic experiments was examined by comparing optimal and naive schedules. (3) I derived an optimal experimental procedure for fitting a steady-state model of single-cell observations from a bistable regulatory motif. This system included a stochastic model of gene expression and the OED methods made use of the linear noise approximation to derive v a tractable design algorithm. In addition to these case-studies, I also introduce the NLOED software package. The package can perform optimal design and a number of other fitting and diagnostic procedures on both static and dynamic multi-input multi-output models. The package makes use automatic differentiation for efficient computation, offers a flexible modeling interface, and will make OED more accessible to the wider biological community. Overall, the main contributions of this thesis include: developing novel OED methods for a variety of gene regulatory scenarios, studying optimal experimental design properties for these scenarios, and implementing open-source numerical software for a variety of OED problems in systems biology. vi Acknowledgements I would like to thank my supervisor Brian Ingalls for his generous support and men- torship; I am very lucky to have been able to work with him over these past four years. I would also like to thank Matt Scott for introducing me to bacterial physiology, I very much appreciate the time he took to answer my many questions. I would also like express appre- ciation for my lab mates, especially Christian Barna, Julian Smith-Roberge and Kwan Lai for their good company and the many interesting discussions that we had together. Finally, I would like to especially thank the undergraduate students who I collaborated with, and from whom I learned so much, these folks include: Melissa Prickaerts, Nick Hon, Leah Fulton, Max Reed, Addison Richards, William Forrest, Michael Astwood, Cody Receno, Davey Lee, Keenjal Mistry, Zixuan Liu and Taylor Pearce. vii Dedication This thesis is dedicated to my parents, for all their love and support. viii Table of Contents List of Figures xiii List of Tables xix 1 Introduction1 1.1 A Simple Example...............................4 1.2 Thesis Overview.................................7 2 Background on Optimal Experimental Design8 2.1 A Short History.................................9 2.2 Mathematical Necessities for Optimal Experimental Design......... 12 2.2.1 Components of a Data-generating Model............... 12 2.2.2 Definition of an Experimental Design................. 14 2.2.3 Maximum Likelihood Estimation................... 15 2.2.4 Quantifying the Accuracy of Parameter Estimates.......... 17 2.2.5 Scalar Objectives for Optimizing Designs............... 23 3 Applications of OED in Systems Biology 26 3.0.1 Summary................................ 26 3.0.2 Introduction............................... 26 3.0.3 Time-varying control of cellular systems............... 27 ix 3.0.4 Model-based design of experiments: dynamic stimuli........ 31 3.0.5 Optimal dynamic experiments in practice............... 33 3.0.6 Future Directions............................ 34 4 Optimal Experimental Design for a Physiologically-aware Model of Gene Expression 35 4.1 Summary.................................... 35 4.2 Introduction................................... 36 4.3 Materials and Methods............................. 39 4.3.1 Derivation

Optimal Experimental Design Applied to Models of Microbial Gene Regulation

The Effect of Compression and Expansion on Stochastic Reaction Networks

Intrinsic Noise Analyzer: a Software Package for the Exploration of Stochastic Biochemical Kinetics Using the System Size Expansion

System Size Expansion Using Feynman Rules and Diagrams

Approximate Probability Distributions of the Master Equation

How Reliable Is the Linear Noise Approximation Of

Computation of Biochemical Pathway Fluctuations Beyond the Linear Noise Approximation Using

Stochastic Analysis of Biochemical Systems

Nonequilibrium Critical Phenomena: Exact Langevin Equations, Erosion of Tilted Landscapes

An Alternative Route to the System-Size Expansion

How Reliable Is the Linear Noise Approximation of Gene Regulatory Networks?

Arxiv:1810.05368V2 [Q-Bio.PE] 20 Feb 2019 a Rusaeemergent, Are Groups Mal .Introduction 1

Chemical Kinetics Roots and Methods to Obtain the Probability Distribution Function Evolution of Reactants and Products in Chemi