Systems Analysis of Stochastic and Population Balance Models for Chemically Reacting Systems
by
Eric Lynn Haseltine
A dissertation submitted in partial fulfillment
of the requirements for the degree of
DOCTOR OF PHILOSOPHY
(Chemical Engineering)
at the
UNIVERSITY OF WISCONSIN–MADISON
2005 c Copyright by Eric Lynn Haseltine 2005 All Rights Reserved i
To Lori and Grace, for their love and support ii
Systems Analysis of Stochastic and Population Balance Models for Chemically Reacting Systems
Eric Lynn Haseltine Under the supervision of Professor James B. Rawlings At the University of Wisconsin–Madison
Chemical reaction models present one method of analyzing complex reaction pathways. Most models of chemical reaction networks employ a traditional, deterministic setting. The short- comings of this traditional framework, namely difficulty in accounting for population het- erogeneity and discrete numbers of reactants, motivate the need for more flexible modeling frameworks such as stochastic and cell population balance models. How to efficiently use models to perform systems-level tasks such as parameter estimation and feedback controller design is important in all frameworks. Consequently, this thesis focuses on three main areas:
1. improving the methods used to simulate and perform systems-level tasks using stochas- tic models,
2. formulating and applying cell population balance models to better account for experi- mental data, and
3. applying moving-horizon estimation to improve state estimates for nonlinear reaction systems.
For stochastic models, we have derived and implemented techniques that improve simulation efficiency and perform systems-level tasks using these simulations. For discrete stochastic models, these systems-level tasks rely on approximate, biased sensitivities, whereas continuous models (i.e. stochastic differential equations) permit calculation of unbiased sen- sitivities. Numerous examples illustrate the efficiency of these methods, including an applica- tion to modeling of batch crystallization systems. We have also investigated using cell population balance models to incorporate both intracellular and extracellular levels of information in viral infections. Given experimental im- ages of the focal infection system for vesicular stomatitis virus, we have applied these models to better understand the dynamics of multiple rounds of virus infection and the interferon (antiviral) host response. The model provides estimates of key parameters and suggests that the experimental technique may cause salient features in the data. We have also proposed an iii efficient and accurate model decomposition that predicts population-level measurements of intracellular and extracellular species. Finally, we have assessed the capabilities of several state estimators, including moving- horizon estimation (MHE) and the extended Kalman filter (EKF). When multiple optima arise in the estimation problem, the judicious use of constraints and nonlinear optimization as em- ployed by MHE can lead to improved state estimates and closed-loop control performance than the EKF. This improvement comes at the price of the computational expense required to solve the MHE optimization. iv v
Acknowledgments
“Whatever you do, work at it with all your heart, as working for the Lord, not for men, since you know that you will receive an inheritance from the Lord as a reward.” -Colossians 3:23-24 I first thank God, creator of heaven and earth, by whose grace I have had the opportu- nity to complete the work comprising this thesis. I thank my wife Lori, for her love, patience, and support. I would not have had the courage to aim so high without your encouragement. Also, the years in Madison would not have been as special without your presence. I thank my daughter Grace, who has always been able to make me smile during this past year no matter how far graduation seemed away. I am grateful to my family: my parents, Doug and Lydia, and my brother, David. With- out your support and guidance through the years of my life, I would not be where I am today. I also wish to thank my in-laws, Carl and Linda Rutkowski, in particular for supporting my wife these past five years. I thank my extended church family at Mad City Church: the Billers, the Thompsons, the Smiths, the Sells, and the Konkols. In particular, I wish to acknowledge Shane and Karen Biller, who have loved, supported, and prayed for my family as if we were part of their own. There are many people in the chemical engineering department at the University of Wisconsin whom I must also acknowledge. First, I thank my advisor, Jim Rawlings, for giving me great latitude to exercise my creativity and to study interesting problems. I am always amazed by your ability to identify the important problems in a field. It has been a great honor to work with you and learn from you. I am also grateful to John Yin for first listening to my modeling ideas, then making ways for me to collaborate with his group. I am deeply indebted to Gabriele Pannocchia, who always made time to answer my questions, no matter how trivial. Since imitation is the highest form of flattery, I have tried to be as patient, kind, and understanding to my junior group members as you were to me. I could always count on either reasoning out research problems or taking a break for humor with Aswin Venkat (a.k.a. the British spy). Thank you, Matt Tenny, for your help in the office and the weight room, although perhaps I would have graduated sooner if you had not introduced me to Nethack. Brian Odelson and Daniel Patience always kept me from taking research too seriously, be it rounding everyone up for a game of darts, or getting MJ to drop by for an ice cream break. Thanks also to John Eaton for Octave and Linux support; who would have figured five years ago that I would install Linux on my laptop? It has been a pleasure getting to know Paul vi
Larsen, Murali Rajamani, and especially Ethan Mastny, who listened to almost all of my ideas on stochastic simulation. I also thank former Rawlings group members Jenny Wang, Scott Middlebrooks, and Chris Rao for their help during my first years in the group. Finally, I have had the great pleasure of getting to know the Yin group over the past year. In particular, I thank Vy Lam for graciously putting up with of my experimental questions. I am also grateful to Patrick Suthers and Hwijin Kim for their friendship.
ERIC LYNN HASELTINE
University of Wisconsin–Madison February 2005 vii
Contents
Abstract ii
Acknowledgments v
List of Tables xiii
List of Figures xv
Chapter 1 Introduction 1
Chapter 2 Literature Review 5 2.1 Traditional Deterministic Reaction Models ...... 5 2.2 Systems Level Tasks for Deterministic Models ...... 8 2.2.1 Optimal Control ...... 8 2.2.2 State Estimation ...... 9 2.2.3 Parameter Estimation ...... 10 2.2.4 Sensitivities ...... 13 2.3 Stochastic Reaction Models ...... 15 2.3.1 Monte Carlo Simulation of the Stochastic Model ...... 16 2.3.2 Performing Systems Level Tasks with Stochastic Models ...... 25 2.4 Population Balance Models ...... 26
Chapter 3 Motivation 29 3.1 Current Limitations of Stochastic Models ...... 29 3.1.1 Integration Methods ...... 29 3.1.2 Systems Level Tasks ...... 31 3.2 Current Limitations of Traditional Deterministic Models ...... 33 3.3 Current Limitations of State Estimation Techniques ...... 34
Chapter 4 Approximations for Stochastic Reaction Models 35 4.1 Stochastic Partitioning ...... 35 4.1.1 Slow Reaction Subset ...... 38 4.1.2 Fast Reaction Subset ...... 39 4.1.3 The Combined System ...... 40 4.1.4 The Equilibrium Approximation ...... 40 viii
4.1.5 The Langevin and Deterministic Approximations ...... 41 4.2 Numerical Implementation of the Approximations ...... 44 4.2.1 Simulating the Equilibrium Approximation ...... 46 4.2.2 Simulating the Langevin and Deterministic Approximations: Exact Next Reaction Time ...... 47 4.2.3 Simulating the Langevin and Deterministic Approximations: Approxi- mate Next Reaction Time ...... 49 4.3 Practical Implementation ...... 50 4.4 Examples ...... 50 4.4.1 Enzyme Kinetics ...... 51 4.4.2 Simple Crystallization ...... 52 4.4.3 Intracellular Viral Infection ...... 59 4.5 Critical Analysis of the Stochastic Approximations ...... 62
Chapter 5 Sensitivities for Stochastic Models 69 5.1 The Chemical Master Equation ...... 69 5.2 Sensitivities for Stochastic Systems ...... 70 5.2.1 Approximate Methods for Generating Sensitivities ...... 71 5.2.2 Deterministic Approximation for the Sensitivity ...... 72 5.2.3 Finite Difference Sensitivities ...... 74 5.2.4 Examples ...... 75 5.3 Parameter Estimation With Approximate Sensitivities ...... 79 5.3.1 High-Order Rate Example Revisited ...... 80 5.4 Steady-State Analysis ...... 82 5.4.1 Lattice-Gas Example ...... 83 5.5 Conclusions ...... 83
Chapter 6 Sensitivity Analysis of Discrete Markov Chain Models 87 6.1 Smoothed Perturbation Analysis ...... 89 6.1.1 Coin Flip Example ...... 90 6.1.2 State-Dependent Simulation Example ...... 93 6.2 Smoothing by Integration ...... 97 6.3 Sensitivity Calculation for Stochastic Chemical Kinetics ...... 100 6.4 Conclusions and Future Directions ...... 100
Chapter 7 Sensitivity Analysis of Stochastic Differential Equation Models 103 7.1 The Master Equation ...... 104 7.2 Sensitivity Examples ...... 106 7.2.1 Simple Reversible Reaction ...... 106 7.2.2 Oregonator ...... 107 7.3 Applications of Parametric Sensitivities ...... 109 7.3.1 Parameter Estimation ...... 109 ix
7.3.2 Calculating Steady States ...... 113 7.3.3 Simple Dumbbell Model of a Polymer in Solution ...... 114 7.4 Conclusions ...... 116
Chapter 8 Stochastic Simulation of Particulate Systems 119 8.1 Introduction ...... 119 8.2 Stochastic Chemical Kinetics Overview ...... 121 8.2.1 Stochastic Formulation of Isothermal Chemical Kinetics ...... 121 8.2.2 Extension of the Problem Scope ...... 122 8.2.3 Interpretation of the Simulation Output ...... 123 8.3 Crystallization Model Assumptions ...... 124 8.4 Stochastic Simulation of Batch Crystallization ...... 126 8.4.1 Isothermal Nucleation and Growth ...... 126 8.4.2 Nonisothermal Nucleation and Growth ...... 135 8.4.3 Isothermal Nucleation, Growth, and Agglomeration ...... 138 8.5 Parameter Estimation With Stochastic Models ...... 141 8.5.1 Trust-Region Optimization ...... 142 8.5.2 Finite Difference Sensitivities ...... 142 8.5.3 Parameter Estimation for Isothermal Nucleation, Growth, and Agglom- eration ...... 145 8.6 Critical Analysis of Stochastic Simulation as a Modeling Tool ...... 146 8.7 Conclusions ...... 148
Chapter 9 Population Balance Models for Cellular Systems 151 9.1 Population Balance Modeling ...... 151 9.2 Application of the Model to Viral Infections ...... 153 9.2.1 Intracellular Model ...... 153 9.2.2 Extracellular Events ...... 154 9.2.3 Final Model Refinements ...... 154 9.2.4 Model Solution ...... 155 9.3 Application to In Vitro and In Vivo Conditions ...... 156 9.3.1 In Vitro Experiment ...... 156 9.3.2 In Vivo Initial Infection ...... 158 9.3.3 In Vivo Drug Therapy ...... 163 9.4 Future Outlook and Impact ...... 168
Chapter 10 Modeling Virus Dynamics: Focal Infections 171 10.1 Experimental System ...... 172 10.1.1 Modeling the Experiment ...... 173 10.1.2 Modeling the Measurement ...... 173 10.1.3 Analyzing and Modeling the Images ...... 174 10.2 Propagation of VSV on BHK-21 Cells ...... 175 x
10.2.1 Development of a Reaction-Diffusion Model ...... 176 10.2.2 Analysis of the Model Fit ...... 177 10.3 Propagation of VSV on DBT Cells ...... 179 10.3.1 Refinement of the Reaction-Diffusion Model ...... 180 10.3.2 Discussion ...... 183 10.3.3 Model Prediction: Infection Propagation in the Presence of Interferon Inhibitors ...... 189 10.4 Conclusions ...... 190 10.5 Appendix ...... 193
Chapter 11 Multi-level Dynamics of Viral Infections 195 11.1 Modeling Framework ...... 196 11.2 Examples ...... 197 11.2.1 Initial Infection for a Generic Viral Infection ...... 197 11.2.2 VSV/DBT Focal Infection ...... 201 11.2.3 Model Solution ...... 207 11.3 Conclusions ...... 211
Chapter 12 Moving-Horizon State Estimation 215 12.1 Formulation of the Estimation Problem ...... 217 12.2 Nonlinear Observability ...... 218 12.3 Extended Kalman Filtering ...... 218 12.4 Monte Carlo Filters ...... 220 12.5 Moving-Horizon Estimation ...... 223 12.6 Example 1 ...... 225 12.6.1 Comparison of Results ...... 225 12.6.2 Evaluation of Arrival Cost Strategies ...... 231 12.7 EKF Failure ...... 233 12.7.1 Chemical Reaction Systems ...... 235 12.7.2 Example 2 ...... 237 12.7.3 Example 3 ...... 244 12.7.4 Computational Expense ...... 248 12.8 Conclusions ...... 250 12.9 Appendix ...... 255 12.9.1 Derivation of the MHE Smoothing Formulation ...... 255 12.9.2 Derivation of the MHE Filtering Formulation ...... 256 12.9.3 Equivalence of the Full Information and Least Squares Formulations . . 257 12.9.4 Evolution of a Nonlinear Probability Density ...... 260 xi
Chapter 13 Closed Loop Performance Using Moving-Horizon Estimation 265 13.1 Regulator ...... 265 13.2 Disturbance Models for Nonlinear Models ...... 266 13.2.1 Plant-model Mismatch: Exothermic CSTR Example ...... 268 13.2.2 Maximum Yield Example ...... 270 13.3 Conclusions ...... 276
Chapter 14 Conclusions 277
Bibliography 281
Vita 293 xii xiii
List of Tables
2.1 Types of cell population models ...... 26
4.1 Model parameters and reaction extents for the enzyme kinetics example . . . . 51 4.2 Model parameters and reaction extents for the simple crystallization example . 53 4.3 Comparison of time steps for the simple crystallization example ...... 56 4.4 Model parameters and reaction extents for the intracellular viral infection example 59 4.5 Simulation time comparison for the intracellular viral infection example . . . . 61
5.1 Parameters for the lattice-gas example...... 83
6.1 Parameters for the coin flip example ...... 92
7.1 Parameter values for the simple reversible reaction...... 107 7.2 Parameter values for the Oregonator system of reactions...... 109 7.3 Parameters for the simple dumbbell model...... 115 7.4 Results for the simple dumbbell model...... 115
8.1 Nucleation and growth parameters for an isothermal batch crystallizer . . . . . 127 8.2 Nonisothermal nucleation and growth parameters for a batch crystallizer . . . 136 8.3 Nucleation, growth, and agglomeration parameters for an isothermal, batch crystallizer ...... 140 8.4 Parameters for the parameter estimation example...... 144 8.5 Estimated parameters ...... 145
9.1 Model parameters for in vitro simulation ...... 157 9.2 Model parameters for in vivo simulation ...... 160 9.3 Comparison of actual and fitted parameter values for in vivo simulation of an initial infection ...... 163 9.4 Additional model parameters for in vivo drug therapy ...... 165
10.1 Parameters used to describe the experimental conditions...... 173 10.2 Parameter estimates for the VSV/BHK-21 focal infection models...... 178 10.3 Hessian analysis for the parameter estimates of the original VSV/BHK-21 focal infection model...... 178 xiv
10.4 Hessian analysis for the parameter estimates of the revised VSV/BHK-21 focal infection model...... 179 10.5 Parameter estimates for the VSV/DBT focal infection models...... 185 10.6 Hessian analysis for the parameter estimates of the reaction-diffusion VSV/DBT focal infection model...... 186 10.7 Hessian analysis for the parameter estimates of the first segregated VSV/DBT focal infection model...... 187 10.8 Hessian analysis for the parameter estimates of the second segregated VSV/DBT focal infection model...... 188
11.1 Model parameters for the initial infection simulation...... 198 11.2 Initial conditions and rate constants for the intracellular reactions of the VSV infection of DBT cells...... 206 11.3 Initial conditions and rate constants for the reactions describing the intracellular host antiviral response of the VSV infection of DBT cells...... 207 11.4 Extracellular model parameters for the infection of DBT cells by VSV...... 208
12.1 Sample size required to ensure that the relative mean square error at zero is less than 0.1...... 223 12.2 EKF steady-state behavior, no measurement or state noise ...... 238 12.3 EKF steady-state behavior, no measurement or state noise ...... 242 12.4 A priori initial conditions for state estimation ...... 246 12.5 Effects of a priori initial conditions, constraints, and horizon length on state es- timation...... 254 12.6 Comparison of MHE and EKF computational expense...... 254
13.1 Model Steady States for a Plant with Tc = 300 K, T = 350 K...... 268 13.2 Maximum yield CSTR parameters...... 273 xv
List of Figures
2.1 Microscopic volume considered in the equation of continuity for two dimensions. 6 2.2 Optimal control seeks to drive the output to set point...... 9 2.3 Parameter estimation seeks to minimize the deviations between the model pre- diction and the data...... 10 2.4 Illustration of the strong law of large numbers given a uniform distribution. . . 17 2.5 Illustration of the central limit theorem given a uniform distribution...... 19
3.1 Computational time per simulation as a function of nAo...... 30
3.2 Extent of reaction as a function of nAo...... 31 3.3 Finite difference sensitivity for the stochastic model...... 32 3.4 Cyclic nature of viral infections...... 33
4.1 Comparison of the stochastic-equilibrium simulation to exact stochastic simu- lation...... 52 4.2 Comparison of approximate tau-leap simulation to exact stochastic simulation. 54 4.3 Comparison of approximate stochastic-Langevin simulation to exact stochastic simulation...... 55 4.4 Comparison of exact stochastic-deterministic simulation to exact stochastic sim- ulation...... 56 4.5 Comparison of approximate stochastic-deterministic simulation to exact stochas- tic simulation...... 57 4.6 Squared error trends for the exact and approximate stochastic-deterministic sim- ulations...... 58 4.7 Intracellular viral infections: (a) typical and (b) aborted...... 60 4.8 Evolution of the template probability distribution for the (a) exact stochastic and (b) approximate stochastic-deterministic simulations...... 62 4.9 Comparisons of the template probability distribution for the exact stochastic and approximate stochastic-deterministic simulations...... 63 4.10 Comparison of the template mean and standard deviation for exact stochastic, approximate stochastic-deterministic, and deterministic simulations...... 64 4.11 Comparison of the genome mean and standard deviation for exact stochastic, approximate stochastic-deterministic, and deterministic simulations...... 64 xvi
4.12 Comparison of the structural protein mean and standard deviation for exact stochastic, approximate stochastic-deterministic, and deterministic simulations. 65
5.1 Comparison of the exact, approximate, and central finite difference sensitivities for a second-order reaction...... 76 5.2 Comparison of the exact and approximate sensitivities for the high-order rate example...... 77 5.3 Relative error of the approximate sensitivity s with respect to the exact sensitiv- ity s as the number of nA,o molecules increases for the high-order rate example. 78 5.4 Comparison of the exact, approximate, and finite difference sensitivity for the high-order rate example...... 78 5.5 Comparison of the (a) parameter estimates per Newton-Raphson iteration and (b) model fit at iteration 20 using the approximate and finite difference sensitiv- ities for the high-order rate example...... 81 5.6 Results for the lattice-gas model...... 84
6.1 Mean E[Sn] as a function of the number of coin flips n ...... 92 ∂E[Sn] 6.2 Mean sensitivity ∂θ as a function of the number of coin flips n ...... 93 6.3 Comparison of nominal and perturbed path for SPA analysis ...... 94 6.4 SPA analysis of the discrete decision...... 94 6.5 Illustration of the branching nature of the perturbed path for SPA analysis . . . 96
6.6 Mean E[nk] as a function of the number of decisions k ...... 97 ∂E[nk] 6.7 Mean sensitivity ∂θ as a function of the number of decisions k ...... 97 6.8 Comparison of the exact and simulated (a) mean and (b) mean integrated sen- sitivity for the irreversible reaction 2A → B...... 99
7.1 Results for the simple reversible reaction re-using the same random numbers. . 108 7.2 Results for the simple reversible reaction using different random numbers. . . . 109 7.3 Results for one trajectory of the Oregonator cyclical reactions...... 110 7.4 Results for parameter estimation of the simple reversible reaction example. . . 112 7.5 Results for steady-state analysis of the Oregonator reaction example: estimated state per Newton iteration...... 114
8.1 Method for calculating the population balance from stochastic simulation. . . . 125 8.2 Mean of the stochastic solution for an isothermal crystallization with nucleation and growth, 1 simulation, characteristic particle size ∆ = 0.01, system volume V = 1 ...... 127 8.3 Mean of the stochastic solution for an isothermal crystallization with nucleation and growth, average of 100 simulations, characteristic particle size ∆ = 0.01, system volume V = 1 ...... 128 8.4 Average stochastic simulation time based on 10 simulations and V = 1 ..... 128 xvii
8.5 Mean of the stochastic solution for an isothermal crystallization with nucleation and growth, average of 100 simulations, characteristic particle size ∆ = 0.1, system volume V = 1 ...... 129 8.6 Deterministic solution by orthogonal collocation for isothermal crystallization with nucleation and growth...... 133 8.7 Deterministic solution by orthogonal collocation for isothermal crystallization with nucleation and growth, inclusion of the diffusivity term...... 134 8.8 Total and supersaturated monomer profiles for nonisothermal crystallization . 136 8.9 Crystallizer and cooling jacket temperature profiles ...... 137 8.10 Mean of the exact stochastic solution for nonisothermal crystallization with nu- cleation and growth...... 137 8.11 Mean of the approximate stochastic solution for nonisothermal crystallization with nucleation and growth, propensity of no reaction a0 = 10 ...... 138 8.12 Deterministic solution by orthogonal collocation for nonisothermal crystalliza- tion with nucleation and growth, inclusion of the diffusivity term ...... 139 8.13 Zeroth moment comparisons ...... 139 8.14 First moment comparisons ...... 140 8.15 Mean of the stochastic solution for an isothermal crystallization with nucleation, growth, and agglomeration ...... 141 8.16 Comparison of final model prediction and measurements for the parameter es- timation example...... 145 8.17 Convergence of parameter estimates as a function of the optimization iteration. 146
9.1 Fit of a structured, unsegregated model to experimental results...... 159 9.2 Time evolution of intracellular components and secreted virus for the intracel- lular model ...... 160 9.3 Fit of a structured, unsegregated model to experimental results...... 161 9.4 Dynamic in vivo response of the cell population balance to initial infection . . . 162 9.5 Extracellular model fit to dynamic in vivo response of an initial infection . . . . 162 9.6 Dynamic in vivo response to initial treatment with inhibitor drugs I1 and I2... 166 9.7 Effect of drug therapy on in vivo steady states...... 167
10.1 Overview of the experimental system...... 172 10.2 Measurement model...... 174 10.3 Comparison of representative experimental images to model fits...... 175 10.4 Comparison of the initial uninfected cell concentration for the original and re- vised models...... 177 10.5 Comparison of representative experimental images to model fits for VSV prop- agation on DBT cells...... 180 10.6 Comparison of intracellular production rates of virus and interferon for the seg- regated model of VSV propagation on DBT cells...... 184 xviii
10.7 Comparison of representative experimental images to model predictions for VSV propagation on DBT cells in the presence of interferon inhibitors...... 189 10.8 Experimental (averaged) images obtained from the dynamic propagation of VSV on BHK-21 cells...... 193 10.9 Experimental (averaged) images obtained from the dynamic propagation of VSV on DBT cells...... 194
11.1 (a) Comparison of the full and decoupled model solutions for the initial infec- tion example. (b) Percent error for the decoupled model solution, assuming the full solution is exact...... 201 11.2 Schematic of modeled events for the infection of DBT cells by VSV...... 202 11.3 Detailed schematic of modeled events for the up-regulation of interferon (IFN) genes...... 203 11.4 Comparison of experimental data, simple segregated model fit, and the devel- oped model...... 211 11.5 Comparison of total production of virus (VSV) and interferon (IFN) per cell for the simple segregated model and intracellularly-structured, segregated model. 212 11.6 Dynamic measurement of mRNA species for the focal infection system. . . . . 212
12.1 Comparison of potential point estimates (mean and mode) for (a) unimodal and (b) bimodal a posteriori distributions...... 216 12.2 Example of using the kernel method to estimate the density of samples drawn from a normal distribution...... 222 12.3 Example of using a histogram to estimate the density of samples drawn from a normal distribution...... 222 12.4 Extended Kalman filter results...... 226
12.5 Contours of P (x1|y0, y1) ...... 227 12.6 Clipped extended Kalman filter results...... 228 12.7 Moving-horizon estimation results...... 229
12.8 Contours of max P (x1, x0|y0, y1)...... 230 x0 12.9 A posteriori density P (x1|y0, y1) calculated using a Monte Carlo filter with den- sity estimation...... 230
12.10Contours of P (x4|y0,..., y4)...... 231 12.11Contours of max P (x1,..., x4|y0,..., y4) with the arrival cost approximated x1,...,x3 using the smoothing update...... 232
12.12Contours of max P (x1,..., x4|y0,..., y4) with the arrival cost approximated x1,...,x3 as a uniform prior...... 233
12.13Contours of max P (x1,..., x10|y0,..., y10) with the arrival cost approximated x1,...,x9 using the smoothing update...... 234 12.14Extended Kalman filter results...... 239 12.15Clipped extended Kalman filter results...... 240 xix
12.16Moving-horizon estimation results...... 241 12.17Extended Kalman filter results...... 243 12.18Clipped extended Kalman filter results...... 244 12.19Moving-horizon estimation results...... 245 12.20Extended Kalman filter results...... 246 12.21Moving-horizon estimation results...... 247 12.22Extended Kalman filter results...... 248 12.23Moving-horizon estimation results...... 249 12.24Clipped extended Kalman filter results...... 250 12.25Moving-horizon estimation results...... 251 12.26Clipped extended Kalman filter results...... 252 12.27Moving-horizon estimation results...... 253
13.1 General diagram of closed-loop control for the model-predictive control frame- work...... 266 13.2 Exothermic CSTR diagram...... 268 13.3 Steady states for the Exothermic CSTR example...... 269 13.4 Exothermic CSTR feed disturbance...... 269 13.5 Exothermic CSTR results: rejection of a feed disturbance using an output dis- turbance model...... 271 13.6 Exothermic CSTR: Comparison of best nonlinear results to linear MPC results. 272 13.7 Maximum yield CSTR ...... 273 13.8 Maximum yield CSTR steady states ...... 274 13.9 Maximum yield CSTR: temporary output disturbance ...... 274 13.10Maximum yield CSTR results...... 275 xx 1
Chapter 1
Introduction
Chemical reaction models present one method of assimilating and interpreting complex re- action pathways. Usually a deterministic framework is employed to model these networks of chemical reactions. This framework assumes that a system evolves in a continuous, well- prescribed manner. Systems-level tasks seek to extract the maximum amount of utility from these models. Most of these tasks, such as parameter estimation and feedback control, can be posed in terms of optimization problems. For systems containing few numbers of particles, such as intracellular reaction net- works, concentrations are not large enough to justify applying the usual smoothly-varying assumption made in deterministic models. Rather, there are a countably finite number of chemical species in the given system. Stochastic reaction models consider such mesoscopic phenomena in terms of discrete, molecular events that, given a cursory examination, occur in a “random” fashion. These stochastic simulations are merely realizations of a deterministi- cally evolving probability distribution. Here, one must use simulation to reconstruct moments of this distribution due to the tremendous size of the probability space. The basis for these models is well established in the literature, but the methods that govern the exact simulation of these models often become computationally expensive to evaluate and hence have great room for improvement. Additionally, relatively little work has been performed in extending systems-level tasks to handling these sorts of models. Consequently, there exists a need to first formulate reasonable analogs of these traditionally deterministic tasks in a stochastic setting, and then propose methods for efficiently performing these tasks. One of the simplest, yet most intriguing biological organisms is the virus. The virus contains enough genetic information to replicate itself given the machinery of a living host. So powerful is this strategy that viral infections present one of the most potent threats to hu- man survival and well-being. The Joint United Nations Programme on HIV/AIDS (UNAIDS) estimates that in 2002, 42 million people were living with HIV/AIDS, 5 million people were newly infected with HIV, and 3.1 million people died due to AIDS related illnesses. The World Health Organization estimates that of the 170 million people currently suffering from hepatitis C, roughly one million will develop cancer of the liver during the next 10 years. In the United States alone, researchers estimate that the 500 million cases of the common cold contracted annually cost $40 billion in health care costs and lost productivity [31]. Hence there is a vi- 2 tal humanitarian and economic interest in systematically understanding how viral infections progress and how this progression can be controlled. Accordingly, researchers have invested significant amounts of time and money towards determining the roles that individual compo- nents such as the genome or proteins play in viral infections. As of yet, however, there exists no comprehensive picture that quantitatively incorporates and integrates data on viral infec- tions from multiple levels. Again, models offer one manner of consolidating the vast amount of information contained across these levels, and systems-level tasks provide one method of conveniently extracting information. This dissertation considers the role of deterministic and stochastic models in assimilat- ing dynamic data. The primary focus is on maximizing the information available from these models as well as applying such models to experimental systems. The remainder of this thesis is organized as follows: • Chapter 2 reviews literature pertaining to simulation of deterministic and stochastic chemical reaction models and methods for extracting information from these simula- tions, such as parameter estimation and state estimation. Here, we introduce the sensi- tivity as a useful quantity for performing systems-level tasks.
• Chapter 3 provides motivation for solving the problems addressed in this thesis.
• Chapters 4 through 7 examine stochastic simulation with an emphasis on stochastic chemical kinetics. We present this material in the following order:
– In Chapter 4, we derive approximations for stochastic chemical kinetics for sys- tems with coupled fast and slow reactions. These approximations lead to simu- lation strategies that result in drastic reductions of computational expense when compared to exact simulation methods.
– Chapter 5 considers biased approximations for calculating mean sensitivities from simulation for the stochastic chemical kinetics problem, and then applies these sen- sitivities to calculate steady states and estimate parameters.
– Chapter 6 explains how the discrete nature of the stochastic chemical kinetics for- mulation makes obtaining unbiased estimates of mean sensitivities difficult, then explores several techniques for calculating these unbiased estimates.
– Chapter 7 considers unbiased estimates for sensitivities of simulations governed by stochastic differential equations. Here, we simply differentiate the continuous samples paths to obtain the desired sensitivities, then use the sensitivities to per- form useful tasks.
– Chapter 8 applies some of the stochastic simulation methods developed in previ- ous chapters to solve the batch crystallization population balance. The flexibility of the simulation allows the modeler to focus on modeling the experimental system rather than the numerical methods required to solve the resulting models. 3
• Chapters 9 through 11 address population balance models for viral infections. We con- sider the following issues:
– Chapter 9 derives a population balance model incorporating information from both the intracellular and extracellular levels of description. To explore the util- ity of this model, we compare numerical results from this model to other simpler models for experimentally relevant conditions.
– Chapter 10 considers modeling of experimental data from the focal infection sys- tem. This experimental system provides dynamic image data for multiple rounds of virus infection and antiviral host response. Here, we place an emphasis on deter- mining the minimal level of modeling complexity necessary to adequately describe the experimental data.
– Chapter 11 proposes a decomposition technique for solving population balance models when flow of information is restricted from the extracellular to intracellular level. The goal is to efficiently and accurately solve population balance models while reconstructing population-level dynamics for intracellular and extracellular species.
• Chapters 12 and 13 consider one specific system-level task, namely state estimation. These chapters focus on the probabilistic formulation of the state estimation problem, in which the goal is to calculate the state estimate that maximizes the a posteriori dis- tribution (the probability of the current state conditioned on all available experimental measurements). We examine the following topics:
– Chapter 12 outlines conditions for generating multiple modes in the a posteriori distribution for some relevant chemically reacting systems. We then construct examples exhibiting such conditions, and compare how several state estimators, namely the extended Kalman filter, moving-horizon estimator, and Monte Carlo filters, perform for these examples.
– Chapter 13 examines how multiple modes in the a posteriori distribution can affect the performance of closed-loop feedback control for different estimators.
• Finally, Chapter 14 presents conclusions, outlines major accomplishments, and discusses potential areas of future work. 4 5
Chapter 2
Literature Review
Models for chemical reaction networks usually arise in a traditional, deterministic setting. Given a deterministic model, we can consider performing various systems level tasks such as parameter estimation and control. We can generally pose these tasks in terms of an optimiza- tion. In this context, a quantity known as the sensitivity becomes useful for efficient solution of the optimization. The shortcomings of the traditional deterministic framework motivate the need for alternatives that provide a more flexible foundation for chemical reaction modeling. Two such alternatives are stochastic and population balance models. This chapter presents a brief review of the modeling literature for both these subjects and the traditional models.
2.1 Traditional Deterministic Reaction Models
In a deterministic setting, we perform mass balances for the reactants and products of interest using the equation of continuity. Here we define the mass of these species as a function of time (t) and the internal (y) and external (x) characteristics of the system:
η(t, z)dz = mass of reactants or products (2.1) " # " # x external characteristics z = = (2.2) y internal characteristics
We now consider an arbitrary, time-varying control volume V (t) spanning a space in z. This volume has a time-varying surface S(t). The normal vector ns points from the surface away from the volume, and the vector vs specifies the velocity of the surface. The vector vz specifies the velocity of material flowing through the volume. Figure 2.1 depicts a low-dimensional representation of this volume. Assuming that V (t) contains a statistically significant amount of mass, the conservation equation for the species contained in V (t) is
d Z Z Z Z η(t, z)dz = Rηdz − F · nsdΩ + η(t, z)(vs · ns)dΩ (2.3) dt V (t) V (t) S(t) S(t) | {z } | {z } | {z } | {z } accumulation generation convective + diffusive flux flux due to surface motion 6
z2 S(t)
vz
vz V (t) ns
ns vs
vs
z1
Figure 2.1: Microscopic volume considered in the equation of continuity for two dimensions.
in which Rη refers to the production rate of the species η, F is the total flux, and dΩ is the differential change in the surface. Making use of the Leibniz formula permits differentiating the volume integral
d Z Z ∂η(t, z) Z η(t, z)dz = dz + η(t, z)(vs · ns) dΩ (2.4) dt V (t) V (t) ∂t S(t)
Substituting equation (2.4) into equation (2.3) yields
Z ∂η(t, z) Z Z dz = Rηdz − F · nsdΩ (2.5) V (t) ∂t V (t) S(t)
Now apply the divergence theorem to the surface integral to obtain
Z ∂η(t, z) Z Z dz = Rηdz − ∇ · Fdz (2.6) V (t) ∂t V (t) V (t)
Combining all terms into the same integral yields
Z ∂η(t, z) dz + ∇ · F − Rηdz = 0 (2.7) V (t) ∂t
Since the element V (t) is arbitrary, the argument of the integral must be zero; this result yields the microscopic equation of continuity:
∂η(t, z) + ∇ · F = R (2.8) ∂t η 7
Equation (2.8) is the most general form of our proposed model. Both Bird, Stewart, and Light- foot [11] and Deen [24] derive this equation without consideration of internal characteristics. We consider a time-varying control element, so our derivation is more akin to that of Deen [24]. Traditionally, one assumes that there are no internal characteristics of interest. Equa- tion (2.8) then further reduces to: ∂η(t, x) + ∇ · F = R (2.9) ∂t η Additionally, we can write the total flux F as the sum of convective and diffusive fluxes
F = η(t, x)vx + f (2.10)
We now assume that the reactor is well-stirred so that neither η nor Rη depend on the external coordinates x. This assumption implies that there is no diffusive flux, i.e. f = 0, which yields ∂η(t, x) + ∇ · (η(t)v ) = R (2.11) ∂t x η
Next, we integrate over the time-varying reactor volume Ve: Z ∂η(t) Z + ∇ · (η(t)vx) dx = Rηdx (2.12) Ve ∂t Ve Z ∂η(t) Z Z dx + ∇ · (η(t)vx) dx = Rηdx (2.13) Ve ∂t Ve Ve dη Z Ve + ∇ · (ηvx) dx = RηVe (2.14) dt Ve in which we have dropped the time dependence of η for notational convenience. Applying the divergence theorem to change the volume integral to a surface integral yields dη Z Ve + ne · (ηvx) dΩe = RηVe (2.15) dt Se in which
• Se is the time-varying surface of the reactor volume Ve,
• dΩe is the differential change in this surface, and
• ne is the normal vector with respect to the surface pointing away from the reactor vol- ume. Clearly η does not change within the reactor volume. However, changes to the surface as well as influx and outflow of material across the reactor boundary affect η as follows Z Z Z ne · (ηvx) dΩe = ne · (ηvx) dΩe,1 + ne · (ηvx) dΩe,2 (2.16) Se Se,1 Se,2 | {z } | {z } flow across the reactor surface surface expansion due to reactor volume changes dVe = q η − qη +η (2.17) f f dt 8 in which q and qf are respectively the effluent and feed volumetric flow rates, and ηf is the concentration of η in the feed. The resulting conservation equation is
df dVe V − q η + qη + η = R V (2.18) e dt f f dt η e
d(ηVe) = q η − qη + R V (2.19) dt f f η e
Equation (2.19) is commonly associated with continuous stirred tank reactors (CSTR’s). Alter- natively, we could have derived the plug flow reactor (PFR) design equation by starting with equation (2.9) and assuming that the reactor is well mixed in only two external dimensions.
2.2 Systems Level Tasks for Deterministic Models
Performing systems level tasks such as parameter estimation, model based feedback control, and process and product design requires a different set of tools than those required for pure simulation. Many systems level tasks are conveniently posed as optimization problems. We briefly review several of these tasks, namely optimal control, state estimation, and parameter estimation, and introduce the sensitivity as a useful quantity for performing these tasks.
2.2.1 Optimal Control
Optimal control consists of minimizing an objective of the form
N X T T T min Φ = (yk − ysp) Q(yk − ysp) + (uk − usp) R(uk − usp) + (∆uk) S∆uk (2.20a) u0,...,uN k=0
s.t. xk+1 = F (xk, uk) (2.20b)
yk = h(xk) (2.20c)
∆uk = uk − uk−1, d(xk) ≥ 0, g(uk) ≥ 0 (2.20d) in which
• yk is the measurement at time tk;
• uk is the input at time tk;
• xk is the state at time tk;
• F (xk, uk) is the solution to a first-principles model (e.g. equation (2.19)) over the time interval [tk, tk+1);
• ysp and usp are the measurement and input, respectively, at the desired set point; • Q and R are matrices that penalize deviations of the measurement and input from set point; and • S is a matrix that penalizes changes in the input. 9
In general, the optimal control problem considers an infinite number of decisions, i.e. the control horizon N is infinite. As shown in Figure 2.2, the goal of optimization (2.20) is to drive the measurements to their set points. Most control applications consist of discrete time sample, so we have formulated the model, equation (2.20b), in discrete time also.
uk
yk
k − 1 k k + 1 k + 2 k Past Future value of control objective Present
Figure 2.2: Optimal control seeks to drive the output to set point by minimizing deviations of both the output y and the input u from their respective set points.
There is a wealth of control literature that examines the properties of the equation (2.20). For example, this formulation does not even guarantee that the controller will drive the out- puts to set point. Rather, one must include additional conditions such as enforcing a terminal penalty on each optimization (i.e. yN = ysp) or adding a terminal penalty to the final measure- ment yN that quantifies the cost-to-go for an infinite horizon. We refer the interested reader to the literature for additional information on this subject [119, 118, 90].
2.2.2 State Estimation
State estimation poses the problem: given a time course of experimental measurements and a dynamic model of the system, what is the most likely state of the system? This problem is usually formulated probabilistically, that is, we would like to calculate
xˆk|k = arg max P (xk|y0,..., yk) (2.21) xk in which xk is the state at time tk, yk is the measurement at time tk, and xˆk|k is the a poste- riori state estimate of x at time tk given all measurements up to time tk. The nature of the 10 estimator depends greatly on the choice of dynamic model. For linear, unconstrained sys- tems with additive Gaussian noise, the Kalman filter [144] provides a closed-form solution to equation (2.21). For constrained or nonlinear systems, solution of this equation may or may not be tractable. One computationally attractive method for addressing the nonlinear system is the extended Kalman filter, which first linearizes the nonlinear system, then applies the Kalman filter update equations to the linearized system [144]. This technique assumes that the a posteriori distribution is normally distributed (unimodal). Examples of implementations include estimation for the production of silicon/germanium alloy films [93], polymerization reactions [103], and fermentation processes [55]. However, the extended Kalman filter, or EKF, is at best an ad hoc solution to a difficult problem, and hence there exist many barriers to the practical implementation of EKFs (see, for example, Wilson et al. [163]).
2.2.3 Parameter Estimation
Parameter estimation seeks to reconcile model predictions with experimental data, as shown in Figure 2.3. In particular, we would like to maximize the probability of the mean parameter
7 6 5
yk 4 3 2 1 A + B ↔ C 0 50 1510 302520 k
Figure 2.3: Parameter estimation seeks to minimize the deviations between the model predic- tion (solid line) and the data (points).
set θ given the measurements yk’s
max PΘ|Y ,...,Y (θ|y0,..., yN ) (2.22) θ 0 N in which θ and yk are realizations of the random variables Θ and Y k, respectively. For con- venience, we drop the subscript denoting the random variable unless required for clarity. We assume that the measurements yk’s are generated from an underlying deterministic model 11 whose measurements are corrupted by noise, i.e.
xk+1 = F (xk; θ) (2.23)
yk = h(xk) + vk (2.24)
vk ∼ N (0, Π) ∀k = 0,...,N (2.25) in which
• the state variables xk’s are simply convenient functions of the parameters θ and
1 • the variables vk’s are realizations of the normally distributed random variable ξ ∼ N (0, Π) .
Using Bayes’ Theorem to manipulate the joint distribution P (θ|y0,..., yN ) yields
P (θ|y0,..., yN ) P (y0,..., yN ) = P (y0,..., yN |θ)P (θ) (2.26) | {z } constant
P (θ|y0,..., yN ) ∝ P (y0,..., yN |θ)P (θ) (2.27)
In general, P (θ) is assumed to be a noninformative prior so as not to unduly influence the estimate of the parameters. For the chosen disturbances (i.e. normally distributed), Box and Tiao show that the noninformative prior is the distribution P (θ) = constant [14]. We derive the distribution of P (y0,..., yN , θ) from the known distribution P (v0,..., vN , θ) in the manner described by Ross [130]. This derivation require use of the inverse function theorem from calculus [132]. First define the function mapping (v0,..., vN , θ) onto (y0,..., yN , θ) as h(x0(θ)) + v0 . . f(v0,..., vN , θ) = (2.28) h(xN (θ)) + vN θ
We require that
1. f(v0,..., vN , θ) can be uniquely solved for v0,..., vN and θ in terms of y0,..., yN and θ. This condition is trivially true because
vk = yk − h(xk(θ)) ∀k = 0,...,N (2.29a) θ = θ (2.29b)
2. f(v0,..., vN , θ) has continuous partial derivatives at all points and the determinant of
1The notation N (0, Π) refers to a normally distributed random variable with mean 0 and covariance Π. 12
its Jacobian is nonzero. The Jacobian J of equation (2.28) is
I ∂h(x0(θ)) ∂x0 ∂xT ∂θT 0 ∂f(v ,..., v , θ) .. . 0 N . . J = T = (2.30) ∂z ∂h(xN (θ)) ∂xN I T T ∂xN ∂θ I
T h T T T i z = v0 ... vN θ (2.31)
If h(xk) and xk are at least once continuously differentiable for all k = 0,...,N, then the Jacobian has continuous partial derivatives. Also, J is a block-upper triangular matrix with ones on the diagonal, so its determinant is one (nonzero).
Since these conditions hold, we can calculate the distribution P (y0,..., yN , θ) via
−1 P (y0,..., yN , θ) = det(J) P (v0,..., vN , θ) (2.32) N ! Y = Pξ(vk) P (θ) (2.33) k=0
Then the desired conditional is
P (y ,..., y , θ) P (y ,..., y |θ) = 0 N (2.34) 0 N P (θ) N Y = Pξ(vk) (2.35) k=0 N Y = Pξ (yk − h(xk(θ))) (2.36) k=0
We derive the desired optimization problem next:
N Y max P (θ|y0,..., yN ) ∝ max Pξ(vk) (2.37) θ θ k=0 N ! Y = max log Pξ(vk) (2.38) θ k=0 N X = max log Pξ(yk − h(xk(θ))) (2.39) θ k=0 N X 1 T −1 ∝ min (yk − h(xk)) Π (yk − h(xk)) (2.40) θ 2 k=1 13
Therefore, this problem is equivalent to the optimization
N 1 X T −1 min Φ = ek Π ek (2.41a) θ 2 k=1
ek = yk − h(xk) (2.41b)
xk+1 = F (xk; θ) (2.41c)
We refer the reader to Box and Tiao [14] and Stewart, Caracotsios, and Sørensen [145] for a more detailed account of estimating parameters from data. Their discussion includes, for example, calculation of confidence intervals for estimated parameters.
2.2.4 Sensitivities
We define the sensitivity s as
∂x s = (2.42) ∂θT in which x is the state of the system and θ is a vector containing the parameters of interest for the system. This quantity is useful for efficiently performing optimization. In particular, sensitivities provide precise first-order information about the solution of the system, and this first-order information is manipulated to calculate gradients and Hessians that guide the non- linear optimization routines. For example, consider the nonlinear optimization for parameter estimation, equation (2.41). A strict local solution to this optimization is obtained when the gradient is zero and the Hessian is positive definite. Calculating these quantities yields
∂ 1 X T −1 ∇θΦ = e Π ek (2.43) ∂θT 2 k k T X ∂h(xk) ∂xk −1 = − Π ek (2.44) ∂xT ∂θT k k T X ∂h(xk) = − s Π−1e (2.45) ∂xT k k k k
∂ ∇θθΦ = ∇θΦ (2.46) ∂θT T ! ∂ X ∂h(xk) −1 = − sk Π ek (2.47) ∂θT ∂xT k k T 2 T X ∂h(xk) −1 ∂h(xk) ∂h(xk) ∂ xk −1 = − sk Π sk + Π ek (2.48) ∂xT ∂xT ∂xT ∂θ ∂θT k k k k k k 14
The sensitivity s clearly arises in calculation of both of these quantities. Next, we consider calculation of sensitivities for ordinary differential equations (ODE’s) and differential algebraic equations (DAE’s). This analysis basically summarizes the excellent work presented by Caracotsios et al. [17].
ODE Sensitivities
ODE systems may be written in the following form: dx = f(x, θ) (2.49a) dt x(0) = x0 (2.49b) Accordingly, we can obtain an expression for the evolution of the sensitivity by differentiating equation (2.49a) by the parameters θ: ∂ dx ∂ = f(x, θ) (2.50) ∂θT dt ∂θT d ∂x ∂f(x, θ) ∂x ∂f(x, θ) ∂θ = + (2.51) dt ∂θT ∂xT ∂θT ∂θT ∂θT ds ∂f(x, θ) ∂f(x, θ) = s + (2.52) dt ∂xT ∂θT This analysis demonstrates that the evolution equation for the sensitivity is the following ODE system: ds ∂f(x, θ) ∂f(x, θ) = s + (2.53a) dt ∂xT ∂θT ( 1 if x0,i = θj si,j(0) = (2.53b) 0 otherwise
Equation (2.53) demonstrates two distinctive features about the evolution equation for the sensitivity: 1. it is linear with respect to s, and
2. it depends only on the current values of s and x. Therefore, we can solve for s by merely integrating equation (2.53) along with the ODE sys- tem (2.49).
DAE Sensitivities
DAE systems consider the following general form: 0 = g(x˙ , x, θ) (2.54a)
x(0) = x0 (2.54b)
x˙ (0) = x˙ 0 (2.54c) 15 where x is the state of the system, x˙ is the first derivative of x, and θ is a vector containing the parameters of interest for the system. Again, we define the sensitivity s by equation (2.42) and differentiate equation (2.54a) by the θ to determine an expression for the evolution of the sensitivity: ∂ 0 = g(x˙ , x, θ) (2.55) ∂θT ∂g(x˙ , x, θ) ∂x˙ ∂g(x˙ , x, θ) ∂x ∂g(x˙ , x, θ) ∂θ 0 = + + (2.56) ∂x˙ T ∂θT ∂xT ∂θT ∂θT ∂θT ∂g(x˙ , x, θ) d ∂x ∂g(x˙ , x, θ) ∂x ∂g(x˙ , x, θ) 0 = + + (2.57) ∂x˙ T dt ∂θT ∂xT ∂θT ∂θT ∂g(x˙ , x, θ) ds ∂g(x˙ , x, θ) ∂g(x˙ , x, θ) 0 = + s + (2.58) ∂x˙ T dt ∂xT ∂θT This analysis demonstrates that the evolution equation for the sensitivity of a DAE system yields a linear DAE system: ∂g(x˙ , x, θ) ∂g(x˙ , x, θ) ∂g(x˙ , x, θ) 0 = s˙ + s + (2.59a) ∂x˙ T ∂xT ∂θT ( 1 if x0,i = θj si,j(0) = (2.59b) 0 otherwise
s˙(0) = s˙ 0 (2.59c)
As is the case for the original DAE system (2.54), we must pick a consistent initial condi- tion (i.e. s0 and s˙ 0 must satisfy equation (2.59a)). Again, we find that we can solve for the sensitivities of the system by merely integrating equation (2.59) along with the original DAE system (2.54).
2.3 Stochastic Reaction Models
When dealing with systems containing a countably finite number of molecules, deterministic models make the unrealistic assumptions that
1. mesoscopic phenomena can be treated as continuous events; and
2. identical systems given identical perturbations behave precisely the same.
For example, most models of intracellular kinetics inherently examine a small number of molecules contained within a single cell (the finite number of chromosomes in the nucleus, for example), making the first assumption invalid. Additionally, identical systems given iden- tical perturbations may elicit completely different responses. Stochastic models of chemical kinetics make no such assumptions, and hence offer one alternative to traditional determin- istic models. These models have recently received an increased amount of attention from the modeling community (see, for example, [3, 91, 79]). 16
Stochastic models of chemical kinetics postulate a deterministic evolution equation for the probability of being in a state rather than the state itself, as is the case in the usual deter- ministic models. Gillespie outlines the derivation of the evolution equation for this probability distribution in depth [48]. The basis of this derivation depends on the “fundamental hypoth- esis” of the stochastic formulation of chemical kinetics, which defines the reaction parameter cµ characterizing reaction µ as: cµdt = average probability, to first order in dt, that a particular combination of µ reactant molecules will react accordingly in the next time interval dt. We also define
• hµ as the number of distinct molecular reactant combinations for reaction µ at a given time, and
• aµ(n)dt = hµcµδt as the probability, first order in dt, that a µ reaction will occur in the next time interval dt.
Given this “fundamental hypothesis”, the governing equation for this system is the chemical master equation
m dP (n, t) X = a (n − ν )P (n − ν , t) − a (n)P (n, t) (2.60) dt k k k k k=1 in which
• n is the state of the system in terms of number of molecules (a p-vector),
• P (n, t) is the probability that the system is in state n at time t,
• ak(n)dt is the probability to order dt that reaction k occurs in the time interval [t, t + dt), and
• νk is the kth column of the stoichiometric matrix ν (a p × m matrix).
Here, we assume that the initial condition P (n, t0) is known. The solution of equation (2.60) is computationally intractable for all but the simplest systems. Rather, Monte Carlo methods are employed to reconstruct the probability distri- bution and its statistics (usually the mean and variance). We consider such methods subse- quently.
2.3.1 Monte Carlo Simulation of the Stochastic Model
Monte Carlo methods take advantage of the fact that any statistic can be written in terms of a large sample limit of observations, i.e.
Z N N 1 X i 1 X i h(n) , h(n)P (n, t)dn = lim h(n ) ≈ h(n ) for N sufficiently large (2.61) N→∞ N N i=1 i=1 17
0.6 0.55 0.5 )
] 0.45 X [ 0.4 E 0.35 0.3
Mean ( 0.25 0.2 0.15 0.1 0 1000 30002000 4000 5000 Number of Samples
Figure 2.4: Illustration of the strong law of large numbers given a uniform distribution over the interval [0, 1]. As the number of samples increases, the sample mean converges to the true mean of 0.5. in which ni is the ith Monte Carlo reconstruction of the state n. Accordingly, the desired statis- tic can be reconstructed to sufficient accuracy given a large enough number of observations. This statement follows as a direct result of the strong law of large numbers, which we state next.
Theorem 2.1 (Strong Law of Large Numbers [130].) Let X1,X2,...,Xn be a sequence of inde- pendent and identically distributed random variables, each having finite mean E[Xi] = m. Then, with probability 1, X1 + ... + Xn lim = m (2.62) n→∞ n
Proof: See Ross for details of the proof [130]. In this case, reconstructions of the desired statistic, i.e. h(ni), are independent and identically distributed variables according to the common density function given by the chemical master equation (2.60). Therefore, sampling sufficiently many of these h(ni) gives us the convergence to h(n) specified by the strong law of large numbers. We illustrate the strong law of large numbers with a simple example. Consider a uni- form distribution over the interval [0, 1]. This distribution has a finite mean of 0.5. The strong law of large numbers requires the average of samples drawn from this distribution to approach the mean with probability one. Figure 2.4 plots the average as a function of sample size; clearly this value approaches 0.5 as the number of samples increases. Unfortunately, the strong law of large numbers gives no indication as to the accuracy of the reconstructed statistic given a finite number of samples. An estimate for the degree of accuracy actually arises from the central limit theorem, which we state next. 18
Theorem 2.2 (Central Limit Theorem [130].) Let X1,X2,...,Xn be a sequence of independent and identically distributed random variables, each having finite mean m and finite variance σ2. Then the distribution of X1 + ... + Xn − nm Z = √ (2.63) n σ n tends to the standard normal as n → ∞. That is, Z a 1 −x2/2 lim P (Zn ≤ a) = √ e dx n→∞ 2π −∞
Proof: See Ross for details of the proof [130]. In this case, we now expect the reconstruction of the desired statistic, i.e. h(n), to be normally distributed assuming a large enough finite sample N. Simulating this statistic multiple times (e.g. twenty samples of h(n) reconstructed from N samples each, or 20 × N total samples) permits indirect estimation of standard statistics for h(n) such as confidence intervals. How does one check whether or not the finite sample size N is large enough to justify invocation of the central limit theorem, then? Kreyszig proposes the following rule of thumb for determining this number of samples: if the skewness of the distribution is small, use at least twenty and fifty samples to reconstruct the mean and variance, respectively [75]. We can also reconstruct multiple realizations of the ZN distribution, then use statistical tests such as the Shapiro-Wilk test to test this distribution for normality [137, 131]. If these tests indicate normality, then we are free to apply the usual statistical inferences for the ZN distribution and hence obtain some measure of the accuracy of the reconstructed statistic h(n). We illustrate the central limit theorem using again the uniform density over the range [0, 1]. Figure 2.5 compares the Monte Carlo reconstructed density for ZN to the standard nor- mal distribution. For N = 1, the reconstructed density of ZN is obviously not normal; in fact, this plot merely reconstructs the underlying uniform distribution (appropriately shifted). For N = 20, the reconstructed density of ZN compares favorably to the standard normal. These statistical theorems, then, ultimately require samples to be drawn exactly from the master equation. For nontrivial examples, direct solution of the master equation is not feasible. Alternatively, one could consider an exact stochastic simulation of the “fundamental hypothesis” as examined by Gillespie [45]. This method examines the joint probability func- tion, P (τ, µ)dτ, that governs when the next reaction occurs, and which reaction occurs. Here,
p ! X P (τ, µ|n, t) = aµ(n) exp − ak(n)τ (2.64) k=1 in which P (τ, µ|n, t)dτ is the probability that the next reaction will occur in the infinitesimal time interval [t + τ, t + τ + dτ) and will be a µ reaction, given that the original state is n at time t. One can then construct numerical algorithms for simulating trajectories obeying the density (2.64). To our knowledge, no one has yet demonstrated the equivalence between the chemical master equation and stochastic simulation. The fact that these two formulas are somehow 19
0.4 ) ) (a) 1 0.35 Z (
f 0.3 0.25 0.2 0.15 0.1 0.05 Probability Density ( 0 -4 -2 0 2 4 Z1 0.45 ) ) 0.4 (b) 20 Z ( 0.35 f 0.3 0.25 0.2 0.15 0.1 0.05 Probability Density ( 0 -4 -2 0 2 4 Z20
Figure 2.5: Illustration of the central limit theorem given a uniform distribution over the in- terval [0, 1]: (a) N = 1 sample and (b) N = 20 samples. Solid line plots the Monte Carlo reconstructed density. Dashed line plots the standard normal distribution. equivalent rests solely on the basis that both arise from the “fundamental hypothesis”. This reasoning is tantamount to the logical statements “if A implies B and A implies C, then B implies C and C implies B”. This reasoning is incorrect. Here, we demonstrate that one can derive equations (2.60) and (2.64) from one another.
Theorem 2.3 (Equivalence of the master equation and the next reaction probability density.) Assume that P (N 0, t0) is known, where h i N 0 = n0 n1 ... (2.65)
The probability densities generated by the chemical master equation (i.e. equation (2.60)) and the joint 20 density P (τ, µ|n, t)dτ (i.e. equation (2.64)) are identical.
Proof. If these probability densities are indeed equivalent, the evolution equations for these densities must be equivalent. Therefore we can prove this theorem by demonstrating that (1) P (τ, µ|n, t)dτ gives rise to the chemical master equation and (2) the chemical master equation gives rise to P (τ, µ|n, t)dτ.
1. Given P (τ, µ|n, t)dτ, derive the chemical master equation.
We consider propagating the marginal density P (nj, t) (dropping the conditional argu- ment (N 0, t0) for convenience) from time t to the future time t + dτ. Noting that the probability of having multiple reactions occur over this time is order dτ, we have
m ! X P (nj, t + dτ) =P (nj, t) 1 − lim P (τ, k|nj, t)dτ (2.66) τ→0 k=1 m X + P (nj − νk, t) lim P (τ, k|nj − νk, t)dτ + O(dτ) (2.67) τ→0 k=1
Manipulating this equation gives rise to the chemical master equation:
P (nj, t + dτ) − P (nj, t) = −a (n )P (n , t) + P (n − ν , t)a (n − ν ) + O(1) (2.68) dτ k j j j k k j k m P (nj, t + dτ) − P (nj, t) X lim = lim −ak(nj)P (nj, t) + P (nj − νk, t)ak(nj − νk) + O(1) dτ→0 dτ dτ→0 k=1 (2.69) m dP (nj, t) X = −a (n )P (n , t) + P (n − ν , t)a (n − ν ) (2.70) dt k j j j k k j k k=1
2. Given the chemical master equation, derive P (τ, µ|n, t)dτ. In this case, the master equation (2.60) is known. Given that the system is in state n at time t, we seek to derive the probability that the next reaction will occur at time t+τ and will be reaction µ. This statement is equivalent to specifying
(a) P (n, t) = 1 and
(b) no reactions occur over the interval [t, t + τ).
Accordingly, the master equation reduces to the following form:
0 Pm 0 P (n, t ) − k=1 ak(n) 0 ... 0 P (n, t ) P (n + ν , t0) a (n) 0 ... 0 P (n + ν , t0) d 1 1 1 0 = , t ≤ t ≤ t + τ dt0 . . . . . . . . . . 0 0 P (n + νm, t ) am(n) 0 ... 0 P (n + νm, t ) (2.71) 21 in which we have now effectively conditioned each P (n, t0) on the basis that no reaction occurs over the given interval. Solving for the desired probabilities yields
m ! 0 X 0 P (n, t ) = exp − ak(n)(t − t) (2.72) k=1 " m !# 0 aj(n) X 0 P (n + νj, t ) = Pm 1 − exp − ak(n)(t − t) , 1 ≤ j ≤ m (2.73) ak(n) k=1 k=1
Our strategy now is to first note that P (τ, µ|n, t)dτ consists of the independent probabil- ities P (τ, µ|n, t)dτ = P (µ|n, t)P (τ|n, t)dτ (2.74) then solve for these marginal densities as a function of the P (n, t0)’s. Conceptually, P (τ|n, t)dτ is the probability that the first reaction occurs in the interval [t+τ, t+τ +dτ). We solve for this quantity by taking advantage of its relationship with P (n, t + τ)
m 0 X dP (n + νj, t ) P (τ|n, t)dτ = dτ (2.75) dt0 j=1 t0=t+τ 0 dP (n, t ) = − 0 dτ (2.76) dt t0=t+τ m X = ak(n)P (n, t + τ)dτ (2.77) k=1 m m ! X X = ak(n) exp − ak(n)τ dτ (2.78) k=1 k=1
As expected, P (τ|n, t)dτ is independent of µ.
0 Similarly, we express P (µ|n, t) as a function of the P (n + νj, t )’s
0 P (n + νµ, t ) P (µ|n, t) = Pm 0 (2.79) k=1 P (n + νk, t ) " m !# aµ(n) X 0 Pm 1 − exp − ak(n)(t − t) ak(n) k=1 k=1 = m " m !# (2.80) X aj(n) X 0 Pm 1 − exp − ak(n)(t − t) ak(n) j=1 k=1 k=1
aµ(n) = Pm (2.81) k=1 ak(n)
As expected, P (µ|n, t) is independent of τ. 22
Combining the two marginal densities, we obtain
P (τ, µ|n, t)dτ = P (µ|n, t)P (τ|n, t)dτ (2.82) m m ! aµ(n) X X = Pm ak(n) exp − ak(n)τ dτ (2.83) ak(n) k=1 k=1 k=1 m ! X = aµ(n) exp − ak(n)τ dτ (2.84) k=1 as claimed.
Theorem 2.4 (Reconstruction of the master equation density from exact simulation.) As- suming conservation of mass and a finite number of reactions, then the probability density at a single future time point t reconstructed from Monte Carlo simulations converges to the density governed by the chemical master equation almost surely over the interval [t0, t]. That is, P lim PN (ni, t|N 0, t0) = P (ni, t|N 0, t0) = 1 ∀i = 1, . . . , ns (2.85) N→∞
in which
• N is the number of exact Monte Carlo simulations,
• PN (n, t|N 0, t0) is the Monte Carlo reconstruction of the probability density given N exact simulations,
• P (n, t|N 0, t0) is the density governed by the master equation, and
• ns is the total number of possible species.
Proof: We must show that P ψ : lim ni,N (ψ, t) = ni(ψ, t) = 1 ∀i = 1, . . . ns (2.86) N→∞
in which h iT • N = n1 ... nns , and
• ni,N is the Monte Carlo reconstruction of ni given N simulations.
Let > 0. We must show that there exists an N such that if m > N,
|P {ψ : ni,m(ψ, t) = ni(ψ, t)} − 1| < ∀i = 1, . . . ns (2.87)
The assumption of conservation of mass and a finite number of reactants indicates that ns is finite. Choose Xi(ψ, t) = δ(ψ − ni, t) (2.88) in which the random variable ψ is generated by running an exact stochastic simulation un- til time t. The mean of this random variable is P (ni, t|N 0, t0). Theorem 2.3 states that any 23 simulation scheme obeying the next reaction probability density P (τ, µ|n, t) generates exact trajectories from the master equation. Therefore, we can apply the strong law of large num- bers, which says that there exists an Ni∀i = 1, . . . , ns such that if m > Ni, |P {ψ : X (ψ, t) = P (n , t)} − 1| ≤ ∀i = 1, . . . n (2.89) i,m i 2 s
Let N = maxi Ni. Then if m > N,
|P {ψ : ni,m(ψ, t) = ni(ψ, t)} − 1| ≤ |P {ψ : ni,N (ψ, t) = ni(ψ, t)} − 1| ∀i = 1, . . . ns (2.90) ≤ ∀i = 1, . . . n (2.91) 2 s < ∀i = 1, . . . ns (2.92)
Since is arbitrary, the proof is complete. In his seminal works, Gillespie proposes two simple and efficient methods for gener- ating exact trajectories obeying the probability function P (τ, µ) [45, 46]. Theorem 2.3 proves that these trajectories obey exactly the chemical master equation (2.60). Gillespie appropri- ately named these algorithms the direct method and the next reaction method. We summarize these methods in algorithms 1 and 2.
Algorithm 1 Direct Method. Initialize. Set the time, t, equal to zero. Set the number of species n to n0.
1. Calculate:
(a) the reaction rates ak(n) for k = 1, . . . , m; and Pm (b) the total reaction rate, rtot = k=1 ak(n).
2. Select two random numbers p1, p2 from the uniform distribution (0, 1). Let τ = − log(p1)/rtot. Choose j such that j−1 j X X ak(n) < p2rtot ≤ ak(n) k=1 k=1
3. Let t ← t + τ. Let n ← n + νj. Go to 1.
Exact algorithms such as the direct method treat microscopic phenomena as discrete, molecular events. For intracellular models, this feature is appealing because of the inherently 24
Algorithm 2 First Reaction Method. Initialize. Set the time, t, equal to zero. Set the number of species n to n0.
1. Calculate the reaction rates ak(n) for k = 1, . . . , m.
2. Select m random numbers p1, . . . , pm from the uniform distribution (0, 1). Let τk = − log(pk)/ak(n), k = 1, . . . , m. Choose j such that j = arg min τk k
3. Let t ← t + τj. Let n ← n + νj. Go to 1. small number of molecules contained within a single cell (the finite number of chromosomes in the nucleus, for example). As models become progressively more complex, however, these algorithms often become expensive computationally. Some recent efforts have focused upon reducing this computational load. He, Zhang, Chen, and Yang employ a deterministic equi- librium assumption on polymerization reaction kinetics [61]. Gibson and Bruck refine the first reaction method, i.e. algorithm 2, to reduce the required number of random numbers, a tech- nique that works best for systems in which some reactions occur much more frequently than others [43]. Rao and Arkin demonstrate how to numerically simulate systems reduced by the quasi-steady-state assumption [113]. This work expands upon ideas by Janssen [69, 70] and Vlad and Pop [157] who first examined the adiabatic elimination of fast relaxing variables in stochastic chemical kinetics. Resat, Wiley, and Dixon address systems with reaction rates varying by several orders of magnitude by applying a probability-weighted Monte Carlo ap- proach, but this method increases error in species fluctuations [126]. Gillespie examines two approximate methods, tau leaping and kα leaping, for accelerating simulations by modeling the selection of “fast” reactions with Poisson distributions [50]. These methods employ ex- plicit, first-order Euler approximations that permit larger time steps to be taken than exact methods by allowing multiple firings of fast reactions by approximating the next reaction dis- tribution. In explicit tau leaping, one chooses a fixed time step τ, then increments the state by
m X n(t + τ) ≈ n(t) + νkPk(ak(n(t))τ) (2.93) k=1 in which Pk(ak(n(t))τ) is a Poisson random variable with mean ak(n(t))τ. In kα leaping, one chooses a particular reaction to undergo a predetermined number of events kα, then deter- 25 mines the time τ required for these events to occur by drawing a gamma random variable Γ(aα(n), kα). Using this value of τ, one draws Poisson random variables to determine how many events the remaining reactions undergo. A subsequent paper by Gillespie and Pet- zold discusses the error associated with the tau leaping approximation by using Taylor-series expansion arguments [51]. These conditions specify restrictions on the time increment τ to ensure that the error in the reconstructed mean and variance remain below a user-specified tolerance. However, this error only quantifies the effects of the reaction rate (aj(n)’s) depen- dence upon the state n, not the effect of approximating the exact next reaction distribution with a Poisson distribution. Rathinam, Petzold, Cao, and Gillespie later present a first-order implicit version of tau leaping, i.e.
m m X X n(t + τ) ≈ n(t) + νkak (n(t + τ)) + νk [Pk(ak(n(t))τ) − ak(n(t))] (2.94) k=1 k=1
This method has greater numerical stability than the explicit version [117].
2.3.2 Performing Systems Level Tasks with Stochastic Models
Employing kinetic Monte Carlo models for systems level tasks is an area of active research. Raimondeau, Aghalayam, Mhadeshwar, and Vlachos consider sensitivities via finite differ- ences and parameter estimation for kinetic Monte Carlo simulations [105]. Drews, Braatz, and Alkire consider calculating the sensitivity for the mean of multiple Monte Carlo simulations via finite differences, and apply this method to copper electrodeposition to determine which parameter perturbations most significantly affect the measurements [25]. Gallivan and Mur- ray consider model reduction techniques for the chemical master equation [39], then use the reduced models to determine optimal open-loop temperature profiles for epitaxial thin film growth [38]. Lou and Christofides consider control of growth rate and surface roughness in thin film growth [81, 82], employing proportional integral control that uses a kinetic Monte Carlo model to provide information about interactions between outputs and manipulated in- puts. This simple form of feedback control does not require an optimization. Laurenzi uses a genetic algorithm to estimate parameters for a model of aggregating blood platelets and neu- trophils [78]. Armaou and Kevrekidis employ a coarse time-stepper and a direct stochastic optimization method (Hooke-Jeeves) to determine an optimal control policy for a set of reac- tions on a catalyst surface [4]. Siettos, Armaou, Makeev and Kevrekidis use the coarse time stepper to identify the local linearization of the nonlinear stochastic model at a steady state of interest [138]. Given the local linearization of the model, standard linear quadratic control the- ory is then applied. Armaou, Siettos and Kevrekidis consider extending this control approach to spatially distributed processes [5]. Finally Siettos, Maroudas and Kevrekidis construct bi- furcation diagrams for the mean of the stochastic models [139]. 26 2.4 Population Balance Models
Stochastic models of chemical kinetics pose one alternative to traditional deterministic models for modeling intracellular kinetics. Many biological systems of interest, however, consist of populations of cells influencing one another. Here, we consider the dynamic behavior of cell populations undergoing viral infections. Traditionally, mathematical models for viral infections have focused solely on events occurring in either the intracellular or extracellular level. At the intracellular level, kinetic models have been applied to examine the dynamics of how viruses harness host cells to repli- cate more virus [73, 27, 29, 3], and how drugs targeting specific virus components affect this replication [122, 30]. These models, however, consider only one infection cycle, whereas in- fections commonly consist of numerous infection cycles. At the extracellular level, researchers have considered how drug therapies affect the dynamics of populations of viruses [164, 62, 98, 13, 100]. These models, though, neglect the fact that these drugs target specific intracellular viral components. To better understand the interplay of intracellular and extracellular events, a different modeling framework is necessary. We propose cell population balances as one such framework. Mathematical models for cell population dynamics may be effectively grouped by two distinctive features: whether or not the model has structure, and whether or not the model has segregations [6]. If a model has structure, then multiple intracellular components affect the dynamics of the cell population. If a model has segregations, then some cellular characteristic can be employed to distinguish among different cells in a population. Table 2.1 summarizes the different combinations of models arising from these features. In this context, current extra- cellular models are equivalent to unstructured, unsegregated models because the cells in each population (uninfected and infected cells) are assumed indistinguishable from each other.
Unstructured Structured
Most idealized case Multicomponent average cell Cell population treated as description one-component solute Unsegregated
Multicomponent description of Single component, cell-to-cell heterogeneity heterogeneous individual cells Most realistic case Segregated
Table 2.1: Types of cell population models [6]
The derivation of structured, segregated models stems from the equation of continu- ity. In particular, the derivation is identical as before up to the microscopic equation (2.8), but now considers the effect of various internal segregations upon the population behavior. 27
Fredrickson, Ramkrishna, and Tsuchiya consider the details of this derivation in their seminal contribution [36]. In recent years, this modeling framework has returned to the literature as researchers strive to adequately reconcile model predictions with the dynamics demonstrated by experimental data [80, 10, 33]. Also, new measurements such as flow cytometry offer the promise of actually differentiating between cells of a given population [1, 67], again implying the need to model distinctions between cells in a given population.
Notation
aµ(n) µth reaction rate
cµdt average probability to O(dt) that reaction µ will occur in the next time interval dt dΩ differential change in the control surface S(t)
dΩe differential change in the reactor surface Se
ek deviation between the predicted and actual measurement at time tk F total flux of the quantity η(t, z) f diffusive contribution to the total flux F
hµ number of distinct molecular reactant combinations for reaction µ at a given time J Jacobian m mean of a probability distribution N (m, C) normal distribution with mean m and covariance C
N 0 matrix containing all possible molecular configurations at time t0 n vector of the number of molecules for each chemical species ni ith Monte Carlo reconstruction of the vector n
ne normal vector pointing from the reactor surface Se away from the volume Ve
ns normal vector pointing from the surface S(t) away from the volume V (t)
ns total number of possible species P probability P(m) random number drawn from the Poisson distribution with mean m p random number from the uniform distribution (0, 1) q effluent volumetric flow rate
qf feed volumetric flow rate
Rη production rate of the species η
rtot sum of reaction rates
Se time-varying surface of the reactor volume Ve S(t) arbitrary, time-varying control volume spanning a space in z s sensitivity of the state x with respect to the parameters θ s˙ first derivative of the sensitivity with respect to time t time
tk discrete sampling time
Ve time-varying reactor volume V (t) arbitrary, time-varying control volume spanning a space in z 28
vk realization of the variable ξ at time tk
vs velocity vector for the surface S(t)
vx x-component of the velocity vector vz
vz velocity vector for material flowing through the volume V (t) X random variable x external characteristics x state x˙ first derivative of the state with respect to time
xk state at time tk
Y k distribution for the measurement yk y internal characteristics
yk measurement at time tk ZN random variable whose limiting distribution as N → ∞ is the normal distribution z internal and external characteristics Γ random number drawn from the gamma distribution δ Dirac delta function η(t, z)dz mass of reactants or products Θ distribution for the parameter set θ θ parameter set for a given model µ one possible reaction in the stochastic kinetics framework ν stoichiometric matrix ξ N (0, Π)-distributed random variable Π covariance matrix for the random variable ξ σ standard deviation τ time of the next stochastic reaction φ objective function ψ random variable 29
Chapter 3
Motivation
The motivation for this work is the current state of stochastic and deterministic methods used to model chemically reacting systems. For example, the rapid growth of biological mea- surements on the intracellular level (e.g. microarray and proteomic data) will require much more complicated models to adequately assimilate the data contained by these measurements. Therefore we seek to improve the current techniques used to evaluate and manipulate stochas- tic and deterministic models. In this chapter, we examine the current limitations of the existing methods for using stochastic models, traditional deterministic models, and state estimation techniques.
3.1 Current Limitations of Stochastic Models
We see two primary limitations of current methods for handling stochastic models:
1. exact integration methods scale with the number of reaction events, and 2. methods for performing systems level tasks require the use of noisy finite difference techniques.
We illustrate these points next.
3.1.1 Integration Methods
The current options for performing exact simulation of stochastic chemical kinetics are Gille- spie’s direct and first reaction methods [45, 46], and the next reaction method of Gibson and Bruck [43]. Gibson and Bruck [43] analyze the computational expenditure of these methods, and find that Gillespie’s methods at best scale with the number of reaction events, whereas their next reaction method scales with the log of the number of reaction events. To illustrate this point, we consider the simple reaction
k1 1 2A −−→ B a() = k1nA(nA − 1) (3.1) k−1 2 in which 30
• k1 = 4/(3nAo) and k−1 = 0.1, • is the dimensionless extent of reaction, • a() is the reaction propensity function,
• nA is the number of A molecules, and
• nAo is the initial number of A molecules.
We consider simulating this system in which there are initially zero B molecules and a variable number of A molecules. For this system, the number of possible reactions scales with nAo. We scale rate constants for reactions with nonlinear rates so that the dimensionless extent of reaction remains constant as the variable nAo changes. Figure 3.1 demonstrates that the computational time for one simulation scales linearly with nAo, as expected.
0.14 0.12 0.1 0.08 0.06 0.04 0.02 0
Time required for simulation (sec) 0 2 4 86 10 1412 1816 20 Initial number of A molecules (nAo)
Figure 3.1: Computational time per simulation as a function of nAo. Line represents the least- squares fit of the data assuming that a simulation with nAo = 0 requires no computational time.
The question arises, then, as to the suitability of these methods for simulating intracel- lular chemistry. As an example, we consider the case of a rapidly growing Escherichia coli (or E. coli) cell. For this circumstance, one E. Coli cell contains approximately four molecules of deoxyribonucleic acid (DNA), 1000 molecules of messenger ribonucleic acid (mRNA), and 106 proteins [6]. Simulating these conditions with methods that scale with the number of reaction events is clearly acceptable for modeling the DNA and mRNA species, but simulating events at the protein level is not a trivial task. Now consider Figure 3.2, which plots how an intensive variable such as the extent of reaction changes as nAo increases. This figure demonstrates that, as the number of molecules increases, the extent appears to be converging to a smoothly-varying deterministic trajectory. This simulation exhibits precisely the mathematical result proven by Kurtz: in the thermody- namic limit (n → ∞, V → ∞, n/V = constant), the master equation written for n (number 31 of molecules) collapses to the a deterministic equation for c (concentration of molecules) [76]. The appeal of the deterministic equation is that the computational time required for its solu- tion does not scale with the simulated number of molecules. For E. Coli, such an approxima- tion may certainly be valid for reactions among proteins, but not for those among DNA. We address this issue further in Chapter 4.
0.9 0.8 0.7 0.6 0.5 0.4
0.3 nAo 10 × nAo
Extent of reaction 0.2 20 × n 0.1 Ao deterministic 0 0 2 4 86 10 Time
Figure 3.2: Extent of reaction as a function of nAo.
3.1.2 Systems Level Tasks
A secondary issue arising from stochastic models is how to extract information from these models. Currently, most researchers merely integrate these types of models to determine the dynamic behavior of the system given a specific initial condition and inputs. As pointed out previously, this integration is potentially expensive. One recent strategy for obtaining more information from the model involves using finite difference methods to obtain estimates of the model sensitivity [105, 25], then using these sensitivities for parameter estimation and steady-state analysis. For example, we could determine the sensitivity of reaction 3.1 to the forward rate constant k1 by evaluating the central finite difference
∂nA F (k1 + δ) − F (k1 − δ) s = ≈ (3.2) ∂k1 2δ in which
• s is the sensitivity of the state nA with respect to the parameter k1,
• F (x) yields a trajectory from a stochastic model integration given the parameter k1 = x and nAo initial molecules, and
• δ is a perturbation to the parameter k1. 32
Figure 3.3 plots the perturbed trajectories and the desired sensitivity. At the smaller perturba- tion of δ = 0.2k1, the stochastic fluctuations of the simulation dominate, yielding a noisy, poor sensitivity estimate. The larger perturbation of δ = 0.8k1 yields a smoother sensitivity, but the accuracy of the central finite difference is questionable. There is obviously significant room for improvement in the methods used to calculate this quantity. We consider this issue further in Chapters 5 and 6. Additionally, little work has focused on how best to use information obtained from simulations of stochastic differential equations. Accordingly, we consider sen- sitivities for these types of models in Chapter 7. Finally, we apply many of the tools developed in these chapters to crystallization systems in Chapter 8.
200 2000 180 (a) 0 160 140 -2000 n 120 A -4000 S 100 80 -6000 60 −δ -8000 40 +δ 20 -10000 0 2 4 86 10 Time 200 0 180 (b) -1000 160 -2000 140 -3000 -4000 n 120 −δ A -5000 S 100 -6000 80 -7000 60 -8000 40 +δ -9000 20 -10000 0 2 4 86 10 Time
Figure 3.3: Finite difference sensitivity for the stochastic model: (a) small perturbation (δ = 0.2k1) and (b) large perturbation (δ = 0.8k1). 33 3.2 Current Limitations of Traditional Deterministic Models
We restrict this examination to modeling of viral infections, although the same arguments gen- erally hold for virtually all systems involving populations of cells. Figure 3.4 generalizes the cyclic nature of viral infections. The initiation of a viral infection occurs when the virus is introduced to a host organism. The virus then targets specific uninfected host cells for infec- tion. Once infected, these host cells become in essence “factories” that replicate and secrete the virus. The cycle of infection and virus production then continues. During this infection cycle, uninfected cells may continue to reproduce. This cycle is essentially the one proposed by Nowak and May [98].
Uninfected Cells Infected Cells
Generation
death
Free Virus
Figure 3.4: Cyclic nature of viral infections.
These types of models usually assume that the production rate of virus is directly pro- portional to the concentration of infected cells. This assumption generally permits reduction of the model to a coupled set of ordinary differential equations (e.g. three ODE’s to model the un- infected cell population, the infected cell population, and the virus population). This assump- tion is a gross simplification; in fact, many modelers have focused entirely on considering the complex chemistry required at the intracellular level to produce viral progeny [73, 27, 29, 3]. A more realistic picture of viral infections consists of a combination of the intracellular and ex- tracellular levels. As described in Chapter 2, cell population balance models offer one means of combining these two levels. Since the literature review uncovered little active research in this area, we therefore seek to explore the utility of the cell population balance in explaining biological phenomena. We believe that refined versions of these models may lead to insights on how to best control viral propagation. We first explore the utility of the cell population balance in a numerical setting in Chapter 9, then investigate whether or not these types of models are useful in explaining actual experimental data in Chapter 10. Finally, we introduce an approximation that significantly reduces the computational expense of solving this class of models in Chapter 11. 34 3.3 Current Limitations of State Estimation Techniques
It is well established that the Kalman filter is the optimal state estimator for unconstrained, linear systems subject to normally distributed state and measurement noise. Many physical systems, however, exhibit nonlinear dynamics and have states subject to hard constraints, such as nonnegative concentrations or pressures. Hence Kalman filtering is no longer directly ap- plicable. Perhaps the most popular method for estimating the state of nonlinear systems is the extended Kalman filter, which first linearizes the nonlinear system, then applies the Kalman filter update equations to the linearized system [144]. The extended Kalman filter assumes that the a posteriori distribution is normally distributed (unimodal), hence the mean and the mode of the distribution are equivalent. Questions that arise are: how does this strategy perform when multiple modes arise in the a posteriori distribution? Also, are multiple modes even a concern for chemically reacting systems? Finally, can multiple modes in the estimator hinder closed-loop performance? We address the first two of these questions in Chapter 12, and the final question in Chapter 13.
Notation
a() reaction propensity function c concentrations for all reaction species
kj rate constant for reaction k n number of molecules for all reaction species
nA number of molecules for species A s sensitivity δ finite difference perturbation extent of reaction 35
Chapter 4
Approximations for Stochastic Reaction Models 1
Exact methods are available for the simulation of isothermal, well-mixed stochastic chemical kinetics. As increasingly complex physical systems are modeled, however, these methods be- come difficult to solve because the computational burden scales with the number of reaction events [43]. We address one aspect of this problem: the case in which reacting species fluctuate by different orders of magnitude. We expand upon the idea of a partitioned system [113, 157] and simulation via Gillespie’s direct method [45, 46] to construct approximations that reduce the computational burden for simulation of these species. In particular, we partition the sys- tem into subsets of “fast” and “slow” reactions. We make various approximations for the “fast” reactions (either invoking an equilibrium approximation, or treating them determinis- tically or as Langevin equations), and treat the “slow” reactions as stochastic events. Such approximations can significantly reduce computational load while accurately reconstructing at least the first two moments of the probability distribution for each species. This chapter provides a theoretical background for such approximations and outlines strategies for computing these approximations. First, we examine the theoretical underpin- nings of the approximations. Next, we propose numerical algorithms for performing the simu- lations, review several practical implementation issues, and propose a further approximation. We then consider three motivating examples drawn from the fields of enzyme kinetics, parti- cle technology, and biotechnology that illustrate the accuracy and computational efficiency of these approximations. Finally, we critically examine the technique and present conclusions.
4.1 Stochastic Partitioning
The key ideas are to 1) model the state of the reaction system using extents of reaction as opposed to molecules of species, and 2) partition the state into subsets of “fast” and “slow” reactions. With these two modeling choices, we can exploit the structure of the chemical mas- ter equation, the governing equation for the evolution of the system probability density, by
1Portions of this chapter appear in Haseltine and Rawlings [57]. 36 making order of magnitude arguments. We then derive the master equations that govern the “fast” and “slow” reaction subsets. This section outlines these manipulations in greater detail. We model the state of the system, x, using an extent for each irreversible reaction 2. An extent of reaction model is consistent with a molecule balance model since
T n = n0 + ν x (4.1) in which, assuming that there are m extents of reaction and p chemical species:
• x is the state of the system in terms of extents (an m-vector),
• n is the number of molecules (a p-vector),
• n0 is the initial number of molecules (a p-vector), and
• ν is the stoichiometric matrix (an m × p-matrix).
The upper and lower bounds of x are constrained by the limiting reactant species. We arbi- trarily set the initial condition to the origin. Given assumptions outlined by Gillespie [48], the governing equation for this system is the chemical master equation
m dP (x; t) X = a (x − I )P (x − I ; t) − a (x)P (x; t) (4.2) dt k k k k k=1 in which
• P (x; t) is the probability that the system is in state x at time t,
• ak(x)dt is the probability to order dt that reaction k occurs in the time interval [t, t + dt), and
th • Ik is the k column of the (m × m)-identity matrix I.
The structure of I arises for this particular chemical master equation because the reactions are irreversible. Also, we have implicitly conditioned the master equation (4.2) on a specific initial condition, i.e. n0. Generalizing the analysis presented in this chapter to a distribution of initial conditions (n0,1,..., n0,n) is straightforward due to the relation X P (x|n0,1,..., n0,n; t) = P (x|n0,j; t)P (n0,j) (4.3) j and the fact that the values of P (n0,j) are specified in the initial condition. Now we examine the time scale over which the extents of reaction change. We must first determine a relevant time scale so that we can partition the extents into two subsets: those that have small propensity functions (ak(x)’s) and occur few if any times over the time scale, and those that have large propensity functions and occur numerous times over the given time
2Note that reversible reactions can be modeled as two irreversible reactions. 37 scale. We designate these subsets of x as the (m − l)-vector y and the l-vector z, respectively. Note that " # " # y Iy 0 x = and I = (4.4) z 0 Iz in which Iy and Iz are (m − l × m − l)- and (l × l)-identity matrices, respectively. We also partition the reaction propensities into groups of fast (cj) and slow (bj) a1(y, z; t) b1(y, z; t) . . . . a (y, z; t) b (y, z; t) m−l m−l = (4.5) am−l+1(y, z; t) c1(y, z; t) . . . . am(y, z; t) cl(y, z; t)
Equation (4.2) becomes
m−l dP (y, z; t) X = b (y − Iy, z)P (y − Iy, z; t) − b (y, z)P (y, z; t) dt j j j j j=1 l X z z + ck(y, z − Ik)P (y, z − Ik; t) − ck(y, z)P (y, z; t) (4.6) k=1
Ultimately, we are interested the determining an approximate governing equation for the evo- lution of the joint density, P (y, z; t), in regimes where fast reaction extents are much greater than slow reaction extents. Denoting the total extent space as X, we define a subspace Xp ⊂ X for which " # y c (y, z) b (y, z) ∀1 ≤ k ≤ l, 1 ≤ j ≤ m − l, ∈ (4.7) k j z Xp
By defining the conditional and marginal probabilities over this subspace as " # y P (y, z; t) = P (z|y; t)P (y; t) ∀ ∈ (4.8) z Xp " # X y P (y; t) = P (y, z; t) ∀ ∈ (4.9) z Xp z we can alternatively derive evolution equations for both the marginal probability of the slow reactions, P (y; t), and the probability of the fast reactions conditioned on the slow reactions, P (z|y; t). Consequently, we then know how the fast and slow reactions evolve over this time scale. Also, this partitioning is similar to that used by Rao and Arkin [113], who partition the master equation by species to treat the quasi-steady-state assumption. We partition by reaction extents to treat fast and slow reactions. 38
All the manipulations performed in the next two subsections apply only for fast and slow reactions in the partitioned subspace Xp. To simplify the presentation of the results, we drop the implied notation " # y ∀ ∈ z Xp from all subsequent equations.
4.1.1 Slow Reaction Subset
We first address the subset of slow reaction extents y. From the definition of the marginal density, X P (y; t) = P (y, z; t) (4.10) z Differentiating equation (4.10) with respect to time yields
dP (y; t) X dP (y, z; t) = (4.11) dt dt z Now substitute the master equation (4.6) into equation (4.11) and manipulate to yield
m−l dP (y; t) X X = b (y − Iy, z)P (y − Iy, z; t) − b (y, z)P (y, z; t) dt j j j j z j=1 l ! X z z + ck(y, z − Ik)P (y, z − Ik; t) − ck(y, z)P (y, z; t) (4.12) k=1 m−l X X y y = bj(y − Ij , z)P (y − Ij , z; t) − bj(y, z)P (y, z; t) z j=1 l ! X X z z + ck(y, z − Ik)P (y, z − Ik; t) − ck(y, z)P (y, z; t) (4.13) z k=1 | {z } 0 m−l X X y y = bj(y − Ij , z)P (y − Ij , z; t) − bj(y, z)P (y, z; t) (4.14) z j=1
Equation (4.14) is exact; we have made no approximations in its derivation. Also, if we rewrite the joint density in terms of the conditional density using the definition
P (y, z; t) = P (z|y; t)P (y; t) (4.15) then one interpretation of this analysis is that the evolution of the marginal P (y; t) depends on the conditional density P (z|y; t). We consider deriving an evolution equation for this con- ditional density next. 39 4.1.2 Fast Reaction Subset
We now address the evolution of the probability density for the subset of fast reactions con- ditioned on the subset of slow reactions, P (z|y; t). For our starting point, we use order of magnitude arguments, i.e. equation (4.7), to approximate the original master equation (4.6) as
l dP (y, z; t) X ≈ c (y, z − Iz )P (y, z − Iz ; t) − c (y, z)P (y, z; t) (4.16) dt k k k k k=1
We define this approximate joint density as PA(y, z; t), and thus its evolution equation is
l dPA(y, z; t) X c (y, z − Iz )P (y, z − Iz ; t) − c (y, z)P (y, z; t) (4.17) dt , k k A k k A k=1
Following Rao and Arkin [113], we define the joint density PA(y, z; t) as the product of the desired conditional density PA(z|y; t) and the marginal density PA(y; t):
PA(y, z; t) = PA(z|y; t)PA(y; t) (4.18)
Differentiating equation (4.18) with respect to time yields
dPA(y, z; t) dPA(z|y; t) dPA(y; t) = P (y; t) + P (z|y; t) (4.19) dt dt A dt A Solving equation (4.19) for the desired conditional derivative yields dPA(z|y; t) 1 dPA(y, z; t) dPA(y; t) = − PA(z|y; t) (4.20) dt PA(y; t) dt dt
Evaluating the marginal evolution equation by summing equation (4.17) over the fast extents z yields
l dPA(y; t) X X = c (y, z − Iz )P (y, z − Iz ; t) − c (y, z)P (y, z; t) (4.21) dt k k A k k A z k=1 = 0 (4.22)
Consequently, equation (4.19) becomes
l ! dPA(z|y; t) 1 X z z = ck(y, z − Ik)PA(y, z − Ik; t) − ck(y, z)PA(y, z; t) (4.23) dt PA(y; t) k=1 l X z z = ck(y, z − Ik)PA(z − Ik|y; t) − ck(y, z)PA(z|y; t) (4.24) k=1 which is the desired closed-form expression for the conditional density PA(z|y; t). 40 4.1.3 The Combined System
For the slow reactions, we approximate the joint density P (y, z; t) as
P (y, z; t) ≈ PA(z|y; t)P (y; t) (4.25)
Combining the evolution equations for the slow and fast reaction extents, i.e. equations (4.14) and (4.24) respectively, then yields the following coupled master equations
m−l ! ! dP (y; t) X X X ≈ b (y − Iy, z)P (z|y − Iy; t) P (y − Iy; t) − b (y, z)P (z|y; t) P (y; t) dt j j A j j j A j=1 z z (4.26a) l dPA(z|y; t) X = c (y, z − Iz )P (z − Iz |y; t) − c (y, z)P (z|y; t) (4.26b) dt k k A k k A k=1
From these equations, using order of magnitude arguments to produce a time-scale separation has clearly had two effects: first, the coupled expressions for the marginal and conditional evolution equations in (4.26) are Markov in nature; and second, the evolution equation for the fast extents conditioned on the slow extents, PA(z|y), has decoupled from the slow ex- tent marginal, P (y). Additionally, exact solution of the coupled master equations (4.26) is at least as difficult as the original master equation (4.2) due to the fact that one must solve an individual master equation of the form of equation (4.26b) for every element of the slow con- ditional equation (4.26a). From a simulation perspective, equation (4.26) is also as difficult to evaluate as the original master equation (4.2) since both of the coupled master equations are discrete and time-varying. However, approximating the fast extents can significantly reduce the computational expense involved with simulating these coupled equations. Different ap- proximations are applicable based on the characteristic relaxation times of the fast and slow extents. Next, we investigate two such approximations: an equilibrium approximation for the case in which the fast extents relax significantly faster than the slow extents, and a Langevin or deterministic approximation for the case in which both fast and slow extents relax at similar rates.
4.1.4 The Equilibrium Approximation
We first consider the case in which the relaxation time for the fast extents is significantly smaller than the expected time to the first slow reaction. To illustrate this case, we consider the simple example
k1 k A )−*− B −→3 C (4.27) k2 41
We denote the extents of reaction for this example as 1, 2, and 3, and define the reaction propensities as
a1(x) = k1nA (4.28a)
a2(x) = k2nB (4.28b)
a3(x) = k3nC (4.28c)
If k1, k2 k3, then we can partition 1 and 2 as the fast reactions z, and 3 as the slow extent of reaction y. Additionally, we would expect the fast extents of reaction to equilibrate (relax) before the expected time to the first slow reaction. Returning to the master equation formalism, this equilibration implies that we should approximate the fast reactions, equation (4.26b), as
l X z z 0 ≈ ck(y, z − Ik)PA(z − Ik|y; t) − ck(y, z)PA(z|y; t) (4.29) k=1 The resulting coupled master equations are
m−l ! ! dP (y; t) X X X ≈ b (y − Iy, z)P (z|y − Iy; t) P (y − Iy; t) − b (y, z)P (z|y; t) P (y; t) dt j j A j j j A j=1 z z (4.30a) l X z z 0 = ck(y, z − Ik)PA(z − Ik|y; t) − ck(y, z)PA(z|y; t) (4.30b) k=1 This coupled system, equation (4.30), is markedly similar to the governing equations for the slow-scale simulation recently proposed by Cao, Gillespie, and Petzold [16]. Their derivation deviates from ours, however, and the differences deserve some attention. First, Cao, Gillespie, and Petzold [16] partition on the basis of fast and slow species rather than extents, with fast species affected by at least one fast reaction and slow species affected by solely slow reactions. We have chosen to remain in the extent space because extents are equilibrating, not chemical species. Also, Cao, Gillespie, and Petzold [16] use the construct of a virtual fast system to arrive at an evolution equation for the slow species (similar to our evolution equation for the slow extent marginal, equation (4.14)), a choice that obviates the need for defining an evolution equation for the conditional density P (z|y). In contrast to this approach, we believe that our approach has a much tighter connection to the original master equation due to the fact we derived the coupled system, equation (4.30), directly from the the original master equation, and the fact that we can obtain an approximate value of the joint density P (y, z; t) through equation (4.25). Also, all approximations arise directly from order of magnitude and relaxation time arguments.
4.1.5 The Langevin and Deterministic Approximations
We now consider the case in which both fast and slow extents relax at similar time scales. Revisiting the reaction example 4.27, we consider the case in which k1 k2, k3 and nAo 42 nBo, nCo in which the notation nAo refers to the initial number of A molecules. For this exam- ple, we partition 1 as the fast extent of reaction z, and 2 and 3 as the slow extents of reaction y. Until a significant amount of A has been consumed, we would expect numerous firings of 1 interspersed with relatively few firings of 2 and 3. Clearly the system never equilibrates, but rather fast and slow reactions fire until the fast extent reaches a similar order of magnitude as one of the slow extents. Note also that, in contrast to the equilibrium approximation, we have introduced the number of molecules into the time-scale argument. For most cases, we expect this time-scale argument to involve large numbers of reacting molecules, but such involve- ment is not always the case as demonstrated in the viral infection example presented later in this chapter. Rather, we require that the magnitude of the fast reactions remain large relative to the magnitude of the slow reactions through the expected time of the first slow reaction. Returning to the master equation formalism, this process requires a different approxi- mation for the conditional density P (z|y). We proceed by demonstrating as outlined by Gar- diner [41] how this subset can be approximated using the Langevin approximation. Define the characteristic size of the system to be Ω, and use this size to recast the master equation (4.24) in terms of intensive variables (let z ← z/Ω). Performing a Kramers-Moyal expansion on this master equation results in a system size expansion in Ω. In the limit as z and Ω become large, the discrete master equation (4.26b) can be approximated by its first two differential moments with the continuous Fokker-Planck equation
l l l 2 ∂PA(z|y; t) X ∂ 1 X X ∂ = − (A (y, z)P (z|y; t))+ B (y, z)2P (z|y; t) (4.31) ∂t ∂z i A 2 ∂z ∂z ij A i=1 i i=1 j=1 i j in which (noting that z consists of extents of reaction):
l T X z h i A(y, z) = Ii ci(y, z) = c1(y, z) c2(y, z) ··· cl(y, z) (4.32) i=1 l 2 X z z T [B(y, z)] = Ii (Ii ) ci(y, z) = diag (c1(y, z), c2(y, z), . . . , cl(y, z)) (4.33) i=1 Here, diag(a, . . . , z) defines a matrix with elements a, . . . , z on the diagonal. Equation (4.31) has Itoˆ solution of the form l X dzi = Ai(y, z)dt + Bij(y, z)dW j ∀1 ≤ i ≤ l (4.34a) j=1 p = ci(y, z)dt + ci(y, z)dW i ∀1 ≤ i ≤ l (4.34b) in which W is a vector of Wiener processes. Equation (4.34) is the chemical Langevin equation, whose formulation was recently readdressed by Gillespie [49]. Note the difference between equations (4.31) and (4.34). The Fokker-Planck equation (4.31) specifies the distribution of the stochastic process, whereas the stochastic differential equation (4.34) specifies how the trajectories of the state evolve. Also, bear in mind that whether or not a given Ω is large 43 enough to permit truncation of the system size expansion is relative. In this case, Ω is of sufficient magnitude to make this approximation valid for only a subset of the reactions, not the entire system. Combining the evolution equations for the slow and fast reaction extents, i.e. equa- tions (4.26a) and (4.31) respectively, the problem of interest is the coupled set of master equa- tions
m−l dP (y; t) X Z ≈ b (y − Iy, z0 )P (z0 |y − Iy; t)dz0 P (y − Iy; t) dt k k k A k k k k=1 z Z 0 0 0 − bk(y, z )PA(z |y; t)dz P (y; t) (4.35a) z l l l 2 ∂PA(z|y; t) X ∂ 1 X X ∂ = − (A (y, z)P (z|y; t)) + B (y, z)2P (z|y; t) ∂t ∂z i A 2 ∂z ∂z ij A i=1 i i=1 j=1 i j (4.35b)
If we can solve these equations simultaneously, then we in fact have an approximate solution to the original master equation (4.6) due to the definition of the conditional density given by equation (4.25). Note that the solution is approximate due to the fact that we have used the Fokker-Planck approximation for the master equation of the fast reactions. In the thermodynamic limit (z → ∞, Ω → ∞, z = z/Ω = finite), the intensive variables for the fast subset of reactions (z’s) evolve deterministically [76]. Accordingly, we propose further approximating the Langevin equation (4.34) as
dzi = ci(y, z)dt ∀1 ≤ i ≤ l (4.36)
In this case, the coupled master equations (4.35) reduce to
m−l dP (y; t) X ≈ b (y − Iy, z(t))P (y − Iy; t) − b (y, z(t))P (y; t) (4.37a) dt k k k k k=1
dzi =ci(y, z)dt ∀1 ≤ i ≤ l (4.37b) in which z(t) is the solution to the differential equation (4.36). The benefit of this assumption is that equation (4.36) can be solved rigorously using an ODE solver. Unfortunately for phys- ical systems, the thermodynamic limit is obviously unattainable. However, knowledge of the modeled system can lead to this simplification. If the magnitude of the fluctuations in this term is small compared to the sensitivity of ci(y, z) to the subset y, then equation (4.36) is a valid approximation. This approximation is also valid if one is primarily concerned with the fluctuations in the small-numbered species as opposed to the large-numbered species, assum- ing that the extents approximated by equation (4.36) predominantly affect the population size of large-numbered species. 44 4.2 Numerical Implementation of the Approximations
We now outline procedures for implementing the equilibrium, Langevin, and deterministic approximations presented in the previous section. We propose using simulation to reconstruct moments of the underlying master equation. For the slow reactions, Gillespie [47] outlines a general method for exact stochastic simulation that is applicable to the desired problem, equation (4.26a). This method examines the joint probability function, P (τ, µ), that governs when the next reaction occurs, and which reaction occurs. We present a brief derivation of this function. We proceed by noting that the key probabilistic questions are: when will the next reac- tion occur, and which reaction will it be [45] ? For this end, we define ( P z bµ(y, z)PA(z|y; t)dt equilibrium approximation bµ(y, z; t)dt = R 0 0 0 z bµ(y, z )PA(z |y; t)dz dt Langevin or deterministic approximation (4.38) in which bµ(y, z; t)dt is the probability (first order in dt) that reaction µ occurs in the next time interval dt. We express the joint probability P (τ, µ)dτ as the product of the independent probabilities P (τ, µ)dτ = P0(τ)P (µ)dτ (4.39) in which
• P0(τ) is the probability that no reaction occurs within [t, t + τ), and
• P (µ)dτ is the probability that reaction µ takes place within [t + τ, t + τ + dτ).
To determine P0(τ), consider the change in this probability over the differential incre- ment in time dt, assuming that probabilities are independent over disjoint periods of time [68]:
m−l X P0(τ + dt) = P0(τ) 1 − bj(y, z; t + τ)dt (4.40a) j=1 y = P0(τ)(1 − rtot(t)dt) (4.40b) y Here, rtot(t) is the sum of reaction rates for subset y at time t. Rearranging equation (4.40a) and taking the limit as dt → 0 yields the differential equa- tion dP0(τ) = −ry (t)P (τ) (4.41) dt tot 0 which has solution Z t+τ y 0 0 P0(τ) = exp − rtot(t )dt (4.42) t The joint probability function P (τ, µ) is therefore: Z t+τ y 0 0 P (τ, µ) = bµ(y, z; t + τ) exp − rtot(t )dt (4.43) t 45
We now address our key questions by conditioning the joint probability function P (τ, µ):
P (τ, µ) = P (µ|τ)P (τ) (4.44) in which P (τ) is the probability that a reaction occurs in the differential instant after time t+τ, and P (µ|τ) is the probability that this reaction will be µ. First note that by definition:
l X P (τ) = P (τ, µ) (4.45) µ=1
Implicit in this equation is the assumption that a reaction occurs, and hence the probability of not having a reaction is zero. Then by rearranging equation (4.44) and incorporating (4.45), it can be deduced that: P (τ, µ) P (µ|τ) = Pm−l (4.46) µ=1 P (τ, µ)
Equation (4.46) can be solved exactly by employing equation (4.43) to yield:
b (y, z; t + τ) P (µ|τ) = µ Pm−l (4.47) j=1 bj(y, z; t + τ)
We then solve equation (4.45) by employing equation (4.43): m−l Z t+τ X y 0 0 P (τ) = bj(y, z; t + τ) exp − rtot(t )dt (4.48a) j=l t Z t+τ y y 0 0 = rtot(t + τ) exp − rtot(t )dt (4.48b) t
Using Monte Carlo simulation, we obtain realizations of the desired joint probabil- ity function P (τ, µ) by randomly selecting τ and µ from the probability densities defined by equations (4.48b) and (4.47). Such a method is the equivalent of the direct method for hybrid systems. Given two random numbers p1 and p2 uniformly distributed on (0, 1), τ and µ are constrained accordingly:
Z t+τ y 0 0 rtot(t )dt + log(p1) = 0 (4.49a) t µ−1 µ X y X bk(y, z; t + τ) < p2rtot(t + τ) ≤ bk(y, z; t + τ) (4.49b) k=l+1 k=l+1
Simulating the different approximations require slightly different algorithms, which we ad- dress next. 46 4.2.1 Simulating the Equilibrium Approximation
We first address the equilibrium approximation. For this case, X bj(y, z; t) = bj(y, z)PA(z|y; t) ∀1 ≤ j ≤ m − l (4.50) z
Additionally, the quantities bj(y, z; t) are actually time invariant between slow reactions. Thus, the integral constraint (4.49a) reduces to the algebraic relation
log(p1) τ = − y (4.51) rtot(t)
Algorithm 3 Exact solution of the partitioned stochastic system for the equilibrium approxi- mation. Off-line. Partition the set x of m extents of reaction into fast and slow extents. Determine the parti- tioned stoichiometric matrices (the (m − l × p)-matrix νy and the (l × p)-matrix νz) and the reaction y propensity laws (ak(y, z)’s). Also, choose a strategy for solving the distribution PA(z|y) given by equa- tion (4.30) for the fast reactions in the partitioned case. Initialize. Set the time, t, equal to zero. Set the number of species n to n0.
1. Solve for the distribution PA(z|y), denoting all possible combinations of z as (z(0),..., z(t)). Record the initial value of z as z(i). 2. For subset y, calculate P (a) the reaction propensities, bj(y, z) = z bj(y, z)PA(z|y) ∀j = 1, . . . , m − l, and y Pm−l (b) the total reaction propensity, rtot = k=1 bj(y, z).
3. Select three random numbers p1, p2, and p3 from the uniform distribution (0, 1).
4. Choose z(j) from the distribution PA(z|y) such that
j−1 j X X PA(z(k)|y) < p1 ≤ PA(z(k)|y) k=1 k=1
Set νˆz = z(j) − z(i). y 5. Let τ = − log(p2)/rtot. Choose j such that
j−1 j X y X bk(y, z) < p3rtot ≤ bk(y, z) k=1 k=1